You are on page 1of 532

1

Proceedings of the
2nd Asia-Pacific
Conference on IAT

DzvdDpmzni

Editors

Ning Zhong
Jiming Liu
Setsuo Ohsuga
Jeffrey Bradshaw
World Scientific
Intelligent
Proceedings; trf the
Agent
2nd Asia-Pacific rri 1 1
Conference on W A eCnnOIOgy

Research and
Development
Intelligent
Proceedings erf the
Agent
2nd Asia-Pacific
Conference on IAT Technology
Research and
Development

Editors
Ning Zhong
Maebashi Institute of Technology, Japan

Jiming Liu
Hong Kong Baptist University

Setsuo Ohsuga
Waseda University, Japan

Jeffrey Bradshaw
University of West Florida, USA

^ S j World Scientific
wB New Jersey * London • Singapore • Hong Kong
Published by
World Scientific Publishing Co. Pte. Ltd.
P O Box 128, Farrer Road, Singapore 912805
USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

INTELLIGENT AGENT TECHNOLOGY


Research and Development
Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.

ISBN 981-02-4706-0

Printed in Singapore by World Scientific Printers (S) Pte Ltd


PREFACE

Intelligent Agent Technology is concerned with the development of autonomous


computational or physical entities capable of perceiving, reasoning, adapting,
learning, cooperating, and delegating in a dynamic environment. It is one of the
most promising areas of research and development in information technology,
computer science, and engineering today.

This book is an attempt to capture the essence of the current state of the art in
intelligent agent technology and to identify the new challenges and opportunities
that it is or will be facing. It contains the papers accepted for presentation at The
Second Asia-Pacific Conference on Intelligent Agent Technology (IAT '01), held in
Maebashi, Japan, October 23-26, 2001. The second meeting in the IAT conference
series follows the success of IAT '99 held in Hong Kong in 1999. IAT '01 brought
together researchers and practitioners to share their original research results and
practical development experiences in intelligent agent technology. The most
important feature of this conference was that it emphasized a multi-facet, holistic
view of this emerging technology, from its computational foundations, in terms of
models, methodologies, and tools for developing a variety of embodiments of agent-
based systems, to its practical impact on tackling real-world problems.

Much work has gone into the preparation of the IAT '01 technical program:
Original, high-quality papers were solicited for various aspects of theories,
applications, and case studies related to agent technologies. 134 full papers were
submitted from 32 countries and regions of all continents. Each submitted paper
was reviewed by at least three experts on the basis of technical soundness, relevance,
originality, significance, and clarity. Based on the review reports, 25 regular papers
(19%) and 40 short papers were accepted for presentation and publication.

This book is structured into six chapters according to the main conference sessions:

Chapter 1. Formal Agent Theories


Chapter 2. Computational Architecture and Infrastructure
Chapter 3. Learning and Adaptation
Chapter 4. Knowledge Discovery and Data Mining Agents
Chapter 5. Distributed Intelligence
Chapter 6. Agent-Based Applications

In addition to the above chapters, this book also includes the abstract or papers for
the IAT '01 keynote/invited talks by Benjamin W. Wah, Toyoaki Nishida, Zbigniew
W. Ras, Andrzej Skowron, and Katia Sycara, which provide different perspectives
to Intelligent Agent Technology.

v
vi

We wish to express our gratitude to all members of the Conference Committee and
the International Advisory Board for their instrumental and unfailing support.
IAT '01 has a very exciting program with a number of features, ranging from
technical sessions, invited talks, agent demos, and social programs. All of this work
would not have been possible without the generous dedication of the Program
Committee members and the external reviewers in reviewing the papers submitted
to IAT '01, of our invited speakers, Benjamin W. Wah, Toyoaki Nishida, Zbigniew
W. Ras, Andrzej Skowron, and Katia Sycara, in preparing and presenting their very
stimulating talks, and of Jianchang Mao (Demos & Exhibits Chair) in soliciting
demo proposals and setting up the program. We thank them for their strong support.

The conference Web support team at the Knowledge Information Systems


Laboratory, Maebashi Institute of Technology did a terrific job of putting together
and maintaining the home page for the conference as well as building a software,
namely, cyber-chair, which is an intelligent agent and interface among organizers,
program committee members, and authors/attendees. We would like to thank Juzhen
Dong, Muneaki Ohsima, Norichika Hayazaki of the conference Web support team
for their dedication and hard work.

IAT '01 could not have taken place without the great team effort of the Local
Organizing Committee and the support of Maebashi Institute of Technology and
Maebashi Convention Bureau. Our special thanks go to Nobuo Otani (Local
Organizing Chair), Sean M. Reedy, Masaaki Sakurai, Kanehisa Sekine, and
Yoshitsugu Kakemoto (the Local Organizing Committee members) for their
enormous efforts in planning and arranging the logistics of the conference from
registration/payment handling, venue preparation, accommodation booking, to
banquet/social program organization. We are very grateful to the IAT '01 sponsors:
ACM SIGART, Maebashi Institute of Technology, Maebashi Convention Bureau,
Maebashi City Government, Gunma Prefecture Government, The Japan Research
Institute, Limited, United States Air Force Office of Scientific Research, Asian
Office of Aerospace Research and Development, and United States Army Research
Office in Far East, and Web Intelligence Laboratory, Inc. for their generous support.
We thank ACM SIGWEB, SIGCHI, Japanese Society for Artificial Intelligence,
JSAI SIGFAI, SIGKBS, and IEICE SIGKBSE for being in cooperation with
IAT '01. Last but not the least, we thank Ms. Lakshmi Narayanan of World
Scientific for her help in coordinating the publication of this book.

October 2001
Ning Zhong and Jiming Liu
Program Committee Chairs
Setsuo Ohsuga and Jeffrey Bradshaw
General Conference Chairs
CONFERENCE ORGANIZATION

General Chairs: Setsuo Ohsuga (Waseda U., Japan)


Jeffrey Bradshaw (Inst. H&M Cognition, USA)
Program Chairs: Ning Zhong (Maebashi Inst. Technology, Japan)
Jiming Liu (Hong Kong Baptist U.)
Demos and Exhibits Chair: Jianchang Mao (Verity Inc., USA)
Local Organizing Chair: Nobuo Otani (Mabashi Inst. Technology, Japan)

International Advisory Board


Jeffrey M. Bradshaw (Inst. H&M Cognition, USA) Setsuo Ohsuga (Waseda U., Japan)
Michele L. D. Gaudreault (US AOARD) Patrick S. P. Wang (Northeastern U., USA)
Daniel T. Ling (Microsoft Corp., USA) Yiyu Yao (U. Regina, Cadada)
Jiming Liu (Hong Kong Baptist U.) Jie Yang (U. Science & Technology of China)
Jianchang Mao (Verity Inc., USA) Ning Zhong (Maebashi Inst. Technology, Japan)
Hiroshi Motoda (Osaka U., Japan) Jan Zytkow (U. North Carolina, USA)

Local Organizing Committee


Masahiko Satori (Maebashi Inst. Tech., Japan) Toshio Kawamura (Maebashi Convention B.)
Tadaomi Miyazaki (Maebashi Inst. Tech., Japan) Masaaki Sakurai (Maebashi Convention Bureau)
Nobuo Otani (Mabashi Inst. Technology, Japan) Kanehisa Sekine (Maebashi Convention Bureau)
Sean M. Reedy (Mabashi Inst. Technology, Japan) Midori Asaka (IPA, Japan)
Ning Zhong (Maebashi Inst. Technology, Japan) Yoshitsugu Kakemoto (JRI, Limited, Japan)
Program Committee
K. Suzanne Barber (U. Texas-Austin, USA) Hideyuki Nakashima (ETL, Japan)
Guy Boy (EURISCO, France) Wee-Keong Ng (Nanyang Tech. U., Singapore)
Cristiano Castelfranchi (CNR, Italy) Katsumi Nitta (Tokyo Inst. Technology, Japan)
Kerstin Dautenhahn (U. Hertfordshire, UK) Yoshikuni Onozato (Gunma U., Japan)
Edmund H. Durfee (U. Michigan, USA) Tuncer Oren (Marmara Research Center, Turkey)
E. A. Edmonds (Loughborough U., UK) Ichiro Osawa (ETL, Japan)
Tim Finin (UMBC, USA) Sun Park (Rutgers U., USA)
Adam Maria Gadomski (ENEA, Italy) Van Parunak (ERIM, USA)
Scott Goodwin (U. Regina, Canada) Zbigniew W. Ras (U. North Carolina, USA)
Vladimir Gorodetsky (Russian Academy of Sci.) Eugene Santos (U. Connecticut, USA)
Mark Greaves (The Boeing Company, USA) Zhongzhi Shi (Chinese Academy of Sciences)
Barbara Hayes-Roth (Stanford U., USA) Carles Sierra (Scientific Research Council, Spain)
Michael Huhns (U. South Carolina, USA) Kwang M. Sim (Chinese U. Hong Kong)
Keniti Ida (Maebashi Inst. Technology, Japan) Andrzej Skowron (Warsaw U., Poland)
Tom Ishida (Kyoka oto U., Japan) Ron Sun (U. Misouri-Columbia, USA)
Lakhmi Jain (U. South Australia) Niranjan Suri (U. West Florida, USA)
Stefan J. Johansson (U. Karlskrona, Sweden) Takao Terano (U. Tsukuba, Japan)
Qun Jin (U. Aizu, Jaoan) Demetri Terzopoulos (U. Toronto, Canada)
Juntae Kim (Dongguk U., Korea) Huaglory Tianfield (Glasgow Caledonian U., UK)
David Kinny (U. Melbourne, Australia) David Wolpert (NASA Ames Research Center)
Matthias Klusch (German Research Center for AI) Jinglong Wu (Kagawa U., Japan)
Sarit Kraus (U. Maryland, USA) Takahira Yamaguchi (Shizuoka U., Japan)
Danny B. Lange (General Magic, INC., USA) Kazumasa Yokota (Okayama Prefectural U., Japan)
Jimmy Ho Man Lee (Chinese U. Hong Kong) Eric Yu (U. Toronto, Canada)
Jiming Liu (Hong Kong Baptist U.) P. C. Yuen (Hong Kong Baptist U.)
Mike Luck (U. Southampton, UK) Chengqi Zhang (Deakin U., Australia)
Helen Meng (Chinese U. Hong Kong) Ning Zhong (Maebashi Inst. Technology, Japan)
Joerg Mueller (Siemens, Germany)
TABLE OF CONTENTS

Preface v
Conference Organization vii

Invited Talks
Intelligent Agents for Market-Trend Prediction 2
Benjamin W. Wah
Social Intelligence Design for Knowledge Creating Communities 3
Toyoaki Nishida
Query Answering Based on Distributed Knowledge Mining 17
Zbigniew W. Ras
Approximate Reasoning by Agents in Distributed Environments 28
Andrzej Skowron
Multi-Agent Infrastructure for Agent Interoperation in Open
Computational Environments 40
Katia Sycara

Chapter 1. Formal Agent Theories


SPY: A Multi-Agent Model Yielding Semantic Properties 44
F. Buccafurri, D. Rosaci, G. M. L. Same, L. Palopoli
ABT with Asynchronous Reordering 54
Marius-Calin Silaghi, Djamila Sam-Haroud, Boi Faltlngs
Social Rationality and Cooperation 64
Guido Boella
Belief Revision in Type Theory 69
Tijn Borghuis, Fairouz Kamareddine, Rob Nederpelt
Heterogeneous BDI Agents II: Circumspect Agents 74
Maria Fash
A Preference-Driven Approach to Designing Agent Systems 80
Stefan J. Johansson, Johan Kummeneje
Agent Consumer Reports: of the Agents, by the Agents,
and for the Agents 86
Xiaocheng Luan, Yun Peng, Timothy Finin
Logical Formalizations Built on Game-Theoretic Argument
about Commitments 92
Lamber Royakkers, Vincent Buskens
Asynchronous Consistency Maintenance 98
Marius-Calin Silaghi, Djamila Sam-Haroud, Boi Faltings

IX
Chapter 2. Computational Architecture and Infrastructure
Reasoning about Mutual-Belief among Multiple Cooperative Agents 104
Wenpin Jiao
Portable Resource Control for Mobile Multi-Agent Systems in JAVA 114
Walter Binder, Jarle G. Hulaas, Alex Villazon, Rory G. Vidal
An Agent-Based Mobile E-Commerce Service Platform for
Forestry and Agriculture 119
Matthias Klusch, Andreas Gerber
An Itinerary Scripting Language for Mobile Agents in Enterprise
Applications 124
Seng Wai Loke, Arkady Zaslavsky, Brian Yap, Joseph Fonseka
Intelligent Agents for Mobile Commerce Services 129
Mihhail Matskin
A New Concept of Agent Architecture in Agentspace 134
T. Nowak, S. Ambroszkiewicz
21 st Century Systems, INC.'s Agent Enabled Decision Guide
Environment (AEDGE™) 139
Plamen V. Petrov, Alexander D. Stoyen, Jeffrey D. Hicks,
Gregory J. Myers
Proactiveness and Effective Observer Mechanisms in Intelligent Agents 144
Jon Plumley, Kuo-Ming Chao, Rachid Anane, Nick Godwin

Chapter 3. Learning and Adaptation


Parrondo Strategies for Artificial Traders 150
Magnus Boman, Stefan J. Johansson, David Lyback
BDI Multi-Agent Learning Based on First-Order Induction of
Logical Decision Trees 160
Alejandro Guerra Hernandez, Amal El-Fallah Seghrouchni,
Henry Soldano
Evolutionary Behaviors of Competitive Agents in Dilemma Situation... 170
Tin Tin Naing, Lifeng He, Atsuko Mutoh, Tsuyoshi Nakamura,
Hidenori Itoh
A Strategy for Creating Initial Data on Active Learning of Multi-Layer
Perceptron 180
Kazunori Iwata, Naohiro Ishii
Equilibrium Selection in a Sequential Multi-Issue Bargaining Model
with Evolutionary Agents 190
Norberto Eiji Nawa, Katsunori Shimohara, Osamu Katai
Affect and Agent Control: Experiments with Simple Affective States 200
Matthias Scheutz, Aaron Sloman
Meta-Learning Processes in Multi-Agent Systems 210
Ron Sun
Scalability and the Evolution of Normative Behavior 220
Jorg Wellner, Sigmar Papendick, Werner Dilger
Thinking-Learning by Argument 230
Aladdin Ayesh
Evolution of a Foraging Model with Many Individuals by Kin-selection 235
Kazue Kinoshita, Atsuko Mutoh, Tsuyoshi Nakamura,
Hidenori Itoh
The Use of Emergent Behaviour in a Multi-Agent System to Drive
Self-Adaptation at the Interface 240
Peter Marshall, Sue Greenwood
A Biologically Inspired Four Legged Robot That Exhibits Some Natural
Walking Behaviours 245
5. Peng, G. R. Cole, C. P. Lam

Chapter 4. Knowledge Discovery and Data Mining Agents


CM-RELVIEW: A Tool for Causal Reasoning in Multi-Agent
Environments 252
Brahim Chaib-Draa
User's Ontology-Based Autonomous Interface Agents 264
Tarek Helmy, Satoshi Amamiya, Makoto Amamiya
Integration and Reuse of Heterogeneous XML DTDs for
Information Agents 274
Euna Jeong, Chun-Nan Hsu
Virtual Museum's Assistant 284
Osvaldo Cairo, Ana Aldeco, M.E. Algorri
Index Based Document Classification with CC4 Neural Networks 289
Enhong Chen, Zhengya Zhang, Xufa Wang, Jie Yang
Price Watcher Agent for E-Commerce 294
Simon Fong, Aixin Sun, Kin Keong Wong
Automated Information Extraction from Web Pages Using
an Interactive Learning Agent 300
Jugal K. Kalita, Paritosh Rohilla
An Intelligent Agent with Structured Pattern Matching for
a Virtual Representative 305
Seung-ik Lee, Sung-Bae Cho
A Calendar Management Agent with Fuzzy Logic 310
Wayne Wobcke
XML Based Multi-Agent Collaboration for Active Digital Libraries 315
Yanyan Yang, Omer F. Rana, David W. Walker,
Roy Williams, Giovanni Aloisio
XII

Chapter 5. Distributed Intelligence


An Intelligent Channel Allocation Scheme for Mobile Networks:
An Application of Agent Technology 322
Eliane L. Bodanese, Laurie G. Cuthbert
An Atomic Approach to Agent-Based Imagery and Geospatial
Problem Solving 334
James J. Nolan, Robert Simon, Arun K, Sood
Model-Based Creation of Agents and Distribution of Problem Solving 344
Katsuaki Tanaka, Setsuo Ohsuga
A Distributed Algorithm for Coalition Formation Among
E-Commerce Agents 355
Guillaume Vauvert, Amal El Fallah-Seghrouchni
Optimal Reward Functions in Distributed Reinforcement Learning 365
David H. Wolpert, Kagan Turner
Polygonal Approximation of Planar Digital Curves Using Ant System 375
Peng-Yeng Yin
A Biological View on Information Ecosystems 385
Bengt Carlsson, Paul Davidsson
The CoDAC Collaboration Framework 390
K W. Ng, T. O. Lee
A Multi-Agent Approach to Modelling Interaction in Human
Mathematical Reasoning 395
Alison Pease, Simon Colton, Alan Smaill, John Lee
Secure Asynchronous Search 400
Marius-Calin Silaghi, Djamila Sam-Haroud, Boi Faltings
Foundations of Market-Driven Agents: An Adaptation of Zeuthen's
Bargaining Model 405
Kwang Mong Sim, Chung Yu Choi

Chapter 6. Agent Based Applications


Kavanah: An Active User Interface Information Retrieval Application 412
Eugene Santos JR., Hien Nguyen, Scott M. Brown
iJADE WeatherMAN - A Multi-Agent Fuzzy-Neuro Network Based
Weather Prediction System 424
Raymond Lee, James Liu, Jane You
Acquaintance Models in Coalition Planning for Humanitarian
Relief Operation 434
Michal Pechoucek, Vladimir Marik, Jaroslav Barta
Agent Negotiation in a Virtual Marketplace 444
Walid S. Saba, Pratap R. Sathi
XIII

Modeling User Preferences to Facilitate More Automated and Accurate


Transaction Brokering within Heterogeneous Multi-Agent Electronic
Markets 454
G. Tewari, P. Maes, A. Berkovich, V. Gabovich
Attitude Based Agents in E-Commerce Applications 464
S. Au, N. Parameswaran
Organizing Internet Agents According to a Hierarchy of
Information Domains 469
Sylvie Cazalens, Philippe Lamarre
Introducing User Preference Modeling for Meeting Scheduling 474
Hon Wai Chun, Rebecca Y. M. Wong
Executive Attentional Control in Autonomous Robotic Agents 479
Jason Garforth, Anthony Meehan, Sue Mchale
Implementation and Analysis of Mobile Agents in a Simulation
Environment for Fieldbus Systems 484
R. Hunstock, U. Ruckert, T. Hanna
Evaluating Believability in an Interactive Narrative 490
Jarmo Laaksolahti, Per Persson, Carolina Palo
iJADE Stock Predictor - An Intelligent Multi-Agent Based Time Series
Stock Prediction System 495
Raymond S. T. Lee, James N. K. Liu
Approximate Sensor Fusion in a Navigation Agent 500
J. F. Peters, S. Ramanna, M. Borkowski, A. Skowron
Simulating Day-Ahead Trading in Electricity Markets with Agents 505
Max Scheldt, Hans-Jurgen Sebastian
Using Mobile Agents to Update and Maintain Course Materials on
Students' Computers in Internet-Based Distance Education 510
Hongxue Wang, Pete Holt

Author Index 515


INVITED TALKS
INTELLIGENT AGENTS FOR MARKET-TREND
PREDICTION

BENJAMIN W. WAH
Department of Electrical and Computer Engineering
and the Coordinated Science Laboratory
University of Illinois at Urbana- Champaign
Urbana, IL 61801, USA
http://manip.crhc.uiuc.edu
(2001 IEEE Computer Society President)

In this presentation we discuss the role of intelligent agents in market-


trend predictions. Market-trend data, such as stock-market data, are charac-
terized by non-stationary time series t h a t may depend on non-numeric and
non-quantifiable measures. T h e prediction of market trends, therefore, should
consist of prediction of non-stationary time series and the abstraction and in-
tegration of non-numeric information in prediction. In this talk, we survey
various prediction techniques for and mining of market-trend data. We pro-
pose to use intelligent agents in the abstraction of non-numeric information,
the decomposition of non-stationary time series into multiple stationary time
series, and the prediction of trends using artificial neural networks. Finally,
we illustrate our techniques in predicting stock-market data.

2
SOCIAL INTELLIGENCE DESIGN
FOR KNOWLEDGE CREATING COMMUNITIES

TOYOAKI NISHIDA
Department of Information and Communication Engineering
Graduate School of Information Science and Technology
The University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
nishida@kc. t. u-tokyo. ac.jp

Communities play an important role in knowledge creation by providing people with


opportunities to continually learn from others, find partners to collaborate with, and
demonstrate the significance of their disciplines. In education or business, it is relatively easy
to find typical examples of knowledge creating communities for sharing and exchanging
specialized knowledge among knowledge workers. In other domains such as NPO or local
communities, people are naturally practicing mutual learning and invaluable knowledge is
built as a result, even if knowledge creation is not deemed a primary goal of the community.

In this paper, 1 present an interdisciplinary approach to augmenting the community


knowledge creating process by integrating insights from social psychology, cognitive
psychology, and advanced information technology. I emphasize the role of conversations and
stories as a means of establishing a common background in a community.

I describe several systems that primarily use the conversational modality to mediate
community communication. Among others, EgoChat allows the user to make conversation
with virtualized egos responding on behalf of other users. It allows the user to take an
initiative by interrupting the conversation and changing its flow. VoiceCafe allows artifacts
to make conversation with people or other artifacts. It stimulates creative thinking by
bringing about utterances from the physical object's point of view, which might be strikingly
different from humans' view.

These engineering approaches should be tightly coupled with sociological and cognitive
approaches, to predict and assess the effects of community communication mediation
systems on the human society. 1 discuss issues on designing a constructive framework of
interaction for achieving practical goals without being caught by known pathological pitfalls
of group interactions.

1 Introduction

The agent technology plays a diverse role in the networked society. On the one
hand, agents may be intelligent surrogates that work on behalf of the user. This type
of agents includes intelligent brokers that seek best match between service providers
and consumers, intelligent traders that buy and sell goods on behalf of the user,
intelligent decision makers that negotiate contracts for the user, and so on.
Alternatively, agents may be embodied conversational interfaces that entertain the

3
4

user. This type of agents is becoming popular as agent portals on the Internet, or as
artificial pets in the entertainment and amusement domain.
In this paper, I discuss issues in applying the agent technology to the
development of a social information service for mediating communication among
people. From this perspective, the central issue is designing and understanding a
world where people and agents cohabit, rather than inventing a system of artifacts.
We will not be able to innovate a totally new kingdom of artificial agents apart from
the human society, but we have to carefully embed the agent system in the existing
human society. This means that we need to understand more about humans and the
human society to better design an embedded system. We need to pay much attention
on the effects the technology brings about the human society. We need to make
every effort to have the proposal accepted by the human community. In contrast, we
need not insist on the novelty of the technology or a pedagogical issue of whether
the artifact can be called an agent.
Let us call this field social intelligence design in general. Research on social
intelligence design involves such issues as how new technologies induce the
emergence of a new language and lifestyle. For example, interactive multimedia
websites are a new medium and maybe even a new language, with interesting new
conventions, and increasing adaptation to the support of communities. Japanese
teenagers have developed a new language for use originally with beepers and now
with mobile phones. These are both new mainstream real world developments that
should be studied further, and could probably give some valuable insights. The
theme of "social intelligence" is really an angle on the support of groups in pursuit
of their goals, whether that is medical knowledge, stock trading, or teenage gossip.
I focus on community support systems to shed light on key aspects of social
intelligence design. The goal of a community support system is to facilitate
formation and maintenance of human and knowledge networks to support activities
in a community. Examples of community support systems include socially
intelligent agents that mediate people in getting to know and communicate with each
other, a collaborative virtual environment for large-scale discussions, personalized
agents for helping cross-cultural communication, interactive community media for
augmenting community awareness and memory, to name just a few.
I emphasize the role of stories and conversations as a means of establishing a
common background in a community. Stories allow us to put pieces of information
into an intelligible structure. Conversations give us an opportunity to examine
information from various angles and search for a good story structure. In some
community support systems, story-telling agents play a central role. It should be
noted that their significance depends more on the contents of stories rather than
conversation mechanism.
I also emphasize the empirical aspects of social intelligence design.
Engineering approaches should be tightly coupled with sociological and cognitive
approaches, to predict and assess the effects of community communication
5

mediation systems on the human society. I show how psychological approaches are
applied to design and evaluation of community support systems.

2 Communities and Social Intelligence

Social intelligence design is distinguished from most of other conventional


engineering disciplines in that we have to be strongly aware of the human society as
a target. For this reason, I first take a look at the nature of my target, i.e.,
communities, in this section.
A community is a group of people loosely coupled by a shared interest or
environment. More formal discussion can be found in literature in sociology. For
example, Smith defines a community as follows:
Generically, a community can be understood as a set of on-going social
relations bound together by a common interest or shared circumstance. As a
result, communities may be intentional or unintentional, a community's
participants may purposely join together or be thrust into membership by
circumstance. Intentional communities are of particular interest because they
raise more questions about the reasons and causes for their emergence than do
unintentional ones [21].
Traditional communities were local communities that are characterized by
locality and shared living environment. The advent of a global information network
has not only considerably relaxed spatial constraints for communities to be built, but
also provided a new opportunities for existing communities. Typical networked
communities include:
• communities of interest, in which people are tied with a shared interest;
• communities of practice, in which a group of people work together and share
a common work practice; and
• enhanced local communities or smart communities, which result from
enhancing communication and information sharing facilities in existing local
communities.
Schlichter contrasts communities with groups and teams [23]. He characterizes
communities as sets of people who share something but who do not necessary know
each other or interact on personal basis. In contrast, groups are sets of people who
know each other but who do not necessarily cooperate, while teams are sets of
people who are cooperating to achieve a common goal. In educational
environments, the class of lecture may be regarded as a community, a discussion
group a group, and a learning group a team.
Recently, communities have become increasingly paid more attention in the
context of knowledge management and distance learning. A community provides its
members with opportunities to continually learn from others, find partners to
collaborate with, and demonstrate the significance of their disciplines. In education
6

or business, it is relatively easy to find examples of communities oriented towards


knowledge creation by sharing and exchanging specialized knowledge among
knowledge workers. In other domains such as NPO or local communities, people
are naturally practicing mutual learning and invaluable knowledge is built as a
result, even if knowledge creation is not deemed a primary goal of the community.
We consider that community knowledge creation is essentially a co-evolution of
human and knowledge networks [16, 17]. By human network, I mean a collection of
people connected by various social relations, such as acquaintance or partnership. A
human network is considered to embody tacit knowledge that may be shared in a
community but may not be explicitly spoken. In contrast, knowledge network is a
collection of documents or files connected explicitly by hyperlinks or implicitly by
references. Knowledge network explicitly describes shared knowledge and interest
in a community.
A knowledge network enables people with a common interest to know each
other, resulting in extension of human network. A human network, in turn, helps
new ideas grow through intimate discussions. It facilitates the extension of
knowledge network through publication of new knowledge. Thus, a synergetic cycle
of human and knowledge network will lead to a successful community.
A more elaborate characterization of human and knowledge networks is
proposed by Contractor [3]. He pointed out that observed knowledge networks are
different from cognitive networks that each individual possesses as a cognitive
perception of the network. He proposes to distinguish between: (i) knowledge
networks that represent the extent to which the same or disparate knowledge is
distributed among various members of the group, and (ii) cognitive knowledge
networks that represent individuals' cognitive perceptions of "who knows what"
within the group.
In order to understand the dynamics of community knowledge, Contractor
proposes to observe five types of network data: (i) a communication network of
actors based on existing tasks and project links between them, (ii) a knowledge
network based on actors providing an inventory of their skills and expertise, (iii) a
knowledge network of actors based on links between their web sites, (iv) a
knowledge network of actors based on common links from their web sites, and (v) a
knowledge network based on similarity in content between different actors' web
sites.
It should be noted that all kinds of interaction in a community may not bring
about fruitful results. In social psychology, various pathological pitfalls are known
about group interactions. A notorious example is flaming, an endless slander battle
on the net, which is rare in face-to-face communication. Flaming blocks discussions
among community members, possibly resulting in a destructive damage to a
community. False consensus is another undesirable phenomenon. It results from "a
spiral of silence", or "bandwagon effect", for instance, in which false cognition is
socially amplified.
7

3 Community Support Systems

The role of community support systems is to support community activities by


providing a communication channel for community members. Community support
systems are built on top of the communication and expected to help community
members (i) exchange awareness with other members, (ii) explore human and
knowledge networks, (iii) build community knowledge, (iv) organize public events,
(vi) form a group/team for collaborative work, (v) negotiate with others, and (vii)
discuss public issues and make decisions about the community. Community support
systems provide rather long-range, bottom-up communicative functions in the
background of daily life. This feature is contrastive with groupware that emphasizes
more task-driven, short-range collaboration, although awareness is equally
emphasized. In the rest of this section, I will discuss the first three functions.

3.1 Helping to Awareness with Other Members


Most of networked communities are based on intentional participation, based on a
common interest for instance. Compared with mission-oriented groups where
participants are incorporated in a certain work structure, the degree of necessity to
exchange awareness is relatively low in networked communities. Participants tend
to become silent unless a mechanism is provided for lowering the cost for
exchanging awareness with other members.
In order to support awareness, Schlichter uses spatial metaphors such as rooms
or hallways in "The Lecture 2000", a computational environment for supporting a
learning community. FaintPop supports a light-weight, acknowledge-only mode of
communications [19]. The major design goal of FaintPop is to communicate the
sense of connectedness, not to perform informative functions. FaintPop is a
communication device similar to a photo frame. Small photos or icons of the user's
colleagues are displayed in the frame, through which the user can communicate with
other users using a simple touch actions. Three types of touching are permitted: a
tap to communicate a neutral feeling, a pet a positive feeling, and a hit a negative
feeling. The user can communicate her/his feeling towards her/his colleagues by
using these three types of touching and other community members can observe it.
Sumi proposes to use interest-based information distribution system, which pushes
information to interested users, rather than passively waits for requests from users
[24].
Voice Cafe [8] allows artifacts to make conversation with people or other
artifacts (Figure 1). It stimulates creative thinking by bringing about utterances from
the physical object's point of view, which might be strikingly different from humans'
view. Each Voice Cafe artifact consists of a physical object and a conversational
agent. It can communicate with community members by exchanging gossips, or
small talks about members' conditions, schedules, thoughts and opinions, and so on.
8

(a) the conceptual framework of Voice Cafe

Figure 1. Virtualized egos as an interactive community medium.

By listening to the gossips, members can gain awareness of other people at the small
talk level.

3.2 Helping to Explore Human and Knowledge Networks


This facility helps the userfindhuman and human resources in a community. Social
matchmaking is frequently used to locate people on the Internet who share some
similar interests and enable the automatic formation of interest group.
Social matchmaking calculates the distance between users by referring to their
user profiles. A major motivation . behind social matchmaking is to address
situations such that finding an expert is difficult and time consuming; people are
often working on similar projects without realizing it; or people feel socially isolated
9

Real World Inhabitant

Figure 2. Virtualized egos as an interactive community medium.

because nobody around s/he seems to share the same Interest. Yenta [4] is a multi-
agent matchmaking system that can automatically determine user interests and
operate in a completely decentralized, peer-to-peer fashion. Yenta is a persistent
agent that uses referrals to find each other, build clusters of like-minded agents, and
introduce users to each other. Special care is paid to protect user privacy.
Silhouettell [20] combines awareness support and social matchmaking to bridge
between informal and formal meetings. It projects the location of participants on the
screen as shadows, and facilitates conversation by presenting Web pages that are
inferred to common to the participants.
Referral Web [11] integrates recommendations and search through the concept
of a social network. It helps the user discover her/his relationship to the best human
experts for a given topic. It gathers all information from public sources, which
removes the cost of information posting and registration. It can also explain the user
why each link in the referral-chain appeared.
In order to provide an integrated method of exploring and building human and
knowledge networks, we use a talking-virtualized-egos metaphor in CoMeMo-
Community [14] and EgoChat [12] to enable an elaborate asynchronous
communication among community members. A virtualized ego mainly plays two
functions (Figure 2). First, it stores and maintains the user's personal memory.
Second, it presents the content of the personal memory on behalf of the user at
appropriate situations. By personal memory, we mean an aggregation of relevant
information represented in the context specific to a particular person. Personal
memory plays a crucial role not only in personal information management but also
in mutual understanding in a community.
A virtualized ego serves as a portal to the memory and knowledge of a person.
It accumulates information about a person and allows her/his colleague to access the
information by following an ordinary spoken-language conversation mode, not by
10

going up and down a complex directory in search for possibly existent information,
or by deliberately issuing commands for information retrieval. In addition,
virtualized ego embodies tacit and non-verbal knowledge about the person so that
more subtle messages such as attitude can be communicated.
As is also the case with Voice Cafe, we take a conversation-centered approach
in designing intelligent systems and capturing intelligence itself. Conversation plays
varieties of roles in human societies. It not only allows people to exchange
information, but it also helps them create new ideas or manage human relations. In
our approach, more emphasis is placed on creating, exchanging, reorganizing, and
utilizing conversational contents in knowledge creation, rather than implementing
intelligent agents or yet-another human interface.

3.3 Helping to Build Community Knowledge


The third function of a community support system is for helping community
members build a shared knowledge. Nonaka and Takeuchi pointed out that the
community knowledge is built by a spiral of interactions between explicit and tacit
knowledge [18]. They suggest that the process of knowledge creation is more
important than the body of knowledge, for people often find more value in
communities that evolve as a result of learning. This implies that more emphasis
should be placed on supporting interactions or the emergent aspect of community
knowledge [13] in community support systems.
The Public Opinion Channel (POC) [15, 16, 7] is a community-wide interactive
broadcasting system (Figure 3). A POC continuously collects messages from people
in a community and feeds edited messages back to them. POC is not intended to be
a system that broadcasts public opinions per se. Instead, it is intended to broadcast
miscellaneous information that can serve as a basis of public opinion formation.
A POC repeats a cycle consisting of call-for-opinion followed by one or more
repetition of responding by the community members and summarization by the POC
system. In the initial call-for-opinion message, the POC system specifies a focus of
discussion. Alternatively, people may also initiate discussion by submitting a topic.
Then, interested community members may respond with messages. In principle,
messages are not limited to pure opinions. Instead, they may include questions,
stories, findings, jokes, proposals, and all other message types. The POC system
may combine these messages, generate a story, and broadcast it to the community.
The POC system may issue a progress report based on responses from community
members. The process proceeds with altering subjects.
A POC brings about ecology of ever evolving stories. People can access to the
story pool at anytime by an on-demand-type access means. Another thing I would
like to emphasize here is that the POC broadcasting can be embedded in the ambient
environment, just like a radio broadcasting, so that people need not pay much
attention at all times.
11

Community POC
Broadcasting Clients We have to
Servers discuss of the
ethics of cyborg .^j
(POC Servers)
' Community B
Advertising/or Opinions
Broadcasting Opinions
. cyborg
nil be

Community C

Figure 3. The Conceptual framework of Public Opinion Channel (POC). The POC is an interactive
broadcasting system that continuously collects messages from community members and feeds edited
message streams back to the community.

Compared with existing mass media, a POC has various advantages.


Computational support and network connectivity enable a large amount of responses
to be analyzed on the fly, allowing real-time interactive stories to be generated. In
particular, a combination of statistical computation and semantic processing permits
minority opinions to be reflected in the structure of public opinion.
We believe that POC also contributes to community knowledge building and
public discussion.

4 Social Intelligence Design

Social Intelligence Design is a new discipline aimed at understanding and


supporting social intelligence, i.e., intelligence collectively exhibited by
(natural/artificial) agents to dynamically organize members' activities into a coherent
one by exploiting or innovating the social structure. Social intelligence models
intelligence as a phenomenon emerging from the way agents, either natural or
artificial, are interacting with each other. Research into community support systems
is concerned with engineering aspects of Social Intelligence Design. Meanwhile,
investigation into the sociological and cognitive aspects are equally or sometimes
more important. Engineering approach should be tightly coupled with sociology and
psychology and other disciplines closely related to the study of humans and human
society. Thus, Social Intelligence Design involves not only designing artifacts but
12

also designing and understanding the social structure in which the artifacts are
embedded.1
Social intelligence design gives some new life to Agent Technology and
Artificial Intelligence research in general in that humans are integral part of a big
picture by shifting the focus, from building artifacts with problem solving or
learning capabilities, to designing a framework of interaction that leads to creation
of new knowledge and relationship among participants. An interdisciplinary study
integrating insights from Artificial Intelligence, Human-Computer Interaction,
Social and Cognitive Sciences, Media Studies, and other related disciplines is
necessary to predict and assess the effects of social intelligence augmentation
systems on the human society from sociological and cognitive viewpoints.
Promising application domains includes collaborative environment, e-learning,
knowledge management, community support systems, symbiosis of humans and
artifacts, crisis management, and digital democracy.
The engineering side of Social Intelligence Design involves not only
community support systems but also systems that range from group/team oriented
collaboration support systems [5] to large-scale online-discussion support systems
such as Bubble used in the IBM's WorldJam trial [26].
The humanity side of Social Intelligence Design involves design and assessment
of social intelligence. In the rest of this section, I will overview a couple of research
in this side.

4.1 Social Intelligence Design from Social Psychological View


Azechi points out that two factors hinder dynamic knowledge interaction in a
networked community. One is the essential disposition of a group that prevents
effective cooperation, particularly known as groupthink and the hostility to out-
groups. Groupthink [9] means a phenomenon that collective creativity does not
exceed individual creativity. The hostility to out-groups means that a group member
has hostility to out-groups easily [23, 25]. This phenomenon is closely related with
stereotyping, which means some stigmata produce the wrong inference about an
outsider's behavior pattern and personality.
Another factor is a new concept called escape from information, which means
the tendency of the people living in a mass-consumption society to make themselves
the same as others and avoid choosing and expressing information themselves.
Azechi classifies the content of a message into dry and wet information [1].
Dry information primarily contains logical linguistic information and constitutes the
core of a message. It may be an assertion, a question, a view, an opinion, or any
other statements that are logically constituted by fact. In contrast, wet information is

These arguments are inspired by discussion at JSAI-Synsophy International Workshop on


Social Intelligence Design, Matsue, Japan, May 21-22, 2001.
13

meta-information incidental to the contents of the message. It is mainly


nonlinguistic information and is not affected by the dry information.
He argues that community-wide discussion for achieving some practical goal
should be made only with dry information, otherwise rational discussion will be
hindered due to the pathology of a group and the fruitful outcome cannot not be
expected at all. He suggests that the following five functions should be incorporated
into POC: (i) protecting personal information, (ii) consideration of minority
opinions, (iii) equal opportunities to presenters, (iv) positive reciprocity, (v) and
tutoring. Such discussion is taken as a valuable insight that can be used to design a
public discussion support system in general. In addition, he argues based on the
investigation from a POC prototype that the user's motivation is often discouraged
by trash information, setting up a clear purpose for using the tool will encourage the
user, and informing the user of the size of the user group will increase the
motivation [2].
Matsumura addresses the consensus formation in networked communities and
points out the importance of minority opinions in group decision making [10].
Based on social psychological experiments on the minority opinions, he has found
out that (i) minority members tend to overestimate the number of other members
who share the same attitude, (ii) minority members tend to underestimate the attitude
of other members, (iii) minority members who underestimate the proportion of the
minority's opinion tend to lose an intention to act. Such inaccuracy in cognition of
opinion distribution is called false consensus effect. These observations should be
taken into account in designing discussion support systems so that useful discussions
can be expected by reflecting minority opinions. He discusses the pros and cons of
using anonymous messages, which will obscure the real distribution of opinions.
Good news is that it will not discourage minority members by the fact that they are
in the minority. Bad news is that it may cause an incorrect cognition about the
distribution of opinions.

4.2 Evaluations of Social Intelligence


Social Intelligence Design is certainly an empirical study. We have to repeat the
design-implement-evaluation cycle until we reach better systems.
Network Analysis is a powerful means of evaluating or comparing empirical
data. It provides us with a means for calculating various aspects of a given network
in terms of centrality, density or cohesion. By comparing those features from one
network against those from another, we can describe the similarity and difference in
quantitative term. Fujihara has applied Network Analysis to a log collected from
experiments with a POC prototype for several months to see if POC actually
facilitates community knowledge creation [6].
Fujihara points out the importance of setting up appropriate control condition
for formalizing the result of experiments. It will allow the effects of tools to be
14

measured and compared against a baseline condition. He also suggests that multiple
methods should be combined to gain a reliable result. Methods of evaluation may
fall into the following three types:

• Analysis of users' subjective estimations and introspection that can be


collected through questionnaire and interview,
• Experimental methods where experimental conditions are operated
systematically, and
• Analysis of data collected by observing users' behavior in ordinal and natural
settings.

5 Concluding Remarks

In this paper, I have discussed issues related to Social Intelligence Design. By


social intelligence I mean intelligence collectively exhibited by (natural/artificial)
agents to dynamically organize members' activities into a coherent one by exploiting
or innovating the social structure. The central issue here is designing and
understanding a world where people and agents cohabit, rather than inventing a
system of artifacts. I have overviewed community support systems as example of
the engineering aspects of Social Intelligence Design. I have also shown some
psychological approaches related to the design and evaluation stages of Social
Intelligence Design. The agent technology has a large potential of augmenting
social intelligence, provided that special care is taken in order to embed artifacts
into the human society.

References

1. Azechi, S., Social psychological approach to knowledge-creating community,


in: Nishida, T. (ed.), Dynamic Knowledge Interaction, pp. 15-57, CRC Press
LLC, 2000.
2. Azechi, S., Motivation for showing opinion on public opinion channel: a case
study, to be presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001,
Japan, 2001.
3. Contractor, N., Bishop, A., and Zink, D., PrairieKNOW: a tool to assist the
study, creation, and growth of community networks, in: Bullinger, H.-J. and
Ziegler, J. (eds.), Human-Computer Interaction, Volume 2, Erlbaum, Hillsdale,
pp. 447-451, 1999.
4. Foner, L. N., Political Artifacts and Personal Privacy: The Yenta Multi-Agent
Distributed Matchmaking System. Ph.D. Dissertation, MIT, 1999.
15

5. Fruchter, R., Bricks, bits & interaction, presented at JSAI-Synsophy


International Workshop on Social Intelligence Design, Matsue, Japan, May 21-
22,2001.
6. Fujihara, N., How to evaluate application of conversational intelligence, to be
presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001, Japan, 2001.
7. Fukuhara, T., Nishida, T., and Uemura, S., Public Opinion Channel: a system
for augmenting social intelligence of a community, presented at JSAI-Synsophy
International Workshop on Social Intelligence Design, Matsue, Japan, May 21 -
22,2001.
8. Fukuhata, T., Nishida, T., and Uemura, S., Voice Cafe: conversational support
system in a group, KES 2001, to be presented at KES-2001, Osaka, Sept. 6, 7 &
8 September 2001, Japan, 2001.
9. Janis, I. L., Groupthink, 2nd Edition, Houghton Mifflin Company, Boston,
1982.
10. Matsumura, K., Consensus formation process in network community, to be
presented at KES-2001, Osaka, Sept. 6, 7 & 8 September 2001, Japan, 2001.
11. Kautz, H., Selman B., and Shah, M., Referral Web: combining social networks
and collaborative filtering, Communications of the ACM, 40 (3) pp. 63-65,
1997.
12. Kubota, H., Nishida, T., and Koda, T., Exchanging tacit community knowledge
by talking-virtualized-egos, in: Proceedings of Agent 2000, pp.285-292, 2000.
13. Nakata, K., Knowledge as Social Medium, New Generation Computing, Vol.
17, No. 4, pp. 395-405, 1999.
14. Nishida, T., Facilitating community knowledge evolution by talking vitrualized
egos, in: Hans-Joerg Bullinger and Juegen Ziegler (eds.), Human-Computer
Interaction VOLUME 2, Lawrence Erlbaum Associates, Pub., pp. 437-441,
1999.
15. Nishida, T., Fujihara, N., Azechi, S., Sumi, K., and Hirata, T., Public Opinion
Channel for communities in the information age, New Generation Computing,
Vol. 17, No. 4, pp. 417-427, 1999.
16. Nishida, T. (ed.), Dynamic Knowledge Interaction, CRC Press LLC, 2000.
17. Nishida, T., Towards dynamic knowledge interaction, Keynote Paper, in: Proc.
KES-2000, pp. 1-12,2000.
18. Nonaka, I. and Takeuchi, H., The knowledge-creating company: How Japanese
companies create the dynamics of innovation, Oxford University Press, New
York, 1995.
19. Ohguro, T., FaintPop: In touch with the social relationships, presented at JSAI-
Synsophy International Workshop on Social Intelligence Design, Matsue,
Japan, May 21-22, 2001.
20. Okamoto, M., Isbister, K., Nakanishi, H. and Ishida, T., Supporting cross-
cultural communication in real-world encounters, The 8th International
Conference on Human-Computer Interaction (HCI-99), Volume 2, pp. 442-
446, 1999.
21. Smith, M. A., Voices from the WELL: the logic of the virtual commons,
http://www.usyd.edu.au/su/social/papers/virtcomm.htm.
22. Sherif, M., Harvey, O. J., White, B. J., Hood, W. R., and Sherif, C. W.,
Intergroup conflict and cooperation: the robbers' cave experiment, Institute of
group relations, University of Oklahoma, Norman, 1966.
23. Schlichter, J., Koch, M., and Xu, C , Awareness — the common link between
groupware and community support systems, in: Ishida, T. (ed.), Community
Computing and Support Systems — Social Interaction in Networked
Communities, LNCS 1519, Springer, Berlin, pp. 77-93, 1998.
24. Sumi, K. and Nishida, T., Context-aware and personalized communication
support system, IEEE Intelligent Systems, in press, 2001.
25. Tajifel, H. and Turner, J.C., The social identity theory of intergroup behavior,
in: Worchel, S. and Austin, W. G. (eds.), Psychology of Intergroup Relations,
2nd Edition, Nelson-Hall, pp. 7-24, 1986.
26. Thomas, J. C , Collaborative innovation tools, presented at JSAI-Synsophy
International Workshop on Social Intelligence Design, Matsue, Japan, May 21-
22,2001.
QUERY A N S W E R I N G B A S E D O N D I S T R I B U T E D
KNOWLEDGE MINING

ZBIGNIEW W. RAS
University of North Carolina,
Department of Computer Science
Charlotte, N.C. 28223, USA
E-mail: ras@uncc.edu

Traditional query processing provides exact answers to queries. It usually requires


that users fully understand the database structure and content to issue a query.
Due to the complexity of the database applications, so called global queries can be
posed which traditional query answering systems can not handle. In this paper a
query answering system based on distributed data mining is presented to rectify
these problems.

1 Introduction

In many fields, such as medical, banking and educational, similar databases


are kept at many sites. Each database stores information about local events
and uses attributes suitable for a local task, but since the local situations
are similar, the majority of attributes are compatible among databases. An
attribute may be missing in one database, while it occurs in many others.
Missing attributes lead to problems. A user may issue a query to a local
database S\ in search for objects that match a desired description, only to
realize that one component a\ of that description is missing in S\ so that the
query cannot be answered. The same query may work in other databases but
the user is interested in identifying suitable objects only in S\.
Clearly, the task of integrating established database systems is complicated
not only by the differences between the sets of attributes but also by differ-
ences in structure and semantics of data. We call such systems heterogeneous.
The notion of an intermediate model, proposed by [Maluf and Wiederholdf, is
very useful in dealing with the heterogeneity problem, because it describes the
database content at a relatively high abstract level, sufficient to guarantee ho-
mogeneous representation of all databases. Discovery layers and action layers
introduced in this paper, can be used for a similar purpose. Discovery layer
contains rules extracted from a database. Actions layer contains, so called,
action rules (see [Ras and Wieczorkowskaf) showing what minimal changes in
a database are needed to re-classify some of its objects.

17
18

2 Distributed Knowledge Systems


In this section, we recall the notion of an information system and a distributed
information system (DIS). Next, we introduce local queries and give their stan-
dard semantics. Finally, we show the structure of discovery layers and action
layers.
By an information system we mean S = (X, A, V), where X is a finite set
of objects, A is a finite set of attributes, and V = [J{Va : a G A} is a set of
their values. We assume that:
• Va, V(, are disjoint for any a, b G A such that a ^ b,
• a : X —> Va is a function for every a £ A.
Instead of a, we may write a^s] to denote that a in an attribute in S.
By a distributed information system we mean a pair DS = ({Si}i^i,L)
where:
• / is a set of sites.
• Si = (Xi, Ai,Vi) is an information system for any i G I,
• L is a symmetric, binary relation on the set / .
A distributed information system DS = ({Si}i^i,L) is consistent if the
following condition holds:
(V;)(Vj)(Vz e Xi n X,)(Va e Az n A,) {a[Si]{x) = (a[Sj])(x)).

In the remainder of this paper we assume that DS = ({5j}i e /, L) is con-


sistent. Also, we assume that Sj = (Xj,Aj,Vj) where Vj — \J{Vja : a G Aj},
for any j & I.
We use A to denote the set of all attributes in DS, A = [){Aj : j G / } .
Also, by V we mean (J{Vj : j G / } .
Before, we introduce the notion of a discovery layer, we begin with a defini-
tion of s(i)-terms and their standard interpretation Mj in DS = ({Sj}jei,L),
where Sj = (Xj,Aj,Vj) and Vj = \J{Vja : a G Aj}, for any j G /.

By a set of s(i)-terms (also called a set of local queries for site i) we mean
a least set Tj such that:
• 0,1 G Tu
• w G Ti for any w G Vj,
19

• if t i , t 2 eTi, then (*i + t2), (h * t2), ~ h £Tt.


By a set of s(2)-formulas we mean a least set F; such t h a t :
• if ti,t2 G Ti, then (h = t2) G Fh
Definition of D 5 - t e r m s (also called a set of global queries) and .DS-formulas
is quite similar (we only replace Tj by (J{Tj : i £ 1} and F, by F in two defi-
nitions above).
We say t h a t :
• s(i)-term t is primitive if it is of the form l\{w : w G Ui} for any Ui C V»,

• s(i)-term t = Y\{w : w G t/j} where U, C Vi is simple if £/$ n Via is a


singleton set for any a £ Ai,

• s(«)-term is in disjunctive normal form (DNF) if t = ^T,{tj : j G J}


where each tj is primitive.
Similar definitions we have for D S - t e r m s .
Clearly, it is easy t o give an example of a local query. T h e expression:

select * from Flights


where airline = "Delta"
and departureJime — "morning"
and departure-airport = "Charlotte"

is an example of a non-local query (Z>S-term) in a d a t a b a s e

Flights(airline, departure dime, arrival-time,


departure-airport, arrival-airport).

Semantics of s(i)-terms is defined by standard interpretation M* in a dis-


tributed information system DS — ({Sj}jei,L) as follows:
• Mi(0) = 0, Mi(l) = Xt

• Mi(w) = {x G Xi : if w G Vja then w = h,(x, a)} for any w G V,,

• if ti,t2 are s(i)-terms, then

M J ( i i + t 2 ) = M1((1)UMi(i2),
Mi(t1*t2) = Mi{t1)nMi(t2),
Mi(~t1) = Xi-Mi(ti).
Afifa = i2) =
(if Mi(ti) = M i ( i 2 ) then T else F )
where T stands for True and F for False
20

The sound and complete axiomatization of the above semantics is quite


standard and for instance is given in paper by [Ras]6.
Now, we are ready to introduce the notion of (fc,i)-rules, for any i £ I.
We use them to form a discovery layer at site i £ I.
By (fc,i)-rule in DS = ({SJ}J^I,L), k,i £ I, we mean a triple (c,t,s) such
that:

• c e Vk - Vu
• t,s are s(fc)-terms in DNF and they both belong to Tt D T,-,
• Mk(t) CMk(c)CMk{t + s).

For any (fc,i)-rule (c, t,s) in DS = ({SJ}J^I,L), we say that:

• (t —> c) is a k-certain rule in DS,


• (t + s —>• c) is a k-possible rule in DS.

Let us assume that r\ = (ci, t\, s\), r-i — (02, *2> ^2) a r e (k, i)-rules. We say
that: r i , r 2 are strongly consistent, if either ci,c 2 are values of two different
attributes in Sk or a DNF form equivalent to t\ * £2 does not contain simple
conjuncts.

Now, we are ready to define a discovery layer Dk{. Its elements can be
seen as approximate descriptions of values of attributes from Vk — Vi in terms
of values of attributes from Vk n V*.
To be more precise, we say that Dki is a set of (k, i)-rules such that:
if (c, t, s) £ Dki and t\ = ~ (t + s), then (~ c, tl, s) e Dk{.

By a discovery layer for site i, denoted by Di, we mean any subset of


\J{Dki : (k,i) e L}.

3 Actions Layer
In this section we introduce the notion of actions layer which is a basic part of
a distributed knowledge system (DKS).
Information systems can be seen as decision tables . In any decision ta-
ble together with the set of attributes a partition of that set into conditions
and decisions is given. Additionally, we assume that the set of conditions is
partitioned into stable conditions and flexible conditions. Attribute a 6 A is
called stable for the set X if its values assigned to objects from X can not be
21

changed in time. Otherwise, it is called flexible. Date of birth is an example


of a stable attribute. Interest rate on any customer account is an example of a
flexible attribute. For simplicity reason, we consider decision tables with only
one decision. We adopt the following definition of a decision table:
A decision table is any information system of the form S = {X, A\ U A2 U
{d},V), where d §t Ai U A2 is a distinguished attribute called decision. The
elements of A\ are called stable conditions, whereas the elements of A2 U {d}
are called flexible conditions.
The goal is to change values of attributes in Ai for some objects in X so
the values of the attribute d for these objects may change as well. Rules in a
discovery layer defining d in terms of A\ U A2 are extracted from S and used
to discover new rules called action rules 7 . These new rules provide suggestions
for re-classification of objects from S in terms of the attribute d. It can be
done because d is flexible.
Now, let us assume that (a,v —> w) denotes the fact that the value of
attribute o has been changed from v to w. Similarly, the term {a, v —> w){x)
means that a{x) = v has been changed to a{x) = w. Saying another words,
the property {a, v) of object x has been changed to property {a, w).
Assume now that S = {X, AiUA2U{d}, V) is a decision table, where A\ is a
set of stable attributes and A2 is a set of flexible attributes. Assume that rules
ri,r2 have been extracted from S and ri/Ai = r2/A2,d{n) = ki,d{r2) = k2
and hi < k2. Also, assume that {bx, b2,..., bp) is a list of all attributes in
Dom{ri) D Dom{r2) fl A2 on which r\,r2 differ and ri{b\) = Wi,7-1(62) =
v2,...,n{bp) =vp.
By (ri,r2)-action rule on x € X we mean a statement:
[(61,vi —>wi) A {b2,v2 —>• w2) A A{bp,vp —> wp)]{x) => [{d,fa) —>
{d,k2)]{x).
If the value of the above rule is true on x then the rule is valid for x.
Otherwise is false.
Action layer for a site i, denoted by Acti, contains ( r i , ^ - a c t i o n rules
constructed from rules r i , r2 in a discovery layer Di.

4 Distributed Knowledge System


In this section, we introduce the notion of a distributed knowledge system.
By Distributed Knowledge System {DKS) we mean DS = {{{Si, Dt, Acti)}iei, L)
where {{Si}iei,L) is a distributed information system, Di = \J{Dki : {k,i) e
L} is a discovery layer and Acti is a n action layer for i £ I.
Figure 1 shows the basic architecture of DKS (a query answering system
QAS that handles global queries is also added to each site of DKS). Opera-
22

Knowledge Knowledge Knowledge


Exchange Exchange Exchange

Actions Actions Actions Actions


Layer Layer Layer Layer

Knowledge Knowledge Knowledge


Exchange Exchange Exchange
i i
Discovery Discovery Discovery Discovery
Layer Layer Layer Layer
mining j j j mining j j j

i ~~i
Database

Operational Operational Operational Operational


Semantics Semantics Semantics Semantics

QAS QAS QAS QAS

Figure 1: Distributed Knowledge System (DKS)

Transformation engine
based on logical axioms
and operational local query at site i
global query semantics Ni
(lower approximation)

QAS for site i

Figure 2: Query Answering System (QAS)

tional semantics reflects the dynamic nature of definitions of attribute values


in a query (see [Ras and Zytkowf).
Figure 2 shows a part of QAS which is responsible for query transforma-
tion. This part of QAS can be replaced by a rough transformation engine
shown in Figure 3.

If for each non-local attribute we collect rules from many sites of DKS and
then resolve all inconsistencies among them (see [Rasp), then the local confi-
dence in resulting operational definitions is high since they represent consensus
of many sites.
Assume now that N is a standard interpretation of global queries as in-
troduced for instance in [Rasf. It corresponds to a pessimistic approach to
23

local query at site i


Rough transformation (upper approximation)
engine based on logical
axioms and operational
global query semantics Ni and Ji
local query at site i
(lower approximation)

QRAS for site i

Figure 3: Query Rough Answering System (QRAS)

evaluation of global queries because of the way the non-local attribute values
are interpreted (their lower approximation is taken).
We can replace Ni by a new interpretation Ji representing optimistic ap-
proach to evaluation of global queries. Namely, we define:

• Ji(w) =X- Ni(~ w),

• J i ( ~ w)=X- Ni(w),

• Jj(i) = Ni(t) for any other t.

In optimistic approach to evaluation of queries, upper approximation of


non-local terms w, ~ w is taken.
Following this line of thought, we can propose rough operational semantics
Ri defined as Ri(t) = [Ni(t), Ji(t)] for any global query t. Rough operational
semantics has a natural advantage of either Nt or Jj. Clearly, if interpretations
Ni and Ji of a term t give us the same sets of objects, then both approximations
(lower and upper) are semantically equal.

5 Query Answering Based on Reducts


In this section we recall the notion of a reduct (see [Pawlakf) and show how
it can be used to improve query answering process in DKS.

Let us assume that 5 — (X, A, V), is an information system and V —


\J{Va '• a, € A}. Let B C A. We say that x,y £ X are indiscernible by B,
denoted [x « B y], if (Vo € B)[a{x) — a(y)].

Now, assume that both Bi,-02 are subsets of A. We say that B\ depends
on B2 if « B 2 C « B I - Also, we say that B\ is a covering of B2 if B2 depends on
24

B\ and B\ is minimal.

By a reduct of A in 5 (for simplicity reason we say A-reduct of 5) we mean


any covering of A.

Example. Assume the following scenario:

. S, = (Xi,{c,d,e,g},Vi), 52 = (X2,{a,b,c,d,f},V2),
S3 = (X3, {b, e, g, h}, V3) are information systems,

• User submits a query q = q(c, e, / ) to the query answering system QAS


associated with system Si,

• Systems Si, S2, S3 are parts of DKS.

Attribute / is non-local for a system Si so the query answering system


associated with Si has to contact other sites of DKS requesting a definition
of / in terms of {d, c, e,g}. Such a request is denoted by < / : d,c,e,g >.
Assume that the system S2 is contacted. The definition of / , extracted from
52, involves only attributes {d, c, e, g} D {a, b, c, d, / } = {c, d}. There are three
/-reducts (coverings off) in 52- They are: {a, b}, {a, c}, {b, c}. The optimal
/-reduct is the one which has minimal number of elements outside {c, d}. Let
us assume that {b, c} is chosen as an optimal /-reduct in 52.

Then, the definition of / in terms of attributes {b, c} will be extracted


from 52 and the query answering system of 52 will contact other sites of DKS
requesting a definition of b (which is non-local for Si) in terms of attributes
{d,c,e,g}. If definition of b is found, then it is sent to QAS of the site 1.
Figure 4 shows the process of resolving query q in the example above.
We will use the graph in Figure 5 to represent visually the fact: R[i] is an
a-reduct at site i containing attribute b.
Let us adopt the following definition. By < ai,A >-linear set of reducts
we mean a set {< Oj, R[i] >: 1 < i < k} such that:

• a, £ A, for any 1 < i < k

• ai+i G R[i], for any 1 < i < k — 1

• R[i] is an a^-reduct at site i and card(^4 — i?[i]) = l, for any 1 < i < k — 1

• R[k] C A.
25

overlap
X2 a b c d f

x1 a1 b1 d d1 f1

x2 a1 b2 c2 d1 f2

x3 a1 b1 d d1 f1

x4 a1 b2 c2 d1 f2
step 2 ^ /
x5 a2 b2 C1 d1 f3
/ • ^ b T d ->f1
b2*c2 ->f2
x6 a2 b2 d d1 f3 b2*d ->f3
c3-> f4
a3 b1 c3 d2 f4 rules extracted
x7 at site 2
x8 a3 b1 c3 d2 f4

b e g h
Coverings of f:
{a,b}, {a,c}, {b,c}
Covering {b, c} is chosen yi b1 e1 gi hi
as optimal one.
y2 b1 e1 g2 h2

y3 b1 e1 gi hi

y4 b1 e1 g2 h2

y5 b2 e2 g2 hi

b2 e2 g2 hi
y6

Coverings of b: y7
y7 b2 e3 g3 h3
{e}, {g,h}
Covering {e} is b2 e3
chosen as y8 g3 h3
optimal one.

Figure 4: Process of resolving a query by QAS in DKS

b
R[i] i

Figure 5: R[i]: a-reduct at site i containing attribute b


26

a3
R[1] I 1 R[2] I lR[k-1]

; *
R[1], R[2],..., R[k-1] should have minimal
number of attributes outside A
R[k] is a subset of A
R[k]

Figure 6: < a\,A >-linear set of reducts

Figure 6 visually represents < a\, A >-linear set of reducts. Clearly, the
existence of < a, A >-linear set of reducts is sufficient for attribute a to be
definable in DKS. The existence of < a, A >-directed set of reducts (defined
below) is necessary for attribute a to be definable in DKS.
By < ai, A >-directed set of reducts we mean a smallest, non-empty set
{< a,, R[i], Si >: 1 < i < k} such that:

• ai $ A, for any 1 < i < k


• Si is a site of DKS, for any 1 < i < k
• R[i] is an a^-reduct at site Sj, for any 1 < i < k

• (Va G U{#[*] :i<k}- A){3j < k)[a = a,]


• R[k] C A

Clearly, for every (ai,A) we have to search for the smallest < ai,A >-
directed set of reducts, to guarantee the smallest number of steps needed to
learn the definition of attribute a\ while keeping the confidence of what we
learn still the highest.

6 Conclusion
Query answering system for DKS can handle two types of queries:
Queries asking for all objects at a site i which satisfy a given description
(any attributes are allowed to be used here). In such a case, query answering
system will search for operational definitions of all attributes not-existing at
the site i, before it can process the query locally.
Queries asking for actions which have to be undertaken in order to change
the classification of some objects at site i. Such queries can be processed
27

entirely at site i or moved for remote processing to other sites of DKS. In


the last case, operational definitions of all attributes from the site i in terms
of attributes from another site are needed. But, this problem will be a topic
of a separate paper.

References
1. Maluf, D., Wiederhold, G., "Abstraction of representation for interopera-
tion", in Proceedings of Tenth International Symposium on Methodologies
for Intelligent Systems, LNCS/LNAI, Springer-Verlag, No. 1325, 1997,
441-455
2. Navathe, S., Donahoo, M., "Towards intelligent integration of hetero-
geneous information sources", in Proceedings of the Sixth International
Workshop on Database Re-engineering and Interoperability, 1995
3. Pawlak, Z., "Rough classification", in International Journal of Man-
Machine Studies, Vol. 20, 1984, 469-483
4. Prodromidis, A.L. & Stolfo, S., "Mining databases with different
schemas: Integrating incompatible classifiers", in Proceedings of The
Fourth Intern. Conf. onn Knowledge Discovery and Data Mining, AAAI
Press, 1998, 314-318
5. Ras, Z., "Dictionaries in a distributed knowledge-based system", in Con-
current Engineering: Research and Applications, Conference Proceed-
ings, Pittsburgh, Penn., Concurrent Technologies Corporation, 1994,
383-390
6. Ras, Z., "Resolving queries through cooperation in multi-agent systems",
in Rough Sets and Data Mining (Eds. T.Y. Lin, N. Cercone), Kluwer
Academic Publishers, 1997, 239-258
7. Ras, Z., Wieczorkowska, A., "Action Rules: how to increase profit of
a company", in Principles of Data Mining and Knowledge Discovery,
D.A. Zighed, J. Komorowski, J. Zytkow (Eds), Proceedings of PKDD'00,
Lyon, France, LNCS/LNAI, No. 1910, Springer-Verlag, 2000, 587-592
8. Ras, Z., Zytkow, J., "Mining for attribute definitions in a distributed two-
layered DB system", Journal of Intelligent Information Systems, Kluwer,
Vol. 14, No. 2/3, 2000, 115-130
9. Ras, Z., Zytkow, J.,"Discovery of equations to augment the shared oper-
ational semantics in distributed autonomous BD System", in PAKDD'99
Proceedings, LNCS/LNAI, No. 1574, Springer-Verlag, 1999, 453-463
10. Zytkow, J.M., Zhu, J., and Zembowicz R. Operational Definition Refine-
ment: a Discovery Process, Proceedings of the Tenth National Conference
on Artificial Intelligence, The AAAI Press, 1992, p.76-81.
A P P R O X I M A T E R E A S O N I N G B Y A G E N T S IN
DISTRIBUTED ENVIRONMENTS

ANDRZEJ SKOWRON
Institute of Mathematics, Warsaw University
Banacha 2, 02-097 Warsaw, Poland
E-mail: skowron@mimuw.edu.pl

Solving complex problems by multi-agent systems in distributed environments re-


quires new approximate reasoning methods based on new computing paradigms.
One such recently emerging computing paradigm is Granular Computing. Gran-
ular computations are performed on information granules representing vague and
complex concepts delivered by agents engaged in tasks such as knowledge represen-
tation, communication with other agents, and reasoning. In this paper, we present
an outline of foundations for information granule calculi and methods for inducing
relevant information granule constructions from data and background knowledge.
These constructions can be interpreted as approximate reasoning schemes. The
proposed methodology of approximate reasoning has been developed for solving
complex problems in areas such as identification of objects by autonomous sys-
tems, web mining or sensor fusion.

1 Introduction

Information processing in intelligent systems, in particular in multi-agent sys-


tems, needs new soft-computing paradigms. The solutions derived by such
systems should satisfy a given specification not necessarily exactly but up to
a satisfactory degree. One can expect that due to such more relaxed specifi-
cation constraints the robust solutions for complex problems can be derived
efficiently by methods based on these new paradigms.
One of such recently emerging paradigm is Granular Computing based
on rough set and rough mereological approaches (see, e.g., Zadeh and
Kacprzyk 33 , Zhong et al 35 , Lin 9 , Polkowski and Skowron 22 ' 25 > 26 ; Skowron
and Stepaniuk 29 , Nguyen et al11, Skowron 27 ) as a way to achieve Comput-
ing with Words (see, e.g., Zadeh 3 2 - 3 4 ; Zadeh and Kacprzyk 3 3 ) . Granular
computations are performed on information granules representing vague and
complex concepts delivered by agents engaged in, for example, knowledge rep-
resentation, communication with other agents, and reasoning. Our approach
is related to logical aspects of perception (see, e.g., Zadeh 3 4 ) .
Specifications of complex tasks are often formulated in words, phrases or
more complex texts of a natural language. Hence, the following main prob-
lem arises: if and how can an information granule, in a sense, sufficiently close
to the target information granule Gt representing the task specification, be

28
29

constructed from input information granules (e.g., representing sensor mea-


surements) .
One of the important problems is related to the construction of an inter-
face allowing knowledge acquisition agents (KA-agents) to acquire knowledge
from customer-agents (CA-agents), who specify a task. The aim is to induce
a satisfactory approximation Gk of the target information granule Gt in the
language of KA-agents, i.e., an information granule Gk sufficiently close to
(or included in) the target information granule Gt- Hence, some tools for
expressing inclusion and proximity (closeness) of information granules mea-
sured by the degree of proximity are needed. For this purpose we use rough
sets (see, e.g., Pawlak 14 , Komorowski et al7) and rough mereology (see, e.g.,
Polkowski and Skowron 1 8 ' 2 0 ' 2 2 ) . The interface construction should be sup-
ported by background knowledge (in particular, by ontology of concepts) and
experimental data.
An information granule G sufficiently close to the information granule Gk
delivered by KA-agents should be constructed from input information gran-
ules (representing, e.g., sensor measurements). In the search for granule G,
relevant operations and inclusion (closeness) measures on information granules
should be discovered and used. The granule G is constructed from basic com-
ponents defined by information granule calculi. Any such calculus consists of
components such as (i) elementary input information granules, (ii) operations
on information granules, (iii) relations of inclusion and proximity measured by
the proximity degree between information granules, and (iv) schemes of infor-
mation granule construction which can be treated as approximate reasoning
schemes (Ai?-schemes, for short) on information granules.
Elementary information granules together with inclusion and proximity
relations between such granules are primitive constructs in granule construc-
tion. Higher level constructs, like information granules and related inclusion
(closeness) relations, can be defined from previously constructed lower level
constructs using relevant operations.
Fusion operations are important operations on information granules.
They are based on negotiation schemes for resolving conflicts between agents,
delivering arguments of operations. More complex operations are defined by
robust ^4-R-schemes. Such schemes are obtained by approximate reasoning
rules and methods for their composition, dependent on available data and
background knowledge. The robustness of ^4i?-schemes means that the close-
ness (inclusion) of constructed granules is preserved in a satisfactory degree
under small deviations of input granules (or operation parameters used for
the granule construction). The robustness of the target construction can be
deduced from the robustness of their sub-constructions, if some constraints
30

for composition are satisfied. The robust j4i?-schemes should be extracted


from experimental (e.g., sensory) data or/and background knowledge rather
than from classical deduction mechanisms.
The ^4ii-schemes are parameterized. Relevant information granules are
constructed by tuning AR-scheme parameters. There are several kinds of
parameters tuned in the process of searching for relevant information granules.
Some of them come from approximation spaces of agents that make it possible
to obtain a proper generalization degree of the granule constructed in the
inductive reasoning. Other parameters are related to agent teams and are used
to tune measures of inclusion (closeness) between information granules and to
tune propagation mechanisms of the inclusion (closeness) degrees along the
Ai?-schemes. The ^4-R-schemes in multi-agent systems can be treated as higher
order neural networks, called rough neural networks, performing operations
on information granules instead of numbers. One of the main problems of
a new Rough Neurocomputing paradigm is to develop methods for inducing
rough neural networks.
In this paper, we outline an approach to the above mentioned problems.
Our approach is based on the foundations of a calculus on information granules
developed by means of rough set and rough mereological approaches. Its aim is
to create a methodology and tools for solving a wide class of complex problems
ranging from the identification of road traffic situations by an unmanned aerial
vehicle (see, e.g., www page of WITAS project 31 ) to problems of text data
mining in the Internet (see, e.g., Skowron 27 , Kargupta and Chan 6 ) .

2 Information Granule Systems

In this section, we present a basic notion for our approach, i.e., information
granule system. Any such system S consists of a set of elementary granules E
together with an operation {•} making collections of granules from finite sets of
granules. A finite subset of the set generated from elementary granules using
this operation is fixed. This subset is extended by means of other operations
on information granules producing new information granules. Moreover, a
family of relations with the intended meaning to be a part to a degree between
information granules is distinguished. Degrees of inclusion are also treated as
information granules. The degree structure is described by a relation to be an
exact part. More formally, an information granule system is any tuple
S = (E,{E},H,0,v,{vp}peH) (1)
where
1. E is a finite set of elementary granules;
31

2. {£} is a finite subset of PU(E) = E U P(E) U P ( E U P(E)) U ... where


P(X) denotes the powerset of X;
3. H is a finite set of granule inclusion degrees with a binary relation v C
H x H to be an (exact) part; v defines on H a structure used to compare
the degrees by assuming, e.g., p < q if and only if v(q,p);
4. O is a set of (partial) operations used to construct new granules from
{E}; by means of operations from O, the set {E} is extended to the set
G(S) C PUJ(E) of granules generated from {E} using operations from O;
5. j / p C G(S) x G(5) is a binary relation io fee a part to a degree at least p
between information granules from G(S).
One can consider the following examples of the set E of elementary gran-
ules: (i) a set of descriptors of the form (a, v) where a € A and v £ Va for some
finite attribute set A and value sets Va, and (ii) a set of descriptor conjunc-
tions. The set {E} consists of granules constructed by means of an operation
{•} making collections from already constructed granules. Examples of such
granules are tolerance granules created by means of similarity (tolerance) re-
lation between elementary granules, decision rules, sets of decision rules, sets
of decision rules with guards, information systems or decision tables (see, e.g.,
Polkowski and Skowron 2 2 , Skowron and Stepaniuk 29 , Skowron 2 7 ). The most
interesting class of information granules create information granules specified
in natural language and their approximations by means of experimental data
tables and background knowledge.
One can consider as an example of the set H of granule inclusion degrees
the set of binary sequences of a fixed length with the relation v to be a
part defined by the lexicographical order. This degree structure can be used
to measure the inclusion degree between granule sequences or to measure the
matching degree between granules representing classified objects and granules
describing the left hand sides of decision rules in simple classifiers (see, e.g.,
Polkowski and Skowron 2 5 ) . However, one can consider more complex degree
granules by taking as degree of inclusion of granule g\ in granule gi the granule
being a collection of common parts of these two granules g\ and <?2- The
relation v satisfies some additional axioms adopted from mereology (Polkowski
and Skowron 1 8 ).
Operations from O are important for constructing an extension of {E}.
One can consider, as operations on information granules set theoretical oper-
ations (defined by propositional connectives). However, there are other oper-
ations widely used in machine learning or pattern recognition (Michell 10 ) for
construction of classifiers. These are the Match and Conflictjres operations
32

(Polkowski and Skowron 2 5 ) . The Match operation is used to construct a


granule describing the matching result of elementary granules describing clas-
sified objects by granules representing the left hand sides of decision rules. The
Conflict-res is an operation producing from this matching granule the result-
ing granule, e.g., identifying a relevant decision class for classifying object. It
is worthwhile mentioning yet another important class of operations, namely,
operations defined by data tables called decision tables (Skowron and Stepa-
niuk 2 9 ) . From these decision tables, decision rules specifying operations can
be induced. More complex operations on information granules are so called
transducers (Doherty et al3). They have been introduced to use background
knowledge (not necessarily in the form of data tables) in construction of new
granules. One can consider theories or their clusters as information granules.
Reasoning schemes in natural language define the most important class of op-
erations on information granules to be investigated. One of the basic problems
for such operations and schemes of reasoning is how to approximate them by
available information granules, e.g., constructed from sensor measurements.
In an information granule system, the relation to be a part to a degree
has a special role. It satisfies some additional axioms of rough mereology
(Polkowski and Skowron 1 8 ). It can be shown that the rough mereological
approach built on the basis of the relation to be a part to a degree generalizes
the rough set and fuzzy set approaches. Moreover, such relations can be used
to define other basic concepts like closeness of information granules, their
semantics, indiscernibiliy and discernibility of objects, information granule
approximation and approximation spaces, perception structure of information
granules as well as the notion of ontology approximation. One can observe
that the relation to be a part to a degree can be used to define operations
on information granules corresponding to generalization of already defined
information granules.
Let us finally note that new information granule systems can be defined
using already constructed information granule systems. This leads to a hier-
archy of information granule systems.

3 Multi-Agent System Based on Information Granules

In this section, we outline how our approach can be used in approximate


reasoning by agents (Huhns 5 ) in a distributed environment.
We assume each agent ag e Ag is equipped with a system of information
granules S(ag). Using such a system, the agent ag creates a representation for
all its components. The reader can find some details of such a representation,
e.g., in papers by Polkowski and Skowron 20>22. Agents are able to extract
33

local approximate reasoning schemes called productions from such represen-


tations. Algorithmic methods for extracting such productions from data are
discussed in papers by Polkowski and Skowron 19 , Skowron 27 , Skowron and
Stepaniuk 30 . The left hand side of each production (in the simplest case) is
of the form
(St1(ag),(e[1\---,e^)),---,(stk(ag),(e[k\---,e^)) (2)

and the right hand side is of the form


(st(ag),(ei,---,er)) (3)
for some positive integers k, r.
Such a production (see Figure 1) represents information about an opera-
tion o that can be performed by an agent ag. In the production, k denotes the
arity of operation. The operation o represented by the production is trans-
forming standard (prototype) input information granules sti(ag), • • •, stk(ag)
into the standard (prototype) information granule st(ag). Moreover, if input
information granules <7i, • • • ,<7jt are included (close) to st\(ag), • • • ,stk(ag) to
degrees e(- , • • •, c- , then the result of the operation o on information gran-
ules gi, • • • ,gk is included (close) to the standard st(ag) to a degree at least
€j where 1 < j < k (see Figure 1). Standard (prototype) granules can be
interpreted in different ways. In particular, they can correspond to concept
names in natural language.

<SD CS © <K

Figure 1. Productions and AR-schemes

The sample productions in Figure 1 are basic components of a reasoning


system related to the agent set Ag. An important property of such productions
34

is that they are expected to be discovered from available experimental data


and background knowledge. Let us observe also that the degree structure is
not necessarily restricted to positive reals from the interval [0,1]. The inclusion
degrees can be complex information granules used to represent the degree
of inclusion. It is worthwhile to mention that the productions can be also
interpreted as a constructive description of some operations on fuzzy sets.
The methods for such constructive description is based on rough sets and
Boolean reasoning (see, e.g., Komorowski et al 7, Pawlak 1 4 ).
Reasoning in multi-agent system can be represented as a construction pro-
cess of information granules. This process is not restricted to internal opera-
tions performed by agents. The agents can communicate. In this process they
exchange some information granules. It is important to note that any agent
possesses her/his own information granule system. Hence, a granule received
by one agent from another agent can not be in general understood precisely by
the receiving agent. We assume that to j - t h argument of any operation o per-
formed by an agent ag there is associated an approximation space AS{ag)i
(see, e.g., Skowron and Stepaniuk 29 , Polkowski and Skowron 25 ) making
it possible to construct relevant approximations of the received information
granules used next as operation arguments. The result of approximation is
an information granule in the information granule system of the agent ag.
In some cases, the approximation can be induced using rough set methods
(see, e.g. Skowron and Stepaniuk 2 9 ) . In general, constructing information
granule approximations is a complex process because, for instance, a high
quality approximation of concepts can be often obtained only through dialog
(including negotiations, conflict resolution, and cooperation) among agents.
In this process, the approximation can be constructed gradually when dialog
is progressing.
The approximation spaces are usually parameterized. It means that it is
necessary to tune their parameters to find (sub-) optimal approximations of
the information granules. This observation was a starting point for Rough
Neurocomputing paradigm (see Skowron et al 2 8 , Polkowski and Skowron 2 5 ,
Pal et al13, Skowron and Stepaniuk 29 , Skowron 2 7 ) .
In general, the inputs of rough neurons are derived from information
granules instead of real numbers and the parameterized approximation spaces
correspond to real weights in the classical neuron. The result of an operation
o depends on the chosen parameters of approximation spaces. The process of
tuning parameters of such approximation spaces corresponds to the process
of weight tuning of classical neurons.
Now, we are able to discuss one of the main concepts of our approach, i.e.,
approximate reasoning schemes (yl-R-schemes). They can be treated as some
35

derivations obtained by using the productions of different agents. Assume for


simplicity of considerations that agents are working using the same system of
information granules, i.e., they do not use approximation spaces to approxi-
mate granules received from other agents. The approach can be extended to
the more general case. The relevant derivations defining yl.R-schemes satisfy
a so called robustness (or stability) condition (see Figure 1). That is, at any
node of derivation the inclusion (or closeness) degree of a constructed granule
(to a given standard) is higher than required by the production to which the
result should be sent. This makes it possible obtain a sufficient robustness
condition for the whole derivation. For details the reader is referred to papers
by Polkowski and Skowron 20>22>23.24>26. I n the general case, i.e., when it is
necessary to use approximation spaces, the Ai?-schemes can be interpreted as
rough neural networks. In the case where standards are interpreted as concept
names in natural language and there is given a reasoning scheme in natural
language over such standards, the corresponding rough neural network rep-
resents a cluster of reasoning constructions approximately following (in other
information granule systems) the reasoning given in natural language.
Let us observe that AR schemes are not classical proofs defined by means
of deductive systems. They are approximate reasoning schemes discovered
from data and background knowledge. The notion of classical proof is substi-
tuted by means of derivations defining Ai?-schemes, i.e., derivations satisfying
some constraints. The deductive systems are substituted by productions sys-
tems of agents linked by approximation spaces, communication strategies and
mechanism of derivation of Ai?-schemes. This revision of classical logical
notions seems to be important for solving complex problems in multi-agent
systems.

4 Conclusions and Some Directions for Further Research

We have outlined a methodology for approximate reasoning in distributed


multi-agent systems. It is based on rough mereology.
Several research directions are related to the discussed v4i?-schemes and
rough neural networks. We enclose a list of such directions together with
examples of problems.

1. Developing foundations for information granule systems.


Certainly, still more work is needed to develop solid foundations for syn-
thesis and analysis of information granule systems. In particular, meth-
ods for construction of hierarchical information granule systems, and
methods for representation of such systems should be developed.
36

2. Algorithmic methods for inducing parameterized productions.


Some methods have already been reported such as discovery of rough
mereological connectives from data (see, e.g., Polkowski and Skowron 19 )
or methods based on decomposition (see, e.g., Polkowski and Skowron 20 ,
Skowron 27 , Skowron and Stepaniuk 30 , Peters et al16). However, these
are only initial steps toward algorithmic methods for inducing of param-
eterized productions from data. One interesting problem is to determine
how such productions can be extracted from data and background knowl-
edge. A method in this direction has been proposed in a paper by Doherty
et al. 3

3. Algorithmic methods for synthesis of AR-schemes.


It was observed (see, e.g., Skowron and Polkowski 20 - 25 ) that problems of
negotiations and conflict resolutions are of great importance for synthe-
sis of ^4i?-schemes. The problem arises, e.g., when we are searching in a
given set of agents for a granule sufficiently included or close to a given
one. These agents, often working with different systems of information
granules, can derive different granules and their fusion will be necessary
to obtain the relevant output granule. In the fusion process, the negoti-
ations and conflict resolutions are necessary. Much more work should be
done in this direction by using the existing results on negotiations and
conflict resolution. In particular, Boolean reasoning methods seem to be
promising (Polkowski and Skowron 2 0 ). Another problem is related to the
size of production sets. These sets can be of large size and it is important
to develop learning methods for extracting small candidate production
sets in the process of extension of temporary derivations out of huge pro-
duction sets. For solving this kind o problems methods for clustering of
productions should be developed to reduce the size of production sets.
Moreover, dialog and cooperation strategies between agents can help to
reduce the search space for necessary extension of temporary derivations.

4. Algorithmic methods for learning in rough neural networks.


A basic problem in rough neural networks is related to selecting relevant
approximation spaces and to parameter tuning. One can also look up
to what extent the existing methods for classical neural methods can
be used for learning in rough neural networks. However, it seems that
new approach and methods for learning of rough neural networks should
be developed to deal with real-life applications. In particular, it is due
to the fact that high quality approximations of concepts can be often
obtained only through dialog and negotiations processes among agents
in which gradually the concept approximation is constructed. Hence, for
37

rough neural networks learning methods based on dialog, negotiations


and conflict resolutions should be developed. In some cases, one can use
directly rough set and Boolean reasoning methods (see, e.g., Skowron
and Stepaniuk 2 9 ) . However, more advanced cases need new methods.
In particular, hybrid methods based on rough and fuzzy approaches can
bring new results (Pal and Skowron 1 2 ).

5. Fusion methods in rough neural neurons.


A basic problem in rough neurons is fusion of the inputs (information)
derived from information granules. This fusion makes it possible to con-
tribute to the construction of new granules. In the case where the gran-
ule constructed by a rough neuron consists of characteristic signal values
made by relevant sensors, a step in the direction of solving the fusion
problem can be found in Pawlak et al. 17

6. Adaptive methods.
Certainly, adaptive methods for discovery of productions, for learning of
Ai?-schemes and rough neural networks should be developed (Koza 8 ) .

7. Discovery of multi-agent systems relevant for given problems.


Quite often, the agents and communication methods among them are not
given a priori with the problem specification and a challenge is to develop
methods for discovery of relevant for given problems multi-agent system
structures, in particular methods for discovery of relevant communication
protocols.

8. Construction of multi-agent systems for complex real-life problems.


The challenging problems are related to applying the presented method-
ology to real life problems like control of autonomous systems (see, e.g.,
www page of WITAS project 3 1 ) , web mining problems (see, e.g., Kar-
gupta and Chan 6 , Skowron 2 7 ) , sensor fusion (see, e.g., Brooks et al1,
Peters et al15'17) or spatial reasoning (see, e.g., Escrig 2 , Diintsch 4 ) .

9. Evolutionary methods.
For all of the above methods it is necessary to develop evolutionary
searching methods for (semi-) optimal solutions (Koza 8 ) .

10. Parallel algorithms.


The discussed problems are of high computational complexity. Parallel
algorithms searching for Ai?-schemes and methods for their hardware
implementation belong to one important research directions.
38

Acknowledgements. I would like to thank to Professor Lech Polkowski


for the years of close cooperation on rough mereology, to Professor Jaroslaw
Stepaniuk for the cooperation on information granule models and to Pro-
fessor James F. Peters for cooperation on sensor fusion methods, insightful
comments and a number of clarifying discussions on the presented paper.
The research has been supported by the State Committee for Scientific
Research of the Republic of Poland (KBN) research grant 8 T11C 025 19 and
by the Wallenberg Foundation grant.

References

1. R.R. Brooks et al, Multi-Sensor Fusion, (Prentice-Hall PTR, Upper Sad-


dle River, NJ, 1998).
2. M.T. Escrig et al, Qualitative Spatial Reasoning: Theory and Practice,
(IOS Press, Amsterdam, 1998).
3. P. Doherty et al, Combining Rough and Crisp Knowledge in Deductive
Databases (submitted).
4. I. Diintsch ed., Fundamenta Informaticae (special issue on Spatial Rea-
soning, in print).
5. M.N. Huhns et al eds., Readings in Agents, (Morgan Kaufmann, San
Mateo, 1998).
6. H. Kargupta and Ph. Chan, Advances in Distributed and Parallel Knowl-
edge Discovery, (AAAI Press/MIT Press, Cambridge 2001).
7. J. Komorowski et al in Rough Fuzzy Hybridization: A New Trend in
Decision-Making, eds. S.K. Pal and A. Skowron (Springer-Verlag, Singa-
pore, 1998).
8. J.R. Koza, Genetic Programming II: Automatic Discovery of Reusable
Programs (MIT Press, Cambridge, MA, 1994).
9. T.Y. Lin in 21 1.
10. T.M. Michell, Machine Learning (Mc Graw-Hill, Portland, 1997).
11. H.S. Nguyen et al, Computational Intelligence (in print).
12. S.K. Pal and A. Skowron eds., Rough-Fuzzy Hybridization: A New Trend
in Decision Making (Springer-Verlag, Singapore, 1999).
13. S.K. Pal et al eds., Rough-Neuro Computing in Neurocomputing 36,
(2001).
14. Z. Pawlak, Rough Sets. Theoretical Aspects of Reasoning about Data
(Kluwer Academic Publishers, Dordrecht, 1991).
15. J.F. Peters, S. Ramanna et al, in Proc. of Int. Fuzzy Systems Association
World Congress (IFSA'01), Vancouver, July 2001 (to appear).
16. J.F. Peters, A. Skowron et al, in Proc. of Int. Fuzzy Systems Association
39

World Congress (IFSA'Ol), Vancouver, July 2001 (to appear).


17. Z. Pawlak et al, Bulletin of International Rough Set Society 5(1-2), 177
(2001).
18. L. Polkowski and A. Skowron, International J. Approximate Reasoning
15(4), 333, (1996).
19. L. Polkowski and A. Skowron, in Critical Technology, Proc. of the Third
World Congress on Expert Systems, eds. J.K. Lee et al (Cognizant Com-
munication Corporation, New York, 1996).
20. L. Polkowski and A. Skowron, Information Sciences, An International
Journal 104(1-2), 129 (1998).
21. L. Polkowski and A. Skowron eds., Rough Sets in Knowledge Discovery,
1-2 (Physica-Verlag / Springer-Verlag, Heidelberg, 1998).
22. L. Polkowski and A. Skowron in 33 1.
23. L. Polkowski and A. Skowron in Grammar Systems for Multiagent Sys-
tems, eds. G. Paun and A. Salomaa (Gordon and Breach Science Pub-
lishers, Amsterdam, 1999).
24. L. Polkowski and A. Skowron in Rough Sets: New Developments in
Knowledge Discovery in Information Systems, eds. L. Polkowski, T.Y.
Lin, S. Tsumoto (Physica-Verlag / Springer-Verlag, Heidelberg, 2000).
25. L. Polkowski and A. Skowron in Proc. of RSCTC'2000, LNAI 2005,
(Springer-Verlag, Berlin, 2001) (in print).
26. L. Polkowski and A. Skowron, Computational Intelligence (2001) (in
print).
27. A. Skowron, Bulletin of International Rough Set Society 5(1-2), 9 (2001).
28. A. Skowron et al, Bulletin of the International Rough Set Society 3(4),
147 (1999).
29. A. Skowron and J. Stepaniuk, International Journal of Intelligent Sys-
tems 16(1), 57 (2001).
30. A. Skowron et al, Bulletin of International Rough Set Society 5(1-2), 135
(2001).
31. WITAS project web page: http://www.ida.liu.se/ext/witas/eng.html.
32. L.A. Zadeh, IEEE Trans, on Fuzzy Systems 4, 103 (1996).
33. L.A. Zadeh and J. Kacprzyk eds., Computing with Words in Informa-
tion/Intelligent Systems 1-2 (Physica-Verlag, Heidelberg, 1999).
34. L.A. Zadeh, Al Magazine 22(1), 73 (2001).
35. N. Zhong et al eds., New Directions in Rough Sets, Data Mining, and
Granular Soft Computing, Proc. of RSFDGr'99, LNAI 1711 (Springer-
Verlag, Berlin, 1999).
MULTI-AGENT INFRASTRUCTURE FOR AGENT
INTEROPERATION IN OPEN COMPUTATIONAL
ENVIRONMENTS

KATIA SYCARA
The Robotics Institute, School of Computer Science
Carnegie Mellon University, USA
e-mail: katia@cs.cmu.edu
http://www.cs.cmu.edu/ softagents/

Multi-agent Systems (MASs) are becoming increasingly important: as a


scientific discipline, as a software engineering paradigm, and as a commercially
viable and innovative technology. Despite the considerable research that has
gone into the formation of theories, scientific principles and guidelines for
MASs, there is relatively little experience with the building, fielding and rou-
tine use of them.. To achieve this goal, a stable, widely used, widely accessible
and extensible MAS infrastructure is crucial. Various standards bodies (e.g.
FIPA) are a t t e m p t i n g to define standards for various aspects of MAS infras-
tructure, such as Agent Communications Languages.. However, there is no
coherent account of what constitutes a MAS infrastructure, what functional-
ity it supports, what characteristics it should have in order to enable various
value-added abilities, and what its possible relation with and requirements it
may impose on the design and structure of single agents.
Another equally important aspect of MAS t h a t operate in an open world
like the Internet, where communication links, informaiton sources, services
and agents can appear and disappear dynamically is the issuse of discov-
ery and interoperation of agents. White pages and yellow page registries of
companies, for example have been proposed and implemented (e.g. Yahoo
business categories) for h u m a n understndability. We have coined the term
middle agents [DeckerfcSycara, IJCAI97] to describe various agent intermedi-
aries t h a t can act as brokers and discovery services for agents on the Internet.
These domain independent intermediaries facilitate the finding and matching
of agents and services with desirable functionalities (e.g. an agent t h a t finds
weather information).
Such intermediaries start having appeal for industry. For example, in-
dustrial organizations (e.g. SUN) are developing and making accessible
software t h a t could constitute a part of a MAS infrastructure, such as
JINI for service discovery. Protocols such as UDDI (www.uddi.org), S O A P
(www.soapware.org) and languages such as WSDL (www.wsdl.org) ebXML
and e-speak are receiving increased visibility.

40
41

In this talk, we will present a model of MAS infrastructure, and our im-
plemented RETSINA system that is an example of the general infrastructure
model. We will also discuss various applications that we have implemented
using RETSINA.
CHAPTER 1

FORMAL AGNET THEORIES


SPY: A MULTI-AGENT MODEL YIELDING SEMANTIC
PROPERTIES

F . B U C C A P U R R I , D . R O S A C I , G. M. L. S A R N E , L. P A L O P O L I
DIMET, Universita "Mediterranea" di Reggio Calabria, Via Graziella Loc. Feo di
Vito, 89100 Reggio Calabria (Italy)
E-mail: {bucca,rosaci,sarne,palopoli}Sing.unirc.it

In Multi-Agent Systems the main goal is providing fruitful cooperation among


agents in order to enrich the support given to user activities. Cooperation can
be implemented in many ways, depending on how local knowledge of agents is
represented and consists in general in providing to the user an integrated view of
individual knowledge bases. But the main difficulty is determining which agents
are promising candidates for a fruitful cooperation among the (possibly large) uni-
verse of agents operating in the net. This paper gives a contribution in this con-
text, by proposing a formal framework for representing cooperation in multi-agents
networks. Semantic properties are here represented by coefficients and adaptive
algorithms permit the computation of a set of suggested agents for cooperation.
Actual choices of the users modify internal parameters in such a way that the next
suggestions are closer to users expectancy.

1 Introduction

Coordinating the activities of multiple agents is a basic task for the viability
of any system in which such agents coexist. Each agent in an agent com-
munity does not have to learn only by its own discovery, but also through
a cooperation with other agents, by sharing individual learned knowledge.
Indeed, cooperation is often considered as one of the key concepts of agent
communities 6 ' 7 . Researchers in Intelligent Agent Systems have recognized
that learning and adaptation are essential mechanisms by which agents can
evolve coordinated behaviours finalized to meet the knowledge of the inter-
est domain and the requirements of the individual agents 3 ' 1 0 . In order to
realize such a cooperation, some techniques developed in the field of Machine
Learning has been introduced in various multi-agent systems 8 ' 5 ' 4 . Such tech-
niques open, on the one hand, the possibility of integrating individual agent
knowledge for acquiring an enhanced knowledge of the environment. But, on
the other hand, they deal with the problem of determining which agents are
promising candidates for suitable knowledge integration, taking into account
situations of the kind mentioned above.
In such a context, this paper describes a new multi-agent model, called
SPY, able to inform the individual agent of a multi-agent network about which
agents are the most appropriate to be contacted for possible knowledge inte-

44
45

gration. The main contributions of this paper are the following: (1) We point
out which properties can be considered important for driving the integration
of the knowledge coming from non local agents and give a formal model in
which such properties are represented as quantitative information by mean of
a number of real coefficients. (2) We propose an adaptive method for deter-
mining, for a given agent a of a multi-agent net, the most appropriate agents
to cooperate with a. Such a method is adaptive in the sense that it takes
into account some reactive properties of users, and, as such, its result depends
on their behaviour. (3) On the basis of this model we design a strategy for
supporting cooperation of agents operating in a multi-agent network. The
first step consists on providing the user with a number of agent lists, each
containing the most appropriate agents for cooperation, from which the user
can choice agents she/he want to contact for supporting her/his activity. The
multiplicity of such choice lists depends on the multiplicity of the properties
that can be used as preference criteria. Users are free to also partially use
the suggested lists, or can ignore them. In any case, user's behaviour induces
a modification of some coefficients (describing reactive properties) in such a
way that lists suggested in the future are (hopefully) closer to real user needs.
Therefore, the system learns from user's behaviour about how to provide the
users with suggestions meeting as much as possible their expectancy. (4) Fi-
nally, we design the architecture of a system implementing the above agent
cooperation model a.
Throughout the paper we refer to a given set of agents A of cardinality n
and we suppose that all agents in A can cooperate with each other. Thus we
can see the set A as a undirected complete graph of agents whose arcs represent
possible cooperation. W.l.o.g., we identify agents in A by the cardinal numbers
{l,...,n}.

2 The Knowledge Bases

2.1 An Ontology for describing the domain of interest


Since we consider only homogeneous agents, we suppose that a unique envi-
ronment is associated with our agent net. We represent such an environment
in our model by a set of objects. For the rest of the section we consider a set
of objects O as given.
A domain D on O is a set of classes of objects. We suppose that a

"This paper is a short version of the full report 2 . For space limitations, the system
architecture is not illustrated in this paper and theorems are provided without proofs. A
detailed description of the system as well as proofs of theorems can be found in 2 .
46

suitable semantics underlying the classification provided in a domain is given.


The notion of domain is formally defined next.
Definition 2.1 A Domain on O , denoted by D , is a subset of 2 ° such that
both: (1) Vo € O, {o} € D, and (2) there exists an element r of D, called root
such that, Vo € O, o £ r. Elements of D are called classes.
In words, a domain is a set object classes containing a root class collecting
all the objects of O and, for each object o of O, the singleton class {o}.
Throughout the rest of the section, we assume a domain D on O as given.
Set containment induces a partial ordering among elements of a domain
D. A domain D plus this ordering in called ontology.
Definition 2.2 An ontology on D, denoted by OD, is a partially ordered set
(D, C). The ontology graph ofOo is a directed graph G{OD) = (N, A), where
N (set of nodes) coincides with D and A (set of arcs) is the binary relation
obtained as transitive reduction of the relation C of Oob• The node r of
G(OD), where r is the root of D, is called root ofG(Oo).
Note that, as a consequence of item (2) of Definition 2.1, each other node
is reachable from the root r of G(OD)- Further, by item (1) of Definition
2.1, nodes of G{OD) with out-degree 0 coincide with singleton classes of D.
An ontology based on a generalization hierarchy is suitable for representing
many real-world situations, like the topics in web engines and in web sites,
the items in e-commerce, the staff hierarchy of an organization and so on. It
is worth noting that this is not the only possible choice for representing the
environment of agents. Indeed, in different contexts, as semi-structured data
in web documents, other kinds of ontologies may be better suited (for example
OEM-graphs l, SDR-networks 9 , etc.).

2.2 The Local Knowledge Base


The ontology represents the common knowledge about the environment in
which the agents work. However, each agent may have a partial view of the
ontology representing the portion of the world the user monitored by the
agent selects by her/his activity. Inside this portion of the ontology, different
priorities for the classes can be inferred by exploiting user behaviour. This is
encoded in the notion of the Local Knowledge Base (LKB for short), defined
next.
Definition 2.3 Given an ontology OD on D and an agent a, a Local Knowl-
edgeBaseLKBa (of a on 0D), is a pair (Ga,pa), such that: (i) Ga = (Na,Aa)
is a sub-graph of G{OD) (i-e., Na C D, Aa C A) containing the root r of
b
(A, B) is in the transitive-reduction of C iff A C B and /B C such A C C and C C B.
47

G(OD) and such that each n £ Na is reachable in Ga from r, and (ii) p is


a function, called priority function, defining a real weight ranging from 0 to
1 associated to each arc (i, j) of Ga such that: p(i,j) = ^ — ^ , where
Ck
2-ik£Adj(,i)
Adj(i) is the set of nodes adjacent to i, and for each k 6 Adj(i), cy. counts
how many times the user of a has selected an object (i.e. a leaf node) through
a path selection including the arc (i,k). Note that coefficients c^ in a path
of nodes r,ii, ...is are updated only when the leaf node is, corresponding to
a single object of the domain, is selected. The root r of G{OD) is also called
the root ofLKBa.
A Local Knowledge Base, representing the local view of the agent, is then
obtained by extracting a sub-graph from the ontology graph including all the
classes accessed by the user (and thus at least the root node). Moreover, arcs
of the so obtained graph are weighted for assigning highest priority to most
accessed classes.

3 Extraction of the Semantic Properties and Integration of


LKBs

Besides his/her local agent, each user looks at the other agents of the net as
a source of potentially interesting information in order to enrich the support
to his/her activity. Interest in agents can be defined by considering some
semantic properties. Such properties, useful for driving users' choices are
of two types: (i) local properties, taking into account information stored in
the LKBs, and (ii) global properties, merging local properties with external
knowledge extracted from the general context. An important feature of the
model is that the merge performed in the construction of global properties
is based on adaptive learning involving some parameters taking into account
user behaviour. In other words, global properties exploit an important kind of
properties (encoded by some parameters) directly reflecting reactions of users
to system advice. We call such additional properties reactive properties. Next
we describe the set of properties used in the model.

3.1 Local properties: Similarity


The only local property we consider is the property we call similarity be-
tween two agents i and j , representing a measure of the similarity of the two
corresponding LKBs. Such a coefficient is a real value ranging from 0 to 1.
Definition 3.1 Let i and j be two agents. Let Gl = {Ni,Ai) and Gj =
{Nj ,Aj) be the two graphs of their LKBs. Let pl and pP be the corresponding
48

priority functions. We define the similarity Sij between i and j as Sij =


! - \A*uAi\ T,h,k€AiuAJ Ihk, where 7ftfc = | pl{h, k)-p>{h, k) \ii(h,k) 6 A^A*,
7/ifc = 1 otherwise.
Observe that the term i^i^/u'i Ylh,keA'uA' ^hk m t n e expression defining
Sij (for two agents i and j) represents a distance between agents i and j .
Such a distance is defined as a mean of a number of contributions 7 ^ , each
corresponding to an arc (h, k) belonging to the set A1 U AK For common arcs
of the two LKBs, that is, arcs belonging to the intersection between A1 and
•A?, "fhk is the difference (in absolute value) between the respective priority
functions (note that such a difference is a real value ranging from 0 to 1). In
words, common arcs can be view as "homologous" arcs, and their distance
measures how much these arcs differ in terms of weight. For the remaining
arcs (h, k) g A1 n A7 , we assign the value 1 to the coefficient 7 ^ . Indeed, an
arc belonging to A1 but not belonging to A> has not a "homologous" arc in
the LKB graph of the agent ,; (and vice versa), and thus its contribution to
the overall distance must be set to the maximum value, that is the value 1.

3.2 Global Properties: Interest and Attractiveness


Recall that global properties merge local properties with knowledge extracted
from the context. In this section we introduce the notion of interest coeffi-
cient, representing just a measure of the global properties of a given agent as
perceived by another one. Hence, for a pair of agents i and j , the interest co-
efficient, besides of the similarity between i and j , must take into account also
knowledge extracted from the context. But which kind of contextual knowl-
edge has to be considered as meaningful?. The choice we make in our model is
the following: The knowledge extracted from the context, used by the agent i
for defining the interest coefficient Iij w.r.t. another agent j , is a measure of
the global interest of all the other agents (different from i) w.r.t. the agent j ,
that is a measure of a sort of attractiveness of the agent j as perceived by the
agent i. Recalling that the interest, besides the contextual knowledge, must
take into account also the local knowledge (i.e., the similarity), the above
definition of contextual knowledge leads to require that, for each i G A \ {j}:

Iij = (t>ij{Sij,Pij{{hj I k ^ i})) (1)


where pij and 4>ij are suitable functions yielding real values from 0 to 1. In
particular, ptj returns a measure of the attractiveness of the agent j detected
by the agent i from the value of the interest coefficients of all the agents
(different from i) w.r.t j , while <f>ij combines such a measure with the similarity
49

Sij. Clearly, the function (f>ij plays also the role of weighing the importance
for the agent i of the local knowledge w.r.t. the contextual one.
For fiij and <f>ij (where i and j are two agents) we adopt in our model the
following choices: (i) Hij is the function computing the mean of the interest
coefficients of all the other agents different from j , (ii) ^>ij is a function com-
puting a linear combination of the similarity coefficient between i and j and
the attractiveness of j w.r.t i. Applying the above definitions for \iij and (j>ij,
(1) becomes the following linear system:

hi = 1>ij • {Pi • Sij + (1 - Pi) • ^ 2 HkeA\{ij} hi) for each i G A \ {j} (2)
where ipij and Pi, for each i G A \ {j}, are adaptive parameters ranging from
0 to 1 representing a measure of reactive properties that we suppose to be
learned from the user behaviour, t/fy plays the role of a reducing factor, fil-
tering the advice of the system on the basis of the user behaviour, while P,
measures the importance that the user gives to the local knowledge (similar-
ity) w.r.t. the contextual one. Note that both ipij and Pi can be estimated
once the reactive properties are defined. We deal with this issue in the next
section. Thus, given an agent j , any value assignment to the interest coef-
ficients of all the other agents w.r.t. j must satisfy (2). The next theorem
ensures existence and uniqueness of a value assignment, for every value of the
parameters occurring in (2).
Theorem 3.1 Given an agent j £ A and a set of real coefficients
{PutijiSij | i G A\{j}, P G [0,1],^- G [0,1] ,3Pk ^ 0,3W S *
1, Sij € [0,1]}, there exits a unique (n — \)-tuple of real values S = (I\j,...,
I(j_!)j, J(j_l_i)j,.. .Inj) satisfying (2) with I^ G (0,1), for each 1^ occurring
in S.
The above result allows us to define the interest coefficients list of an
agent j as the unique solution of (2).
Definition 3.2 Given an agent j G A, the interest coefficient list of j is the
unique (n — l)-tuple of real values (lij,..., I(J_I)J,IQ+I)J, ... Inj) satisfying
(2). Given an agent i ^ j , the interest coefficient of i w.r.t j is the value lij
occurring in the interest coefficient list of j .
Besides of the interest property, from the knowledge of the interest co-
efficients lists, agents can exploits a second type of property. Indeed, an
agent can compare different agents on the basis of their attractiveness coeffi-
cient, representing the component of the interest capturing only the contextual
knowledge.
Definition 3.3 Given a pair of agents i, j G A, the attractiveness of j per-
ceived by i, is the real coefficient Aij (ranging from 0 to 1) defined as:
50

A
ij = ^hT^keAMij}1^, w h e r e (Iij,---,I(j-i)j,I(j+i)j,...Inj) is t h e in-
terest coefficients list of the agent j .

3.3 Choice Lists


Suppose the user of an agent i has the intention of contacting other agents
in order to establish a cooperation. Suppose the similarities between i and
every other agent is known as well as both the interest coefficient of i w.r.t.
every other agent and the attractiveness of all the agents perceived by i. As
previously discussed, such values can be effectively computed once a number of
parameter are set (actually, they can be suitably initialized and their updating
could be learnt from the behaviour of the user, as we shall explain in the
sequel). Thus, three agent lists can be presented to the user i associated to
the agent i, each associated with a property among similarity, interest and
attractiveness. We denote these lists Ls(i), i / ( i ) , and LA{I). Ls{i) (Li(i),
LA(I), resp.) is the list of the n — 1 agents j (different from i) ordered by
decreasing similarity (interest, attractiveness, resp.) coefficient Sy (Jy, A^,
resp.). When the user i chooses an agent j from the list Ls(i) (Li(i), LA(I),
resp.), it means that she/he perceived only the property of similarity (interest,
attractiveness, resp.) about the agent j . From the choices of the users, useful
knowledge can be thus drawn, which is potentially usable as feedback for
correcting advice given to them. This issue is discussed in the next section.

3.4 Reactive Properties


For reactive properties we mean properties describing reactions of users to
the suggestions received from the system at a given time, that must be taken
into account for adapting future responses of the system. We implement
such adaptation of the system to the user behaviour by including into the
interest coefficient definition (see Section 3.2) some specific coefficients that
are automatically updated during system running. In this section we describe
both the role of such coefficients and the rules defining their adaptation to
the user behaviour. Recall that, given a pair of agents i and j , for defining
the interest coefficient hj, two parameters P, and tpij must be set. They are
real parameters ranging from 0 to 1. Pi encodes the preference property and
is called preference coefficient of the agent i, while tpij is the product B^ • Cij
between the benevolence coefficient B^ and consent coefficient Cij, resp., of
i w.r.t. j . Given an agent i, we denote by Ls{i), Lj(i), and LA(I) the three
choice lists presented to the user of agent i by the system.
The Preference Property. It is described by a real coefficient ranging
from 0 to 1 denoted by Pi and called preference coefficient. The property
51

measures how much for an agent i the similarity is more important than the
attractiveness property for defining global properties. It is easily recognizable
that in the definition of interest given in Section 3.2 the coefficient Pi plays
just this role. Now we define how the coefficient Pi is updated. Suppose that
at a given time the user of the agent i makes a selection of agents. Denote by
Sli (SSi, SAi, resp.) the set of the agents that the user has selected from the
list Ls(i) (L[{i), LA{I)I resp.). We interpret the behaviour of the user in the
following way. The choice of an agent from a list, say Ls{i), means that the
user relies on the associated property, say similarity. We can then interpret
the former choice as an implicit suggestion from the user to set the coefficient
Pi to 1, while the latter as an implicit suggestion to set this value to 0. In case
the user chooses from the list Lj(i), we infer that the user accept the current
value the coefficient Pi. Taking into account the above observations, updating
Pi after a selection step is defined as: Pi — | • ( is/.i+i'ss^+isJt.i + Pi)- This
updating is obtained by computing the average between the old value of Pt
and a new contribution corresponding to the mean of the " suggested" values
for Pi. Observe that computing the mean with the old value, allows us to
keep memory of the past, avoiding drastic changing of the coefficient.
The Benevolence Property. This property measures a sort of availability
of the agent j to which a user i requires to share knowledge. Such a property is
used in order to weight the interest of i w.r.t. j . For instance, an agent j that
recently, and for several times, has denied collaboration in favor of i should
become of little interest for i. The parameter encoding such a knowledge is
called benevolence coefficient, denoted by Bij, and takes real values ranging
from 0 to 1. Bij = 0 (resp., Bij = 1) means the agent j is completely
unavailable (resp., available) to fulfill the requests of i. The response of j
to requests of i updates the value of B^ according to the following rules:
B^ = min(l, Bij + 6) if j grants the request of i, B^ = max(l, B^ — S) if j
denies the request of i, where 6 is a (reasonably small) positive real value.
The Consent Property. This property describes how much the user of an
agent i trusts suggestions of the system regarding another agent j done on the
basis of the interest property. The coefficient associated with this property is
denoted by C»j and is called consent coefficient. The updating rules defining
how to adapt the coefficients Cij after a user selection step take into account
only the portion of the selection performed on the list Li(i). Indeed, from
this portion of the user selection, we can draw information about the opinion
of the user about the suggestions provided by the system. For instance, if the
user of i completely trusts the system capability of providing the best suited
agent for cooperation by providing the list Lj(i) she/he will choose exactly
52

only the first k agents appearing in L/(i), where k is the size of the portion
of her/his selection extracted from Li(i). This is not in general the case,
that is, some of the k agents chosen from L/(i) do not occur in the set of
the first k agents of Lj(i), We defined updating rules by taking into account
the above observations according to the following idea: every agent h chosen
by the user from Lj (i) produces a gain of the consent coefficient dh if h is
a candidate from the system to be selected, produces an attenuation of Cih
otherwise. More formally, given an agent i and a selection 5, (set of agents)
extracted by the user of i from Lj(i), for each h £ Sf. dh = min(l,C^ + S)
if h appears among the first \Si\ elements of Lj(i), C^ = max(0,Cih — 6),
otherwise, where S is a (reasonably small) positive real value.

3.5 Integration of Local Knowledge Bases


Cooperation between two agents is implemented in our model by the integra-
tion of their LKBs. Thus, the user of an agent i which has selected an agent j
from one of the three choice lists can exploit the cooperation of j by consult-
ing the Integrated Knowledge Base, obtained by integrating the LKB of i with
the LKB of j . It is worth noting that in our model the integration technique
is orthogonal w.r.t. the process of selecting the most appropriated agents for
cooperation, even if it is necessary to implement such a cooperation. In order
to concentrate the attention of the reader to the main contribution of this
paper, consisting on the extraction of semantic properties and their adapta-
tion to the user behaviour and due to space limitation, we omit a detailed
description of the integration process, which can be found in 2 .

4 Discussion and Conclusion

We presented a formal framework for representing cooperation among agents


in a multi-agent environment. The model is based on the extraction of some
semantic properties capturing both local and contextual knowledge about
agents. Such properties, encoded by suitable coefficients, drive users on se-
lecting from the agent net the most promising candidate agents for fruitful
cooperation. User choices are exploited as feedback for adapting coefficients
in such a way that a trade-off among similarity and attractiveness, on the
one hand, agent congestion and user dissatisfaction, on the other hand, is
obtained. As example, consider: (i) An agent a with high similarity and
low attractiveness perceived by another agent b. The user of b can decide
to contact a less similar, but more attractive, agent c, and this means that
the current similarity does not fully satisfy b. Since b has chosen c, probably
53

it will make choices more similar to those of c than to those of a, and the
similarity between a and b will decrease, coherently with dissatisfaction of the
user, (ii) An agent a with high interest and low similarity (or low attractive-
ness) perceived by another agent b. The user of b can decide to contact a
less interesting, but more similar (or more attractive) agent c. As a conse-
quence, the interest for a perceived by b will decrease, due to the decreasing
of the consent coefficient Ct,a. (iii) An agent a with high interest and high
attractiveness perceived by another agent b. The user of b knows that high
attractiveness means probably long waiting time for obtaining answers from
a and can decide to contact a less interesting agent c. As a consequence, the
interest of b for a will decrease.

References

1. S. Abiteboul. Querying semi-structured data, in Proc. of Int. Conf. on


Database Theory -ICDT'97, (LNCS, Springer, 1-18, Delphi, 1997).
2. F. Buccafurri, L. Palopoli, D. Rosaci, G.M.L. Same. The SPY system.
TR 1/01, DIMET Univ. Reggio C. (Italy, 2001).
3. C. Byrne and P.Edwards. Collaborating to refine knowledge, in Workshop
on Agents that Learn from Other Agents - ML'95, (USA, 1995).
4. A. Chavez, A. Moukas and P.Maes. Challenger: a multi agent system
for distributed resource allocation, in Proc. of the 1st Int. Conf. on
Autonomous Agents - Agents'97, (ACM Press, 323-331, New York, 1997).
5. Y. S. Choi and S. I. Yoo: Multi-agent web information retrieval: neural
network based approach, in Proc. of the 1st Int. Symposium - IDA '99
ed. Berthold (Springer, Amsterdam, 1999).
6. J.E. Doran, S. Franklin, N.R. Jennings and T.J. Norman, On cooperation
in multi-agent systems, The Knowledge Engineering Revue 12, 3 (1997).
7. M. Fisher, J. Muller, M. Schroeder, G. Staniford and G.Wagner. Method-
ological foundation for agent-based systems, The Knowledge Engineering
Revue 12, 3 (1997).
8. A. Moukas and P.Maes. Amalthaea: an evolving multi-agent information
filtering and discovery system for the WWW, Autonomous Agents and
Multi-agent Systems 1, 1 (1998).
9. L. Palopoli, G. Terracina and D. Ursino. A graph based approach for
extracting terminological properties of elements, in Proc. of Int. Conf.
on Data Engineering - ICDE'01, (Heildeberg, 2001).
10. S. Sen. A foundational principle for promoting cooperative behavior
among self-interested agents, in Proc. of the 1st Int. Conf. on Multi-
agents Systems, (MIT Press, 1995).
ABT WITH ASYNCHRONOUS REORDERING

MARIUS-CALIN SILAGHI, DJAMILA SAM-HAROUD, AND BOI FALTINGS


Swiss Federal Institute of Technology Lausanne
1015 Ecublens, Switzerland
{ Marius.Silaghi,Djamila.Haroud, Boi.Faltings) @epfl. ch

Existing Distributed Constraint Satisfaction (DisCSP) frameworks can model


problems where a)variables and/or b)constraints are distributed among agents.
Asynchronous Backtracking (ABT) is the first asynchronous complete algorithm
for solving DisCSPs of type a. The order on variables is well-known as an im-
portant issue for constraint satisfaction. Previous polynomial space asynchronous
algorithms require for completeness a static order on their variables. We show
how agents can asynchronously and concurrently propose reordering in ABT while
maintaining the completeness of the algorithm with polynomial space complexity.

1 Introduction

Distributed combinatorial problems can be modeled using the general frame-


work of Distributed Constraint Satisfaction (DisCSP). A DisCSP is defined
in 1 as: a set of agents, Ai, ...,An, where each agent Ai controls exactly one
distinct variable xt and each agent knows all constraint predicates relevant to
its variable. The case with more variables in an agent can be obtained quite
easily from here. Asynchronous Backtracking (ABT) * is the first complete
and asynchronous search algorithm for DisCSPs. A simple modification was
mentioned in * to allow for a version with polynomial space complexity.
The completeness of ABT is ensured with the help of a static order im-
posed on agents. So far, no asynchronous search algorithm has offered the
possibility to perform reordering without losing either the completeness, or
the polynomial space property. In this paper we describe a technique that al-
lows the agents to asynchronously and concurrently propose changes to their
order. We then prove that, using a special type of markers, the completeness
of the search is ensured with polynomial space complexity.
This is the first asynchronous search algorithm that allows for asyn-
chronous dynamic reordering while being complete and having a polynomial
space complexity. Here we have built on ABT since it is an algorithm easier
to describe than its subsequent extensions. The technique can nevertheless
be integrated in a straightforward manner in most extensions of ABT. 2

54
55

2 Related Work

The first complete asynchronous search algorithm for DisCSPs is the Asyn-
chronous Backtracking (ABT) 1 . For simplicity, but without severe loss of
generality, the approach in 1 considers that each agent maintains only one
variable. More complex definitions were given later. 3 ' 4 Other definitions of
DisCSPs 5 ' 6 ' 7 have considered the case where the interest on constraints is
distributed among agents. 6 proposes versions that fit the structure of a real
problem (the nurse transportation problem). The Asynchronous Aggregation
Search (AAS) 7 algorithm actually extends ABT to the case where the same
variable can be instantiated by several agents and an agent may not know
all constraint predicates relevant to its variables. AAS offers the possibil-
ity to aggregate several branches of the search. An aggregation technique
for DisCSPs was then presented in 8 and allows for simple understanding of
the privacy/efficiency mechanisms. The strong impact of the ordering of the
variables on distributed search was so far addressed in 9>6>10.

3 Asynchronous Backtracking (ABT)

In asynchronous backtracking, the agents run concurrently and asyn-


chronously. Each agent instantiates its variable and communicates the vari-
able value to the relevant agents. Since here we don't assume FIFO channels,
in our version a local counter, Cf', is incremented each time a new instan-
tiation is proposed, and its current value tags each generated assignment.
Definition 1 (Assignment) An assignment for a variable a;, is a tuple
(xi,v,c) where v is a value from the domain of Xi and c is the tag value.
Among two assignments for the same variable, the one with the higher
tag (attached value of the counter) is the newest. A static order is imposed
on agents and we assume that A{ has the i-th position in this order. If i>j
then Ai has a lower priority than Aj and A, has a higher priority then Ai.
Rule 1 (Constraint-Evaluating-Agent) Each constraint C is evaluated
by the lowest priority agent whose variable is involved in C.
Each agent holds a list of outgoing links represented by a set of agents.
Links are associated with constraints. ABT assumes that every link is directed
from the value sending agent to the constraint-evaluating-agent.
Definition 2 (AgentJView) The agent_view of an agent, Ai, is a set con-
taining the newest assignments received by Ai for distinct variables.
Based on their constraints, the agents perform inferences concerning the
assignments in their agent.view. By inference the agents generate new con-
56

straints called nogoods.


Definition 3 (Nogood) A nogood has the form -<N where N is a set of
assignments for distinct variables.
The following types of messages are exchanged in ABT: ok?, nogood,
and add-link. An ok? message transports an assignment and is sent to a
constraint-evaluating-agent to ask whether a chosen value is acceptable. Each
nogood message transports a nogood. It is sent from the agent that infers a
nogood -<N, to the constraint-evaluating-agent for ->N. An add-link message
announces Ai that the sender A,- owns constraints involving £j. Ai inserts Aj
in its outgoing links and answers with an ok?.
The agents start by instantiating their variables concurently and send
ok? messages to announce their assignment to all agents with lower priority
in their outgoing links. The agents answer to received messages according to
the Algorithm 1 (except for pseudo-code delimited by '*'). 4
Definition 4 (Valid assignment) An assignment (x,v\,ci) known by an
agent Ai is valid for A\ as long as no assignment (x, t>2, C2), Ci>C\, is received.
A nogood is invalid if it contains invalid assignments. The next property
is mentioned in 1 and it is also implied by the Theorem 1, presented later.
Property 1 If only one nogood is stored for a value then ABT has polynomial
space complexity in each agent, 0(dn), while maintaining its completeness and
termination properties, d is the domain size and n is the number of agents.

4 Histories

Now we introduce a marking technique that allows for the definition of a total
order among the proposals made concurently and asynchronously by a set of
ordered agents on a shared resource (e.g. an order).
Definition 5 A proposal source for a resource Ti is an entity (e.g. an
abstract agent) that can make specific proposals concerning the allocation (or
valuation) oflZ.
We consider that an order < is defined on proposal sources. The proposal
sources with lower position according to -< have a higher priority. The proposal
source for TZ with position k is noted P™, k > x^. x^ is the first position.
Definition 6 A conflict resource is a resource for which several agents can
make proposals in a concurent and asynchronous manner.
Each proposal source Pf~ maintains a counter C ^ for the conflict resource
Tl. The markers involved in our marking technique for ordered proposal sources
are called histories.
57

Figure 1. Simple scenarios with messages for proposals on a resource, x.

Definition 7 A history is a chain h of pairs, \a:b\, that can be associated to


a proposal for 1Z. A pair p=\a:b\ in h signals that a proposal for 7Z was made
by P^ when its C"^ had the value b, and it knew the prefix of p in h.
An order oc (read "precedes") is defined on pairs such that |ii:Zi| ex \iidi\
if either ii < ii, or i\ = ii and l\ > li.
Definition 8 A history hi is newer than a history hi if a lexicographic
comparison on them, using the order oc on pairs, decides that hi precedes hi.
P^ builds a history for a new proposal on TZ by prefixing to the pair
|fc:value(C^)|, the newest history that it knows for a proposal on TZ made
by any P^, a<k. The C ^ in P^ is reset each time an incoming message
announces a proposal with a newer history, made by higher priority proposal
sources on 7Z. C^ is incremented each time P^ makes a proposal for 1Z.
Definition 9 A history hi built by Pf~ for a proposal is valid for an agent
A if no other history hi (eventually known only as prefix of a history h'2) is
known by A such that hi is newer than hi and was generated by PV~, j < i-
For example, in Figure 1 the agent P£ may get messages concerning the
same resource x from Pf and P2. In Figure la, if the agent P-f has already
received mi, it will always discard TO3 since the proposal source index has
priority. However, in the case of Figure lb the message mi is the newest only
if kif < ku and is valid only if fcij < ku. In each message, the length of the
history for a resource is upper bounded by the number of proposal sources for
the conflict resource.

5 Reordering

Now we show how the histories described in the previous section offer during
the search a mean for allowing agents to asynchronously and concurrently
propose new orders on themselves. In the next subsection we describe a
didactic, simplified version that needs additional specialized agents.
58

5.1 Reordering with dedicated agents


Besides the agents Ai,...,An in the DisCSP we want to solve, we consider
that there exist n—1 other agents, R°,...,Rn~2, that are solely devoted for
reordering the agents Ai.
Definition 10 An ordering is a sequence of distinct agents Aku,...,Akn.
An agent Ai may receive the position j,j ^ i. Let us assume that the
agent Ai, knowing an ordering o, believes that the agent Ai, owning the
variable x\, has the position j . Ai can refer Ai as either Ai, Ai{6) or Aj. The
variable Xi is also referred to by Ai as either x^, xj (o) or x\.
We propose to consider the ordering on agents as a conflict resource. We
attach to each ordering a history as defined in the previous section. The
proposal sources for the ordering on agents are the agents Rl, where Rl^R:> if
i<j and x o r c l e r = 0 . Rl is the proposal source that when knowing an ordering,
o, can propose orderings that reorder only agents on positions p, p > i.
Definition 11 (Known order) An ordering known by Rl (respectively A1)
is the order o with the newest history among those proposed by the agents
Rk,0<k<i and received by R1 (respectively A1). A1 has the position i in o.
This order is referred to as the known order of Rl (respectively A1).
Definition 12 (Proposed order) An ordering, o, proposed by Rl is such
that the agents placed on the first i positions in the known order of Rl must
have the same positions in o. o is referred to as the proposed order of Rl.
Let us consider two different orderings, oi and 02, with their corresponding
histories: 0\ = {o\,h\), O2 = (02,/12); s u c n * n a * I'1!I — l^l- Let p\ = |a^:foj|
and pk = \ak:b\| be the pairs on the position k in hi respectively in /12.
Definition 13 (Reorder position) Let u be the lowest position such that
p\ andp^ are different and let v = \h\\. The reorder position of hi and hi
is either min(a\, a%) + 1 if u> v, or a%+1 + 1 otherwise. This is the position
of the highest priority reordered agent between hi and /12.
New optional messages for reordering are: heuristic messages for heuris-
tic dependent data, and reorder messages announcing a new ordering, (o, h).
An agent Rl announces its proposed order o by sending reorder messages
to all agents Ak(o), k>i, and to all agents Rk, k>i. Each agent A1 and each
agent R% has to store a set of orderings denoted Cord. Cord contains the
ordering with the newest history that was received from each Ri, j < i (if that
history is valid)." By the history of Cord we refer the newest history in Cord.
For allowing asynchronous reordering, each ok? and nogood message receives

"Typically Cord is completely described by the ordering with the newest received history.
59

as additional parameter an order and its history (see Algorithm 1). The ok?
messages hold the newest known order of the sender. The nogood messages
hold the order in the Cord at the sender Ai that A* believes to be the newest
known order of the receiver, A1. This ordering consists of the first i agents in
the newest ordering known by A? and is tagged with a history obtained from
the history of its Cord by removing all the pairs \a:b\ where a>i.b
When a message is received which contains an order with a history h that
is newer that the history h* of Cord, let the reordering position of h and h*
be F. The assignments for the variables xk, k > Ir, are invalidated.0
The agents R1 modify the ordering in a random manner or according to
special strategies appropriate for a given problem.d Sometimes it is possible to
assume that the agents want to colaborate in order to decide an ordering. 6 The
heuristic messages are intended to offer data for reordering proposals. The
parameters depend on the used reordering heuristic. The heuristic messages
can be sent by any agent to the agents Rk. heuristic messages may only
be sent by an agent to Rk within a bounded time, th, after having received
a new assignment for x^,j<k. Agents can only send heuristic messages to
R° within time th after the start of the search. Any reorder message is sent
within a bounded time t r after a heuristic message is received (or start).
Besides CJ? rcler and C°rd, the other structures that have to be maintained
by Rk, as well as the content of heuristic messages depend on the reordering
heuristic. The space complexity for Ak remains the same as with ABT.

5.2 ABT with Asynchronous Reordering (ABTR)


In fact, we have introduced the physical agents Rl in the previous subsection
only in order to simplify the description of the algorithm. Any of the agents
Ai or other entity can be delegated to act for any R?. When proposing a
new order, R% can also simultaneously delegate the identity of Rl+1,..., i?™~2
to other entities-^, Pfc, by attaching a sequence R°—>Pkn ..., Rn"2^>Pki to the
ordering. At a certain moment, due to message delays, there can be several
entities believing that they are delegated to act for Rl based on the ordering
they know. However, any other agent can coherently discriminate among

''The agents absent from the ordering in a nogood are typically not needed by A1. A1
receives them when it receives the corresponding reorder message.
'Alternative rule: A1 can keep valid the assignments of new variables xk, i > k > Ir but
broadcasts xl again.
d
e.g. first the agents forming a coalition with R'.
e
This can aim to improve the efficiency of the search. Since ABT performs forward checking,
it may be possible to design useful heuristics.
^In n we explain how Rl can redelegate itself.
60

when received (okl ,(xj,dj *,cXj, (o, h) *)) do


*if(-.getOrder((o,/i)) or old cx.) return*; //ABTR;
add(xj,dj*,cXj*) to agent-view; check_agent_view;
end do.
when received (nogood, Aj,nogood * (o, h) *) do
*if(-.getOrder((o,/i))) return*; //ABTR;
*discard nogood if it contains invalid assignments else*; //ABTR;
when {xk,dk,Ck), where Xk is not connected, is contained in nogood
send add-link to Ak\ add (xk,dk,Ck) to agent-view,
add nogood to nogood-list; add other new assignments to agent-view;
old-value <— current-value; check_agent_view;
when old-value = current-value
send (ok?,(:Ej,cwrren£-«a/Me,cXi),known_order(j4j)) to A,;
end do.
procedure check_agent_view do
when agent_view and current-value are not consistent
if no value in Di is consistent with agent.view then
backtrack;
else
select d £ Di where agent-view and d are consistent;
current-value <— d; cXi++; O <— known_order(Aj);
send (ok?,(xj,d,c Xi ),(9) to lower priority agents in outgoing links;
end
end do.
procedure backtrack do
nogoods <— {V | V = inconsistent subset of agent-view};
when an empty set is an element of nogoods;
broadcast that there is no solution, and terminate this algorithm;
for every V G nogoods;
select (xj,dj*,cx,*) where Xj has the lowest priority in V;
send (nogood, Xi,V,Oj) to AJ; remove (xj,dj*,cx.*) from agentjview;
check_agent_view;
end do.
function getOrderf(o,/i); -> bool //ABTR
when h is invalidated by the history of Cord then return false;
when not newer h than Cord then add (o, h) to Cord; return true;
I <— reorder position for h and the history of Cord;
invalidate assignments for x\ j > I (alternative6*); add (o,h) to Cord;
end.
Algorithm 1: Procedures for Receiving Messages in ABT and ABTR.
61

A1/A1/R1 _ok?(xi,l,l)(Ai,A 2 ,A 3 )_ A3
A2/A2/R°/R2 -ok?(x2,2,l)(Ai,A2,A3)- A3
A 3 /A 3 /K 3 _nogood-.((ii, 1, l)(x 2 , 2, l»(i4i, A2)_> A2
Ai/RV^ 1 . reorder (Ai, A3, A2)\l A3
A2/K°/A3/R2 . reorder (A3, Ai, A2)\0 A3
Ai/YO-/Al .reorder (Ai,A3,A2)\l A2
A2/R°/A3/R2 . reorder (A3, Ai, A2)|0 Ai
A3/A1/R1 -.ok?(x3,l,2){A3,AuA2)\0: 1|_ Ax
A3/A1/R1 _ok?(x3,l,2)(A3,Ai,A2)|0:l|_ A2

Figure 2. Simplified example for ABTR with random reordering. Rl delegations are done
implicitely by adopting the convention "A* is delegated to act for Rl". Left column:
Ai/Ai/R11 /R12... shows the roles played by Ai when the message is sent. In bold is shown
the capacity in which the agent Aj sends the message. The addlink message in not shown.

messages from simultaneous Rls using the histories that Rls generate. The
Rl themselves coherently agree when the corresponding orders are received.
The delegation of Rl, i > 0 from a physical entity to another poses no problem
of information transfer since the counter Cf'rder of Rl is reset on this event.
For simplicity, in the example in Figure 2 we describe the case where the
activity of Rl is always performed by the agent believing itself to be A1. R%
can send a reorder message within time tr after an assignment is made by A1
since a heuristic message is implicitely transmitted from A1 to Rl. We also
consider that A2 is delegated to act as R°. R° and R1 propose one random
ordering each, asynchronously. The receivers discriminate based on histories
that the order from R° is the newest. The known assignments and nogood
are discarded. In the end, the known order for A3 is (A3, Ai, A2)\0 : 1|.
By quiescence of a group of agents we mean that none of them will re-
ceive or generate any valid nogoods, new valid assignments, reorder messages
or addlink messages.
Property 2 In finite time tl either a solution or failure is detected, or all the
agents A^,0<j<i reach quiescence in a state where they are not refused an
assignment satisfying the constraints that they enforce and their agent-view.
Proof. Let all agents Ak,k<i, reach quiescence before time tl~l. Let r be
the maximum time needed to deliver a message.
Btp < t%~1 after which no ok? is sent from Ak, k<i. Therefore, no heuristic
message towards any Ru, u<i, is sent after tlh = tlp + T + th- Then, each Ru
becomes fixed, receives no message, and announces its last order before a time
^r = t\ + T + tr. After tlr + T the identity of A1 is fixed as A\. A\ receives the
last new assignment or order at time tl0 < t\ + r.
Since the domains are finite, after tl0, A\ can propose only a finite number of
62

different assignments satisfying its view. Once any assignment is sent at time
tla > tl0, it will be abandoned when the first valid nogood is received (if one
is received in finite time). All the nogoods received after t\ + nr are valid
since all the agents learn the last instantiations of the agents Ak, k < i before
tla + nr — T. Therefore the number of possible incoming invalid nogoods for
an assignment of A% is finite.
l.If one of the proposals is not refused by incoming nogoods, and since
the number of such nogoods is finite, the induction step is correct.
2.If all proposals that A1 can make after tl0 are refused or if it cannot find
any proposal, A1 has to send a valid explicit nogood -<N to somebody. -<N is
valid since all the assignments of Ak, k < i were received at A1 before tl0.
2.a) If N is empty, failure is detected and the induction step is proved.
2.b) Otherwise -iJV is sent to a predecessor A*,j<i. Since -iJV is valid,
the proposal of A> is refused, but due to the premise of the inference step, A?
either
2.b.i) finds an assignment and sends ok? messages, or
2.b.ii) announces failure by computing an empty nogood (induction
proven).
In the case (i), since -*N was generated by A1, A1 is interested in all its
variables (has sent once an add-link to A?'), and it will be announced by A^
of the modification by an ok? messages. This contradicts the assumption
that the last ok? message was received by A1 at time tz0 and the induction
step is proved.
From here, the induction step is proven since it was proven for all alternatives.
In conclusion, after tl0, within finite time, the agent A1 either finds a solution
and quiescence or an empty nogood signals failure.
R° is always fixed (or after tr in the version in x l ) and the property is true
for the empty set. The property is therefore proven by induction on i •

Theorem 1 ABTR is correct, complete and terminates.


Proof. Completeness: All the nogoods are generated by logical inference
from existing constraints. Therefore, if a solution exists, no empty nogood
can be generated.
N o infinite loop: This is a consequence of the Property 2 for i = n.
Correctness: All assignments are sent to all interested agents and stored
there. At quiescence all the agents know the valid interesting assignments of
all predecessors. If quiescence is reached without detecting an empty nogood,
then according to the Property 2, all the agents agree with their predecessors
and the set of their assignments is a solution. •
63

6 Conclusions

Reordering is a major issue in constraint satisfaction. All previous complete


polynomial space asynchronous search algorithms for DisCSPs require a static
order of the variables. We have presented an algorithm that allows for asyn-
chronous reordering in ABT. This is the first asynchronous complete algorithm
with polynomial space requirements that has the ability to concurrently and
asynchronously reorder variables during search. Here we describe a random
reordering heuristic that can be useful for special purposes (coalitions, special
strategies). However, this algorithm offers a flexible mechanism (general pur-
pose heuristic messages) that allows for implementing most other heuristics
that can be believed useful for general or specific applications. Alternative
implementations, alternatives to using histories, how to save effort across re-
ordering and efficient heuristics are described in n .

References

1. M. Yokoo, E. H. Durfee, T. Ishida, and K. Kuwabara. The distributed


CSP: Formalization and algorithms. IEEE TKDE, 10(5)-.613-685, 98.
2. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. Asynchronous consis-
tency maintenance with reordering. TR #01/360, EPFL, March 2001.
3. M. Yokoo and K. Hirayama. Distributed constraint satisfaction algorithm
for complex local problems. In ICMAS'98, pages 372-379, 1998.
4- M. Yokoo. Distributed Constraint Satisfaction. Springer, 2001.
5. Y. Zhang and A. K. Mackworth. Parallel and distributed algorithms for
finite CSPs. In Proc. of Symp. on PDP, pages 394-397, 91.
6. G. Solotorevsky, E. Gudes, and A. Meisels. Distributed CSPs - a model
and application, http://www.cs.bgu.ac.il/~am/papers.html, Oct 97.
7. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. Asynchronous search
with aggregations. In Proc. of AAAI2000, pages 917-922, 2000.
8. P. Meseguer and M. A. Jimenez. Distributed forward checking. In Proc.
of DCS. CP'00, 2000.
9. M. Yokoo. Asynchronous weak-commitment search for solving large-scale
distributed CSPs. In ICMAS, pages 467-318, 95.
10. Y. Hamadi and C. Bessiere. Backtracking in distributed constraint net-
works. In ECAI'98, pages 219-223, 98.
11. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. Hybridizing ABT
and AWC into a polynomial space, complete protocol with reordering.
TR #01/364, EPFL, Mai 2001.
SOCIAL RATIONALITY A N D COOPERATION

GUIDO BOELLA
Dipartimento di Informatica - Universita di Torino
C.so Svizzera 185 10149 Torino ITALY - email: guido@di.unito.it

The definitions of cooperation to shared plans and joint intentionality have tradi-
tionally included subsidiary goals: they aim at coordinating the part of the group's
action which goes beyond the control of individual intentionality. In this paper, we
present a definition of collective acting which explains goals aiming at the group's
coordination as a result of the interaction of goal adoption, group utility and re-
cursive modeling of BDI agents.

1 Introduction

W h a t is involved when a group of agents decides to do something together?


Cohen and Levesque 1 answered to this question t h a t if the group's members
were directed by single agent intentions towards a shared plan, they would risk
to let the group act in an irrational way; in fact, a member could realize t h a t
her task is impossible to achieve or t h a t the shared goal has been satisfied:
she would leave the group, while the remaining members could go on in trying
to pursue an impossible or already achieved goal.
For this reason, Cohen and Levesque 1 add to the definition of joint intention
some subsidiary goals which must be satisfied just in case an agent believes
t h a t the joint intention has been satisfied, etc.: in all these cases, she should
drop her main intention, but pursue the goal of making the remaining agents
aware of the new situation.
We present an operational definition of joint intentionality based on the
idea t h a t subsidiary goals, which come into play at the end of the group
activity, can be explained as a consequence of the interplay of three factors:
the group's utility, goal adoption and anticipatory coordination.
First of all, the rationality of the group's action is defined in terms of the
benefit of the entire group and not only of the advantage of the single members.
If the group's members followed only their individual intentions, they would
achieve a better individual performance, since they would stop their action
when it becomes just a waste of time and resources; but, in this way, they
would leave the remaining agent working without any real achievement, thus
reducing the group's overall performance.
Second, making the remaining agents aware t h a t some new situations
hold, as prescribed by the definition of joint intention, is an instance of a more

64
65

general behavior which Conte and Castelfranchi 2 call goal adoption. T h e goal
of making the partners believe t h a t the joint goal has been achieved or is
impossible to achieve can be motivated by a similar attitude of agents: the
agent is adopting a control goal of the partners, i.e., a goal which stems during
the intentional execution of an action. Goal adoption, per se, does not imply
doing anything for another agent's goals. An adopted goal is given as input
for means-ends reasoning, but it still must undergo the deliberation process of
the agent. It is selected as the actual intention only if the agent gains from its
satisfaction the m a x i m u m advantage with respect to the other alternatives.
We measure the advantage an agent gains in terms of the decision theoretic
concept of utility. In case of collective action - as stated above - the utility
t h a t must be considered is a combination of the private utility of an agent
with t h a t of her partners in the shared plan.
Finally, if agents interact (in a cooperative or conflictual way) in a resource
bounded environment, when they have t o measure the utility of their actions,
they cannot but take into account the effect of their actions on the other
agents. In particular, they have to compute the utility not of the outcome of
their actions, but the utility of the outcomes produced by the predicted reac-
tions of their interactants. In case of cooperation among agents, this means
t h a t an action must be chosen only after the agent has predicted what her
partners can (or cannot) do afterwards and she has computed which is the util-
ity of the resulting situations for the entire group (anticipatory coordination,
another brick of social rationality, according to Conte and Castelfranchi 2 ).

2 T h e D e f i n i t i o n of C o o p e r a t i o n a n d t h e P l a n n i n g A l g o r i t h m

We assume t h a t an agent has a set of preferences and goals and t h a t she does
planning in order to find a (possibly partial) plan which satisfies one or more
of these goals and maximizes the agent's utility. T h e chosen plan constitutes
the current (individual) intention of the agent. Then, the plan is executed in
a reactive manner, i.e., monitoring effects and triggering replanning in case of
failure or new information.
Since a decision must be taken about which plan to choose, we need some
techniques to balance the different possibilities: we adopted the decision theo-
retic planner D R I P S described in Haddawy and Hanks 3 , where they described
a way to relate the notions of goals and planning to t h a t of utility.
In our definition of cooperation, a group GR composed of agents G\, .. .,
G„ cooperates to a shared plan a" for achieving goal <p (with an associated
recipe Rx composed of steps 0^' 1, ..., /3m *), when:
66

1. each step 0f' has been assigned to an agent Gk in GR for its execution;
2. each agent Gk of the group GR has the single agent intention to perform
his part fff' , an intention relative (in Cohen and Levesque 4 's sense) to
the existence of the group shared plan;
3. all the agents of GR have the m u t u a l belief t h a t each one (Gk) is engaged
in cooperating to <p with GR by means of recipe Rx;
4. all the agents mutually know t h a t they share an utility function /GR
based on a weighed sum of the utility of the goal ip, which the shared
plan aims at, and of the resource consumption of the single agents;
5. when an agent Gk becomes aware t h a t a partner Gj has a goal ip t h a t
stems from his intention to do his part / ? * J , Gk will consider whether to
adopt it;
6. each agent Gk remains in the group so long as the group's expected utility
of going on in performing f3^k for <p or adopting some of the goals of the
partners is greater t h a n the expected utility of doing nothing more for
the group.

For what concerns point 5, the goals t h a t are adopted by an agent Gk are
the subgoals ip which Gj has formed while planning how to perform his part
and while executing t h e m in a reactive way. Therefore, Gk considers not only
the steps of /?*'b she may execute to assist Gj in performing his task, but also
Gj's goals deriving from his single agent intention to perform f3p,J: knowing
how to perform / ? p J , achieving its preconditions, monitoring the execution.
The above definition is not by itself sufficient, since it does not explicitly
address the problem of anticipatory coordination. In order to implement it,
we had to revise the evaluation of the heuristics for action selection of the
D R I P S planner: before the evaluation of an action outcome is carried out,
the outcome of GA is updated with the effects of the partner GB 'S reaction
which the agent tries to predict via a recursive modeling of the planning and
decision making activity of her partner about his part of the shared plan (see
Boella 5 for the details).
Second G^s beliefs must be constructed from the outcomes of each Ga's
alternative Ry', In many situations, in fact, Gb is not aware of all the effects of
Ga's action. In this proposal, a STRIPS-like solution to this subjective form
of the frame problem is adopted; Gb's knowledge of a state is updated by an
action of Ga only with the effects which are explicitly mentioned as believed
by Gb in the action definition." Since not all states are distinguishable from
Gb's point of view, we exploit the game-theoretic notion of information set.

°See Isozaki et al.e for a more complex methodology for reasoning on changes in the beliefs
of other agents.
67

T h e evaluation of Ga 's alternative RVi with outcome S't, under the light
of anticipatory coordination is made in the following way (/?p,b is the task
assigned to Gb in the shared plan, JQR is the group's utility function):

1. From S[ (the result of recipe Ry'), form the corresponding information


sets and Gb's point of view Sf — {Sfx, . . . , Sfr};
2. On each state Sf• in Sf: (a) restart the planning process from the per-
spective of partner Gb and try to solve his current task /?f'6. T h e result
is a set of candidate recipes {RUl, . . . , .ft""}.
(b) For each / (I < I < v) by means of the recipe Ru' expand the state
Sfj obtaining again a set of (probability, state) pairs S1,'6. t = {(pfji 1,

(c) For each / (1 < / < v), the group's utility function /OR is applied
to these sets of (probability, state) pairs, and the plan Rbest>.j which
maximizes the following formula is the one selected by agent Gb for exe-
cution in Sfj (its outcome is SfJbest^ = { ( p & ^ . t ^ i . Sij,be,tij,i)> •••>
\Pi,j,best,,j,r,' &i ,j,best, j,r,) J ) : 1-, S^ , g 5,'6-, P
iJ,',z * JGR\^i,j,l,z)

3. Expand each state 5,',- in S[ with the recipe Rbest'-e, where 5,-,e is the
equivalence class in Si which S[ • belongs to; for each j , the result is a set
of (probability, state) pairs: S§ = {{jpffo, S £ , , ) , ..., ( ^ , , , 5 ^ . ^ ) }
4. Given the n initial states 5,' fc in 5,', the probability of each state S'ft x is
Pi k * PTj I ( t n e latter depends on the probability distribution of Rbesti-i
effects). Consequently, the expected utility of the initial states S't is:

3 Cooperation phenomena explained

H e l p f u l b e h a v i o r : helpful behavior (i.e., goal adoption) is provided not only


when the group cannot otherwise proceed in its plan, but, also, it should not
be provided indiscriminately. By exploiting the decision theoretic paradigm,
it is possible to keep apart the forms of help t h a t are rational from those t h a t
appear to be just a waste of resources. If the effort for satisfying adopted goals
conflicts with the ability of the agent to do her own part, then, a decrease in
the group utility is obtained, instead of an increase.
C o m m u n i c a t i o n : As a special case of goal-adoption, it is possible to predict
various forms of communication: they arise when the goal to be adopted is a
control goal; t h a t is, a goal of Gb to know some information, like the t r u t h value
of constraints and preconditions of actions, and whether an action succeeded
68

or not. T h e effect of a communicative action is to make such a goal true


(at least with some probability). Communication does not always worth its
cost for the group. T h e agent Ga has to consider the cost of communicating
with Gb- If communication is expensive, slow or unreliable, then it is not
convenient for the group to waste resources in kindly communicating, since
Gb could discover the desired information in a less expensive manner.
H i e r a r c h i c a l g r o u p s : the consumption of resources need not be weighed in
a uniform way for all members of the group; a sort of hierarchy in the group
can be induced by weighing - in the multi-attribute utility function - resource
consumption differently depending on the agent who executes an action.
Conflict a v o i d a n c e : since agents share a group utility function and perform
anticipatory coordination, they will (try to) avoid conflicts with other agents'
intentions: performing an action t h a t interferes with the plans of other team
members decreases the utility of the whole team.
When Ga considers the possible developments of her partial plan, she examines
what effects her action will have on the partners' plans. So also the possible
interferences are weighed as any other cost t h a t decreases the group utility:
conflicts result in less preferred choices, but they are not necessarily ruled out.
E n d i n g c o o p e r a t i o n : when all members know t h a t the top-level goal of the
group has been achieved, has become impossible or irrelevant, then no more
utility can be obtained by any other actions than terminating the group: in
fact, termination gets higher utility by saving resources. Therefore, the shared
plan is naturally ruled out, without the need of stipulating other explicit
conditions for its termination.

References

1. P. R. Cohen and H. J. Levesque. Teamwork. Nous, 25:487-512, 1991.


2. R. Conte and C. Castelfranchi. Cognitive and Social Action. UCL Press,
1995.
3. P. Haddawy and S. Hanks. Utility models for goal-directed, decision-
theoretic planners. Computational Intelligence, 14:392-429, 1998.
4. P.R. Cohen and H.J. Levesque. Intention is choice with commitment.
Artificial Intelligence, 42:213-261, 1990.
5. G. Boella. Cooperation among economically rational agents. PhD thesis,
Universita di Torino, 2000.
6. I. Hideki and K. Hirofumi. Observability-based nested belief computa-
tion for multiagent systems In N.R. Jennings and Y. Lesperance, editors,
Intelligent Agents VI (ATAL-99). Springer-Verlag, 2000.
BELIEF REVISION IN T Y P E THEORY

TUN BORGHUIStAND FAIROUZ KAMAREDDINEJAND ROB NEDERPELTf


fMathematics and Computing Science, Eindhoven Univ. of Technology, P.O.Box
513, 5600 MB Eindhoven, the Netherlands
Email: {v. a.j. borghuis,wsinrpn} @win. tue.nl
\Computing and Electrical Engineering, Heriot-Watt Univ., Riccarton, Edinburgh
EH14 4AS, Scotland. E-mail: fairouz@cee.hw.ac.uk

This paper explores belief revision for belief states in which an agent's beliefs as
well as his justifications for these beliefs are explicitly represented in the context
of type theory. This allows for a deductive perspective on belief revision which can
be implemented using existing machinery for deductive reasoning.

1 Introduction
An agent who keeps expanding his belief state with new information may
reach a stage where his beliefs have become inconsistent, and his belief state
has to be adapted to regain consistency. In studying this problem of "belief
revision", the justifications an agent has for his beliefs are not usually consid-
ered as first-class citizens. The two main approaches for dealing with belief
revision (foundation and coherence theories 5 ) represent justifications of beliefs
implicitly (e.g. as relations between beliefs in foundations theory) rather than
as objects in their own right which are explicitly represented in the formali-
sation of belief states and belief change operations. In this paper, we explore
belief revision for belief states in which justifications are first-class citizens.
Our motivation for investigating belief revision along these lines stems
from working on knowledge representation in type theory 2 in the DenK-
project 4 . In this project a formal model was made of a specific communication
situation, and based on this model, a human-computer interface was imple-
mented. Both in the model and in the system, the belief states of agents were
formalised as type theoretical contexts. This means that an agent's beliefs
are represented in a binary format, where one part of the expression is the
proposition believed by the agent and the other the justification the agent has
for this particular belief. Both parts are syntactic objects in their own right,
and can be calculated upon by means of the rules of the type theory. This way
of representing beliefs turns justifications into first-class citizens, and proved
to be very fruitful for the purposes of the project.
At that time mechanisms for belief revision were not investigated but it
became clear that given this formalisation of belief states there is a straight-
forward deductive approach to the problem: since every belief is accompanied

69
70

by its justification (and the rules operate on both), every inconsistency that
surfaces in the agents belief state has its own justification containing the jus-
tifications of the beliefs that cause the inconsistency.

2 Type theory for knowledge representation

Judgements: The basic relation in type theory is the judgement T h a : T


(read as 'term a has type T in context r ' ) . Here 'o' and 'X" are both formulas
written according to a well-defined syntax, a : T is called a statement, whose
subject is the term a. One also says that term a is an inhabitant of type T.
The context T is a list of statements with variables as subjects, e.g. x\ :
T\,...,xn : T„. The judgement T h a : T can then be read as follows: "If x\
has type Ti, ..., and xn has type Tn, then term a has type T". Note that a
may contain x\, ..., xn, so a depends on xi to xn. The set of subject variables
{x\,... ,xn} is called the domain of F.
Statements: The intuitive notion 'has type' has a direct counterpart in
naive set theory, viz. 'is element of. For example, the statement 'a : N '
('term a has type N ' ) , assuming that N is a symbol representing the set of
natural numbers, can be interpreted as 'a G N ' ('the object represented by a
is element of the naturals'). The notion of having a type, however, is more
general than the notion of set-theoretical elementhood. This is because a type
T can represent not only some kind of set, but also a proposition. In the latter
representation, the statement a : T expresses: 'a is (a term representing) a
proof of the proposition T". One speaks of 'propositions as types and proofs
as terms' (abbreviated as PAT) in order to emphasize this usage of types.
Contexts: The context T in a judgement T h a : T contains the 'prerequisites'
necessary for establishing the statement a : T. In T = x\ : T\,..., xn : T n , a
statement Xi : Ti expresses many kinds of prerequisites, the simplest being:
1. Xi is an element of the set Ti,
2. Ti is an assumption (a proposition) and xt is its atomic justification.
However, in type theory there are different 'levels' of typing: a type can have
a type itself. Statements expressing the typing of types deal with the well-
formedness of these types. For the Ti in 1. and 2. above, we can have:
1. Ti : set, to express that Ti is a well-formed formula representing a set,
2. Ti : prop, to express that Tj is well-formed representing a proposition.
The last-mentioned statements can also be part of a context. So a context
could look like: Xi : prop,T 2 : s e t , x i :Ti,X2 :Tt- The terms set and prop
are examples of so-called sorts, predefined constants on which the type system
is based. Every type system has a specific set of sorts, which we denote by S.
We identify three characteristics of knowledge which, according to us,
should be taken into account in any attempt to formalize knowledge:
71

• Subjectivity: Knowledge of an agent is partial: no one knows everything,


and agents differ in what they know and don't know. Also, knowledge
is formulated in terms of concepts which are subjective in the sense that
one agent may judge something to be an instance of a certain concept,
while another agent would not recognize this as such.
• Justification: Knowledge is justified: agents not only know things, but
they have reasons for knowing them. Generally, parts of knowledge are
justified in terms of more basic parts; an agent's body of knowledge is
structured. And even atomic justifications are supports for the knowl-
edge, since they point at an origin (an axiom, an observation, etc.).
• Incrementality: The knowledge of an agent can be extended as new in-
formation becomes available. Whether this information can be incorpo-
rated by the agent depends on the possibility to tie this information to
the knowledge that is already present. This may lead to simply adding
the new information, but also to dismissing it (for instance because it is
incomprehensible) or even to a reorganization of the existing knowledge.
With these requirements, the traditional distinction between knowledge and
belief disappears: there can be no knowledge which is true in any absolute
sense, since an agent's knowledge depends on his subjective conceptualisation
of the world. At best some pieces of knowledge turn out to be more reliable
than others and some things can be agreed upon by more agents than others.
There is a natural way to capture the above characteristics in type theory:
• Subjectivity is captured by types: Each concept is formalized as a type,
each instance of the concept is a term inhabiting this type. An agent's
subjective ability to recognize something as an instance of a concept, is
mirrored in the ability to judge that the corresponding term inhabits the
corresponding type. Note that 'having a concept' is also subjective in
the sense that different people may have formed different concepts in the
course of time. This means that one agent can have a concept, whereas
another agent has no comparable concept. And in case agents do have
comparable concepts, they may differ in what they recognise as belonging
to this concept. In case the type formalizing the concept is a 'set-type',
this means that they may differ in what they regard as elements of the set
(a rhododendron may be a tree for the one, but a shrub for the other). In
case this type is a 'proposition-type', they may differ in what they accept
as a justification for that proposition.
• Justification is captured by terms: As said before, by the PAT-principle,
justifications are first-class citizens, formalized in the type-theoretical
syntax as terms. The fact that term a justifies proposition T, is expressed
72

as the statement a : T. The rules of type theory allow these terms to


be combined into complex terms, which reflects that parts of knowledge
may be a structured combination of more basic parts of knowledge.
• Incrementality is captured by contexts: An agent's knowledge state can
be formalized as a type-theoretical context. Addition of new information
to the knowledge state can be formalized by adding statements to the
context, dismissing information amounts to reducing the context. Infor-
mation may only be added if it 'matches' an agent's knowledge state. In
type theory, a statement can only extend a context if it obeys certain
well-formedness restrictions.
The knowledge state of an agent consists of 'everything he knows' at some
instant. Given our characterization of knowledge, this means that everything
in a knowledge state is formulated in terms of the agent's concepts. Hence:
• Meaning fulness: An agent has formed his own, private concepts, and
only things formulated by means of these concepts can be meaningful to
him. Whether or not information coming from outside (by observation or
communication) makes sense, depends on the concepts that are already
available. (We assume that the entirety of concepts of an agent is fixed.)
• Inhabitation: Whatever an agent knows about the world is recorded in
a knowledge state in the form of meaningful expressions that he accepts.
This includes expressions about which objects 'inhabit' the concepts, and
which propositions hold, according to the agent.
If we take the following (very simple) context as representing an agent's knowl-
edge state: Ti : prop,T 2 : set,a:i : T\,X2 : T-j, we can see:
• Meaningfulness is captured by statements of the form T : prop or T : s e t .
That is to say, in this example the agent has two concepts, viz. Ti, which
is a proposition to him, and T2, which is a set. At this stage, there are no
other concepts, i.e. all sets and propositions which are not constructed
out of T\ and/or T2 are not meaningful to him.
• Inhabitation is captured by statements of the form x : T, where T is
meaningful. In the example context, the inhabitant X\ of T\ represents
the agent's justification for the holding of Ti, and the inhabitant X2 of
T2 is an element of the set T2 which is recognized as such by the agent.
'Everything an agent knows' at a certain instant can be divided into:
• Explicit knowledge expressed by the statements in context T. These are
explicitly represented pieces of knowledge directly available to the agent.
• Implicit knowledge expressed by statements derivable on context T. These
are consequences (obtained by inference) of an agent's explicit knowledge.
Hence, in a judgement of the form V \- a : T, the explicit knowledge can be
found to the left of h, and the implicit knowledge to the right of K
73

3 Concluding remarks
We explored the use of explicitly represented justifications in belief revision
where beliefs and belief states were represented respectively as type theoret-
ical statements and contexts (for details see 3 ) . Justifications make it easy
to identify the beliefs that cause inconsistency of the belief state and greatly
simplify the handling of dependencies between beliefs. Our approach is appli-
cable to agents with limited computational resources because it is deductive
and we do not require that our theory of belief revision itself selects which
beliefs have to be removed. This holds independently of the strength of the
logic in which the belief change operations are cast: the mechanisms that were
used to represent justifications and dependency relations between beliefs are
at the heart of type theory, making our approach applicable to: a) a large
family of type systems, and hence b) given the connections between type the-
ory and logic, in a wide range of logics2. Our work has been implemented on
the basis of a standard type theoretic theorem prover where the agents belief
state is represented as type theoretical contexts as described in this paper 4 .
Although we know of no work in the literature where justifications are
explicitly represented, we show in 3 that our framework is related to: a)
revision for belief bases and to Foundations Theory, but does not suffer from
the drawbacks usually associated with foundations theory such as problems
with disbelief propagation, circular justifications, and multiple justifications
for the same belief; and b) the work of Hansson on semi-revision, whose notion
of consolidation can be simulated in our framework and where new information
is not automatically completely trusted.
References
1. Ahn, R., Borghuis, T., Communication Modelling and Context-
Dependent Interpretation: an Integrated Approach. In: TYPES'98.
LNCS 1657, Springer Verlag (1999), pp. 19 - 32.
2. Barendregt, H., Lambda calculi with types. In Handbook of logic in com-
puter science, Abramsky, Gabbay and Maibaum (eds.), Oxford University
Press, Oxford (1992), pp. 117 - 309.
3. Borghuis, T., and Nederpelt, R., Belief Revision with Explicit Justifi-
cations, an Exploration in Type Theory. CS-report 00-17, Eindhoven
University of Technology, Dept. of Math, and Comp. Sc, NL (2000).
4. Bunt, H., Ahn, R., Beun, R-J., Borghuis, T., and Van Overveld, K., Mul-
timodal Cooperation with the DenK System. In: Multimodal Human-
Computer Interaction, Bunt, H., Beun, R-J., Borghuis, T. (eds.), Lecture
Notes in Artificial Intelligence 1374, Springer Verlag (1998), pp. 39 - 67.
5. Gardenfors, P., The dynamics of belief systems: Foundations versus co-
herence theories, Revue Int. de Philosophic, 44 (1990), pp. 24 - 46.
H E T E R O G E N E O U S B D I A G E N T S II: C I R C U M S P E C T
AGENTS

M A R I A FASLI
University of Essex, Department of Computer Science, Wivenhoe Park, Colchester
C04 3SQ, United Kingdom
Emaitmfasli@essex.ac.uk

The study of formal theories of agents has received an increasing attention in par-
ticular within the context of the BDI paradigm. An interesting theoretical issue
in this framework is defining notions of realism, that is interrelations between the
agent's beliefs, desires and intentions. Intuitively, each notion of realism charac-
terises a different type of agent. In this paper we extend the BDI framework and
we propose notions of realism for capturing circumspect agents, that is agents that
are willing to adopt intentions only if they believe that these are achievable op-
tions. Three such notions of realism are presented which are shown to have better
features than the classical notion of strong realism.

1 Introduction

Formal theories of agents have attracted considerable attention within the


Agent community since they can be viewed as specifications for building such
systems. The BDI framework developed by Rao and Georgeff3,4 is perhaps
one of the most well-known theories of agents. An interesting theoretical issue
in the BDI is defining notions of realism, that is interrelations between the
agent's attitudes, namely beliefs, desires and intentions. Different relations
give rise to different types of agents as in Rao and Georgeff4. Thus the notion
of strong realism characterises a cautious agent whereas realism characterises
an enthusiastic one. This accords well with our intuition that heterogeneous
agents are needed: different domains of applications may require agents with
different cognitive models. In this paper we propose notions of realism for
capturing what we describe to be circumspect agents. In particular, we present
notions of realism that improve upon the characteristics of strong realism and
thus come closer to the desiderata for reasoning BDI agents laid down by
Bratman 1 and Rao and Georgeff4. The paper proceeds as follows: Next we
describe the BDI paradigm along with the the three original notions of realism.
Due to space limitations we will only briefly describe the basic ideas; the reader
is referred to Rao and Georgeff3,4 for full formal definitions. Then we discuss
our intuitions and basic ideas behind the concept of a circumspect agent and
we present three notions of realism for formalising such agents. A discussion
on their evaluation is also included. The paper ends with the conclusions.

74
75

B
D
D
B I

i) ii) iii)

Figure 1. i) Strong Realism, ii) Realism, iii) Weak Realism

Table 1. Axioms for the notions of realism


Strong Realism Realism Weak Realism
Intendi(4>) =>• Desi(<j>) Beli{<t>) => Desi{4>) Jntendi(4>) => -iDesi(->4>)
Desi{4>) => Beh(<f>) Desi(</>) =*• Intendi(<j>) Intendi(<l>) =r- -<Beli(-•</>)
Desi{<j>) => - i B e i i ( - i ^ )

2 The B D I Paradigm

In the BDI paradigm developed by Rao and Georgeff3,4 an agent's cognitive


state is described in terms of both information attitudes and pro-attitudes.
Formally a first order language is adopted which in addition includes three
modal operators Be/j, Desi, and Intendi for expressing beliefs, desires and
intentions respectively. The framework also uses a temporal component based
on CTL logic. Semantics is given in terms of possible worlds which are them-
selves branching time structures. The axiomatisation for beliefs is the stan-
dard KD45 modal system while for desires and intentions the K and D systems
are adopted respectively. Since an agent's cognitive state is expressed in terms
of three attitudes, additional axioms are required that describe their interre-
lations. Three such ways of interrelating the three attitudes, known also as
notions of realism, have been proposed by Rao and Georgeff 4 , Figure 1.
In the first notion of realism, namely strong realism, if an agent intends to
optionally achieve ip, then it optionally desires it and also believes that I/J is an
achievable option. The set of belief-accessible worlds is a subset of the desire-
accessible worlds, and the set of desire- accessible worlds is a subset of the
intention-accessible worlds, as is illustrated in Figure l(i). Set containment
corresponds to logical implication and thus the relationships between sets of
accessible worlds entail the axioms in Table 1°. According to strong realism
a
Note: the application of these axioms is restricted to O-formulas 3 , 4 . This is also the case
for the other notions of realism.
76

Table 2. Asymmetry Thesis Principles and their satisfaction in Basic BDI Systems

# Name Formula S R W
Al I-B Inconsistency h Intend^) => -^Beli(-Kp) T T T
A2 I-B Incompleteness \/ Intendi((f>) => Beli{4>) F T T
A3 I-D Incompleteness \f Intendi((p) => DeSi((f>) F T T
A4 I-D Inconsistency h Intendi(<p) =>• ->.Desi (-><£) T T T
A5 B-D Incompleteness \/Bdi(<l>) => Desi((t>) T F T
A6 B-I Incompleteness \/ Beli(4>) => Intend^) T F T
Al D-B Inconsistency h Desi(<j)) => ^Bek{-^4>) T T T
A8 DT Incompleteness \f Desi(<j>) => Intendi((f>) T F T
A9 D-B Incompleteness \f Desi{4>) => Beli{4>) F T T

Table 3. Consequential Closure Principles and their satisfaction in Basic BDI Systems

# Formula S R W
CI Intendi{<f>\) A Beli(<j>i =£• fa) A -ilntend^fa) T F T
C2 Intendi(<j>i) A Des,(</>i => fa) A -ilntend^fc) T F T
CS Desi(<j>i) A Beli(<t>i => </>2) A ^Desi{(p2) T F T

the agent is very cautious, and only intends and desires propositions that
believes to be achievable. In realism the set of intention accessible worlds is a
subset of the desire-accessible worlds, and the set of desire-accessible worlds
is a subset of the belief-accessible worlds Figure l(ii). The axioms are given
in Table 1. An agent based on realism is an enthusiastic agent and believes
that it can achieve its desires and intentions. Finally, in weak realism, the
intersection of intention- and desire-, intention- and belief-, and belief- and
desire-accessible worlds is not the empty set as is shown in Figure l(iii). The
axiom schemas for weak realism are provided in Table 1. The agent described
by weak realism is a more balanced agent than the two other types of agents.
The three different systems that result from the adoption of the corresponding
axioms of realism will be called S-BDI, R-BDI and W-BDI respectively.
Bratman 1 and Rao and Georgeff4, discussed several properties or condi-
tions of rationality that a BDI system should satisfy. The first set of such
properties is known as the Asymmetry Thesis or the incompleteness and the
inconsistency principles, and they hold pairwise between desires, beliefs, and
intentions. They are listed in Table 2 along with their satisfaction in the basic
systems. The second set is called the Consequential Closure principles. They
are provided in Table 3 along with their satisfaction in the basic BDI systems.
77

3 Circumspect Agents

Different domains of applications for agents have different requirements, which


need to be reflected in their conceptualisation, design and implementation.
For instance an agent that has to deal in the stock market may have a differ-
ent reasoning model from an air-traffic controller agent that has as a respon-
sibility the safe landing and take off of aircrafts. The stock market agent may
be required to engage in transactions that involve risk, whereas the air-traffic
controller agent needs to be strictly cautious and avoid decisions that involve
uncertainty and risk. Thus the need for heterogeneous agents stems from the
fact that the cognitive model of the agent may have to vary depending on
the requirements of a particular application. In the BDI paradigm this need
is addressed by adopting different realism constraints. For instance the no-
tion of strong realism characterises a cautious agent. However, strong realism
describes only one possible way of relating the three sets of accessible worlds
for capturing cautious agents. Moreover, it does not satisfy three Asymmetry
Thesis principles as shown in Table 2. Here we propose alternative constraints
for characterising cautious or circumspect agents. A circumspect agent is not
willing to take any risks, that is, we interpret circumspect agents in the BDI
framework as agents that only intend a proposition if they believe it to be
achievable in all accessible worlds. Although a number of notions of realism
have been uncovered, only three will be presented due to space limitations.
According to our interpretation of circumspect agents, such an agent only
intends to optionally achieve ip if it believes in all respective accessible worlds
that ip is a n achievable option. Thus, one basic restriction for such agents in
terms of semantic conditions is that the set of belief-accessible worlds should
be a subset of the intention-accessible worlds. Consequently the A2 prin-
ciple will not be satisfied for such an agent. Therefore we will attempt to
improve on the remaining principles, namely A3 and A9. In the first no-
tion of realism RC1-BDI the set of belief-accessible worlds is a subset of the
intention-accessible worlds, the intersection of desire- and intention-accessible
worlds is not the empty set, and the intersection of belief- and desire-accessible
worlds is not the empty set as shown in Figure 2(i). Although we relax the
requirement of strong realism in that an agent can have desires that it may
not believe in all possible worlds to be achievable, the agent remains circum-
spect regarding its intentions and it will only adopt intentions that believes
to be achievable options. The axioms imposed according to the set relations
are given in Table 4b. The respective system called RC1-BDI consists of the

6
Again the application of the realism axioms of Table 4 is restricted to O-formulas.
78

I D I
B B B
D D
I
i) ii) iii)

Figure 2. i) RC1-BDI Realism, ii) RC2-BDI Realism, iii) RC3-BDI Realism

Table 4. Axioms for the notions of realism for Circumspect Agents

RC1-BDI RC2-BDI RC3-BDI


Intendi(4>) => Beli((t>) Intendi{4>) =>• Beli(<f>) Intendi((j>) =>• Beli{4>)
Desi{<j>) => ->Bdi{-4) DeSi(<l>) =>• Bek{4>) Intendi(<f>) =$> Desi(<f>)
BeU{4>) => -iDesj(-i^)

basic BDI axiomatisation and the axioms for RC1-BDI realism.


Another type of circumspect agent is depicted in Figure 2(h). This agent
believes that its desires and intentions are achievable options, although its in-
tentions are loosely coupled with its desires. Thus, the set of belief-accessible
worlds is a subset of the desire-accessible worlds, the set of belief-accessible
worlds is a subset of the intention-accessible worlds as well, and the intersec-
tion of the intention- and desire-accessible worlds is not the empty set. The
axioms are provided in Table 4 and the respective system is called RC2-BDI.
A third variation is illustrated in Figure 2(iii). In this both the set
of belief- and desire-accessible worlds are subsets of the intention-accessible
worlds, while the intersection of the desire- and belief-accessible worlds is not
the empty set. The axioms that are imposed according to these conditions
are provided again in Table 4. This system is called RC3-BDI.
We turn now our attention to the evaluation of the proposed systems
with regards to the desiderata for rational agents as suggested by Bratman 1
and Rao and Georgeff4. According to our interpretation and basic condition
for circumspect agents, the A2 principle is not satisfiable in these systems.
Comparing the three notions of realism with strong realism we see that RC1-
BDI, RC2-BDI and RC3-BDI provide certain improvements. In strong realism
three of the Asymmetry Thesis principles (A2,A3 and A9) are not satisfied
whereas in RC1-BDI only one is not satisfied, and in RC2-BDI and RC3-BDI
two of them are not. In all three systems the Consequential Closure principles
are satisfied. In conclusion the three proposed systems seem to come closer
to the requirements for rational BDI agents than that of strong realism.
79

Table 5. Asymmetry Thesis and Consequential Closure in Circumspect BDI Agents

Al A2 A3 A4 A5 A6 A7 A8 A9 CI C2 C3
RC1 T F T T T T T T T T T T
RC2 T F T T T T T T F T T T
RC3 T F F T T T T T T T T T

4 Conclusions

The research presented in this paper has been motivated by the need to for-
malise heterogeneous agents and in particular circumspect agents in the BDI
paradigm. A circumspect BDI agent will only adopt an intention to option-
ally achieve ip, if it believes that this is an achievable option. Three different
notions of realism for circumspect agents were presented. These were shown
to have better characteristics than the notion of strong realism. In the scope
of this research and in the effort to investigate all the available options, ad-
ditional notions of realism were uncovered. However, due to lack of space we
only described those that seem to yield the most interesting properties.
In contrast to circumspect agents one can consider bold agents. Such
an agent can adopt an intention towards a proposition if it does not believe
that the proposition is not an achievable option. The basic condition that
seems to characterise such agents is: Intendi(4>) =^ -'Bel^-xf)). Notions of
realism for such agents were explored elsewhere2. In conclusion, we believe
that the research presented here comes one step closer towards heterogeneous
BDI agents. Perhaps the most interesting aspect of this work is to consider
real applications and investigate how real agents that correspond to these
formal cognitive models can be built.

References

1. M.E. Bratman, Intentions, Plans, and Practical Reason. Harvard Uni-


versity Press (1987).
2. M. Fasli, Towards Heterogeneous BDI Agents I: Bold Agents. In Proceed-
ings of the 14th International FLAIRS Conference, AAAI Press (2001).
3. A. Rao and M. Georgeff, Modelling Rational Agents within a BDI-
Architecture. In Proc. of the 2nd Int. Conf. on Principles of Knowledge
Representation and Reasoning, pp.473-484 (1991).
4. A. Rao and M. Georgeff, Decision Procedures of BDI Logics. Journal of
Logic and Computation, 8(3):293-343 (1998).
A PREFERENCE-DRIVEN APPROACH TO DESIGNING
AGENT SYSTEMS

S T E F A N J. J O H A N S S O N
Department of Software Engineering and Computer Science,
Blekinge Institute of Technology, S-372 25 Ronneby, Sweden
e-mail:sja@bth. se

JOHAN KUMMENEJE
Department of Computer and Systems Sciences, Stockholm University and the
Royal Institute of Technology, S-I64 42 Kista, Sweden
e-mail: j ohankSdsv. su.se

We present a preference-driven approach to the construction of agent systems in


which owners and designers of both the agents and environments, are recognized to
influence the preferences of the agents in order to maximize their expected utilities.
We propose some general guidelines for using preferences in the process of agent
engineering and identify the need for future research in the area.

1 Introduction

One important issue of agency is control. We must not be enticed to believe


that the agents live in social and environmental isolation. They have been
designed to fulfill the goals of their creators by interacting with its environment
and other agents. Central to our point of view are the following concepts:
Definition 1 An agent owner, (Ao) is the (human or artificial) agent that
has the power to launch the agent, as well as make the decision whether the
agent should be shut down or be assigned new preferences. The owner ex-
presses its preferences to the agent, and get it to work toward the given pref-
erences.
Definition 2 An agent designer, ( A D ) is the (human or artificial) agent that
has designed (and possibly implemented) the control mechanism of an agent.
With control, we mean the internal evaluation of the environment and the
owner preferences.
Definition 3 A designer of an environment, ( E D ) is the (human or artificial)
agent that has designed and possibly implemented the rules and conditions
under which agents are able to act in the environment.
Definition 4 An environment owner, (Eo) is the (human or artificial) agent
whose run-time preferences are reflected in the dynamics of the rules and the
conditions under which agents are able to act in the environment.

80
81

We will try to clarify the role of each and one of these characters in the
following sections. In the next section, we will give some (artificial) examples
of agent systems and also discuss how the different users and designers relate
to their parts of the system. Section 3 discusses a real example of preference
dynamics based on the simulated league in RoboCup, in which designers and
users of both agents and environments act on the preferences of the others.
We finish off with a section of discussion and future work.

2 The Meta-Design of a System

In an agent system, we may identify the following features:


First, each of the agents has a set of dynamic preferences expressed by
their owners as well as a set of static preferences decided at design level.
Secondly, the agents may take into account preferences expressed by the
designer and the owner of the environment.
Thirdly, each of the agents optimize their actions according to their pref-
erences, their knowledge and their abilities, i.e. they are houndedly rational
(more about bounded rationality is found in e.g. Boman 1 ).
Fourthly, the actions of the agent influence the environment either directly
or indirectly, and
Fifthly, changes occur in the environment as a result of the actions of the
agents. These are the possibly observable side-effects of the system that the
owner may benefit from, and possibly adjust its preferences according to.
The Agents' Choice of Actions: The assumption of bounded rationality is
pragmatic in the sense that the may be unaware of the preferences, abilities,
etc. of other agents. Consider an agent as not being bounded rational, then it
would deliberately be acting non-optimal with respect to its design objectives
which the user and designers would consider to be the best action. Instead,
some other preferences must have been present, which is in contradiction
with the fact that the only thing that guides the behavior of an agent is the
preferences of its owner and its designers and the state of the environment.
The Observations of the Owner: It is rarely the case that agents as such are
the reason for running a system (exceptions are to be found e.g. in agent-
based simulations 2 ). Instead, what the owners of the agents are generally
interested in, is the side-effects of actions of the agents. To illustrate this,
imagine an office environment agent. The owner of this agent is interested in
the result of the negotiations, i.e. that the local environment gets as close to
the owner preferences as possible, not the negotiation protocols used nor how
many agents it had to negotiate with.
82

Owner preferences

Environment preferences

Figure 1. The different sources of valuation

Design Principles for Agent Environments: As an E D , the task is to imple-


ment the rules and conditions under which agents that act in the environment
will be evaluated. At the same time as the E D will have to design a (hopefully
unambiguous) framework of rules, much effort must be put into the design
of a system of punishments and rewards. The E o then sets the rewards and
the punishments for certain behaviors in a way that will lead the expected
behavior to an acceptable behavioral equilibrium. If not, the result will be
an environment in which the agents niche themselves in behaviors that are
sub-optimal for the environment as whole. We therefore suggest the following
schematic guidelines for environment design and owner maintenance:
(i) Set up the conditions under which the agents are allowed to act in the
environment.
(ii) Assign to each (class of) possible allowed state(s) a preference de-
scribing the estimated value of the state (from the perspective of the E D - E O ) ,
and
(iii) Calculate the assignment of punishments and rewards of behaviors
that, when implemented in the environment, will have its equilibrium in the
preferred states.
The complexity of the calculation of punishments and rewards is of course
dependent on the complexity of the allowed actions. It is not our purpose to
expound our ideas about how to calculate punishments and rewards here,
instead we leave it for future work.
Design Principles for Agents: Each agent have a set of preferences in which
each preference is a measure of the importance that a certain goal is fulfilled.
We can distinguish two types of these preferences, static, and dynamic. The
83

static preferences are the ones set at the designers levels when the agents and
the environments are implemented. The dynamic preferences are the ones set
by the owners of the agents, and to some extent the owners of the environment,
in run-time.
We may expect a further development of the skills and abilities of the
agents as the field of agent engineering matures. This means that they will
be able to (if possible) exploit the weaknesses of the environments that they
act in, as well as the weaknesses of other agents. Today these weaknesses are
exploited manually through the expression of explicit owner preferences, but
as the level of abstraction increases, we may expect this to be automated in
a way that the ADS provide skills that automagically find out the weak spots
of the environment and use them for its own purposes.
A suggested set of guidelines for ADS are therefore to design/implement:
(i) Abilities to find out the rules and conditions of an environment (e.g.
by look-up services, etc).
(ii) Abilities to optimize the behavior with respect to: a) the actions
possible to perform in the given environment, b) the expected rewards and
punishments of different behaviors in the environment, and c) the preferences
of the Ao.
(iii) An interface to the Ao in which the Ao can express its preferences.
The Relation between the Agent and the Environment: It is possible to rec-
ognize two different types of relationships — between an agent and its envi-
ronment, and between agents (i.e. communicative acts). Also, an agent may
observe the effects of its own and other agents actions, even though it may be
hard or even impossible for the agent to draw any causal conclusions.
If we take a closer look at what happens in the environment, the actions
are performed under the assumption of the agent that the action was the best
possible thing to do in order to reach its goals, expressed by its preferences,
regardless of whether they are communicative or not. The agent must in
all cases to some extent observe the external state of the environment and
the other agents, but the distribution of computational attention between
for example observing and acting, is individual from agent to agent. This is
typically a parameter that is determined on the designers level. For instance,
an agent that rely on learning in order to perform well may be designed to be
more observant than an agent that must be prepared for quick responses on
changes in the preferences of its owner. This means that it is possible that
one agent in one system collects all possible observations, while another agent
only observe the actions performed by itself. A study of the trade off between
deliberation and action can be found in e.g. the work of Schut 3 .
84

3 A n Exemplification of Preferences

To exemplify our point, we use the student implementations of RoboCup


teams at Linkoping University. The example, though somewhat artificial,
clearly illustrates a number of occasions where the preferences of the envi-
ronment designer, the agent designer, and the agent owner influences the
development process. RoboCup can simply be described as robots playing
soccer, however we focus on the simulated league as we avoid dealing with
the ambiguity of the real-world (more information on RoboCup and the sim-
ulated league is available in Kummeneje 4 ). The designers of the server of the
simulated league, is in our example considered to be the environment designer.
RoboSoc5 is a base to ease the creation of new soccer playing teams, and
Heintz is thereby considered in our example to be the agent designer, while
the students creating their teams are considered to be the agent owners. The
agent owners may or may not be aware of the preferences expressed in the
simulation server and the RoboSoc platform, however if they are aware of the
preferences (and most likely any caveats), they might be able to use these
preferences. For instance, in 1997 and 1998 the maximum ball speed was
not limited, allowing a team to accelerate the ball to incredible speeds by
simply passing the ball a number of times. Afterward the discovery of this
feature, it was changed to have a fixed limit. We thereby recognize that the
set of preferences of the E D , A D , and the Ao are not fixed, but dynamically
changing over time. The preferences may also be viewed as being delicately
intertwined.

4 Discussion and Concluding Remarks

The designer of the agent may be the same as the owner, however, more likely
is that the future user of agent system is someone who is not able to program
the low level algorithms, etc., but who prefer to use the agent at the service
level. This will of course raise the issue of trust in agent design. How can we
as users of an agent make sure that the agent we have launched to perform
a certain task will do its best to serve us without putting the interests of
the agent designer in the first room? For instance, should we trust a flight
ticket buying agent, designed by someone at the payroll of a major airline
company? Questions like this are important to ask if we as agent designers
and representatives of the agent research community would like to deserve
respect for what we are doing from the point of view of the users of our
agents.
We have presented a perspective on agent systems, based on preferences
85

set by users and designers and suggested general guidelines for the engineering
of agents, as well as agent environments.
From an evolutionary perspective, we may expect the agent designers to
be better on taking other, external preferences into consideration, while the
owners get less interested in how exactly the agent works, and more keen on
having their preferences satisfied. The environment designers will concentrate
on setting up rules, specific for the domain it is designed for. These rules will
not be able to control what actions that can be performed by which agents
at what time. However, indirectly the punishments and the rewards of the
environment will have a great impact on these matters.
Even though this study include a good example of the preference per-
spective in the domain of RoboCup, it is far too early to draw any extensive
conclusions based on this and we suggest that more effort must be put into
this promising area of research.

Acknowledgments

Stefan Johansson would like to thank the EC research programme 1ST-1999-


10298 ALFEBIITE and the KK-foundation for funding and inspiration to this
work6. The authors thank Paul Davidsson and Magnus Boman for comments.

References

1. M. Boman. What is rational agency. Technical Report 95-048, Depart-


ment of Computer Systems Sciences, 1995. Internal Working Note.
2. H.J.E. Verhagen. Norm Autonomous Agents. PhD thesis, Department
of Computer and Systems Sciences, Stockholm University and Royal In-
stitute of Technology, 2000.
3. M. Schut. Intention reconsideration as discrete deliberation scheduling.
In Proceedings of 2001 AAAI Spring Symposium on Game Theoretic and
Decision Theoretic Agents, Thecnical Report SS-01-03. AAAI Press,
2001.
4. Johan Kummeneje. RoboCup as a Means to Research, Education, and
Dissemination, March 2001. Licentiate thesis, Department of Computer
and Systems Sciences, Stockholm University and the Royal Institute of
Technology.
5. Fredrik Heintz. RoboSoc a System for Developing RoboCup Agents for
Educational Use. Master's thesis, Department of Computer and Infor-
mation Science, Linkoping University, March 2000.
6. The ALFEBIITE home page, http://www.iis.ee.ic.ac.uk/alfebiite.
AGENT CONSUMER REPORTS: OF THE AGENTS,
BY THE AGENTS, AND FOR THE AGENTS

XIAOCHENG LUAN, YUN PENG, AND TIMOTHY FININ


University of Maryland, Baltimore County, 22215 Overview Lane, Boyds, MD 20841, USA
E-mail: {XLUAN1, YPENG, FININ}@CS.UMBC.EDU

Service matching is critical in large, dynamic agent systems. While finding exact matches is
always desirable as long as an agent knows what it wants, it is not always possible to find exact
matches. Moreover, the selected agents (with exact match) may or may not provide quality services.
Some agents may be unwilling or unable to advertise their capability information at the sufficient
level of details, some might unknowingly advertise inaccurate information, while others might
even purposefully provide misleading information. Our proposed solution to this problem is the
agent "consumer reports". The broker agent will not only collect the information advertised by the
service provider agents, but also learn about the experiences the consumer agents have about their
service providers. It might also hire some agents to test certain service providers to see how well
they can do what they claim they are capable of doing. Then agent consumer reports will be built
based on the information collected. The advanced level of agent consumer reports will also
dynamically capture the probabilistic distribution of the services and use it to assess the probability
of a match. We plan to extend LARKS and use it as our agent capability description language.

1 Introduction

Finding the right agent(s) for the right task (service) is critical to achieve agent
cooperation in large, dynamic agent systems. A popular approach to this problem is to
use a broker agent (may also be called matchmaker, or facilitator) to connect the service
provider agents and the service consumer agents, via service matching. Typically a
broker agent recommends service providers based on the capabilities/services advertised
by the service providers themselves. The matching method evolves from the early age,
simple KQML performative based matching, to syntax and semantic based matching;
from yes/no matches to matches with probabilities. However, we may still have
problems since some agents may be unwilling or unable to advertise their capability
information at sufficient level of details; some might unknowingly advertise inaccurate
information; while others might even purposefully provide misleading information.
We have similar problems in the real world: we don't know whether the colorful,
fancy, and even touching commercials are true or not. There is no perfect solution to
this real world problem, but consumer reports certainly help a lot (besides the justice
system). Consumer reports are created using the information from the manufacture's
specification, consumer's feedback, and their test results on the products. It provides

86
87

guidance for consumers to choose the right product. We believe that this consumer
reports approach should work for the agent world, too. By following a simple brokering
protocol (which will not be discussed here because of space limitation), the broker agent
will not only collect the information advertised by the service provider agents, but also
learn about the experiences the consumer agents have about their service providers. It
might also hire some agents to test certain service providers to see how well they can do
what they claim they are capable of doing. Based on the collected information and the
domain knowledge, consumer reports can be built to assist in service matching.
Moreover, the broker agent can dynamically capture the probabilistic distribution of the
agent services and use this information to assess the probability of a service match.
Finally, our approach goes beyond the simple notion of a "reputation server" in that it
discovers and refines a complex, symbolic model of a service provider's performance.
This rest of this article is organized into two sections. In section 2, we shall describe
how the agent consumer reports will be built, and we will discuss some related issues in
section 3.

2 Building Consumer Reports

In our model of agent system, there are three types of agents: service provider agents,
service consumer agents, and broker agents. A broker agent is the one responsible for
building the agent consumer reports. To simplify the problem, but without loss of
generality, we make the following assumptions: (1) All the agents (including the broker
agent) in a system share a common domain ontology, and (2) the security and/or privacy
issues are orthogonal to what we will discuss in this article.

2.1 Representation
We are extending the LARKS framework for use in describing the agent's capabilities.
LARKS, Language for Advertisement and Request for Knowledge Sharing, is an agent
capability description language developed at CMU. It describes an agent's service by
specifying the context, the data types, the input and output variables, and the input and
output constraints. It also has a slot for the definition of the concepts used in the
description.
The matchmaking scheme in LARKS is relatively flexible and powerful. It has five
filters, each of which addresses the matching process from a different perspective.
"Context matching" determines if two descriptions are in the same or similar context;
"profile comparison", "similarity matching", and "signature matching" are used to
check if two descriptions syntactically match; "semantic matching" checks if the
88

input/output constraints of a pair of descriptions are logically match. Based on the need
of a specific application domain, these filters can be combined to achieve different
types/levels of matching.
Since LARKS doesn't provide the mechanisms for describing the "ratings" of an
agent service, we plan to extend LARKS so that, besides the 7 standard slots described
above, a description will also have zero or more "CR" (Consumer Reports) slots. These
slots (if any) are typically domain dependent, and will be used to describe the strength
of various aspects of the service provided by some specific agent. For example, the
integer sort service description can have some CR slots (in Italic) as shown in figure 1.

Context Sort
Types
Input Xs: ListOf Integer;
Output Ys: ListOf Integer;
InConstraints Le(length(xs), 100);
OutConstraints Before(x,y,ys) <- ge(x.y); In(x,ys) <- in(x,xs);
ConcDescriptions
Pricelndex 2 (10 is best)
ResponseTimelndex 1 (10 is best)

Figure 1. Capability description for integer sort, with CR slots.


Basically we will add another type of filter, the consumer reports filter, to handle
the CR related slots. Since these slots are usually domain dependent, the evaluation and
comparison of these slots might need to be done in a domain dependent way. A default
CR filter can be provided, e.g., to compare integer-typed slots. The system will allow
customized CR filters to be plugged-in to handle the CR slots in a domain dependent
way during the matchmaking or comparison. It is recommended that the consumer
reports filter be applied after all the other designated filters have been applied. The CR
filter will then be used to pick the best one(s) from all the candidates.
Please note that while we plan to extend LARKS and use its service/capability
description language and its matching filters, we think the approach proposed here is
applicable to other representations or systems as well.

2.2 Building Consumer Reports


The consumer reports are built based on the information the broker collects about the
service provider agents. The information comes from various channels: The feedback
89

from service consumer agents, testing results (relevant agents can be asked or "hired" to
test the service provider agents, when appropriate), the service descriptions advertised
by the service provider agents, the domain knowledge etc. If the broker also performs
task brokering (in which the broker receives a query, finds an appropriate agent,
forwards the query to that agent, and passes the result back to the requesting agent), the
requests and the results are useful sources for learning too.
The building of consumer reports is more than just collecting feedback data and
assigning ratings. There are two levels of consumer reports - the basic level and the
advanced level. The basic level is simply about assigning ratings to each relevant CR
slots of the original service descriptions based on the information collected. The
advanced level, however, goes beyond the originally advertised service descriptions. It
might also rate the sub-classes and super-classes of the advertised service class, and
captures the probabilistic distribution of the services. Let's use an example to illustrate
the basic idea.
Consider selling televisions as a service with three sub-service classes: selling
traditional TVs, selling HD-ready TVs, and selling HDTVs. Suppose the broker
discovered that 85% of the advertisements/requests are about traditional TVs, 8% are
about HD-ready TVs, and the rest (7%) are about HDTVs. Then if an agent requests a
recommendation on "selling TV" service, the broker would be able to recommend a
traditional TV seller with pretty high confidence, or recommend a HD-ready TV seller
or a HDTV seller with low confidence (if there is no better choice). Five years later, the
distribution of the 3 sub service classes might change to 30%, 20%, and 50%
respectively. The broker agent will then be able to dynamically capture the changes in
the probabilistic distribution and change its matching criteria accordingly.
On the other hand, while most of the TV sellers (those who advertise that they sell
TVs) sell traditional TVs, not that many TV sellers sell HDTVs. So based on the
probabilistic distribution, the broker agent would be more confident to recommend a TV
seller if the request is about traditional TV, while it would be less confident (to
recommend a TV seller) if the request is about HDTV. When computing the
probabilistic distributions, we consider both how many sub classes a service class has,
and the frequency of the advertisements and recommendation requests on that service.
Moreover, the feedback from the consumer agents will also be taken into account.
In large, heterogeneous agent systems, while exact service matches are always
desirable (as long as you know what you want), it's not always possible to find exact
matches. Therefore, it's important for the broker agent to learn the probabilistic
distribution of the services so as to identify the partial matches that have higher
probability of success.
90

3 Discussions

This paper presents some preliminary concepts and plans for an adaptive service broker
which learns and refines a model of a service provider's performance. Although we
have touched on a number of issues, significant additional issues remain as well as a
concrete implementation. The related issues not addressed here include (but not limited
to) the security issue, the privacy issue, the fairness issue, and the ontology issue. We
believe that the security issue and the privacy issue are orthogonal to what we've
discussed here. The fairness issue is more closely related. Though we believe that in
general the agent consumer reports provide basis for better service matching, the ratings
on specific services may not always be "accurate" - the evaluation of "accuracy" itself is
already a big issue. One (partial) solution in mind is for the broker agent to always
return a list of service provider agents (instead of the best one(s) only) but will be
ordered. For the ontology issue, what if the agents have only a limited subset of shared
ontology, or they might use just different ontologies. This issue is somewhat orthogonal,
but not cleanly. Employment of ontology translation or ontology negotiation might help.
One of the ideas behind this work is the law of locality. The approach proposed
here is meant to capture both the temporal locality (e.g., the distribution may change
over time) and the spatial locality (e.g., a sub set of the services may get referenced
frequently).
We will develop a prototype implementation of a system which is partly based on
the LARK framework. We will incorporate new ideas which are evolving from the
semantic web [Berners-Lee, et. Al. 2001] and the DAML [DAML, 2000] language in
particular. Some initial work has been done to explore how DAL can be used to
represent and reason about web services and agent services [DAML-S 2001, Mcllraith
and Zeng 2001].

References

1. [Cohen, et al, 1992] Cohen, W., Borgida, A. and Hirsh, H. Computing Least
Common Subsumers in Description Logics. Proceedings of the National
Conference on Artificial Intelligence - AAAI 92, pp 754-760, 1992
2. [Decker, et al, 1996 (1)] Decker, K, and Sycara, K and Williamson, M, Modeling
Information Agents: Advertisements, Organizational Roles, and Dynamic Behavior.
Working Notes of the AAAI-96 workshop on Agent Modeling, AAAI Report WS-
96-02. 1996.
91

3. [Dellarocas 2000] Dellarocas C, , Immunizing online reputation reporting systems


against unfair ratings and discriminatory behavior, Proceedings of the 2nd ACM
Conference on Electronic Commerce, Minneapolis, MN, October 17-20, 2000
4. [Genesereth & Singh, 1993] Genesereth, M. R. and Singh, N. P., A Knowledge
Sharing Approach to Software Interoperation Stanford Logic Group Report Logic-
93-12.
5. [Gruber, 1993] Gruber, T. R., A Translation Approach to Portable Ontologies.
Knowledge Acquisition, 5(2): 199-220, 1993.
6. [Michalski, et al, ????] Michalski, R. S., Carbonell, J. G., Mitchell, T. M., Machine
Learning, An Artificial Intelligence Approach, Tioga Publishing Company
7. [Mui 2001] Mui, Lik, Szolovitz, P, and Wang, C , Sanctioning: Applications in
Restaurant Recommendations based on Reputation, Proceedings of the Fifth
International Conference on Autonomous Agents, Montreal, May 2001.
8. [Sycara, et al, 1998] Sycara, K., Lu, J. and Klusch. M. Interoperability among
Heterogeneous Software Agents On the Internet. CMU-RI-TR-98-22.
9. [Berners-Lee, et. Al. 2001] Tim Berners-Lee, James Hendler and Ora Lassila, The
Semantic Web, Scientific American, May 2001.
10. [Chen et. Al., 2001] Harry Chen, Anupam Joshi, Tim Finin. "Dynamic Service
Discovery for Mobile Computing: Intelligent Agents Meet Jini in the Aether." The
Baltzer Science Journal on Cluster Computing. March 2001 (Volume 3, No. 2).
11. [DAML 2000] DAML specification, http://www.daml.org/, October 2000.
12. [DAML-S, 2001] DAML-S: A DAML for Web Services, White paper, SRI,
http://www.ai.sri.com/daml/services/daml-s.pdf
13. [Labrou, et. Al, 2001] Yannis Labrou, Tim Finin, Benjamin Grosof and Yun Peng,
Agent Communication Languages, in Handbook of Agent Technology, Jeff
Bradshaw, ed., MIT/AAAI Press, 2001.
14. [Mcllraith abd Zeng, 2001] Mcllraith, S., Son, T.C. and Zeng, H. "Semantic Web
Services" , IEEE Intelligent Systems. Special Issue on the Semantic Web. To
appear, 2001.
15. [WSDL, 2001] Web Services Description Language (WSDL) 1.1, January 23, 2001,
Microsoft Corporation, http://msdn.microsoft.com/xml/general/wsdl.asp
Logical Formalizations Built on Game-Theoretic
Argument about Commitments
Lamber Royakkers and Vincent Buskens *

Abstract
The formalization of commitment is a topic of continuing interest in
Artificial Intelligence (AI)'s understanding of human cooperative activity
and organization. Such formalizations are crucial for clarifying rational
behavior. AI research on commitments, however, has been focusing on
describing systems of agents, neglecting the individual incentives to per-
form certain actions. We argue in this paper that an understanding of a
system of agents needs to incorporate not only a logical system of possi-
ble actions, but also an incentive structure related to the actions and the
interdependence of agents involved in interactions between more agents.
As an example we will discuss the use of commitments in interactions be-
tween two agents. By adding game-theoretic reasoning, we will not only
be able to describe different commitment systems in various (legal) set-
tings, but we can also determine whether or not such commitment system
is expected to be socially efficient, desirable, and able to influence human
behavior.

1 Introduction
M a n y social interactions between two (or more) agents d e m a n d for various rea-
sons t h e use of c o m m i t m e n t s t o reach socially efficient or avoid socially inefficient
outcomes. We will s t a r t with an example. Assume you want to write an arti-
cle together with a colleague. You are b o t h convinced t h a t joining forces will
p r o d u c e a b e t t e r p r o d u c t t h a n writing two articles separately. However, you as
well as your colleague cannot be sure t h a t the other will actually invest his fair
share in this joint project (cooperate). Still, if b o t h of you work hard, you will
b o t h be satisfied. You realize t h a t if t h e colleague sits back (defects) while you
do the j o b , he is even b e t t e r off and you would have preferred t o write an article
alone. Clearly, your colleague also fears t h a t you sit back and profit from his
effort.
* Supported by a grant from the Niels Stensen Foundation and by a grant from the
Netherlands Organization for Scientific Research (NWO), email: l.m.m.royakkers@tm.tue.nl,
v.buskens@fss.uu.nl.

92
93

Agent 2
Defect Cooperate
Defect 2,2 4,1
Agent 1
Cooperate 1,4 3,3

Figure 1: Strategic form of the Prisoner's Dilemma Game

The "game" described above (without commitments) is called a Prisoner's


Dilemma Game [3]. In strategic form, 1 the game is shown in figure 1. The
values in the cells of the matrix indicate the payoffs for each agent related to a
combination of actions of the two agents. The expected action in this game is
"defect" by both agents, because independent of the action of the other agent,
each agent is better of by defecting. Consequently, both agents receive 2 instead
of 3, which they could obtain if they both would cooperate. Thus, the expected
outcome (2,2) is socially inefficient. However, by committing to cooperation,
e.g., my mutually informing the responsible professor who can incur sanctions
on the researcher who does not work on the joint paper, cooperation becomes
the best option for both agents. Hence, a mutual commitment leads to a better
outcome for both agents in this situation.
If we want to represent such a simple interaction in a logical system, only
the possible actions are described. Commitment is then introduced as an ele-
mentary proposition. This implies that the commitment is a fact that does or
does not occur. More sophisticated theories [2, 4] describe a formalization of
motivational attitudes such as intentions, goals, and wishes that explain why
agents behave the way they do. However, within the logical systems there is
nothing that drives the motivational attitudes. It is only stated that if certain
attitudes are present, commitments are used without explicit reasoning why and
when a certain attitude leads to a commitment. For example, in organization
theories of Distributed Artificial Intelligence (DAI), negotiation systems, and
cooperative software agents, the notion of commitment is used as a mediator of
the transformation of the collective activity to the agents expressing issues such
as delegation, adaptation, intention, responsibility, etc., which constitutes the
theory of collective activity in a narrower way (cf. [1]). We use the primitive
notions of intention, knowledge, and goal to define formally social commitment,
inspired by Castelfranchi [1, 2]:

COMM(i, j,r) = def INT(i,r) A / ^ ( I N T ^ . r ) ) A GOAL(j, ACHIEVE^, r)), (1)

where Kj(<j>) stands for the fact that agent j knows <f>, and is interpreted in
the Kripke-style possible worlds semantics. The definition means that agent i is
committed to agent j to achieve task r if and only if agent i has the intention to
do that, agent j knows this, and agent j is interested in i fulfilling i's intention.
1
F o r all basic game-theoretic terminology and aspects we refer the reader to [6].
94

The last condition can be seen as a goal adoption: the achievement of the task
is a goal of j .
In game theory, motivational attitudes are represented by the payoffs agents
receive at the end of an interaction, based on their combination of actions. The
situation discussed above is only one example of a situation in which a commit-
ment can change the expected outcome for an interaction between two agents.
Likewise, the usefulness of commitment systems can be investigated for many
social and legal interactions. For now, we will give a very informal description
of what we mean by a commitment in this paper. Later we will become more
precise and we will show that there are various types of commitments.

Definition 1 A commitment is an action by an agent before an interaction


with other agents that signals to the other agents the intention to perform a
particular action later on in the interaction.

We restrict ourselves in this paper to commitments that ensure that the agent
who commits to a certain action will execute this action (binding commitments).

2 Adding Game Theory


Our main criticism of logical systems is that they do not explain but only de-
scribe actions by agents, probably including the use of commitments. Logical
systems fail to explain why commitments are used in some situations and not
in others. Logical systems cannot distinguish which commitment is or is not
credible in a given interaction. The reason is that logical systems generally ne-
glect the incentives related to various combinations of actions and the strategic
interdependence between different agents.
Besides explaining the use and effectiveness of commitments, game theory
can help to distinguish between different types of commitments. As an illustra-
tion, we consider games in which two agents have each two possible actions and
preferences over the four possible outcomes are strictly ordered for both agents.
Because only the ordering of the payoffs is important for the analyses, they can
be labeled as 1, 2, 3, and 4. 2 Rapoport, Guyer, and Gordon [5] show that there
exist 78 distinct 2 x 2 games with strictly ordered payoffs.3 Each of the four
outcomes represents a possible goal state for the agents. The goal states for the
two agents do not need to coincide.
For considering commitments, we classify these 78 games in eight groups.
Figure 2 presents the matrices for one representative of each group. In these
games, agent 1 chooses between T(op) and B(ottom), while agent 2 chooses
between L(eft) and R(ight).
T h e example of the introduction is not included in this set of games, because in this
example the four possible outcomes are not strictly ordered. Including games for which the
outcomes are not strictly ordered complicates the analysis considerably.
3
T w o games are considered the same if the one can be constructed from the other by
changing rows, columns, or person labels.
95

L R L R L R L R
T 4,4 3,3 T 2,4 4,1 T 3,3 1,4 T 2,4 4,1
B 2,2 1,1 B 3,2 1,3 B 4,1 2,2 B 1,2 3,3
(1) (2) (3) (4)
L R L R L R L R
2,3 4,1 3,4 2,1 2,4 3,1 3,3 2,4
1,2 3,4 1,2 4,3 1,2 4,3 4,2 1,1
(5) (6) (7) (8)
Figure 2: Representative examples of 2 x 2 games with strictly ordered outcomes

Examples (1) and (2) illustrate two situations in which both agents do not
want or need to commit to any of the two actions. Example (1) represents
a group of 58 games in which at least one of the two agents has a dominant
strategy. 4 The other agent optimizes her payoff given the dominant strategy of
the first agent, and both agents cannot do better using a commitment for some
other strategy. 5 Example (2) represents 4 games in which none of the agents
has a dominant strategy and there exists only one (mixed) equilibrium in which
the agents randomly choose between the two options. Their expected payoffs lie
between 2 and 3. If one agent would commit, she would not obtain more than 2. 6
For examples (1) and (2) it is impossible to formalize a commitment that affects
the behavior of the agents. I.e., any commitment the agents want to make leads
to the same behavior as they would execute if there was no commitment.
Example (3) is the Prisoner's Dilemma game. This is a very special game.
In this game, the game-theoretic solution predicts that both agents obtain 2,
while they both would prefer to obtain 3. However, this would imply that
both agents have to deviate from their dominant strategy. Consequently, the
only commitment arrangement that can work in this game is one in which both
agents commit to not playing the dominant strategy. No agent wants to commit
unilaterally to Top or Left, respectively, because then the other agent certainly
plays the dominant strategy leaving the first agent with the worst outcome
possible. This can formally be expressed as follows:

COMM(l, 2, Top) A COMM(2,1, Left), (2)

implying that agent 1 commits to playing Top and agent 2 to playing Left,
which leads to the goal state (3,3). This bilateral commitment can be seen as a
special case of a collective commitment.
Example (4) is also a unique game. In this game, agent 1 wants to commit
to playing Bottom, which would result in a payoff 3 for both agents. However,
agent 2 prefers to play the game without commitment, which leads to a payoff 4
4
An agent has a dominant strategy if there is one action the agent can perform t h a t gives
her a higher payoff for each of the actions the other agent can perform.
Readers interested in the precise classification of all the games can contact the authors
for an overview.
6
A (Nash) equilibrium is (loosely) an outcome in which none of the agents wants to change
her action given t h e action of t h e other agent.
96

for her. This shows that definition (1) is too restrictive to incorporate some kinds
of commitments. It requires that the commitment of one agent contributes to
a goal of the other agent. This presupposes that both agents have the same
goal state. However, example 3 illustrates a situation in which (3,3) is the goal
state of agent 1 while (2,4) is the goal state of agent 2. Moreover, without
commitment the outcome will be (2,4). Consequently, agent 1 wants to commit
to play Bottom. Because this is not the goal state of agent 2, such a commitment
does not follow the definition (1). However, an alternative definition:

COMM'(i, j , r) = d e ( INT(i, r) A K(j, INT(i, r)) (3)

formalizes a unilateral commitment that does not need to lead to the goal state
of agent j . This definition excludes that there has to be an agreement between
the agents about whether or not the commitment can be made. 7
Example (5) represents a group of 8 games, in which both agents agree that
one agent should commit. Without commitment they both obtain less compared
to the situation that one agent commits. In example (5), agent 1 has to commit
to play Bottom. Example (6) represents 3 games, which could also be called
"coordination" games. In these games, there are more equilibria, and both
agents want to coordinate on one of the equilibria, but without a commitment
they do not have a clue about what the other agent will choose. In these games,
the agent who commits first is best off, and the other agent is better off than if
their would not be a commitment, although she would have preferred to be the
one who committed herself. Note that in these games, a two-sided commitment
does not work if, for example, agent 1 commits to Bottom and agent 2 commits
to Left. The definition (1) is a suitable formalization for a commitment that
leads to a socially efficient outcome in example (5) and example (6). However,
for example (6), there is a complication because both agents might commit,
but they should not commit simultaneously. Therefore, a suitable commitment
system should prescribe which agent is allowed to commit. Both agents want to
commit because the committed agent receives 4, while the other agent receives 3.
The system can be formalized by the convention:

COMM(l, 2, Bottom) V COMM(2,1, Left)A

-.(COMM(l, 2, Bottom) A COMM(2,1, Left)). (4)

Example (7) looks very much the same as example (6). The only difference is
that agent 1 prefers to play the game without a commitment, rather than that
agent 2 commits to playing Left, while this is the best solution for agent 2. On
the other hand, both agents prefer to play the game while agent 1 commits to
playing Bottom over playing the game without a commitment. There are two
7
For example, a car driver will stop for somebody who s t a r t e d crossing the road, although
the car driver would have preferred to continue driving while the other person waited at the
sidewalk. In this example, starting t o cross t h e road is the commitment signaling the intention
of the pedestrian to go first.
97

games with this s t r u c t u r e . T h i s analysis suggests t h a t C O M M ( l , 2, B o t t o m ) is


t h e preferred formalization of a c o m m i t m e n t in this situation.
Finally, example (8) is a unique example in which different c o m m i t m e n t
systems lead t o three different solutions. If t h e agents can commit unilaterally,
agent 1 commits to playing B o t t o m , while agent 2 commits to playing Right.
T h e one who commits first o b t a i n s 4, while t h e other who has to follow obtains 2.
However, if t h e y can agree on c o m m i t t i n g t o play Top and Right, they b o t h
o b t a i n 3, which is still b e t t e r t h a n playing without a c o m m i t m e n t , because t h e
expected outcome for b o t h agents is t h e n somewhere between 2 and 3. T h e
socially efficient outcome (3,3) can only be reach with a bilateral c o m m i t m e n t ,
expressed by formula (2).
W h a t we learn from this classification of simple 2 x 2 games is t h a t the
definition of a social c o m m i t m e n t provided in logical systems leaves too m a n y
essential dimensions of a c o m m i t m e n t unspecified. If t h e c o m m i t m e n t has t o be
agreed u p o n by t h e n o n - c o m m i t t e d agent, the c o m m i t t e d agent will commit in
other situations t h a n if t h e c o m m i t t e d agent can unilaterally commit which is
neglected in existing logical formalizations. Therefore, we introduced a n o t h e r
t y p e of c o m m i t m e n t using t h e o p e r a t o r C O M M ' , which does not include t h a t
t h e intended action of t h e c o m m i t t e d agent contributes to t h e goal of t h e other
agent. It might be crucial whether one or b o t h agents have an option to commit
t o a move and in which order t h e agents obtain t h e o p p o r t u n i t y t o commit. In
game-theoretic t e r m s , these options can be formalized by adding moves t o t h e
game t h a t i m p l e m e n t t h e possibilities for t h e agents to commit and, eventually,
to accept t h e c o m m i t m e n t of t h e other agent. T h e s e moves m i g h t be speci-
fied simultaneously or sequentially. Using game-theoretic reasoning, solutions
of these extended games can be calculated, which provides predictions a b o u t
whether or not c o m m i t m e n t s will be used and w h a t t h e consequences of these
c o m m i t m e n t s are d e p e n d i n g on t h e chosen c o m m i t m e n t system. As a result,
insides are o b t a i n e d a b o u t whether a c o m m i t m e n t s y s t e m is socially efficient or
favors one of t h e two agents.

References
[1] Castelfranchi, C , Commitments: From individual intentions to groups and or-
ganizations, in: V. Lesser (ed.), Proceedings First International Conference on
Multi-Agent Systems, AAAI-Press and MIT Press, San Francisco, 41-48, 1995.
[2] Dunin-Keplicz, B., and R. Verbrugge, Collective commitments, in: M. Tokora
(ed.), Proceedings Second International Conference on Multi-Agent Systems,
AAAI-Press, San Francisco, 56-63, 1996.
[3] Luce, R.D. and H. Raiffa, Games and Decisions, Wiley, New York, 1957.
[4] Meyer, J.-J.Ch., W. van der Hoek and B. van Linder, A Logical approach to the
dynamics of commitments, Artificial Intelligence 113, 1-40, 1999.
[5] Rapoport, A., M.J. Guyer, and D.G. Gordon, The 2x2 Game, University of
Michigan Press, Ann Arbor, MA, 1976.
[6] Rasmusen, E., Games and Information: An Introduction to Game Theory (2nd),
Blackwell, Oxford, 1994.
ASYNCHRONOUS CONSISTENCY MAINTENANCE

MARIUS-CALIN SILAGHI, DJAMILA SAM-HAROUD, AND BOI FALTINGS


EPFL, CH-1015, Switzerland
{Marius. Silaghi, Djamila. Haroud, Boi. Faltings} @ epfl. ch

Maintaining local consistency during backtrack search is one of the most powerful techniques
for solving centralized constraint satisfaction problems (CSPs). Yet, no work has been re-
ported on such a combination in asynchronous settings. The difficulty in this case is that, in the
usual algorithms, the instantiation and consistency enforcement steps must alternate sequen-
tially. When brought to a distributed setting, a similar approach forces the search algorithm
to be synchronous in order to benefit from consistency maintenance. Asynchronism 1 | 2 is
highly desirable since it increases parallelism and makes the solving process robust against
timing variations. This paper shows how an asynchronous algorithm for maintaining consis-
tency during distributed search can be designed. The proposed algorithm is complete and has
polynomial-space complexity. Experimental evaluations show that it brings substantial gains in
computational power compared with existing asynchronous algorithms.

1 Introduction

A constraint satisfaction problem (CSP) is defined as a set of variables taking their


values in particular domains and subject to constraints that specify consistent value
combinations. Distributed constraint satisfaction problems (DisCSPs) arise when the
constraints or variables come from a set of independent but communicating agents.
The most successful centralized algorithms for solving CSPs combine search with
local consistency. The local consistency algorithms prune from the domains of vari-
ables the values that are locally inconsistent with the constraints, hence reducing
the search effort. When a DisCSP is solved by search using a distributed network
of agents, it is desirable that this search exploits asynchronism as much as possi-
ble. Asynchronism gives the agents more freedom in the way they can contribute
to search. It also increases both parallelism and robustness. In particular, robust-
ness is improved by the fact that the search can still detect unsatisfiability even in
the presence of crashed agents. The existing work on asynchronous algorithms for
distributed CSPs has focused on one of the following types of asynchronism:

a) deciding instantiations of variables by distinct agents. The agents can propose


different instantiations asynchronously.

b) enforcing consistency. The distributed process of achieving "local" consistency


on the global problem is asynchronous (e.g. Distributed Arc Consistency 3 ).

We show how these techniques can be combined without losing asynchronism.

98
99

A
s a^----rv
level 0 level 0 A1
Bli^>o-leveM level 1 A^
level 2 Aa.
proposals of A1 MA

Figure 1. Distributed search trees: simultaneous views of distributed search seen by A2, A4, and A4,
respectively. Each arc corresponds to a proposal from Aj_ 1 to Aj.

2 Preliminaries

Asynchronous search In this paper we target problems with finite domains. We


consider that each agent Ai wants to satisfy a local CSP, CSP(Aj). The agents may
keep their constraints private but publish their interest on variables. The technique
we propose builds on Asynchronous Aggregation Search (AAS), a general complete
protocol for solving distributed CSPs with polynomial space requirements 2 . AAS
is an extension of Asynchronous Backtracking (ABT) and allows for asynchronism
of type a. AAS uses a strict order on agents. We assume that Aj has the position
j , J> 1- If j > k, we say that Aj has a lower priority than Ak- Aj is then a successor
of Ak, and Ak a predecessor of Aj.
Asynchronous distributed consistency The centralized local-consistency al-
gorithms prune from the domain of variables the values that are locally inconsistent
with the constraints. Their distributed counterparts (e.g. 3 ) work by exchanging mes-
sages on value elimination. The restricted domains resulting from such a pruning are
called labels. In this paper we will only consider the local consistencies algorithms
which work on labels for individual variables (e.g. arc-, bound-consistency). Let P
be a Distributed CSP with the agents Al,ie{l..n}. We denote by C{P) the CSP
defined by Uj6{i..n}CSP(.4t). Let A be a centralized local consistency algorithm as
just mentioned. We denote by DC(.4) a distributed consistency algorithm that com-
putes, by exchanging value elimination, the same labels for P as A for C(P). When
DC(A) is run on P, we say that P becomes DC(A) consistent.

3 Asynchronous consistency maintenance

In distributed search, each agent has its own perception of the distributed search tree.
It is determined by the proposals received from its predecessors. In Figure 1 is shown
a simultaneous view of three agents. Only A2 knows the fourth proposal of A\. Ay,
has not yet received the third proposal of Ai consistent with the third proposal of A\.
However, A4 knows that proposal of A2. Suppose that A\ has not received anything
valid from A3, A4 will assume that A3 agrees with A2. The term level in Figure 1
100

refers to the depth in the (distributed) search tree viewed by an agent. We show that
A, can then benefit from the value eliminations resulting by local consistency from
the proposals of subsets of its predecessors, as soon as available.

4 The DMAC protocol

This section presents DMAC (Distributed Maintaining Asynchronous Consistency),


a complete protocol for maintaining asynchronous consistency, built on AAS.
Definition 1 (Aggregate) An aggregate is a triplet (xj,Sj,hj) where Xj is a vari-
able, Sj a set of values for Xj, s-,-^0, and hj a history of the pair (XJ , Sj).
The history guarantees a correct message ordering. Let ai = (xj,Sj,hj) and a^ =
(xj,s'j,h'j) be two aggregates for the variable Xj. a\ is newer than a-i if hj is more
recent than /i'. The ordering of histories is described in full detail in 4 . The newest
aggregates received by an agent Ai define its view, view(Ai). An aggregate-set is a
set of aggregates. Let V be an aggregate-set and vars(Aj) the variables of CSP(^4,).
Ti(V) will denote the set of tuples directly disabled from CSP(J4,) by V.
Definition 2 (Nogood entailed by the view) V—>-iTi(V) is a nogood entailed for
At by its view V, denoted NVi{V), iff V'CV and T(V) = T(V).
Definition 3 (Explicit nogood) An explicit nogood has the form -> V, or V—>fail,
where V is an aggregate-set.
The information in the received nogoods that is necessary for completeness can be
stored compactly in a polynomial space structure called conflict list nogood.
Definition 4 (Conflict list nogood) A conflict list nogood, denoted by CL, for
A; has the form V—>->T, where V<Zview(Ai) and T is a set of tuples:
T={t \t=(xti=vj,...,xtnt=v"'),\fk,xtkevars(Ai)}, such that T can be repre-
sented by the structures (stack) of a complete centralized backtracking algorithm.
An aggregate with history hx built by Aj for a variable x is valid for an agent
Am, m>j, if no other history known by Am and built by agents A^, k<j, in some
aggregate for x, is more recent than hx. A nogood containing only valid aggregates
is valid. The AAS protocol is defined by the ok, nogood and a d d l i n k messages.
The ok messages have as parameter an aggregate-set, V. They announce proposals
of domains for a set of variables and are sent from agents with lower priorities to
agents with higher priorities. The proposal is sent to all successor agents interested
in it. Let the set of valid aggregates known to the sender Ai be denoted known(A,).
VCknown(Ai). Any tuple not in Ti(known(J4j)) must satisfy the local constraints of
the sender Ai and its valid nogoodsa. An agent maintains its view and a valid CL
and always enforces its CL and its nogood entailed by the view, nogood messages

"Except for constraints about which Ai knows that a successor enforces them (as in ABT).
101

annouce explicit nogoods. Any received valid explicit nogood is merged into the
maintained CL using an inference technique.

4.1 DM AC
In addition to the messages of AAS, the agents in DMAC may exchange information
about nogoods inferred by DCs. This is done using p r o p a g a t e messages.
Definition 5 (Consistency nogood) A consistency nogood for a level k and a vari-
able x has the form V—>(x&lx) or V^>^(xGs\lx). V is an aggregate-set and may
contain for x an aggregate (x,s,h), lxCs. Any aggregate in V must have been pro-
posed by predecessors of'Ak+i. lx is a label, lx^ty.
Each consistency nogood for a variable x and a level k is tagged with the value
of a counter Cx at sender and is sent via p r o p a g a t e messages to all interested
agents Ai,i>k. The agents Ai use the most recent proposals of the agents Aj,j<k
when they compute DC consistent labels. Ai may receive valid consistency nogoods
of level k with aggregate-sets for the variables vars, vars not in vars(Aj). Ai must
then send a d d l i n k messages to all agents A&, k'<k not yet linked to Ai for all
vars. In order to achieve consistencies asynchronously, besides the structures of
AAS, implementations can maintain at any agent A,, for any level k, k<i:

• The aggregate-set, Vfc\ of the newest valid aggregates proposed by agents


Aj,j<k, for each interesting variable.

• For each variable x, xG\ais(Ai), for each agent Aj,j>k, the last consistency
nogood (with highest tag) sent by Aj for level k, denoted cnx (i,j), if it is valid.
It has the form V£x—>(xeSjtX).

Let cnkx(i,.) be (U*f ^ ) - > ( i e n ^ . f c s^x). P^k) := C S P ( ^ ) U (Uxcnkx(i,.)) U


NV^V^) U CLj.. Cx is incremented each time a new cnx(i, i) is stored.
On each modification of Pi (k), cnx(i,i) is recomputed by inference (e.g. using
local consistency techniques) for each variable x for the problem Pi(k). cnx(i, i) is
initialized as an empty constraint set. CLxk is the set of all nogoods known by Ai and
having the form F->-,T where VCV£ and T is a set of tuples in C S P ( ^ ) . CL\ may
contain the CL of A±. An agent can manage to maintain one CL for each instantiation
level and the space requirements do no change. cnx(i, i) is stored and sent to other
agents by p r o p a g a t e messages if and only if any constraint of CSP(Aj) or CL\
was used for its logical inference from Pi(k) and its label shrinks.
We only use DC techniques that terminate (e.g. 3 ). By quiescence of a group of
agents we mean that none of them will receive or generate any valid nogoods, new
valid aggregates, or addlink messages. The proofs are given in 5 .
102

Property 1 Vi in finite time tl either a solution or failure is detected, or all the


agents Aj, 0<j<i reach quiescence in a state where they are not refused a proposal
satisfying CSP( A,- )UNVj (view( A,-)).
Proposition 1 DMAC is correct, complete, and terminates.
Among the consistency nogoods that an agent computes itself at level k from its
constraints, cn^.(i, i), let it store only the last valid one for each variable. Let Aj also
store only the last (with highest tag) valid consistency nogood, cn^.(i,j), sent to it
for each variable a;GCSP(A) at each level k from any agent Aj. Then:
Proposition 2 DC(A) labels computed at quiescence at any level using p r o p a -
g a t e messages are equivalent to A labels when computed in a centralized manner
on a processor. This is true whenever all the agents reveal consistency nogoods for
all minimal labels, l^., which they can compute.
Proposition 3 The minimum space an agent needs with DMAC for ensuring main-
tenance of the highest degree of consistency achievable with DC is 0(a2v(v + d)).
With bound consistency, the required space is 0((av)2).
Additional nogoods can be stored as redundant constraints.

5 Conclusion

Consistency maintenance is one of the most powerful techniques for solving central-
ized CSPs. Bringing similar techniques to an asynchronous setting poses the problem
of how search can be asynchronous when instantiation and consistency enforcement
steps are combined. We present a solution to this problem. A new distributed search
protocol which allows for asynchronously maintaining distributed consistency with
polynomial space complexity is then proposed.

References

1. M. Yokoo, E. H. Durfee, T. Ishida, and K. Kuwabara. The Distributed CSP:


Formalization and algorithms. IEEE Trans, on KDE, 10(5):673-685, 98.
2. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. Asynchronous search with
aggregations. In Proc. of AAAI2000, pages 917-922, 2000.
3. Y. Zhang and A. K. Mackworth. Parallel and distributed algorithms for finite
constraint satisfaction problems. In Proc. of Third IEEE Symposium on Paral-
lel and Distributed Processing, pages 394-397, 91.
4. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. ABT with asynchronous re-
ordering. In IAT, 01.
5. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. Asynchronous consistency
maintenance with reordering. Technical Report #01/360, EPFL, March 2001.
CHAPTER 2

COMPUTATIONAL ARCHITECTURE
AND INFRASTRUCTURE
REASONING ABOUT MUTUAL-BELIEF AMONG MULTIPLE
COOPERATIVE AGENTS

WENPIN JIAO
Department of Computer Science, University of Victoria, Victoria, BC V8W 3P6, Canada
wpjiao @ csr. esc. uvic. ca

Believing mutually is an important premise to ensure that cooperation among multiple agents
goes smoothly. However, mutual belief among agents is always considered for granted. In this
paper, we adapt a method based on the position-exchange principle to reason about mutual
belief among agents. To reason about mutual belief formally, we first use a process algebra
approach, the pi-calculus, to formalize cooperation plans and agents, and then bind the
position-exchange principle into the inference rules. By reasoning about mutual belief among
agents, we can judge whether cooperation among agents can go on rationally or not.

1 Introduction

Cooperation among agents is one of the keys to drawing multiple intelligent systems
together [6]. Cooperation among multiple agents should meet at least three criteria:
1) agents should response mutually, 2) all agents should make joint commitments, 3)
each agent should be committed to supporting inter-actions [1]. That is, every agent
participating in cooperation must believe that any other agents are honest and will
take actions following a specific cooperation plan, and vice versa. Shortly, all agents
involved in cooperation must believe each other mutually.
Generally, once after an agent takes an action, it must expect to observe a
specific result or response from others so that it could conclude whether it can
believe others or it is believed by others. If any agent participating in cooperation
believes that it itself is believed by others and others are believable as well, we will
say that those agents believe each other mutually and the cooperation will proceed
smoothly. However, in a distributed system, an agent almost knows nothing about
others, thus it can only reason about the others' knowledge based on its own
knowledge. To achieve that, an agent has to assume that others will think and act in
a similar way as itself. In this paper, we adopt a technique using the position-
exchange principle to reason about mutual belief between agents.
The position-exchange principle means that one will put him in others' position
and judge others' feelings by his own. In other words, when one wants to reason
about another, he will take the view of the other and thinks as if he were the other.
For example, to reason about another's knowledge, one may say "/// did it, I believe
that if he were me he would do it under the similar circumstance, too." In a logic
system, the position-exchange principle can be described as the following formula.
BA(a -*p)^ BA(BB{a{yA] -* /?{%}))

104
105

Where, B$ indicates that X believes Y is held; a{B/A) is a new formula different


from a, in which all variables related to A are substituted with variables related to B.
It means if A believes that /? will be held under condition a, A will believe as
well that B will believe the similar conclusion fi{B/A] will be held under the similar
condition a{B/A}.
When we were using the position-exchange principle, we need not only
substitute those variables related to agents but also transform the actions associated
with agents since one does not know how the other acts. However, in a general logic
framework, we cannot reason about actions. So we use process algebra, the pi-
calculus, to reason about mutual belief among agents. In the pi-calculus, actions of
the pi-calculus processes occur in pairs and are mainly for communicating. Thus
when we use the position-exchange principle, we can reason about other's belief by
substituting both variables and mutual complementary input/output actions.
In the following sections, we first give the formal framework in section 2. Then
in section 3, we formally describe what an agent, a cooperation plan, and
cooperation look like. In section 4, we define inference rules based on the position-
exchange principle for reasoning about mutual belief among agents, and then use
them to reason about the rationality of specific cooperation among agents. The last
section offers some conclusions.

2 The Formal Framework

In this paper, we adopt a process algebra approach, the pi-calculus [5], to formalize
agents, plans, and cooperation.
In the pi-calculus, there are only two kinds of entities: processes and channels,
where processes are active components of a system and they communicate with each
other through ports (or names) connected via channels. The processes in the pi-
calculus have the following forms.
P-.^H^^.P, | P\Q | \P | (yx)P | [x = y]P
7t::=x(y)\xy\t
Where, / is a finite set. £iGl % .P, represents to execute one of these / processes and
when / = 0 we mark E ieI m -P, as 0, which is inert. x(y) and xy represent that name
y will be input/output along channel x, respectively, whereas r represents a silent
action. P\Q represents the parallel composition of two processes of P and Q. \P
represents any number of copies of P. (vx)P introduces a new channel x with scope
P, where v is the constriction operator, [x = y]P means that process P will proceed
only if x and y are the same channel, where [x = y] is the matching operator.
In the pi-calculus, the computation and the evolution of a process are defined by
reduction rules. The most important reduction relation is about communication.
ay.P\a(z).Q ~^au) )P\QV/)
106

It means that the process will reduce into the right form after the communication,
and meanwhile all free occurrences of z in Q will be substituted with y.

3 Agents and Their Cooperation

Though an agent is an active entity with pro-activities, it should take actions


complying with the global cooperation plan. In this section, we first define
cooperation plans formally as the pi-calculus processes. And then we define agents
formally and show how to bind agents together to perform the cooperation plan.

3.1 Cooperation Plan


A cooperation plan is always composed of a series of tasks, among which there are
some specific relationships to coordinate their performing. In general, a cooperation
plan can be viewed as a tree, in which nodes are tasks to be allocated to agents; and
relationships among tasks can be mapped to relationships among nodes. A
cooperation plan can be defined recursively as follows.
1. The cooperation plan has a hierarchical structure, which is represented as a tree.
2. Any task is corresponding to a node within the plan tree. The global task, P,
corresponding to the global plan, is the root of the plan tree, and Plan =def P.
3. If a task P consists of a set of sub-tasks, Pj, P2, ..., Pn, the node corresponding
to the task will have as many children-nodes as the sub-tasks. And
P ~dej Pi I P2 I • • • I Pn-
4. Among those sibling nodes, there are two categories of relations. If there is a
unary relation over P, or a binary relation over />,- and P}• (1 < i ^ j < n), Ph PJt
and P may need to be redefined.
4.1. Unary relation: Repetition. It means that the corresponding task needs to
be performed many times. And P, =redef 'Pi-
4.2. Binary relations. There are four kinds of binary relations between sibling
nodes, serialization, synchronization, sequence, and parallel1.
4.2.1. Serialization. It means that the performing order of two tasks is not
important, but the two tasks cannot be carried on concurrently. And
P, =rm P„ -P.'^P Pj =re„e, P,J • fy • V ^

1
While defining the plan process, we require that serialization relations must be
considered first, and then synchronization and sequence; otherwise, deadlocks may
be brought into the plan process. For example, consider three sub-processes, P, Q,
R, among which P and Q must be performed serially and R must be carried on
before both P and Q. If we do not follow the above convention, we may get process
p .5 .P.V~\p .<? .Q.^~\R.1T.1T\S
r v
, whereS -~p~.v .S • Then, if
r vy
r'pq pr ' pg I r pq qr *£ pq 1* ^pr qr \ pq' "'"-'** >~>^ yp(j * pg ^ ^

Q communicates with Spq before P has a chance to do so, a deadlock will occur.
107

Where, s = rT.v .S
y
is like a pv semaphore controller in
r r
y def rij ij ^ y

operating systems.
4.2.2. Synchronization. Two tasks with a synchronization relation must be
performed at the same time. And2
Pi=,m$irPi> Pi=retef^rP>'and'
P P P
= , m ( . ^ , j ) - \ i \ - \ j \ -

4.2.3. Sequence. The performing of two tasks should be controlled under a


restricted order, i.e., one must precede the other. And
a n d
Pt=«ufP,-'s'v. ^=^r W
p=^(ySu)-\p,\-\Pj\-
4.2.4. Parallel. They can even be carried on concurrently. For that case,
processes need not to be redefined.
5. There are no any other kinds of nodes or relations within the plan tree except
for those defined above.
For example, in an electronic commerce community, a price negotiation procedure
can be planned as the repetition of price bargaining between two parties (figure 1).

Accept the price? Agree/Disagree


'"•£ Dash arrowhead arc represents
the unary repetition relation.
•••> Sequence
Wait for a Wait for an Strike a <-> Serialization
stroked price asked price price ••-• Synchronization

Figure 1. The plan tree of a price negotiation procedure

In the plan, the bargaining process, which is divided into two sub-processes of
price asking and striking, will repeat for any times until both sides make a deal. For
the price-asking process, it is divided further into two sub-processes, a process
asking a price and then the other waiting for a stroked price. For the price-striking
process, it is also divided into two sub-processes, one waiting for a price and then
the other striking a price back. Once someone (for instance, the bargaining initiator)
thinks the stroked price is acceptable, it can stop bargaining and make a deal.

2
Synchronization relations are symmetric, so we need only to consider those cases
that i < j . Thus deadlocks can be avoided among synchronized nodes.
108

The plan shown in figure 1 can be expressed in the pi-calculus as follows.


PriceNegotiationPlan = P0 = (vS0) !/>, . S0 | S0 • P2

P 1 I =(^ I )/ J 1 1 1 .^"|J 1 .P 1 I 2 />, = {v52)Pm .T2\S2 .Pm


When representing a cooperation plan in the pi-calculus processes, we add some
new communicating ports to control the execution of sub-processes so that we could
represent relationships within a composition process. Generally, when there are
relationships such as serialization, synchronization, and sequence in a system, there
may occur deadlocks. Fortunately, by using the procedure described above, we can
get a non-deadlock plan process if there is no deadlock among the plan tree.
Proposition 1. If there is no deadlock among the plan tree, the corresponding
composition process of the plan will be deadlock free.
The proof is quite simple. As discussed above, we can first eliminate the
possibility of a deadlock lying in serialization and synchronization relations. On the
other hand, any two synchronized processes cannot have sequence relations with
another process simultaneously, and vice versa. That is to say, any sequence
relations and synchronization relations are impossible to bring a cyclic waiting-chain
into processes if there is no cyclic waiting-chain occurring in the plan tree. Thus, we
can say that the translation described above is deadlock free.

3.2 Agent
In a cooperative environment, an agent must undertake tasks to cooperate with
others by complying with a certain cooperation plan. We can define an agent as an
entity that includes actions, tasks it undertakes, and behavior specifications
consistent with a specific cooperation plan. To represent the behavior specifications
of an agent, we define a function of expectation from actions to actions to indicate
that the agent expects to conceive what kind of response after it takes an action. An
agent is an 4-ary tuple.
A = <A, T, F, B>
Where, A is an action set, 7"is a collection of tasks, £"is A's expectations and
defined as a function E: A—> A, and Bis A's beliefs.
Components of Agents can be defined on the pi-calculus formally, in which the
action set A is a set of pi-calculus actions, the task set 7~is a collection of pi-calculus
processes, and for any process P e 7~and P-y.P',ye A.
Suppose that a, /? e A, then E(a) = ft means that if the agent A takes action a, it
will expect that action p to happen. In general, we can say that only when an agent is
waiting for something does it expect that thing to appear, so we will only define an
agent's expectations on its input actions. Then if E{a) = 0, a can be either an input
or an output, but /3 must be an input action.
109

For any process P s T, suppose that P has the following form.


P= — .a.p.—
Where, oris an input/output action, and /?is an input action. Then E(a) = J3.
In addition, suppose that the agent is assigned two tasks within a cooperation
plan, P, and P2, if there is a sequence relation between them, and
^=(-.or.-).^T, P2=Sn. (-.p.-)
Where, oris an input/output action, and /?is an input action. Then E\a) = p.
Since each agent has its own actions, tasks, expectations, and beliefs, A, T, E,
and B can be viewed as functions with the domain of agents. In the rest context, we
use /4(A), 7[A), E(A), and B(A) to denote the action set, the task set, the
expectations, and the beliefs of A, respectively.
In this paper, we will only consider such kind of beliefs as whether an agent
trusts others, whether the agent is trusted by others, and so on. For convenience, we
mark x e Bas A > x. Suppose there is a set of agents, Ag, and A, B e Ag, then A > B
means A trusts B, whereas A> (B\> A) means A believes that B trusts A as well.

3.3 Bind Agents into the cooperation plan


The cooperation plan is only a cooperation blueprint or specification of tasks, which
does not provide concrete actions or functions to perform those tasks. After
cooperation is planned, tasks should be assigned to cooperative agents.
For example, if we allocate those tasks shown in figure 1 to a seller agent, S,
and a buyer agent, B, for instance, P0, Pi, Pn, Pm, Pm, P2, and P2i to S, and Pn,
Pm, P122, ar| d P22 to B, agent S and B can be defined as follows.
S=<A,T,E,B> B=<A,T,E,B>
A = {ap,a(x),0\p,o2(y)) A = {eop,a(x),ol(y),o2t}
T = {P0,Pl,P„,Pln,Pnl,P1,P2l) T = {Pl2,Pm,Pll2,P12)
E = {<a,co>,<ol,o1 >} E = {<r,a>,<a>,a>,<a>,ol >}
B = {) B = {)
p
PIU =CalculatePrices(p).ap m = a(x).CakulatePrice „(p)
Pul=aKx) Pm=cop
P2l=o,p.o2(y) P22=o,(y).o2t
Figure 2. Formal definitions of agent S and B

Where, a and a represent actions "asking a price" and "waiting for an asked
price" respectively, co and a represent actions "striking a price" and "waiting for a
stroked price", and o, asks "Accept the price or not?" and then o2 waits for the
answer. Functions CalculatePricesip) and CalcuatePriceB(p) are used to calculate a
new asking price and a new striking price, respectively.
110

For agent S's expectations, they mean that the seller hope that it will receive a
response after each round of bargaining and the buyer will acknowledge its any
questions. For agent B's expectations, the buyer may expect that the bargaining must
be initiated by someone else, and after it strikes a price it may hope that the seller
asks a new price or makes a deal with it.
To assemble cooperative agents into the cooperation plan, we should connect
the abstract plan specification with those concrete implementations of agents'
functions. In the pi-calculus, we can use the following method to achieve that.
First, we view the tasks occurring in the plan process as pointers and then make
those pointers point to the functions provided by agents. For example, suppose that
Pi is a task in the plan process and has been assigned to agent A, who will undertake
that task by taking action Ta, then we can define following processes.
A
Pi=Z,.A> = Zi.A'Ta
Then compose the processes defined above into a composition process, that is
Pi\A = Z,.A I *LA • T<,
Thus we bind the agent with the plan together.
On the other hand, an agent may undertake several tasks, for instance, Th T2,
...,Tke 7(A), then 7(A) can be re-defined as a composition of processes.
r(A) = zl-Tl\z2.T2\ ••• \zk-Tk
Thus, a cooperation system with a cooperation plan, Plan, and a collection of
cooperative agents, Ah A2, ..., A„, can be defined as follows.
Sys = Plan\T(Al)\T(A2)\ ••• |7"(A„)

4 Reason about Mutual-Belief

In this section, we will define some inference rules for reasoning about mutual-belief
among agents. While defining those rules, we mix the position-exchange principle
into the definitions. And then we will describe in what condition agents will believe
each other mutually.

4.1 Rules on Beliefs


To define rules on beliefs, we should first know what actions are observable to an
agent. To represent an agent obseves an action y, we assign the form A—Z—*A' with
the following meaning.
P e TO!),«„«.,,—,gte/4Q4), P ">•">• "•"" >P', P'—^P",r*^ A ^ e A(A)
A—L>A'
Intuitively, if an action is observable in a process, it is also observable to the agent.
In general, An agent knows nothing about others. To build beliefs on others, it
can only base on those messages it has sent and received. However, not all messages
111

it receives are something that it is waiting for or expecting. So, in our definitions of
rules on beliefs, we include the expectations of agents as premises and then agents
will only believe things that they are expecting.
Based on the position-exchange principle, an agent can derive beliefs on it from
messages it receives, and then derive beliefs on others from messages it sends.
1. Belief about honesty of the other
If the agent receives a message that it is expecting, it will believe that the sender
agent is trustable.
A—g->A',3cr-(a,/7)e E(A),~J3e A{B)
A B
> (BR1)
Where, a can be an input/output action, whereas P must be an input action.
2. Belief on the other's belief
Correspondingly, under the position-exchange principle, A will believe that
agent B also trusts it if A responds a message to B as B requests.
A—£-»A',3a-(«,jff)eF(B),/7e/4(fi)
A>(B>A) (BR2)

While using the position-exchange principle in the above rule, we do not


substitute all occurances of A. Instead, we just replace the action /} with its
complement one p since A may not be clear how the receiver, B, is evolving.

4.2 Mutual Belief among Agents


Informally, we say two agents have built mutual belief if both of them trust each
other and each of them believes that it counterpart also trusts it. Then, the mutual-
belief can be defined formally in several groups of beliefs.
1. Both of the agents believe in its counterpart.
A> B, and B> A
2. Each of the two agents believes its counterpart trusts it as well.
A>(B> A), and B>(A> B)
For a cooperation plan, in which its tasks are allocated to cooperative agents, if
those agents cannot build mutual belief during cooperation, we will say that
cooperation will not proceed smoothly and it is irrational. In other words, to build
mutual belief among agents is the least requirement for cooperation.
Definition: At-Least-Rationality of cooperation. If agents can build mutual
belief during cooperation, we say that cooperation is at least rational.

4.3 Reason about Mutual Belief among Agents - an Example


Consider the example shown in figure 1 again, the complete plan, and parts of agent
S and agent B are redefined as follows.
112

Plan =! {{(v8x )z, . Sx | Sl . z2) | ((v£, )z 3 . £21tf2. z 4 )). J01S0 • ((vf, )f, • z51 $", • z6)
n 5 ) = z,.A' 111 |z 2 .P 112 |z s .l , 21 , ««/ T(B) = z 3 . PI2I | z 3 . />122 |z 6 . P22
Then the procedure to reason about mutual belief between S and B can proceed at
the same time while the computation between S and B is going on.
1. S calculates out an asking price and sends it to B, and then waits for Z?'s
response. On the other side, B is waiting for S to ask for a new price. If B
receives the message from S, i.e., B observes action a(x), then by rule BR1
B—^B\(T,a)sE(B),ae/t(S) then B>S
2. Once after B receives an asking price, it will calculate a new price for striking
and then send it back to S. At that case, by rule BR2
B—^B\(a,a))eF(S),coeA(S) then B>{S>B)
On the other side, for S, by rule BR1
S—2-^S',(a,aJ)eE(S),~coeA(B) then S>B
3. By now, B has believed that S is trustable and it itself is also trustable for S.
However, S is not certain whether it is trusted by B or not though it has trusted
B. If the cooperation stopped now, cooperation would be uncompleted since
the two agents have not built mutual belief. Nevertheless, according to the
cooperation plan, agent S has two choices for its succeeding actions.
3.1. Continue by suggesting another asking price to B. Then by rule BR2
S—Z-*S',(a),a)^E{B),a&A{B) then S>(B>S)
3.2. Or stop bargaining and make a deal with B. Similarly as 3.1
5—^-^S',{w,o l )e E{B),oxeA(B) then S > (B > S)
Now, although the computation between S and B does not finish, the mutual belief
has been built between them. If we reason about further, we can only enhance the
mutual belief. Thus we can say the cooperation between 5 and B is rational.

5 Conclusions

In [1], it gave three criteria for cooperation among multiple agents. Briefly, to
cooperate, all agents must believe each other mutually. However, cooperation
schemes in current literatures take mutual belief for granted [2][3][4][6][8], and they
always assume that cooperating agents believe each other mutually, which will leave
many chances for malicious agents to do harms on cooperation. Only when we know
that every agent participating in the cooperation believes each other mutually can we
say that the cooperation will go through smoothly.
In this paper, to reason about mutual belief among agents, we adopt a technique
using the position-exchange principle. By using those inference rules based on the
principle, we can reason about an agent's beliefs on it and on others. In [7], a
113

different inference rule was used to reason about knowledge of others. That
inference rule can be expressed as follows.
BABB(a - » £ ) - > (BABBa-» BABBfi)
Intuitively, this rule says that if A believes that B believes some implication is held,
then once A believes that B believes the premise of the implication is satisfied then A
will also believe that B will believe the result of the implication is implied.
That inference rule has several main differences from ours. First, it requires that
A must have already had beliefs on B. Second, the rule can only be applied to the
circumstance that all agents have completely common knowledge. However, in a
distributed environment, agents are incapable of owning knowledge or beliefs about
others in advance, and it is impossible for agents to possess all knowledge dispersed
within the environment, either, which will lead the above rule unsuitable for real
distributed systems.
Before defining the position-exchange principle in inference rules, we first take
a process algebra approach, the pi-calculus, to formalize cooperation plans and then
define an agent as an entity with actions, tasks, expectations, and beliefs. While
defining the inference rules for reasoning about mutual belief, we take an agent's
expectations into consideration and bind the expectations with its beliefs together so
that the agent will only believe what it is expecting. Thus once mutual belief is built
among agents; we will be able to say that the cooperation will go on rationally.

References

1. M. E. Bratman. Shared cooperative activity. Philosophy Reviews, 101:327-341,


1992.
2. Barbara Grosz and Sarit Kraus. Collaborative plans for complex group actions.
Artificial Intelligence, 86(2):269-357, 1996.
3. V. R. Lesser. A retrospective view of fa/c distributed problem solving. IEEE
Transactions on Systems, Man, and Cybernetics, 21(6), December 1991.
4. H. J. Levesque, P. R. Cohen, and J. H. T. Nunes. On acting together. In
Proceedings of the Eighth National Conference on Artificial Intelligence
(AAAI-90), pp.94-99, Boston, MA, 1990.
5. R. Milner, J. Parrow, and D. Walker. A Calculus of Mobile Processes, Part I, II.
Journal of Information and Computation, Vol.100, 1992, pp. 1-77.
6. Sarit Kraus. Negotiation and Cooperation in Multi-Agent Environments.
Artificial Intelligence Journal, 94(l-2):79-98, 1997.
7. SHI Zhongzhi, Tian Qijia, and Li Yunfeng. RAO Logic for Multiagent
Framework. Chinese Journal of Computer Science and Technology 14(4), 1999.
8. Michael Wooldridge and Nicholas R. Jennings. Towards a theory of
cooperative problem solving. In Proceedings of Modelling Autonomous Agent
in a Multi-Agent World (MAAMAW-94), Odense, Denmark, 15-26, 1994.
PORTABLE RESOURCE CONTROL FOR MOBILE
M U L T I - A G E N T S Y S T E M S IN JAVA

WALTER BINDER
CoCo Software Engineering, Margaretenstr. 22/9, A-1040 Vienna, Austria
E-mail: w.binder Q coco. co. at

J A R L E G. H U L A A S , A L E X V I L L A Z O N , A N D R O R Y G. V I D A L
University of Geneva, rue General Dufour 24, CH-1211 Geneva, Switzerland
E-mail: {Jarle.Hulaas, Alex Villazon)@cui.unige.ch
vidalr5<3cuimail. unige. ch

Prevention of denial-of-service attacks is indispensable for distributed multi-agent


systems to execute securely. To implement the required defense mechanisms, it
is necessary to have support for resource control, i.e., accounting and limiting the
consumption of resources like CPU, memory, and threads. Java is the predominant
implementation language for mobile agent systems, even though resource control
is a missing feature on standard Java platforms. Moreover, prevailing approaches
to resource control in Java require substantial support from native code libraries,
which is a serious disadvantage with respect to portability, since it prevents the
deployment of applications on large-scale heterogeneous networks. This article
describes the new resource-aware version of the J-SEAL2 mobile agent kernel.
The resource control model is based on a set of requirements, where portability
is very significant, as well as a natural integration with the existing programming
model.

1 Introduction

Java was designed as a general-purpose programming language, with spe-


cial emphasis on portability in order to enhance the support of distributed
applications. Therefore, it is natural that access to low-level, highly machine-
dependent mechanisms were not incorporated from the beginning. New classes
of applications are however being conceived, which rely on the facilities of-
fered by Java, and which at the same time push and uncover the limits of
the language. These novel applications, based on the possibilities introduced
by code mobility, open up traditional environments, move arbitrarily from
machine to machine, execute concurrently, and compete for resources on de-
vices where everything from modest to plentiful configurations can be found.
We are therefore witnessing increased requirements regarding fairness and se-
curity, and it becomes indispensable to acquire a better understanding and
grasp of low-level issues such as resource management.
Operating system kernels provide mechanisms to enforce resource limits

114
115

for processes. The scheduler assigns processes to CPUs reflecting process


priorities. Furthermore, only the kernel has access to all memory resources.
Processes have to allocate memory regions from the kernel, which verifies that
memory limits for the processes are not exceeded. Likewise, a mobile agent
kernel must prevent denial-of-service attacks, such as agents allocating all
available memory. For this purpose, accounting of resources (e.g., memory,
CPU, network, threads, etc.) is crucial.
The great value of resource control is that it is not restricted to serve as
a base for implementing security mechanisms. Application service providers
may need to guarantee a certain quality of service, or to create the support for
usage-based billing. The basic mechanisms described here will be necessary
to schedule the quality of service or to support the higher-level accounting
system, which will bill the clients for consumed computing resources.
This article is organized as follows. The next section presents the de-
sign goals and the resulting resource control model. Section 3 compares our
approach with related work, whereas section 4 concludes the article.

2 Objectives and Resulting Model

The ultimate objective of this work is to enable the creation of execution


platforms, where anonymous agents may securely coexist without harming
their environment. The desire to deploy secure systems translates into the
following requirements:

• Accounting of low-level resources, like CPU and memory, as well as of


higher-level resources, such as threads.
• Prevention against denial-of-service attacks, which are based on CPU,
memory, or communication misuse.
• No dependence on particular hardware or operating system features, in
order to enable a portable implementation. Portability and transparency
are crucial in heterogeneous environments.
• Minimal overhead for trusted agents, which have no resource limits.
• Support for resource sharing between closely collaborating agents, in or-
der to minimize resource fragmentation.

Since some aspects of resource control are to be manageable by the appli-


cation developer, it is important that the general model integrates well with
the existing programming model of the J-SEAL2 mobile agent system 3 . The
116

Fully trusted domains


(no accounting needed)

Figure 1. Illustration of the general resource control model.

J-SEAL2 kernel manages a tree hierarchy of nested protection domains. This


model of hierarchically organized domains stems from the Ja¥aSeal mobile
agent kernel 4 . Protection domains encapsulate agents as well as service com-
ponents. The J-SEAL2 kernel ensures that protection domains are completely
isolated from each other. Furthermore, a parent domain may terminate its
children at any time, forcing the children to release all allocated resources
immediately.
A general model for hierarchical resource control fits very well to the
hierarchical domain model of J-SEAL2. At system startup the root domain
owns by default all resources. Moreover, the root domain, along with the
other domains loaded at platform startup, are considered as completely safe,
and, consequently, no resource accounting will be enforced on them. When a
nested protection domain is created, the creator donates some part of its own
resources to the new domain. Figure 1 illustrates the way resources are either
shared or distributed inside a hierarchy. In the formal model of J-SEAL2, the
Seal Calculus 6 , the parent domain supervises all its subdomains, and inter-
domain communication management was the main concern so far. Likewise,
in the resource control model proposed here, the parent domain is responsible
for the resource allocation with its subdomains.
Within each untrusted protection domain, the J-SEAL2 kernel accounts
for the following resources (for details, see 2 ) :
117

• CPU-RELATIVE defines the relative share of CPU, and is expressed


as a fraction of the parent domain's own relative share. In our current
implementation, this resource is controlled by periodic sampling of the
amount of executed bytecode instructions.

• MEM_ACTIVE is the highest amount of volatile memory that a protec-


tion domain is allowed to use at any given moment.
• THREADS-ACTIVE specifies the maximal number of active threads by
protection domain at any moment.
• THREADS-TOTAL limits the number of threads that may be created
throughout the lifetime of a protection domain.
• DOMAINS-ACTIVE specifies the maximal number of active subdomains
a protection domain is allowed to have at any given moment.
• DOMAINS-TOTAL bounds the number of subdomains that a protection
domain may generate throughout its lifetime.

Note that the kernel of J-SEAL2 is not responsible for network control,
because network access is provided by different services. These network ser-
vices or some mediation layers in the hierarchy are responsible for network
accounting according to application-specific security policies. Let us stress
that the network is not a special case, since J-SEAL2 may limit communica-
tion with any services, like networking, file 10, etc.

3 Related Work

Our current implementation, which is based on Java bytecode transformations


(for details see 2 ) , has been inspired by JRes 5 , a resource control library for
Java that takes CPU, memory, and network resource consumption into ac-
count. The resource management model of JRes works at the level of individ-
ual Java threads; there is no notion of application as a group of threads, and
the implementation of resource control policies is therefore cumbersome. JRes
is a pure resource accounting system and does not enforce any separation of
domains. For its implementation, JRes relies on native code libraries for net-
work and CPU accounting. Therefore, JRes does not meet our requirement
of full portability.
KaffeOS 1 is a Java runtime system allowing to isolate applications from
each other, as if they were run on their own Java Virtual Machine. Thanks
to KaffeOS it is possible to achieve resource control with a higher precision
118

than what is possible with bytecode rewriting techniques, where e.g. memory
accounting is limited to controlling the respective amounts consumed in the
common heap, and where CPU control does not account for time spent by the
common garbage collector working for the respective applications. The Kaf-
feOS approach should by design result in better performance, but is however
inherently non-portable.

4 Conclusion

Whereas other approaches to resource control in Java demonstrate a long-


term, deep re-design of the Java runtime system, our proposal might be
grossly characterized as a language-based patch. J-SEAL2 isolates agents
from each other, and particularly prevents denial-of-service attacks originat-
ing from inside the execution platform. Moreover, the complete compatibility
and portability of our approach makes it immediately usable for the benefit
of distributed multi-agent systems, especially when mobile code is involved.

References

1. G. Back, W. Hsieh, and J. Lepreau. Processes in KaffeOS: Isola-


tion, resource management, and sharing in Java. In Proceedings of the
Fourth Symposium on Operating Systems Design and Implementation
(OSDF2000), San Diego, CA, USA, October 2000.
2. W. Binder, J. Hulaas, and A. Villazon. Resource control in J-SEAL2.
Technical Report Cahier du CUI No. 124, University of Geneva, October
2000. f t p : / / c u i . u n i g e . c h / p u b / t i o s / p a p e r s / T R - 1 2 4 - 2 0 0 0 . p d f .
3. W. Binder. Design and implementation of the J-SEAL2 mobile agent ker-
nel. In The 2001 Symposium on Applications and the Internet (SAINT-
2001), San Diego, CA, USA, January 2001.
4. C. Bryce and J. Vitek. The JavaSeal mobile agent kernel. In First Inter-
national Symposium on Agent Systems and Applications (ASA '99)/Third
International Symposium on Mobile Agents (MA '99), Palm Springs, CA,
USA, October 1999.
5. G. Czajkowski and T. von Eicken. JRes: A resource accounting interface
for Java. In Proceedings of the 13th Conference on Object-Oriented Pro-
gramming, Systems, Languages, and Applications (OOPSLA-98), volume
33, 10 of ACM SIGPLAN Notices, New York, October 1998.
6. J. Vitek and G. Castagna. Seal: A framework for secure mobile compu-
tations. In Internet Programming Languages, 1999.
AN AGENT-BASED MOBILE E-COMMERCE SERVICE PLATFORM FOR
FORESTRY AND AGRICULTURE 1

MATTHIAS KLUSCH AND ANDREAS GERBER


German Research Center for Artificial Intelligence, Stuhlsatzenhausweg 3,
66123 Saarbrucken, Germany
E-mail: (klusch, agerberj@dfki.de

The range of applications developed in the domain of agriculture and forestry covers
restricted types of market places as well as information systems. However, the innovative
integration of Internet, agent technology, and mobile telecommunication for integrated
commerce, supporting business processes in these domains, is still at the beginning. We
present the first approach of a holonic agent-based information and trading network (CASA
ITN) for dynamic production and sales in which integrated services for logistics and e-
commerce are provided. This paper introduces the agent-based architecture and describes the
added-value services of the CASA ITN for mobile timber sales.

1 Introduction

Electronic commerce (e-commerce) is a general name for business transactions that


are entered into through electronic rather than paper-based means. E-commerce has
the capacity to change the way the entire world does business, because it enables
people to buy and sell goods and services from anywhere in the world. Especially in
the agricultural and forestry domains their is a great demand to announce offers and
information about goods to a large audience and in doing fast negotiation before
perishable goods get a loss in quality.
In the project CASA2 at DFKI we are developing agent-mediated services for the
CASA ITN to support the main operative business processes users are performing in
each of the following application scenarios: (1) customer-oriented, dynamic timber
production, (2) mobile trading of timber using different types of auctions, fixed or
negotiable price, and (3) electronic trading of cereals.
The approach taken for providing information and trading services in the CASA
ITN focuses on the effective integration of production, logistics and trading
processes of these scenarios. It is motivated by the paradigm of integrated
commerce (i-commerce) which can be seen as an operational extension of traditional
e-commerce. The basic ideas of i-commerce are (a) to get customers more involved

1
This research is sponsored by the Ministry of Economics of the Saarland,
Germany, under the grant 032000.
2
Abbrev.: Cooperative Agents and Integrated Services for Logistic and Electronic
Trading in Forestry and Agriculture

119
120

in the activities related to his/her orders and tasks, and (b) to get related processes in
the supply chain more integrated in practice.
The agent-based CASA services for i-commerce can easily be accessed from
anywhere by using PC or mobile WAP 1.1-enabled devices such as smart phones or
PDAs. Efficient coordination of services is performed by appropriate types of
collaborating software agents. The WAP application services are currently
implemented using the T-D1WAP gateway of the Deutsche Telekom.

2 CASA Agents and Services

2.1 Holonic Agent System of the CASA ITN


We differentiate between following groups of participants in the CASA ITN:
producers offering goods; buyers purchasing goods; retailers acting of their own or
in agency of companies; logistics companies responsible for transportation tasks,
storage and resource management. Each member of these groups is represented by a
special so-called holonic agent (cf. Figure 1). The concept of holonic agents [1,5] is
used for reasons of effectively accomplishing complex, mostly hierarchically
decomposed tasks and resource allocations in the selected application scenarios. A
holonic agent (or holon) co-ordinates and controls the activities and information
flow of its subagents. In a holonic multi-agent system, autonomous agents may join
others to form, reconfigure, or leave a holon.
A human user in the ITN is represented by a special holonic agent called personal
assistant. It pro-actively acts on behalf of its user even if (s)he is off-line; the
personal assistant is the coordinating head of a set of other specialized agents for
individual negotiation, participation in auctions, finding relevant partners and
information, and elaboration of optimal trading strategies over time. Each
corporation is represented by a special holonic agent system according to its task-
oriented subdivision into departments for information management, logistics, and
production planing. In this context we presume that (1) information management
services provide information either on certain products, or on current market
situation and potential competitors, (2) logistics services support the co-ordination
of machines for production and transportation, human resources, and storage
capacities., and (3) production planning services support short-, middle-, and long-
term product planning cycles.
A corporation holon is constituted by other holonic agents each of them representing
a special department. Since in the CASA ITN the roles of buyer/retailer or and
seller/producer may be interchangeably used both are modeled by similar holonic
agent structures. In addition, logistics companies are usually contracted by other
corporations for the purpose of time- and cost-saving delivery of goods on demand.
121

Figure 1. Overview of the holonic CASA agent system

Finally, we developed agent-based services for a distributed virtual market place to


enable different kinds of trading between the participants such as multiple online
auctions, sales by fixed or negotiable prices in simultaneously bilateral negotiations.

2.2 Agent-Based Services of the CASA ITN


The CASA agent society co-ordinates and provides the following classes of services
to its users.
• Auction mechanisms [4] including Dutch, English, Vickrey auctions.
• Integrated services for dynamic pricing via information on transportation costs
and constraints during the bidding processes of the user.
• Logistics [2] services provide dynamic, approximately optimal (re-) scheduling
and (re-)planning of transportation.
• Information management [7]. Agents gather relevant information on behalf of
its users in different trading and production settings.
• Mobile services to let the users access most services of the CASA ITN also on
WAP-enabled mobile devices.
122

2.3 Application Scenarios


In brief, the application scenarios of the CASA ITN for its users are as follows.
• Customer-oriented dynamic timber production: Foresters and timber harvester
appropriately cooperate with pro-active support of service providing agents of
the CASA ITN to satisfy an individual customer's order to deliver a certain
quantity and quality of timber at a given time. The processing of an order can be
influenced by many side effects such as changes in weather, uncompleted order
parts, stoppage of harvesting machines, or shortage of human resources.
Therefore the approximately optimal, dynamic (re-)planning and coordination
of services for harvesting, processing, and transportation has to be performed
just-in-time and is additionally supported by mobile WAP-enabled devices.
• Mobile timber sales: CASA ITN members may set up and participate in one or
multiple different timber auctions via Internet or WAP-enabled mobile devices.
• E-trading of cereals: Similar to the mobile timber sales scenario registered
users may trade grains via auctions or multi-lateral negotiations.

The first two application scenarios have been implemented using the FIPA-OS 2.0
agent system platform and Java; for reasons of space limitations we briefly describe
the mobile timber sales scenario in the following sections.

2.4 Mobile Timber Sales: Services, Interactions, and Agents


In this special scenario each forester may sell timber via different kinds of auctions,
fixed or negotiable sales offers to other registered users of the CASA ITN. Main
benefits of the agent-based service support are the concurrent monitoring and
computation of optimal transport costs per individual bid or buying offer, and the
full mobile service support of the user via WAP-enabled mobile devices.

2.4.1 Services and Interactions


In general, the mobile timber sales services of the CASA ITN enable registered
users to initiate or participate in one or multiple timber auctions. But the members
could also sell or buy timber at fixed or negotiable prices. In the first case, the
CASA ITN offers types of auctions like Dutch, English, Vickrey, and First-Price-
Sealed-Bid. The auction server has been built upon a general holonic coordination
server [3]. Any user may invoke integrated services for decision-support during the
participation in auctions. For example, a personal CASA agent may concurrently
determine the optimal transportation costs and delivery dates of some auction goods
for each individual bid of its user. As a result, the agent may notify its user in real-
time if estimated optimal transport costs exceed the allowed limit due to given
buying preferences or if some deadlines are at risk to be exceeded. In addition, each
of the information and trading services is available on mobile WAP 1.1 enabled
123

devices and PCs connected to the Internet. A synchronization is co-ordinated by


appropriate CASA agents (cf. fig. 1)[6]. These are holonic agents for users as buyers
or sellers/auctioneers, and shipping companies. Buyers without logistics capabilities
have to contract carriers appropriately. Participation in any trading process can be
delegated to a personal user agent which then is in charge of negotiating or bidding
at an auction, and notifying its user, e.g., via SMS or email.

3 Related Work

There are just a few market places known which resemble the CASA system.
Agriftow[8], for example, is putting Europe's arable industry on the fast track to e-
business with a series of dynamic products, including Cigrex, an online co-operative
independent grain exchange, and Agrivox, an information service. The Virtual
Agricultural Market (VAM)[9] system has been built for B2B transactions in
agricultural markets. It offers mechanisms for trading, and activities for distribution
of products; VAM provides a set of generic functionality, in a stakeholder-
independent, and interoperable way. However, these systems significantly differ
from CASA in its architecture and provision of added values implied by the dynamic
integration of logistics and information in mobile timber sales and production.

References

1. Burckert, H.-J., Fischer, K., and Vierke, G., Transportation Scheduling with
Holonic MAS — The TeleTruck Approach. Proc. 3rd Intl Conference on
Practical Applications of Intelligent Agents and Multiagents PAAM'98, (1998).
2. Biirckert, H.-J., Fischer, K., and Vierke, G. Holonic Transport Scheduling With
TELETRUCK. Applied Artificial Intelligence, 14, (2000), pp. 697-725.
3. Gerber, A. and RuB, C , A Holonic Multi-agent Co-ordination Server. In Proc.
14th Intl. FLAIRS Conference, 2001. pp. 200-204, ISBN 0-1-57735-133-9
4. Gerber, A., Klusch, M., RuB, C , and Zinnikus, I., Holonic Agents for the
Coordination of Supply Webs. Proc. Intl. Conf. on Autonomous Agents, (2001)
5. Gerber, C , Siekmann, J., and Vierke, G., Flexible Autonomy in Holonic Agent
Systems. Proc. AAAI Spring Sympos. on Agents with Adjustable Autonomy,
(1999).
6. Gerber, C , Siekmann, J., and Vierke, G., Holonic Multi-Agent Systems. DFKI
Research Report RR-99-03, (1999), ISSN 0946-008X.
7. Klusch, M., Information Agent Technology for the Internet: A Survey. Data
and Knowledge Engineering, 36, 1&2 (2001) pp. 337-372
8. Agriflow: www.agriflow.com
9. C.I. Costopoulou, M.A. Lambrou, An architecture of Virtual Agricultural
Market systems: Information services and use, Vol 20 (1), 2000), ISSN 0167-
5265, pp. 39-48
An Itinerary Scripting Language
for Mobile Agents in Enterprise Applications"

Seng Wai Loke


School of Computer Science and Information Technology
RMIT University, GPO Box 2476V, Melbourne VIC 3001, Australia
swlokeQcs . rmit. edu. au

A r k a d y Zaslavsky, Brian Yap, J o s e p h Fonseka


School of Computer Science and Software Engineering
Monash University, Caulfield VIC 3145, Australia
Arkady.ZaslavskyQmonash.edu.au, b r i a n l l Q h o t m a i l . c o m , rukiQmbox.com.au

We view an agent's itinerary describing which tasks to be performed when and


at which location (e.g. which host) as a script glueing the tasks of the agents
together in a (possibly) complex way. We present the ITAG (ITinerary AGent)
scripting language which is based on the notion of the itinerary. We also discuss
the enterprise-wide infrastructure needed for executing ITAG scripts, and illustrate
our approach with examples of scripts for voting and distributed authoring.

1 Introduction
This paper introduces a scripting language approach to developing mobile
agent applications. In the scripting approach,2 a scripting language is used to
glue components together to assemble an application rather than programming
an application from scratch. Our scripting language is based on the concept of
the agent itinerary. An agent's itinerary describes which actions (or tasks) are
to be performed when and at which location (e.g. which host), i.e. an agent's
itinerary glues the actions of the agent in a (possibly) complex way while each
action at a location might involve complex algorithms and data structures. A
scripting language should closely match the nature of the problem in order to
minimize the linguistic distance between the specification of the problem and
the implementation of the solution, thereby resulting in cost reductions and
greater programmer productivity.3 Our itinerary scripting language provides a
higher level of abstraction, and economy of expression for mobility behaviour:
the programmer expresses behaviour such as "move agent A to place p and
perform action a" in a simple direct succinct manner without the clutter of
the syntax of a full programming language.
a
T h e work reported in this paper has been funded in part by the Co-operative Research Cen-
tre Program through the Department of Industry, Science & Tourism of the Commonwealth
Government of Australia.

124
125

In the following section, we first present our itinerary scripting language,


and in §3, present an example of a distributed authoring application scripted
in our language. We conclude in §4.

2 ITAG: The Itinerary Scripting Language

We previously created an itinerary algebra.1 ITAG is an executable implemen-


tation of this algebra in the form of a scripting language. We first outline the
algebra below.
We assume an object-oriented model of agents (e.g., with Java in mind),
where an agent is an instance of a class given roughly by:
mobile agent = state + action + mobility
We assume that agents have the capability of cloning, that is, creating copies
of themselves with the same state and code. Also, agents can communicate to
synchronize their movements, and the agent's code is runnable in each place it
visits.
Let A, O and P be finite sets of agent, action and place symbols, respec-
tively. Itineraries (denoted by I ) are now formed as follows representing the
null activity, atomic activity, parallel, sequential, nondeterministic, conditional
nondeterministic behaviour, and have the following syntax:
I : : = 0 | Aap | ( I | | e I ) | ( I • J ) | ( I | I ) | ( I : n I )
where A e A, a € O, p € P , © is an operator which, after a parallel operation
causing cloning, recombines an agent with its clone to form one agent, and IT
is an operator which returns a boolean value to model conditional behaviour.
We specify how © and II are used but we assume that their definitions are
application-specific.
We assume that all agents in an itinerary have a starting place (which we
call the agent's home) denoted by h £ P . Given an itinerary I, we shall use
agents(I) to refer to the agents mentioned in J.
Agent Movement (Aav). Aav means "move agent A to place p and perform ac-
tion a". This expression is the smallest granularity mobility abstraction. It
involves one agent, one move and one action at the destination.
Parallel Composition ("\\"). Two expressions composed by "||" are executed
in parallel. For instance, (Ap || Bbq) means that agents A and B are executed
concurrently. Parallelism may imply cloning of agents. For instance, to exe-
cute the expression (A® || Abq), where p ^ q, cloning is needed since agent A
has to perform actions at both p and q in parallel. When cloning has occurred,
decloning is needed, i.e. clones are combined using an associated application-
specific operator (denoted by © as mentioned earlier).
Sequential Composition ("•")• Two expressions composed by the operator "•"
126

are executed sequentially. For example, (A^ • Aq) means move agent A to place
p to perform action a and then to place q to perform action b.
Independent Nondeterminism ("\"). An itinerary of the form (/ | J) is used
to express nondeterministic choice: "I don't care which but perform one of /
or J". If agents(I) n agents(J) ^ 0, no clones are assumed, i.e. / and J are
treated independently. It is an implementation decision whether to perform
both actions concurrently terminating when either one succeeds (which might
involve cloning but clones are destroyed once a result is obtained), or trying
one at a time (in which case order may matter).
Conditional Nondeterminism (":"). Independent nondeterminism does not
specify any dependencies between its alternatives. We introduce conditional
nondeterminism which is similar to short-circuit evaluation of boolean expres-
sions in programming languages such as C. An itinerary of the form I -u J
means first perform / , and then evaluate II on the state of the agents. If II
evaluates to true, then the itinerary is completed. If II evaluates to false, the
itinerary J is performed (i.e., in effect, we perform I • J). The semantics of
conditional nondeterminism depends on some given II.
We give an an example using agents to vote. An agent V, starting from
home, carries a list of candidates from host to host visiting each voting party.
Once each party has voted, the agent goes home to tabulate results (assuming
that home provides the resources and details about how to tabulate), and then
announces the results to all voters in parallel (and cloning itself as it does so).
Assuming four voters (at places p, q, r, and s), vote is an action accepting
a vote (e.g., by displaying a graphical user interface), tabulate is the action
of tabulating results, and announce is the action of displaying results, the
mobility behaviour is as follows:
\/vote
v \/vote
v v\rvote yvote v^/tabulate /^/announce II j/announce \\ \/announce II \rannounce\
p q r ' vs h ^ P II 1 II r II s )

Implementation. To allow the programmer to type the itinerary expres-


sions into the computer, we provide an ASCII syntax and a Controlled English
version. The translations are given in Table 1.
When the operators are used without op, we assume a pre-specified system
default one, i.e. using op is an optional clause. A° • Aq • A% can be described
as follows: "(move A to a do p) then (move A to b do q) then (move A to c
do r)." Apart from the above basic elements of the language, we define the
following five phrases that map down to more complex expressions:
1. A^ is translated as return A do a.
2. Aav • Aq • A® • Aas is translated as tour A t o p,q,r,s in s e r i e s do a.
3. Ap|\Ag\\A"11Aas is translated as tour A to p,q,r,s in p a r a l l e l do a.
4. y l p l ^ l ^ l ^ is translated as t o u r A t o one of p,q,r,s do a.
127

Symbol ASCII Controlled English


Aa move A to p do a
[A,p,a]
then
:n :{op} otherwise using op
I I or
He #{op} in parallel with using op

Table 1: Translations.

5. Ap : Aaq : A^ : A" is translated as tour A i f needed to p,q,r,s do a.


Similarly, we also have A^ :n A% :n A% :n Aas translated as tour A i f needed
to p,q,r,s do a using II.
Using the phrases, the voting itinerary can be described succinctly as fol-
lows:
(tour V to p , q , r , s in s e r i e s do vote)
then (return V do tabulate)
then (tour V to p , q , r , s in p a r a l l e l do announce)
Our current implementation is in the Java programming language and is
built on top of the Grasshopper mobile agent toolkit. 6 In our current imple-
mentation, the user first types in itinerary scripts into an applet (running in a
Web browser). Then, the itinerary script is parsed into a binary tree represen-
tation and executed by an interpreter. Execution is as follows: the interpreter
translates the actions specified in the script into commands which are then
forwarded to Grasshopper agents which are initially at a place (the home).
These agents on receiving the commands are then launched into the network
of places to do their work.

3 An Example: Distributed Authoring


We aim mainly for lightweight applications (e.g., ad hoc workflows), lightweight
in the sense that they can be quickly scripted as long as the required actions
code can be downloaded from a code server. Here, we consider an example
adapted from Tripathi et al.4 concerning coordinating the activities of a dis-
tributed authoring system involving the author, an editor and two reviewers.
In this collaboration among the four parties, the agent transfers the required
information (e.g., the document draft, reviews, etc) and the itinerary repre-
sents the order in which actions are to be accomplished. For example, in a
typical scenario, the author first publishes the document to the editor, the edi-
fc
See http://www.grasshopper.de
128

tor then sends the document to the reviewers, after which the reviewers forward
reviews to the editor, and finally, the editor adds further comments and sends
all the information to the author. Assuming agent A is launched by the au-
thor, places abbreviated as e d i t o r , author (the place from which the agent is
launched), reviewerl, and reviewer2, actions are submit, review, f i n a l i z e
and notify, the following script can be written to enact this collaboration:
(move A t o e d i t o r do submit)
then ((move A t o reviewerl do review)
in p a r a l l e l with
(move A t o reviewer2 do review))
then (move A t o e d i t o r do f i n a l i z e )
then (move A t o author do n o t i f y )
Note that data (including the draft document, the reviews, and editor's com-
ments) are carried with the agent.

4 Conclusions a n d F u t u r e W o r k
We contend that a scripting approach is well-suited for developing mobile agent
applications and presented ITAG based on the notion of the agent itinerary.
Autonomy and flexibility are important aspects of intelligent agents. ITAG
accommodates agents with a degree of autonomy and flexibility in performing
tasks via the nondeterminism and conditional nondeterminism operators.

References
1. S.W. Loke, H. Schmidt, and A. Zaslavsky. Programming the Mobility
Behaviour of Agents by Composing Itineraries. In P.S. Thiagarajan and
R. Yap, editors, Proceedings of the 5th Asian Computing Science Con-
ference (ASIAN'99), volume 1742 of Lecture Notes in Computer Science,
pages 214-226, Phuket, Thailand, December 1999. Springer-Verlag.
2. J.K. Ousterhout. Scripting: Higher Level Programming for the
21st Century. IEEE Computer, March 1998. Available at
<http://www.scriptics.com/people/john.ousterhout/scripting.html>.
3. D. Spinellis and V. Guruprasad. Lightweight Languages as Software En-
gineering Tools. In Proceedings of the USENIX Conference on Domain-
Specific Languages, California, U.S.A., October 1997.
4. A. Tripathi, T. Ahmed, V. Kakani, and S. Jaman. Distributed Collab-
oration Using Network Mobile Agents. February 2000. Available at
<http://www.cs.umn.edu/Ajanta/papers/asa-ma.ps>.
INTELLIGENT AGENTS FOR MOBILE COMMERCE SERVICES

MIHHAIL MATSKIN
Department of Computer and Information Science, Norwegian University of Science and
Technology, N-7491 Trondheim, Norway
E-mail: mishaQiidi. ntnu. no

We consider application of intelligent agents in mobile commerce services. Basic idea of the
approach is providing customers of mobile devices and service providers with personal
intelligent agents representing their interests in the Internet and usage of multi-agent system
approach for coordination, communication and negotiation between the agents. We
demonstrate how such agents and services can be implemented in the Agora environment that
we developed earlier. Some properties of developed prototype mobile commerce services are
briefly discussed.

1 Introduction

Development of mobile communication technology in the last years opens new


perspectives for providing services to the users of mobile devices such as cellular
phones or PDAs. An essential feature of mobile services is that the user of mobile
device can be available for services almost anytime and anywhere. This allows high
reactivity of user responses and decisions. At the same time development of such
technology as WAP [2,5] allows the users of mobile devices get access to the
Internet, which was before a privilege of PC users only. In particular this means that
the users of mobile devices get access to web-based technologies and computing
network resources outside of telecom networks.
However, opening access to the Internet resources mobile communication
technology put quite serious restrictions to such communication. Basic restrictions
are related to low bandwidth, high cost of communication, slow CPU, small
memory, restricted power supply, small screen and complicated input for mobile
devices. In order to relax such restrictions we think that precision and focus of
information delivered to the mobile devices should be very high [4]. In particular,
this means that: 1) the amount of delivered information should be as minimal as
possible but sufficient enough to be interesting to the user, 2) user input should be
minimized as much as possible 3) connection time of mobile devices to network
during processing the user request should be shortened. In order to achieve such
precision and focus, most of work for information analysis and processing should be
done off-line, and the analysis and processing should be personalized as much as
possible: they should take into account user preferences, interests as well as context
of communications (a geographical position, time etc).
We think that usage of intelligent agents [1] and agent technology is a
constructive approach to intelligent and personalized off-line processing. In

129
130

particular, this assumes providing participants of the commercial activity (they are
mobile device customers and service providers) with software assistants-agents.
Some details of this approach are presented in [4]. Here we demonstrate how the
approach can be applied to support of particular mobile commerce services. As a
tool for implementing the approach we use the Agora environment for support of
multi-agent cooperative work [3]. For communication with mobile devices we use
WAP technology [2,5] and SMS messages.
The rest of the paper is organized as follows. First we give a brief introduction
to the Agora environment and present solutions for mobile services using the Agora
based approach. Then we consider some details of implemented prototype services.
Finally, we present conclusions and future work.

2 The Agora system and mobile commerce services

In order to support agent creation and multi-agent cooperative work we use the
Agora system which we developed earlier [3]. Basic idea behind this system is
consideration of cooperative work as a set of cooperative acts which include
coordination, negotiation and communication and providing means for supporting
such cooperative acts. In order to get such support we propose a concept of
cooperative node (we call it Agora). The Agora node allows registration of agents
and provides means for support of cooperative activity such as matchmaking,
coordination and negotiation between the registered agents.
If we apply the Agora concept to the mobile commerce services then we, first,
need to identify participants of the cooperative work and possible cooperative acts
between them. In our case participants are customers and service providers, and we
assume the following basic cooperative acts between participants: 1) buying/selling
products/services by customers and providers, 2) product/service information
exchange between different customers, 3) customer coalitions formation for co-
shopping, 4) providers coalition formation for common policy development, 5)
coordination between different agents of the same customer, 6) subscription service
management.
Our next step is to map participants into agents and cooperative acts into
corresponding Agoras. For example this can be done as it is shown in Figure 1 (in
this figure rectangles denote agents, diamonds denote Agoras and arrows show
registration of agents at Agoras).
Each agent in the Agora system has planner, knowledge base, communication
block and goal analyzer. By default, knowledge base and planner use Prolog-like
notation for knowledge representation. However, all agent components can be
overridden when necessary. An important feature of such implementation is
encapsulation of private data in agents and ability to get service without disclosing
personal preferences to providers. The planner, knowledge base and ability to
handle events by goal analyzer provide a basis for implementation of pro-activity.
131

Figure 1. Customers, providers and Agoras

Ability to communicate is based on communication adapters and message wrappers


in the Agora system. Both KQML and FIPA are supported. This is done by
implementing an intermediate representation level (wrappers) which allows
translation of constructions from both languages. Usage of wrappers also allows
defining own syntax and parameter types for communicative acts. In particular, we
use that for plan and action files exchange between agents. Different ontology can
be described and their combination with performatives uniquely defines the
communicative act.
Communication
Adapter Manager
Information
about registered
agents Customer Provider
.4 / \ \ ^ notificator notificator

l/\?
1/ \
y Negotiator^
/Coordinators.
\
•*
Events handler

History browser
Matchmaker

Registrator

Figure 2. Subscription service Agora


In the case of subscription services, customers specify information they are
interested in and service provider sends the information to the customers in some
time interval or upon a specified event. Basic steps of agent-based subscription
service are registration, announcement of the offers, matchmaking, events
generation and handling (both for providers and customers). These steps are
supported by a manager of the Subscription service Agora (see Figure 2).
132

Customers present their interests to corresponding Agora by pointing rules,


keywords or ranked list of interests. The Agora manager tries to match customer
interests with providers' proposals and, when the matching is successful, notifies
the customers. Both provider and user interests can be presented/updated anytime
and asynchronously. Complexity of the matchmaker can be different for different
applications. In the optimistic case (when customer discloses detailed preferences)
the matchmaker does the whole work for matching customer requests and provider
offers and notifies the customer when matching was successful. It is possible to
implement a more intelligent behavior of manager with pro-active recommendation
of offers which are relevant to customer's interests but are not presented explicitly.
In the pessimistic case (when customer doesn't disclose his particular interests but
rather subscribes for a wide scope information) the matchmaker does a pre-filtering
of the information but particular analysis is performed by the customer agent. After
successful matchmaking the customer agent may directly contact corresponding
provider agent and perform additional information request or negotiation using the
Negotiator component of the Agora.
Managers for other types of Agoras (such as Customers, Providers,
Buying/Selling or Coalitions Agoras) may have functionality different from the
functionality of the Subscription service Agora. The Agora system allows attaching
different manager agents for different Agoras.

3 Some applications

There have been developed several prototype systems of mobile commerce services
based on the above-described approach. They include: 1) Valued customer
membership service and product search; 2) Financial services (notification of stock
quotes change); 3) Real-estate agent (search and notification for real-estate
property); 4) Advertising over Internet with agents.
For the valued customer membership service a user of mobile device can
register for a customer service which provides membership benefits. After
registration a personal assistant agent is created. Basically, agent operates on a
user's host providing a privacy of personal data; however, it may also operate on a
server provider host when the user trusts the environment. When the agent finds that
some special offer matches the customer interests, the agent may send
corresponding message to the user's mobile device (if it requires quick reaction) or
may place the offer to a user WML-page. In addition to analyzing offers from the
customer service, the agent can perform a search of relevant products from other
specified sources.
In the case of financial services, notification of changes in quotes of specified
stocks is implemented. The Agora system is used for deploying agents and
matching required and provided services. Both the specified stocks and conditions
of their change are kept privately in the agent.
133

Advertising service uses Agoras for formation of customer and service


provider coalitions. The coalitions are used for co-shopping and for co-advertising.
The real-estate agent searches for real-estate property which satisfies the user's
preferences, notifies the user via cellular phone when it is found and, if it is of
user's interest, starts a bidding process for the property according to user's
instructions.

4 Conclusions

We present an approach to usage of intelligent agents in mobile commerce services.


The approach is based on providing users of mobile devices with personal software
assistant agents and usage of the Agora system for support of cooperative work
between agents. The general conclusions are as follows: 1) Usage of agents as
personal assistants for users of mobile devices is a practical and feasible approach;
2) Even with a simple intelligence and functionality agents provide a great benefit
by employing autonomy, communication ability and pro-activity; 3) A concept of
Agora as a cooperative node is practical and convenient mean for multi-agent
system design.
Our future plans are directed to increasing intelligent capabilities of the agents
and Agoras in the system. In particular we would like to use different negotiation
protocols, rules for coalition formation and planning activity of the agents in mobile
services support.
This work is partially supported by the Norwegian Research Foundation in the
framework of the Distributed Information Technology Systems (DITS) program and
the ElComAg project. I also would like to thank Thomas Heiberg and J0ran
Pedersen (product search and valued memberships services), Terje Wahl (financial
services), Lars Killingdalen (advertising with agents) and Bj0rn Skogseth (real-
estate search and analysis) for their work for implementing the prototypes.

References

1. Bradshaw, J. M. (Ed.). Software Agents. Menlo Park, CA: AAAI Press/The


MIT Press, 1997.
2. Mann, S. Programming Applications with the Wireless Application Protocol:
The Complete Developer's Guide. John Wiley & Sons, 2000.
3. Matskin, M., O. J. Kirkeluten, S. B. Krossnes and 0ystein Saele. Agora: An
Infrastructure for Cooperative Work Support in Multi-Agent Systems. T.
Wagner, O. Rana (eds.) Infrastructure for Scalable Multi-Agent Systems.
Springer Verlag, LNCS Volume 1887, 2000.
4. Matskin, M. and A. Tveit. Mobile Commerce Agents in WAP-Based Services.
Journal of Database Management, Vol. 12, No. 3, 2001, pp. 27-35
5. WAP: URL:http://www.wapforum.org
A N E W C O N C E P T OF A G E N T A R C H I T E C T U R E IN
AGENTSPACE

T . N O W A K A N D S. A M B R O S Z K I E W I C Z
Institute of Computer Science, Polish Academy of Sciences,
al. Ordona 21, PL-01-237 Warsaw,
and Institute of Informatics, University of Podlasie,
al. Sienkiewicza 51, PL-08-110 Siedlce, Poland
E-mail: sambrosz, tnowak@ipipan.waw.pl

Agentspace is an emerging environment resulting from process automation in the


Internet and Web. It is supposed that autonomous software (mobile) agents pro-
vide the automation. The agents realize the goals delegated to them by their
human masters. Interoperability is crucial to assure meaningful interaction, com-
munication and cooperation between heterogeneous agents and services. In order
to realize the goals, the agents must create, manage and reconfigure complex work-
flows.

1 Introduction

Cyberspace, the emerging world created by the global information infrastruc-


ture and facilitated by the Internet and the Web, offers new application sce-
narios as well as new challenges. One of them is creating new infrastructures
to support high-level business-to-business and business-to-consumer activities
on the Web, see for example Sun ONE, Microsoft .NET, and UDDI. T h e sec-
ond one is Semantic W e b 4 , conceptual structuring of the Web in an explicit
machine-readable way. These two challenges are strongly related to each other,
t h a t is, semantic interoperability is necessary for integration of heterogeneous,
distributed Web services. It is supposed t h a t the integration will be performed
automatically by autonomous software (mobile) agents. Agent is a running
program t h a t can migrate from host to host across a heterogeneous network
under its own control and interact with other agents and services.
Since the software agents are supposed to "live" in the cyberspace, they
must be intelligent, t h a t is, they must efficiently realize the goals delegated to
them by their h u m a n masters. Hence, along the development of cyberspace
the new world (called agentspace), inhabited by the software agents, is being
created. It seems t h a t the process automation in the Internet and Web makes
the development of agentspace inevitable.
H u m a n users are situated at the border of the agentspace and can influence
it only by their agents by delegating to t h e m complex and time consuming tasks
to perform. Since the Internet and Web are open distributed and heterogeneous
environments, agents and services can be created by different users according

134
135

to different architectures. Interoperability is crucial to assure meaningful in-


teraction, communication and cooperation between heterogeneous agents and
services. We can distinguish two kinds of interoperability: interaction inter-
operability and semantic interoperability. Interaction interoperability provides
common communication infrastructure for message exchanging whereas seman-
tic interoperability provides the message understanding.
T h e semantic interoperability concerning the meaning of resources on the
Web is a subject of current research, see DAML 5 -f OIL 8 as the most prominent
example.
In order to use services established by different users working in heteroge-
neous domains, agents must be capable of acquiring knowledge about how to
use those services and for what purposes. There must be a common language
for expressing tasks by the users, delegating these tasks to agents, as well as
for describing services, and for communication between agents and services.
There are several efforts for creating such language, see DAML-Enabled Web
Services 7 , A T L A S 3 , C C L 1 0 , W S D L 9 , and FIPA ACL.
As to the communication infrastructure, there is no need to force one trans-
portation platform (i.e. one message format and one message delivery way) as
the standard. It seems t h a t rather message language and its meaning is crucial
here, not message wrapping. It is relatively easy to provide a transformation
service between two platforms for translating message format of one platform
to the message format of the other.
Mobile agent platform (MAP, for short) gives also a communication infras-
tructure as well as "migration service" for the agents. One may ask if agent
mobility is essential for creating agentspace, see for example J A D E 6 frame-
work where mobility is not provided. In our approach, agent mobility may be
seen as a means for learning between heterogeneous environments.
Our project aims at creating an absolute m i n i m u m necessary for joining
heterogeneous applications as services on the one hand and for using them by
heterogeneous agents (on behalf of their users) on the other hand. As this
m i n i m u m we propose the language Entish (a shorthand for e-language), and
its intended semantics.
We introduce a new form of agent migration. Usually, a M A P provides
weak form of migration t h a t consists in moving agent's d a t a and code to a new
place and executing this code at the new place whereas the agent process at
the old place is closed. The d a t a and the code is strictly related to each other
in t h a t agent architecture. We propose a new architecture where the d a t a are
independent of the code. As a result we get a much weaker migration form
where agent's d a t a can be moved without the code. The d a t a are expressed
in Entish and contain all parameters needed to continue agent process at the
136

new place. This agent's d a t a is called agent "soul" and is separated from
agent body responsible for reasoning and action execution. T h e idea of the
new migration form is t h a t a running agent process stores all its essential d a t a
and control parameters in its soul. T h e process may be closed at any time
and then fully reconstructed at any new place. At the new place, agent soul
is given a new body (may be a different code) and then the completed agent
can continue its process. So t h a t d a t a (soul) are independent of the code
(body). T h e new migration form is independent of M A P and it can be applied
to communication platforms t h a t does not support (weak) agent mobility, like
J A D E or a platform based on H T T P + S O A P transport. T h e structure of soul
constitutes the core of language Entish.
T h e main achievement of our project is a generic architecture of agentspace
and its implementations. The idea of agentspace consists in constructing mid-
dleware t h a t provides transparency between heterogeneous agents and hetero-
geneous services. We define agentspace as an implementation of the language
Entish and its semantics on a communication platform. So far we have im-
plemented Entish on Pegaz - our own MAP, and we are completing Entish
implementation on another communication platform, called Hermes, t h a t is
based on H T T P + S O A P transport. It seems t h a t Hermes platform may serve
as a middleware for Web Service integration. We are also implementing trans-
port protocol of Hermes in Pegaz and vice versa, so t h a t we will achieve com-
plete interoperability between these two agentspaces. It means t h a t agents
(actually their souls) can migrate from one agentspace to the other as well as
communicate with services located in the other agentspace.

2 Agentspace architecture

T h e idea of agentspace consists in construction of open distributed infrastruc-


ture t h a t would allow to join heterogeneous applications as services on the
one hand and to use t h e m by heterogeneous agents on the other hand. A
user, delegating a task to an agent, need not to know the locations of services
and resources necessary for task realization. T h e user expresses the task in
our high level common language called Entish. T h e agent migrates across the
agentspace, communicates with services and other agents looking for informa-
tion, services and resources needed to realize the delegated task.
Since agentspace is an implementation of the language Entish and its in-
tended semantics on a communication platform, the layered architecture seems
to be natural and generic. The architecture consists of three layers: interaction
layer, agent/service layer, and language layer. T h e interaction layer specifies
infrastructure t h a t provides basic functionality for agents and services like
137

agent moving from one place to another and communication between agents
and services. This layer is implemented by a communication platform. In our
case it is done by Pegaz and Hermes. However, it may be any communica-
tion platform, like J A D E 6 , or a new one built on, for example, on the top of
C O R B A , RMI-IIOP.
The second layer, i.e., agent/service layer specifies some aspects of agent
and service architecture t h a t allow t h e m to evaluate formulas (called situa-
tions) expressed in the language Entish as well as determining new situations
resulting from performing elementary actions. T h e agents are equipped with
mental attitudes: knowledge, goals, intentions and commitments represented
as Entish formulas. These attitudes serve as d a t a and control parameters of
agent behavior. Agents and services execute actions (migration and message
exchange) in the interaction layer, whereas the message contents is expressed in
Entish. The agent/service layer implements the intended semantics of Entish.
T h e language layer consists of Entish - a simple version of the language
of first order logic, along with a specification how to "implement" it for open
and distributed use. The implementation follows the idea of so called "webiz-
ing language" see T. Berners-Lee 4 . T h e language describes the "world" (i.e.
agentspace) to be created on the basis of infrastructure provided by the pre-
vious layers. However, this description is purely declarative. Actions are not
used in Entish; the formulas describe only the results of performing actions.
So t h a t no causal relations can be expressed here. T h e language is sufficient to
express desired situations (tasks) by the users as well as by agents and services,
however it can not explicitly express any idea about how to achieve them. This
may be done by implementing distributed information services (called InfoS-
ervices) where an agent may get to know how to realize the delegated task,
or to get a hint. Usually, as the reply to its query (expressed also in Entish)
agent gets a sequence of intermediate situations to follow. BrokerServices play
the role of virtual brokers to facilitate complex task realization. A BrokerSer-
vice forms, manages and reconfigures a workflow t h a t realizes special type of
complex tasks. T h e workflow can be quite sophisticated and consist of a large
numbers of ordinary services. So t h a t it may be seen as virtual organization
in agentspace.
T h e language is implemented in the second layer by DictionaryServices
containing the syntax and new concept definitions. There are three additional
types of services, namely SecretaryService, MailService, and BodyService. Let
us note t h a t all those services are not system services. They can be imple-
mented and developed independently by different users. It is important t h a t
only "operation type" of any of these services is specified in Entish. Roughly,
operation type is a description of the function performed by a particular ser-
138

vice. A service implementation must only satisfy specification of the operation


type.
T h e paper presents our work in progress. The limit of space does not
allow to present details. T h e first version of Entish syntax and semantics is
completed. A prototype of agentspace based on Pegaz is already implemented.
Implementation of Hermes, i.e., agentspace based on H T T P + S O A P transport,
will be completed shortly. Now, we are developing (by implementing services)
and testing our small agentspace in the frame of Pegaz Ring t h a t consists of
several research groups.

Acknowledgment s

The work was done partially within the framework of E S P R I T project No.
20288 CRIT-2, and KBN project No. 7 T11C 040 20.

References

1. S. Ambroszkiewicz, W. Penczek, and T . Nowak. Towards Formal Specifi-


cation and Verification in Cyberspace. Presented at G o d d a r d Workshop
on Formal Approaches to Agent-Based Systems, 5 - 7 April 2000, NASA
Goddard Space Flight Center, Greenbelt, Maryland, USA. To appear in
Springer LNCS.
2. S. Ambroszkiewicz, O. Matyja, and W. Penczek. " T e a m Formation by
Self-interested Mobile Agents." In Proc. 4-th Australian DAI-Workshop,
Brisbane, Australia, July 13, 1998. Published in Springer LNAI 1544.
3. ATLAS - Agent Transaction Language for Advertising Services
h t t p : / / w w w . c s . c m u . e d u / softagents/atlas/
4. T. Berners-Lee - www.w3.org/DesignIssues/Webize.html -and- /Desig-
nlssues/Logic.html
5. DAML www.daml.org/
6. J A D E - Java Agent DEvelopment Framework
http://sharon.cselt.it/projects/jade/
7. Mcllraith, S., Son, T. and Zeng, H. "Mobilizing the Web with DAML-
Enabled Web Services", www.ksl.stanford.edu/projects/DAML/
8. OIL, Ontology Interchange Language, www.ontoknowledge.org/oil/
9. Web Services Description Language (WSDL)
www.w3.org/TR/2001/NOTE-wsdl-20010315
10. S. Willmott, M. Calisti, B. Faltings, S. Macho-Gonzalez, O. Belakhdar,
M. Torrens. " C C L : Expressions of Choice in Agent Communication" T h e
Fourth International Conference on MultiAgent Systems (ICMAS-2000).
21 s ' CENTURY SYSTEMS, INC.'S AGENT ENABLED DECISION GUIDE
ENVIRONMENT (AEDGE™)

PLAMEN V. PETROV
s
21 ' Century Systems, Inc., Omaha, Nebraska, USA

E-mail: plamen@21csi.com

ALEXANDER D. STOYEN

University of Nebraska and 21s' Century Systems, Inc., Omaha, Nebraska, USA
E-mail: alex@21csi.com

JEFFREY D. HICKS

University of Nebraska and 21s' Century Systems, Inc., Omaha, Nebraska, USA

E-mail: jeff@21csi.com

GREGORY J. MYERS
s
21 ' Century Systems, Inc., Omaha, Nebraska, USA

E-mail: greg@2lcsi.com

21 s ' Century Systems, Inc.'s Agent Enabled Decision Guide Environment (AEDGE™) is a
standardized Commercial Off the Shelf (COTS), DII COE compliant, agent architecture that
enables complex DSS to be developed as an expansion of the AEDGE core functionality. The
AEDGE core consist of a Master Server, Entity Framework, Agent Infrastructure and
Database Connectivity components. User service specific DSS tools, such as agents, servers
or clients, are quickly and efficiently constructed above the core functionality through the use
of common interfaces and data structures. The extender components (Simulation Server, Live
Links, Visualization Client, Agent Client, and Data Bridges) serve as a template for extending
the application. To facilitate Agent interactions, the AEDGE provides a number of local and
remote mechanisms for service registration and invocation. In addition Agents can interact,
synchronize, and cooperate via Agent Managers, which in turn provide the aggregate agent
functionality to the user. The componentized structure of the AEDGE enables multiple levels
of product availability that satisfies the needs of the user through different levels of product
involvement.

1 Introduction

In the past decade we have observed a significant increase in the demand for
computer-based decision support systems (DSS), due primarily to the overwhelming

139
140

availability of data from multiple sources with various degrees of quality, coming
from networked sensors, databases, archives, web-based applications, and other
sources. Simultaneously, a new branch of distributed computing, based on
intelligent, semi-autonomous processes, referred to as Agents, has been the center of
attention because of its flexibility, extensibility, and network-friendliness. 21st
Century Systems, Inc. (21CSI), a small company has pioneered the integration of
agent-based computing into DSS applications. We have developed stand-alone and
mobile agents and agent architectures to perform individual and team decision
support for multiple defense-oriented environments such as AWACS [1], Aero-
space Operations Centers, Navy Ship Command Centers [2] etc. The need for a
standardized common infrastructure has lead us to design an environment where
both agents and simulated entities (or representations of real-world assets) are
represented as first-class objects capable of interacting with each other. The Agent
Enabled Decision Guide Environment (AEDGE™) (see Figure 1) is 21CSFs
undertaking to build a common reference framework and a test-bed environment for
integrated simulation and agent-based decision support.

AEDGE defines Agents, Entities, Avatars and their interactions with each other and
with external sources of information. This standardized architecture allows
additional components, such as service-specific DSS tools to be efficiently built
upon the core functionality. Common interfaces and data structures can be exported
to interested parties who wish to extend the architecture with new components,
agents, servers, or clients. When the core AEDGE components developed by 21CSI
are bundled with customer-specific components in an integrated environment, a
clean separation of those components, through APIs, is provided.
| ftetahascs ! . . Bridf^__

Party
Compts

Figure 1. 21CSI's AEDGE Product Structure ^^**


141

2 Agent Enabled Decision Guide Environment (AEDGE™)

21CSI's DSS product [3] is based on an extensible architecture and a number of


standard components that enable simulation and decision support capabilities.
AEDGE is designed in an open, DII-COE and CORBA compliant manner. The
architecture is unified and allows users to use and extend existing components, as
well as to build new, compatible and customized add-ons. The kernel of the
architecture consists of four core and five extender components. These define the
internal structures, dataflows, and interface to the architecture.
• Master Server. Tracks components and matches service providers with service
requesters. The Master Server is a network component of AEDGE that
facilitates connections and interactions among the rest of the AEDGE
components. It provides component registration and tracking services, interface
matching services and component search, identification and connection
services. The Master Server is also responsible for synchronizing simulation
time (and real time) among multiple simulation servers and live links.
• Entity Representation Framework. The Entity Representation Framework is
an integral part of AEDGE, which provides the basic entities and events for a
time-event simulation or live-feed connections. The object-oriented hierarchy
of entities represents a wide range of structures, vehicles, platforms, weapons,
and sensors. The Framework includes a interfaces, which allow users to add
new entities with new behaviors or with combinations of existing behaviors.
• Agent Infrastructure. The Agent Infrastructure provides the basic inter-agent
communication and synchronization mechanisms, as well as the interfaces for
agents to use other data sources, such as simulation servers, live data links,
databases etc. A base hierarchy of agents is also provided, and it could be
extended and customized for particular user's need.
• Database Connectivity. AEDGE provides the capability of storing and
retrieving data to/from various databases. The Database Connectivity
components provide generic and specific bridges to a number of proprietary
and public databases. New Database Connectivity modules can be added by
extending the provided bridges and implementing the connectivity interfaces.
In addition to these kernel components, extender components define the basic
functionality of information clients and servers and define interfaces for adding new
functionality. These components are, in essence, templates for extending the
platform with new functionality, while maintaining tight integration and efficient
implementation. The following standard AEDGE extender packages are provided:
• Simulation Servers. Simulation Servers model a particular aspect of the
physical reality in terms of the AEDGE components. In other words, a
simulation server maintains a set of entities, their properties, and those of the
environment and models the interactions among those. For example, the vehicle
movement model, based on kinematics, affects the position, speed, direction of
motion and fuel burn rates for the entities; the weapon models affect the
142

outcome of engagements, the communication models determine how orders and


subordinate feedback are distributed. A simulation server may potentially
interact with all four-core components of AEDGE. It registers with the Master
Server and posts its exported services (e.g. providing entity position
information). The server manipulates a set of entities (object instances) from
the Entity Framework that represent the current view of the world according to
that simulator. The simulation server may interact bidirectionally with agents
from that Agent Infrastructure, both providing information about the state of
the world and receiving recommendations and action requests from Agents.
Finally, a server may require information from various databases that is
provided through the Database Connectivity component.
Live Links. Live Links are similar to Simulation Servers in that they provide
information about the world to the AEDGE components. However, this
information is based on sensor information and reflects the state of the physical
world in real-time. Thus, the information flow is unidirectional, since we do not
yet support actuators placed in the physical world. The live links may provide
entity or track information, weather information, or any other state or capability
changes. The links can interface with all core AEDGE components, much like
the simulation servers can, with the limitation of unidirectional communication.
Visualization Clients. Visualization Clients are responsible for interactions
with the human users. They present data from the AEDGE in a clear and
intuitive manner, allowing for simple, yet powerful presentations of complex
interdependencies in the simulated/sensor world. Visualization clients interact
with all components through a bidirectional information flows. They receive
information on the simulated entities, their environment and interactions, as
well as on agent evaluations and recommendations. The users' interactions with
the Visualization client provide feedback to the AEDGE core components.
Agent Clients. Agent Clients host one or more Intelligent Agents, which
monitor the simulated world, react to changes in it and interact among each
other and with human users according to their specific agent behaviors. The
agent client receives information from the AEDGE core on the state of the
world and sends back agent requests and feedback.
Database Bridges. These are a natural extension of the AEDGE core Database
Connectivity. Bridges to characteristics and performance data, weapons
performance and effectiveness data and terrain databases are provided.
Interfaces for new database bridges are also provided.

3 Componentization

The AEDGE Architecture enables the commercialization through four


componentized availability levels that cover the needs of customers with different
levels of involvement. The Demo availability level provides for execution and
143

evaluation rights of a binary (compiled) distribution of the product. This type of


availability level is aimed at early users or prospective customers. The Enterprise
availability level is designed to fulfill the needs of advanced customers who have a
specific use of the platform. Often the Enterprise availability level is accompanied
by customized extender components. The Research availability level delivers the
best configuration for scientists who may use and/or extend the platform for their
particular experimental needs. More interfaces to the system are provided to enable
the researcher to tap in the rich data sources of the platform. The Development
availability level enables advance users to build components on top of the base
platform. It provides all interfaces and some source code examples to key
components. Under this level, customers are able to extend the core functionality
with their own agents or graphical displays. While none of the availability levels
enable re-distribution of the platform, the developer availability level permits the
distribution of the binaries only, with proper disclosure.

4 Conclusion

21 s ' Century Systems, Inc. has developed the Agent Enabled Decision Guide
Environment (AEDGE™), an open DII COE and CORBA compliant agent-based
environment that enables the development of componentized decision support
systems. AEDGE's core functionality can be easily extended with new capabilities
by using extender components and bridges to third party products. A number of
commercial and military customers already benefit from this decision support
environment in a variety of applications (AWACS Command and Control, Griffin
Special Forces Rote planner, IDAS Aerospace Operations Center, Navy's Advanced
Battle Station, etc). Customers use AEDGE at multiple levels of component
availability to satisfy their specific needs for intelligent agent DSS architecture.

5 Bibliography

1. Petrov, P. V., Stoyen A. D. An Intelligent-Agent Based Decision Support


System for a Complex Command and Control Application. Proceedings of the
Sixth IEEE International Conference on Engineering of Complex Computer
Systems, ICECCS'2000, Tokyo, Japan, September 2000.
2. Hicks, J. D., Stoyen, A. D., Zhu, Q Intelligent Agent-Based Software
Architecture for Combat Performance under Overwhelming Information Inflow
and Uncertainty. Proceedings of the Seventh IEEE International Conference on
Engineering of Complex Computer Systems, ICECCS 2001, Skovde, Sweden,
June 2001.
3. 21 st Century Systems, Inc. Extensible Multi-Component DSS Architecture - a
Multi-agent Decision Support Environment. Technical report. 21CSI, Omaha,
NE. January 2001.
PROACTIVENESS AND EFFECTIVE OBSERVER MECHANISMS IN
INTELLIGENT AGENTS

JON PLUMLEY, KUO-MING CHAO, RACHID ANANE AND NICK GODWIN


School of Mathematical and Information Sciences
Coventry University,Coventry CVI 5FB, UK
E-mail:{j.plumley, kxhao, r.anane, a.n.godwin} @coventry.ac.uk

Proactiveness is a necessary property for an autonomous intelligent agent. We believe that to


exhibit this property, agents require an effective and efficient observing mechanism. In this
paper, we propose a novel method that enables agents to observe dynamic change in other
agents. The method incorporates two components: an ORB-based observing mechanism and
a mobile element. This new approach goes beyond the observing mechanism in Java. The
scope for interoperability is improved, and the dynamic generation and transfer of observable
conditions between agents is fully supported by the flexibility of the mechanism. Under this
new scheme, the ORB-based observing mechanism enables agents to monitor any changes in
the component objects of other agents. The mobile agent is used to transfer the filtered
knowledge between agents in order to set the monitoring conditions in a dynamic manner.

1. Introduction

Intelligent Agent technology has attracted a number of researchers and industrialists


in the field of distributed systems[8,9]. We argue that agent technology can be
useful in the integration of diverse systems in a distributed environment. The
realisation of an agent's proactiveness through the use of a two level ORB based
observer mechanism could reduce the tendency to redundant computation.
This paper proposes a Dynamic Mobile Agent (DMA) with an Observer/Observed
Mechanism (O/OM) operating at two levels- the global level and the object level.
The observation of change allows the proposed agent to change the rule set of its
mobile elements. It is this dynamic behaviour, which is described in detail in the
next section, that makes the agent more proactive and more autonomous, and
therefore better suited to dynamic distributed systems.

2. An overview of the Proposed Dynamic Mobile Agent functionality.

Two essential elements of the DMA are the Observation strategy and the reasoning
process.

144
145

2.1 The Observation Strategy


The DMA maintains a table of active objects with its Observer mechanism at
the global level by observing any object creation or deletion. The logical integrity
of any decision taking process would be flawed if the client held objects of which
the DMA was not aware. Likewise a lack of knowledge of deleted objects would
lead to run time error if the DMA were to attempt to reference such a deleted object.
A set of meta rules is held in the Belief Desire Intention (BDI) [7] module, and with
the built in mobile element rule generator, rules can be generated for each mobile
element. Specific mobile elements can then be dispatched to observe particular
objects. Fig. 1 illustrates the separation of the static and mobile elements of a DMA.
The observer mechanism and rule set of the mobile element allows it to monitor any
changes in the object states, and the significance of such a change. With the
knowledge of the observed change a decision is made (by human intervention) as to
whether any changes in the rule sets are needed. If so then a mobile element(s) with
a revised rule set(s) can be dispatched to continue observation.

Figure 1. A conceptual view of the elements of a Dynamic Mobile Agent

2.2 The reasoning process


Once dispatched with its own rule set, the mobile element observes its designated
object. When a change is observed in the object the mobile element is able to use its
rule set to determine the significance of the change. If appropriate it will pass a
message to the DMA notifying it of the observed change. The DMA will then be
able to use its global rule set to determine the significance to the whole system. This
146

may result in the need to change the 'observation rules' of one or more of the mobile
elements. When a change of rule set is appropriate the specific mobile element is
retracted, a new rule set generated, and then a new mobile element is dispatched to
continue observation. This dynamic behaviour ensures that each of the agents
involved in a multiple agent system responds to the dynamics of the system as a
whole and that they are able to cooperate together efficiently.

3. Architecture of a Dynamic Mobile Agent

The proposed mechanism is supported by a three level architecture. The three levels
and their appropriate mechanisms are described below.

3.1 The Three-level Architecture


The three levels of the system architecture are the communication mechanism, the
mental model, and the interaction with the observer mechanism. The communication
mechanism involves message passing built upon the Object Request Broker (ORB)
principles. It transports the agent's message using the syntax of an Agent
Communication Language (ACL) [4] to the recipient, which subsequently parses the
message. The mental model interprets the content of the message, reasons with it
and asks the underlying application to perform the task. The underlying application
returns the result to the mental model. The mental model generates the appropriate
reply and forwards it to the requesting agent. Remote method invocation is used to
invoke the functions in the application at the lowest level. The interface between the
application and the mental model uses the ORB in order to support applications that
are implemented in different programming languages.

3.2. The Mental Model


The Belief Desire Intention (BDI) module parses the incoming message from
the ACL module and reasons with its content. The BDI then invokes appropriate
methods. The BDI is a reasoning mechanism that interprets the information,
motivational, and deliberative states of the agents. "These mental attributes
determine the system's behaviour and are critical for achieving adequate or optimal
performance when deliberation is subject to resource bounds" [7].
147

4 Conclusions and future work

4.1 Discussions
Wooldridge and Jennings [9] identify proactiveness as a key property of an
intelligent agent. A proactive agent is able to exhibit goal-directed behaviour by
taking the initiative through its ability to observe the internal and external
environment. An effective and efficient observation mechanism is required for the
agents to be proactive. In this respect the A-Design system [3] is a proactive system
requiring a constant flow of information, and a failure to note that the object being
observed has been deleted could cause system errors. The mobile agent has been
widely used in the area of information retrieval over the internet [2, 6]. We exploit
this feature to work with our global observation mechanism in order to ensure that
the system maintains a consistent state. The JAM agent [5] supports agent mobility
with BDI representation. It provides a flexible representation for mobile agents. We
use this feature and apply it in agent observations. Ahmad and Mori [1] proposed
using mobile agents to push and pull data to cope with ever-changing situations in
information services and to reduce access time for the users. Our proposed method
provides a more flexible approach that allows the intelligent agent to generate new
monitoring rules as required and introduced the ORB observing mechanism to cater
for changes to the objects in the environment.

4.2 Conclusion
The main contribution of this work is the proposal of a method that supports an
intelligent agent's proactiveness with an observing mechanism that operates at two
levels: global and local (object level). The global observation allows the agent to be
aware of any changes such as creation and deletion of objects, thus enhancing the
robustness of the system. The local observer associated with the BDI and mobile
element generator, enables the observer agent to generate and dispatch an
autonomous mobile element to observe the state of a particular object. Changes to
the monitoring rules in the mobile element can be made when the need arises
without recompiling the code. The architecture of the system enables the intelligent
agents to be autonomous and to reflect the dynamic environment. The volume of
communication between agents can be reduced, because the mechanism in the
mobile element only sends filtered information to the agent rather than the raw data.
The ORB observer mechanism also contributes to the reduction of
communication traffic, because it is server side, the observable agent, pushing the
data out to the client side, the observer agent. Thus, the observer agent does not
need to constantly monitor the status of the objects in the observable agent. This
148

then, is an effective method of maintaining system consistency in a dynamic


environment where the objects and monitoring rules may change frequently.
The agent framework has been partially implemented. A simple example was
used to test the ORB observing mechanism and the mobile element in order to
evaluate its feasibility. A further implementation of these components is needed in
order to carry out a demonstrable case study.

References
1. Ahmad H. F., Mori K., Push and pull of information in autonomous information
service system, Proceedings 2000 International Workshop on Autonomous
Decentralized System, IEEE Comput. (2000), pp. 12-18.
2. Cabri G., Leonardi L., Zambonelli F., Agents for information retrieval: issues of
mobility and coordination, Journal of Systems Architecture, 46(15), (2000)
pp. 1419-33.
3. Campbell, M. I., Cagan, J., Kotovsky, K., A-Design: An Agent-Based
Approach to Conceptual Design in a Dynamic Environment, Journal of
Research in Engineering Design, 11(3), (1999), pp. 172-192.
4. FIPA, Agent Communication Language Specifications 97, http://www. fipa.org.
(1997).
5. Huber M. J., JAM: a BDI-theoretic mobile agent architecture. Proceedings of
the Third International Conference on Autonomous Agents. ACM. (1999),
pp.236-43.
6. Lieberman H., Selker T., Out of context: computer systems that adapt to, and
learn from, context, IBM Systems Journal, 39(3-4), (2000), pp.617-32.
7. Rao, S. A., & Georgeff. M. P., BDI Agents: From Theory to Practice,
Conference Proceedings of 1st international conference on multiple agent
system, (1995), pp. 312-319.
8. Shen, W. M., Douglas H. N., Agent-based Systems for Intelligent
manufacturing: A State-of-the-Art Survey, International journal of Knowledge
and Information Systems, 1(2), (1999) pp. 129-156.
9. Wooldridge, M. and Jennings, N. R., Agent Theories, Architectures, and
Languages: a Survey, Intelligent Agents, ed. by Wooldridge, M., Jennings, N.
R., (1995), pp. 1-22.
CHAPTER 3
LEARNING AND ADAPTATION
P A R R O N D O STRATEGIES FOR ARTIFICIAL T R A D E R S

MAGNUS BOMAN
Swedish Institute of Computer Science, Box 1263, SE-164 29 Kista, Sweden
E-mail: mabQsics.se

S T E F A N J. J O H A N S S O N
Department of Software Engineering and Computer Science,
Blekinge Institute of Technology, Box 520, SE-372 25, Ronneby, Sweden
E-mail: sja@bth.se

DAVID LYBACK
Financial Market Systems, OM AB, SE-105 78 Stockholm, Sweden
E-mail: david.lyback@omgroup.com

On markets with receding prices, artificial noise traders may consider alterna-
tives to buy-and-hold. By simulating variations of the Parrondo strategy, using
real data from the Swedish stock market, we produce first indications of a buy-
low-sell-random Parrondo variation outperforming buy-and-hold. Subject to our
assumptions, buy-low-sell-random also outperforms the traditional value and trend
investor strategies. We measure the success of the Parrondo variations not only
through their performance compared to other kinds of strategies, but also rela-
tive to varying levels of perfect information, received through messages within a
multi-agent system of artificial traders.

Keywords: Artificial trader, Parrondo strategy, on-off intermittency,


multi-agent system, artificial stock market

1 Introduction

Stock markets to an ever-increasing extent allow for trading controlled by ar-


tificial agents, or more generally, program trading. For instance, the Swedish
Securities Dealers Association finds that it has no objections to program trad-
ing, and already in 1992 declared that only the means to exploiting unlawful
quotes manipulation, resulting from program trading, should be controlled 19 .
Nasdaq, in a communication to their members write 17 :
Recent events show that the way some stocks are traded is changing
dramatically, and the change in trading methods may affect price
volatility and cause increased trading volume. This price volatility
and increased volume present new hazards to investors, regardless of
whether trading occurs on-line or otherwise.

150
151

In general, stock markets do not apply restrictive policies to program trading.


A primary objective of the market place operator is to promote a high liquidity
in the traded instruments. This can be done through reducing the transaction
costs: one typical implicit cost is lack of orders, leading to wide spreads, or
non-existing quotes. The operators thus have reasons to encourage inbound
orders. As long as these are authenticated, and the network can keep up
disseminating the market info in a proper fashion so that the situation stays in
line with the overall aim of up-keeping a fair and orderly market, the operator
should have nothing against a large number of valid orders per second being
placed by artificial agents.
Hence, we feel motivated to relate certain theoretical results from physics
to artificial traders of the future. We do not assume markets populated
solely by artificial traders, however. If we did, we could move on to claim
that the Efficient Market Hypothesis and rational choice theory yield effi-
cient equilibria 14 , since the vast empirical evidence against such assumptions
are directed almost exclusively towards human traders 13 . We instead believe
that artificial traders have gradually and almost unnoticeably slipped onto
the same markets as human traders, and we will treat them as speculat-
ing noise traders (traders with non-rational expectations and potentially zero
intelligence)6. Artificial stock markets possibly exhibit volatility (i.e., stan-
dard deviation) of a different kind than ordinary excess volatility markets 2 ,
as argued, e.g., in the ban of crawlers from the Internet auction site eBay 20 .
In practice, Internet marketplaces supply information on their acceptance of
artificial traders and other softbots in a file named r o b o t s . t x t , and on Inter-
net markets that do allow for softbots, their behavior is usually monitored in
some way, in order to mitigate the effects of speculation through unconven-
tional methods such as denial-of-service attacks. Program trading has also in
general reached a level where flocking behavior worry policy makers 7 . On an
artificial stock market, in contrast to an ordinary market 16 , active portfolio
management should also incorporate the price dynamics, because of the in-
tense trading. This factor has also led to transaction fee policies being radical
on some artificial trader markets. Since significant transaction fees can render
the Parrondo strategies described in sections 2 and 3 below useless, the exis-
tence of markets with low or no transaction fees is important to our object.
We will consider portfolios on markets with receding prices. We will represent
artificial traders as agents in a multi-agent system (MAS), in which agents
affect each other's behavior through trusted message passing, as explained in
section 3. In the MAS setting, variations of Parrondo strategies are then sub-
ject to experiments on a simulation testbed, the results of which are reported
in section 4. In the last section, we present directions for future research.
152

2 The Parrondo Strategy in Games

The flashing ratchet (or Brownian motor) 1 is a molecular motor system con-
sisting of Brownian particles moving in asymmetric potentials, subject to a
source of non-equilibrium 18 . In its game-theoretical formulation 9 , the flashing
ratchet can be described in terms of two games (A and B) in which biased
coins are tossed.

• Game A is a single coin game in which the coin comes up heads (=win)
50 — e per cent of the time (for some small e > 0) and results in tails the
rest of the times (Parrondo himself18 used e = 0.005, and the constraints
are described, e.g., at seneca.fis.ucm.es/parr/GAMES/discussion.
html).

• Game B involves two coins. The first coin comes up heads 10 — e per
cent of the time, and the second coin 75 — e per cent of the time. What
coin to flip is decided through looking at the capital of the player. If it
is divisible by 3, the first coin is flipped, while the second coin is used in
the rest of the cases.

Clearly, game A is a losing game, but the same holds for game B. This is
because the player is only allowed to flip the second coin if her capital is not
a multiple of 3. The latter situation comes up more often than every third
time: The player will start with the unfavorable coin, which will very likely
place her in loss -1. She will then typically revert to 0, and then back again to
-1, and so on. Whenever the unfavorable coin lands tails twice in succession,
however, she will end up with capital -3, and then the pattern will repeat,
leading to -6, etc. Hence, game B is a losing game, just like game A.
The Parrondo strategy for playing games A and B repeatedly is to choose
randomly which game to play next. Somewhat counter-intuitively, this dis-
crete representation of a ratchet yields a winning game.

3 The Parrondo Strategy in Artificial Trading

Artificial trading and herd behavior have often been studied through bottom-
up simulations, as in Sugarscape 8 or the Santa Fe artificial stock market 2 .
We have concentrated on speculating investors that use variations of the Par-
rondo strategy. Table 1 briefly describes these strategies, as well as some
control strategies. Value investors (exemplified by BLSH in Table 1) seek
profits, while trend investors (exemplified by BHSL in Table 1) try to identify
upward and downward movers and adjust their portfolios accordingly 10 . In
153

Strategy Description
Buy-and-hold The buy-and-hold strategy here acts as a control
(BaH) strategy that trades no stocks.
Random This strategy trades stocks randomly.
Insider The insider gets quality ex ante information about
some stocks on which it may react before the market.
Buy low, sell high This Markovian value investor strategy monitors if
(BLSH) the stock increased or decreased in value during the
latest time interval. If the value increased, it sells the
stock, and if the value dropped, it buys the stock.
Buy low, sell ran- Like BLSH, except BLSR randomly chooses what
dom (BLSR) stock to sell.
Buy random, sell Like BLSH, except BRSH randomly chooses what
high (BRSH) stock to buy.
Buy high, sell low This Markovian trend investor strategy is the oppo-
(BHSL) site of BLSH.

Table 1. The artificial trading strategies.

our simulations, the value investor proportion is larger, but this significant
fact notwithstanding, our object is not the study of how it affects the market
dynamics. Instead, we augment the Parrondo variations by market informa-
tion, in the form of agent messages. The agents may thus influence each other
by passing hints on what to buy, or what to sell. A message is treated by
the receiver as trusted information, and the receiving agent will act upon the
content of the message, interpreting it as normative advice. A message can be
interpreted as perfect (or even insider) information, randomized for the sake
of our experiment.
We considered a portfolio often stocks with receding prices, assumed to be
unaffected by agent trading. The data used is real daily data from the Swedish
stock market, from the one-year period starting March 1, 2000. The stocks
are listed in Table 2, and in Figure 1 their development is shown. Values have
been normalized to 100 for the start of the period. The strategies initially
held $10000 value of each stock. One trade was done per day, in which the
strategy decided what to sell and what to reinvest in. Three different levels
of hint probabilities were used: 1%, 5%, and 10% chance of receiving a hint.
A 1% level means that the strategy will on average receive a hint for one of
the ten stocks every tenth day of trading. When choosing randomly what to
buy and what to sell, 10 integers were randomized and taken modulo 10 in
154

Stock Business area Value


ABB Industrial 83.33
Allgon Telecom 24.55
Boliden Mining 37.19
Enea Data IT 20.09
Hennes&Mauritz Clothes 60.40
Ericsson Telecom 36.36
OM Financial 48.67
Scania Industrial 77.80
Securitas Security 80.35
Skandia Insurance 53.22

Table 2. The ten stocks used in the experiment, and their normalized values on March 1,
2001.

order to get (at most 10) stocks that were then traded. For each of the stocks
sold, a percentage of the possession p e [0.2,0.8] was sold. The values of all
sales were then reinvested according to their relative part in a similar selection
process. If the strategy did not get at least one stock to buy and one to sell, it
held its possessions until the next day. Each strategy was evaluated towards
the same set of stocks and the same set of hints (if used). In order to even
out differences due to the randomness of the trading, the simulations were
repeated 1000 times. Alignment and docking experiments are encouraged,
and specifics are available upon request.

4 Experiment Results

As can be seen in Figure 2, most of the strategies over the 252 trading days
followed the major trends of the market and none of them managed to main-
tain the initial portfolio value. There was considerable movement, as shown
in the blowup of the last days of trading in Figure 3, but also significant
differences between outcomes (Table 3). Buy-low-sell-random was the only
strategy that outperformed Random. Strategies also differed with respect to
volatility. For instance, BLSH was inferior to all strategies for most of the
year. However, around day 100 through day 120, it outperformed all other
strategies. As expected, BHSL amplified the receding trend.
In spite of its poor performance, there are still many reasons for pol-
icy makers and speculators to use buy-and-hold even on supposedly receding
markets. One reason is to declare and uphold a clear company investment
155

> 60

ABB
r^pg»p|
Allgon
Boliden
Enea
H&M
Ericsson
OMG
Scania -
Securitas
Skandia
1
50 100 150
Time

Figure 1. The development of the values of the stocks used in the experiment.

Strategy Value
BLSR 6110.88
Random 5524.60
BaH 5383.40
BLSH 5338.15
BHSL 5202.71
BRSH 5140.29

Table 3. Strategy results without hint probabilities (strategies are explained in Table 1).

policy, another is that frequent re-investments could be undesirable (e.g., due


to transaction fees). Nevertheless, we feel that BLSR produced good enough
results to merit further study. For now, we will be content with comparing it
to various levels of hint probabilities, however. From those results, shown in
Figure 4, we see that BLSR is comparable to the insider strategy with a hint
probability of approximately 4%.
156

BaH
11000 Random -
BLSH

10000
i .\
BHSL
BRSH
BLSR

9000
'•..» : ; i

v
^'T'r
i
inki .
I
•>
8000

7000
I/ Tr

!
&r
y
A
w
^4 A A

^ % ; : F

^.•-••iC^!»K:..,.
-

6000

5000
iift&J
K 1?
?
' i

100 150
Time

Figure 2. The development of the values of the experiment portfolios.

5 Conclusions and Directions for Future Research

We have shown that the use of certain Parrondo-based strategies may im-
prove the performance of artificial traders. Our model is simplistic, in the
following respects. The messages sent must be allowed to have richer content,
and may be indicators or signals, rather than simple instructions. Instead
of interpreting received messages as normative advice, trust could somehow
be represented. For instance, a probability distribution may be associated
with messages, and trust assignments can then be represented as second-
order probabilities. Market norms should be modeled and adhered to by the
traders 3 . Message content can then depend on market dynamics. Artificial
traders have two ways of communicating such dynamics. Firstly, they may
observe and recognize other traders and try to model them with the intent
of communication and possibly co-operation 5 . Secondly, they may monitor
prices, as in the Trading Agent Competition 4 (see t a c . e e c s . u m i c h . e d u / ) or
artificial stock market approaches 11 . Naturally, each trader itself also observes
the market dynamics. We have placed no reasoning facilities in the trader at
this stage, and so the trader cannot act on sense data. Another simplifica-
157

BaH
Random
BLSH
BHSL
BRSH
BLSR
6500

6000

5500

4500
244 245 246 248 249 250
Time

Figure 3. Magnification of the last days of trading.

tion is that our models should incorporate transient phenomena, including


not only crashes and bubbles, but also transient diversity, i.e. we must find
the homogeneity and heterogeneity drivers in our MAS 15 . A related point in
need of further investigation is learning in artificial traders 12 .
For purposes of prediction, our results are almost useless, since we cannot
in general design in advance a portfolio of stocks, the prices of which are all re-
ceding. In rare circumstances, such as during the period of almost universally
receding prices of IT stocks in the autumn of 2000, ex ante portfolios could
relatively easily be assembled, and then Parrondo variations would indeed be
an interesting alternative to buy-and-hold. For our experiment, the real data
was chosen ex post from a large sample space with the criterion that each
stock should have a saw-tooth receding price curve.
While the above shortcomings together render our results useless for prac-
tical purposes, they should be seen as directions for future research. We in-
tend to pursue the important question of strategy programming for artificial
traders, as we feel that such programming will be of increasing importance
in the future. By replacing our unrealistic assumptions one by one, we hope
to achieve our ultimate goal of reasonably efficient strategies on real-time
markets with non-linear dynamics.
158

-T 1
BaH
Insider 1%
Insider 5%
Insider 10%

Figure 4. The development of the values with three different levels of hint probabilities.

Acknowledgements

Magnus Boman was in part financed by a NUTEK (VINNOVA) grant within


the PROMODIS (Programming modular and mobile distributed systems) pro-
gramme. Stefan J. Johansson was financed by the KK-foundation. David
Lyback was supported by a research assignment in the OM corporation. The
authors wish to thank Fredrik Liljeros, as well as their respective colleagues,
for comments on drafts.

References

A. Ajdari and J. Prost, Mouvement Induit par un Potentiel Periodique


de Basse Symetrie: Dielectrophorese Pulsee, C. R. Acad. Sci. Paris 315,
1635 (1992).
W. B. Arthur, J. Holland, B. LeBaron, R. Palmer, and P. Tayler, Asset
Pricing under Endogenous Expectations in an Artificial Stock Market, in
The Economy as an Evolving Complex System II, eds. W. B. Arthur, S.
Durlauf, and D. Lane, pp. 15-44, Addison-Wesley, Reading, MA, 1997.
159

3. M. Boman, Norms in Artificial Decision Making, Artificial Intelligence


and Law 7, 17 (1999).
4. M. Boman, Trading Agent Competition, AgentLink News 6, 15 (2001).
5. M. Boman, L. Brouwers, K. Hansson, C-G. Jansson, J. Kummeneje, and
H. Verhagen, Artificial Agent Action in Markets, Electronic Commerce
Research 1, 159 (2001).
6. J. B. De Long, A. Shleifer, L. H. Summers, and R.J. Waldmann, The Sur-
vival of Noise Traders in Financial Markets, J. of Business 64, 1 (1991).
7. V. M. Eguiluz and M. G. Zimmermann, Transmission of Information and
Herd Behaviour: An Application to Financial Markets, Phys. Rev. Lett.
85, 5659 (2000).
8. Epstein and R. Axtell, Growing Artificial Societies (Brookings, Washing-
ton D.C., 1996).
9. G. P. Harmer and D. Abbott, Losing Strategies can Win by Parrondo's
Paradox, Nature 402(6764), 864 (1999).
10. P. Jefferies, M. Hart, P. M. Hui, and N. F. Johnson, ^From Market Games
to Real-World Markets, cond-math0008387 (2000).
11. B. LeBaron, Agent Based Computational Finance: Suggested Readings
and Early Research, Economic Dynamics and Control 24, 679 (2000).
12. M. Lettau, Explaining the Facts with Adaptive Agents: The Case of
Mutual Fund Flows, Economic Dynamics and Control 21, 1117 (1997).
13. T. Lux, Herd Behaviour, Bubbles and Crashes, The Economic Journal
105, 881 (1995).
14. T. Lux and M. Ausloos, Market Fluctuations I: Scaling, Multi-Scaling and
Their Possible Origins, in Theories of Disasters: Scaling Laws Governing
Weather, Body and Stock Market Dynamics, eds. A. Bunde and H-J.
Schellnhuber, Springer-Verlag, Berlin, in press.
15. D. Lyback, Transient Diversity in Multi-Agent Systems, DSV Report
99-X-097, Royal Institute of Technology, Stockholm, 1999.
16. S. Maslov and Y-C. Zhang, Optimal Investment Strategy for Risky As-
sets, Theoretical and Applied Finance 1(3), 377 (1998).
17. NASD Regulation Issues Guidance Regarding Stock Volatility, NASD
Notice to Members 99-11, 1999.
18. J. M. R. Parrondo, J. M. Blanco, F. J. Cao, and R. Brito, Efficiency of
Brownian Motors, Europhys. Lett. 43(3), 248 (1998).
19. Swedish Securities Dealers Association, Recommendations on Program
Trading and Related Topics, May 19, 1992 (In Swedish).
20. T. Wolverton, Judge Bars eBay Crawler, CNETNews.com, May 25, 2000.
BDI MULTIAGENT L E A R N I N G B A S E D O N F I R S T - O R D E R
I N D U C T I O N OF LOGICAL DECISION TREES

ALEJANDRO GUERRA HERNANDEZ, AMAL EL-FALLAH SEGHROUCHNI


AND HENRY SOLDANO
Universite Paris 13 , Laboratoire d'Informatique de Paris Nord, U.P.R.E.S.-A.
CNRS 7030, Institute Galilee, Avenue Jean-Baptiste Clement, Villetaneuse,
93430, France. Email: {agh,elfallah,soldano}@lipn.univ-parisl3.fr

This paper is about learning in the context of Multiagent Systems (MAS) composed
by intentional agents, e.g. agents that behave based on their beliefs, desires, and
intentions (BDI). We assume that MAS learning differs in subtle ways from the
general problem of learning, as defined traditionally in Machine Learning (ML).
We explain how BDI agents can deal with these differences and introduce the
application of first-order induction of logical decision trees to learn in the BDI
framework. We exemplify our approach learning the conditions in which plans
can be executed by an agent. Key words: MAS learning, BDI systems, Logical
Decision Trees.

1 Introduction

We are interested in learning in the context of Multiagent Systems (MAS)


composed by intentional agents, e.g. BDI agents. In this paper, we deal with
the issue of adding learning competences to a BDI architecture, which lead us
to consider learning methods applied to systems which behavior is explained in
terms of beliefs, desires, intentions (BDI propositional attitudes), and partial
hierarchical plans, as proposed in practical rationality theories *, and that can
be characterized as autonomous, reactive, pro-active and social 15 .
Usually, MAS learning 10 ' 14 is characterized as the intersection of Machine
Learning (ML) and Distributed Artificial Intelligence (DAI). Motivations for
this are reciprocal: i) MAS community is interested in learning, because it
seems to be central to different properties defining agents; and ii) an extended
view of ML dealing with agency and MAS can improve the understanding of
general principles underlying learning in natural and artificial systems.
A learning agent9 can be conceptually divided into four components: i)
a learning element responsible for making improvements executing a learning
process; ii) a performance element responsible for taking actions, e.g. the
agent without learning competences; iii) a critic responsible for providing
feedback; and iv) a problem generator responsible for suggesting actions that
will lead to informative experiences.
Then, the design of the learning element, and consequently the choice of a

160
161

particular learning method, is affected by five major issues: i) which elements


of the performance element are to be improved? ii) what representation is
used for these components? iii) what feedback is available? iv) what prior
information is available? v) is it a centralized or decentralized learning case?
In this paper we expose the way BDI agency can be used to conceive
learning agents able to operate in MAS, using induction of logical decision
trees. In order to do that, the paper is organized as follows: Section 2 recalls
briefly BDI architectures, introducing an example used in the rest of the paper.
Section 3 presents our approach to MAS learning, it considers the design of a
BDI learning agent, the learning method used (first-order induction of logical
decision trees), and examples. Section 5 focuses on discussion, related and
future work.

2 B D I Agency

BDI theories of agency are well known. Different aspects of intentionality and
practical reasoning have been studied formally using extensions of modal and
temporal logics 5-11-15. The goal of the section is just to recall the way BDI
architectures work to complement the discussion on learning.
Examples in this paper comes from a very simple scenario proposed origi-
nally by Charniak and McDermott 2 (see figure 1). This scenario is composed
by a robot with two hands, situated in an environment where there are: i)
a board; ii) a sander; iii) a paint sprayer; iv) a vise. Different goals can be
proposed to the robot, for example, sand the board or even get self painted!
which introduces the case of incompatible goals, since once painted, the robot
stops being operational for a while. The robot has different options to achieve
its goals, it can use both of its hands to sand the board, for example, or well,
use the vise and one hand. Eventually, another robot will be introduced in
the environment to deal with examples about different interactions.
In general, a BDI architecture contains four key data structures: beliefs,
desires or goals, intentions, and a plan library.
Beliefs represent information about the world. Each belief is represented
symbolically as a ground literal of first-order logic. Two activities of the
agent update its beliefs: i) the perception of the environment; and ii) The
execution of intentions. The scenario shown in Fig. 1 can be represented by
the following beliefs of robot r l as: somewhere(sander), somewhere(board),
somewhere(sprayer), free-hand (left), free-hand (right), operational (rl).
Desires, or goals, correspond to the tasks allocated to the agent and are
usually considered logically consistent. Two kinds of desires are considered:
i) to achieve a desire expressed by a belief formula, i.e. !sanded (board); and
162

Environment Plan: pO
Trigger: ! sanded(X)
Context: free-hand(Y) and
i^S) somwhere(X)
board Body:
santier ,/
f h
^
vise
^ o pickup(X)
^__
r?r^ rov
I; —•;<!;:!i
;:;
"
V
O put-in-vise(X)
sprayer
\ Vy robot r2 Q
T ! sand-in-vise(X)
^ \
robot rl ^^-^
-
1
Figure 1. The scenario for examples and a typical plan.

ii) to test a situation expressed as a disjunction and/or conjunction of belief


formulae.
Plans have several components. The invocation condition specifies, as a
trigger event, the circumstances under which the plan should be considered.
Four types of trigger events are possible: the acquisition of a new belief, the
removal of a belief, the reception of a message, and the acquisition of a new
(sub)goal. The context specifies, as a situation formula, the circumstances
under which the execution of the plan may start. The body of a plan is repre-
sented as a tree where nodes are labeled with states and arcs with actions or
subgoals, specifying a course of action. The maintenance conditions describe
the circumstances that must remain to continue the execution of the plan.
Finally, a set of internal actions is specified for the cases of success and failure
of the plan. Figure 1 shows a simplified plan pO to sand an object X. The
last branch in the plan is a subgoal, because the robot will need to take the
sander to do its work, which involves another plan.
An intention is implemented as a stack of plan instances. In response to
an event, the agent must find a plan instance to deal with it. Two cases are
possible: i) If the event considered is an external one, an empty stack is created
and the associated plan is pushed on it, i.e. if the event is .'sanded(board), the
plan pO is considered, possibly among others, and the substitution (board/X,
left/Y) makes it executable. So, this substitution and pO are used to form
a new intention stack identified as ipO; ii) If the event is an internal one, it
means it was produced by some already existing intention. The plan instance
generated for the internal event is pushed in the intention stack that gener-
163

ated the event, i.e. When executing ipO, the last branch in the plan body
is a subgoal, so the event (!sand-in-vise(X),ipO) will be posted and will be
processed as usual, but the intention formed, will be pushed on the top of ipO.
A BDI interpreter 3 manipulates these structures, selecting appropriate
plans based on beliefs and desires, structuring them as intentions and execut-
ing these ones.

3 B D I Learning Agents

We consider that learning in the MAS context differs in subtle ways from
learning in other ML situations. There are two sources for these differences:
i) the flexible autonomous behavior defining agency introduces some consid-
erations which are not present in traditional software 4 ' 8 , i.e. autonomy and
pro-activeness; ii) MAS environments are usually complex and dynamic.
This suggests that the same mechanisms controlling the behavior of the
agent should b use to control learning processes, e.g. learning processes should
be considered as actions of the agent. In particular: i) Agents have to be able
to identify situations where learning is necessary (pro-activity); ii) Agents have
to evaluate and prioritize their learning processes (action selection); iii) Even-
tually, agents should be able to cope with simultaneous learning processes,
attending different learning goals found by the agent; and iv) The result of
the learning processes should be incorporated in the agent architecture.
We have observed that applications and challenges of MAS for ML are
indicative of a hierarchy of MAS levels of different complexity, that could be
useful to adopt a bottom-up approach in MAS learning research towards a full
distributed MAS learning. Levels are as follows: i) In the first level, agents
learn from the observation of their environment without direct interaction
with other agents (centralized learning); ii) In the second level, an elementary
form of direct interaction is introduced: implicit exchange of messages among
agents, requests included. Since it is a form of delegation this level introduces
social learning in MAS; iii) In the third level, agents are enabled to learn from
the observation of the behavior of other agents; and iv) All previous levels
are forms of centralized learning. In the fourth level, decentralized learning is
considered, i.e. agents with different beliefs participating in the same learning
process.
Defining BDI learning agents involves: i) taking into account the above
considerations; ii) considering the questions suggested while defining learning
agents (section 1) under these considerations; and iii) Choosing a learning
method.
164

3.1 Defining BDI learning agents


What components of performance can be improved? Plans are central in our
approach: i) the context of each plan determines when they are executable
affecting the order in which they are considered, so we want agents to learn the
context of their plans that led to successful executions of them; ii) plans will
be used as background knowledge; and iii) Success and Failure components of
the plan help to build examples using internal actions. BDI learning agents
will not learn their beliefs, but use them to build examples to learn. Events
can be used in two ways: i) trigger events label the concept to be learn (event
satisfied or not); and ii) the set of plans obtained after a given event can be
used as background knowledge.
What representation is used for these components? The whole BDI inter-
preter is built on first-order logic representations. Belief formulae are defined
as an atom or the its negation. Beliefs are grounded belief formulae. Situation
formulae are a conjunction and/or disjunction of belief formulae. Two goals
are considered, achieving a belief formula and testing a situation formula.
Actions are seen as procedure calls, possibly including arguments. Plans, as
seen, are complex structures. What is relevant here is that the invocation of
a plan is represented as a trigger event, the context of a plan is represented as
a situation formula, and the body of a plan is a tree which arcs are labelled
with either goals or actions. Intentions are built as stacks of plan instances.
What feedback is available? The agent keeps traces of the execution of
their intentions. Success in achieving an intention executes a set of internal
actions to update the agent structure. These actions can include saving in-
formation about the context in which the intention was satisfied. Failures are
processed in a similar way, but the event associated originally with the plan is
reposted in the queue with the following information i) which plan is produc-
ing it, and ii) which plans has failed to satisfy it. This can be complemented
with information about the beliefs of the agent when success or failure occurs,
to build learning examples.
What prior information is available? Basically, we consider as prior in-
formation the bootstrap component of the BDI architecture, i.e. the plan
library, and initial beliefs.

3.2 First- Order Induction of Logical Decision Trees


After the representations used in BDI architectures, we considered first-order
learning methods. Since the context of plans was represented as a disjunction
of conjunctions of belief formulae, we decided to use decision trees as target
representation.
165

Decision tree learning is a widely used and very successful method for
inductive inference. As introduced in the ID3 algorithm by Quinlan 13 , this
method approximates discrete-value target functions. Learned functions are
represented as trees and instances as a fixed set of attribute-value pairs. These
trees represent, in general, a disjunction of conjunctions of constrains on the
attribute values of the instances. Each path from the tree root to a leaf corre-
sponds to a conjunction of attribute tests, and the tree itself is a disjunction
of these conjunctions. Decision trees are inferred by growing them the root
downward, greedily selecting the next best attribute for each new decision
branch added to the tree, in a divide-and-conquer strategy, differing from its
rule-based competitors, i.e. CN2 and AQ, which use covering strategies.
Since clausal representation used in inductive logic programming (ILP)
exhibits discrepancies with the structure underlying decision trees, Luc de
Raedt 7 introduced the concept of logical decision trees, that are binary de-
cision trees (trees where tests have two possible outputs) constrained by: i)
every test is a first-order conjunction of literals; and ii) a variable that is
introduced in some node can not occur in its right subtree. This represen-
tation that corresponds to a clausal representation known as learning from
interpretations paradigm 6 .
The learning from interpretations paradigm can be defined in the follow-
ing way. Given: i) a set of classes C; ii) a set of classified examples E; iii) a
background theory B. Find a hypothesis H, such that: Ve £ E,HAeAB |= c',
where c is the class of the example e and c' € C\ {c}. The background theory
B is used in the following way. Rather than starting from complete inter-
pretations of the target theory, examples are a kind of partial interpretations
(sets of facts) that are completed by taking the minimal Herbrand model
M(B U J) of the background theory B and the partial interpretation / . This
paradigm enables the agent to conceive examples as sets of beliefs considered
when executing an intention.
Tilde 7 is a learning from interpretations algorithm, operating on logical
decision trees.It uses the same heuristics that C4.5, a predecessor of ID3 (gain
ratio, post-pruning heuristics), but the computation of the tests is based on
a classical refinement operator under 0-subsumption.

3.3 Exemplifying the approach


In the scenario proposed in Fig. 1 we can consider the following predicates to
specify the actions configuring the behavior of the agent: pickup(X), put-
down(X), put-in-vise(X), sand-in-vise(X), sand-in-hand(X), paint(X), self-
paint(X). To describe the environment where the agent is situated, the fol-
166

lowing predicates are used: free-hand(X) to indicate that the robot has the
hand X free; somewhere(X) to indicate that the object X is somewhere there;
in-vise(X) to indicate that the object X is in the vise; in-hand(X) to indicate
that the object X is in a hand of the robot; operational(X) to indicate that
the robot X is operational; sanded(X) and painted(X).
Then we can consider the simple plan body of pO to sand an object X, exe-
cuting sequentially: pickup(X), put-in-vise(X), and sand-in-vise(X). This plan
body is executed if (context of plan): free-hand(Y) and somewhere(board).The
specification of the plan can be incorporated in the background knowledge,
as well as other general knowledge of the agent:

board-sanded : - plan(pO,board).
plan(pO,board) : - free-hand(Y), somewhere(board), sanded(board).
sanded(X) : - pickup(X), p u t - i n - v i s e ( X ) ,
sand-in-vise(X).

The agent can build examples as models of the cases where the execution
of pO lead to the board sanded and also for the cases where it does not. For
this, the trigger event Isanded(board) produces two classes to consider board-
sanded and board-not-sanded. The rest of the models are beliefs the agent had
when the intention containing pO was executed.

begin(model(l)). begin(model(2)). begin(model(3)).


board-sanded. board-sanded. board-not-sanded.
free-hand(left). free-hand(right) free-hand(left).
operational(rl). operational(rl). somewhere(board).
somewhere(board)- somewhere(board) plan(pO).
plan(pO). plan(pO). end(model(3)).
end(model(1)). end(model(2)).

begin(model(4)). begin(model(5)). begin(model(6)).


board-not-sanded. board-sanded. board-sanded.
free-hand(left). free-hand(left). free-hand(left).
somewhere(board). operational(rl). operational(rl).
in-vise(sander). somewhere(board). somewhere(board) ,
plan(pO). plan(pO). plan(pO).
end(model(4)). end(model(5)). end(model(6)).

The following pruned tree for this learning setting is obtained by Tilde:
167

operational(A) ?
+—yes: board-sanded [4 / 4][ml,m2,m5,m6]
+—no: board-not-sanded [2 / 2][m3,m4]

Fractions in the form [i / j] indicate the number of examples in the class


(i) and how many of them were well classified (j). Examples in the class are
listed immediately (ml...m6). Induction time, for this example was of 0.03
seconds. The equivalent logic program for this logical decision tree is:

nl :- operational(A).
c l a s s ( b o a r d - n o t - s a n d e d ) : - not n l .
class(board-sanded) : - operational(A).

The definite clause nl :- operational(A) is introduced by the refinement


operator of Tilde, because it will be useful to define the branch for the class
board-not-sanded, which is defined in terms of not nl. The decision tree ob-
tained suggests that the agent must add operational A) in the preconditions
of the plan pO.
Observe that examples expressed as models, can include beliefs about
other agents, i.e. operational(r2) where r2 is a different robot, or also beliefs
that other agents have sent to robot rl, without affecting the learning process.
This is very important to scale up the approach to social learning, particularly
to the fourth MAS level proposed.

4 Discussion

We have explained and exemplified how BDI agents can learn using First-
Order Induction of Logical Decision Trees.
Different triggers have been considered in literature 4 to start learning
processes associated with specific areas, i.e. expectation violations, and per-
ceived need of improvement. All of them are possible in a BDI agent thanks to
the way it uses its plans. We have not considered here expectation violations,
but expectations can be represented in the states of plan bodies to verify
these conditions. Unsuccessful executions of intentions suggest the need of
improvement. The setting used in learning from interpretations are very im-
portant here, since using the BDI architecture we can: i) identify a task that
is not well accomplished; ii) obtain examples of the execution of intentions
(positives and negatives); and iii) obtain background knowledge, defining in
this way the area where learning is necessary.
168

The example introduced suggests, it is possible for the agent to learn


with few examples. We think that this is due to the way BDI architectures
built windows of rationality 1 enabling the agent to focus on beliefs and plans
relevant to particular events. More complicated experiments are necessary to
know if Tilde continues to infer useful information with few models, specially
in the case of the agent considering interactions with other agents.
We have decided to do our own implementation of a BDI interpreter. The
reasons for this decision include i) we knew that different implementations for
BDI architectures already existed, e.g. PRS , its re-implementation dMARS
3
, but we only had access to formal specifications of them, not the source code
or low level information that help us modify or extend them accordingly to
our needs. We are using Allegro CL 4.3 running on a Linux platform. This
lisp interpreter enables us to execute several functions, i.e. agents, sharing
the same lisp environment in a multiprocess way. For the learning algorithm
we are using Tilde version 5.5.1.
Some works in the same direction that ours include: Olivia and co-authors
12
present a Case-Based BDI framework applied to intelligent search on the
Web, but the interpreter operates in a case-based cycle. Grecu and Brown 4
have some similar position about the way learning must be incorporated in
agent systems, but their agents are not intentional and they use propositional
learning. Jacobs et al. 8 presents the use of ILP systems for th validations of
MAS.
Experimental results are promising. Even when the scenario proposed
is very simple, extended with a second robot, it seems to be sufficient to
experiment different interaction situations among agents. Immediate work to
do is completing some details about the interaction of the interpreter and the
learning processes, in order to use more realistic scenarios. Experiments done
up to now have help us to better understand the interaction of the agents
with their learning processes.

5 Acknowledgements

Discussion with David Kinny and Pablo Noriega has been very helpful. The
first author is supported by Mexican scholarships from Conacyt, contract
70354; and Promep, contract UVER-53.

References

1. M Bratman, Intention, Plans, and Practical Reasoning,(Harvard Univer-


sity Press, Cambridge MA., USA, 1987).
169

2. E Charniak and D McDermott, Introduction to Artificial Intelligence,


(Addison-Wesley, USA, 1985).
3. M D'Inverno, D Kinny, M Luck, and M Wooldridge in Intelligent Agents
IV, Volume 1365 in Lecture Notes in Artificial Intelligence, pages 155-176,
(Springer-Verlag, Berlin-Heidelberg, Germany, 1997).
4. D L Grecu and D C Brown in Proceedings of the Third IFIP Working
Group 5.2 Workshop on Knowledge Intensive CAD, eds. T Tomiyama
and M Mantyla, Guiding Agent Learning in Design, pages 237-250, Tokio,
Japan,1998.
5. A Rao and M P Georgeff, Decision Procedures for BDI Logics, Journal
of Logic and Computation 8(3):293-344, 1998.
6. L De Raedt and Dzeroski, First-order jk-clausal theories are PAC-
learnable, Artificial Intelligence (70):375-392, 1994.
7. L De Raedt and H Blockeel, Top-Down Induction of Logical Decision
Trees, Technical Report, Department of Computer Science, Katholieke
Universiteit Leuven, Belgium, 1997.
8. J Nico et al in Inductive Logic Programming, eds. N Lavrac and S. Dze-
roski, Using ILP-Systems for Verification and Validation of Multi-Agent
Systems, pages 145-154, (Springer Verlag, Berlin-Heidelberg, Germany,
1997).
9. S J Russell and P Norvig, Artificial Intelligence, a modern approach,
(Prentice-Hall, New Jersey, USA, 1995).
10. S Sen and G Weiss, Multiagent Systems, a modern approach to Distributed
Artificial Intelligence, (MIT Press, Cambridge, MA., USA, 1999).
11. M Singh et al in Multiagent Systems, a modern approach to Distributed
Artificial Intelligence, ed. G Weiss, chapter Formal Methods in DAI:
Logic-based Representation and Reasoning, (MIT Press, Cambridge MA.,
USA, 1999).
12. C Olivia et al in AAAI Symposium on Intelligent Agents, Case-Based
BDI Agents: an Effective Approach for Intelligent Search on the WWW,
Stanford University, USA, 1999.
13. J R Quinlan, Induction of Decision Trees, Machine Learning 1:81-106,
1986.
14. G Weiss and S Sen, Adaptation and Learning in Multiagent Systems,
Number 1042 in Lecture Notes in Artificial Intelligence (Springer-Verlag,
Berlin-Heidelberg, Germany, 1996).
15. M Wooldridge, Reasoning about Rational Agents, (MIT Press, Cambridge
MA., USA, 2000).
EVOLUTIONARY BEHAVIORS OF C O M P E T I T I V E A G E N T S
IN DILEMMA SITUATION

T i n T i n Naing, Lifeng He*, Atsuko M u t o h , Tsuyoshi N a k a m u r a a n d Hidenori Itoh


Intelligence and Computer Science Department
Nagoya Institute of Technology, Nagoya, Japan
E-mail: naing@juno.ics.nitech.ac.jp
^Faculty of Information Science and Technology
Aichi Prefectural University, Aichi, Japan

Evolutionary behaviors of agents have received many interests of researchers be-


cause of its important role in both of multi-agent interactions and understanding
of human interactions. "Game-theoretic approach" is a major means to study
agent's behaviors in which competitive problems are formalized as games. Iterated
Prisoner's Dilemma (IPD) game has been well studied for such purpose in various
research areas. However, not all situations in real world environment can be for-
malized as IPD. Among others, the dead-lock avoidance problem is such one. In
this paper, we propose a new game model, called Compromise Dilemma (CD), for
studying evolutionary behaviors of agents, which is suitable for dead-lock avoid-
ance problem. For each agent, there are two basic actions and one intermediate
action. The combination with opponent's action makes an agent opportunist or
victim. Evolutionary behaviors of agents in a co-evolutionary population are stud-
ied. We test our model with different parameters for evolutionary algorithm and
analyze the results, and show that agents can evolve in a manner to achieve their
optimal cooperative strategy to share the maximum average score for each other.

1 Introduction
Recently, searching an optimal interactive strategy for agents in multi-agent
system has received a lot of attention among researchers, because multi-agent
systems play an important role in developing and analyzing models and theo-
ries of interactivity in human societies. Although interaction between human
beings is an integrated part of our everyday life, their mechanisms are still
poorly understood. With the help of evolutionary learning, one of Distributed
Artificial Intelligence technologies, we are able to explore their sociological and
psychological foundations.
There is a big trend of using "Game-theoretic approach" for studying au-
tonomous multi-agent models. By formalizing situations around agents to an
appropriate game, we can use it to find a good strategy of agents. Iterated
Prisoner's Dilemma (IPD) is one of the most popular game models that has
been studied in numerous works. However, it is a pity that not all situations in
real world can be formalized as IPD. In IPD framework, even rational agents
can get higher profit than they cooperate. Such model is useless for deadlock-

170
171

avoidance problems where competitors will risk their life if they only consider
their self-profits. One example is front-to-front car race 1 .
In this paper, we propose a dynamic game model, called Compromise
Dilemma (CD), for studying deadlock-avoidance problems. In our model, two
agents will utilize the same resource, that can only be used by one agent at
each time, to accomplish their work. The action taking the resource increases
the work done if succeeded, but raises a collision that decreases the work done
of both agents, if failed.
Normally, IPD allows agents only have two choices of action: full cooper-
ation and full defection. However, recent papers have considered more choices
than the two extremes 10 ' 11 . In our work, each agent will consider two alter-
nate actions with an intermediate one during competition with his opponent.
In real community, human sometimes also considers an intermediate action, for
example, waiting for a chance, without making his decision at once. He may
watch first what his opponent does and takes opportunity at the next time, or
sometimes, he may make himself to be in opposite condition by exploitation
of his opponent. Allowing the existence of intermediate actions leads us able
to make a more realistic approach to study human interactions.
The remain of this paper is organized as follows. In the next section,
we briefly introduce a dilemma problem, formalize it as Compromise Dilemma
with an intermediate action. In the section 3, we describe how to implement the
evolutionary learning algorithm. We test the model with different parameters
for GA operations and analyze the results in section 4. Finally, we discuss
why the evolutionary approach can lead rational agents to provide profit their
community in section 5.

2 Game-theoretic Approach
2.1 Deadloak-avoidance Problem
To formalize a conflict resolution problem, we consider a grid-lane environment
in which mobile agents are navigating to their predefined goals according to
their planned space-time paths, as shown in Figure 1.
Here we assume that agents are unable to communicate with each other,
so they must decide their actions by themselves. In Figure 1, agents x and y
are moving towards point B and A respectively. In this case, if both agents go
forward in current directions, then there will be a collision of the two agents.
To avoid the collision, one of them must give up the way. If one is sure that
his opponent will give up the way, it is profitable for him to advance the way.
On the other hand, there may be a waste of the space-time resource if both of
them give up the way without having any information what opponent intended
172

N
V2 G W T

G U=3 LS = 2 1 =2
i

T
o

A=5
=4
1 HE=5
V=1

C =0

Fi
Figure 1: Deadlock-avoidance Problem S u r e 2 : P a y ° f f Matrix for CD with Interme-
diate Action

to do. We formalize this competitive problem as Compromise Dilemma (CD).


It is a two-player game and similar to the so-called Chicken Game 1 .

2.2 Compromise Dilemma Model

In the situation introduced in section 2.1, usual dilemma games allow each
agent choose an action either "take the way" or "give the way" on each play.
However, in real community, human beings might consider an intermediate ac-
tion such as "waiting chance" without making any decision at once. Therefore,
in order to approach a more realistic model of evolution, we add an interme-
diate action "waif into our dilemma game. The intermediate action means
"do nothing in current step and watch what the opponent does first". If his
opponent gives up the way in current step, he will become the opportunist
because at the next step he can take the way without any disturbance. On the
other hand, he will be victim if his opponent takes the way in current step,
because he is tricked and he must change his direction at the next step.
According to combinations of their actions, each agent can get a score due
to the payoff matrix shown in Figure 2. In the payoff matrix, the row player
is P\ and the column agent is P 2 , respectively. We use symbol "G" for agent's
action "give", "T" for "take" and "W" for "wait". Each content of the matrix
expresses a payoff that agent Pi gets due to the corresponding combination
of his and his opponent's actions. If both agents take the action "give", each
one obtains a payoff U = 3 for "loss by unnecessary compromise". If both
choose the action "take", both obtain C = 0 for "punishment for damage by
collision". If one agent chooses the action "take" while his opponent chooses
"give", he gets A = 5 for "advantage". In opposite situation, he gets J = 2 for
"intended compromise".
In contrast with the Prisoner's Dilemma, when both of agents play advance
173

(take), the payoff of any agent is lower than that when the agent playing
compromise (give) and his opponent play advance (take). Obviously, if both
of them compromise, they will avoid the crash and none of them will either
be a winner or risk his life. If one of them certainly swerve away, they will
be "chicken" as in Chicken Game 1, but will survive with the result that the
opponent will get all the honor. If they crash each other, the cost for either of
them will be higher than the cost of being a chicken.
In addition to those combinations, we give a payoff O = 4 for "opportunist
", LS = 2 for "lose but save", L = 2 for "lazy", V = 1 for "victim" and E = 5
for "exploitation". Notice here that a collision occurs only when both agents
advance the way simultaneously. We assume that if an agent chooses "take"
while his opponent chooses "wait", he just tricks his opponent in current step
that causes no damage for his opponent. Therefore we give a payoff E = 5,
the same as A, when he exploited his opponent.
Suppose we define up(a,i, a,-) as the score that agentp received when agentp
executes action o^ and his opponent executes action a,-. In this paper, the above
payoff matrix satisfies the following conditions.
up(T,T) <up(W,W) <up(G,G)
up(T,T) <up(W,T) <up(G,T)
up{G,G) <up(W,G)<up(T,G)
The arrows in payoff matrix illustrate these conditions. Those illustrate
that we let agents drift as far as possible away from the point where collision
occurs, and let them drift as close as possible to the point where "cooperative
actions" emerge.

3 Implementation of Evolutionary Learning


To study evolutionary behaviors of agents in CD, a population of 100 individ-
uals is used. Each individual in the population is a strategy represented by
Moore-machine of Finite State Automaton (FSA), as shown in Figure 3. An
input alphabet to FSA is one of the actions {G, W,T}, which represents the
latest action of its opponent and is expressed as the label of the corresponding
transition arrow in FSA. Output alphabet is also an action, which represents
the action of the agent for current play and is expressed in the corresponding
letter around the outside of each state.
In this work, we assume that each initial FSA has only one state, and the
maximum number of states in each FSA is limited to 8. Initial state is pointed
by a thick arrow. In Figure 3, as an example, if the initial state is si and the
latest action of opponent is "G", then transition occurs from si to s2 along
174

Figure 3: Expressing strategies by Moore-Machine

the arrow labeled as "G", and FSA returns the label "W" outside of the state
s2 as the action for current play.
Following Axelord 8 , a genetic algorithm (GA) maintains a population of
trial strategies. In each generation, each individual plays the iterated CD
game against each of other individuals in the same population. The fitness of
a strategy (an individual) is the average payoff of all those games, defined as
follows.

fitness(pi) = ^ N _ x " (!)

where N is the population size, score(pi, Oj) is the average payoff of individual
Pi for random iterations play against opponent individual Oj, and is denned
by:

where a£ (a™) is the action taken by agent Pi (opponent Oj) in the nth iteration,
and Round is the number of iterations decided randomly.
Initially, a population of 100 individuals, each of which has only one state,
is generated randomly. Starting from the initial population, co-evolution goes
on with requiring no prior knowledge of how to play a game well. With the
population evolves, the individual strategies improve as the game goes on.
After each generation, individuals are sorted according to their fitness. The
50 best individuals are transferred to the next generation, and the remaining
individuals are discarded. Then, parent individuals are selected from the 50
elites by roulette-wheel preservation method, and the genetic operators of mu-
tation, insertion and deletion are applied to the selected parents to generate 50
offsprings for the next generation. Parent individuals are mutated at a proba-
bility of a. Internal states in parent FSA are inserted at a probability of (3 and
those are deleted at a probability of 7. These probabilities will be used as the
parameters of our tests. Three genetic operators are implemented as follows:
175

3.1 Genetic Operations


Mutation: An offspring is generated by random change of one of three tran-
sition conditions of a state, selected at random, of parent FSA.

Before mutation After mutation

Insertion: An offspring is generated by inserting a newly generated state at


random position in the parent FSA. Three transition conditions of the
new state to other existing states are set randomly.

New state
Before insertion

After insertion

Deletion: An offspring is generated by deleting a randomly selected internal


state of the parent FSA. The transition arrows previously directed to
that state are changed to other states randomly.

Before deletion After deletion


176

4 Experimental Results

0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 <t00 450 500
Generations Gene ra No ns

(a) (b)

• • • • .

• ;, i >, i'''* w f'"T'''*Jti^'


k1*
:
IMKii"' lUH

•'i;
M run 2- - - -
ran 10

0
'I 50 100 150 200 250 300 350 400 450 500
Generations

(c) (d)

Figure 4: Plots of Population Averaged Fitness in each generation, (a) for a = 0.5,/3 =
0.25,7 - 0.25. (b) for a = 0.1,/? = 0.05,7 = 0.05. (c) for a = 0.04, j3 = 0.02,7 = 0.02. (d)
Means of 10 runs of (a), (b) and (c).

In this paper, we provide two experiments on our model. In each one, a


population of 100 trial strategies evolves until certain generations.
In the first experiment, in each generation, each individual plays an iter-
ated game against other members of the same population (round-robin). The
number of iterations in each game is decided randomly. We make the test by
changing the probability parameters of genetic operations. 10 runs for each of
177

three parameter sets, (a = 0.5,(3 = 0.25,7 = 0.25), (a = 0.1,(3 = 0.05,7 =


0.05) and (a = 0.04, [3 = 0.02,7 = 0.02) are made and each three of them are
plotted in Figure 4(a), 4(b) and 4(c) respectively. The mean value of 10 runs
for each of three parameter sets are plotted in Figure 4(d).
In the test with parameter set (a — 0.5,(3 = 0.25,7 — 0.25), 3 out of 10
runs did not get the optimal score, example: run 9 in Figure 4(a). In other
two tests, all runs reached the nearly optimal score, (the optimal score is 3.5),
after 400 generations. In all three tests, we can see the tremendous change of
population-averaged fitness in earlier generations. As there is only one state
in each FSA in the initial generation, agents behave as blind. An individual
has no other incentives except the output value of the initial state. If the
output of the initial state is "G", agent gives up the way whenever he meets
dilemma in spite what his opponent's action would be. On the other hand, if
the output of the initial state is "T", he advances without considering whether
that would risk his life, and so on. As evolution proceeds, population members
become better judges of each other. Genetic mutations create more states in
FSAs, therefore an individual can alter to the transition paths to get higher
score. From the mean plots in Figure 4(d), we find that the less individuals are
genetically mutated, the longer it takes for evolving to reach their saturation,
but the higher saturated score (nearly optimal) they can get.

1000 1200 1400 1600 1800 2000 0 200 400 600 800 1000 1200 1400 1600 1800 2000
Generations Generations

(a) (b)

Figure 5: Results for second experiment (a) 3 out of 10 runs, (b) Mean of 10 runs

On the other hand, in a real world problem, for example, multi-agent


burden carriage problem, agents cannot expect when and where they will meet
with which opponent. Therefore, in the second experiment, we let individuals
178

play iterated games against only part of the members in a population. Each
agent plays games in random iterations with randomly selected opponents.
We fixed the probability parameter set as (a = 0.04, (3 = 0.02,7 = 0.02).
At this time, individuals cannot get their optimal score in 500 generations. 10
runs are made and three of them are plotted in Figure 5(a) and the mean of
10 runs is plotted in Figure 5(b). We found that agents can evolve to reach
their optimum after 1000 generations in almost all runs.
In all tests, agents behave as blind in the earliest generations. As evolution
proceeds, they improve in playing the game by taking more and more complex
actions, and emerging cooperative interactions. Here, cooperative interaction
in CD means that agents take their action alternatively to avoid damage or
loss. In later generations, agents keep their cooperative interactions while
generating optimal score and keeping their community peaceful.

5 Conclusions

In this paper, we proposed a game model for Compromise Dilemma problems


and observed the evolutionary behaviors of simulated agents with such a model.
According to the experimental results, evolutionary approach makes the agents
able to evolve their own strategies for dealing with an uncertain environment in
an intelligent manner. The reason is that autonomous agents are able to share
maximum average score by avoiding two kinds of extreme cases: occurrence
of damage by collision and undesired loss of space-time (resource). In other
words, agents can achieve their optimal strategy that enables them to utilize
the resources of environment as much as possible.
Compromise Dilemma is naturally a competitive problem in which all in-
dividuals try to maximize their own benefit. However, since their opponents
are who also concurrently evolve in the same way to upgrade their fitness, co-
operative interactions between agents are established. Accordingly, population
fitness becomes escalated as the evolution goes on. Following the experimen-
tal results, we speculated that rational agents yielded a communal pronto with
suitable circumstances.
Our future work is to use this model to simulate real world problems.

Acknowledgments

This work is partially supported by the Hori Information Science Promotion


Foundation, Japan.
179

References
1. Bengt Carlsson and Stefan Johansson: "An Iterated Hawk-and-Dove
Game" Proceedings of the Third Australian Workshop on Distributed
AI and Lecture Notes in Artificial Intelligence 1441, 1997.
2. Boyd, R., & Lorberbaum, J.P.: "No pure strategy is evolutionarily stable
in the repeated prisoner's dilemma game" Nature, 327 pp 58-59, 1987.
3. Fogel, D.B : "Evolving Behaviors in the Iterated Prisoner's Dilemma"
Evolutionary Computation, 1(1) pp 77-97 1993.
4. Akira Ito and Hiroyuki Yano: "The Emergence of Cooperation in a So-
ciety of Autonomous Agents- The Prisoner's Dilemma Game Under the
Disclosure of Contract Histories" ICMAS'95
5. Lindgren K.: "Evolutionary Phenomena in Simple Dynamics" Artificial
Life II pp 295-311, 1991.
6. Peter J Angeline: "An Alternative Interpretation of the Iterated Pris-
oner's Dilemma and the Evolution of Non-Mutual Cooperation" Arti-
ficial Life IV, Proceedings of the fourth international workshop on the
synthesis and simulation of living system pp 353-358
7. Reiji Suzuki and Takaya Arita: "Meta-Pavlov: Strategies that Self-
Adjust Evolution and Learning Dynamically in the Prisoner's Dilemma
Game" Game Informatics 1999.
8. R.M.Axelord: "The evolution of Cooperation" Basic Books, New York
1984.
9. Multiagent Systems : "A Modern Approach to Distributed Artificial
Intelligence" The MIT press, 1999.
10. Paul G.Harrald and David B.Fogel: "Evolving continuous behaviors in
the Iterated Prisoner's Dilemma" Biosystems, 1996.
11. Yao.X, Darwen.P : "How Important Is Your Reputation in a Multi-Agent
Environment" Proceedings of the 1999 IEEE international conference on
systems, man, and cybernetics(SMC99) .
A STRATEGY FOR CREATING INITIAL DATA O N ACTIVE
L E A R N I N G OF MULTI-LAYER P E R C E P T R O N

K A Z U N O R I IWATA A N D N A O H I R O ISHII
Dept. of Intelligence and Computer Science, Nagoya Institute of Technology,
Gokiso-cho, Showa-ku, Nagoya, 466-8555, Japan
E-mail: {kiwata,ishii} Qegg.ics.nitech.ac.jp
Keywords : active learning, multi-layer perceptron, network inversion,
pseudo-random number, low-discrepancy sequence

Many active learnings in the training of a partially trained Multi-Layer Perceptron


(MLP) have been proposed. We note any active learning performance depends on
initial training data. The initial training data plays an important role for active
learning performance, because any active learning algorithm generates additional
training data that is useful for improving the classification accuracy, based on ini-
tial training data. Most of conventional methods have generated initial data at
random using a pseudo-random number. However, in practical case, we can not
prepare enough data by the limit of time and cost. Therefore, the bias of initial
training data becomes critical, especially in the case of input space dimension to be
large. In this paper, we propose a strategy by the use of low-discrepancy sequence
for creating more uniform initial data than pseudo-random numbers. For the clas-
sification problem of MLP, we analyze the experimental performances of network
inversion algorithm which use a pseudo-random number and a low-discrepancy se-
quence as initial training data. In experimental results, we found low-discrepancy
sequences give a good strategy to create initial training data. Finally, we also
discuss some advantages and disadvantages of low discrepancy sequences as initial
training data.

1 Introduction

Learning by the use of queries through training data generation mechanism is


well-known as active learning 1 ' 2 . Active learning includes the interaction with
the oracle, which always responses a correct answer when queried with exam-
ple. In other words, the oracle gives a correct classification for a given data
point. Examples of oracle include a human expert, costly experimentation,
computer simulators and so on. The classifier adds the properly classified
point by the oracle as training data. Such a learning with additional train-
ing data can significantly increase the resulting classification accuracy with
a small computation 1,2 , and has recently attracted considerable attention.
In this paper, we consider only the case of active learning on Multi-Layer
Perceptron (MLP) 3 .
Many active learnings in the training of a partially trained MLP have
been proposed 1 ' 2,4 ' 5 ' 6 ' 7 . We note any active learning performance depends on

180
181

initial training data. The initial training data plays an important role for
active learning performance, because any active learning algorithm generates
additional training data that is useful for improving the classification accuracy,
based on initial training data. In practical case, it is desirable that we prepare
various initial data, that is, uniformly distributed data for a given space.
There are several reasons for uniformly distributed data to be required. One
is that each class should have a few initial data, because if no training data
initially exist within each class region, most of active learning algorithms can
not refine these classification boundary. However, in many case, we cannot
recognize each class region in advance. A good strategy is to prepare as
uniform data as possible for a given space avoiding the repetition of the same
data. Another reason is to detect the whole boundary by active learning
algorithm. The bias of initial data may cause the classification bias for a
given space.
Most of conventional methods have generated initial data at random us-
ing a pseudo-random number. By the low of large number and central limit
theorem in statistics, the pseudo-random number can distribute uniformly for
a given space as the number of data approaches infinity. However, in practical
case, we can not prepare enough data by the limit of time and cost. There-
fore, the bias of initial training data becomes critical, especially in the case
of input space dimension to be large. In this paper, we propose a strategy by
the use of low-discrepancy sequences for creating more uniform initial data
than pseudo-random numbers. For the classification problem of MLP, we ana-
lyze the experimental performances of network inversion algorithm which use
a pseudo-random number and a low-discrepancy sequence as initial training
data. Network inversion algorithm is one of effective active learnings to cre-
ate additional training data in terms of classification independence of input
distribution, computational cost and complexity of implements.
The organization of this paper is as follows. In section 2, we briefly explain
the back-propagation and network inversion algorithms. Low-discrepancy se-
quence is discussed in section 3. In section 4, for the two-class classification
problem, we compare the experimental performances which employ a pseudo-
random number and a low-discrepancy sequence as initial training data, and
discuss some advantages and disadvantages of low-discrepancy sequence. Fi-
nally, we summarize and give some conclusions in section 5.

2 Dynamics of Multi-layer Perceptron

It is helpful to review the dynamics of MLP before moving to the main task.
We start with the forward and learning (backward) phases of MLP, and then
182

proceed to the Network Inversion (NI) algorithm.

2.1 Forward and Learning Dynamics


Let the number of layer be L and the lth layer has 7Vj neurons. The 1st layer,
the Lth layer and the other (from 2nd to L — 1th) layers are called input
layer, output layer and middle layer, respectively. The output at each layer
is expressed in the following equations,

JV,_i

Ui{l) = Y, ^{1)^(1 - 1) + 9i(l) (1)

Oi(l) = f(Ui(l)) (2)

where Ui(l) and ai{l) denote the net value and activation value of the ith
neuron at the lth layer, respectively. 6i(l) is the bias of the ith neuron at the
Zth layer. Wij(l) denotes the weight connected between the jth neuron at the
I — lth layer and the ith neuron at the lth layer. /(•) is an activation function
(e.g. sigmoid function).
The back-propagation method is the most popular method for learning of
MLP. Using an iterative gradient descent algorithm, the mean squared error
E between the teaching vector £ = (ii, • • • ,tNL) and the actual output vector
a(L) = (oi(L), • • • , CLNL (L)) is minimized according to the rule :

3E
«,„(!)<- »,,(()- n-g^ (3)
(4)
-"*w-"®£$>
where n is the learning rate, and the mean squared error E and the error
signal Si (I) are calculated recursively :

NL
1
E=-Y,(ti-ai(L)f (5)
2
dE
Si{l) =
da.i{l)
[-{U-ai{L)) (l = L)
E j r M * + 1 ) ^ 1 ^ (otherwise) (6)
183

2.2 Network Inversion Algorithm


NI algorithm 7 ' 8 is designed to move each existing data to one specific boundary
point. This idea is similar to the back-propagation algorithm. In NI algorithm,
by using a gradient descent algorithm, the error signal e is propagated from
output layer to input layer in order to update the input vector so that the
desired output vector r = (T± , • • • , TNL ) will be produced by the network.
NL
1
(7)
i=l

m _ dE
£i()
~ daiil)
_{-{Ti-ai{L)) (l = L)
1 (8)
~ \ EfiV ej(l + l ) ^ (ot^rwise)
NI algorithm works concurrent with back-propagation algorithm.

2.3 Additional Data by Network Inversion


In order to represent the classification problem concisely without loss of
generality, we consider MLP with outputs is designed to output a two-
class classification value for an input vector. That is, each output neuron
Oi(L)(l < i < NL) is trained to output either 1 for one class or 0 for the
others. The input vector which produces the desired output vector r, typi-
cally T = 0.5, can be considered the classification boundary of MLP. In other
words, the most likely explanation from an ambiguity point of view is that
the input vector corresponding to the output vector T includes the region of
maximum classification ambiguity (see Figure 1). We employ NI algorithm
to invert initial training data toward the region of maximum classification
ambiguity, and use the inverted data as additional training data so that MLP
effectively improves the boundary. Such an additional training data can sig-
nificantly increase the resulting classification accuracy. Note any additional
training data is created based on the initial training data.

3 Low-Discrepancy Sequences

In general, any active learning algorithm can generate effective additional


training data to improve the classification accuracy. NI algorithm which is one
of active learning algorithms also creates additional data based on the initial
training data as has discussed above. This means the initial training data
184

\^(^region of maximum
classification ambiguity

Figure 1. A concept of the region of maximum classification ambiguity

plays an important role on its performance. Suppose that we can generate


any point inside a given input space, and are allowed to get the properly
classification (teaching signal) by the means of interaction with the oracle.
But each class region is not tell us in advance. A good strategy is to generate
as uniform initial data as possible inside the input space without the repetition
of the same data so that a few initial data exist inside each class at least, and
the whole boundary can be detected. Most of traditional ways use a pseudo-
random number to generate data uniformly.
Low-discrepancy sequence9 (LDS) is well-known in the field of Quasi
Monte Carlo method. One of the notable features of LDS is to have low
discrepancy. Discrepancy means a measure of the uniformity of distribution
of finite point sets. In short, LDSs create more uniform distributed data than
pseudo-random numbers for a given space. In addition, LDSs never generate
the repetition of the same data point. For many cases of multidimensional
integration, Quasi Monte Carlo method by the use of LDS is more effective
than the conventional Monte Carlo method by the use of pseudo-random num-
ber. We employ LDS as a strategy for creating initial training data for the
multidimensional classification. In this section, we briefly review the basic of
LDS at first, and then explain Faure sequence which is a kind of LDS a .

3.1 Discrepancy
To carry the discussion of properties of LDS further, let us define the term
discrepancy in detail. Let x{n) — (xi(n),--- ,XK(TI)) and E{x) be the nth
training data of K dimensions and the subset [0,X\) x ••• x [0,xx) in the
K dimensional hypercube [0,1] K , respectively, ./^-discrepancy TK{U) of the
training data set P — {x(n) \ n = 1, • • • , N} by the measure of L2 norm for

"For further details of LDSs, see


185

Lebesgue integrable function space is defined as follows.

Tx{n)*g{f ( #(£(* I A0)_fix(n))2^ (9)

where #{E(x \ N)) denotes the number of data inside E(x). In the same way,
imax-discrepancy DK (n) by the measure of maximum norm is defined as the
following equation.

def | #{E(x I AQ) TT . . . ,im


DK(n) = sup | \\x{n) | (10)
*e[o,i] K ^ k~i
Equations 9 and 10 are the amount of the uniformity of distribution of N data
point sets by the measure of each norm. Only Z/2-discrepancy is known to be
calculated by the following equation.
N N K
T 1
K^ =^ E E I I ( - msK{xk(n),xk{m)})
n = l m=\ k=l
Ol-K N K
—ArEII(1-^H) + 3-if (ii)
n = l Jfc=l

For N > 1, the relation between Z/2-discrepancy and L max -discrepancy satis-
fies the following equation.
TK(n) < DK(n) (12)
With the large number of training data so that data points are distributed as
uniform as possible, we can consider
N
DK(n) -^° 0 (13)
asymptotically. Equations 12 and 13 lead the following equation.

TK(n)N-^T0 (14)
LDS keeps the following minimum discrepancy for N > 1.

DK(n) < cK^p!L (15)


where CK is the constant number depends on K dimensions. The multidimen-
sional LDSs include the Halton, Sobol', Faure and other sequences 9 ' 10 . We
will concentrate on Faure sequence in the next section and leave the details
of other sequences to references 9 ' 10 .
186

3.2 Faure Sequence


Faure sequence is generated based on a prime number p more than K as
the radical number for the K dimensional problem. The first step in the
calculation of nth data point is to compute the first element x\(n) as follows.

ar1(n)=53ai,ro(n)p-m-1 (16)
m=0
where ai, m (n) is the number which satisfies the following digit expansion.
oo

" = 5Z a^m{n)pm (17)


m=0
Then, in the next step, the other elements Xk(n)(2 < k < K) are computed
as
x a
k(n) = ^2 k,m(n)p (18)
m=0

where each ak,m(n) is the number which satisfies the following equation.

fak,o(n)\ /oC0 iC0 2C0 • •-^ fc-i fa1>0(n)\


ak,i(n) 1C1 2 Ci •• ai,i(n)
ak,2(n) =
2C2 • • ai, 2 (n) (mod p) (19)

V / V 0 ••/ / V
where 0C, denotes o combination •. We use Faure sequence as typical LDS
in the experiment in the next section.

4 Experimental Results and Discussion

In order to be simple without loss of generality, we will take up the two-class


hyper-sphere classification problem with K dimensional input vector where
classification target is 1 inside hyper-sphere and 0 otherwise within hypercube
[0,1] K as shown in Figure 2. That is, when queried with the nth data, the
correct classification is

h(n)
_/l if£f = 1 (^W-0.5) 2 <r 2 (20)

where r denotes the radius of hyper-sphere. Any training data is generated


inside hypercube [0, l]k. Firstly, MLP was trained with each initial train-
ing data. The training was concurrent with inverting initial training data
187

(0.5, ... ,0.5)

Figure 2. Two-class hyper-sphere classification problem

Table 1. Each parameter of MLP training

Dimension of input vector (K) 2 3 4 6 8 10


Number of neuron in middle layer 3 4 5 5 5 5
Radius of hyper-sphere (r) 0.3 0.4 0.5 0.6 0.7 0.8
Number of initial training data 75 400 600 900 1200 1500

toward the boundary point. Then, after convergence, the resulting inverted
data is classified with a correct classification by the oracle. Secondly, MLP
was re-trained with combination of the original data and inverted data with
a correct classification. Finally, MLP classifies 104 validation data, which
uniformly distributed inside the input space [0, l]k. We evaluated the classi-
fication accuracy of MLP by the measure of misclassification ratio.
We set the learning rate r/ to 0.01 and each initial weight randomly within
[—0.05,0.05]. Table 1 shows the structure of three layer perceptron, radius
of hyper-sphere and the number of initial training data. Figures 3 and 4 is
one of graphical representations of inverted data of two dimensions based on
a pseudo-random number and Faure sequence, respectively. Circle in each
figure denotes the true boundary. These figures tell us how well NI algorithm
detects the whole boundary. In the classification accuracy, it is important
to create additional training data for the whole boundary. As these figures
indicate, the inverted data based on pseudo-random number failed to detect
the lower part of boundary. By contrast, the inverted data based on Faure
sequence detected the whole boundary well so that the classification accuracy
188

1 1 •
• •

0.8 r^*^^, 0.8 -*tSJ««S

0.6 0.6
• Ii
ft
0.4 It 0.4

V '" • 0 /"
/* /
0.2 0.2 M M

i , *
•.
0 0.2 0.4 0.6 0.8 "0 0.2 0.4 0.6 0.8

Figure 3. Inverted data based on pseudo- Figure 4. Inverted data based on Faure se-
random number quence

Table 2. Misclassification ratio (%) in each dimension

Dimension of input vector (K) 2 3 4 6 8 10


pseudo-random number 3.0 4.39 6.23 16.47 27.68 25.34
Faure sequence 1.74 3.65 5.82 15.47 25.58 23.3

can be improved. Table 2 shows the experimental results, averaged over 5


simulations. We found the trained MLP based on Faure sequence identifies
more better than the pseudo-random number, especially in the high dimen-
sional cases. It follows from these results that LDSs provide a good strategy
to generate initial training data for the classification problem. The superiority
of LDS can be explained by the sampling principle "Data should be sampled
uniformly for a sample space so that it avoids sampling bias" in statistics.
As a drawback of LDSs, LDSs tend to introduce systematic artifacts com-
pared with pseudo-random numbers. However, this drawback disappears as
the number of data is increased.

5 Conclusion

In this paper, we discussed the use of LDS for generating initial training data
on active learning of MLP. The use of LDS is designed to create initial training
data uniformly so that each class initially has a few data at least and the whole
189

boundary is detected by any active algorithm without the repetition of the


same training data point.
In our experiments, we compared the performances of NI algorithm which
used a pseudo-random number and a LDS as initial training data of MLP. We
showed in experiments that LDSs have an advantage over pseudo-random
numbers in terms of effective generation method of initial training data that
the classification accuracy can be improved. Especially, a good performance
of LDSs becomes critical in higher dimensional cases. In the further work, we
would like to show a theoretical advantage of LDS for generation method of
initial data in active learning.

References

1. Les Atlas, David Cohn, and Richard Ladner. Training connectionist


networks with queries and selective sampling. Advances in Neural Infor-
mation Processing Systems, 2:566-573, 1990.
2. Jenq-Neng Hwang, Jai J. Choi, Seho Oh, and Robert J. Marks II. Query-
based learning applied partially trained multilayer perceptrons. IEEE
Transactions on Neural Networks, 2(1):131-136, 1991.
3. D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning represen-
tations by back-propagating errors. Nature, 323:533-536, 1986.
4. D. MacKay. Information-based objective functions for active data selec-
tion. Neural Computation, 4:590-604, 1992.
5. Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Information, predic-
tion, and query by committee. Advances in Neural Information Process-
ing Systems, 5, 1993.
6. D. Cohn. Neural network exploration using optimal experiment design.
Neural Networks, 9, 1996.
7. Hiroyuki Takizawa, Taira Nakajima, Hiroaki Kobayashi, and Tadao
Nakamura. An active learning algorithm based on existing training data.
IEICE Transactions Information and Systems, E83-D(l):90-99, January
2000.
8. A. Linden and J. Kindermann. Inversion of multilayer nets. In Pro-
ceedings of International Joint Conference on Neural Networks, pages
425-430, Washington DC, June 1989.
9. H. Niederreiter. Random Number Generation and Quasi-Monte Carlo
methods, volume 63 of CBMS-NSF Regional Conference Series in Applied
Mathematics. SIAM, 1992.
10. Syu Tezuka. Uniform Random Numbers Theory and Practice. Kluwer
Academic Publishers, 1995.
EQUILIBRIUM SELECTION IN A SEQUENTIAL
MULTI-ISSUE B A R G A I N I N G MODEL W I T H
EVOLUTIONARY AGENTS

N O R B E R T O EIJI NAWA1'2, KATSUNORI SHIMOHARA1-2, OSAMU KATAI2


1
A T R International - ISD, Soraku-gun, Kyoto 619-0288, Japan
2
Grad. School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan

A multi-issue alternating-offers bargaining model is presented where the issues are


sequentially negotiated by the agents, as opposed to the classical setting where they
are disputed simultaneously in bundles. The strategies that determine the agents'
negotiation behaviors are generated by evolutionary algorithms. Preliminary re-
sults show a qualitative conformity with game theoretic predictions. Moreover,
they suggest that, in specific situations, the sequential setting can lead to better
outcomes concerning social welfare.

1 Introduction
Recent years have witnessed an intense cross-fertilization between economics
and computer science, more specifically with the area of artificial intelligence
(AI). 1 Negotiation is the coordination mechanism that involves the interaction
of two or more parties with heterogeneous, possibly conflicting preferences,
searching for a compromise that is satisfactory and mutually beneficial, so
as to be accepted by all participants. It has been long a subject of study in
economics, but recently it has also attracted the interest of AI researchers,
due to its direct implications in the implementation of multi-agent systems.
This paper reports on preliminary results of experiments performed with
a sequential multi-issue bargaining model. The players have their bargaining
strategies developed by means of a class of evolutionary algorithms named
evolution strategies (ES). 2 Differently from the classical setting, where the
issues are disputed simultaneously in a single bundle, in the present model
each issue is negotiated individually in sequence. Our interest in the sequen-
tial setting of bargaining processes lays on the fact that often the negotiated
issues have time-varying, inter-dependent complementarities. That is, from
the point of view of the players, the requirements with regard to a certain is-
sue may change depending on the results of negotiations with regard to other
issues. If the negotiation occurs over bundles of issues, the players have to
consider the inter-relationships in advance in order to calculate the utilities
of the possible outcomes and settle an agreement that provides a satisfac-
tory trade-off. On the other hand, by negotiating the issues sequentially, it is
expected that these inter-issue relations are more naturally dealt with.

190
191

2 Bargaining Models

A bargaining situation consists of two or more players trying to engage in


a mutually beneficial agreement over a set of issues. The players, or agents,
have a common interest to cooperate; the question that remains open is which
one of the possibly several compromise settings will be chosen by the players. 3
That decision should be deliberated by the participating agents, in light of
their different and perhaps incompletely revealed, conflicting preferences.
The seminal work by Rubinstein 4 set the dominant tone in the systematic
analysis of bargaining games. Rubinstein started by illustrating the typical
situation using the following scenario: two players, A\ and A2, have to reach
an agreement on the partition of a "pie". For this purpose, they alternate
offers describing possible divisions of the pie, such as " A\ receives x and A<i
receives 1 — x at time i", where x is any value in the interval [0,1]. The
player that receives the offer decides whether to accept it or not. If the offer
is accepted, the negotiation process ends, and each player receives the share of
the pie determined by the engaged contract. Otherwise, the receiving player
makes a counteroffer, and all the steps above are repeated until a solution is
reached or the process is aborted due to some external reason (e.g. one of the
parties leaves the negotiation process). Players obtain utilities from shares of
the pie according to their private valuation functions.
In addition to the pie itself, time is also a relevant commodity. Interactions
between the players are not " frictionless"; every additional time step in the
bargaining process decreases the utility obtained by the agents. The discount
due to time is usually calculated as follows. Let y be the share received by
agent Ak; the time t in which the contract is set reflects in the obtained payoff
(utility) Uk by means of a fixed discount factor, such as Uk = V • 5lk, where
0 < 5k < 1 is the discount factor. The lower the 5k, the stronger is the time
pressure. In an infinite-horizon scenario, if the transactions are frictionless,
rational agents would most likely get trapped in a impasse, since they would
try to get the whole pie for themselves in an endless attempt to maximize
their utilities. In this sense, time pressure constitutes an incentive for mutual
concessions and agreements.
The most significant result introduced by Rubinstein 4 is that under the
assumptions that (1) the discount factors are common knowledge to the play-
ers and (2) the number of stages to be played is infinite, the solution for the
game is unique, i.e., there is a perfect equilibrium partition (P.E.P). In other
words, in an exchange between rational agents, the first proposer should (cal-
culate and) offer the P.E.P; the first responder should then accept the offer
right away, making an instantaneous deal with no need of further interaction.
192

For a setting where agents A\ and A^ are penalized with discount factors
5-y and 62, respectively, and assuming that A\ is granted the first offer, the
composition of the P.E.P contract is that player A\ receives a share of the pie
which returns her a utility of U\ = (1 — #2)/(l ~ ^1^2)) whereas player A^ gets
a share that returns him a utility of C/2 = £2(1 — ^i)/(l — ^lfe).
It is possible to perform a similar analysis for the finite-horizon case. Say
the maximum number of steps in the game, n, is common knowledge to the
players. In the case where n = 1 (also known as the ultimatum game), agent
A\ makes the only offer; A2 can accept it or refuse it; in either case the
negotiation process ends. If the offer is refused, both agents receive noth-
ing. For a rational agent "anything is better than nothing"; therefore, A\,
knowing about the rationality of its opponent, will tend to keep the whole
pie to herself, offering only a minimum share to A^ aware that there are no
further stages to be played in the game, rational A2 inevitably accepts the
tiny offer. Applying a backward induction reasoning on the situation above,
it is possible to calculate the P.E.P for n > 1. For values of S close to 1,
finite-horizon alternating-offers bargaining games give a great advantage to
the player making the last offer, since it becomes similar to an ultimatum
game.

3 Evolutionary Computation and Economic Agents


The dissertation by Oliver 5 was the first work that succeeded to show that
evolutionary algorithms can be used in the design of strategies for multi-
issue negotiations. Oliver's motivation originated from the observation that
negotiation problems are rather inefficiently resolved by humans, who often
settle in suboptimal agreements. In his framework, a strategy consists of
a vector of numbers that represent offer and threshold values. Offer values
indicate the portion that the proposer is willing to share with an opponent;
threshold values correspond to the minimum value a received offer should have
in order to be accepted.
The work by van Bragt, Gerding and La Poutre 6 has a different spirit
from the previous one; an interesting and detailed game-theoretic analysis of
the evolved trading strategies is performed. Using Rubinstein's alternating-
offers bargaining game with a finite-horizon as a base model, they perform
several numerical experiments with traders that evolve strategies in the same
format devised by Oliver.5 The results show that despite the bounded ra-
tionality of the traders (since they are only aware of their own payoffs and
discount factor), the evolved behaviors are aligned with what is predicted by
game theory.
193

4 Sequential Multi-issue Bargaining Model


In the classic multi-issue alternating-offers model, the issues are negotiated
simultaneously, in a bundle. If there are N issues in the setting, trader Ak
makes an offer to its opponent by passing a vector Ok = (o\, 02,03,..., ON)
with one offer relative to each one of the issues. Usually in multi-issue sit-
uations, a trader describes its preferences on the several issues through a
vector of weights, Wk — {w\, w-i, W3,..., WN), indicating the relative im-
portance of the issues. If an offer Ok is accepted, the offering trader re-
ceives an utility of (J — Ok) • Wk, where J is the unity vector, assuming that
0 < Oj < 1, i e [0,1, 2 , . . . N]. Accordingly, the agent receiving the offer gains
an utility of Ok • Wq, where Wq denotes its own weight vector.
However, often the issues are inter-related and complementary; the util-
ity of an issue is a function of the values obtained from other issues. One
could devise situations where the weights attributed to the issues may change
according to the value obtained in other issues, or vary as a function of some
other external parameter, such as time. If there are single issues or subgroups
of issues within the whole set that are substitutable, it may be the case that
the utility obtained with one issue or a subgroup of issues affects the weight
assigned to other issues.
Building up in an example presented by Boutilier et al., 7 if a producer
is negotiating with a transportation company the most suitable way to carry
its goods to the consumers, the agenda of issues may contain options such as
trucks, ships, and airplanes. However, if the producer succeeds in obtaining a
reasonable deal with the trucks, the utility of the ships and airplanes would
be diminished. Negotiating all the issues at once in such a scenario demands
the agent to consider all the inter-dependencies between the issues, before
computing the utility of a contract or making an offer to its opponent. The
calculation of all the possible combinations and trade-offs can be computa-
tionally expensive, especially if the the number of related issues is large. By
negotiating the issues sequentially, this cost could be naturally avoided.

4-1 Model Description and Experiments


Experiments were performed" with a model inspired by van Bragt et al.'s
framework.6 Two bargaining agents, Ai and A2, each one equipped with its
own evolutionary algorithm, optimize the numerical parameters of the negoti-
ation strategies. The strategies consist of vectors of floating point numbers in
"The system was implemented using the Swarm Simulation System, developed by the
Swarm Development Group. Detailed information about the software can be found at
www.swarm.org. The source code used in the experiments described in this paper is avail-
able upon request.
194

the interval [0,1], encoding offers and thresholds, as the strategies employed
by Oliver.5 Being a finite-horizon model, the total number of offers that can
be exchanged between the traders has a maximum value of n. If n is even,
as Ai always makes the first offer, the last offer is granted to A2. If n is odd,
Ai has the first and the last offers. Traders should reach an agreement before
the maximum number is exceeded, otherwise they receive a null payoff. As
the issues are negotiated in sequence, each strategy corresponds to a set of N
sub-strategies, each one relative to one issue.
Each agent uses a conventional (/z + A) evolution strategies (ES). 2 In one
complete iteration, all the strategies are evaluated and ordered according to
their fitness values. In an (/x + A)-ES, the best /i strategies (parents) remain in
the set from one iteration to the next; in addition, A new strategies (offspring)
are produced at each iteration. Offspring is generated by applying operators
such as mutation and recombination in the set of parents.
In the experiments, only the mutation operator was employed when gener-
ating offspring. In an ES, mutation consists of adding or subtracting samples
from a Gaussian distribution with standard deviation s to the parameters of
a certain parent strategy. The parameter s is self-regulated and determines
the strength of the mutation. Each strategy keeps the s value of the Gaussian
distribution from which it was generated; at each iteration, the average of
the parents' standard deviations is used to produce the Gaussian distribution
that generates the next set of offspring.
Threshold and offer values were only accepted in the range [—1,1] (neg-
ative values were used in their absolute form); any strategy that contained
a value out of that range received a penalty, if the respective parameter was
demanded by the negotiation process.
The parameters /i and A were both set to 25. Each simulation instance was
run for at least 750 generations. At every generation, each one of the strategies
owned by A\ had to confront a randomly chosen subset of size 25 of A2
strategies, and vice versa. The fitness value of a strategy was calculated as the
mean value of the all the payoffs and penalties obtained in the confrontations
in one generation.
Ak's payoff, t/jt, was calculated as follows. Assume a deal on the first
issue I is reached at t = 77, yielding Ak a share of a, and a deal on the second
issue is reached at t = 777, yielding a share of (3, then Uk is:
_ ST' -a-wfk + <T" -p-wfjk
k ~ A, A, (-V
wfk + wff
Note that the discount factor is more severe with IPs share, as it is negoti-
ated at least one stage after issue /. Moreover, in this model of sequential
195

bargaining, if the traders can not reach an agreement on the division of issue
/ , the confrontation is halted and the bargaining on issue II is canceled.
In the first session of experiments, the influence of different values of 6
was investigated. The tested values of S were in the interval from 0 to 1, in
increments of 0.1. The original amount of the pie at t = 0 is 1. The same value
of 5 is applied to both traders and issues. The vector of weights for agents A\
and A2 were, respectively, (wf\wfl) = (0.3,0.7) and (wf2, wff) = (0.7,0.3).
Figures 1 and 2 show the P.E.Ps predicted by a game theoretic analysis
for finite bargaining games of size n —> oo (full line) and n = 10 (dashed
line), for agents A\ and A2, respectively. These partitions were calculated
by regarding the negotiation process of each one of the issues as a single
game. After calculating the values of each agent's shares in the equilibrium
for each one of the games, the utilities were calculated by (1) discounting 5 in
the share obtained from issue II, as it is negotiated one stage after issue /,
and (2) weighting the equilibrium shares with the respective set of individual
weights. The dotted lines are the payoffs obtained by the evolutionary traders
(mean value over the whole set of strategies in the last 100 generations from
a total of 750); 20 runs were performed for each of the 5 values. The vertical
bars at each of the tested points show the standard deviation of the results.

0.0 02 04 0.6 08 10

discounbng factor

Figure 1. Relation between the discount factor and agent Ai's utility, in the multi-issue
sequential model of sizes n = 10 (dashed) and n —> 00 (full). The dotted line shows the
utility actually obtained by the evolutionary agent in the experiments, when (vjj 1, WJJ1 ) =
(0.3,0.7).

As noted by van Bragt et al., 6 despite the bounded rationality of the bar-
gainers, who have no explicit representation of the size of the game or any
knowledge about the opponent's discount factor values, the traders achieve
196

0.0 0.2 0.4 0.6 0.8 1.0

discounting factor

Figure 2. Same as in Figure 1, for agent A2's utility, when (wj 2,Wjf) = (0.7,0.3).

outcomes which are qualitatively close to what is predicted by game theoret-


ical models. In these results, a phenomenon that was previously observed 6
is also detected: A\ does a little better than the game theoretic predictions,
whereas Ai performs considerably worse. It is suggested 6 that the poor per-
formance of A2 is due to the fact that, especially for small values of S, it is too
severe to A^ not to accept A\& first offer; the latter, taking advantage of this
fact, then offers very small shares of the pie. Indeed, it is also observed in the
results of this first session that the smaller the S the higher is the frequency
of deals that are closed right away in the first stage (Table 1). Also, it is
interesting to notice that there is a great leap in the average size of the nego-
tiation process between 5 = [0.9,1.0]. Intuitively, one would expect that the
negotiation process sizes would grow smoothly, following the decrease in time
pressure. However, the parameter S is strongly perceived by the evolutionary
processes, which leads the agents to play the game as if there were only one
offer to be exchanged, resulting in a great advantage to the agent that makes
the first offer. Table 2 shows the average value of the first offer for issue /
by Ai over all the strategies at the 750*^ generation, and the correspondent
average threshold by A%, against which the first offer is checked. It is inter-
esting to notice that despite the spread of ^2's threshold values, the offers by
Ai are equally spread and just higher enough to be accepted right away by
A2 (in average).
In the second session of experiments, the vector of weights {wf1, wf/) was
set to different values. 20 runs were performed for each case, with S = 0.9. The
data shown in Table 3 is an average of the payoffs obtained in all confrontations
in the last 100 generations (total of 750).
197

Table 1. Average number of stages until an agreement is reached as a function of the


discount factor (average for both issues over 20 runs, with n = 10).
5 # stages (er) 5 # stages (a) S # stages (a)

0.1 1.06 (0.00) 0.5 1.11 (0.01) 0.8 1.24 (0.02)


0.2 1.04 (0.00) 0.6 1.13 (0.01) 0.9 1.39 (0.04)
0.3 1.06 (0.00) 0.7 1.22 (0.02) 1.0 8.19 (0.12)
0.4 1.05 (0.01)

Table 2. Ai's average first offer and ^ ' s average first threshold for issue / , across all the
strategies at generation 750 t h , for each one of the runs (S = 0.9, n = 10).
Run Ax off. A2 t h r . Run Ai off. A2 thr.

1 0.28 0.01 11 0.63 0.55


2 0.79 0.68 12 0.78 0.73
3 0.55 0.45 13 0.73 0.62
4 0.35 0.31 14 0.65 0.54
5 0.55 0.37 15 0.14 0.05
6 0.17 0.02 16 0.65 0.57
7 0.85 0.80 17 0.38 0.30
8 0.37 0.27 18 0.91 0.83
9 0.79 0.68 19 0.48 0.38
10 0.82 0.72 20 0.35 0.28

Table 3. Results using different weights for Alt for 20 runs each. (wf2,wfj2) = (0.7,0.3),
and S = 0.9, n = 10 (ut. = utility; * are the values used in the previous session).
(wf^wfp Ai ut. Ai ut.std. A2nt. A2 ut.std.

(0,1) 0.42 0.01 0.64 0.01


(0.1, 0.9) 0.47 0.00 0.53 0.01
(0.3, 0.7)* 0.43 0.01 0.50 0.00
(0.5, 0.5) 0.48 0.00 0.42 0.00
(0.7, 0.3) 0.48 0.00 0.43 0.00
(0.9, 0.1) 0.54 0.00 0.39 0.00
(1,0) 0.53 0.01 0.46 0.00

Noticeably, there is a tendency that makes A\ to receive higher values of


utility as Wj 1 increases. This can be justified as a combination of two factors.
First, the influence of the discount factor applied to IPs share decreases, as
II's relative importance gradually diminishes in the total utility received by
198

A\. As the importance of issue I increases, A\ manipulates ^2's inability of


using the advantage of offering last, in the presence of a relative slight discount
factor.
In the last session of experiments, a simple case of a negotiation over inter-
substitutable issues was simulated. The values of wnk were made dependent
of the size of the share obtained from issue I, i.e., if an agent obtained more
than 0.9 of issue I, then WJJ was made 0.1, otherwise, 0.9. The weight
assigned to I by both agents was fixed in 0.3. As n — 10 and 6 = 1.0, Ai
has the last offer advantage in both issues. We were particularly interested
to see whether it would be possible for the players to engage in an agreement
which would be socially fair, i.e., once A2 obtained a large portion of issue
I, and therefore becoming "less interested" in disputing issue II, would that
allow A\ to obtain a more satisfactory share of issue III In fact, the weight
ratio is kept constant (3:1) between the most and least valued issues in both
agents; however, whether issue I or 77 is the most valued depends on how
the negotiation of I develops. As the relative importance of the issues may
shift between them, they are considered to be substitutable. The question
addressed is whether learning agents equipped with evolutionary algorithms
are capable to achieve a mutually satisfactory solution in such setting.
Figure 3 shows the histograms of the utilities obtained by A\ in the setting
with variable valuations (right) and in a test case (left), where both issues are
equally valuated with fixed weights by the players throughout the bargaining
game {WJJ = 0.5). This latter case can be interpreted as a situation where the
agents regard the issues as being perfectly inter-substitutable; as the agents
valuate them with the same importance, both issues are disputed with the
same strength. From the results, it is possible to observe that, as expected,
when the weights are fixed, A2 makes use of its last offer advantage very
frequently, yielding A\ very low payoffs (leftmost bar at U\ = 0 ) . When the
valuation is variable, though the frequency of low payoffs is still relatively
high (i.e., A\ looses I and II entirely to A2), with almost the same frequency
Ai is able to obtain everything of issue II. The bar at 0.7-0.8 in the right
figure cover the cases where A\ obtains practically nothing of I and almost
everything of II, representing a situation of high social welfare, which suggests
that there are situations where the sequential negotiation can be beneficial.

5 Conclusions
This paper presented a model of a sequential multi-issue alternating-offers
bargaining model, where the agents have their strategies devised by an evolu-
tionary algorithm. Differently from the usual bargaining model, where several
issues are negotiated simultaneously, in this setting the issues are disputed one
199

Figure 3. Histograms for the average utility obtained by A\ over 50 runs, in t h e last 100
generations (total of 1000), with fixed (left), and variable weights (right).
by one, in sequence. Numerical experiments were performed; the results are
qualitatively aligned with game theoretic predictions, as previously shown in
a simultaneous multi-issue model, 6 despite the fact that the evolving agents
have no restrictions concerning rational behaviors. A simple case with inter-
substitutable issues was also presented, illustrating a possible scenario where
a sequential negotiation may actually be beneficial for both parties to achieve
a satisfactory agreement.

Acknowledgments
Thanks to four anonymous reviewers for their helpful comments. NEN re-
ceives partial financial support from CNPq under grant #200050/99-0.
References
C. Boutilier, Y. Shoham, and M. P. Wellman, editors. Artifical Intelligence, vol. 94
(1-2). July 1997.
T. Back, G. Rudolph, and H.-Paul Schwefel. Evolutionary programming and evo-
lution strategies: Similarities and differences. Proc. the 2nd Annual Evolutionary
Programming Conference, 11-22, February 1992.
A. Muthoo. A non-technical introduction to bargaining theory. World Economics,
145-166, 2000.
A. Rubinstein. Perfect equilibrium in a bargaining model. Econometrica, 50(1):97-
109, January 1982.
J. R. Oliver. On Artificial Agents for Negotiation in Electronic Commerce. P h D
thesis, U. of Pennsylvania, 1996.
D. D. B. van Bragt, E. H. Gerding, and J. A. La Poutre. Equilibrium selection in
alternating-offers bargaining models: The evolutionary computing approach. In 6" 1
Int. Conf. of the Society for Computational Economics on Computing in Economics
and Finance (CEF'2000), July 2000.
C. Boutilier, M. Goldszmit, and B. Sabata. Sequential auctions for the allocation
of resources with complementarities. In Proc. of the Int. Joint Conf. on Artificial
Intelligence (IJCAI-99), 527-534, 1999.
AFFECT AND AGENT CONTROL:
EXPERIMENTS WITH SIMPLE AFFECTIVE STATES

MATTHIAS SCHEUTZ
Department of Computer Science and Engineering
University of Notre Dame, Notre Dame, IN 46556, USA
E-mail: mscheutz@cse.nd.edu

AARON SLOMAN
School of Computer Science
The University of Birmingham, Birmingham, BI5 2TT, UK
E-mail: axs@cs.bham.ac.uk

We analyse control functions of affective states in relatively simple agents in a variety of en-
vironments and test the analysis in various simulation experiments in competitive multi-agent
environments. The results show that simple affective states (like "hunger") can be effective in
agent control and are likely to evolve in certain competitive environments. This illustrates the
methodology of exploring neighbourhoods in "design space" in order to understand tradeoffs in
the development of different kinds of agent architectures, whether natural or artificial.

1 Introduction

Affective states (such as emotions, motivations, desires, pleasures, pains, attitudes,


preferences, moods, values, etc.) and their relations to agent architectures have been
receiving increasing attention in AI and Cognitive Science.1'2'3 Detailed analyses
of these subspecies of affect should include descriptions of their functional roles
in contributing to useful capabilities within agent architectures4, complemented by
empirical research on affect in biological organisms and concrete experiments with
synthetic agent architectures, to confirm that the proposed architectures have the
claimed properties. Our approach contrasts with most evolutionary AI research,
which attempts to discover what can evolve from given inital states. Instead, we
explore "neighbourhoods" and "mini-trajectories" in design space, by starting with
examples of agent architectures, then explicitly provide possible extensions with evo-
lutionary operators that can select them, and run simulations to investigate which of
the extensions have evolutionary advantages in various environments. This can show
how slight changes in environments alter tradeoffs between design options.
To illustrate this methodology, we next analyse functional roles of affective
states and then describe our simulation experiments which show how certain sim-
ple affective control mechanisms can be useful in a range of environments and are
therefore likely to evolve in those environments.

200
201

2 What Affective States are and aren't

If we attempt to define "affective" simply in terms of familiar examples, such as


"desiring', "having emotions", "enjoying", etc. we risk implicitly restricting the
notion to organisms with architectures sufficiently like ours. That could rule out
varieties of fear, hunger, or aggression found in insects, for example. We need an
architecture-neutral characterisation, which is hard to define if it is to be applicable
across a wide range of architectures (such as insect-like reactive architectures or
deliberative architectures with mechanisms able to represent and reason about non-
existent and possible future states). Our best hope is to define "affective" in terms
of a functional role which can be specified independent of the specific features of an
architecture.
The intuitive notion of "affect" already has two aspects that are relevant to a
variety of architectures, namely direction and evaluation. On the one hand there is
direction of internal or external behaviour, for instance, wanting something or trying
to avoid something. On the other hand there is positive or negative evaluation of what
is happening, for instance, enjoying something or finding it unpleasant. However,
even evaluation is linked to direction insofar as enjoying involves being disposed to
preserve or repeat and finding painful involves being disposed to terminate or avoid.
Either way affective states are examples of control states5.
Yet, not all states in control systems are affective states, even if they have some
effect on internal or external behaviour. For instance, perceiving, knowing, reason-
ing, and self-monitoring can influence behaviour but are not regarded as affective.
Suppose an agent can use structures as representations of states of affairs (never
mind how). Anything that represents must be capable of failing to represent. There
are various kinds of mismatch, and in some cases the mismatch can be detected,
for instance perceiving that some desired state has or has not been achieved, or that a
goal is being approached but very slowly. If detection of a mismatch has a disposition
to cause some behaviour to reduce the mismatch there are (to a first approximation)
two main cases: (1) the behaviour changes the representation to fit the reality, or
(2) the behaviour changes reality to fit the representation. In (1) the system has a
"belief-like" state, and in (2) a "desire-like" state. In other words, belief-like states
tend to be changed to make them fit reality, whereas attempts are made to change
reality to make it fit desire-like states. It is this distinction between belief-like and
desire-like control states that can give us a handle on how to construe affective states,
namely as "desire-like" control states whose role is initiating, evaluating and regu-
lating, internal or external behaviour, as opposed to merely acquiring, interpreting,
manipulating, or storing information (that might or might not be used in connection
with affective states to initiate or control behaviour).
A state representing the current position of an effector, or the location of food
202

in the environment, or the agent's current energy level is, therefore, not an affective
state. However, states derived from these which are used to initiate, select, prioritise,
or modulate behaviour, either directly or indirectly via other such states would be af-
fective states. An example might be using a measurement of the discrepancy between
current energy level and a "target" level (a "hunger" representation), to modulate the
tendency of the system to react to perceived food by going for it. This might use
either a "hunger threshold" to switch on food-seeking or a continuous gain control.
In complex cases, the "reference states" used to determine whether corrective
action is required may be parametrised by dynamically changing measures or de-
scriptions of the sensed state to be maintained or prevented, and the type of correc-
tive action required, internally or externally. For instance, an organism that somehow
can record how frequently food sources are encountered might use a lower hunger
threshold to trigger searching for food. If sensitive to current terrain it might trigger
different kinds of searches in different terrains. Thus while the records of food fre-
quency and terrain features are acquired they function as components of perceptual
or belief-like states, whereas when they are used to modulate decision making they
function as components of affective states.
Affective states can vary in cognitive sophistication. Simple affective mecha-
nisms can be implemented within a purely reactive architecture, like the "hunger"
example. More sophisticated affective states which include construction, evalua-
tion and comparison of alternatives, or which require high-level perceptual categori-
sations, would require the representational resources of a deliberative architecture.
However, recorded measurements or labels directly produced by sensors in reactive
architectures can have desire-like functions, and for that reason can be regarded as
affective states that use a primitive "limiting case" class of representations6.
The remainder of this paper describes simulation experiments where agents with
slightly different architectures compete for resources in order to survive in a carefully
controlled simulated environment. Proportions surviving in different conditions help
to show the usefulness of different architectural features in different contexts. It turns
out that simple affective states can be surprisingly effective.

3 The Simulation Environment

The simulated environment consists of a rectangular surface of fixed size (usually


around 800 by 800 units) populated with various kinds of agents and other objects
such as "lethal" entities of various sizes, some static and some moving at different
speeds in different directions, and "food items" (i.e., energy sources which pop up
at random locations and disappear after a pre-determined period of time unless con-
sumed by agents). Agents use up energy at a fixed rate, when stationary, and require
additional energy proportional to their speed, when moving. Hence, they are in per-
203

manent need of food, which they can consume sitting on top of a food source in a
time proportional to the energy stored in the food source depending on the maximum
amount of energy an agent can take in at any given time. Agents die and are removed
from the simulation if they run out of energy, or if they come into contact with lethal
entities or other agents.
All agents are equipped with a "sonar" sensor to detect lethal entities, a "smell"
sensor to detect food, a "touch" sensor to detect impending collisions and an internal
sensor to measure their energy-level. For both sonar and smell sensors, gradient
vectors are computed and mapped onto the effector space (see below), yielding the
direction in which the agent will move. The touch sensor is connected to a global
alarm system, which triggers a reflex to move away from anything touched, unless it
is food. These movements are initiated automatically and cannot be controlled by the
agent. They are somewhat erratic and will slightly reorient the agent (thus helping it
to get out of "local minima").
On the effector side, agents have motors for locomotion (forward and backward),
motors for turning (left and right in degrees) and a mechanism for consuming food.
After a certain number of simulation cycles, agents reach maturity and can pro-
create asexually, in which case depending on their current energy level they will have
a variable number of offspring which pop up in the vicinity of the agent one at a time
(the energy for creating a new agent is subtracted from the parent, occasionally caus-
ing the parent to starve). While different agents may have different short term goals
at any given time (e.g., getting around lethal entities or consuming food), common to
all of them are the two implicit goals of survival (i.e., to get enough food and avoid
running into/getting run over by lethal entities or other agents) and procreation (i.e.,
to live long enough to have offspring).
For evolutionary studies, a simple mutation mechanism modifies with a certain
probability some of the agent's architectural parameters (e.g., the parameters respon-
sible for integrating smell and sonar information). Some offspring will then start out
with the modified parameters instead of being exact copies of the parent. This mu-
tation rate as well as various other parameters need to be fixed before each run of
the simulation (a more detailed description of the simulation and its various control
parameters is provided elsewhere)7.
In is worth pointing out that our setup differs in at least two ways from other sim-
ulated environments that have been used to study affective states. 8 ' 9 ' 10 ' 11 ' 12 First, by
allowing agents to procreate (i.e., have exact copies of themselves as offspring) we
can study trajectories of agent populations and can thus identify properties of archi-
tectures that are related to and possibly influence the interaction of agent populations.
And second, by adding mutation, we can examine the potential of architectures to
be modified and extended over generations of agents. In particular, by controlling
which components of an architecture can change while allowing for randomness in
204

the way they can change, we are able to study evolutionary tradeoffs of such exten-
sions/modifications. From these explorations of "design space" and "niche space"13
we cannot only derive advantages and disadvantages of architectural components,
but also the likelihood that such components would have evolved in natural systems
using natural selection.

4 The Agents and their Architectures

In the following we consider two kinds of agents: reactive agents (R-agents) and
simple affective agents (A-agents) (other studies have compared different kinds7).
R-agents process sensor information and produce behavioural responses using a
schema-based approach, which obviates the need for a special action selection mech-
anism: both smell and sonar sensors provide the agent with directional and inten-
sity information of the objects surrounding the agent within sensor reach, where
intensity = 1/'distance2 (i.e., the distance of the object from the current position
of the agent). The sum of these vectors (call them S and F for sonar and food, re-
spectively) is then computed as a measure of the distribution of the respective objects
in the environment and passed on to the motor schema, which maps perceptual space
into motor space yielding the direction, in which to go: SS + 7 F (where 5 and 7 are
the respective gain values).0
A-agents are extensions of R-agents. They have an additional component, which
can influence the way sensory vector fields are combined by altering the gain value
7 based on the level of energy. In accordance with our earlier analysis of affective
states as modulators of behaviours and/or processes, this component implements an
affective state, which we call "hunger".
The difference in the architecture gives rise to different behaviour: R-agents are
always "interested" in food and go for whichever food source they can get to, while
A-agents are only interested in food when their energy levels are low. Otherwise
they tend to avoid food and thus competition for it, reducing the likelihood of getting
killed because of colliding with other competing agents or lethal entities.

5 The Behavioural Potential of a Simple Affective State

We start our series of experiments by checking whether each agent kind can survive
in various kinds of environments on its own. Five agents of the same kind are placed
in various environments (from environments with no lethal entities to very "danger-
ous" environments with both static and moving lethal entities) at random locations
to "average out" possible advantages due to their initial location over a large number
0
Note that this formula leaves out the details for the touch sensor for ease of presentation.
205

Table 1. Surviving agents in an n-environment when started with 5 agents of only one kind.
R-agents A-agents
Env A* (7 Con A* a Con
0 14.60 2.80 1.73 19.20 2.74 1.70
5 13.20 4.78 2.96 17.20 3.05 1.89
10 11.90 3.81 2.36 17.20 3.77 2.33
20 11.60 3.47 2.15 15.40 3.95 2.45
30 7.50 4.43 2.75 13.00 3.56 2.21
40 2.90 3.57 2.21 10.40 3.57 2.21
50 0.20 0.63 0.39 8.00 3.56 2.21

Table 2. Surviving agents in an n-environment when started with 5 agents each of boths kinds.
R-agents A-agents
Env A« a Con A* a Con
0 0.00 0.00 0.00 17.20 3.61 2.24
5 0.00 0.00 0.00 16.30 2.91 1.80
10 1.60 5.06 3.14 14.50 6.54 4.05
20 0.10 0.32 0.20 14.50 4.22 2.62
30 0.00 0.00 0.00 15.10 3.35 2.08
40 0.00 0.00 0.00 12.80 2.49 1.54
50 0.00 0.00 0.00 10.00 3.16 1.96

of trials. The "food rate" is fixed at 0.25 and the procreation age at 250 update cy-
cles. Table 1 shows for each agent kind the average number ([/,) of surviving agents
as well as standard deviation (<r) and confidence interval (Con) for a = 0.05 taken
over 10 different runs of the simulation, each for 10000 environmental updates for
a given environment (where "n-environment" is intended to indicate that n static
and n moving lethal entities were placed at random in the environment). Note that
A-agents do significantly better than R-agents if measured in terms of the average
number of agents in each environment at any given time.
Given that each agent kind can survive on its own, we now compare the per-
formance of mixed groups of R- and A-agents. It turns out that A-agents reliably
outperform R-agents in all considered environments (see Table 2).
The results depend neither on the initial number of agents nor on the distribution
of moving and static lethal entities: experiments with different numbers of initial
agents of each kind as well as experiments with different numbers of moving and
static lethal entities (that added up to the same total) yield very similar results. Higher
food rates (e.g., of 0.5) do not change the picture either, rather they show even more
206

clearly the ability of affective agents to coexist in large groups. With lower food rates
the advantage of A-agents over R-agents slowly decreases as waiting for hunger to
grow before moving towards food is not a good strategy. Eventually, at food rates
of 0.125 and below, survival in crowded environments becomes impossible for any
agent kind-there are simply too many lethal entities obstructing the paths to food.
The superior performance of A-agents might not seem very surprising, since
the additional information about the current energy level, ignored by R-agents, but
utilized by A-agents, allows for a more complex mapping between sensory input and
behavioural output. However, using more information does not automatically lead to
better performance, as can be seen from the fact that A-agents may lose out against
R-agents if the "rules of the game" are slightly modified: in a simulation without
procreation, where either the numbers of surviving agents of each kind are counted
after a predetermined number of cycles or the average lifespan of an agent is used
as a measure of fitness, R-agents almost always perform (slightly) better than A-
agents (in all of the above environments). Only in combination with procreation is
the tendency of A-agents to distribute themselves better over the whole environment
(in Seth's terminology: their lower degree of "dumpiness" 12 ) by virtue of being
at times less attracted to food beneficial, as their offspring will benefit from not
having to compete immediately with many other agents in their vicinity. In this
light, the answer to the following question whether A-agents can be produced by
some evolutionary process is not obvious at all.

6 Simple Affective States Can Be Evolved

To study the degree to which simple affective states like "hunger" can be evolved in
a competitive environment, we allowed for mutation of the link between the com-
ponent connected to the energy sensor (which is supposed to assume the role of the
affective "hunger" state) and the component encoding the food gain value 7 in the
mapping from perceptual to motor space. This link, expressed as a multiplicative
factor and called "foodweight", is initialised at random in the interval (—0.2,0.2).
Whenever an agent has offspring, the probability for "genetic modification" of the
foodweight is 1/3 and the probability for weight increase/decrease (by the given fac-
tor T — 0.05) is 1/6, respectively. Everything else remains the same.
Of all seven environments, A-agents did not survive in the 40- and 50-
environments, which are very tough in that wrong moves are punished right away:
there is simply no room for genetic trial and error.6
In the other five environments, A-agents evolved using the state in the expected
way, although to varying degrees: the less crowded an environment, the better the use

'The only agents that survived on 2 out of 10 runs were the R-agents in 40-environments.
207

Table 3. Average weight values, standard deviation and confidence level at a = 0.05 for the
"foodweight" of the surviving affective agents in one run in an n-environment.
Foodweight Foodweight
Env M a Con Env M a Con
0 0.26 0.09 0.05 10 0.19 0.07 0.05
0.27 0.05 0.03 0.30 0.04 0.03
0.23 0.08 0.05 0.29 0.09 0.05
0.07 0.05 0.03 20 0.24 0.12 0.00
0.17 0.06 0.04 30 0.17 0.11 0.07
5 0.33 0.12 0.07 0.13 0.07 0.04
0.19 0.06 0.03 40 0.00 0.00 0.00
0.10 0.09 0.06 50 0.00 0.00 0.00
0.18 0.04 0.03

Table 4. Surviving affective agents in an n-environment when started with 5 R-agents, which
can have randomly initialised A-agents as part of their offspring with a probability of 0.25.
No R-agent survived a single run.
A-agents Foodweight
Env H a Con A* a Con
0 17.90 3.60 2.23 0.18 0.10 0.06
5 14.90 1.91 1.19 0.19 0.11 0.07
10 15.10 3.03 1.81 0.19 0.11 0.07
20 11.53 8.39 5.20 0.18 0.09 0.05
30 3.80 5.45 3.38 0.17 0.09 0.06
40 4.10 5.57 3.45 0.21 0.10 0.06
50 1.00 2.31 1.43 0.21 0.09 0.06

of the state can be evolved, the reason being that agents with initial random weights
are very likely to be inefficient in navigating through the environment, if able at all.
In such cases it is helpful if food is not obstructed by too many lethal entities. Table 3
shows for each environment mean, standard deviation and confidence interval (again
for a — 0.05) for weights for all those runs on which affective agents survived. The
above experiment also works for different mutation rates as well as different values
of r. Note that while in 5 out of 10 runs the "affective use" of the state was evolved
in 0-environments, only in 2 out of 10 runs was the use evolved in 30-environments.
The positive value of the foodweight indicates that the hunger state deserves its
name. Yet, the magnitudes of the weight seem small given the procreation age of
250 and the increment/decrement factor r = 0.05. On closer inspection, however, it
turns out that evolution was quite fast, since assuming that there are only about 40
208

generations of agents in each run, and given that the probability of a positive increase
of the weight by r is 1/6, then starting at a slightly positive hunger weight, say, the
maximum we should expect is about 1/3.
We have not dealt with issues of genetic coding and how genetic codes relate
to the "added machinery" in the cognitive architecture of affective agents. Rather,
we assume that adding a realizer of such a state (e.g., a neuron) is an evolutionarily
feasible operation (e.g., which could result from some sort of duplication operation
of segments of genetic information14) and that mutation on genetically coded weight
information can lead to an increase or decrease of weight values.
We have, however, considered an evolutionarily more plausible variant of the
experiment. Starting with R-agents, let some of their offspring have additional archi-
tectural capacities with a certain probability (in our case, the capacities of A-agents).
The probability, with which R-agents have such randomly initialised A-agents as off-
spring is 0.25 (the results are also valid for much lower rates such as 0.05). It turns
out that environments with only R-agents in the beginning will eventually be also
populated by A-agents (most of the time exclusively, see Table 4).
It is worth mentioning that the results of this section also hold for extended
simulations, where agents need a second resource (e.g., water) for survival. Multiple
affective control states (e.g., "hunger" and "thirst") are even more beneficial when
agents have multiple needs, which can be seen from the fact that R-agents can hardly
survive on their own in such a setting (to "always go for the nearest resource" is not
simply not a good strategy, e.g., see u ) . They even lose against A-agents if fitness is
determined without procreation (see the end of the last section).

7 Discussion

The above experiments help us understand some of the conditions in which affec-
tive states like hunger have survivial value, and indicate that in certain competitive
environments, if there is an option to develop new architectural resources that im-
plement such affective states, then these resources will likely evolve. Especially the
last result is not obvious, for a reason that makes the question why higher species
with more complex and sophisticated control architectures evolved in the first place
so fascinating: every species along an evolutionary trajectory has to have a viable
control architecture, which allows its individuals to survive and procreate, otherwise
it will die out. This is a very severe constraint imposed on trajectories in design and
niche space, which we are only slowly beginning to understand.
Our investigations are, of course, just a start. Many more experiments using
different kinds of affective states are needed to explore the space of possible uses
of affective states and the space of possible affective states itself. We have begun to
explore a slightly different neighbourhood in design space by allowing some agents
209

to have deliberative capabilities, and comparing them with A-agents . In a surprising


variety of environments the deliberative agents do less well, though a great deal more
investigation is needed. Further work on the capacities of affective states as control
mechanisms and the likelihood of their evolution in certain environments should
thus also help to explain why evolutionary developments that increase intelligence
by adding a deliberative layer were favoured by so few species!

Acknowledgments

The work was conducted while the first author was on leave at the School of Com-
puter Science at the University of Birmingham and funded by the Leverhulme Trust.

References

1. K. Oatley and J.M. Jenkins. Understanding Emotions (Blackwell, Oxford,


1996).
2. R. Picard. Affective Computing (MIT Press, Cambridge, Mass, 1997).
3. G. Hatano, N. Okada, and H. Tanabe, eds. Affective Minds (Elsevier, Amster-
dam, 2000).
4. A. Sloman, In Cognitive Processing, 1 (2001).
5. A. Sloman. In Philosophy and the Cognitive Sciences, eds. C. Hookway and
D. Peterson (Cambridge University Press, 1993).
6. A. Sloman. In Forms of representation: an interdisciplinary theme for cognitive
science, ed. D.M.Peterson (Intellect Books, Exeter, U.K., 1996).
7. M. Scheutz and B. Logan. In Proceedings ofAISB 01 (AISB Press, 2001).
8. P. Maes. In From Animals to Animats: Proceedings of the First International
Conference on Simulation of Adaptive Behavior, eds. J.A. Meyer and S.W.
Wilson (MIT Press, Cambridge, MA, 1991).
9. T. Tyrrell. Computational Mechanisms for Action Selection (Ph.D. Thesis, Uni-
versity of Edinburgh, 1993).
10. D. Canamero. In Proceedings of the First International Symposium on Au-
tonomous Agents (AA'97, Marina del Rey, CA, The ACM Press, 1997).
11. E. Spier From reactive behaviour to adaptive behaviour (Ph.D. Thesis, Univer-
sity of Oxford, 1997).
12. A. Seth. On the Relations between Behaviour, Mechanism, and Environment:
Explorations in Artificial Evolution (Ph.D. Thesis, University of Sussex, 2000).
13. A. Sloman. In Parallel Problem Solving from Nature - PPSN VI, LNCS 1917,
eds. M. Schoenauer, et al. (Berlin, Springer-Verlag, 2000).
14. J. Maynard-Smith and E. Szathmary. The Origins of Life: From the Birth of
Life to the Origin of Language (Oxford University Press, Oxford, 1999).
META-LEARNING PROCESSES IN MULTI-AGENT SYSTEMS

R O N SUN
CECS, University of Missouri, Columbia, MO 65211, USA
E-mail: rsun@cecs.missouri.edu

Straightforward reinforcement learning for multi-agent co-learning settings often


results in poor outcomes. Meta-learning processes beyond straightforward rein-
forcement learning may be necessary to achieve good (or optimal) outcomes. Al-
gorithmic processes of meta-learning, or "manipulation", will be described, which is
a cognitively realistic and effective means for learning cooperation. We will discuss
various "manipulation" routines that address the issue of improving multi-agent
co-learning. We hope to develop better adaptive means of multi-agent cooperation,
without requiring a priori knowledge, and advance multi-agent co-learning beyond
existing theories and techniques.

1 Introduction

It is common that a group of agents deal with a situation jointly, with each
agent having its own goal and performing a sequence of actions to maximize
its own payoffs. However, difference sequences of actions (by different agents)
interact in determining the final outcome for each agent involved. In such a
situation, each agent has to learn to adapt to other agents that are present and
"negotiate" an equilibrium state that is beneficial to itself (e.g. Kahan and
Rapoport 1984). Our focus will be on nonverbal "communications" in which
(sequences of) actions by agents may serve the purpose of communicating
intentions and establishing cooperation, in an incremental and gradual way.
The ultimate goal is to avoid mutually harmful outcomes and to distribute
payoffs to individual agents in a rational way.
We need some framework that determines a proper strategy for each agent
in order to deal with the presence of other agents. The framework should al-
low adaptive determination of strategies on the fly during interaction. The
framework should allow learning from scratch without a priori domain knowl-
edge.
Game theory (e.g., van Neumann and Morgenstern 1944) has been focus-
ing on static equilibria of strategies in a variety of game settings (Osborne
and Rubinstein 1994), which furthermore unrealistically assumes unbounded
rationality on the part of agents (Simon 1957). The recent surge of study of
game learning (e.g., Fudenberg and Levine 1998, Camerer and Ho 1999) brings
adaptive processes of reaching equilibria into the focus. To study the dynam-
ics of reaching equilibria, learning algorithms need to be developed. However,

210
211

learning algorithms studied in much existing work on game theoretic learning


have been overly simple and, thus, to a large extent, unrealistic.
Beyond static equilibria and simple learning, complex algorithmic pro-
cesses involved in learning by cognitive agents (Sun et al 2001) need to be
studied. By complex algorithmic processes, I mean procedures that include
detailed, varied, and subtle steps of manipulation of information, strategies, or
equilibria (see Sun and Qi 2000, Sonsino 1997 for preliminary versions of such
processes). I hope that, by incorporating such complex algorithmic processes,
we can extend game theoretic studies to more realistic settings of multi-agent
interaction.
What I emphasize in this work is not just end results (i.e., equilibria;
Osborne and Rubinstein 1994), not just simple processes involving simple op-
erators (Fudenberg and Levine 1998), but complex algorithmic operations and
processes. This is because what an agent may do is not limited to completely
rational choices as assumed by many game theorists, but also some appar-
ently irrational behaviors which may nevertheless lead to desirable outcomes
in the future. We are interested in learning and action selection that are
more opponent-oriented and more determined on the fly, than many existing
processes. This kind of algorithmic process helps to improve cooperation.

2 Background

2.1 Game Theory

Game theory studies decision making involving multiple agents (Osborne and
Rubinstein 1994). A strategic game is the one in which all agents choose their
actions simultaneously and once for all. In contrast, in an extensive game,
agents perform actions sequentially. Formally, an extensive game is a 4-tuple:
(N,H,P,U), where N is a set of agents, H is a set of history (see Osborne
and Rubinstein 1994 for further specifications), P is the player function such
that P(h) specifies the player after history h e H, U is the payoff function
that maps each terminal history to a real value.
For simplicity, in the following discussions, we assume the length of games
is finite, that is, each game always terminates in a finite number of steps. Each
agent has perfect information.
Given these assumptions, we will look into extending current game theory,
incorporating more complex algorithmic processes that capture more realistic
cognitive and social processes during game learning.
212

2.2 Reinforcement Learning


Reinforcement learning is a general learning framework suitable for learning
extensive games. In a single-agent learning setting, there is a discrete-time
system, the state transitions of which depend on actions performed by an
agent. A Markovian process determines state transition after an action is
performed. Costs (or rewards) can occur for certain states and/or actions.
Normally the costs/rewards accumulate additively, with or without a discount
factor 7 G (0,1].
One algorithm for learning optimal policies is the Q-learning algorithm of
Watkins (1989):

Q{st,at) := (1 -a)Q(st,at) + a(r(st+1)

+ 7 max (Q(st+i,at+i)))
at + iSA
where a is the learning rate, which goes toward zero gradually. Action at is
determined by an exploration action policy, e.g., using (1) alternating explo-
ration (random actions) and exploitation (greedy actions) periods, (2) a small
fraction of random actions (with probability e, a random action is chosen; with
probability 1 — e, a greedy action is chosen), or (3) stochastic action selection
with the Boltzmann distribution. Such an algorithm allows completely
autonomous learning from scratch, without a priori domain knowledge.
Extending Q-learning to co-learning in extensive games, we may simply
use the above single-agent Q-learning equation, or we may use multi-agent
Q-learning equations (Littman 2001).
We assume that each state (used in Q-learning), or information set (as
termed by game theorists), is comprised of all the actions up to the current
point and, optionally, information about the initial situation at the time when
the game begins. State transitions are deterministic.
We assume that there is sufficient exploration during reinforcement learn-
ing (which is a standard requirement for ensuring convergence of RL), so that
each agent knows the payoff outcomes of all the paths on the game tree. But,
eventually, each agent converges to a deterministic action policy, i.e., a pure
strategy in game theoretic terms.

3 Types of Meta-Learning

In practice, performance of Q-learning in extensive games tends to be very


poor (Shoham and Tennenholtz 1994, Sandholm and Crites 1995, Haynes and
213

(2,5) (1.1) (1,1) (3.3) (3,5) (1,1) (1,1) (3,3) (2.5) (4.4) (1,1) (3,3)

(1) (2) (3)

Figure 1. Three cases of the left/right game. The numbers in circles indicate agents. I and
r are two possible actions. The pair of numbers in parentheses indicate payoffs (where the
first number is the payoff for agent 1 and the second for agent 2).

Sen 1996, Sun and Qi 2000), despite the fact that Q-learning is cognitively jus-
tifiable (Sun et al 2001). The problem may lie in the fact that other cognitive
mechanisms may also be needed, on top of such trial-and-error learning, in
order to achieve good performance (Sun et al 2001). In this paper, I shall ex-
plore additional adaptive mechanisms (i.e., meta-learning routines), within a
RL framework, to facilitate the attainment of optimal or near optimal results.

3.1 Manipulation by Preemptive Actions


An agent can manipulate other agents by adopting suboptimal actions (likely
temporarily) in order to force or induce opponents to take actions that are
more desirable to the agent but result in lower payoff (and are thus less desir-
able) to themselves. For example, in the game of Figure 1 (1), with reinforce-
ment learning, agent 1 will end up rationally choosing action r, which will
lead agent 2 to choose its best action r, and hence a payoff of (3,3) for them.
However, agent 2 prefers the outcome of (2,5). Therefore, it may deliberately
choose / after agent 1 chose r, leading to a payoff of (1,1), which forces agent 1
to change its action. With further reinforcement learning, agent 1 may adapt
to this change and choose / instead (because this action can lead to the best
possible outcome, if agent 2 rationally chooses / afterwards, given its manip-
ulative action of / after agent l's r). This change gives agent 2 its preferred
outcome of (2,5).
We assume that there are only two agents in an extensive game, which
take turn in acting. Assume that after sufficient learning using Q-learning, a
subgame perfect equilibrium (Osborne and Rubinstein 1994) is reached. Suffi-
cient exploration is done during reinforcement learning and, thus, each agent
214

have fairly accurate knowledge of the payoffs of different p a t h s on the entire


game tree.
After reinforcement learning settles into a particular payoff outcome, as-
sume there is an alternative outcome (not necessary a Nash equilibrium) t h a t
is more desirable t o an agent but less desirable to its opponent. Suppose this
alternative outcome can b e reached if t h e opponent takes a different action
at a certain point (but follows the current optimal policy as determined by
the reached subgame perfect equilibrium thereafter). This point (the targeted
switch point) can be determined by a search of the game tree:

A l g o r i t h m 1.
1. Search from the root of the tree along the current (equilibrium) path
using depth-first search. At each point of action by the opponent, do the
following:
1.1. Adopt an alternative action.
1.2. Follow the current, optimal policy (the subgame perfect equilibrium
strategy) of each agent thereafter.
1.3. If one of these alternative actions leads to a more desirable outcome
for the agent, add the whole path to the candidate path set.
2. Choose from the candidate path set the most desirable path. Start
the manipulation process at the point of the alternative action by the
opponent in the chosen path.

Here is what the agent can do to change the action by the opponent at
the point (the manipulation):

A l g o r i t h m 2.
Search the subtree that starts at the action (by the opponent) that the
agent aims to change (using depth-first search):
1. If there is an alternative action by the agent at any point along the
current path in the subtree, that creates a path (1) that leads to a payoff
for the opponent that is lower than the payoff of the most desired path, and
(2) on which all other actions (by either agents) conform to the optimal
policies (determined by the equilibrium), then, commit to that action (i.e.,
perform that action whenever that point of the game tree is reached).
2. If there are multiple such actions by the agent, choose the one highest
in the tree (that is, the closest to the current action by the opponent to
be changed).

T h e algorithm is based on the following result:


T h e o r e m 1 For the subgame described by the part of the game tree below
the point of the committed (manipulating) action (of the manipulating agent),
215

the original subgame perfect equilibrium strategies for both agents remain the
subgame perfect equilibrium strategies.
Thus, the acquired policies below the changed action need not be changed
and re-learned. Similarly,
Theorem 2 For the subgame described by the part of the game tree below the
point of an alternative action by the opponent, the original subgame perfect
equilibrium strategies for both agents remain the subgame perfect equilibrium
strategies.
An obvious shortcoming of Algorithm 1 is the cost of the exhaustive search
used. An alternative is to search and find only one desirable path for the
agent, with a straightforward modification of Algorithm 1, and then to force
the opponent to go down that path using Algorithm 2. We may similarly
eliminate exhaustive search in Algorithm 2.
In either case, the hope is that the opponent will opt for an alternative ac-
tion at the targeted switch point that leads to a better outcome for the agent
(as a result of further trials and further learning by the opponent during those
trials). However, there may be multiple action choices for the opponent at this
point or another (before the committed action of the agent). The opponent
may opt for an action that is not the desired action. To force the opponent
to take the desired action, the agent needs to close off all loopholes (all "dis-
tractor" paths). That is, the above algorithm can be repeatedly applied, if
the desired outcome is not reached due to the opponent taking an unintended
action at a point above the committed action by the agent. This process can
continue until all the other alternatives are "eliminated" except the desired
path (or, when an outcome that is equivalent to, or better than, the desired
outcome is reached). a
As a result of further trials during which further reinforcement learning
occurs, the opponent may adapt to the manipulation and take the target
action intended for it by the agent. Thus the game settles into a new state that
is a subgame perfect equilibrium state given the manipulation (i.e., with the
original action by the agent at the point of manipulation being "prohibited"
or removed).
However, the opponent may counter-react to the manipulation. First of
all, counter-reaction may take the form of obstinacy: The opponent can refuse
to change any action despite the worsened outcome as a result of the manipu-
lation and despite the existence of alternative actions that can lead to better
outcomes (although they may not be as good as the original outcome). Sec-

° Alternatively, we may at once lowers the payoffs of all the alternative actions for the
opponent, if they are higher than that of the desired outcome for the opponent (see Sun et
al 2001).
216

ond, counter-reaction may also take the form of counter-manipulation using


the same algorithm described above. The opponent can, e.g., eliminate the
outcome that is the most desirable for the original agent (and thus was the
goal of the original manipulation). These issues are dealt with elsewhere and
not repeated here due to the page limit.

3.2 Manipulation by Nudging Actions


This is the case of an agent adopting some suboptimal actions in order to direct
its opponent to take actions that are equally, or more, desirable to each agent
involved. As a result of the manipulation, everyone receives a payoff that is
equal to, or higher than, the payoff each would receive otherwise (without the
manipulation). This type of manipulation is obviously easier to accomplish,
and does not call for counter-reaction from opponents.
If there is a point (the targeted switch point) along the equilibrium path
in the game tree (as determined by the reached subgame perfect equilibrium)
where an alternative action by the opponent may lead to better payoffs to the
agent and no worse payoffs to its opponent, then the agent can take a non-
optimal action at a point below the afore-identified targeted switch point to
force a worse payoff on the opponent if it follows the old path. The algorithms
for reaching the desired outcome, including selecting the switch point and
forcing the switch, have been discussed earlier and remain the same.
For example, in the game of Figure 1 (2), with reinforcement learning,
agent 1 may choose action r, which leads agent 2 to choose action r, and hence
a payoff of (3,3) for them. However, agent 2 prefers the outcome of (3,5).
Therefore, it decides to take / after agent 1 took r, leading to a payoff of (1,1)
for them, in order to nudge agent 1 to change its action as well. With further
reinforcement learning, agent 1 adapts to this change and chooses I instead,
which leads to the outcome of (3,5) — a better outcome for agent 2. This is
a special case of the previously discussed scenario where there is no need for
the opponent to consider counter-reaction.

3.3 Manipulation by (Mutual) Compromise


An agent can adopt some suboptimal actions, in order to induce its opponent
to take actions that are suboptimal too, which together, however, can lead to
outcomes more desirable to both agents involved. This case can be viewed as
reaching a mutually beneficial compromise in order to maximize the payoffs
of all those involved.
For example, in the game of Figure 1 (3), with reinforcement learning,
agent 1 will end up choosing action r, which leads agent 2 to choose action r,
217

and hence a payoff of (3,3) for them. However, agent 2 prefers the outcomes of
(2,5) or (4,4). It cannot easily induce agent 1 to an outcome of (2,5), because
it gives agent 1 a worse payoff. But it can induce agent 2 to an outcome of
(4,4). Therefore, it consistently takes r if agent 1 takes I, which gives agent 1
incentives t o take / instead of r (because it leads t o a better payoff for agent
1). W i t h further reinforcement learning, agent 1 settles on action /, which
leads to the outcome of (4,4) — a compromise between the two agents. b
As a result of the manipulation, everyone receives a payoff t h a t is higher
t h a n the payoff each would receive otherwise (without the manipulation).
However, as with the previous cases of manipulations, the resulting outcome
is not a Nash equilibrium, and it is stable only under the reached compromise
(i.e., given t h e committed action choice).

Algorithm 3.
1. Search from the root of the tree along the current (the subgame perfect
equilibrium) path. At each point of action by the opponent, and at each
point of action by the agent itself following that, try a pair of alternative
actions. That is, repeat the following (using depth-first search):
1.1. Adopt an alternative action at a point of action by the opponent.
1.2. Follow thereafter the current policy (the subgame perfect equilibrium
strategy) of each agent, except the following change.
1.3. At a point of action by the agent itself, try an alternative action.
1.4. If this pair of alternative actions leads to more desirable outcomes for
both agents, store the pair as a candidate pair.

Now, the agent commits to his part of this compromise (a chosen pair of
alternative actions):

A l g o r i t h m 4.
If there is at least one candidate pair (that is, if at least one of these pairs
of alternative actions led to more desirable outcomes for both agents),
start the manipulation process:
1. Find the best such pair (based on a criterion such as the maximum
increase of payoffs for the manipulating agent, or the highest total
increase of payoffs for both agents).
2. Commit to the action (of the agent) from the chosen pair of actions.

W i t h o u t explicit communication, the agent has to wait for the opponent


to discover this commitment through exploration during further reinforcement
learning. Most likely, the opponent will discover the advantage of taking the

"Note that, in this game, it is also possible for agent 2 to take preemptive actions to force
an outcome of (2.5), as in section 3.1.
218

corresponding action determined by this compromise, and thereafter both


agents will be able to reap the benefit.
We show below that the desired action change of the opponent as deter-
mined by the compromise will lead to the highest possible payoffs for both
agents, given the manipulation, and thus there is sufficient incentive for the
opponent to take that action determined by the compromise:
Theorem 3 (1) The targeted action change of the opponent will lead to the
highest possible payoffs for both agents, given the manipulation. (2) The op-
timal policies of both agents, either below the agent's manipulating action,
above the targeted alternative action by the opponent, or in between the two
points, will not be changed due to the manipulation.
Note that, compared with earlier types of manipulations, here the agent
chooses to induce rather than to force its opponent to take the action it wants
it to take. This mutual compromise process can be extended to more than
two steps.

4 Concluding Remarks

This paper considers algorithmic meta-learning processes of joint sequential


decision making that are both cognitively realistic and practically effective.
In essence, we incorporate more cognitively realistic learning processes by
combining simple decision making studied in game theory with complex algo-
rithmic processes (Sun et al 2001). Armed with extended senses of rationality,
we are aiming at an algorithmic account of multi-agent learning of cooperation
(that starts from scratch without a priori domain knowledge). Of course, it
is important that we extend our basic assumptions to deal with more general
cases, which are being worked on right now.

References

1. C. Camerer and T. Ho, (1999). Experience-weighted attraction learning in


normal-form games. Econometrica, 67, 827-874.
2. C. Claus and C. Boutilier, (1998). The dynmics of reinforcement learning
in cooperative multiagent systems. Proceedings of AAAI'98. AAAI Press,
San Mateo, CA.
3. D. Fudenberg and D. Levine, (1998). The Theory of Learning in Games.
MIT Press, Cambridge, MA.
219

4. T. Haynes and S. Sen, (1996). Co-adaptation in a team. International


Journal of Computational Intelligence and Organizations.
5. J. Kahan and A. Rapoport, (1984). Theories of Coalition Formation.
Lawrence Erlbaum Associates, London.
6. M. Littman, (2001). Value-function reinforcement learning in Markov
games, special issue on multi-agent learning (edited by R. Sun), Cognitive
Systems Research, Vol.2, No.l, 2001.
7. J. Nash, (1950). Equilibrium points in N-person games. Proceedings of
National Academy of Science, vol.36, 48-49.
8. M. Osborne and A. Rubinstein, (1994). A Course on Game Theory. MIT
Press, Cambridge, MA.
9. T. Sandholm and R. Crites, (1995). Multiagent reinforcement learning in
the iterated prisoner's dilemma. Biosystems, 37, 147-166.
10. S. Sen and M. Sekaran, (1998). Individual learning of coordination knowl-
edge. Journal of Experimental and Theoretical Artificial Intelligence, 10, 333-
356.
11. Y. Shoham and M. Tennenholtz, (1994). Co-learning and the evolution of
social activity. Technical Report STAN-CS-TR-94-1511, Stanford University.
12. H. Simon, (1957). Administrative Behavior (2nd ed.). New York: Macmil-
lan.
13. D. Sonsino, (1997). Learning to learn, pattern recognition, and Nash
equilibrium. Games and Economic Behavior, 18, 2, 286-331.
14. V. Soo, (2000). Agent negotiation under uncertainty and risk. Design
and Applications of Intelligent Agents, pp.31-45. Springer-Verlag, Heidelberg,
Germany.
15. R. Sun, E. Merrill, and T. Peterson, (2001). From implicit skills to explicit
knowledge: a bottom-up model of skill learning. Cognitive Science.
16. R. Sun and D. Qi, (2000). Rationality assumptions and optimality of
co-learning. Design and Applications of Intelligent Agents. Lecture Notes in
Artificial Intelligence, Volume 1881. pp.61-75. Springer-Verlag, Heidelberg,
Germany.
17. J. von Neumann and O. Morgenstern, (1944). Theory of Games and
Economic Behavior. John Wiley and Sons, New York.
18. C. Watkins, (1989). Learning with Delayed Rewards. Ph.D Thesis, Cam-
bridge University, Cambridge, UK.
SCALABILITY A N D THE EVOLUTION OF N O R M A T I V E
BEHAVIOR

J O R G W E L L N E R t , S I G M A R P A P E N D I C K * , A N D W E R N E R DILGER+
t Chemnitz University of Technology, Computer Science
D-09107 Chemnitz, Germany
{jwe, wdi} Qinformatik.tu-chemnitz.de

* University of Konstanz, Department of Sociology


D-784 64 Konstanz, Germany
Sigmar.Papendick@uni-konstanz.de

We present an evolutionary approach for developing an agent system consisting


of a large and varying number of agents. We start off by describing in short a
sociological theory to problems of coordination in societies with many members.
Elaborated cognitive explanations of handling individual information are rejected
and the concept of symbolically generalized communication media is suggested
instead. In a first attempt we have modeled an agent system based on this concept.
Simulation results show that agents may coordinate their actions even though they
have no individual representations of each other. Simulation starts with a small
group of agents and evolves a system of several hundred agents which base their
actions mainly on exchanged messages.

1 Introduction

Many coordination approaches for agent systems rely on mechanisms which


include detailed knowledge of an agent architecture. For one agent it is es-
sentially knowledge of another agent in order to cooperate with it. Agents
personalized this knowledge in the way that they know goals, skills, or beliefs1
for different opponents. To simplify the usual situation one can state that the
more potential partners for an interaction an agent has the more agent specific
knowledge it has to cope with. This is one reason why current logic based
agent approaches scale so badly, because keeping track of information about
other agents is an expensive matter. In the next section we consider in more
detail a concept, developed by sociologists, to answer questions concerning the
coordination of individuals in a society of a huge number of members, namely
symbolically generalized communication media. A first approach of modeling
the proposed mechanisms for one media is presented in Section 3, focusing
on the ability of the agents to acquire a shared symbol system. Simulation
results are discussed in Section 4. Section 5 concludes the paper, indicating
that one can reasonably base multi-agent systems on the proposed sociological
concepts in order to achieve a good scaling.

220
221

2 The concept of symbolically generalized communication


media

Humans faced a scaling problem during the development from small groups
to modern societies. In small groups it is possible for each individual to keep
in mind relevant facts about other members of the group. Different strategies
were developed to keep one's knowledge about each other up-to-date, e. g.
gossip2. As groups became larger, personalized coordination mechanisms be-
came less efficient, due to the necessary increase of cognitive capabilities which
are - however - constrained.
In these situations, generalized media simplify communication and the
representation of situations. The concept of generalized media has been in-
troduced in sociological theory by Talcott Parsons 3 who used the term "gen-
eralized media of interchange". In the context of constructivistic sociological
systems theory, the concept has been adopted as "symbolically generalized
communication media" (SGCM) by Niklas Luhmann 4 . They offer a mecha-
nism to allow coordinated behavior among individuals that have few or no
representations about each other's individual goals, beliefs, intentions or re-
strictions - which used to be regarded as indispensable for behavioral selec-
tions in most of the dominating microsociological models. SGCM simplify
the predictability of behavior because they offer a universaUstic mechanism of
generating strong motivations as a prerequisite for further cooperation. They
symbolize the expectability of getting rewarded by others in situations of re-
quested cooperation. A typical example of such an symbolic representation
is money: Its possession symbolizes the expectability of having the option
to instrumentalize cooperational behavior of others, for instance in case of
spending money and getting goods or services, regardless of the time or situ-
ation this option is needed. If money is transferred, this option is transferred
also and has to be represented and evaluated only as an option which can
be used by the recipient. It can be coded, communicated and represented
by a simple binary distinction of having or not having money. Therefore, it
works as an generalizable and reliable communication mechanism of initiating
coordinated behavior without ponderous and cognitive complex procedures of
making others adopt the own goals in order to cooperate. Thus, the use of
symbolically generalized media is an efficient way to reduce social complexity
by symbolizing expectability.
Another important example of a SGCM is the symbolization of power,
which is the main objective of our model. Like money, power is used as
a mechanism to symbolize the instrumentability of coordinated behavior of
others. A typical example of symbolizing power are policemen dressed in uni-
222

forms. By wearing a police-uniform, the option to apply expectable superior


sanctions of criminal prosecution in case of normative deviance can be com-
municated very efficiently. It is also based on expectations, but - in contrast
to money - these are predictions of getting sanctioned if requested coopera-
tion is refused. As money, power as a symbolizing mechanism can only be
established and preserved if its function can be proofed on demand. If the
inability to apply sanctions being supported by others gets observable, the
auto-catalytic mechanism of reinforcing a symbol by referencing networks of
cooperation breaks down in dynamics comparable to inflationary processes of
currency in economic markets.
Some concepts related to SGCM are already dealt with on a large scale
in multi-agent research, especially norms 5 ' 6 ' 7 and market based coordination
mechanisms 8 ' 9 ' 10 . All the mechanisms based on these concepts, first of all
norms, have the goal of enabling interactions between agents which do not
know much about each other, but do know something in general. Norms are
condensed expectation structures. A population wide norm makes actions of
agents predictable.
It is obvious that coordination mechanisms based on SGCM may play an
important role in scaling huge agent systems. The main benefit of such media
is that they reduce the amount of knowledge that agents need to interact.
Interactions, controlled by a SGCM, are structured in a straightforward way:
They do not ensure that an interaction always succeeds, but they ensure that
agents know in advance on what aspect negotiation should be limited. Agents
need not to know each other. Agents can be black boxes to each other, and
indeed they cannot look inside their head. Coordination may still succeed and
agents know in what stage an interaction currently is, and when it should be
stopped. Every agent is only concerned with its own beliefs or goals, there is
no need to take into account elaborated reasoning mechanisms about beliefs or
goals of other agents, since they become immediately apparent to each other
during an interaction to some extent. Whatever an agent wants or believes
will be conveyed by a medium to another agent. A medium does not reveal
an agent's goal or its beliefs, but it offers a way to achieve a goal or to verify
or to strengthen its own beliefs.
In the remainder of this paper we concentrate on the SGCM power as
proposed by Luhmann. We do not model predefined rules of power that allow
an agent to interact with another one. We rather focus on the evolution of a
shared symbol system and the meaningful use of a sanction mechanism that
represents power, both, with respect to efficiently coordinated actions.
223

3 A first approach to the evolution of the power medium

A simulation consists of a large number of trials of a cooperation game which


we called the "Planter-and-Harvester-Game" for simplicity. We introduce
two different types of agents, with respect to their ability to change the en-
vironment. There are also two types of actions that change the state of the
environment in an effective way, namely "planting" and "harvesting", which
complement each other. Plant agents, called Planter can perform only plant
actions effectively, harvest agents, called Harvester can perform only harvest
actions effectively. At the beginning of a game the environment U is always in
state Us = 0. A plant action Plantj - performed by a Planter - transforms
the environment into state Ut — 1, a harvest action Harvest/ - performed
by a Harvester - transforms it into the final state Ue = 2. In more compli-
cated games the final state may be Ue > 2 assuming action sequences Planti,
Harvesti, Plantu, and so on. Action Planti in state U — 1 has no effect
with regard to the state of the environment, similarly action Harvest/ in state
U = 0. Furthermore, a Planter might successfully apply in state U = 2 only
action Plantu, not actions Plantj or Plantuj.
At the beginning of a game two agents are randomly selected from the
population, one of them is the start agent. This agent begins by sending a
message Mo. The other agent receives this message and performs an action
ai and sends another message Mi to the start agent. Then, the first agent
performs an action ai and sends a message M^ to the second agent, and so
on. A round is defined by a successive sequence of performing one action and
generating a message for each of the two agents. Both types of agents have
the same repertoire of actions regardless of the efficiency: apart from plant
and harvest actions they have a ./VuZZ-action without any effect, a Sanction
action, an action Exit, and an action Replace. The later action affects the
opponent agent in the way, that it gets replaced by another agent, randomly
selected from the population. This may increase the general possibility for
a successful coordination. A game may end by three different outcomes: an
agent performed the Exit action, the environment reached the final state Ue,
or the number of rounds in the games exceeded the defined threshold rounds.
There is a predefined set of symbols S = {0,1,2, ...,Smax} for message
generation. A message consists exactly of one of these symbols. A symbol
itself has no meaning to an agent, there is no predefined semantics at all.
A game ends successfully if the environment was transformed into the
final state Ue. In this case, the last two agents, participating in the game, get
a certain amount E* of "energy". In other cases there is no energy payoff.
Every action that an agent performs consumes a specified amount of energy
224

of the agent. There are low cost actions (Null, Exit, and Replace) and high
cost actions (Plantx, Harvestx). For a low cost action the agent consumes
energy E[ > 0, for a high cost action Ei + E^, Eh > 0. The cost of the action
Sanction is Ei + E^, Eb > 0. This action affects the other agent in the way
that the sanctioned agent looses pain energy Ep > 0. At the beginning of an
agent's life time its energy is set to E = Es > 0, its start energy. If E ever falls
below 0, the agent dies, that is, the agent is removed from the population.
An agent does not know its own type nor perceives the type of another
agent. They are black-boxes to each other. An agent perceives the message of
another agent, the state of the environment, and the fact of being sanctioned.
In any case not all relevant aspects of the environment are known in the same
way to all the participants, for instance the direct result of an action. Agents
must test different actions at different times and the only hint to whether an
action or message was appropriate is given by a reward signal. This signal is
always generated by the agent itself, based on the energy difference between
two consecutive actions. A sigmoid function generates the reward signal r
based on the energy difference e^; a positive energy difference results in a
positive reward, a negative difference results in a negative reward. Thus,
individual agents employ reinforcement learning. This definition of a reward
signal is a weak one, since it does not assume any intelligent observer (outside
the agent) who generates a reward signal based on its knowledge about correct
actions.
Beside an energy value agents have an age A, which at the beginning
of an agent's life time is set to 0. Any time an agent gets selected to play
the game, its age will be incremented by 1. If the age reaches an individual
maximum, Amax, the agent will be removed immediately from the population.
At the start of the simulation, the population P consists of a certain number
of agents Ps. The number of agents during the simulation may shrink or grow,
depending on the fitness of the agents. An agent may enter the population if
there are at least two agents, whose age is above value Asex and whose energy
value is above a value Esex. The two "parents" are selected by a "Roulette
wheel" 11 from all possible parent agents based on their energy value. Once
a successful breeding occurred, the two parent agents are prevented from
reproduction for a certain period of time tpause. Whenever the number of
agents in the population Pt falls below Ps, agents are randomly added to the
population until Pt = Ps-
We focused explicitly on one certain aspect of media, namely the relevance
of expectations in choosing an appropriate answer to a received message.
Thus, we combine an internal state with the expectation of a received message.
This results in a frame-like structure which will be executed on two levels. In
225

a first step a set Ft of frame structures is chosen based on the state of the
environment. This step will be performed without any learning by the agent
and is totally determined by the environment. In a second step the agent
chooses one frame structure from the previously chosen set Fi. The selected
frame will be executed resulting in an action at+i and a new message Mt+\.
A frame F is defined with respect to a received message Mr = Mt in the
following way:
if MT = Mei then a := acti and M := mes\
elseif Mr = Me2 then a := acti and M := mes2
else execute a trouble frame in FT ,
where at+1 = a and Mt+i — M. A "trouble frame" Ff will be executed
in the case that the received message was neither Me\ nor Me2- This frame
has a special structure, because it does not check the occurrence of a certain
message, rather it checks whether the agent has been sanctioned or not in
order to determine the new action and message:
if sanctioned = true then a :— actri and M := mesri
else then a := actr2 and M := mesx2 •
For every state of the environment the agent has two frames. The selection
of a frame at time t will be guided by a Q-value Qp, that is, reinforcement
learning 12 takes place in order to choose an appropriate frame in a given (en-
vironmental) situation. The entire collection of frames for an agent by a given
final state Ue of the environment is: Fu = {F(k,o),F(k,i)}, for fc = 0 , . . . , Ue.
An additional frame set is employed by an agent when the agent starts the
communication by generating the start message M 0 . For the trouble state UT
the agent can choose also between two (trouble) frames FT = {F^Ff}.
Evolution is based on frames, agents do not change frames during their
life time, they are just able to change the Q-value of a frame with respect
to the other frame inside the same frame set. At the beginning of the sim-
ulation, all frames of all agents are initialized randomly. In particular, vari-
ables M e i, Me2, mesi, mes2, mesri, and mesT2 get randomly chosen values
from S = { 0 , 1 , 2 , . . . tSmax], and variables acti, act2, actxi, and actT2 g e t
randomly chosen values from A = {Null, Sanction, Exit, Replace, Planti,
Harvesti, Plant11,...}. Inheritance happens on the frame level, that is,
cross-over takes place between frames, not inside a frame (but inside a frame
set). Individual parts of a frame are subjected to mutation. Therefore, e. g.
part M e i or act2 may get a new random value during mutation process. Q-
values are not passed on to offspring, and are set to a small random value at
the beginning of an agent's life time.
226

I
yAt*tw <=)

Figure 1. Simulation of 1000000 games (results averages 1000 games). Result of the simu-
lation: a) maximum possible success (counting the occurrence of a "correct" pairing of the
agents); b) the actually achieved success; c) correctly performed Exit; d) Exit in wrong
situation; e) stopped, because maximum rounds exceeded. For example: after around
500000 games, the average result of 1000 games was 60% successful games, out of a max-
imum of 75% possible successful games, 25% were correctly and 10% were incorrectly ex-
ited by an agent, and 5% were stopped by the system (values approximated). Ue = 4,
Smax - 3, rounds = 10, E* = 10.0, Et = 0.5, Eh = 2.5, Eb = Ep = 0.1, Es = 50.0,
Amax e { 5 5 0 , . . . , 800}, Asex - 20, tpause = 20, a = 5.0, b = 1.0.

4 Simulation results

Figure 1 shows the general outcome of a simulation, and Figure 2 shows


the statistics of the number of sanctions in 1000 games, the number of living
agents, and the average energy of the agents. The maximum number of agents
was set to 1024. The simulation started with 3 agents and as long as the
number of agents was below 15 a higher energy pay off E* was given for
success than indicated in the caption of Figure 1 (to support an onset of
evolution). The number of agents grew rapidly until the limit was reached.
Later, evolution still took place optimizing the frame structures. This may
result, for example, in changing cooperation sequences, or in a "competition"
of different sequences as indicated in Figure 3. A sequence was denned by
MQ, M\ a\ M2 a,2 • •., that is, Mo is the start message of the first agent,
Mi the answer message and a\ the action of the other agent and so on. The
coding of actions is: 0 - Null, 1 - Sanction, 2 - Exit, 3 - Replace, 4 - Plant\,
5 - Harvestj, Because we analyzed only sequences which did not contain
a i?ep/ace-action, and which were successful, all these sequences end with
action 7 (Harvestji). Figure 3 shows the eight most frequent sequences of
the entire simulation. The sequence 1 occurred 160877 times, out of 346727
successful sequences, without a -Rep/ace-action. In detail, sequences are shown
227

£ 400

3rLw~—»w

1000 -

:L
Figure 2. From top to bottom: Number of sanctions ("Bites", not averaged), number of
living agents, and average energy of the agents. The number of agents was restricted to
1024. When this number was reached, agents did increase their amount of energy on the
average.

Figure 3. The eight main sequences of the frame based evolution. Left: Absolute occurrence
of sequences (average of 1000 games), right: relative occurrence of the sequences (in relation
to 346727 successful sequences). The eight sequences occurred 329895 times.

in Figure 4.
The communicative behavior of agents became more and more regular.
Because there were two frames for each environmental situation it is obvious
that a frame set is assumed to contain exactly one appropriate frame for
Planters and one for Harvester. An individual only has to explore which one
is better suited. A detailed analysis of the communicative behavior reveals
indeed that communication controls the behavior of agents. As the results
indicate, the agents were able to set up a population wide semantics for the
228

number (see Fig. 3) number of occur. seq. M0 M\ en M2 a2 • • •


1 160877 104051627
2 66551 20404051627
3 37402 004051627
4 26721 01504051627
5 19039 21504051627
6 7118 00404051627
7 6453 20504051627
8 5734 21704051627

Figure 4. Eight most sequences in detail.

exchanged symbols. The meaning of a symbol depends - of course - on


the environmental state, however symbols became functional for the agent's
choice of the next message or action. Sanctions became less important as the
behavior became more normative. Although not shown here, simulations are
easily adapted to cases where several thousand agents may evolve, still acting
in a coordinated manner.

5 Conclusion

We have shown that a growing population of agents may act in a coordinated


manner even in the case when the cognitive capabilities of the agents are
limited and, moreover, when agents do not know anything about each other
(apart from received messages). From an observer's point of view agents reveal
a normative behavior, although we did not predefine any norms. We started
by questioning what kind of mechanisms human society evolved in order to
cope with a growing number of individuals. We found an interesting answer in
the work of sociologists, especially SGCMs proposed by Luhmann. We have
modeled one SGCM (power) in a first approach. However, our simulation is
still too simple to establish all aspects of a symbolic medium.
Nevertheless, Luhmann's suggestions regarding SGCM, especially the as-
pect of structuring a situation by expectations, turned out to be useful. We
modeled some aspects of his theory, mainly aspects of a closed communica-
tion system, but found interesting approach to answer well-known problems
in multi-agent research, namely problems of scalability and the definition of
norms. In subsequent work, we will deal with a more elaborated model of
a symbolic medium. Further, the impact of more than one medium has to
be analyzed, especially their potential for a more heterogeneous agent society
and more complex problems to be solved by the agents.
229

Acknowledgement

We are grateful to three anonymous reviewers for their comments. This work
is supported by the Deutsche Forschungsgemeinschaft under grant number
DI 452/10-1 and part of a research project headed by Werner Dilger and
Bernhard Giesen.

References

1. A. S. Rao and M. P. Georgeff. Modeling Rational Agents within a BDI-


Architecture. In Proceedings of the Second International Conference on
Principles of Knowledge Representation and Reasoning, pages 473-484,
Cambridge, Mass., 1991.
2. R. Dunbar. Grooming, Gossip, and the Evolution of Language. Harvard
University Press, Cambridge, Mass., 1996.
3. T. Parsons. The Structure of Social Action. Free Press, New York, 1968.
4. N. Luhmann. Social Systems. Stanford University Press, Stanford, Ca.,
1995.
5. M. Paolucci and R. Conte. Reproduction of Normative Agents: A Sim-
ulation Study. Adaptive Behavior, 7(3/4) :307-322, 1999.
6. K. Binmore. Game Theory and the social contract, volume 1: Playing
fair. Cambridge, Mass.: MIT Press, 1994.
7. Y. Shoham and M. Tennenholtz. Social Laws for Artificial Agent Soci-
eties: Off-line Design. Artificial Intelligence, 73, 1995.
8. M. P. Wellman. A Market-Oriented Programming Environment and its
Application to Distributed Multicommodity Flow Problems. Journal of
Artificial Intelligence Research, 1:1-23, 1993.
9. S. Park, E. H. Durfee, and W. P Birmingham. Emergent Properties of
a Market-based Digital Library with Strategic Agents. In Y. Demazeau,
editor, Third Int. Conf. on Multi-Agent Systems (ICMAS98), pages 230
- 237, Los Alamitos, Cal., 1998. IEEE Computer Society.
10. G. Ballot and E. Taymaz. Technological change, learn-
ing and macro-economic coordination: An evolutionary
model. Journal of Artificial Societies and Social Simulation
<http://www.soc.surrey.ac.Uk/JASSS/2/2/3.html>, 2(2), 1999.
11. D. E. Goldberg. Genetic Algorithms in Search, Optimization and Ma-
chine Learning. Addison-Wesley Publishing Company, Inc., Reading,
Mass., 1989.
12. R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction.
The MIT Press, Cambridge, Mass., 1998.
THINKING-LEARNING BY ARGUMENT

ALADDIN AYESH
De Montfort University, The Gateway, Leicester LEI 9BH
Email: aayesh@dmu.ac.uk

Humans argue all the time. We may argue with one's self, with a partner or even with some
one we have just met. The argument can take a decision-making form, discussion form,
thinking form or in some cases it could be for the argument sake. In this paper we describe a
system that uses three object-oriented components referred to as cells to utilize the argument
concept to enable thinking-learning process to take place.

1 Introduction

Our argument ability allows us to express our concerns, possibilities and make
collective decisions. We may argue with one's self, with a partner or even with a
complete stranger. The argument may take the form of decision-making, discussion,
thinking or argument for argument sake. Argument with one's self for learning,
thinking and decision-making purposes is the concern of this paper. This paper
describes a system that uses three object-oriented components to utilize the
argument concept into a thinking-learning process. These components are developed
using agents' theory and techniques. However these components form one entity and
they are not individual agents. Therefore and for clarity sake these components will
be referred to as cells through out the paper. The paper discusses the argument
concept and outlines the system proposed to utilize this concept.

2 Preliminaries

There are two relevant subjects to be discussed before proceeding further: arguing
as a human's mental process, and argumentative agents. Arguing is a powerful tool
we use individually and socially [1]. We use this tool to reach to agreements or
understanding with our social partners. We use it individually to form understanding
about ourselves and about matters of individual concern, as part of our thinking
process. And finally we use it as a way of learning new facts from perceived
knowledge. The relation between arguing and the three processes of understanding,
thinking and learning can be seen in the early work of Plato and who followed his
technique of philosophers. Also this relationship is evident in our social life.
Consider the statement 'the more we discuss (argue about) issue X the more I learn
about your personality'. This could be about your attitude towards or beliefs about
the subject of discussion and so on. Finally arguing is affected greatly by our

230
231

perception and by our initial and developed set of beliefs [2], Arguing as a
communication protocol in multi-agents systems has been studied intensively. An
example is the work done by Mora et al. on distributed extended logic programs [3].
Another example is the work done by Jennings et al. on negotiation [4]. Nonetheless
there are differences. In multi-agents systems there is usually a problem to be solved
by negotiation. Each agent participates in the argument autonomously. In contrast
the agent-like components in our system are limited to three components that form
collectively one entity. These components are chosen to converge the argumentation
nature and agents technology. Each agent has pre-determined function.

3 Learning by argument system - basics

The proposed system comprises three cells, which are represented as object-agents.
These cells are named Observer cell (O cell), Questioner cell (Q cell) and Memory
cell (M cell). Each of the three cells is explained here.

3.1 Observer cell (O cell)


O cell represents the perception system. It observes the environment and feeds back
to the Questioner cell (Q cell) which is described next. From the observations
provided the Q cell forms some knowledge about the observed objects. The cycle
continues to perceive as much needed observations to form an opinion about the
object or a set of facts describing that object. To demonstrate the working
mechanism of O cell let us take for an example our eyes and the argument we have
with our perception system. Let us assume that I want to buy a car. I go to a car
dealership showroom and look at cars. I see a nice car so in my brain I say 'it is a
nice car', a reply comes back with 'but it is a blue car and I want a red car'. I see
another car, which is red, but it is not as nice. Now I have one of two choices either
to decide in favor of a nice car or a red car, or I can ask if they make the nice car in
red. O cell deals with qualitative and quantitative information. Therefore a
representational language is being devised using Hybrid logic [5] and adaptive
neural nets [6, 7] to represent both qualitative and quantitative information.

3.2 Questioner cell (Q cell)


Q cell is the voice that replies to our observations and stipulate further information.
It is the part of the brain that says ' yes it is nice car but it is not red' . The main task of
Q cell is to interrogate the information provided by the O cell and feeds back. This
will provide stimulus to trigger O cell to provide further observations. Q cell can be
viewed as a knowledge management component, which review the M cell to
determine ignorance points. Once this is done the questions are formulated and
passed to O cell. Q cell uses the same representational framework used by O cell.
232

However, determining ignorance points, which is the consequential issue of this


component, is determined by using three-valued logic where U predicate implies 'do
not know' [8]. The work is still undergoing to build the representational language.

3.3 Memory cell (M cell)


There are two types of fact perceived by the system, which are asserted facts and
observations. The following definitions state these two types.
Definition 1 An observation may be defined as a feature a of an observed object T
in relation to subject K with a relevance 'low', annotated: 0(a, T) -> Relevance(r,
K, low) v Relevance(cc, T, low). •
Definition 2 An asserted fact may be defined as a feature a of an observed object T
in relation to subject K with a relevance 'strong' or 'definitive', annotated: 0(a, T) -
> Relevance(r, K, strong) v Relevance(cc, T, definitive).*
The Memory cell imitates the memory concept as defined in psychology:
working memory and persistent memory which may also be identified respectively
as short and long memory [9, 10]. This encourages the investigation of two types of
neural nets: self-organizing neural nets (NN) [7] and adaptive architecture NN [6].
Self-organizing NN are well known in machine learning [7]. However, the size and
type of information, which M cell deals with, vary greatly depending on the
argument process between O cell and Q cell.

3.4 System architecture


Figure 1 shows the communication between the proposed system's main segments.

OCell Arguments OCell

Feeding back Reading


MCell

Figure 1: Overview of OMQ System

Definition 3 An OMQ system may be defined as a tuple of components <0,M,Q>


where O is an Observer cell, M is a Memory cell and Q is a Questioner cell; under a
communication mechanism E in which packets are quadruple defined as follow:
Observation packet (O) <object, feature, relation,?>; Question packet (<J>) <object,
request, relation,?>; and Hold packet (11) <observation, type ={observe, fact}, time
feed, time out>; with two main template operations defined which are Receive (9?)
and Feed (8). •
233

3.5 Illustrative example


Here is an example of learning through interceding with the environment and
stipulating information. Table 1 shows the example in which a car is experienced.
Table 1. Learning example
Cell Activity Operations
O cell: I see a red car. Write to M cell short memory: car and color: red with
inter-relation between them. Inform Q cell that two facts
have been observed. Let us assume this observation is of
interest to the system.
Q cell: What is a car? Inform O cell with a request for further observations.
O cell: A car is an Write to M cell short memory: car: car-features and inform
object with four Q cell. Assuming a four-wheel object abstract exists, Q cell
wheels. will classify car and update M cell with: car: four-wheel
object. M cell will re-organize its long memory.
It is noticeable that the system requires constant feed of observations and
depending on the interest in a given observation a scrutiny may be conducted.

4 Practical and implementation issues

There are several practical difficulties to be resolved. First there are two major
processes need to run simultaneously: arguments process between O and Q cells and
M cell re-organization process. This leads to the difficulty of deciding when an
information X is maintained within the long memory or kept in the short memory.
Additionally if robots are considered for physical implementation, real time
processing would be desired. The second problem is the representation of
information. While neural nets may be useful for long memory, it may not be as
suitable for short memory that may keep contradictory information. Furthermore
different types of neural nets (NN) may be used such as Specht's self-organizing NN
[6] whereby pieces of information can be added or deleted as neurons. Trials are
being carried out on different versions of self-organizing NN and logical models for
the development of knowledge and communication language.

5 Conclusion and Future work

In this paper a system that deploys the concept of argumentation to enable learning-
thinking process was presented. The system consists of three agents-like
components, which are referred to as cells and identified as Observer cell (O cell),
Questioner cell (Q cell) and Memory cell (M cell). Collectively they form one entity
namely OMQ system. A definition of the three cells and their functionality is
234

provided with explanation of the OMQ system working mechanism by example. A


full implementation is still under development. Server-client concepts will be
examined to facilitate the predicated expansion in which case each cell will consist
of several components. Parallel computation will be considered to speed processing.
The current system is developed for a single existence and it enables the agent to
argue with one's self. The next stage is to permit team argumentation within a multi-
OMQ-agents system.

6 References

1. A. L. Wilkes, Knowledge in Minds: Individual and Collective Processes in


Cognition. UK: Psychology Press (of Erlbaum(uk) Taylor & Francis), 1997.
2. N. H. Frijda, A. S. R. Manstead, and S. Bern, "Emotions and Beliefs," . Paris:
Cambridge University Press, 2000.
3. I. Mora, J. J. Alferes, and M. Schroder, "Argumentation and Cooperation for
Distributed Extended Logic Programs," presented at Working Notes of the
Workshop on Non-monotonic Reasoning, Trento, Italy, 1998.
4. N. R. Jennings, P. Faratin, A. R. Lomuscio, S. Parsons, C. Sierra, and M.
Wooldridge, "Automated Negotiation: Prospects, Methods and Chellenges,"
Journal of Group Decision and Negotiation, vol. 10, 2001.
5. C. Areces, P. Blackburn, and M. Marx, "Hybrid logics: Characterization,
interpolation and complexity," The Journal of Symbolic Logic, 1998.
6. D. F. Specht, "Probabilistic Neural Networks," Neural Networks, vol. 3, pp.
109-118, 1990.
7. K. H. Pribram and J. King, "Learning as Self-organization,". Mahwah, New
Jersey: Lawrence Erlbaum Associates, Publishers, 1996.
8. A. Ayesh, "Self Reference in AI," in Computer Science Dept. Colchester:
University of Essex, 1995.
9. L. R. Squire, B. Knowlton, and G. Musen, 'The Structure and Orgnaization of
Memory," in Human Memory: A Reader, D. R. Shanks, Ed. London: Arnold,
1997.
10. A. Ayesh, 'Towards Memorizing by Adjectives," presented at Submitted to
AAAI Fall Symposium on Anchoring Symbols to Sensor Data in Single and
Multiple Robot Systems, 2001.
Evolution of a Foraging Model with Many Individuals by
Kin-selection

Kazue Kinoshita, Atsuko Mutoh, Tsuyoshi Nakamura, Hidenori Itoh


Department of Intelligence and Computer Science, Nagoya Institute of Technology,
Gokiso-cho, Showa-ku, Nagoya-shi, Japan, 466-8555
e-mail: kaz@juno.ics.nitech.ac.jp

Genetic Algorithm (GA) has a property, which prompts evolution of superior in-
dividuals by weeding out others under standards of estimation. So, GA is weak
in evolution of altruistic behavior. We used an selection algorithm based on the
theory called "Kin-selection" 1 , which is a popular rationale of altruistic behavior.
Efficacy of this algorithm was confirmed by simulation of a model, which sending
signal and communication among a group regards as altruistic behavior. As a
result, a group had consisted of some subgroups of individuals who has the same
property and signal pattern to communicate among a subgroup.

1 Introduction

Some animal species coexist in the same space. They use species-specific sig-
nals to communicate to members in species and avoid confusion. This paper
proposes new evolution model, which many individuals are classified to some
groups of similar characteristic individuals, and each group acquires group-
specific signal to cooperate with its company. This model is named "foraging
model". Groups are characterized by types of foraging (search and intake)
foods. "Signal" means "food call", which gathers members to found feeders.
The evolution algorithm for such model needs possibility of evolving "altru-
istic behavior", which is behavior that individuals behave for increase benefit
of many members in the same group in exchange for decrease benefit of actors.
Because "food call" is regarded as an altruistic behavior for many signal re-
ceivers by a sender. It is difficult to evolve such altruistic behavior for general
selection in GA which weeds out individuals with low fitness (gained benefit).
Altruistic behavior is explained by "kin selection" in biology. Kin selection is
a theory that altruistic behavior is tactics to indirectly gain offspring whose
gene is partially similar to actor, by increasing chance of bearing children of
many similar parents. This probability of indirectly bearing similar offspring is
called "inclusive fitness". This paper uses it in place of general fitness (gained
benefit by one individual). Other literature 3 proposed such replacement too,
however, this paper uses new definition to be suited to evolve plural coexisting
group and altruist in each group. In following, new inclusive fitness and model
are defined, and confirms efficiency of inclusive fitness.

235
236

Table 1: Symbolic expressions of words


x = (x\...,xn),Y = (»' s-),r,r = {0,1}
/i(X,Y) : hamming-distance between X and Y
w(X) : Weight of X, (number of 1 in X)
X e Y = ( i ' 9 j i ' , . . . , i ° ®yn), (® '• exclusive OR)
gene g={0,1}, individual a; = {g}, ...,gt) benefit of a; = 6;
group A = {ai,..., a j j } , species P , C A benefit of A =BA = T"!,-_i &»
distance dij = /i(a;,aj) d
i = M 2 ^ = 1 dV

2 Definition of Inclusive Fitness


Table. 1 shows definition of symbolic expressions. "Species" Pi is defined as a
subset consisting of members with partial common gene.
ei is general inclusive fitness3 of individual a;.

ei =ip(bl,...bM,ra,...,rtM) = Y,r'ij -bJ ( r i j = K ' ( d u ) = 1


~ ~^hj (X)

In formula(l), similarity degree r- is as larger, di3 is smaller. It means that


if many so similar individuals get more benefits then e; is raised. Conversely
it, individuals whose distance betweena^ is shorter than average d^ve, r^ is
negative value. If these negative similarity individuals get some benefit then
ei is lowered. In this formula, however, plural species can not coexist, even if
their acquisitions benefit behavior does not conflict with other's. So this paper
defines new similarity degree r ^ by function K. ei is influenced by benefits of
low hamming-distance individuals. These individuals are usually in the same
species. On the other hand, ei is not influenced by benefits of high hamming-
distance individuals, who are usually in other species.
Definition 2.1
Tij: = K{dij) = a'lV" 0 < Q < 1 •

In this function, range of r^ is 1 to 0. If TY, calculated from functionary)


is given as the arguments of function ip (formula(l)), e^ reflects benefits of
individuals who has short relative-distance only. So, coexistence of several
species is realized, if no conflict occurs among benefits of each species.

3 Foraging Model
The general foraging modeP is the model which selects kind of intake-foods for
a individual to maximize efficiency of foraging, under existence of n kind of
237

Table 2: Foraging model


Food J
r« = (/„',-,/;), /„ = {o,i}
Individual action-gne
Si = (s\,... A) signal-gene
xi,si,f! = {0,1} U = (//,•. taste-gene

foods F i , . . . , F n . Foods are expressed by a row of r bits in Table.2 in this model.


General foraging model is limited in problem of one individual, however, M
individuals do cooperative foraging in our model. Each individual is expressed
by 3 rows of bits in Table.2. Next 3.1 is details of our foraging model.
D e f i n i t i o n 3.1
1. Limited indexed cells exist and each individual exists in one cell. Ratio of each
Fj (i=l,...,n) in a cell is common value A. (0 < A < 1). (more than two kind
of foods never exists in one cell.) Number of F ; in one cell is common value e.
2. Individuals can change current cell to other unsearched cell within 1 "turn" 0 .
3. The efficiency of intake F„ by an is called rank of food rank{v. rankiv =
C — h(Fv,fi). (C is some constant value) a; can take in rankiv units of F t ,
within 1 turn, and get benefits in proportion to number of intake units.
4. When a; finds F„, inside-code I;„ = F„ ffi fi is produced. rankiv = w(I;„).
If rankiv > 0, signal Si„ = lit, © Si is produced and sent to all individuals b .
5. When a r receives signal S s , from a s , inside-code 1T = Ss» (B s r is produced.
6. Each individual has a "memory", which is the list of item (c, Lc,s). c is the
index of cell where the signal sender exists. Lc is rank of food (expected
number of foods the receiver can take within 1 turn) in cell c. L c = w(Ir). s
is similarity between sender and receiver0 . Items in memory are staked in L c
order. (The highest L c is top of memory)
7. If memory is not empty, when individuals change cell, they move to a top cell
of index c. And, item in the top of memory is deleted.

If signal sender a s and receiver a r have the same signal gene ( s s = s r ) ,


I r = Isd . In this model, "species" is regarded as set of individuals who
has some common taste-gene and common ranks of foods. If members in one
species have common signal-gene, correct informations are transmitted among
the members. It can be raised efficiency of foraging by signals. If individuals
" 1 "turn" is a period that all individuals act once.
b
individuals who are not similar to sender is difficult to receive. Probability that aT receives
signal from a s is rec(a r ,S s ) = exp( <•**£'•) y
c
If duplicate information about the same cell is added, one item is deleted, which s shows
more non-similar individual as to gene.
d
Aj = s,;„ e Sj = (A; e s^ e s^ = A ; S{ ® SJ = &
238

% Generation ,_ ^
6 Inclusive Firm
-ltncss t»'00 0 100 2(X) 300 400 500 ~ g.16'
o?| °l ,s=(0101) s=(1010)
f 80
It
4
**"*Gcn(
General Fitness
•£•«> s=(0100)re c i P l e n t
5 40 donor
(average of 20 times) 3 20/
100 200 300 400 500 5ti i
Generation (lluii :(01(X))~ PI P2 P3 other

Figure 1: Left : Evolution of foraging efficiency, Center transition of the signal pattern,
Right : Final signal patterns and appeared donors

have same signal-gene as members in other species, incorrect rank is added to


memory. Their efficiency of foraging can not be raised by signal information.
Following experiments, number of individuals M = 36, A = 1/20, e = 200,
and (p,q,r) = (1,4,4), C = 2 and F t , = {Fi,F 2 ,F 3 }={(llll),(1110),(0000)}
in Table.2. If action-gene x; = (0), a; is called donor, and others are called
recipient. Donors are extreme altruists who are specialist of search and send-
ing signal. Recipients do search and intake of found foods. Use of signal from
donors raises efficiency of foraging of many recipients. Recipients move using
memories (3.1). Donors, however, move to other unsearched cell. If memory
is empty, they moves at random.

4 Experiment
This section is the description about some experiments for proving effect of
individual selection by inclusive fitness. Several experiments are comparison
of general fitness (b; : taken number of foods by a individual within 1000 turns)
and inclusive fitness (Def.2.1). e
As a result of experiments, finally a group consists of only 3 types of in-
dividuals whose taste-gene fj coincides to F i , F 2 or F3. These 3 classified
individuals are called species P i (fj = F i ) , species P2 (fi = F 2 ), and species
P 3 (fi = F 3 ). At first, members in P l t P 2 or P 3 increased. Next, signals are
unified in each species. Finally, individuals can be able to send and receive in-
formations correctly among the same species. It is confirmed that experiments
with inclusive fitness can make a group get higher benefits than them with
general fitness (Figl(Left)). Its causes appear in Figl(Right). It was difficult
for general fitness to gain donor. Inclusive fitness, however, makes it possible.
In the evolutionary process of species-specific signals, some small sub-
groups P;G Pj appeared. These subgroups can be classified by signal-genes.
e
Here this paper describes an information that an algorithm used in these experiments is
GA with plural groups and using group selection.
239

Figl(Center) shows subgroups P^GPi and Figl(Right) shows finally acquired


specialized signal-gene. After several generations, these subgroups were weeded
out except for one optimum subgroup P | , which has Property4.1^ .
Property 4.1
3i Pi (fi, Si) is fixed, \/j PjSj = f; © f,- © s, (j = 1,..., K) •

Only when almost all individuals have Property4.1, information of correct


rank of foods is transmitted even among other 2 species.

5 Conclusion
This paper used Def.2.1 as a similarity degree function/c, which realizes evolu-
tion of several species in one same field. It realized evolution of communication
in each species and appearance of altruist. Inclusive fitness, however, has re-
strictions. First, there are the premise of existence of "altruistic behavior
gene", and the premise that altruistic behavior is always done for other similar
individuals as to gene. So, inclusive fitness can not be applied to any mod-
els. Next, it needs as long gene arrangement as possible, because of gaps of
similarity degree for effects of inclusive fitness.
We will inspect about other functions satisfied propertied in functions,
and will make definition about effects and limits of inclusive fitness clear. This
paper also confirmed a phenomenon about evolution of signal-genes which
make using signals produced by other species possible each other. In the real
ecosystem, for example, the common alarm call for common enemies is used
through several species. Each individual can distinguish it from private signals
inside species. This distinguished signals problem is one of the future work.

References
1. Hamilton.W.D,The genetical evolution of social behaviorl,2, J.Theor-
Biol,1964
2. Eiichi Kasuya,Primer of behavior ecology, Publisher of Tokai Univ,1990
3. Ezequiel A.Di Paolo,A Little More than Kind and Less than Kin: The
Unwarranted Use of Kin Selection in Spatial Model of Communication,
"Advances in Artificial life" Proc. ECAL'99, Springer-Verlag, 1999
4. Kazue Kinoshita, Toshikazu Suzuki, Nobuhiro Inuzuka, Hidenori Itoh
,An Evolutionary Acquisition of a Cooperative Population by Selection
Methods., MACC99,1999

•f i r = sj e sr = (F„ ®f,® ss) e ST, i r = F„ e fs e ss e (f» efr e ss) = F„ e fr (a. and


a r has Property4.1), so C — w(lT) = C — /i(F t J ,f T .)=rank of ¥v for a r
THE USE OF EMERGENT BEHAVIOUR IN A MULTI-AGENT SYSTEM
TO DRIVE SELF-ADAPTATION AT THE INTERFACE

PETER MARSHALL AND SUE GREENWOOD


Intelligent Systems Research Group, School of Computing and Mathematical Sciences,
Oxford Brookes University, Oxford, England, UK.

e-mail pmarshallfdibrookes. ac. uk. sereen wood(d).brookes. ac. uk

Multiple simple agents have been used to drive adaptive behavior in a system that presents
data in various graphical and tabular forms. Agents observe the users' actions and review the
data that is input into the system. Based on their observations, the community of agents make
decisions about which display formats to recommend when new data is loaded. Rather than
carrying out high-level decision-making, the agents work as an emergent system where the
result of their interactions provides the set of recommended displays. This approach has been
deployed in the real world domain of medicine.

1 Introduction

Previous work in the Intelligent Systems Research Group employed a system that
enabled data to be displayed according to the needs of a particular user. Due to the
time critical nature of the problem, spending time searching through the data was
not feasible. This prohibited the system being usefully employed in the clinical
setting for which it was designed. Thus a more automated approach was required.
Current work involves redeveloping the earlier system from the ground up. A
multi agent system has been utilised to drive the adaptivity. A set of simple agents,
each concerned with a single aspect of the system, communicate with each other
and the suggested summary is a result of the emergent behavior of the whole
system. While emergent behavior is used in other areas where agents have been
applied, notably robotics, it is novel to use this approach in adaptive interfaces.
This paper first considers the use of reactive agents to provide a context for the
application of emergence in the area of self-adaptive interfaces. The field of
adaptive interfaces is also considered to identify approaches that have been used in
the past. An emergent multi-agent system using a two-layer model is then
described. This approach has been applied to the problem of providing self-
adaptivity at the interface.

240
241

2 Intelligent Agents

Jennings et al [4] provide an argument to show how reactive agents that utilize
planning from first principles will not be viable. They consider that by not
employing a world model, the agents must have enough information in their local
environment to allow them to decide on a reasonable action. Reactive agents are
therefore restricted to relying on this local information and must therefore take a
'short term view'. Hence they do not see how these agents could learn from
experience to improve performance over time.
Their analysis suggests that reactive agents might not be a good idea except in
specialist areas such as Brooks' work with robot control [1]. However the work of
Wavish and Graham [6] shows that reactive agents can produce interesting results in
interface work. They have created systems with agents as actors where the behavior
of the system emerges from the interactions of the 'actors'. This would indicate that
the application of reactive agents to the user interface could be valid.
In the proposed system, simple agents make decisions based on whether values
exceed numeric thresholds and by comparing values directly. The agents can
modify these thresholds when patterns of behavior repeat. The agents' internal state
is stored between sessions thus allowing the overall system behavior to adapt over
time.

3 Adaptivity at the User Interface

Providing adaptivity requires obtaining user data. In static adaptation the user is
initially classified then the system configures itself to match this at their first use. In
dynamic adaptation, the system takes account of the user's behavior while using the
system. In this approach, the system can take time to learn the user's habits.
Korvemaker & Greiner[5]discuss this problem. The use of stereotypes can address
this. These initial stereotypes can then be modified over time as in Bushey et al's
[2] CDM method.
To allow adaptation over time, the user must be monitored. There are two
ways to attempt this. The first is to build up a discourse model over time. User
choices can be tracked and utilized to show patterns in behavior as in Goecks [3] for
example. The second is to ask the user for ratings. This can cover everything from
simply asking the user whether or not to carry out an action up to and including
modifying the content to allow it to be rated. This second more direct approach
gives more concrete feedback as to the users' opinions of the content but it could be
considered as intrusive. In the problem being considered here where data needs to
be interpreted by the user in a time critical situation, it is probably more important
not to disturb the user than to gain direct feedback.
242

4 System Development

The aim of this system is to produce a form of adaptivity where the user can be
offered what the system thinks are the most relevant data views while not taking
away from the user control of the system. When requested, agents decide on a
summary that consists of a list of possible data views that appear in a new window
to the side of the main window thus not interfering with normal activity. The user
can view or ignore all or part of the recommendations as they see fit.
Figure 1 below shows the architecture of the system. To drive the adaptivity, a
community of agents is used. Each agent is relatively simple in itself but the power
of the system comes through the interactions of the various agents. The agents are
divided into two layers, the interaction layer, comprising of interface, data and
reasoning agents and the control layer comprising of overseer and scheduler agents.
Interaction level agents are concerned with monitoring the actions of the system and
propose changes to the summary while the control layer agents are concerned with
coordinating the actions of the interaction layer agents. Below, the various types of
agent are considered.

4.1 Interface Agents


Each interface action has an interface agent monitoring it. When an agent's action
occurs, it notifies its reasoning agent then notifies the discourse model. By
observing which actions occur near to when it fired, it can decide which other
agents might be linked to its firing and update its internal belief model accordingly.

4.2 Data Agents


Unlike interface agents, which respond after every action, the actions of the data
agents occur as a group when a new dataset is loaded. Each data agent watches a
particular facet of the data and when new data is loaded, decides whether it differs
from the norm enough to warrant rating high or low. It then tells the domain model
whether its value was high, low or normal. Once all data agents have reported, Any
with high or low values can, by observing others with these results, update their
internal beliefs about direct or inverse relationships.

4.3 Reasoning Agents


Reasoning agents are each associated with a single interface agent but can interact
with all data agents. When a summary is requested, each reasoning agent derives a
final rating for its interface agent based on the raw rating from the number of times
the interface agent has fired and modifiers from patterns discerned by its interface
agent or any of the data agents.
243

User Interface

Interface Agents
Discourse
Model

Overseer
Reasoning Agents Blackboard
Agent

Control Interaction
Layer La er
y Jb
Data Agents

Dataset

Figure 1: The agent architecture


1
4.4 Overseer Agents
Once all reasoning agents have determined their ratings, the overseer ranks them to
produce the final summary. As the agents use emergence, one must be wary of the
behavior moving outside of accepted bounds. The overseer, looks at both the raw
and total ratings and so can decide if the modifiers applied are out of range.

4.5 Scheduler Agents


These agents are concerned with the smooth running of the system. They notify the
relevant implementation layer agents when events of interest to specific agents
occur. Because of the way the agents are implemented, where each agent action is
the direct result of a particular action, there have not been coordination issues but
the role of the scheduler could be expanded to deal with this if required.

5 Analysis

The agents as described above are able to make decisions that allow the summary
the system offers to adapt over time and usage. As noted above, there is the
problem of lead time before a system such as this can hope to perform adequately.
This is addressed by giving each user a stereotype that is modified over time. To
244

check how effective the summary offered is, the agents watch to see whether the
user selects items from the summary. If they do, this is taken as positive
reinforcement of the item's inclusion. In this way, there is a feedback loop.
If one has a strong model of what a user is trying to achieve then one can
simply map this to the actions they are taking. Without such a model, one runs into
difficulties. In this system, the agents directly observe the data that the user uses
and by linking patterns in this to patterns in user behavior, attempts to overcome the
lack of an explicit model of the user's goals.
Using a community of simple agents that communicate with each other it is
possible to consider the actions at the interface and the patterns in the data
separately while still having a mechanism in place which allows these two analyses
to be combined to provide final decisions.

6 Conclusions

This work demonstrates the use of emergent behavior in a community of agents to


provide a self-adaptive system. To achieve this using a conventional approach
would have required the construction of a far more complex set of high-level
models. We have produced a model for a group of agents that can combine together
through their interactions to produce a working system. With the relationships
between the various patterns in the data and user actions implicitly modeled, one
can, at least in some cases, sidestep the need for complex high-level models.

7 References

1. Brooks R A, Intelligence without Representation, Artificial Intelligence I 47


(1991), pp. 139-159.
2. Bushey R, Mitchell Mauney J and Deelman T, The development of Behavior-
Based user models for a computer system, UM99 The 7th Nat. Conf. on User
Modeling (1999) pp. 109-118.
3. Goecks J and Shavlik J, Learning User's interests by Unobtrusively Observing
their Normal Behavior, IUI 2000 Int. Conf. On Intelligent User Interfaces,
(2000), pp. 129-133.
4. Jennings N, Sycara K and Wooldridge M, A Roadmap of Agent Research and
Development, Autonomous Agents and Multi-Agent Systems, 1 (1998), pp275-
306.
5. Korvemaker B and Greiner R, Predicting Unix Command Lines: Adjusting to
User Patterns, Proc. 17'h Nat. Conf on A I (2000), pp. 230-235.
6. Wavish P and Graham M, Roles Skills and Behavior: A Situated Action
Approach to Organizing Systems of Interacting Agents, in Wooldridge M and
Jennings N R (Eds), Intelligent Agents - Theories, Architectures and
Languages, Lecture Notes in A 1890, (1994) Springer Verlag pp. 371-385.
A BIOLOGICALLY INSPIRED FOUR LEGGED ROBOT THAT
EXHIBITS SOME NATURAL WALKING BEHAVIOURS
S. Peng, G.R.Cole, C.P.Lam
School of Engineering, Murdoch University, Perth, WA, Australia 6150
email: shiqi@eng. murdoch. edu. au
This paper presents the design and implementation of a four legged walking robot that can
exhibit some natural walking behaviours. The robot is inspired from biological systems and
aims to address the complex behaviours of legged locomotion from a biological perspective.
The robot displays its natural walking manners via some overall emergent walking behaviours
generated from the real-time interactions with its environment. A parallel Subsumption
Architecture is applied to the robot for the physical implementation of the design.

1 Introduction

Legged robots have been studied for a long time and a number of them have
also been built for laboratory investigation and practical application in recent years.
Based on the number of legs, these can be mainly classified into three types:
• Biped robots that have two legs robot (eg. Honda humanoid robot [ 1 ], Eyebot [2]).
Quadruped robots which have four legs (eg. BISAM [3], TITAN [4]).
Insectoid robots which have more than four legs (eg.[5, 6]).
Although most of these walking robots are also inspired from biological systems,
the approaches used to generate their walking behaviours have been very much
from an engineering perspective. Smooth and natural walking behaviours exhibited
by animals are seldom shown in existing walking robots. Based on many biological
studies on animal locomotion (eg. [7-10]), it has been shown that a natural rhythmic
cycle of animal locomotion is composed of several different phases (also known as
duty factor [11] in most biophysics documents). For different gaits, there are
different numbers of phases. For example, a walking gait has four phases while
trotting, pacing and bounding have two phases. The reason why there are different
gaits in animal locomotion is that certain gaits result in the most efficient energy
consumption at certain speeds [12]. For example, the walking gait is suitable for
low-speed locomotion while the pacing and trotting are suitable for high-speed
locomotion. The walking gait has four different phases (eg. phase 0,1,2,3, as shown
in Figure 2). Each of the legs undergoes these four phases during walking.
However, at any one time, all the legs are at different phases. At phase 0, a leg
(referred as the leading leg [13]) is lifted and swung forward. At phase 1,2,3, a leg
will move backwarded. All four legs cooperate to generate the force to move the
body forward. While a leg is moved forward, all the other three legs on the ground
are pushing in the backward direction simultaneously. The difference between
phase 1,2,3 is that the position of the leg is at various positions relative to the body.
For example, at phase 3, a leg is at the fully extended position (eg. left back leg of
Figure 1-i) and phase 1 is at a minor extended position (eg. right back leg of Figure
1-i).

245
246

Another essential issue for natural walking is balance. Raibert addressed this
issue in many of his research documents [14]. There are two types of balance
strategies in animal locomotion: dynamic balance and static balance. In static
balance, the center of gravity of the animal is always kept within its supporting area
formed by its legs on the ground. The animal can statically keep its posture and not
fall down. For dynamic balance, the animal's center of gravity is sometimes outside
its supporting area. The animal must use its movements, which generate
momentum, to compensate for its temporal instability. For instance, when a leg is
off the ground and swung forward, the center gravity of its body may be outside its
supporting area and results in falling down. However, as long as the leg can
complete its forwarding actions before it falls beyond a tolerant limit, the falling is
acceptable and utilizable to the animal. Animals employ both balance strategies
during their locomotion. The faster an animal moves, the more the dynamic strategy
is employed.
Subsumption Architecture (SA) [15] is a robotic architecture inspired from
biological systems. It is a bottom-up reactive AI approach without a model and/or
representation of its environment. A Subsumption Architecture is made up of a
hierarchical set of pre-defined behaviours, which all operate in parallel. A behaviour
is defined as a set of actions triggered by certain sensor(s) (physical or virtual)
conditions for achieving a certain goal that will eventually facilitate the
achievement of the final system target goal. According to the preset suppression
rules, higher level behaviours, if triggered, can suppress lower level ones.
This paper presents the design and implementation of a four legged walking
robot, inspired from four legged animals (eg. a dog) and aims to investigate the
problem of natural walking, an issue yet to be addressed sufficiently in the robotic
community. This is an attempt whereby the robot has reasonable complexities and
similarities close to its biological counterpart as without these, some animal walking
issues may be overlooked. The walking behaviours of the robot are implemented
using one SA for each leg (i.e. parallel SA).

2 Design and Implementation

As addressed previously, the cycle of the walking phases are set as : 0-3-2-1-
0... . Given that the full moving distance of a leg (relative to the body) is defined as
100%, four leg positions (d,c,b,a) represent positions of 100%, 66%, 33% and 0%.
Movement details of these four phases are listed as followings:
Phase 0: Movefromposition a to position d (in the forward direction).
Phase 3: Movefromposition d to position c (in the backward direction).
Phase 2: Movefromposition c to position b (in the backward direction).
• Phase 1: Movefromposition b to position a (in the backward direction).
A detailed illustration of the walking gait is shown in Figure 1. By implementing
these cycles of leg motion, a walking behaviour for the robot can be achieved. The
legs enter phase 0 in the order of left front, right hind, right front and left hind (the
247

normal walking gait for four legged animals). A picture of the robot, built as an
experimental platform, is shown in Figure 2. Pneumatic cylinders attached to the
limbs act as "muscles", providing the actuation through the usage of solenoid
valves. An independent Subsumption Architecture with its own action execution
unit has been developed for each leg of the robot, resulting in four SA functioning
in parallel in the system. There are no direct communications among the four
architectures. The only connection between them is the physical body of the robot
and a simple Central Pattern Generator (CPG) to coordinate leg movement phases.

Right hind Right hind


Right front
leg: Phase 1 leg: Phase 1 leg: Phase 0

Left hind leg: Left hind leg:


Phase 3 Phase 2

(i) Step 1 (left front leg off-ground forward, (ii) Step 2 (right hind leg off-ground forward,
others on-ground backward) others on-ground backward)

Right front
leg: Phase 0 .-'TV
d t b
\
a
Right hind /.' \ \
leg: Phase 3 / ." \ \
d c b a
Right front
leg: Phase 3

Left front leg Left hind leg: / 7 \ \ Left front leg:


Phase 2 s Phase 1 / J \ \ Phase 1

(iii Step 3 (r ght front leg off-ground forward, (iv) Step 4 (left back leg off-ground forward,
othe rs on-ground backward) ^ others on-ground backward)

Figure 1: Walking gait of the design

Six types of behaviours: Stand, Forward, Backward, Balance, LegDown and


Protect, listed in Table 1, are defined for the four SA.
Behaviour Behaviour Sub-goal Priority
Stand To lock a leg at the current position 0
Forward To carry out the phase 0 actions of a leg to swing a leg 1
forward
Backward To carry out other phase (1,2,3) of a leg to push body 1
forward
Balance To balance the body to avoid falling over 2
LegDown To put a leg down to make contact with the ground 3
Protect To set the legs and body of the robot to "safe" 4
positions
Table 1: List of behaviours
In total, there are 24 behaviours running in parallel in the system, co-operating
together to generate an overall emergent walking behaviour during locomotion. The
Forward and Backward behaviours are at the same priority level and thus do not
suppress each other.
All the behaviours in the four SA operate concurrently. The Forward behaviour
swings a leg forward from position a (eg. start position of left front leg of Figure l-i
) to position d (eg. stop position of left front leg of Figure l-i ). The Backward
248

behaviour moves a leg backward a unit distance (eg. from position d to position c, c
to b, etc.) at a time to generate the force for pushing the body of the robot forward.
At any moment, if the Forward behaviour of a leg is triggered, the Backward
behaviours of the remaining three legs are also triggered at the same time, with
different phases. These two types of behaviours can automatically record their phase
status and transfer to the next status in the order of phase 0-3-2-1. The Balance
behaviour is designed to supplement the Backward and Forward behaviours to
implement the robot's balance strategies (both dynamic and static, depending on the
real-time situations). The Balance behaviours will be activated when the body of the
robot tilts at an angle (eg. 10 degrees) to the horizontal surface. If the robot tilts
beyond a critical degree (eg. 20 degrees), the LegDown behaviour will be activated,
resulting in the lowering of leg(s) down to the ground to support the weight of the
robot to prevent the robot from tipping over. To protect the robot from situations
whereby it may fall over, there is a Protect behaviour (for each SA) that can be
activated. This behaviour resets the robot to a pre-defined "save" sate.
In terms of the physical implementation of these architectures, the methodology
proposed in [16] has been used. Behaviours are implemented as behaviour objects
that are instantiated from the Behaviour class and composed of reusable
components (eg. Action, Trigger and Executor components). A behaviour
encapsulates all its functionality and characteristics (eg. its trigger condition,
suppressible behaviour list, actions and operating knowledge) so that it can operate
independently and no extra behaviour arbitrator is required. The development
language employed here is Swiftx 2.5 [17], which provides a simple multi-thread
(task) programming and operating environment. Each behaviour, as well as the
Action Execution Unit of an architecture, is implemented as an instance via an
independent thread.

3 Experiment results

A laboratory floor is used as the testing terrain. The robot successfully walks
from one end to the other end of the floor at a speed of about 2.5 meter/minute,
exhibiting certain walking behaviours. The initialisation stage involves setting
conditions whereby each of the leg is positioned to a preset phase (legs are preset at
the phases of 0, 2, 3, 1 for left front leg, right front leg, left hind leg and right hind
leg). Given the Stand behaviour has no "trigger" conditions, it will automatically
activate provided no other behaviours are active. The sequence of behaviours is not
deterministic but a typical scenario is now described. When the CPG is first started,
the Forward behaviour of the left front leg and the Backward behaviours of the
remaining three legs are triggered. The Forward behaviour suppresses the Stand
behaviour to become activated and moves the leg forward. The leg extends
downward onto the ground, lifts off, fully swings forward and places down on the
ground. At the same time, the Backward behaviours of the other three legs push
backward on the ground to move the body forward. They cooperate to generate the
249

necessary force to enable the robot to move forward. A smooth transition of leg
position phases is shown during movement. Visually, it is seen that the robot is
walking forward. During these activities, the LegDown behaviour may be activated
if the body of the robot tilts beyond a tolerant degree. If it is triggered, this
behaviour will suppress any lower level behaviour (eg. Forward or Backward
behaviour) to become the activated behaviour. Its actions involve putting the leg
onto the ground in an attempt to prevent the robot from tipping over. Once this
behaviour is completed, the Balance behaviour is triggered subsequently to further
stabilize the robot's posture. In the worst case scenario when the robot loses its
balance and reaches an abnormal unstable posture, the Protect behaviour is
triggered to reset the posture of the robot to a certain predefined "safe" position.
When the robot regains its balance, the Forward or Backward behaviours are again
activated. This alternation of behaviours may occur repeatedly until all the stepping
actions have been completed. After the first "phase of walking", the phases of legs
are changed to 3 (left front), 1 (right front), 2 (left back), 0 (right back), so that it is
ready for the right hind leg to be moved forward. Overall, all the interactions inside
the system will generate an emergent walking behaviour that enables the robot to
move forward.
The process discussed above is for one of the four legs and will occur
simultaneously for all the four legs. Figure 2 shows a walking cycle of the robot on
a flat ground, 2a to 2d sequentially show the steps of left front leg, right hind leg,
right front leg and left hind leg. The phase transitions of left front leg are shown by
the arrow box pointed to the leg (note positions of legs relative to tie.body).

Transition phases of the other three legs are similar. A point to note here is that
a pulsed mode of operation is used with the pneumatic cylinders, the movement of
a limb consists of a number of smaller pulsed movements or jerks. For this reason,
we are not at a stage where it would be fair to compare the walking quality of this
robot to others, given that our movements have yet to be fully optimized. An Mpg
format video clip of the walking behaviours of the robot can be obtained via the
250

Internet from this address: http://eng.murdoch.edu.au/~shiqi/Robotwalker.mpg.


Experiments with the robot employing different gaits and being subject to different
terrain are being carried out.

4 Conclusion & Future works

This paper presents the design and implementation of a four legged walking
robot that incorporates some biological inspiration, which enables the robot to walk.
Four parallel SA are used in the robot to physically implement the concepts. All of
the behaviours within each of the four parallel SA and a simple CPG co-operate to
generate emergent walking behaviours. In the future, a more complicated CPG will
be incorporated for walking phase optimization together with machine learning to
enable the robot to carry out more sophisticated and flexible natural walking
behaviours.

References

1. L. HONDA MOTOR CO., "The HONDA HUMANOID ROBOT," :


http://www.honda.co.jp/english/technology/robot/tec 1 .html, 2000.
2. T. Braunl, "The eyebot mobile robot family," presented at International Conference on Artificial
Ingelligence (IC-AI), Las Vegas., 1999.
3. W. Ilg and K. Berns, Jedele, H., Albiez, J., Dillmann, R., Fischer, M., Witte, H., Biltzinger, J„
Lehmann, R., Schilling, N., "Bisam: From small mammals to a four legged walking machine,"
presented at Fifth International Conference of the Society for adaptive Behaviour, 1998.
4. K. Yoneda, "TITAN VIII," in Tokyo Institute of Technology,
http://www.fzi.de/ipt/WMC/preface/nodel50.html, Ed„ 1998.
5. A. Onat, "The Six Legged Walker Gokiburi," in University ofTyoko, http://turbine.kuee.kyoto-
u.ac.jp/staff/onat/altibacak.html, Ed., 1998.
6. M. Binnard, "Boadicea - A Small, Pneumatic Walking Robot,",
http://www.ai.mit.edu/projects/boadicea/boadicea.html, Ed., 1998.
7. E. von Hoist, "Uber relative Koordination Bei Arthopoden," in Pflugers Archive, vol. 246, 1943.
8. E. R. Kandel, J. H. Schwartz, and T. M. Jessell, "Principles of Neural Science," . Norwalk, CT.:
Appleton & Lange, 1991.
9. S. Grillner, "Control of locomotion in bipeds, tetrapods, and fish," in Handbook of physiology,
1981, pp. 1179-1236.
10. M. L. Shik and G. N. Orlovsky, "Neurophysiology of Locomotor Automatism," Physiology
Revidw, vol. 56, pp. 465-501, 1976.
11. R. M. Alexander, Locomotion of Animals. New York: Blackie & Son Limited, 1984.
12. R. M. Alexander, Exploring Biomechanics: Animals in Motion. New York: Scientific American
Library, 1992.
13. A. B. Howell, Speed in Animals. New York: Hafner Publishing Company, Inc., 1965.
14. M. H. Raibert, "Legged Robots," in Robotics Science. Cambridge,MA.: The MIT Press, 1989, pp.
563-594.
15. R. A. Brooks, "A Robust Layered Control System For A Mobile Robot," IEEE Journal of Robotics
and Automation, vol. Vol.RA-2,No.l, pp. 478-485, 1986.
16. S. Peng, G. R. Cole, and C. P. Lam, "A generic framework for implementing Subsumption
Architecture," presented at RA2000, Honolulu, Hawaii, USA, 2000.
17. Swiftx Ref., "SwiftX Reference," . SwiftX Co., Ltd.: http://www.forth.com. 2000.
CHAPTER 4

KNOWLEDGE DISCOVERY AND


DATA MINING AGENTS
CM-RELVIEW: A T O O L F O R C A U S A L R E A S O N I N G I N
MULTIAGENT ENVIRONMENTS

BRAHIM CHAIB-DRAA
Computer Science Department,
Pavilion Pouliot, Laval University, Ste-Foy,
PQ, Canada G1K 7P4
email: chaibQift.ulaval.ca

Analytical techniques are generally inadequate for dealing with causal interrela-
tionships among a set of individual and social concepts. In this paper, we present
a software tool called CM-RELVIEW based on relational algebra for dealing with
such causal interrelationships. Then we investigate the issue of using this tool in
multiagent environments, particularly in the case of: (1) the qualitative distributed
decision making and, (2) the organization of agents considered as a wholistic ap-
proach. For each of these aspects, we focus on the computational mechanisms
developed within CM-RELVIEW to support it.

1 Introduction

Cognitive maps follow personal construct theory, first put forward by Kelly 8 .
This theory provides a basis for representing an individual's multiple per-
spectives. Kelly suggests that understanding how individuals organize their
environments requires that subjects themselves define the relevant dimensions
of that environment. He proposed a set of techniques, known collectively as
a repertory grid, in order to facilitate empirical research guided by the the-
ory. Personal construct theory has spawned many fields and has been used
as a first step in generating cognitive maps. Huff 7 has identified five generic
"families" of cognitive maps. Among these families, there is one that show
influence, causality and system dynamics: This type of maps, called causal
maps, allow generally the map maker to focus on action, as for example, how
the respondent explains the current situation in terms of previous events,
and what changes she expects in the future. This kind of cognitive map is
currently, has been, and is still, the most popular mapping method.
We generally use causal maps for dealing with such cause-effect relations
embedded in deciders' thinking. Theses maps are represented as directed
graphs where the basic elements are simple. The concepts an individual (a
decision-maker or a group of decision-makers) uses are represented as points
and the causal links between these concepts are represented as arrows between
these points. This representation gives a graph of points and arrows, called
a causal map (CM). The strategic alternatives, all of the various causes

252
253

Japanese
attrition

Japan ^ ^ ^ ^ Japanese
remains ~~-^_ ^ ^ success
idle ^~~\_ ^^^"^ in war
+ ^ - s - US
preparedness
Figure 1. An example of causal map

and effects, goals, and the ultimate utility" of the decision-maker can all be
considered as concept variables and represented as points in the CM. Causal
relationships can take on different values based on the most basic values +
(positive), — (negative), and 0 (neutral). Logical combinations of these three
basic values give the following: "neutral or negative" (0), "neutral or positive"
(©), "non-neutral" (±), "ambivalent" (a) and, finally, "positive, neutral, or
negative" (i.e.,"universal") (?) 1 - 5 ' 11 .
The real power of this approach appears when a CM is pictured in graph
form. It is then relatively easy to see how concepts and causal relationships are
related to each other and to see the overall causal relationships of one concept
with another, particularly if these concepts are the concepts of several agents.
The CM of Fig. 1, taken from 10 , explains how the Japanese made the
decision to attack Pearl Harbor. Indeed, this CM states that "remaining idle
promotes the attrition of Japanese strength while enhancing the defensive
preparedness of the United States, both of which decrease Japanese prospects
for success in war". Thus, a CM is a set of concepts as "Japan remains idle,"
"Japanese attrition," and so forth, and a set of signed edges representing
causal relations like "promote(s)," "decrease(s)," and so forth.
Note that the concepts' domains are not necessarily defined precisely be-
cause there are no obvious scales for measuring "US preparedness," "success
in war,", and so forth. Nevertheless, it seems easy to catch the intended mean-
ing of the signed relationships in this model 14 . As any causal map, the CM
of Fig. 1 can be transformed in a matrix called an adjacency or valency matrix
which is a square matrix, with one row and one column for each concept.
Inferences that we can draw from a CM are based on a qualitative rea-
soning similar to "friend's enemy is enemy, enemy's enemy is friend, and so
forth." Thus, in the case of Fig. 1, "remaining idle" decreases the prospects

"Utility means the unspecified best interests of a decision maker.


254

for Japanese success in a war along two causal p a t h s . Notice t h a t the rela-
tionship between idleness and war prospects is negative because b o t h p a t h s
agree. In these conditions, J a p a n has an interest in starting war as soon as
possible if she believes t h a t war is inevitable.
Thus, causal maps and t h e qualitative reasoning t h a t it sustains serve
generally as the modeling language for problem resolution t h r o u g h decision
making, particularly in multiagent systems where decision emerges generally
from interrelationships among agents' concepts. Such is the case for the pre-
vious example t h a t reflects a multiagent system in the sense where "Japan"
and "USA" are individual agents.
In this paper, we present a n implementation of a formal model (details
on this model can be found in 6 ) which has been implemented in a system
used as a computational tool supporting the relational manipulations.

2 CM-RELVIEW: A n I m p l e m e n t a t i o n o f t h e R e l a t i o n M o d e l o f
CMs

T h e CM-RELVIEW has been built over the R E L V I E W software 6 , a free-


ware package developed by Berghammer and Schmidt 2 . In the CM-RELVIEW
system, all d a t a are represented as binary relations, which t h e system visu-
alizes in two different ways. For homogeneous relations, CM-RELVIEW offers
a representation as cognitive maps, including several different algorithms for
pretty-printing. As an alternative, an arbitrary relation may be displayed on
the screen as a Boolean matrix. W i t h matrix representation, we can visu-
ally edit and also discover various structural properties t h a t are not evident
from the causal m a p representation. T h e CM-RELVIEW system can manage
as m a n y graphs a n d matrices simultaneously as memory allows a n d the user
may manipulate and analyze the relations behind these objects by combin-
ing t h e m with the operators of relational algebra. T h e elementary operations
can be accessed through a simple mouse-click, b u t they can also be combined
into relational expressions, mapping, and imperative programs. CM-RELVIEW
allows also users t o store relations and CMs.
In addition, CM-RELVIEW offers a menu window (Fig. 2) t h a t can be
divided into different p a r t s . T h e first p a r t , deals with tasks as: (1) FILES:
opens the file-chooser window; (2) INFO: helps users by giving some appro-
priate information, (3) QUIT: quits the system.
T h e "Editors" part includes the following : (a) RELATION: opens the

"This software can be obtained by anonymous ftp from "http://www.informatik.uni-


kiel.de/^ progsys/relview.html".
255

C M - Relview

Editors:
CREUATIOISQ CGRAPH)

Directories:
(XRV/FROG) (LABET)

Used—defined f u n c t i o n s a n d t e s t s :
CPEFINE) CHVAL") QTER ) (TESTS')

Basic operations:
CD © O CO ©
Residuals and quotients:
(S/R ) (R\S) CSYQ)

Closures:
(TRANS ) CREFL) ("SYMMJ

Product, sum and powersets:


Ca*b") (a+b) fa^b)

Figure 2. The menu window of CM-RELVIEW.

window of the relation editor; (b) GRAPH: pops the window of the CMs
editor,
By clicking onto the button RELATION, one opens the relation editor. One
can then load a relation by simply selecting this relation in the first scroll list
of the directory window. Typically, the window of the relation editor looks
like as a grid network in which a single entry of the relation unequivocal
defined by a row and a column of a relation is represented by one of the set
C := {a, +, - , 0, ©, Q, ± , ?}. If the mouse pointer is located on an item of a
relation, the mouse buttons invoke the following different actions:
• the left mouse button sets the item if it was cleared, or clears it if it was
set,
• the middle mouse button allows one to choose one relation (which is used
by the left mouse to set) of the set C := {a, +, - , 0, ©, 0 , ± , ?}, finally,
• the right mouse button pops up a menu where appears (i) NEW: it cre-
ates a new relation; (ii) DELETE: It deletes the relation displayed in the
relation editor window from the workspace (the causal map associated
with the deleted relation is also deleted), (iii) RELATION —> GRAPH:
it creates a CM from homogeneous relation with the same name as the
relation (such CM is displayed in the graph editor).
The window of the graph editor (i.e., CM editor) can be opened by press-
256

ing the button GRAPH in the menu window. Similar to relations, all actions
within this menu are selected with the same right mouse button. By pressing
such a button, we reach the graph menu, within, we can particularly invoke
the following actions:

• DELETE: it deletes all nodes of a causal map,


• NEW: it opens a dialog window which allows one to enter a name for a
causal map,
• GRAPH —^RELATION: it creates a relation from a causal map,
• GRAPH-DRAWING: it opens a submenu from which different graph al-
gorithms can be chosen, particularly LAYER, which places the edges
vertically, FOREST, which draws a directed forest, and WHOLISTIC-
APPROACH, which draws a particular causal map that we will detail
in sub-section 4.

CM-RELVIEW offers also a "directory" part which contains:

• XRV/PROG: it displays the directory window showing the state of the


workspace and the reasoning on causal maps,
• LABEL: it opens the label directory listing label set which is, in our case,
C:={a,+,-,0,©,e,±,?}.

The buttons in the "user-defined functions and tests" part are mostly
needed while working with the CM-RELVIEW system:

• EVAL: pops up the evaluation window for entering a relational term (a


relational term can be a relation, a function, or a relational program),
• TESTS: pops up a window for invoking tests. With this command, one
can perform the following actions: (i) TEST-l-R: to execute various kinds
of tests on a relation (is it empty, injective, symmetric? etc.); (ii) TEST-
2-R: to execute tests on two relations (are they equal, included? etc.);
(iii) SUBJECTIVE VIEWS: to do tests on CMs in the case of the reason-
ing on subjective views (COMPARISON, PREDICTION, EXPLANATION
and NEGOTIATION); (iv) WHOLISTIC-CM to execute some strategies of
changes on the particular CMs representing an organization of agents as
discussed in Section 4.

Finally, the other parts of the menu window offer a number of relational
operations which are directly accessible via push buttons. Among those oper-
ations, TRANS allows one to calculate the transitive closure of a given relation.
257

3 CMs as a Tool for Qualitative Distributed Decision Making

CMs can also help an agent or a group of agents considered as a whole to


make a decision. Given a cognitive map with one or more decision variables
and a utility variable, which decision should be taken and which should be
rejected? To achieve this, the concerned agent should calculate the total effect
of each decision on the utility variable. Those decisions that have a + or ©
total effect on utility should be chosen, and decisions that have a — or 0 total
effect should be rejected. Generally, no advice can be given about decisions
with a, that is an ambivalent, total effect on utility, whereas that a ± or ?
total effect on utility should not be rejected because it raises the undetermined
decision problem. To solve such undeterminated decision, we propose here an
original algorithm which is based on the principle of superposition adopted
for CMs. This principle stipulates that the result of applying together two
concepts Ci and Ci is the same as applying C\ and C-i in sequence.

Algorithm for solving the undetermined decision

For any concept C that has an undetermined result on the utility U, calculate all
the indirect effects between C and U; then separate those indirect effects in positive
and negative paths; i.e., paths with " + " and "—" total indirect effect respectively;
Cut off all the negative paths and evaluate the effect of positive paths on U, then
note P i this evaluation;
Repeat the previous step for the effect of negative paths on U (without taking into
account the positive paths) and note P2 this evaluation;
Compare P i and Pi

(a) if P i is more valuable than P2 then the sign between C and U is " + " ;
(b) else if P i is less valuable than P2 then the sign between C and U is "—";
(c) else if P i ia as valuable as P 2 then the sign between C and U is " 0 " .

We will show below how this algorithm operates with a concrete example.
Before that, we now illustrate the decision-making process in the context of
multiagent environments using CMs. To achieve this, consider for example
the causal map of a professor Pi (considered as an agent) shown in Fig. 3
who supervises a research group called G12) and who has to choose between
two courses D\ and D% (D\ and Di are decisions variables). The question
now is how P\ can choose between D\ and Di knowing the facts reflected by
258

the causal map, shown in Fig. 3. This causal map includes the following Pi
beliefs: (i) D\ favors the theoretical knowledge of Gi2's students; (ii) Greater
theoretical knowledge gives a greater motivation to students; (iii) Greater
motivation of students gives a better quality of research for group G12, which
gives a greater utility of G\i which, in turn, has a positive result on utility of
Pi. Finally, the second decision variable D2 is an easy course that decreases
the workload of Pi. Obviously, decreasing Pi's workload increases her utility.

Research quality + Utility


of G12 ^K)f Gl2
D1

Utility
of Pi
+

Theoretical knowledge Student -=•- Pi workload


of students motivation

D2

Figure 3. An illustrative example for decision-making in a multiagent Environment.

In this case, how can Pi make her choice between the two courses Z?i
and £>2? Notice that in the context of our example, Pi should reason about
another agent which is the group G12 to make her decision. In other contexts,
and for other decisions, she can also collaborate with her group to develop
her decision. In this sense, the decision-making process considered here is a
multiagent process. To run this process, it might be useful to convert the
causal map being analyzed to the form of a valency matrix V. With the
valency matrix, Pi can calculate indirect paths of length 2 (i.e. V2), 3 (i.e.
Vs), etc., and the total effect matrix Vt. In fact, Vt tells Pi how the decision
variables L>i and D2 affect her utility and Gi2's utility. This gives the following
matrix of size 2 x 2 (keeping only the relevant entries) involving two decision
concepts (DC), D\ and D2, and two utilities considered as value concepts
(VC), namely, Utilities of G i 2 and Pi.
259

DC\VC Utility of G12 Utility of Pi


?
Di +
D2 - +
Thus, Pi perceives (1) decision D\ as having a positive effect on Utility
of G\2 and an undetermined effect on her utihty; (2) decision D2 as having a
negative effect on Utility of G\2 a n d a positive effect on her utility. In these
conditions, it is important to remove the undetermined result of Di decision
on Pi utility. To achieve this, we apply the previous algorithm as follows:

1. To see the impact of giving the course D\ on utility of G\2 we cut off the
negative path produced by "Student motivation" —(+)—> "Workload of
Pi" —(—)—> "Utility of Pi". Practically, this means that Pi evaluates
the following hypothetic situation: "if the course D% will be given by
another colleague what will be the impact (Ji) of D\ on my utility without
taking into account the workload induced by D\V
2. Similarly, we cut off the positive path produced by "Student motivation"
—(+)—> "Research quality of G12" —(+)—> "Utility of Gi 2 " — (+)—
> "Utility of Pi". By doing so, we can see the impact (I2) of giving
the course D\ on the workload (W2) of Pi without the positive impact
induced by the group Gi2. Practically, this means that Pi evaluates the
following hypothetic situation: "What will be the impact (I2) on my
utility if I give the course D\ to another group that has no connection
with me?".
3. Finally, If the impact I\ compensates I2 then D\ —(0)—> utility of Pi;
(b) is more valuable than I2 then £>i —(+)—> utility of Pi; (c) is less
valuable than I2 then D\ —(—)—> utility of Pi

Suppose that Pi believes that the impact of giving the course D\ produces
effects on her utility, via her group of research, which are more valuables than
what this course gives her as workload. In these conditions, we have
DC\VC Utility of Gx2 Utility of Pj
Dt + +
D2 0 +
It is clear here that decision D\ would be preferred on decision D2 be-
cause this decision has a positive impact on Pi's utility and on G\2 utility.
Conversely, D2 has only limited impact because it only positively influences
the utility of Pi.
It is now important to say how the CM-RELVIEW tool can be used by de-
cision makers for their QDM? In fact, decision makers (DMs) can elicit causal
260

knowledge about their decision and utility variables from different sources,
including documents (such as corporate reports or memos), questionnaires,
interviews, grids, and interaction and communication between other agents.
After that, they use the relation editor of CM-RELVIEW for filling matrices
relative to this causal knowledge. Then, they use the GRAPH button for
transforming those matrices, into graphs (causal maps). Finally, they analyze
those causal maps using the TRANS button.
Here, how a decision maker (DM) can use this tool. By pressing the
button TRANS in the menu window (Fig. 2), CM-RELVIEW, a decision maker
(DM) can calculate the transitive closure, (i.e., the total effect that a decision
has on the utility variable). In the case where there is an undetermined result,
CM-RELVIEW applies the algorithm introduced previously and asks the DM
to give it some guidance to solve the undetermined result. In particular, the
DM is asked to supply (1) the impact of positive and negative paths and,
(2) the most valuable impact. A fully automated process for solving the
undetermined result problem is scheduled in the agenda of our future work.

4 CMs as a Tool For Studying Changes in Organization of


Agents

In multiagent systems, the study of an organization of agents has generally


focused on some structural models such: (1) centralized and hierarchical or-
ganizations, (2) organizations as authority structure, (3) market-like organi-
zations, (4) organizations as communities with rules of behavior. All these
structures missed dynamic aspects and influences that exist in an organization
of agents.
Weick 13 suggested to change the prevalent static view of an organiza-
tion of agents to a dynamic view which is sustained by change. Precisely, he
proposed that organization and change were two sides of the same social phe-
nomena. His reasoning was that an organization is a process of co-evolution of
agents' perceptions, cognitions and actions. In this context, Weick proposed
a theory of organization and change based on the graphs of loops in evolving
social systems. In the last decade, additional investigation guided by this
approach 3 ' 4 tried to articulate how CMs provide a way to identify the loops
that produce and control an organization.
As an example, consider the organization that binds researchers, grant
agencies and qualified personnel in any (science and engineering) department.
The causal map representing this organization is shown in Fig. 4. The meaning
of this CM is clear and we shall explain it no more.
In this causal map, concepts link together to form loops, some of which
261

Figure 4. An organization of agents as loops.

are numbered (1) to (7). Loops (1), (4)-(7) are deviation-amplifying loops.
Change in the organization is the result of such loops, because any initial in-
crease (or decrease) in any concept loops back to that concept as an additional
increase (or decrease) which, in turn, leads to more increase (or decrease).
Loops (2) and (3) are deviation-countering loops 4 . The stability of the
organization is the result of such loops. In the case of loop (2), for instance,
an increase of resources for research can lead to an increase of salaries which,
in turn, reduces the resources allowed to research. If this reduction is not
enough to compensate the initial increase of resources, then a residual increase
of salaries takes place which, in turn, reduces the resources, and so on, until a
balance between the initial increase of resources and salaries is reached. Thus,
deviation-countering loops are useful for stabilizing the growth generated in
an organization.
Notice that in a wholistic approach the whole constraints the concepts and
the relationships between them. With an organization of agents represented
262

as a wholistic approach , we obtain a dynamic system in which deviation-


amplifying loops are responsible for change and deviation-countering loops are
responsible for stability of the organization. Using these loops, an individual
strategist can direct strategic change in the desired directions. This can be
achieved by (1) choosing and changing a loop or (2) choosing and changing a
set of loops.
Now let's see how the CM-RELVIEW tool can be used by decision makers
(DMs) for the reasoning on organization changes. Here also, DMs elicit causal
knowledge about their organizations from different sources as reports, memos,
questionnaires, interviews, etc.. After that, they use the CM-RELVIEW for
constructing causal maps reflecting this causal knowledge. Finally, they use
the CM-RELVIEW tool for analyzing those causal maps.
As stated in Section 2, the submenu of the graph menu called WHOLISTIC-
APPROACH allows DMs to draw a "wholistic" causal map, whereas the menu
WHOLISTIC-CM of TEST allows them to test it by choosing and changing a
loop. Obviously, the loop to be changed should be a weak loop loosely coupled
to the system. CM-RELVIEW offers DMs the following actions for changing a
loop (from deviation amplifying to deviation countering, or vice versa): ADD-
NODE: adding a node; REM-NODE: removing a node; REP-NODE: replacing
a node; CHG-LABEL: changing the label of a link.

5 Conclusion and Future Work

We have firstly proposed a tool for qualitative reasoning based on cognitive


maps representing relationships between agents' beliefs. This tool allows users
to determine certain quantitative and qualitative features of any cognitive
map. Then, we have argued for the use of this tool in the context of multiagent
systems, particularly for the reasoning on interrelationships among a set of
individual and social concepts.
There are many directions in which the proposal made here can be ex-
tended.

• The full possibilities of relation algebra have yet to be exploited. Another


option is to study "fuzzy relations" between agents' concepts 15 . Our
approach might be extended in this direction to take into account many
degrees and vague degrees of influence between agents such as none, very
little, sometimes, a lot, usually, more or less, and so forth 9 ' 12 .
• Applications such as the following ones must be investigated in greater
depth: (1) negotiation and mediation between agents in the case of rea-
soning about subjective views; (2) knowledge available to or necessary to
263

agents in the case of nested causal maps; (3) reasoning about the wholis-
tic approach; and (4) reasoning on social laws, particularly for qualitative
decision making.

References

1. R Axelrod, ed. Structure of Decision: The Cognitive Maps of Political


Elites. Princeton University Press, (1976).
2. Berghammer, R. and Schmidt, G. RELVIEW - A computer system for
the manipulation of relations. In Con/. AMAST-93, pp. 405-406,(1993).
3. Bougon, M. G. Uncovering Cognitive Maps: The Self-Q Technique, Priv.
Print Handb., Penn. State Univ., (1986).
4. Bougon, M. G. and Komocar, J. M. Directing strategic change: a dynamic
wholistic approach, in Mapping Strategic Thought, A. S. Huff, ed,. Wiley
and Sons, pp. 135-163, (1990).
5. Buede, D. M. and Ferrell, D. Convergence Problem Solving: A Prelude
to Quantitative Analysis, IEEE Trans. Syst., Man, Cybern. 23, pp.
746-765, (1993).
6. Chaib-draa, B. abd Desharnais, J. A Relational model of cognitive maps.
Inter. Jour, of Human-Computer Studies 49, pp. 181-200,(1998).
7. Huff, A. S. Mapping Strategic Thought, Wiley & Sons, Eng., (1990).
8. Kelly, G. A. The Psychology of Personal Constructs, New: Norton,
(1955).
9. Kosko, B.Neural Networks and Fuzzy Systems, Prentice Hall. (1992).
10. Levi, A. and Tetlock, P. E. A cognitive analysis of Japan's 1941 decision
for war. Journ. of Conflict Resolution 24, pp. 195-211, (1980).
11. Nakumara, K. Iwai, S. and Sawaragi, T. Decision support using causation
knowledge base, IEEE Trans. Syst, Man, Cybern. SMC-12, pp. 765-
777, (1982).
12. Park, K. S. and Kim, S. H. Fuzzy cognitive maps considering time rela-
tionships, Int. J. Human-Computer Studies 42, pp. 157-168, (1955).
13. Weick, K. E. The social Psychology of Organizing, Reading, MA: Addison
Wesly, (1969).
14. Wellman, M. P. Inference in cognitive maps, Mathematics and Computers
in Simulation 36,pp. 1-12, (1994).
15. Zhang, W. R. NPN Fuzzy Sets and NPN Qualitative Algebra: A Com-
putational Framework for Bipolar Cognitive Modeling and Multiagent
Analysis. IEEE Trans. Syst, Man, Cybern., 26(4), pp. 561-574, (1996).
USER'S ONTOLOGY-BASED AUTONOMOUS INTERFACE AGENTS

TAREKHELMY, SATOSHI AMAMIYA, MAKOTO AMAMIYA


Graduate School of Information Science and Electrical Engineering
Kyushu University, 6-1 Kasuga-Koen, Kasuga
Fukuoka 816-8580, Japan
Email: [helmy, roger, amamiya]® al.is.kyushu-u.ac.jp

This paper proposes methods by which user's preferences for WWW-based pages can be
inferred from user's behaviors. Both explicit and implicit feedbacks of inference were used to
infer the user's preferences. In the explicit feedback mode, a user evaluates the selected page
as interest/not interest according to the relevancy of the page with the given query and sends
an explicit feedback. In the implicit feedback mode, a user browses the pages by performing;
for instances; bookmark, saving, printing, scrolling, enlarging, closing, reading, or jumping to
another link, and the system infers from these operations how much the user was interested in
the page. The users browse pages by using Kodama's simple browser in which there is an
interaction agent that monitors the user behaviors and a learning agent that infers user's
preferences from the interaction agent. The results show that the proposed techniques for
learning and using user preferences in refining the given query and filtering the retrieved
documents greatly enhance the value of retrieving more relevant information.

1 Introduction

The number of information sources available to the Internet user has become
extremely large. This information is loosely held together by annotated connections,
called hyperlinks [3], [12]. This makes locating relevant information consistent with
the user's information need is very difficult. The users normally face with very large
hit lists with low precision while using the Traditional Search Engine (TSE).
Moreover, the information gathering and retrieving processes in the TSE are
independent of user's preference, and therefore feedback from the later process is
hardly adaptive to improve the quality of the former process. These factors make it
necessary to investigate new techniques to address these problems. Intelligent
agents may be the way to improve search and retrieval process as active personal
assistants. Researchers in Artificial Intelligence (AI) and Information Retrieval (IR)
fields have already succeeded in developing agent-based techniques to automate
tedious tasks and to facilitate the management of information flooding [4], [5], [15].
Kodama1 is a distributed multi-agent for the IR in large, dynamic and distributed
environment such as WWW. The approach is based on a distributed, adaptive and
on-line agent population negotiating and making local decisions for retrieving the
most relevant information to the user's query.

1
Kyushu University Open Distributed Autonomous Multi-Agent.

264
265

In this paper we will describe briefly the mechanism of agentifying a Web site,
creating WPAs communities and the main focus is on the User Interface Agent
(UIA). We discuss our new methodologies of calculating the relevancy with the
User's Preferences (UP) by UIA. Next, we introduce ways to model user's interests
and show how these models can be deployed for more effective information
retrieval and filtration. We describe the adaptation techniques used in the UIA and
how the UIA makes use of user's query history and bookmark files as the UP.
Finally we present the experimental results and future work of Kodama.

2 Web Site Agentification

Cooperating intelligent Kodama agents are employed to agentify the Web where the
hyper structure is preexisting in the form of Web links [12]. Our system uses three
types of Kodama agents in the agentification mechanism for searching the Web. A
Server Agent (SA) assigned to each Web server, a Web Page Agent (WPA)
assigned to each Web page, and a User Interface Agent (UIA) assigned to each
user's machine [6], [7], [8], [9].
A SA is assigned to one Web server to be responsible. The SA starts from the
portal address of the Web server and creates the hyper structure of WPAs
communities based on the hyper link structure in the Web server. We introduce a
definition of WPAs community that enables the SA to effectively focus on narrow
but topically related subset of WPAs and to increase the precision of search results.
The SA knows all WPAs in the server and works as a gateway when WPAs
communicate with each other or with one in another server. The SA initiates all
WPAs in its server when it starts searching relevant information to the user's query.
The WPA registers itself to the SA and takes essential properties and principles
given by the SA to create the Interpretation Policy (IP) as an ontology that
represents the context of the Web page. Each WPA has its own parser, to which the
WPA passes a URL, and a private IP, in which the WPA keeps all the policy
keywords, found in its URL. At the retrieval phase, WPAs, when received a user's
query from SA initiate search by interpreting the query and/or either asking 'Is this
yours?' or announcing This is yours,' to its down-chain WPAs. The selected WPAs
and/or their down-chain WPAs of each Web server, in turn, interpret the query
based on both Query-IP and Query-URL similarities and reply the answer This is
mine' with some confidence or 'Not mine' (zero confidence). For more information
about the IP representation and relevancy measurement by the WPA, see [8], [9].

3 User Interface Agent Autonomy

The UIA resides in the user's machine, communicates with the WPAs via an SA to
retrieve information relevant to the user's query, and shows the results returned by
the WPAs to the user after filtering and re-ranking them. Monitoring the user-
266

browsing behavior is accomplished via a proxy server that allows the UIA to inspect
HTTP requests from its browser. The UIA receives user's responses of his/her
interest/not interest to the results and regards them as rewards to the results. The
UIAs in Kodama system look over the shoulders of the users and record every
action into the query history file. After enough data has been accumulated, the
system uses this data to predict a user's action based on the similarity of the current
query to already encountered data. The followings are the job stream of the UIA.
(1) The user starts by sending a Natural Language (NL) query to the UIA.
(2) UIA analyzes the NL query using a simple NL processing algorithm, throws out
irrelevant words, reformulates and transforms it to Qt.
(3) The UIA calculates the similarity with the method described here and looks for
relevant URLs in UP files using equations 5, 6.
(4) If UIA finds relevant URLs in UP then shows them and asks the user whether
the user is satisfied or wants to search the Web.
(5) In case of finding relevant queries in UP, the UIA takes two queries from the
UP, whose similarity to the given query is over a predefined threshold value and
concatenates the given query with the keywords of these two queries after
removing the redundant terms to expand Qt.
(6) The UIA takes a set of queries, whose similarity to the given query is over a
predefined threshold value from the UP. Then, the UIA makes a context query
from them and <2, to be used for filtering the retrieved documents.
(7) If a user does not satisfy with the relevant URLs from the UP files then the UIA
routes Qi to a relevant SA, which in turn forwards it to its community of WPAs,
(In the current version the UIA routes the query to default or predefined SAs).
(8) The UIA receives the search results returned by the WPAs via the SA. The
results consist of a set of contents of Web pages.
(9) The user checks and either explicitly evaluate the retrieved documents using
UIA's feedback or the system implicitly detects user's response.

3.I User's Preferences Creation and Relevancy Algorithm by UIA


Recording and analyzing user's accessing histories and bookmark by the UIA are
quite important to catch his/her preferences. The query history file contains
information about previously visited URLs for specific queries, the number of
occurrences that this URL is visited, the time of visiting & leaving and the query.
The bookmark file contains a user's hot-list of Web links, the number of
occurrences that a URL is visited, bookmaking time of the URL and its title. The
query and the title fields in query history and bookmark files are represented as a
vector of keywords sorted in alphabetical order, where a weight value is assigned to
each keyword to reflect the correlation with the content of the page. User's explicit
response (9?), is Useless, Not very useful, Mildly interesting, Neutral, or Interesting
and each has a value between 0 and 1. The UIA maps the user's implicit responses
267

to have the same effect as the explicit responses. When looking up relevant URL
from the UP, the UIA calculates similarities as follows:
First: We define equations to calculate the similarity between a user's query
and his/her query history file. Assume we have a query history file and a bookmark
file of n URL lines gathered. Qi =<kl,k2,---,kn > stands for a vector of keywords
sorted in alphabetical order, of the query given by the user. Qj =< Khj{, Khj2. • • • • nhjm >,
(\< j<n) stands for the vector sorted in alphabetical order, of the query of j th line
in the user's query history file, where A:*,-=**,-• w*,- , **,- is the ith keyword in the
./line and 0<w),<l is its weight. Similarly, Tj=<KbA,Kbj2,...,Kbj>
and Kbn =kb:,:• wb:: are defined for the title of / th line in the user's bookmark file. The
j
j*' j,< J,I

weight w*, and wj, are incrementally computed with the number t. of visiting to
URLj . WJJ (/_,•+1) = p • WJJ (/,-) + (1 - P )•» (1)
Where w.^. means w •• or w.,-, and 0<9?<1 is a user's response described
above. Initial value w7, (l) is set by the user's first response. 0 < p < 1 is a function
of tj, i.e., p(tj), and pOj) depends on how long user's response history upon the
keyword will be involved in calculating and adapting the next weight wJ?,(f7 +1).
Notice that W-. means the accumulated user's preference of keyword in the y th
line. We calculate the similarity Sj between Qin and the query field of j th line of
the user's query history file, and similarity S*between <2,and the title field ofy'th
line of the bookmark file. 5*. = £ w j,rg(k) (2) & S) = I Wjii••g\k.) (3)
Where, g(kj) = l if k.eQ.nQ , otherwise g(£,) = 0, and g'(k) = l if
' J / j '

k. e Q. nT., otherwise g:(kj) = 0. Also, we calculate the similaritySU- between

Qin and the URL of j th line using equation (4). S J' = -~f—_ (4)-
in j url
Where, cin =\Q\, suri -\<2,^ n U R L \ , dj=\URLj\, and URLj stands for the set of
words in the URL of y'th line. Weighting factor 0 < / < i is also defined by
heuristics. Then, the total similarity between user's query and his/her query history
file is calculated by the following equation, using a heuristic weighting factor
0<«<1. arg max (a • S J1 + (1 - a ) • S hj ) (5)

Second: By the same way, we calculate the total similarity between the user's
query and his/her bookmark file, using a heuristic-weighting factor 0 < p < l:
268

3.2 Context Dependent Reformulation of the Query by the UIA


A query modification is a set of extra keywords or phrases will be added to the
given query to increase the likelihood that results of the desired information are
narrowly focused. Query reformulation by UIA means that the query used internally
is different from the one submitted by the user to be more representative of the
user's intent. Modifications include removing terms, or adding some context terms
to enhance the precision of retrieving. The UIA takes two queries from the UP,
whose similarity to the given query is over the predefined threshold value. Then the
UIA concatenates the given query with the keywords of these two queries after
removing the redundant terms to form <2,.

3.3 User Preferences Content-Based Filtration by the UIA


Intelligent Kodama agents rely on UPs for playing a fundamental role in actively
searching and finding relevant information from the Web on behalf of their users.
Kodama's UIA represents UP as a set of categories, each category is a set of URLs
and queries has similarity over a predefined threshold value, and reflects a specific
user's category of interest. In Kodama, we are developing a new UP-based filtering
technique as a novel information-filtering mechanism. The UIA uses UP contents to
improve the task of retrieving relevant information, and for filtering the retrieved
results. The UIA takes a set of queries, whose similarities to the given query are
over the predefined threshold value from the UP and combines them with the given
query to form a new contextual query that represents the current user's interest. The
UIA uses the contextual query vector for filtering the retrieved documents. For each
of the retrieved pages, a keyword vector (title, headers) is calculated and compared
with the contextual query vector associated with current user's category of interest.
Comparing means, which documents in the retrieved results are relevant (have a
similarity value with the contextual query vector over a predefined threshold value)
and which are non-relevant with the current user's category of interest.

3.4 Implicit Response Implication by UIA


By observing user's behavior, it is possible to infer implicit feedback without
requiring explicit judgments. Previous studies have shown that reading time to be a
useful source of predicting UP implicitly [11, 13]. We are investigating other
sensors in correlation with the time of visiting the page to let Kodama's UIA detects
the actual user's implicit response. These sensors are:
1. The size of the selected page and the number of links within the page, as the
user may visit some links within the page.
269

2. Monitoring user's operations such as saving, printing, bookmark, scrolling,


minimizing, maximizing, or closing the page.
3. Jumping to another link, where the UIA distinguishes between two types of
links, if it is between pages with different or the same domain names.
4. The selected page's server response, as the user may spend time and finally the
server says it is unknown domain.
5. Other heuristic factors like; type of the page (text, image, and applet or all),
number of visits to this page.

We therefore decided to give reward to the keyword's weight of a URL when


the user does " Print", "Save" or "Bookmark" operation on a page. That is, these
operations have the same effect that modify a keyword's weight by using the
"Interesting" explicit feedback does. Conversely, when the user quickly jumps to
another link or closes the browser without neither "Print" nor "Bookmark" a URL,
the same effect is then as when the user explicitly sends a "Useless" explicit
response.

3.5 Adaptability of UP by the UIA


The UIAs automatically create the UPs based on the user's preferences, it is
necessary to improve the UP by the UIA autonomously to converge to the sudden
change of user's interest over time. In contrast to other systems that learn a UP and
use it [1], [2], [4], [10], [13], [14], [15], UPs of Kodama's users are continuously
evolving according to the dynamically changing of UPs. In Kodama a UIA sends
the user's query Qt to WPAs through the SA. The WPAs choose an action to
perform based on <2, and send the results back to the SA, which in turn forwards
the results to the UIA. The UIA picks a number of keywords from the title and the
headers of the selected document Ks in addition to the keywords of Qt and creates
a new list of keywords Kf for the feedback, % = Q . n K • According to user's
s
/ '
response, the UIA will do the followings on the UP files:
• Modify the weight of the keyword using equation (1).
• Modify the number of visiting.
• If one of the keywords does not exist in the query field then adds it with an
initial weight reflecting the user's response.
• Refine the contents of the UP files by deleting the keywords that have weights
less than a predefined threshold value.
• If the selected URL does not exist in the UP then adds a new record and
initializes its query field.
By this way, the UP will evolve over time to reflect the user's interests. Also,
the keywords of the query and title fields continually moved closer to or away from
their URLs.
270

4 Experimental Results

We have performed several experiments to make a consistent evaluation of Kodama


system performance and to verify that the mechanism of agentifying the Web is
useful for retrieving relevant information [6], [7], [8], [9].

^System's lirjnficil Resp.

'5 c Vv^^^V^^Wyv- -V^^^N /

Figure 1 Explicit and Implicit Response Conversion

In the First experiments, we measured how well Kodama's UIA is being able
to adapt to the user's interests over time by using either implicit or explicit response
detection mechanism. We compared the adaptation effect of the explicit feedback
mode with that of the implicit feedback mode. We asked some users; who already
understood how to browse pages using Kodama; to use the system for some time,
give queries, browse the answers and send an explicit feedback according to their
interests and the UIA in the background detects the user's implicit responses. The
UP automatically created by the system and reflects the user's interests. We
calculate the summation of keywords weights inferred by both implicit (IM) and
explicit (EX) responses of each URL exists in the UP files after refining the UP
contents, then we compared these data as follows:
EX , = J.W, & /M, = £iv
Where Wk is the weight of keywordk in the URL,. The values calculated for fifty
URLs (Figure 1) show that the values of EX and IM are converging to each other,
and this means that UIA is being able to predict implicitly the user's interests when
some parameters and heuristics were properly set. In an experiment, the system was
able to adapt to user's preferences by using only implicit feedback when some
parameters and heuristics were properly set, but the adaptation was not as precise as
it was when explicit feedback was used. By using both explicit and implicit
feedback's, Web Kodama system could adapt to user's preferences quickly and
precisely without requiring too much effort on the side of the users.
In the Second Experiment: we agantified several Web servers by giving the
portal address of the Web servers to the system, the system creates the hyper
structure of the WPAs communities based on the hyperlink structure of each Web
server. In the experiment, we calculated the Precision of the retrieved URLs to
user's queries as the number of relevant document retrieved/ total number of
document retrieved.
271

First, we gave 10 ambiguous queries to the system, disabled the UIA's query
expansion mechanism and calculated the Precision (PI) of the retrieved URLs to the
given queries. Second, we allowed the UIA to expand the given query from the UP
and submitted the new query to the community of WPAs, then calculated the
Precision (P2) of the retrieved URLs to the expanded queries. Third, the UIA
creates the context query and uses it for filtrating the retrieved documents by
comparing these documents with the context query, then we calculated the Precision
(P3) of the filtered URLs to the context queries.
The results depicted in Figure 2 shows that the idea of Web page agentification,
query expansion and filtration done by the UIA promise to achieve relevant
information to the users and promoted using Kodama as a pinpoint IR system.

Fig. 2 Precision of the retrieved URLs of the queries to an agentified domain

In the Third Experiment: We measured how well Kodama can get correlation
between the contents of the URLs and the queries over time to predict the user's
preference. The user starts with UP files contain different URLs and gives three
times the following five queries, "Conferences and workshops of agents," "Natural
language processing agent," "Electronic commerce agent systems," "KQML agent
communication language," and "Collaborative intelligent interface agents." For
each query, the user browses and sends back the evaluation upon the selected URLs
either from the Web or from the UP files. At this point, the UP contents have
already been customized by the system to user's current interest and the URLs get
correlated with the queries by inserting new keywords, deleting non-relevant
keywords and modifying the weights of the existing keywords. After that, the user
gives more general and ambiguous query, "intelligent agent systems." The URLs
retrieved from the UP files, and satisfied the user, are the most relevant URLs to the
given query in the UP. Figure 3 shows that the retrieved URLs of high weight of
similarity are relevant to the given query while the rest of UP contents have neutral
or zero weight of similarity.

5 Conclusion and Future Work

This paper discussed Kodama system; developed and in use at Kyushu University;
as a multi-agent-based approach to build a Pinpoint IR system. The paper introduces
272

methods for learning user's preferences by monitoring the user's behavior while
browsing on-line WWW. It focuses on a methodology to determine UP, as
autonomously as possible, by evaluating user's operations when online Web pages
are accessed. We reported methods to exploit UP adaptively on the Kodama's UIA.
We carried out several experiments to investigate the performance of the UIA in
Kodama system. Through these experiments, we ensure that Kodama's UIA learns
and adapts to the UP over time. Moreover, the system is able to change the weights
of some keywords and classifies URLs in query history and bookmark files in a way
that it reflects user's interest in these keywords of the related URLs. Future step in
Kodama is extending our experiments in multiple SA domains and developing a
smart query routing mechanism in the UIA for routing the user's query. Routing
refers to the process of selecting the appropriate SA to be queried and forwarding
queries to it instead of sending the query to all SAs and gathering a large amount of
noisy Web pages inconsistent with user's information need.
_3hup://agents.umbc.edu/Topics/Multi_Agent_Systems/index
.sht ml
• h t t p : / / a g e n t s . u m b c . e d u / A p p lie at i o n s _ a n d _ S o ft w a r e / A p p lie
al i o n s / i n d e x . s h t m l
• h t t p : / / l i e b e r . w w w . m e d i a . m it . e d u / p e o p l e / l i e b e r / T e a c h in g / A g
2 e n t s-Tut o r i a l /
• h t t p : / / a g e n t s . u m b c .edu/
1.8 %
• h t t p : / / a g e n t s , u m b c . e d u / To pic s/BDL_ A g e n t s / i n d e x , s h t m l
1.6
__http://agents.umbc.edu/Topics/Natural_language_proccssin
1.4 1 g/inde x.shtml
• http://a gents.umbc.edu/Topics/Interface_Agents/index.sht
F 1.2 I1 ml
1•M
o 1
I1 E_http://agents.umbc.edu/kqml/

1ki
m • ht t p : / / w w w . u t o r o n t o . c a/ i a n / s o ft w a r e / s o f t w a r e , h t m l
_: 0.8
w
>
I • hi t p : / / a g e n t s , u m b c . e d u / A p pi ic at i o n s _ a n d _ S oft w a r e / A p p lie
0.6 a t i o n s / i n d e x , sht m l
• http://agents.umbc .edu/Courses/Tutorials/index.shtml
0.4
E J h t l p : / / w w w . e p m . o r n l . g o v / c t r c / M A B E S .htm
0.2
• h t t p : / / a g e n t s . u m b c . e d u / A g e n t s _ f o r _ . . . / P la nn in g _ a n d _ s c h e d
u l i n g / i n d e x.sbt m 1
u -* • h t t p : / / w w w . l a b s . b t . c o m / p r o j e c t s/a g e n t s.ht m

• hup://computer.org/cspress/csp-aulh.htm

• http://www.mit.edu:8001/people/cdemcllo/eg.html

Figure 3 URL's correlation with the relevant keywords

References

1. Ballacker K., S. Lawrence, and L. Giles, "CiteSeeer: An Autonomous System for


processing and organizing scientific Literature on the Web", Proc. of Automated
Learning and Discovery Conference, Carnegie, Mellon University, 1998.
273

2. Budzik J. and Hammond K. ^Watson: Anticipating and Contextualizing Information


Needs", in Proceedings of Sixty-second annual Meeting of the American Society for
Information Science, 1999.
3. Chakrabarti S., B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan,
"Automatic Resource Compilation by analyzing Hyperlink Structure and Associated
Text", Proc. of the 7th WWW Conference, 1998.
4. Chen L. and Katia S., "WebMate: A Personal Agent for Browsing and Searching",
Proceedings of the Second International Conference of Autonomous Agents,
Minneapolis, MN USA, May 9-13,1998, pp. 132-138.
5. Edmund S. Yu, Ping C. Koo, and Elizabth D. Liddy: Evolving Intelligent Text-based
Agents, Proceedings of the 4th International Conference of Autonomous Agents, June 3-
7- 2000, Barcelona, Spain, pp.388-395.
6. Helmy T., B. Hodjat and M. Amamiya, " Multi-Agent Based Approach for Information
Retrieval in the WWW", Proceedings of the First Asia-Pacific International Conference
on Intelligent Agent Technology (IAT'99), Hong Kong, 15-17/12, 1999, pp. 306-316.
7. Helmy T., T. Mine, G. Zhong, M. Amamiya, "A Novel Multi-Agent KODAMA
Coordination for On-line Searching and Browsing the Web", Proceedings of The Fifth
International Conference and Exhibition on The Practical Application of Intelligent
Agents and Multi-Agents, 10-12/4, 2000, Manchester, UK, pp. 335-338.
8. Helmy T., T. Mine and M. Amamiya, "Adaptive exploiting User Profile and
Interpretation Policy for Searching and Browsing the Web on KODAMA System",
Proceedings of the 2nd International Workshop on Natural Language and Information
Systems NLIS, London, United Kingdom, September 4-8, 2000, pp. 120-124.
9. Helmy T., Amamiya S. and Amamiya M. "Collaborative Kodama Agents with
Automated Learning and Adapting for Personalized Web Searching", Thirteenth
International Conference on Innovative Applications of Artificial Intelligence (IAAI-
2001), AAAI Press, August 7-9, 2001, Seattle, USA.
10. Joachims Thorsten, Dayne Freitage, and Tom M. Mitchell, "WebWatcher: A tour guide
for the World Wide Web", in Proceedings of International Joint Conference on Artificial
Intelligence (IJCAI97), pp. 770-775, 1997.
11. Kim J., Oard D., and Romanik K. "Using Implicit Feedback for User Modeling in
Internet and Intranet Searching" Technical Report [2000], Collage of Library and
Information service, University of Maryland.
12. Kleinberg J., "Authoritative sources in a hyperlinked environment", ACM Journal,
46(s), PP. 604-632 1999.
13. Morita M. and Shinoda Y., "Information filtering based on user behavior analysis and
best match text retrieval", Proc. of the Seventeenth International ACM-SIGIR
Conference on Research and Development in Information Retrieval, pp. 272-281.
14. Pann K., A. And Sycara, K. " A Personal Text Filtering Agent", Proceedings of the
AAAI Stanford Spring Symposium on Machine Learning and Information Access,
Stanford, CA, March 25-27,1996.
15. Steve Lawrence, "Context and page analysis for improved Web Search", IEEE Internet
Computing, July-August, pp.38-46,1998.
I N T E G R A T I O N A N D R E U S E OF H E T E R O G E N E O U S XML
D T D S FOR I N F O R M A T I O N A G E N T S

EUNA JEONG
Computer Science and Information Eng., National Taiwan University
E-mail: eajeong@agents.csie.ntu.edu.tw

CHUN-NAN HSU
Institute of Information Science, Academia Sinica
E-mail: chunnan@iis.sinica.edu.tw

This paper proposes a novel approach to integrating heterogeneous XML DTDs.


With this approach, an information agent can be easily extended to integrate het-
erogeneous XML-based contents and perform federated searches. Based on a tree
grammar inference technique, this approach derives an integrated view and source
descriptions of XML DTDs in an information integration framework. The deriva-
tion takes advantage of naming and structural similarities among DTDs in similar
domains. The complete approach consists of three main steps. (1) DTD clustering
clusters DTDs of similar domains into classes. (2) Schema learning takes the DTDs
in a class as input and applies a tree grammar inference technique to generate a set
of tree grammar rules. (3) Minimization optimizes the rules previously generated
and transforms them into an integrated view as well as source descriptions. We
have implemented the proposed approach into a system called DEEP and had the
system tested in artificial and real domains. Experimental results reveal that the
DEEP can effectively and efficiently integrate radically different DTDs.

1 Introduction

Software agents 1 ' 2 and integration systems of heterogeneous databases 3 ' 4 ' 5 ' 6
are widely studied and developed to allow users the ability to find, collect, filter
and manage information sources spread on the Internet. The design concern
of these systems vary for different domains, but all share a common need for
a layer of an integrated view and source descriptions in order to seamlessly
integrate heterogeneous information sources. The integrated view must be
designed for each application domain. Source descriptions are needed to map
source schemas to the integrated view. However, previous work in information
integration requires both of them to be constructed manually in a laborious
and time-consuming manner.
The approach presented in this paper is based on previous work in in-
formation integration. In particular, this approach addresses the problem of
automatic derivation of the integrated view for XML DTDs(Document Type
Definition).7 Although XML is becoming an industrial standard for exchang-

274
275

Table 1. Example DTDs

(a) COOKBOOK DTD


1 (!ELEMENT cookbook (title, author*, year, isbn, publisher))
2 (!ELEMENT author (authorname))
3 (IELEMENT authorname Cfirstname, lastname))
4 ((ELEMENT publisher (name, address))
(b) B I B DTD
5 (IELEMENT bib (title, author*, publisher, price))
6 (IATTLIST bib year CDATA #REQUIRED)
7 (IELEMENT author (last, first))
8 (IELEMENT publisher (name, email))

ing data on the Internet, it is difficult and sometimes impossible to have such
a common DTD when maintaining of the information sources is independent
of the integrator.
The remainder of the paper is organized into the following. Section 2
reviews XML and information integration. Section 3 describes our view infer-
ence approach. Section 4 contains the experimental results. Finally, Section 5
reviews related work and draws conclusions.

2 XML Information Integration

XML data is an instance of semistructured data. With a DTD, XML is self-


descriptive and provides a semistructured data model. Thus, we model a DTD
as a labeled, directed tree. The tree nodes represent objects and are labeled
with an element or attribute name. The leaf nodes represent atomic objects
and the internal nodes represent complex objects.
Type Each internal node in the tree has its own type. The type of an
object is defined by its label and its immediately adjacent child nodes. XML
attributes are treated in the same way as element tags. Each type is denoted
by U, where i is the type id. All leaf nodes (i.e., #PCDATA type) belong to t0.
Each internal node type has a type definition of the form [label: Type (label)],
where label is a regular expression over a finite set M of names; Type (label) is
either #PCDATA for leaf nodes or a regular expression over M with type id
as the subscription.
D T D Schema A DTD schema consists of a sequence of type definitions.
D T D Class A DTD class consists of similar DTD schemas.

Example 1 Table 1 gives two example DTDs extracted from published pa-
pers and documents. 8,9 Here, COOKBOOK and BIB DTDs represent two related
domains. Although these DTDs are created by different authors, they reveal
276

User < ;>

F i g u r e 1. X M L I n f o r m a t i o n I n t e g r a t i o n A g e n t

structural and naming similarities because the underlying concepts are closely
related. Given the set V of source DTDs in Table 1, the following type set T
can be constructed. The underlined label, such as year0 of £5, means that it
corresponds to an XML attribute. •
ti = [cookbook: (titleo, (author2) + , yearo, isbno, publisher^)];
ti ~ [author : (authomame^)];
ti = [authorname : (firstnameo, lastnameo)];
t$ = [publisher : (nameo, addresso)];
£5 = [bib : (titleo, (authorc)-r, publisher*/, priceo, yearn)];
tg = [author : (firsty, last/))};
ty — [publisher : (name//, emailo)\\

Figure 1 shows an information integration agent(IIA) for XML documents


with different DTDs. The user submits a request to the system through a
user interface. The request is then translated into an XML-QL8 query by a
query decomposer. Given the query, and based upon an integrated view, the
server transforms the query into a set of subqueries against each integrated
information source. Finally, query executor issues the subqueries towards each
information source, integrates the results, and returns the requested data to
the user as an XML document.
Previous work in information integration usually assumes that the inte-
grated view is given. This paper, however, describes how to automatically
derive the integrated view by way of a view inference system. Consequently,
the derivation is conducted offline before the IIA is able to provide service.
The view inference system serves to automatically discover the association
between closely related DTDs, identify elements with similar underlying se-
mantics, and generate an integrated view that covers these semantically similar
elements.
277

Process flow *• Dataflow


351

Figure 2. View Inference System diagram

3 View Inference System


Our view inference approach consists of the following three major components
(shown in Figure 2): DTD clustering; schema learner, minimizer. A brief
description of each module is as follows: DTD clustering takes a collection
of source DTDs as input and clusters them into DTD classes based on DTD
similarities; Schema learner infers the general rules describing source DTDs
in each DTD class; Minimizer optimizes the learned rules. The learned rules
are first adjusted to fit the characteristics of DTDs and then transformed
into both the integrated view and the source descriptions to be used in the
information integration agent.

3.1 Renamer
Renamer as a preprocessing step is an optional module that requires human
intervention. The internal nodes in XML DTDs offer both naming and struc-
tural hints in order for the system to conveniently associate related elements
in the different DTDs, while leaf nodes offer very limited information to the
system. The renamer module is designed to allow human users to provide ad-
ditional hints for the system to associate related leaf nodes. In the case of leaf
nodes, the element name can be manually renamed to another internal/leaf
element name in different DTDs so that they will be considered as sharing
the same underlying concept. For instance, in Example 1, element f i r s t can
be changed to f irstname.

3.2 DTD clustering


Since we make no assumption that the input DTDs must describe the same
domain, the possibility exists that the input DTDs may describe drastically
different domains. Therefore, DTDs need to be clustered into classes of similar
domains so that the system may meaningfully derive an integrated view. This
278

task is the goal of our DTD clustering approach.


As a preprocessing step, we merge types before clustering. The purpose
is to reduce the number of types, as well as the distance between DTD trees.
This will allow DTDs of similar domains to have a better chance to be clus-
tered together.
Example 2 Given the type set T in Example 1, the merged type set is gener-
ated. In this example, types t^ and £7 are merged to new type t' = [publisher
: (nameo, addresso?, emailo*?]. Continuing from Example 1, all DTD schemas
are redefined according to Tm. The BIB DTD schema will be redefined with
three types, t5, tg, and t'. •
We employ a hierarchical clustering method 11 used widely in information re-
trieval. The basic idea is the following: initially start with a separate class
for each DTD; successively merge the classes closest to one another, until the
number of classes is sufficiently small. We extend Lu's algorithm 12 to com-
pute the distance between two labeled trees. The distance is computed by
calculating the minimum number of modifications required to transform the
input tree into a reference tree. For computing the distance between classes,
the average distance is used.

3.3 Schema Learner


Now that DTDs in similar domains are clustered into classes, the next task is
to generate an integrated schema covering the DTDs in each class. We address
this problem with a tree grammar inference approach. Grammatical inference
is the task for inducing hidden grammatical rules from a set of examples.
The problem of deriving an integrated schema from similar DTD schemas can
be reduced to this task. We adopt the fc-follower method, 10 which applies a
simple state-merging heuristic process. Given a DTD class, schema learner
generates a tree automaton to describe DTDs (as trees) in the DTD class.
The corresponding tree grammar of the tree automaton describes an infinite
language, containing input trees in the DTD class.

Definition 1 Let S be a given finite set of trees, and S a union of S and


Ssuf, which is the set of all subtrees of the member trees in S. Let A; be a
nonnegative integer. The k-follower H^(T) of tree T with respect to S is
defined by
fl$(T) = {17(6 <== %)\U e S and b/U = T)
where tree U satisfies one of the following: if U £ S, then Depthu(b) < k; if
U e Ssub, then Depthu(b) = k; where $ is a special character not in the set N,
279

U(b -£= $) is the replacement of the subtree of U at b with $, and Depthu(b)


is the number of nodes on the path from the root of U to b, excluding b. •
Our tree grammar inference algorithm is formally presented as follows:
Algorithm 1 Given a set S of source DTDs,
Step 1. Generate the set S of all subtrees of the member trees in S and ini-
tialize k to 0.
Step 2. For each subtree T in S, generate the A;-follower with respect to the
set S. If the fc-follower set of two trees are equal, then the states of the
automaton corresponds to the same equivalence class.
Step 3. If the equivalence classes have changed, then go to Step 2 with k
increased by 1. Otherwise, go to Step 4.
Step 4. Generate state transition functions A.

Example 3 Suppose we are given two modified DTD schemas of Example 2.


The input set S is a DTD class containing two DTD schemas and the
generated set S contains 15 subtrees. Algorithm 1 is terminated when k
= 2 because its equivalence classes are the same as k = 1. The inferred
tree automaton is M = {{F, qx, q2,... ,913}, M, A, {F}} where the state
transition functions are as follows. (The corresponding tree grammar is
shown in Figure 3.) •
<5«»aU = 91 1 <5pric. = 9 2 , <5isbn = 931
^address = 9 4 , <Sname = 9 5 , *5title = 9 6 .
Oyea — 9 7 . Ofiratlfirgtnajne = 9 8 . Oi as tji as tname — 9 9 .
l5author(98, 99) = 9 l 0 . <5publiBhor(95 , 9 4 ? , 9l ?) = 0 1 1 , ^authornam (98 , 99) = 912,
<5author(9l2) = 9 l 3 , <5c„oki,„ok (96 , 9 l 3 , 97 , 93 , 9 l l ) = F, <5bib(96, 910, 911 1 9 2 , 97) = F.

3.4 Minimizer
The minimizer module optimizes the states generated by the schema learner
module and transforms the optimized states into an integrated schema. The
optimization strategy is to merge and/or modify states that have parent-child
relationships or common labels/subtrees between states. If two states have a
parent-child relationship and their labels are equal, the label of parent state
is changed to "*" symbol.
The different combinations of state merging methods generate several
integrated schemas. The quality of these integrated schemas will be evaluated
by the following two criteria. Coverage, the first rule, guarantees that the
integrated schema derives all DTD schemas in the input DTD class. Secondly,
the integrated schema must be compact. Ideally, the integrated schema should
be the smallest type set covering the input DTD class so that similar types in
different DTDs may be mapped to the same type in the integrated schema.
280

Figure 3. Learned Tree Grammar

Example 4 The following optimized rules describe trees in the DTD class
containing COOKBOOK and BIB DTDs:
<5price = 9 1 , <5isbn = 92, ^ a d d r e s s —= 9 3 ,
<5na»e = 9 4 , (5titl . = 95, <5ysar = 96,
=
" f i r s t |firstname 97, <5i»»t l a s t n a n e = 9 8 , Ismail = 99,
"author|authornane(97, 9 8 ) = 9l0, (5 p u bl i s h e r ( 9 3 , 9 4 ? , 9 9 ? ) = 9 n , <5«(9io) == 912,
<5cookbook|Mb(95,9l2 + , 9 l l , 9 6 , 9 2 ? , 9 l ? ) = F

Given the state transition rules, the corresponding integrated schema is as


follows:
SF ~ [cookbook\bib : (titleo, (*\) + ,publis herz, yearo, (isbno)?, (priceo)!)];
si = [* '• ((author\authorname)2)]',
«2 = [author\authorname : ((first\first name)o, (la st\lastname )o)];
s3 = [publisher : ((nameo), (addresso)?, (ema «o)-')J;
Given the integrated schema above, the resulting integrated view is shown
in Table 2 (a). The source description for each DTD is generated from cor-
responding types in integrated schema. Based on the integrated view, the
system can translate a query into subqueries with assigned URLs, against the
related source DTDs. A subquery for COOKBOOK DTD in Table 1 (a) is
shown as Table 2 (b). In fact, two source descriptions should be presented
because the integrated schema is derived from the input DTD class containing
two DTDs(C00KB00K and BIB). •

4 Experimental Results
We implemented our approach into a system called DEEP and conducted some
preliminary experiments to evaluate this system. We tested DEEP on three
281

Table 2. Integrated View and Subquery examples

(a) Integrated View (b) Subquery for COOKBOOK DTD


1 WHERE <cookbooklbib> 1 WHERE <cookbook>
2 <title>$title</> 2 <title>$title</>
<*> 3 <author>
<author1authorname> 4 <authorname>
<first|firstname>$first</> 5 <firstname>$first</>
<last1lastname>$last</> 6 <lastname>$last</>
</> 7 </>
</> 8 </>
9 <publisher> 9 <publisher>
10 <address>$address</> 10 <address>$address</>
11 <name>$name </> 11 <name>$name </>
12 <email>$email</> 12 </>
13 </> 13 <year>$year</>
14 <year>$year</> 14 <isbn>$isbn</>
15 <isbn>$price</> 15 </> IN "COOKBOOK.xml"
16 <price>$price</> 16 CONSTRUCT result patterns
17 </>
18 CONSTRUCT result patterns

domains, namely, book, play, and movie-list. The tested DTDs are prepared as
follows: we started by collecting two to three seed DTDs from published pa-
pers and documents 8,9 on test domains. The seed DTDs serve as the "golden
rule" for performance evaluation. From these seed DTDs, we construct 100
DTDs for each domain by using various perturbations with different modi-
fication rates. The modification rate is defined as the ratio of the number
of modified nodes and the total number of nodes in a given tree (i.e., DTD).
The modification is conducted by randomly selecting one node and applying a
randomly selected operator to the node. Each data set was used in two cases:
with or without the renamer process, as described in Subsection 3.1. The
first performance measure is the correctness of the clustering. The precision
of clustering is the average of the ratio of the correctly clustered DTDs and
the number of DTDs in each DTD class. As the modification rate increases,
the precision degrades gracefully from 100% to 75% with the renamer in Fig-
ure 4 (a). Even without the process, we see that the 38% degradation (from
100% to 62%) is not too severe. The second measure is the accuracy of the
integrated schema. The result was achieved without clustering. The accuracy
is the ratio of the number of similar concepts discovered by the system and
the total number of similar concepts in the data set. Without the renamer,
the performance is not optimal, as accuracies range from 50% to 18%. On the
contrast, with the renamer, DEEP performed quite well; accuracies ranged
from 100% to 82%. In actuality, renaming only gives hints to the system.
Most associations between similar concepts are identified by the system, as
282

Figure 4. Quality of DEEP

shown in the shaded area of Figure 4 (b). In this experiment, renaming is as-
signed by human experts. Furthermore, the task can be also automatized by
regulating human heuristics, a topic currently being investigated. Ontologies
of common vocabulary that guide renaming may also help.

5 Related Work and Conclusions


The most closely related work is LSD, 13 a system that learns source descrip-
tions. Given an integrated schema, the system learns from manual mappings
and then proposes mappings for new data sources. LSD's focus is on finding
one-to-one mappings for leaf elements of source schemas.
Another related work is XTRACT, 14 a system that extracts a DTD from
XML documents. Input XML documents are assumed to conform to the same
DTD. Since DTDs are not mandatory, tools that can infer an accurate DTD
for given XML documents are useful. It is straightforward to extend our sys-
tem in order to extract a DTD from XML documents using the schema learner
module. In this case, the set of sample trees consist of XML documents and
the inferred rules corresponds a DTD able to cover all the input documents.
Given a query, MIX 15 derives the view DTD able to describe the query.
Since many candidate view DTDs may satisfy the query, the main concern is
to derive the tightest view DTD.
We have proposed a view inference approach that automatically derives
an integrated view and source descriptions in order for an information in-
tegration agent to access XML-based sources. This problem arises because
manually constructing an integrated view and source descriptions for each
application domain is error-prone and labor-intensive. We therefore conclude
that our view inference approach is a feasible solution for alleviating engineer-
ing bottlenecks in the development of scalable information integration agents.
283

Acknowledgements
This reported research was supported, in p a r t , by the National Science Council
in Taiwan under Grant No. NSC 89-2218-E-002-014, 89-2750-P-001-007, and
89-2213-E-001-039.

References

1. O. Etzioni and D. Weld. A softbot-based interface to the Internet. In C. ACM,


1994.
2. C. Kwok and D. Weld. Planning to gather information. In Proceedings on 13th
national conference of AI, 1996.
3. S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou,
J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous
information sources. In Proceedings of the Information Processing Society of
Japan Conference, pages 7-18, Tokyo, Japan, October 1995.
4. T. Kirk, A. Y. Levy, Y. Sagiv, and D. Srivastava. The information manifold.
In Proceedings of the AAAI Spring Symposium on Information Gathering in
Distributed Heterogeneous Environments, Stanford, California, March 1995.
5. C. A. Knoblock, Y. Arens, and C. N. Hsu. Cooperating agents for information
retrieval. In Proceedings of International Conference on Cooperative Informa-
tion Systems. University of Toronto Press, 1994.
6. O. Duschka and M. Genesereth. Querying planning in infomaster. In Proceed-
ings of the ACM Symposium on Applied Computing, San Jose, CA, 1997.
7. Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen. Extensible Markup Lan-
guage(XML) 1.0, 1998. W3C Recommendation.
8. A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. XML-QL: a
query language for XML, 1998. W3C Note.
9. M. Fernandez, J. Simeon, and P. Wadler. XML query languages:experiences
and examplars, 1999. W3C Draft manuscript.
10. H. Fukuda and K. Kamata. Inference of tree automata from sample set of
trees. International Journal of Computer and Information Sciences, 13, 1984.
11. Edie Rasmussen. Clustering Algorithms, chapter 16. Prentice Hall, 1992.
12. S. Y. Lu. A tree matching algorithm based on node splitting and merging. In
IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 6,
pages 249-256, 1984.
13. A. Doan, P. Domingos, and A. Levy. Learning source descriptions for data
integration. In 3rd International Workshop on the Web and Databases, 2000.
14. M. Garofalakis, A. Gionis, R. Rastogi, S. Seshadri, and K. Shim. XTRACT:
a system for extracting document type descriptors from xml documents. In
Proceedings of SIGMOD, 2000.
15. Y. Papakonstantinou and P. Velikhov. Enhancing semistructured data me-
diators with document type definitions. In Proceedings of the International
Conference on Data Engineering, 1999. Sydney, Australia.
VIRTUAL MUSEUM'S ASSISTANT*

OSVALDO CAIRO, ANA ALDECO, M.E. ALGORRI


Division Academica de Ingenieria
Instituto Tecnologico Autonomo de Mexico (ITAM)
Rio Hondo 1, Tizapdn San Angel, 01000 Mexico DF
Email: cairo(a),itam.mx, al51578@alumnos.itam.mx

This paper focuses in an application combining three apparently separated research areas:
virtual environments, intelligent agents and museum web pages. It consists in a virtual
visit to a museum guided by an intelligent agent.
The agent must respond in real time to user's requests, providing different layers of
data, making difference between users by using different Knowledge-Bases. The
agent not only has some autonomy during the visit but also permits the user to make
his own choices. The environment is created allowing an immersion, so the user
could feel himself inside the museum's structure. This kind of application works as a
complementary experience, because the user is being introduced to the expositions
in the museum, convincing him to make a future real visit.
Keywords: Knowledge-Bases (KB), Intelligent Agent, Virtual Environments.

1 Introduction
Internet offers an enormous amount of information for every kind of users,
making it cumbersome and sometimes plain time consuming to come across
the desired data. Intelligent agents are revealing themselves as future internet
assistants to allow faster, intelligent queries; narrowing user's choices to
whatever information is most relevant to him, making his search more
natural, enjoyable and less time consuming. Agent based systems are
essential in the data query both of users and agents [4].
Research on AI has shown the capabilities of web-agents, such as
Letizia [1], multiagent infrastructure framework [5] and AVATARS [6].
Being incapable to represent the human knowledge about how to use the
body during communication. This problem was attacked using a behaviour
defined by KB's. The agent perceives an action and has an established
reaction to it.
As graphics development increases, raising the levels of interactivity
between the user and the environment by offering different views of objects
simulating 3d perspectives to provide a high level of immersion and to allow
the user to get the impression of being inside the virtual environment.
Some museum sites are introducing virtual visits to their expositions, Le
Musee du Louvre [16] is offering additional purchase software not available
through the Internet, making it inaccessible to most people. Museums are

* This project has been partially funded by CONACYT as project number


33038-A, and Asociacion Mexicana de Cultura, A.C.

284
285

primer application places for agents because they hold an enormous amount
of information interesting to the user, but in the web pages, an average
Internet user generally avoids the static reading sites and privileges
interactive sites where he can have freedom of action. The valuable
information is kept unread. In deeper layers, the user could even have
information taylored to suit his own interest profile.
We started this work by visiting several museums web pages ([15] to
[26]) noticing that virtual visits and intelligent agents serving as guides or
information facilitators in the virtual visits are very rare.
The remaining of the paper is organised as follows: Section 2 gives the
agent's definition. Section 3 deals with the museum's virtual environment.
Finally, we state some conclusions.
2 Agent: Main Elements
Our agent can be classified as hybrid because it has several characteristics
from reactive agents -it will be constantly monitoring the user's actions to
give a proper reaction, it is considered as an impulse-reply agent- and static
characteristics.
Multiple KB's were defined to give answers to certain questions with
all the connotations of a human answer. This is translated as accesses to the
KB's, which are established by the environment where the agent is standing.
global KB's such as movements and gestures.
Perception Action: QUERIES (KB)
the user clicks Information of the piece's
in a paint Authors' information
Gestures and movements
Room's
Museum's architecture
Phrases
User's information
Figure 1. Example of table with perception-actions.
Based on these features, the agent chooses randomly an action according to
the environment that surrounds him and the requests of the user. The agent
must consider the different kinds of users -adults, children, foreign or local-,
to share different information, by asking certain data before the application
starts -part of the agent's attributes- such as name, age, country, etc.
Information. Contains the actions concerning user's questions about the artwork
presented in the museum, its history, authors' biographies, architecture knowledge
as well as regional knowledge. This last KB is divided by local and foreign users, to
provide relevant information about the country to a foreign user and do not repeat
the facts that a local user would be familiar with.
Reactions. Concerned with the agent's mouvements, gestures and common phrases.
Museum scope. The virtual visit is loaded and contains the actions that allow the
user to go through the museum's rooms with different perspectives.
Table 1. Agent KB's
286

2.1 Agent Goals


Users should be able to interact with the interface like they commonly
communicate, being able to specify preferred forms of presentation, but not
determinating the best way of doing it or the details to achieve it.
Goal orientation: The user should be able to state what he wants finished, rather
than how he wants it to be done. The agent should determine how to perform the
necessary actions, without user's supervision.
Expression: The user should be able to make requests without learning an artificial
query language or being limited by the constraints of a menu system.
Adaptability: The agent should adapt to different users, both through direct requests
and by learning the patterns of a user's behavior.
Autonomy: The agent should have some independent decision-performing
capabilities and should be able to choose among several strategies to accomplish
user's tasks.
Integration: The interface that users and agents use to communicate should be
understandable, consistent and directed to the task. The user should not have to
remember details of the task in order to specify his goals.
Table 2. Criteria considered for this agent
2.2 Agent Design
The presented system is described using UML definition, based on the
abstract models' construction of real-world ideas, notations and
specifications.
Register for visit
Navigate through a virtual museum
Select rooms for visit
Request pictures information
Create museum catalog
Maintain rooms information
Maintain user information
Table 3. Defined Agent's Use Cases, each of them documented with a flow of events to
accomplish the required behavior of each one.
There are two main actors: the agent and the user, one secondary: the Virtual
Museum's Environment and one use case: Navigate Through the Virtual
Museum. In the UML use case presented in figure 2, User navigates around a
specific view of the Virtual Museum, on which the agent is embedded to
assist the visitor through his trip. Every action performed by the user over the
scenario is monitored by the agent's sensors. These elements allow the Agent
to identify which set of actions must be performed, according to its KB's.
Moreover, if user does not carry out any defined movement, the Agent is
able to achieve an autonomous action, according to User's navigation. The
Agent starts the selection of rooms to visit use case as soon as it requires to
perform an action registered in its KB's (see Figure 3).

lyst Select rooms for visil

Figure 2 and 3. Use Case Diagram and Select rooms for visit use case
287

• KnosdedgeManager

"V
7 \
/eguesf

perform
AsentAtfortRequest

KnawledgeActicw

Figure 4 and 5. Class diagram of the Navigate Through a Virtual Museum's Scenario use case
and Select rooms for visit use case class diagram.
The AgentManagement class models the sequencing behavior described by
the Navigate Through a VirtualMuseum'sScenario use case. This class
monitors every action performed by the user, defined in the UserOptions
class. This class encloses events the user can trigger whether from the
AgentOptions or from the VirtualOrganization classes. These two classes
delimit the virtual navigation's potential, because the user won't invoke any
action not allowed by these classes. The VirtualOrganization class also
restricts the behavior of the agent, according to the context where the user
lays. The ScenarioElement class describes all the objects included in the
VirtualOrganization class -e.g. art pieces-. The UserProfile class
collaborates by giving a personal touch in the interaction between the Agent
and the User along the navigation. Finally, the AgentsActingOptions class
functions as an interface used by the AgentManagement class to
communicate Agent and tasks to achieve. In figure 5 the Select Rooms' Use
Case Class Diagram is shown:
The KnowledgeManager class is responsible of the actions' execution
requested by the agent, it handles the performance of the Agent -phrases,
attitudes and movements- using the KnowledgeAction class, as well as the
Agent's requests with the use of the AgentActionRequest class.
3 Virtual Environment
Deepness: Allows the user to feel he's travelling inside the structure, and also to
view the artwork from several views.
Texture & Light: Both properties give the virtual environment a sense of reality.
Table 4. Relevant points while developing the virtual environment
To avoid speed and space limitations while navigating through a full 3D
virtual environment in the Internet, the virtual museum must be exported to a
standard Internet format such as Virtual Reality Modeling Language
(VRML) to dynamically download complex scenes from a server directly to
a web browser. Using VRML it is possible to interactively navigate through
the virtual environment in real time.
4 General Conclusions
The proposed application, according to the research, defeats problems of
space, speed, compatibility and basically, suggests an interaction
environment-user through an agent, who guides the user during the
navigation in the virtual environment bringing relevant information to each
288

kind of user and offering it pleasantly, solving the problems presented in


SectionOne.
As future work, it could be exported to an embodied environment such
as REA [3], where the user has complete interaction through the interface
agent.
Our project is currently integrating up-to-date literature and seeks the
feedback from researchers in the area to generate ideas and opinions that will
permit us to attack the weak points and deliver a high quality application.
5 References
[l]Lieberman,HLetizia:AnAgentThatAssistsWebBrowsing. Proceedings of the Fourteenth
International Joint Conference on Artificial Intelligence. 1995
[2]FininJ^,FntzsonJiMcKay,DMcEntiie,R.KqmlAsAnAgentCommunicationLanguage.3thlnter
nationalConferenceOnlnformationAndKnowledgeManagementGaithersburg, Maryland, 1994
[3]Cassell,J,Biclanore,T,Billinghurst,M,Campbell,L.,Chang,K.,Vilhjalmsson,H.,andYan,H.
EmbodimentlnConversationallnterfaces: i?ea,CHI'99Proceedings,ACMPress,Pittsburgh,PA, 1999
[4]Sycara,K., Klusch, M, Lu, }.MatchmakingAmongHeterogeneousAgentsOnTheInternet.\999
[5]Sycara,K.,Decker,K.,Pannu,A.,Williamson,M.andZeng,D.Z)M/n6«/erf/n/e/%ent4genta.IEEE
Expert,December 1996
[6]Casse\\,S.andWi\h}^mssonMF"llyEmbodiedConversationalAvalars:MakingCommunicativeB
e/iavio/-^j<<ono»iou5.AutonomousAgentsandMulti-AgentSystems,Volume2,Numberl,1999
[7]Hannoun,M.,Boissier,0.,Sichman,J.andSayettat,C.MO/5£.v4nOrga«/za/i'ona/Mo^e^brMM//i-
^ge/i/S>«/e/H.IBERAMIA-SBIA,LNAI1952,Springer-VerlagBerlinHeidelberg2000
[8]Castelfranchi,C. ,CommitmentsFromIndividiialIntentionsToGroupsAndOrganizations,V'lCM
AS, San-FranciscoUSA1995
[9]LopesCardoso,H.,01iveira,E. UsingAndEvaluatingAdaptiveAgentsForElectronicCommerceNe
gotiation
[lO]Bryson,3McGomgle,BAgentArchitectureAsObjectOrientedDesignAntdYi$entAgentslVSpnn
ger 1997
[ll]Wooldridge,M.Jennings,N./nte/%ent4gente.Tteor)'^«^>''ac//ce.KnowledgeEngineeringRe
view 1995
[l2]Btooks,RARobustLayeredControlSystemForAMobileRobot.lEEE]o[in\alOfRoboticsAindAut
omation. RA-2:14-23 April 1986
[\3]VeThsaen,E.J)igmm,F.Jios,SJmplementationOfACooperativeAgentArchitectureBasedOnT
heLanguage-Action Perspective. Intelligent Agents IV, Springer 1997
[14]Museo del Prado, URL: http://prado.mcu.es
[15]Musee du Louvre, URL: http://www.louvre.fr
[16]Vatican Museum, URL: hltp://vvww.christusrex,org
[17]Museum Meji-mura, URL: http://cin.meitetsu.co.jp
[18]Israel Museum, URL: http://www.ini.org.il
[19]British Museum, URL: http://thebritishmuseum.ac.uk
[20]Smithsonian National Air and Space Museum, URL: http://www.nasm.edu
[21]Universum, URL: http://www.universum.unam.mx
[221TheChineseUniversityofHongKongArtMuseum.URL: http://cuhk.edu.hk/ics/amn/index.html
[23]Hungarian National Museum. URL: http://origo.hnm.hu/english/ottlapl.html
[24]Museum of Modern Art URL: http://wvvvv.moma.org
[25]Schi\\'mg,ATowardP^alTimePhotorealisticRendering:ChallengesandSolutions,SGGRAPH/
EurographicsWorkshopOnGraphicsHardwareAugust 1997 New York City, NY. ACM Press
[26]Goss,M.,Yuasa,K. TextureTile VisibilityDeterminationForDynamicTextureLoading, SIGGRA
PH/EurographicsWorkshopOnGraphicsHardwareAugust 1998,Lisbon, Portugal. ACM Press.
[27]Labrou,Y.,Finin,T.5e»wnriciFor/4/j^ge«/Coffi/Mun/car(o«Ianguage.IntelligentAgentsIV.Spri
nger1997
Index Based Document Classification with CC4 Neural
Networks
Enhong Chen + Zhengya Zhang"1" Xufa Wang + Jie Yang + +

+
Department of Computer Science and Technology
++
Department of Precision Machinery and Precision Instrumentation
University of Science and Technology of China,
Hefei, Anhui 230027, P.R.China
cheneh@ustc.edu.en

Abstract. CC4 network is an efficient neural network-based classification


algorithm used in metasearch engine Anvish[2]. Metasearch engine Anvish
uses CC4 to classify the returned web pages from other search engines. The
documents returned by search engines are very short abstract information.
Almost every keywords contained in each document appear only once. It is
natural to represent these documents with binary vectors and classify them
with CC4 neural network. However, for real life documents, binary
representation is not much appropriate. This paper proposes to map all
documents into points in low dimensional space while their distance
information is kept as much as possible. Then each index of documents is
transformed into a binary sequence so that the CC4 neural network can accept
it as its input. Experimental results show that our method can greatly improve
classification precision in comparison to classifying documents only with
CC4 neural network, and the performance is very stable.

1 Introduction

With the explosive growth of information sources available on the World Wide
Web, it has become increasingly necessary for users to utilize web search engine to
find the desired information sources[l]. Among all techniques used by search
engine, document classification is very important for helping users to find their
interested information efficiently. CC4 network [2] is an efficient neural network-
based classification algorithm used in metasearch engine Anvish[3], Metasearch
engine Anvish uses CC4 to classify the returned web pages from other search
engines. The documents returned by search engines like Yahoo, WebCrawler,

This work was supported by National Nature Science Foundation of China research
grant 60005004 and National 973 Project Gl998030509.

289
290

Excite, Infoseek are very short abstract information. Almost every keywords
contained in each document appear only once. Therefore, it is natural to represent
these documents with binary vectors and render them to CC4 neural network as its
input for classification. However, for real life documents, the frequency of each
keyword in a document varies widely. Therefore, binary representation is not much
appropriate. Considering that CC4 can only accept binary vectors as its input, we
propose to map all documents into points in low dimensional space while their
distance information is kept as much as possible. Then each &-index of documents is
transformed into a 0/1 sequence so that the CC4 neural network can accept it as its
input.
In the following section, we will describe our document index based
classification method called ExtendedCC4, which is an extension of original CC4
neural network based classification. Our theoretical analysis is given in Section 3.
Section 4 is our experimental results. The final section is concluding remarks.

2 CC4 Network Based Classification with Document Index

2.1 Construction of Document Indexing


To map every document into a point in low-dimensional space, we propose a BP
neural network based incremental data indexing approach, called MDS-NN method.
In this method, a small data set called sample data set is first indexed with MDS
approach. For the size of sample data set is very small, the time spent on this step is
very low. Then the indexing results are provided as training samples and supervisor
signals to train neural network. The trained neural network is used to index newly
added data. The quality of indexing is measured by Stress function [4], [5].

Definition 1 Af-index: Suppose that there exists a mapping Tthat maps any original
data d into a point p in ^-dimensional space, then point p is called &-index of d.

Our proposed MDS-NN method is as following:


1. Build the ^-indexes of training sample data using MDS method.
2. Construct the sample data set and supervisor signal set for BP Neural
Networks with the results obtained in step 1.
3. Train the BP Neural Networks with the data obtained in step 2.
4. Build the &-index of newly coming data with the trained Neural Networks.

2.2 Document Classification with ExtendedCC4


The CC4 algorithm, proposed by Tong and Kak [2], is a new type of corner
classification training algorithm for three-layered feedforward neural networks.The
291

CC4 network maps an input binary vector X to an output vector Y. The neurons are
all binary neurons with binary step activation function. Considering that CC4 can
only accept binary vectors as its input and for real life documents, the frequency of
each keyword in a document varies widely thus making binary representation be
inappropriate for real life documents, we propose to map all documents into points
in low dimensional space while their distance information is kept as much as
possible. Then each &-index of documents is transformed into a 0/1 sequence so that
the CC4 neural network can accept it as its input. We call our method ExtendedCC4
for short, in contrast to the CC4 (we call it InitialCC4) using binary representation
of textual documents as its input.
In the following, we will present the notion of L-discretization sequence of real
numbers first, and then L-discretization sequence of &-index.

Definition 2 Let x be a real number such that x e [a, b], S is a L-discretization


sequence of x given that the frontmost k elements of S are all ones and the rest L - k
b-a x-a
elements are all zeroes, where L is the length of S, m = ,k =[ ].
L m
Definition 3 Suppose that &-index of an original data d is (*/, x2,..., xk), xt e [a,-, bj\,
i = 1, 2,..., k, L is a given positive integer and 5, is L-discretization sequence of x,at
interval [a„ b-\, then S = < 5y> = < Sn, S 12 ,..., SlL , S2l, 5 2 2 ,.... S2L, •••, Skl, Sk2,...,
SkL> is the L-discretization sequence of &-index of data d, where i,j = 1,2, ..., L,
Sij= Si\j] ' s they'th element of the L-discretization sequence offc-indexof x-t.
When training ExtendedCC4 to classify documents, each training document is
indexed with MDS-NN method and then its L-discretization sequence of &-index is
calculated as the input of ExtendedCC4. The topic of the corresponding document is
served as the supervisor signal of ExtendedCC4 Neural Networks. For new textual
documents, MDS-NN method is applied to obtain their L-discretization sequence of
^-indexes and then the trained ExtendedCC4 network is used to classify them.

3 Theoretical Analysis of ExtendedCC4


We first introduce the notion of 8-neighborhood of k dimensional point X, and then
give the relationship between radius of generalization and classification ability of
ExtendedCC4.

Definition 4: Suppose that X is the center of a hyper-cube whose length of each


edge is 28, then the continuos area covered by the hyper-cube is called 5-
neighborhood of X and denoted as N£X), X is the representative of the area.
292

Definition 5: Suppose that X = (JKI, x2, ...,xk) e [0, 1]*, Y = (yt, yi, ..., yk), *,-, >>,e
[0, 1], i = 1, 2, ..., k. If | Xi - y, \ < 8, where S> 0, then Y belongs to the 8-
neighborhood of X and is denoted as Y e N£X).

Theorem 1: Suppose that &-index X = (x\, x2,..., **) is the center of training set for
class C, jc,e [0, 1], / = 1, 2, ..., k. Let L = s for L-discretization sequence of £-index
X and r = [8/s]. To any Y = (y,, y2, ..., yk), y,e [0, 1] for i = 1, 2, ..., k, if the
Hamming distance of L-discretization sequences of *,-, yt is at most n, n < r iff Y €

Proof: First, we know that r = [S/s], hence rs < S< (r+l)s. For n < r and £>0, thus
ns < rs < S .Hence the Hamming distance of L-discretization sequences of &-index
of X and Kis at most d. Thus we can conclude that Y € N&X).
Conversely, given that Y e N£X), thus | X; - yj |< S,i=l,2,...,k. For ns < | Xj
- y, \ < (n+l)s, hence ns< 8. For rs < | JC, - yi \ < (r+\)s, hence ns < rs, hence n < r,
and the theorem is proved.

By Theorem 1, we know that more and more points will be covered by the 8-
neighborhood of each training center with the increase of the radius of
generalization when training ExtendedCC4 and thus improve the classification
precision of trained ExtendedCC4. The precision will reach to its highest value at a
certain radius of generalization. Afterwards, with the increase of the radius of
generalization, more and more points are covered by the ^neighborhoods of the
centers that belong to other classes, thus leading to the decrease of classification
precision. However, when the radius of generalization is larger than a value r0,
called threshold value, the ^-neighborhood of the center used as the first training
sample for ExtendedCC4 will cover all points. The classification precision will stay
at a stable level, around at the percentage of test samples belonging to the first class.

4 Experimental Results and Analysis


Our experiments are performed on real data downloaded from UCI KDD Archive
site http://kdd.ics.uci.edu. We randomly select 10 out of 20 groups of news data
downloaded and pick out the frontmost 50 news in each group as our experimental
data. All documents are mapped into points in 3-dimensional space. In each news
group, determine a value for ratio such that ratio = size of training set / size of
entire data set, then calculate the size of training set SamplesNumber. Here we set
ratio value to be 10%.
Figs 1 and 2 show the relationships between the radius of generalization and the
classification precision of ExtendedCC4 and InitialCC4 when the ratio of training
documents is 10%. It can be observed that the highest classification precision of
293

ExtendedCC4 will be much better than that of InitialCC4. We can also observe that
when the radius of generalization is larger than a threshold value r0 the
classification precision of ExtendedCC4 and InitialCC4 stays at a stable level, i.e.
around at the percentage, i.e. 10%, of test samples belonging to the first class.

f*ts«=ij i ratiorO. 1
0.2 r

III o 0.15

- oi
-

! »
0.05

IC" IV .'. = 500 600 700 800 900 1000 1100 1200
eeneralization radius

Fig. 1. The influence of radius of Fig. 2. The influence of radius of


generalization on classification precision of generalization on classification precision of
ExtendedCC4 (ratio = 0.1) InitialCC4(raf/o = 0.1)

5 Conclusion

This paper proposes a document classification approach with ExtendedCC4 through


mapping documents into points in low dimensional space and then transforming
each &-index of documents into a 0/1 sequence so that the CC4 neural network can
accept it as its input. Our experiments show that the performance of ExtendedCC4
is much better than that of InitialCC.4.

References

1. Venkat N. Gudivada, et al., Information Retrieval on the World Wide Web, IEEE
Internet Computing, September • October, 1997, p.58-68.
2. Tong K.-W. and S.C.Kak, A New Corner Classification Approach to Neural Network
Training. J. of Circuits, Systems, Signal Processing, Burkh auser Boston, 1998, p.459-469.
3. Shu B., Kak S., A neural network-based intelligent metasearch engine, Information
Sciences, 120, 1999, p. 1-11
4. Jagadish H.V., A retrieval technique for similar shapes, Proc. ACM SIGMOD Conf, May
1990, p208-217
5. Faloutsos C , FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of
Traditional and Multimedia Datasets, Proc. of ACM SIGMOD Conf, 1995, pl63-174.
P R I C E WATCHER A G E N T FOR E-COMMERCE

SIMON F O N G
E-Netique Pte Ltd, Singapore
E-mail: simon@enetique.com.sg

AIXIN SUN
School of Computer Engineering, Nanyang Technological University, Singapore
E-mail: sunaixin@pmail.ntu.edu.sg

KIN K E O N G W O N G
School of Computer Engineering, Nanyang Technological University, Singapore
E-mail: askkwong@ntu.edu.sg

We report an autonomous agent for retrieving competitors' product prices over


the World Wide Web, for the purpose of price comparison at an e-commerce retail
shop. This price watcher model is different from the conventional price comparison
services currently available on the Internet in a way that it collects competitors'
price information without the competitors' participation and attention. It scans
the price information over the Internet on a regular basis, builds up a knowledge
base at the user's site and provides a price comparison facility for shoppers to use.
It is an information retrieval utility that could be used as a part of the business
intelligence infrastructure. This paper summaries the application background as
well as the technical details in the design of the prototype.

1 Introduction

The Watcher Agent proposed in this paper is an autonomous software program


that "spies" on the competitors' prices over the web. The prices collected
from the competitors are stored in a local database. They can be used for
price comparison at the front-end of an e-commerce online shop as well as
for market research at the back-end. This technology will offer itself as a
useful new feature for online shops and help increase consumers' confidence
in buying the products by showing them the competitors' prices, and hence
helps improve sales. The agent can be configured such that only the prices
higher than (or equal to) ours are displayed. A snapshot of a shopping site
with price watcher is shown in Figure 1.
One of the barriers for e-commerce retailers to overcome is that most
consumers are not convinced that the price of a product offered at their sites
is the best; and it is always easy for them to surf away to other shopping
sites looking for a better offer 1. How to encourage the consumer to commit
a purchase on the spot at the current site is thus an issue to be addressed.

294
295

"/

• M M I fst R ' <*d Wk» siu»> p u t f


L i l t M f i K f 1SS4S,'
Oyr Pit**: %U9SS
B^/flC^U
RW'Ci

Figure 1. Snapshot of the application of Price Watcher

There are several price comparison services available on the web 2 ' 3,4 . The
differences between our price watcher agent and most of the web-based price
comparison software and portals are follows:
1. Designed for usage by individual online shops. Price watcher is
a price-monitoring tool used by individual online shops while the usual
web-based price comparison services are made publicly available for web
surfers to compare prices.
2. Neither broker nor public database is used. For most of the price
comparison services, there exist a mediator which is usually the web
server or service provider, and a centralized database is used to maintain
the price information available to the users. In our watcher agent strat-
egy, a private and confidential database that holds the competitors' price
information is located at the local site.
3. N o participation of retailing shops is required. The way that
some price comparison services work is they let the participating stores
to submit their latest prices to the mediator. Our approach is different
because there is no need to get the competitors involved.
4. Forms part of the Competitor Intelligence strategy. The price
watcher is to be implemented as a part of the competitor intelligence
strategy that includes information retrieval, filtering, analysis, and pre-
sentation.
In this paper, Section 2 covers the overall working process of the price
watcher. The product name matching and price extraction algorithms are
296

described in detail in Sections 3.1 and 3.2 respectively. The technical limita-
tions about price watcher is given in Section 3.3 and finally we conclude our
work in Section 4.

/eb pages Information Retrieval


Layer
URL Retrieval Engine

URLi l W l
resS ' « " i ' 'JRLs

Market Market Compilation Layer


Price
Explorer Monitor
Watcher
Market Watcher

Storage Layer

Presentation I^ycr
Marketing Information System

Figure 2. The architecture of Watcher Agent

2 Price Watcher Working Process

The price watcher working process consists of five steps:


1. The set of competitors' URLs, configuration parameters(e.g. retrieval
scheduling) and product names are obtained from database.
2. The HTML pages are downloaded using the web retrieval engine.
3. A dollar sign detector is used as a filter. Only pages containing dollar
signs like $ and S$ are to be processed further.
4. The product names are searched within each page. The price for any
possible matches is extracted and stored in the local database.
5. The competitors' price (and our own price) are then queried and shown
in a tabular form.
297

3 Technical Details

To monitor a web site, the contents of the web site should be downloaded based
on some schedule setting 5 . In the price watcher, only the HTML texts are
to be downloaded. Finding the level of similarity between our product names
and the names provided on the web, as well as extracting the corresponding
prices are the two main challenges facing us. The architecture of the Watcher
Agent is shown in Figure 2. The agent is composed of two major parts. One
part is the price watcher and the other part is the market watcher. The market
watcher helps the administrator of the online shop get the latest information
about his competitors' web sites. The market watcher part is not covered in
this paper.

3.1 Product Name Matching


We know that one product name can usually be divided into three parts:
brand, model number and description. For example, brand: Canon, model
number: BJC-4200SP and description: Color Bubble Jet Printer. The model
number is believed to be unique for a specific product. The brand part may
appear to be slightly different on different Web sites. For example, Hewlett
Packard and HP (for short). This problem can be solved by inputting more
than one brand equivalents from users. The description part may be quite
different from each Web site. However, this part is not so critical for product
name matching although it is useful in determining where model number or
brand can be found. What we do in product name matching is to allow
users allocate weight for each part. For example, 50%, 30% and 20% for
model number, brand and description respectively. Model number and brand
require exact matching regardless of the character case. Exact matching will
give a similarity level of 1, otherwise the similarity level is 0. Approximate
word matching algorithm 6 is applied for similarity level computation of the
description part. The final similarity of each part is given by the product of its
similarity level and weight. The overall similarity level for the whole product
name matching is obtained by summing the final similarity levels of these
three parts. This final value is subsequently compared with the threshold
value to decide if a match has actually been detected.

3.2 Price Extraction


The main operation of the price watcher is to extract the prices from HTML
documents. HTML documents are semi-structured in nature 7 , hence extract-
ing information from HTML documents is significantly different from extract-
298

ing information from tables in a database. The price extraction algorithm is


developed based on the KPS Mining Algorithm8. Once a product name is
matched and located in a HTML document, the following rules are applied to
extract the price.
• For a product name appearing in a title (i.e. <title>, < h l > - <h6>),
the price of the product is most likely to be located in the string after
the product name.

• For a product name appearing in an item list, the price is most likely to
be located in the same item, or the next one until the end of the list.

• For a product name appearing in a cell of a table, the price is most likely
to be located in the same cell, or the same row in the column-wise table,
or the same column in the row-wise table.
• For a product name appearing in a textual line, the price is most likely
to be located in the same paragraph, or the next paragraph, until the
end of the page.
• The price is assumed to be the first one appearing after the product name
if more than one price are found.
For each HTML page retrieved by the system, a Semi-Structured Data Tree 7
will be constructed. If a model number can be located in the tree, the brand
and the description are searched within the data node. If none of them can be
located in the current data node, a super data string will be formed from all
the data nodes which are children of the parent of the current data node. The
similarity level of the obtained product name and the defined product name
will then be computed. The price of this product will firstly be searched with
the current data node, and up to three levels if no price information can be
found.

3.3 Price Watcher Limitations


One technical limitation is that the price watcher cannot distinguish Singa-
pore dollar and American dollar. The reason is that the "S$" and "$" are
always used interchangeably in Singapore. In the current prototype imple-
mentation, price watcher can only deal with textual data. Another problem
is that the detected product name may not be the one to be monitored al-
though a high similarity level is calculated. For example, "Cartridge for HP
DeskJet 840C Printer" will be easily detected as "HP DeskJet 840C Printer".
A more sophisticated algorithm is needed to resolve this problem.
299

4 Conclusion and Future Work

In this paper, we have reported an autonomous software program called price


watcher that collects competitors' product prices on the web. The collected
price information will contribute to managers' business decision making, and
it can be used to enhance shoppers' confidence via price comparison. The
application of price watcher technology is believed to be relatively new and
would create an impact on the way that retail shops market their goods online.
The first online shop that applies this technology would benefit most, because
it helps to place their business in a market position one step ahead of their
competitors. It is envisaged that the system can be expanded to include
scanning and analysis of competitors' other information, such as news, new
products, promotions, etc. Work can also be extended to study how this agent
can be integrated into the full infrastructure of business intelligence5.

References

1. L. Gerald and L. Spiller, Electronic shopping:The effect of customer in-


terfaces on traffic and sales. Communications of the ACM, 41(7), pages
81-87, 1998.
2. B. Krulwich, The BargainFinder agent: Comparison price shopping on
the Internet. In Agents, Bots, and other Internet Beasties, SAMS.NET
publishing, pages 257-263, 1996.
3. R. B. Doorenbos, O. Etzioni and D. S. Weld, A Scalable Comparison-
Shopping Agent for the World-Wide Web. In Proceedings of the First
International Conference on Autonomous Agents, pages 39-48, 1997.
4. Pricewatch for Computer Products, http://www.pricewatch.com.
5. Q. Chen, P. Chundi, U. Bayal and M. Hsu, Dynamic Software Agents for
Business Intelligence Applications. ACM Autonomous Agents'98, pages
453-455, 1998.
6. J. C. French, A. L. Powell and E. Schulman, Applications of Approximate
Word Matching in Information Retrieval. In Proceedings of the Sixth
International Conference on Knowledge and Information Management,
pages 9-15, 1997.
7. S. J. Lim and Y. K. Ng, An automated approach for retrieving hierarchi-
cal data from HTML tables. In Proceedings of the Eighth International
Conference on Information and Knowledge Management, pages 466-474,
1999.
8. T. Guan and K. F. Wong, KPS: a Web Information Mining Algorithm.
Computer Networks 31(11-16): 1495-1507, 1999.
AUTOMATED INFORMATION EXTRACTION FROM WEB PAGES
USING AN INTERACTIVE LEARNING AGENT

JUGAL K. KALITA AND PARITOSH ROHILLA


Dept. of Computer Science, University of Colorado, Colorado Springs CO 80933, USA

Due to the dynamic nature of the Web the layout of information on a Web page can change
often. If a comparison-shopping agent relies on the programmer to detect changes in the
layout and change the information extraction algorithms accordingly, the agent's efficiency
and accuracy are compromised. The process of manually changing code is cumbersome. In
addition, an agent built with hard-coded logic specific to a Web site works only for that
domain. We have built a GUI based system, which enables the agent to learn to extract
product information from a Web page. The algorithms use machine learning to help make the
agent generic and easily adaptable to various product domains. We avoid any hard coding. In
addition, the system is able to learn the desired information based upon just few training
samples. Such a capability enables adding new sites for a product category relatively easy.

1 System Overview

The following steps are involved in semi-automatically extracting relevant


information from Web pages for comparison-shopping:
1. A structure for the relevant information on Web pages needs to be specified.
2. The learning engine needs samples that fit the structure defined.
3. From the training samples, the learning engine produces extraction rules.
4. The extraction rules are applied to Web pages to extract relevant information.
These results also determine if more training samples are necessary.
5. The rules learnt can be fine-tuned if the Learner cannot capture all details.
Our system has two programs, namely the Learner and the Extractor. Both programs
interface with a common database. The Learner has modules for the first three tasks.
The Extractor handles the last two.
Our approach uses the inherent structure of tags and syntactic properties of
plain text to infer rules. Our approach differs from published approaches based on
wrappers or other techniques [1,2,5]. The entire page rather than being broken up
into tokens is converted into a document tree. The tree is made up of tags and plain
text nodes. The Learner tries to identify a node of interest by exploiting the
properties of this tree and the plain text nodes.

2 The Learner

The rules learnt by the Learner for a particular page are stored in the database. The
Extractor uses these rules to extract records from target Web pages.

300
301

2.1 Structure specification


Most Web pages that provide information about products have an inherent structure.
8ook This segment of the page can be thought of
as consisting of several records. A record is
a group consisting of coherent pieces of
related information [3]. Thus each record
TJUB Discount SollPriiB Author ISBN
has several fields. Out of the fields a record
Figure 1. Desired record structure
has, we may be interested only in a few
selected ones. For example, the relevant fields of a record (say, book) may be like
the ones shown in figure 1 .While defining the structure of the records, we can mark
some fields as mandatory. For example, in constructing a comparison-shopping
agent for books, the mandatory fields for each record can be Title and Price. Besides
providing names for various fields of the record structure, it may be helpful to also
provide additional information such as the data type of the field. The extraction
algorithm uses the data type information to reject nodes that do not match.

2.2 Providing training samples


Training samples shown to the Learner are records contained in Web pages. Several
Web pages
!
£k representative
« K93
Arti f i r m l
of the same
Tn( i l l i ' j e n t e
714
722
:79.95:
:7*:
record structure Stattanwig j
731
744
:74.50:
are obtained.
747 : Bo d e n , M a r g a r e t . A . :
OMtt'tflv j
The Learner
761
7G4
7G6 :Detail:
has a GUI e<it |

786
796
: 5. :
: Artificial
interface, which
Intelligence:
B07 :24:
eis :31t: facilitates the
824 :16.7S:
837
840 : P e r c y , R o b e r t L. :
learning
843
854
:,
process.
ISBH:0531117S7X , War 0 0 , MATS :
Buy
A
/
857
859
879
Detail
6
screen shot of
jd
889 Artificial the GUI is
Intelligence
fll M
shown in figure
L**f«g«te$1SaQK£

9-
fiseft fwfaitf***
&«*«<)&
r
fcfcto*
2.
VA*
A user loads
dffitawdbtli
i$&i )7 in
i J KBN0171G19G4B M a y X ACAD JlSBK|

sample pages
He»i' j n j j $ i * « i .UttJ one at a time.
Figure 2. The Learner GUI
Once loaded, a
sample page
looks like a text file without tags. The entire Web page is first converted into a
document tree. The plain text nodes, which appear in the display area, are indented
according to their depth in the tree. The indentation gives a feel of rendering. This
can help the human trainer recognize record boundaries. Every node in the tree is
302

given a node number by traversing the entire tree in a depth-first fashion. The
numbers on the left show the node numbers assigned to the text nodes. The GUI has
a form below the display area to show where various fields of the record appear on
the page. The learning process begins by the GUI prompting the user to show fields
of the record. The prompting continues until the trainer is satisfied that all possible
variations of the record structure have been shown.

2.3 Generation of extraction rules


Extraction rules are learnt for every element of the record. Key properties of the
document tree are utilized to formulate rules. Figure 3 shows a document tree
corresponding to parts of a Web page. Our goal is to develop rules to extract fields
chtral-' that
comprise
records. The
document
tree for any
Web
document
containing
several
records
shows a
number of
Artificial HeSgcac* 71J0»«dtB,Mtt;.. interesting
Figure 3. A document tree features:
1. Most records in the tree have a similar pattern. The fields show common
properties across records, e.g., they are at the same depth, and have parent
nodes with the same tag.
2. Every node in a document tree has a unique node number. This uniqueness
property helps avoid conflicts while grouping identical looking fields into their
respective records.
3. All plain text nodes show up as leaf nodes. Thus, the problem of identifying a
field now narrows down to identifying the appropriate leaf node.
The following information is gathered for all the fields of various sample
records that are shown by the trainer:
• The depth of a node in the document tree is recorded. Since all records have an
identical pattern, it is very likely that this field is always at the same depth
across all records.
• For each field, we find the sequence of tags, starting from the root. For
example, in Figure 3 the node containing "Artificial Intelligence" has the tag
sequence "html;body;table;tr;td;b".
303

• The relative position of a field, the difference between its node number and the
node number of the first field in the record, is recorded.
• Any number of word(s) or characters) that stay constant across all records of a
field are keywords for that field. Keywords can help in resolving ambiguity.
• Any number of word(s) or characters) that should not be part of the plain text
of the field are classified as omitwords for that field. Any plain text nodes
matching the omitwords are ignored at extraction time.
• The entire text associated with the field is also stored. We attempt to infer
characteristics of the field by examining the text of a field across all records.
For example, we can find the average size of the text in the field.
The rule generation algorithm uses all of the information gathered above to
formulate rules for each field of the record.

3 The Extractor

The Extractor extracts and displays the records from the loaded document. The
trainer specifies the rule set to be applied to the document. Having two GUIs, one
for the Learner and the other for the Extractor helps the trainer to immediately view
results of the samples that he provides to the Learner. Based upon the results he can
either stop the learning process or continue to provide more samples.

3.1 Applying Extraction Rules

The extraction module is a rule-based deduction system [4]. We have established the
following general antecedent-consequent rules for each field of the record structure:
• if depth of node = learned depth A tag sequence of node = learned tag
sequence then node is a candidate node.
• if node is a candidate node A node text has the specified data type A node has
learned keywords A node doesn't have learned omitwords A text length is
between min and max values then node belongs to the field.
The extraction process follows a bottom up approach to form records. This approach
helps deal with records that do not have all fields. Every node that qualifies as a
field is extracted from the page, irrespective of the record it belongs to. The
extracted fields are then grouped together into records.

3.2 Rule refinement


The Extractor GUI provides a facility to look at the rules and make manual changes.
This facility is handy if the algorithms are unable to calculate proper values for the
rules. It is advisable that the trainer does not use this feature often.
304

4 Experimental Results

We selected 11 Web sites that sell products belonging to different product


categories. We filled product search forms on each of these Web sites and obtained
pages showing product listings. We then ran our Learner and Extractor programs.
Table 1 shows the results from some of our experiments.
Table 1. Experimental Results

W* Time Total Recoms Reconb Wnms UMOinplefe Fin Recall Pndsifln


Site (Mm) Expected Extracted Reconls Recomt Tunc % %
Borders 15 4 182 182 0 0 Y 100 100
6 3 55 10 0 7 N 18 100
Noble 4 4 55 45 0 7 N 81 100
4 5 55 55 0 0 Y 100 100
Book 9 4 157 120 0 48 N 77 100
Closeouts 10 4 137 157 0 0 y 100 100

All online stores considered have simple record structures. The time that a
trainer can spend trying to get our system to learn to extract the records can vary
between 15 - 40 minutes. These also include the time it took to count and determine
if the extracted records were incomplete or wrong. The experiments indicate that
given sufficient amount of time our system can achieve a recall rate of 100% for all
stores. The precision is 100%. On almost all Web sites, the rule refinement involved
changing only the minimum and maximum values for the length of the text that can
appear in a field. We think this overhead could have been reduced by more careful
selection of sample records. For some Web documents we were able to achieve a
recall rate of over 75% without fine-tuning. The final rules that were learnt show a
very impressive recall and precision rate.

References

1. Berd Adelberg. NoDoSE - A tool for Semi-Automatically Extracting


Structured and Semistructured Data from Text Documents. In SIGMOD '98, pp
283-294, Seattle, WA, June 1998.
2. Berthier Ribeiro-Neto, Alberto H.F. Leander, Altigran S. da Silva. Extracting
Semi-Structured Data through Examples. In CIKM-8, pp 94-101, Kansas City,
MO, November 1999.
3. D.W. Embley, Y. Jiang, Y.K. Ng. Record Boundary Discovery in Web
Documents. In SIGMOD'99, pp 467-478, Philadelphia, PA, June 1999.
4. Patrick Henry Winston. Artificial Intelligence. Addison-Wesley, 1992.
5. Wai-Yip Lin, Wai Lam. Learning to Extract Hierarchical Information from
Semi-structured Documents. In CIKM-9, McLean, VA, November 2000.
AN INTELLIGENT AGENT WITH STRUCTURED PATTERN MATCHING
FOR A VIRTUAL REPRESENTATIVE

SEUNG-IK LEE AND SUNG-BAE CHO


Computer Science Department, Yonsei University, 134 Shinchon-dong, Sudaemoon-ku, Seoul
120-749, Korea
E-mail: cypher@candy.yonsei.ac.kr, sbcho@csai.yonsei.ac.kr

We propose a conversational agent that can act as a virtual representative of a web site in-
teracting with visitors using natural languages. The agent consists of three main components:
dialogue act categorization, structured pattern matching, and knowledge construction and repre-
sentation. Dialogue acts (DAs) are classified by automata which accept sequences of keywords
defined for each of the DAs to identify the user's intention. Structured pattern matching is used
for matching the queries with responses rather than the conventional natural language process-
ing techniques. To show the usability and possibility, this agent is applied to the introduction
of a web site. The results show that the conversational agent has the ability to present more
adequate and friendly responses.

1 Introduction

Conversational agents have been focused recently because they can have conversa-
tions with users in natural languages and thus provide accurate information of a web
site and respond quickly with friendly interaction. One of the first conversational
agents, called Eliza, was born at Massachusetts Institute of Technology in 1966.
Eliza was contrived for the research on natural language processing. This agent uses
simple pattern matching technique l. ALICE (Artificial Linguistic Internet Com-
puter Entity, http://www.alicebot.org) is written in a language called AIML (Artifi-
cial Intelligence Markup Language) that is based on XML. A new idea in ALICE
is to tailor the conversation for categories of individual, mainly through attempts to
determine the client's age, gender, geographic location and occupation.
However, most of the conversational agents have shortcomings in that they do
not consider user's intention because of simple sequential pattern matching based on
keywords. This paper aims to develop a conversational agent that identifies user's
intentions and utilizes them in matching the corresponding response.

2 Conversational Agent

The conversational agent we propose identifies the intention of a query and responses
in natural languages, both Korean and English. A user query is preprocessed for the
correction of typos and replacement of synonyms and put into the DA categorization
process, which classifies it into categories of dialogue acts (DAs) 2 ' 3 ' 4 and extracts

305
306

Table 1. Dialogue acts

User input Dialogue act


Primary question Ability, Description, Fact, Location, Method,
Miscellaneous, Obligation, Reason, Time, Whatlf, Who
Secondary question Acquisition, Comparison, Confirmation, Cost,
Direction, Example, More, Possession
Primary statement Act, Fact, Message, Miscellaneous, Possession,
Status, Want
Secondary statement Cause, Condition, Feeling, Time

keywords for each DA. These DA, keywords, and preprocessed query, are used to
match the most appropriate response in a knowledge database called script.

2. / Dialogue Act Classification


For the identification of the intentions of a query, thirty domain-independent DAs
are defined as in table 1. Each DA is identified by a corresponding automaton con-
structed on keywords and their sequential information as in fig. 1, which shows an
automaton that classifies a query into 'Location' DA. A special meta character '#' is
used to allow the ending of a word to vary according to subject, tense, etc. A query

LOCATION =state, county, city, street, town,


neighborhood, county, nation, location

Figure 1. Automata for location question


307

List 1 A Part of a Script


TOPIC "location of lab"
IF ((7LOCATIONQUESTION OR 7DIRECTIONSQUESTION) AND
HEARD ("lab#", "softcomputing", "soft"&"computing")))
THEN SAYONEOF
ITEM "It is located at the 3rd engineering ..."
ITEM "529, the 3rd engineering building, ..."
ITEM "The 3rd engineering building in yonsei ..."
DONE
ENDTOPIC

is classified into only one of the DAs of primary category whereas several DAs can be
assigned in case of secondary category. List 1 shows a part of a script. When a user
asks the location or direction of something and "lab#," "softcomputing," or "soft"
and "computing" appears in the query, one of the items below the "SAYONEOF" is
randomly selected and presented as a response to the user.

2.2 Structured Pattern Matching


Scripts are interpreted and loaded into memory by script interpreter when the agent
starts. The condition part in each topic is transformed into a Boolean expression and
the keywords listed in the condition part are transformed into a regular expression
as in Fig. 2. The procedure of matching a query to a response is as follows. The
condition part of each topic is compared with the query, DAs, and the keyword list.
Different types of matching components, like DAs, keywords, or Boolean opera-
tors, are assigned different scores and the highest topic is selected and its action is
performed to respond to the query.

3 Experimental Results

To show the conversational capability of the agent, it is applied to the introduction of


a web site. Fig. 3 (a) shows a response of the agent to a query concerning the location
of the laboratory and (b) shows a somewhat different response to a location question
of the same intention. As can be seen in Fig. 3 (a) and (b), the agent shows intelligent
behaviors by responding a bit differently to the queries of the same intention.
What follows next shows a stream of conversations between a user and the agent.
The agent is able to interact with users naturally presenting the requested information
to them.
User : who a r e you?
308

Script Interpreter

IF ((7LOCATIONQUESTION OR SAYONEOF
7DIRECTIONSQUESTION) ITEM "It is located at the 3rd engineering
AND HEARD ("lab", "sottcomputlng" "soft" & building in yonsei university"
"computing"))) ITEM "529, the 3rd engineering building,
134, yonsei university, shinchon-
dong, seodaemoon-gu, seoul"
ITEM "The 3rd engineering building in
yonsei university"

Figure 2. Internal representation of the script in List 1

. , , . _ n your query: tell me where your lab is


your auery : where Is your laboratory?
D
Chatbot ms»r: It Is located at the 3rd engineering building In OTsei university "tbot an8,er:
** M
engineering building In yonsei university

(a) Location question 1 (b) Location question 2

Figure 3. Questions of the same intention

Agent : I am a virtual representative of this site to help y


find what you want. So, what do you want to know?
User : where is your laboratory?
Agent : It is located at the 3rd engineering building
in yonsei university
User : hmm, can I ask your name?
Agent : I am Walwalri.
309

User : I want to know name of your professor


Agent : My professor's name is Sung-Bae Cho.

4 Conclusion

In this paper, we have constructed a conversational agent that can give responses
to the queries of users in natural languages. The agent can accept queries in both
Korean and English, and give responses more consistently with the user's intention.
This consistency originates from identifying the user's intention by the classifica-
tion of DAs and applying them to the structured pattern matching. Furthermore, the
conversational agent has an advantage of making users feel natural and friendly in
finding information because of mutual interactions with natural language support.
As further works, we plan to study on the automatic construction of scripts from
web pages in order to reduce the time and effort of the construction of the scripts.
Maintaining contextual information in a conversation is another research topic to
guarantee more intelligent and consistent interactions. Finally, giving the initiative to
both sides could make the conversation more natural than the current implementation
of the initiative given to only users.

References

1. Weizenbaun, J., ELIZA - a Computer Program for the Study of Natural Lan-
guage Communication between Man and Machine. Communications of the
ACM 9(1). (1965) pp. 36-45.
2. Austin, J.L.: How to do Things with Words. (Clarendon Press, Oxford, 1962).
3. Stolcke, A. et. al, Dialogue Act Modeling for Automatic Tagging and Recog-
nition of Conversational Speech. Computational Linguistics 26(3). (2000) pp.
339-373.
4. Core, M.G. and Allen, J.F., Coding Dialogs with the DAMSL Annotation
Scheme. Working Notes of the AAAl Fall Symposium on Communicative Ac-
tion in Humans and Machines. (1997) pp. 28-35.
5. Brooks, R.A., A Robust Layered Control System for a Mobile Robot. IEEE
Journal of Robotics and Automation. (1986) pp. 14-23.
6. Cho, J. et. al, Efficient Crawling through URL Ordering. Proceedings of the
7th International Conference on the World Wide Web. (1998) pp. 161-172
A C A L E N D A R M A N A G E M E N T A G E N T W I T H FUZZY LOGIC

WAYNE W O B C K E
Department of Information Systems
University of Melbourne, Parkville VIC 3052, Australia
E-mail: wobcke@staff.dis.unimelb.edu.au

In this paper, we present a prototype calendar management assistant that uses


fuzzy logic for the representation and satisfaction of user preferences. The system
uses ideas from fuzzy scheduling and more loosely from "granular computing" to
schedule actions at two different levels of temporal granularity: the task level and
the activity level. Tasks are specific actions that have a set start time and are
performed throughout a single timeslot (such as a meeting), whereas activities are
processes that may not have a fixed start time and which may be spread over
many time periods (such as writing a research paper). Users specify start times,
deadlines and durations using terms such as morning, late afternoon and around 1
hour which are interpreted as fuzzy functions: the scheduler must produce a valid
schedule that maximizes the overall satisfaction of the user's preferences. The
present version of the system is also able to schedule plans of tasks or activities
(networks of tasks or of activities ordered into an acyclic graph) in which fuzzy
constraints between actions may be specified.

1 Introduction

Adopting the "personal assistant" metaphor promoted by Maes, 1 we present


in this paper a prototype personal diary assistant that aims to help its user
with the "time management" function of diaries by scheduling personal tasks
and activities. The system is personalized in taking into account user pref-
erences when scheduling actions: it makes use of fuzzy logic4 in allowing the
user to specify preferences for the start time, duration and deadline of an
action using simple natural language expressions such as morning, late after-
noon, next week, around I hour, etc., each interpreted by the system using a
predefined fuzzy function. The present version of the system is also able to
schedule plans of tasks or activities in which fuzzy constraints between actions
may be specified by the user.
The assistant schedules actions at two levels of temporal granularity, this
idea being related to Zadeh's notion of 'information granularity', Zadeh.5 The
task level concerns specific actions that have, after scheduling, a set start time
and which are assumed to be performed throughout a single timeslot, e.g. a
(multi-agent) meeting. By contrast an activity is a process that is not assumed
to have a fixed start time, even after scheduling, and whose execution may be
spread over many time periods, e.g. writing a research paper.

310
311

Task scheduling is based on a standard "appointment diary" divided into


half-hour slots into which tasks are allocated one or more consecutive slots,
while activity scheduling operates at a different level of temporal granular-
ity, dividing the day up into four fixed 2-hour periods, and scheduling "work
sessions" of an activity over a number of such periods. Moreover, the inter-
pretation of such "scheduling" is much more liberal: by allocating a timeslot
to a task, it is understood that the user works on that task throughout the
timeslot; however, with an activity, which is typically more easily interrupt-
ible, it is understood only that the user works on the activity in a given period
for the length of time allocated to the activity for the period (making no as-
sumption about when, during the period, the user works on the activity). A
major function of the system is to maintain consistency between the different
views of the diary resulting from these two levels of temporal granularity.

2 Task Layer

The interface to the task layer is designed to look like a standard appointment
diary, with each day divided into half hour slots, as illustrated in Figure 1.
The duration of each task is given in parentheses beside its description. The
dashed lines indicate that there are activities scheduled for the corresponding
period, the length of the lines giving the user an idea of how much time is
allocated to activities during a period; this is explained further in section 3.

Figure 1. Appointment Diary Interface

The system uses two scheduling methods in sequence: a depth-first


"greedy" scheduler that uses heuristics to find an allocation of timeslots to a
set of tasks as efficiently as possible, then a local search "hill climbing" sched-
312

uler that makes minimal adjustments to efficiently improve a given scheduling


solution. There are two sorts of scheduling heuristics: for denning (i) the or-
der in which tasks are considered by the search and (ii) the order in which
assignments to tasks are explored. First, tasks are ordered according to any
plan dependencies," then from most constrained to least constrained. Second,
when scheduling a single task, the possible timeslots are explored in order of
degree of preference satisfaction. When there are no valid timeslots for a task
prior to the commencement of scheduling but the day for the task is known,
local search is invoked in order to increase the size of any gaps in the day
schedule so as to create a slot for the task, moving up to three tasks forwards
or backwards, and compressing a task that has a fuzzy duration (possibly in
conjunction with moving up to two preceding or succeeding tasks). Meetings
between multiple agents are scheduled using the "hierarchical" protocol of
Sen and Durfee2 adapted to handle fuzzy preferences in conjunction with user
profiles, as described in Wobcke and Sichanie.3
For efficiency, the first solution found by the scheduler is adopted, even
though this may not be optimal. The scheduler then applies local search in
order to improve the initial solution by making small changes to the task
schedule for a single day. The heuristic employed is to consider moving a
sequence of up to three tasks forwards or backwards by half an hour. The
aim is to minimize changes to task sequences in the user's schedule. The
typical case where the use of local search improves the solution is when there
are a number of relatively unconstrained tasks that require scheduling.

3 Activity Layer

The idea behind the activity layer of temporal abstraction is that not all
actions a user may want to enter in a diary are tasks. Many are processes
that may be spread over a number of days or weeks (or even months). Our
aim is to provide some assistance to the user with "time management" for
these types of activities. An activity should be "scheduled" at a higher level
of granularity than tasks, but only in the loose sense that it is allocated some
amount of time in some time periods. The idea is that this will enable the
user to be sure the activity can be completed before its deadline, given the
other tasks and activities in the user's diary. We call a part of an activity to
be executed in a time period an activity session, and a collection of activity
sessions (for a number of activities) an approximate schedule.

"In keeping with the emphasis on efficiency, plans have a restricted structure, essentially
enabling a tree-like set of dependencies to be constructed.
313

The use of a separate layer of temporal granularity necessitates a new


interface and this is based on some assumptions about the particular grain
size for activities that will be useful. We work with activities that are of week
level granularity and which are divided into sessions of up to 2 hours duration
by the scheduler. This is partly inspired by the standard "week-to-view"
diary, which also forms the basis of the interface design. A sample screen,
compatible with the task view from Figure 1, is shown in Figure 2.

Figure 2. Week-to-view Diary Interface

Each activity has a preferred work period and preferred work day, both of
which may be fuzzy expressions such as morning or next week. The scheduler
allocates a number of periods to each activity. The value of a period is the
average, over the free timeslots in the period, of the degrees to which the
timeslots meet the given preference.
A "capacity check" must be carried out for each period proposed by the
scheduler for an activity session. The simplest form of capacity check is to
ensure that the user has sufficient free time in the period to allocate to the
new activity session. However, this check is more complicated if activity
deadline(s) fall within a period. It is now assumed that the user can optimally
distribute time from the period to the different activity sessions. This enables
the scheduler to treat the activity sessions as if they were discrete tasks, and
to determine the value of a period by computing the ordering of these sessions
that maximizes the degree to which all deadlines are met, for each ordering
using the earliest end time for an activity as the basis against which the fuzzy
deadline function is evaluated.
Activity scheduling is based on task scheduling, although instead of allo-
cating a single timeslot to a task, a set of periods is allocated to an activity,
314

periods being allocated in order of satisfaction of the user's preferences. For


plans of multiple activities, the scheduler uses depth-first search driven by the
plan's dependency structure, although for efficiency reasons, no backtracking
is performed. Activity rescheduling is invoked whenever a task is to be added
to a period in which there is insufficient capacity.

4 Conclusion

We have described a calendar management assistant that uses fuzzy logic for
the representation and satisfaction of user preferences. The system operates
at two complementary levels of temporal granularity: scheduling tasks (in
timeslots) and activities (in larger time periods). The scheduler uses depth-
first search with heuristics for ordering the actions and the possible timeslots,
and uses local search for improving an initial solution so obtained. The assis-
tant is also able to schedule plans of tasks or of activities that may include
fuzzy constraints, and includes a "hierarchical" protocol for meeting schedul-
ing between multiple agents.

Acknowledgements

This work was carried out at British Telecom Laboratories in the United
Kingdom. We gratefully acknowledge the contribution to research and devel-
opment on the project made by Ben Azvine, David Djian, K.C. Tsui, Simon
Case, Heather Maclaren, Gilbert Owusu and Arash Sichanie.

References

1. Maes, P. (1994) 'Agents that Reduce Work and Information Overload.'


Communications of the ACM, 37(7), 31-40.
2. Sen, S. & Durfee, E.H. (1994) 'On the Design of an Adaptive Meeting
Scheduler.' Proceedings of the Tenth Conference on Artificial Intelligence
for Applications, 40-46.
3. Wobcke, W.R. & Sichanie, A.G. (2000) 'Personal Diary Management with
Fuzzy Preferences.' Proceedings of the Fifth International Conference on
the Practical Application of Intelligent Agents and Multi-Agent Technol-
ogy, 105-120.
4. Zadeh, L.A. (1965) 'Fuzzy Sets.' Information and Control, 8, 338-353.
5. Zadeh, L.A. (1973) 'Outline of a New Approach to the Analysis of Com-
plex System and Decision Processes.' IEEE Transactions on Systems,
Man, and Cybernetics, 3, 28-44.
XML BASED MULTI-AGENT COLLABORATION FOR ACTIVE DIGITAL
LIBRARIES

YANYAN YANG, OMER F. RANA AND DAVID W. WALKER


Department of Computer Science, Cardiff University, PO Box 916, Cardiff CF24 3XF, UK
E-mail: L. Yann@cs.cf.ac.uk

ROY WILLIAMS
Center for Advanced Computing Research, California Institute of Technology, CACR 158-79,
Pasadena, CA 91125, USA

GIOVANNI ALOISIO
Department of Innovative Engineering, University ofLecce, Italy

Intelligent and automatic processing of the distributed data that efficiently supports scientific
collaboration between both professional and casual users is a highly demanding task. It is also
particularly challenging when the system must cope with active data that is processed on-
demand. As part of the ongoing SARA Digital Library project, the research presented here
proposes an intelligent mobile agent approach to on-demand processing of remote sensing
data. We discuss the agent-based infrastructure that we have developed. The design,
architecture and implementation of a prototype system that applies this approach are reported
on here. In this experiment, the SARA system utilises cooperative software agents for data
access and analysis and uses XML to model metadata and support agent communications on
clusters of servers. Although the examples presented are mainly based on the SARA system,
the applicability of the proposed techniques to the potentially more rewarding active archive
system should be obvious. In particular, we believe the proposed agent design can allow
distributed access, concurrent querying, and parallel computing over multiple heterogeneous
remote-sensing archives in a modular and scalable fashion.

1 Introduction

The general problem of managing large digital data archives is particularly


challenging when the system must cope with active data which is processed on
demand [2]. Active data represents data that is dynamically generated by a scientific
experiment, or may be obtained from a sensor or monitoring instrument. The
amount of digital spatial data available is growing rapidly. In particular, there is a
vast amount of data from earth observation satellites. This presents a challenge for
the development of software systems to enable the storage, management and
dissemination of these huge datasets in on-line data archives or digital libraries.
Ideally, such a system should provide efficient, on-demand remote access to these
datasets over the Internet, so that authorized users can easily access and utilize the
data for a variety of applications including geology, image registration, resource

315
316

monitoring etc. For a number of spatial applications, such as satellite imagery, the
processing requires high-performance compute servers. In addition, scientists often
require integrated access to information combining retrieval, computation, and
visualization of individual or multiple datasets. Scientific collaborations are already
distributed across continents, and software to enable these work groups will become
increasingly vital. It will be necessary for human interfaces to these archives to
become more simple to use and flexible. In the scientific world, scientists need to
deal with both data-centric and process-centric views of information. While it is
important to have access to information, often it is also important to know how the
information was derived. Hence, the scientist should have a technological
infrastructure that can intelligently and automatically process the distributed data,
thereby transform the processed data into useful knowledge.
SARA is an active digital library of multi-spectral remote sensing images of the
earth, and provides web-based on-line access to such images. As part of the on
going SARA digital library project, this paper describes a collaborative effort to
explore an XML and agents based framework for the distributed management and
analysis of remote sensing archive. We believe our proposed techniques suggest
useful guidelines that go beyond the SARA system. Our results provide further
evidence of the utility of the mobile agent approach for active archive systems.

2 The multi-agent system

The SARA service is actually maintained by Caltech and the University of


Leccefl]. The data are maintained in different kinds of file systems, such as SUN
NFS, IBM/Livermore HPSS, and delivered using web front end. The web interfaces
act as an integration tool for combining different server implementations. At
present, a SARA user must telnet to the center hosting the parallel computer, select
computing resources to use, and perform the image processing activity. The results
of computation, image or text are written to web pages and examined with a
browser enabling further data to be imported and processed if desired.
The current approach is very restrictive on the kinds of users that can access
and manipulate data in the SARA, as the client may not have permission or the
relevant software to analyse data. Also, CGI scripts are stateless, whereby a user
cannot enable one request to be based on the outputs or conditions generated from a
previous one.
A multi-agent system, which comprises both intelligent and mobile agent, has
been developed to manage and analyze distributed multi-agency remote sensing
data. Compared with other agent applications, our emphasis is on the use of the
mobile agents to transfer analysis algorithms to image processing archives. We
localize the most complex functionality in non-mobile Local Interface Agents
(LIA), which remain at one location, communicate with the mobile User Interface
Agents (UIA), and provide resources and facilities to lightweight mobile agents that
317

require less processor time to be serialized, and are quicker to transmit. Each agent
is responsible for offering a particular type of service, and the integration of
services is based on a user specification. SARA mobile agents are persistent, and
can wait for resources to become available. Agents allow the delivery and retrieval
of data to complete without user monitoring or recovery actions.
There are two types of User Interface Agents: User Request Agents (URA)
and User Assistant Agents (UAA). URA supports the user in creating a query or
operation to perform on the SARA data. UAA manages the information of the user
and provide control functions to the user, such as updating their file space on a
remote server, and parameter settings for their visualization tool. There are many
types of Local Interface Agents: a Local Assistant Agent (LAA) supports
interaction with any visiting User Request Agents (URAs) by informing them about
the available data and computing resources, and cooperating on the completion of
the task carried by the URA. A Local Management Agent (LMA) coordinates
access to other LAAs and supports negotiation among agents. It is responsible for
optimizing itineraries of mobile URAs, to minimize the bottlenecks inherent in
parallel processing and ensuring that the URA is transferred successfully. A Local
InteGration Agent (LIGA) provides a gateway to a local workstation cluster, or a
parallel machine. A Local Retrieval Agent (LRA) can translate query tasks and
performs the actual information retrieval from the local archive. In addition to
retrieval, a LRA may also perform other operations. For instance, it may save the
results to a file before sending it to the user. A Local Security Agent (LSA) is
responsible for authenticating and performing a validation check on the incoming
URA. The URA will be allocated an access permission level. Agents from
registered users may use, and have access to, more information resources than
agents from unregistered users.

3 XML-based agent communication and data specification

3.1 XML-based agent communication


It is essential that agents used to access heterogeneous remote-sensing data
archives communicate and co-operate with each other in order to provide service
and satisfy user requests. A simple way to do this is to define an interaction protocol
for communication in the particular problem area. The best way to represent such a
protocol and to define a standard message format with meaningful structure and
semantics have become key issues.
We propose an XML schema for agent communication that enable agents to
communication with each other by expressing intentions in the SARA ontology.
Our XML schema allows efficient parsing and is modular and extensible to support
318

evolving classes of XML documents. In addition it retains its simplicity and clarity,
and is readable by the user. Each message has a standard structure, showing the
message type, context information, message sequence, and the body of the message.
Autonomous agents cooperate by sending messages and using concepts from the
SARA ontology, which describes terms and concepts (such as a Track, a
Latitude/Longitude coordinate, etc) and their inter-relationships. We represent
ontology by listing terms, their meanings and intended use in the Document Type
Definition (DTD). Every specific XML specification is based on a separate DTD
that defines the names of tags, their structure and content model. A DTD can define
elements, attributes, types, and required, optional, or default values for those
attributes. While the XML specification contains the structured information, the
DTD defines the semantics of that structure, effectively defining the meaning of the
XML-encoded message.
<?xml version='1.0' encoding="UTF-8"?>
<!ELEMENT message (context+, content+)>
<!ATTLIST message
type (request | response | failure | refuse) #REQUIRED
date CDATA #IMPLIED
id CDATA #REQUIRED
>
< .'ELEMENT EMPTY)>
<!ATTLIST context
sender CDATA #IMPLIED
receiver CDATA #IMPLIED
originator CDATA #IMPLIED
returnby CDATA #IMPLIED
>
< .'ELEMENT content (itinerary+, querydeff, results)>
< [ELEMENT itinerary (server)+>
<!ELEMENT server (Cardiff | Leece | Caltech , server2?) >
< ELEMENT server2 (Cardiff | Leece | Caltech)>
<!ENTITY query SYSTEM "query.xml">
<!ENTITY querydef(&query;)+>
< .'ELEMENT results (#PCDATA)>
Figurel A DTD for Agent Message Communication

Message type represents intentions such as request, response, failure, and refuse
explicitly and allows the system to monitor and control the progress of the
interaction. For example, we can define a message for a request to search for tracks,
and another message for information passing to return tracks.
Context is used to identify the sender, the intended recipient of the message or
originator for forwarded messages, using some form of local, regional, or global
naming scheme. Returnby sets a deadline for user's waiting time.
319

Content defines itinerary of agent and user's request wrapping in XML, as well
as forms of returning results.
We define a set of DTDs for agent communication in the SARA system that
specifies all of the legal message types, constraints on the attributes, and message
sequences.
In XML-based messages, agents encode information with meaningful structure
and commonly agreed semantics. On the receiving side, different parts of the
information can be identified and used by different services. Agents may use XML
to explain their beliefs, desires, and intentions (BDI). Moreover, a mobile agent can
carry an XML front-end to a remote data archive for data exchange, where both
queries and answers are XML-encoded. We have currently identified various types
of messages for agent interaction, such as upa-ura messages, ura-lia messages, and
lia-upa message. Messaging is performed synchronously so that the URA is
launched as soon as receiving a message from the UPA. A lia-upa message is sent
from a LIA to a related UPA when the tasks are finished. In our system, we use the
JAXP interface to XML developed by SUN that supports SAX and Document
Object Model (DOM).

3.2 XML-based data specification


In our system we use XML to encode system structure as metadata. The
metadata base consists of four tables. The Track table houses information about the
track image such as its name, date of acquisition, unique id, width, height, and
number of channels. The Coords table contains the latitude and longitude
coordinates of the four vertices of the image. In the File table the filenames of the
files constituting the image are recorded, and finally the Stored table contains the
information about where the image is actually stored, that is, one of the data servers
that compose the distributed digital library.

4 Implementation and application of the system

We have developed a simple prototype of an agent-based SARA system [4]. A


subset of the LIAs and UIAs has been implemented to provide the basic
functionality for on-demand processing of SARA data. We also implemented the
following functions: locate a mobile agent, send a URA to the specified data server
or compute server, retrieve URA-related information from the management server,
stop the execution of an agent and release the resource held by the agent, and report
agent status.
The current system can be employed in a wide range of application domains,
such as the analysis of multi-temporal images corresponding to changes in the
ecology of a particular region, and studies of environmental pollution. SARA
images can also be compared based on phase and amplitude differences of the
320

backscatter radiation, to study seismic or volcanic processes, motions of ice-sheets


or glaciers, or other similar geological events. Support for real time processing can
facilitate frequent overpassing of satellites over a given region in case of natural
disasters such as forest fires or flash floods. The agent-based approach provides a
useful system for enabling such applications to be more effectively deployed due to
the reasons mentioned above, and involves an integration of data resources, various
types of physical storage media, and various compute servers and analysis
algorithms. A detailed description of agent-based data analysis is given in [3].

5 Conclusions and future work

An agent based framework for managing access to data, supporting parallel


queries to data repositories, and providing an XML based data model for integration
data from different repositories is outlined. Our approach utilises stationary agents
which undertake specific roles, and mobile agents which can carry analysis
algorithms to data repositories. We use XML to model metadata and support agent
communication. We are currently working on the further enhancement of the user
interface for formulating queries and visualizing data. We are integrating more
sophisticated agents into the system. Another important task for future study is to
investigate system scalability and response times when thousands of agents are
hosted on a single data source.

References

1. Aloisio G., Milillo G., Williams R.D., An XML architecture for high-
performance web-based analysis of remote-sensing archives, Future
Generation Computer Systems 16 (1999) 91-100
2. Coddington P.D., Hawick K.A., James H.A., Web-based access to distributed
high-performance geographic information systems for decision support, In
Proc. HICSS-32, Maui, January 1999.
3. Rana Omer F., Yang Yanyan, Georgousopoulos Christos, Walker David W.,
Williams Roy, Agent Based Data Analysis for the SARA Digital Library,
Workshop on Advanced Data Storage / Management Techniques for HPC,
Warrington, UK, 23rd - 25th February 2000.
4. Yang Yanyan, Rana Omer F., Georgousopoulos Christos, Walker David W.,
Williams Roy, Mobile Agents and the SARA Digital Library, In Proceedings
of the IEEE Advances in Digital Libraries 2000, Washington DC, Published by
IEEE Computer Society, May 2000.
CHAPTER 5

DISTRIBUTED INTELLIGENCE
AN INTELLIGENT CHANNEL ALLOCATION SCHEME FOR MOBILE
NETWORKS: AN APPLICATION OF AGENT TECHNOLOGY

ELIANE L. BODANESE
Centro Federal de Educacao Tecnologica do Parana , Av. Sete de Setembro, 3165
Curitiba, Parana, Brazil, 80230-901,
E-mail: bodanese@dainf.cefetpr.br

LAURIE G. CUTHBERT
Dept. of Electronic Engineering, Queen Mary and Westfield College - University of London
Mile End Road, London, El 4NS, England
E-mail: laurie.cuthbert@elec.qmw.ac.uk

As the demand for mobile services grows, techniques for increasing the efficiency of channel
usage in mobile networks become more important. Radio resource flexibility is needed to
cope with the limited frequency spectrum available for network operators. The frequency
channel allocation in mobile networks is a complex control problem with specific goals, i.e.,
to minimize the call blocking probability of the network and minimize the delay in channel
assignment. This paper proposes a multi-agent system implementation to control frequency
channel allocation in mobile networks. The internal agent architecture allows base stations to
be more flexible and intelligent, negotiating and co-operating with others to improve the
efficiency of the channel allocation scheme. The simulation results prove that the use of agent
technology in controlling the allocation of channels is feasible and the agent negotiation is an
important feature of the system in order to improve perceived quality of service and to
improve the load balancing of the traffic.

1 Introduction

Mobile networks were first implemented following the specifications of the called
cellular concept [1]. This architecture of cellular network is composed of hexagonal
cells representing geographic areas. The users called mobile stations (MS) or mobile
users, are able to start/receive communication while moving inside the cellular
network. Each cell has a base station (BS) which supplies frequency channels to the
mobile stations inside its boundaries. The base stations are linked to a mobile
switching centre (MSC) responsible for controlling the calls and acting as a gateway
to other networks. When a mobile station using a frequency channel reaches the
boundary of the cell, it needs to change its current frequency channel for another
belonging to the neighboring cell, this procedure is known as handojfov handover.
The assignment of frequency channels in the cellular concept is fixed, i.e., the
total number of frequency channels allocated to a network operator is divided in sets
and each set is assigned statically to a cell. The set of frequency channels used in
one cell can be used in other cells distant enough to allow the reuse of the frequency
channels without causing interference problems. These cells are called co-channel

322
323

cells and the distance between them is the co-channel reuse distance. The group of
cells using different sets of channels form a cluster of cells that is called compact
pattern. The frequency reuse layout of a cellular system is assembled following a
scheme [1], that finds the nearest co-channel cells of any cell of the network. Fig. 2
illustrates the frequency reuse layout of a network with a 7-cell compact pattern.

Figure 1. Frequency reuse layout of a 7-cell compact pattern cellular network.

The frequency channel allocation scheme as introduced in the cellular concept is


called Fixed Channel Assignment (FCA), that is simple, but strongly constrains
channel utilization. If the traffic demand in a cell is greater than the number of
nominal channels, (i.e. the frequency channels assigned to that cell) all the excess
demand is blocked, regardless of the traffic conditions in other cells. This constraint
is too limiting for mobile networks and several strategies have been proposed to
maximize frequency channel allocation and minimize the traffic blocking rate. The
strategies have been divided mainly in two groups: variants of FCA and Dynamic
Channel Assignment (DCA).
The strategies using FCA differ from the original concept by allowing a cell to
use channels from its neighboring cells, when there are no more available channels
to serve incoming calls or handoff requests. One of the examples is the channel
borrowing scheme with channel locking [2] that borrows a channel from an adjacent
cell, but prevents the use of the borrowed channel in the co-channel cells of the
lender that are near to the borrower, taking into account the co-channel interference
constraints. DCA strategies [3] do not have nominal channels; all channels are kept
in a central pool. When there is a channel request in one cell, the MSC chooses the
appropriate channel that gives maximum channel efficiency taking into account the
co-channel interference constraints. The channels are assigned for the duration of a
call and then they are returned to the central pool or reallocated. A detailed survey
describing DCA schemes and FCA-based schemes, including those with channel
borrowing is given in [3].
324

Centralized schemes generally perform better, but the communications


overhead is a real problem, causing several authors to propose modifications to
make the schemes more distributed. One example is the Distributed Load Balancing
with Selective Borrowing (D-LBSB) scheme [2] that performs better than its
centralized version and also outperforms some other existing schemes.
The schemes referenced here are purely reactive algorithms: although they give
better performance than FCA still there is wasted signaling because of the high
degree of failure under increasing traffic loads. Fully reactive algorithms are not
able to detect the drop in their efficiency; as it will be shown in Section 4. In order
to avoid wasting signaling resources, base stations must detect the situations where
the efficiency of the algorithm is low and decide on different ways to improve the
efficiency of the algorithm or to stop its execution. In count of that, we propose a
multi-agent system that allows base stations to be more flexible, negotiating and co-
operating with others to improve the efficiency of the channel assignment scheme.
There is one agent in each base station, which is responsible for the control of the
assignment of channels in the cell where it is located.

2 A Channel Allocation Scheme Based on Agents

Several agent frameworks have been proposed for control management in


telecommunication networks, more specifically, the adoption of a layered multi-
agent control system seems to fit the requirements of distribution, intelligence,
robustness and concurrency of telecommunication network control as we can see in
Somers [4]. A layered arrangement allows levels of co-ordination, each layer is
defined to conduct the control of the network to a certain level of competence. In
our approach, we selected a type of multi-agent system whose layered control is
located inside the agents. The agents were based on a hybrid architecture called
INTERRAP (INTEgration of Reactive behavior and RAtional Planning) proposed
by Miiller [5]. This architecture is composed of three hierarchical controls layers, a
knowledge base that supports the representation of different abstraction levels of
knowledge and a well defined control architecture that ensures coherent interaction
among the control layers. Each layer has different functionality which working
concurrently completes the global behavior of the agent. INTERRAP was designed
to react to unexpected events, to long-term actions based on goals and to cope with
other agent interactions. Its layered control approach and its mechanisms allowing
negotiation among agents made this architecture suitable for optimizing channel
allocation in mobile networks and therefore employed and adapted for this work.

For the cellular network scenario, the layered control of the agent is structured
to include a reactive layer that is responsible for fast accommodation of traffic
demand, a local planning layer that uses other strategies to optimize the local load
distribution of channels and a co-operative layer, responsible for load balancing
325

across a larger area. The knowledge base is composed by a world model that
contains the environment information and everything necessary for the operation of
a channel allocation algorithm; a mental model that contains the complete
information about the agent, about the use of frequency channels and possibly
history of traffic load in the cell and a social model that has relevant information
about other agents' data. The agent is prepared to receive the input from the
environment that includes requests for channel allocation from new calls, handoff
requests, borrowing channel requests and orders for locking channels. The actions
performed by the agents over the environment includes all execution tasks that
actually allocate, release, re-allocate, lock, lend channels, manage handoffs and
terminate appropriately unsuccessful requests.

3 Functional Specification of the Agent

3.1 Reactive Layer


The Reactive Layer is primarily responsible for the channel assignment and is
basically composed of a FCA algorithm with channel borrowing and channel
locking. The algorithm called "Distributed Borrowing Algorithm " (D-B A) is based
on the general behavior of the D-LBSB scheme [2]. Using an algorithm similar to
one of the distributed schemes described in the literature allows a better evaluation
of the performance of the agent and also allows the benefits of the Local Planning
and Co-operative Planning layers to be evaluated against the same reactive scheme.
In the D-BA scheme, when the channel availability in a cell decreases to a
certain threshold, it becomes "hot"; cells above the threshold are "cold" cells.
When a cell becomes hot, it triggers the execution of the borrowing channel
algorithm. The neighboring cells with more channel availability are chosen to lend
channels to the hot cell, they are the lenders. The hot cell expects to get a number X
of channels (determined by the algorithm). The algorithm borrows channels until
the number X is reached or all selected lenders are not able to lend channels
anymore, what comes first. Full details of the D-BA scheme is found in [6].

3.2 Local Planning Layer


This layer is responsible for the channel re-assignment scheme. Every time a
channel is released, the Reactive Layer requests a re-assignment decision in order to
improve locally the channel usage. The Local Planning Layer also determines a
departing region r close to the borders of the cell. Mobile users inside this region
are considered as departing from the cell and they have important role inside the
agent negotiation. Finally, this layer monitors the efficiency of the algorithm in the
Reactive Layer, and it is responsible for the decision of triggering the Co-operative
Planning Layer.
326

3.3 Co-operative Planning Layer


The Co-operative Planning Layer is responsible for the negotiation of resources.
When the use of the local channel allocation algorithm on its own is not sufficient to
keep low rates of call blocking, one way to alleviate the load of a hot spot is to move
calls to less loaded regions. This can be done through management handoffs (or
traffic handoffs as known in the standards), but this is not an easy task. Only mobile
users close to certain borders of the cells can attempt the management handoffs. The
handoff attempts must be in a co-ordinated manner to avoid a mobile user being
shifted back and forward between two cells. The handoff of mobile users to different
cells is a load-balancing problem, so that the co-ordinated control needed to solve
this problem depends on the collaboration of a group of cells. The co-ordination of a
joint plan using management handoffs is the responsibility of the Co-operative
Planning Layer of the agents. The agent negotiation has two phases: first to find the
best region to attempt the movement of calls; second, the actual execution of the
handoff requests in a co-ordinated manner, i. e. the execution of the joint plan. In
order to engage and collaborate in a joint plan, the agents communicate through the
Contract-Net Protocol (CNP) [7]. In this work, the agent implementation uses a sub-
set of the communicative acts specified in FIPA Agent Communication Language
(ACL) [8]. The execution of the protocol during the two-phase agent negotiation is
best explained through an example. In Fig. 2, cell A is hot and the local channel
algorithm is not responding efficiently. The Local Planning Layer triggers the Co-
operative Planning Layer to start the negotiation with other cells.
The following actions are taken in the negotiation:
1. In the hot cell, agent A is a manager candidate agent and it sends call for
proposals (cfp(O)) to all its co-channel cell agents A',-, where i can vary from 1
to the current number of co-channel cells of A (maximum 6).
2. The A\ agents advertise the need for channel availability assessment to their
neighboring cell agents B',y, 1 < j < 6 sending cfp(l) (cell load assessment and
plan engagement availability).
3. Each B'ij agent sends to the respective A',- agent apropose(l) act if the cell is
able to engage in a joint plan in the near future or a refuse(l) act otherwise.
Inside apropose(l) act the B'y agent sends its degree of coldness (dc) and the
percentage of the mobile users in the cell {Depart) which are inside the
departing region r of selected cell borders, called regions of movement. The
regions of movement in each cell are pre-defined according to the position of
the cell of agent A.
4. Each A't agent receives the answers of its neighboring B'y agents and it
computes the value of the utility function (Equation 3). Where Reg is the
minimum set containing A',- and its neighbors B'y that have sent propose(l) acts
327

(1 < |Re g\ < 1). The factor a is introduced in order to decrease the influence of

Depart over F' (0 < a < 0.25) : F'=-j ,* Y(dc k +a- Departk)(S)
|Reg| teRes
5. The A'i agents that were able to perform the calculation of F\ send the result to
agent A 'mpropose(O) acts. The A',- agents that did not have enough propose(l)
acts from their neighbors in order to calculate F' send refuse(O) acts.
6. The receivedpropose(O) act with biggest F' value is chosen to be the region for
moving the calls (if F' is greater than a minimum value). Agent A advertises the
result of the auction to the winning co-channel cell agent with an accept-
proposal(O) act. If there is no winning region, then agent A sends reject-
proposal(O) to all A',- agents that have sent propose(O) acts and aborts the joint
plan attempt for a specific duration of time.
7. If there is a winning region, then the co-channel cell agent of this region sends
cfp(2) (engage joint plan) to its neighboring B'y agents.
8. Each B'y agent receiving the cfp(2), assess its availability to engage the joint
plan, considering the number of plans it is already participating in and the
regions of movement being already considered in such plans. It sends a
propose(2) act if: the number of current engaged plans is less than two and the
regions of movement (if engaged in another plan) match the requesting one.
Otherwise, it sends a refuse(2) act.
9. If the wining co-channel cell agent receives back a minimum number of
propose(2) acts from its neighboring B'y agents, it sends back an inform(jp)
(inform joint plan) act to agent A and sends accept-proposal(2) acts to all of its
B'y agents that have sentpropose(2) acts. Otherwise it sends afailurefjp) (joint
plan failure) act to agent A and reject-proposal(2) acts to its B'y agents that
have sent propose(2) acts.
10. The winning co-channel cell agent that has just sent an informfjp) and its B'y
agents will perform all preparatory tasks to engage the joint plan and they will
wait for an inform(activejp) (inform joint plan activation) from agent A.
11. If agent A receives an informfjp) act, it sends a reject-proposal(O) to all other
co-channel cell agents that have sent propose(O) acts before, and a requestQp)
(request joint plan engagement) act to its two neighboring cell agents in
connection with the winning region. This request is mandatory. Finally, agent A
will send an inform(activejp) act to all agents engaged in the joint plan (first
joint plan execution act). If agent A receives afailureQp) act, it selects the next
best F' (if exists) and the actions from 6 to 11 are repeated.
12. An agent receiving a requestQp) act will perform all preparatory tasks to engage
the joint plan and wait for an inform(activejp) act from agent A.
13. End of the first phase of negotiation.
The second phase of the negotiation starts with the engagement of all agents
belonging to the winning region, the manager agent A and its two neighboring cell
328

agents into the joint plan (shaded region in Fig. 2). Agent A is the manager of the
joint plan and the other partner agents are the contractors of the plan [7]. The
manager has the responsibility to monitor the actions of the contractors and to
terminate the joint plan. Each iteration of the joint plan needs to be feasible.
Therefore, a proposed heuristic follows a resource-oriented approach of market-
based control. The aim is to load-balance the whole region so that the difference in
degree of coldness of partner cells should be smaller then certain threshold.

Figure 2. Negotiation Strategy.

The following heuristic tries to balance the region by distributing users among cells:
1. The manager agent A sends its first act to all partner agents to inform them that
the joint plan is in operation (inform(activejp) act).
2. All partner agents receiving the inform(acttvejp) act will send an
inform(ptrnjp) (partner cell in the joint plan) act to their manager agent
identifying themselves and their neighboring cells in the regions of movement.
3. Iteration:
a) The manager agent sends a query-ref(0) act to all partner agents.
b) Each partner agent sends its total number of channels and the number of
channels in use to the manager agent through an inform-ref(0) act.
c) The manager agent computes the rate of change (Ac,) for each partner
agent and itself by calculating the difference between the channel
occupancy of the cell (c/C,) and the average channel occupancy of all
members (N) of the joint plan (Lmg):
Ac,=|-L ovs (4) L^L±±{S)vieN

Ci is the total number of channels in use in the cell of agent i.


C, is the total number of channels (nominal + borrowed) in the cell of agent i.
Lavg is the average channel occupancy of all cells of the joint plan.
Ac, is the rate of change in channel occupancy of cell i inside the joint plan.
329

d) If the cell of agent i has Ac, >0, the manager agent sends to agent i: Ac,,
the Ac of the neighboring cells having borders with the regions of
movement of the cell of agent i and the total number of channels of these
cells (C). It also sends Lavg. This information is sent through a
requestQpaction) (joint plan action) act.
e) Each agent i that receives the requestQpaction) act from the manager agent
will try to transfer mobile users in the regions of movement (departing
areas) following the algorithm:
I. Sort the received Ac of the neighboring cells.
II.If Ac, is smaller than min Ac , then no transfers can be made; go to step f).
Otherwise, go to step III.
III.Calculate how many mobile users need to be transferred: users = Ac/ * Ci.
IV. If min Ac is greater than Lavg, then transfer one mobile user to the
neighboring cell with min Ac ; Go to step VIII. Otherwise, step V.
V. Sort only Ac that is smaller or equal to Lavg. The aim is to transfer mobile
users proportionally to the number of channels available in each
target neighboring cell with Ac smaller or equal to Lavg.
VI.For all sorted Ac find the number of mobile users that the cell can
receive. For Ac of cell j : us, = -Ac, * C,.
j
J J J
VII. To find the proportion of mobile users that will be attempted to transfer
m
to each cell, sum all USJ: US = ^jusj . The proportion of mobile

'us, ^
users for each cell is: min — - * users
US
VIII. Do the handoff attempts.
f) End of the iteration.
4. Repeat this iteration at intervals of s seconds until the manager decides to
terminate the joint plan. When the plan is to be terminated, the manager agent
sends a cancel(jp) (cancel joint plan) act to inform the termination of the plan.
The termination of the joint plan can be determined by the completion of certain
number of iterations or by an exception.

4 Implementation and Analysis

The results of the multi-agent system are compared against the conventional mobile
network using the FCA and a mobile network using only the D-BA scheme under
common traffic load scenarios. The network performance measurements used for the
comparison are the traffic blocking rate and handoff rejection rate. For simplicity,
the cellular networks being compared are identified by their channel allocation
330

schemes: FCA, D-BA and MA for the multi-agent system. The expected behavior of
the MA network is to improve the performance of the D-BA scheme when the latter
starts to decrease the efficiency of its borrowing algorithm. This improvement also
starts to decrease as the traffic load increases, because fewer resources will be
available for load balancing.
A common cellular network was modeled in OPNET™ and validated against a
mathematical model. The cellular network is composed of 49 cells and each cell has
10 nominal channels. The compact pattern is a 7-cell cluster with the reuse distance
being 3 cell units. Mobile users have their own trajectories inside the mobile
network. Call establishments and handoff requests are simulated as they are
requested in AMPS systems [1]. Poisson distribution is used to generate calls, which
have an exponentially distributed duration with a mean of 3 min. An idle mobile
inside the cell performs the call attempt.
The performance of the three systems were analyzed under four layouts with
different traffic distributions. The results of one layout (Fig. 3) are presented here. In
Fig. 3, the number in the bottom of each cell is the cell identification; the number in
the middle gives the Poisson arrival rates in calls/hour (ranging from 20 to 200
calls/hour). Mobile users inside the shaded area drive at 40km/h or walk at 2km/h
(in both directions). The remaining mobile users have random trajectories moving at
5km/h. In the results, the abscissa of each graph is the percentage of load increase in
all cells compared to the traffic load shown in Fig. 3, called the base load (0 %).

100 145 100


id .14

« 40 SO ISO
IT 3 4 13
ISO S8B m
i
6 3

<m 20 140 40
1 2 11
i

dO 40 100

8 10

10B 80 I d0 fl0

21 22 23 24

Figure 3. Cellular system layout.

The MA network outperforms the D-BA and the FCA networks (Fig. 4) and the
expected general behavior of the MA network is demonstrated. The number of
borrowing attempts is kept at the same level as the D-BA network, and at almost the
same efficiency ((successful + partial successful outcomes) / total number of
331

borrowing algorithm executions) (Fig. 5). This shows that the reductions in blocking
rate are due to the agent negotiation performing the load balancing.

^
y ^

y£^
s ^ -<-MA
-^ ••- D-BA
— FCA

^ ^
100 120

Load increase (%)

Figure 4. Traffic blocking rate for the three systems.

^ "S.
% K
V5
V
\
\
^*~.
* " - • • — -

Load Increase (%) Load increase (%)

Figure 5. MA versus DBA: Borrowing algorithm (BA) results.

Some of the cells with higher traffic load in the network had better improvement in
the traffic-blocking rate, for example, cells 0 and 26 (Fig. 6). The greater
improvement in traffic blocking rate of individual highly loaded cells is a good
result, because these cells have a greater need of resources.
^. --
*_---""--^ ^"' ^
< ^ ^ _,y^
'
^ y -
*7^
— y
y

y ' s - i ^ *-BS_0MA y yr
-*•• BS_26 MA
y?'' -*-BS 26 D BA
-«-BS_0D-BA
-*-BS._0FCA -*-BS_26FCA
5^- ~"^~
Load Increase (%)
Load increase (%)

Figure 6. Cells 0 and 26: Traffic blocking rate.


332

The handoff rejection rate is also lower in the MA network, thanks to the action of
the agent negotiation (Fig. 7). This is an important result because it increases the
QoS perceived by the mobile user.

-40 -20 0 20 40 60 80 100 120

Load increase (%)

Figure 7. Handoff rejection rate.

The agent negotiation proved to work well, almost 100% of the management
handoffs were successful when the mobile station had enough signal strength to shift
cells (Table 1). This shows good performance from the proposed heuristic inside the
agent negotiation, choosing the right cells to receive mobile users. Here, the success
rate is defined as the ratio of successful handoffs to possible handoffs; possible
handoffs exclude those where the signal strength is too low.

Table 1. Management Handoff Request Performance for layout Fig. 3.

Load Mgmt handoff Rejected mgmt Successful Success rate


requests handoff for lack mgmt (%)
Increase
of signal handoffs
-40 551 331 220 100.0
-20 1,781 1,115 666 100.0
0 3,267 1,889 1,378 100.0
20 4,780 2,846 1,934 100.0
40 5,787 3,409 2,376 99.9
60 6,494 3,735 2,751 99.7
80 6,538 3,778 2,752 99.7
100 6,914 3,843 3,052 99.4
120 6,122 3,301 2,790 98.9

The important feature shown in the results is the behavior of the multi-agent system.
There are still resources available when the neighboring cells, of a borrower cell,
also reach the threshold of channel availability. At this point the agent negotiation
has an important role in shifting some of the traffic to less loaded regions.
333

5 Conclusion

The approach proposed by the authors was to use agent technology to control the
channel assignment in the cellular network. A special hybrid agent architecture was
adopted consisting of three layers. The interplay of the three layers proved to be a
powerful framework to improve radio resource flexibility and to increase the
robustness of the cellular network as a whole. The simulation results demonstrated
that the use of agent technology brought more flexibility in obtaining extra radio
resources to the network than the other two approaches. Overall, the multi-agent
system proved to be feasible and the agent negotiation was an important feature of
the system in order to improve perceived quality of service and to improve the load
balancing of the traffic.

References

1. MacDonald V. H., Advanced Mobile Phone Services: The Cellular Concept.


Bell Systems Tchnology Journal 58 (January 1979).
2. Das S.K., Sen S.K., Jayaram R. and Agrawal P., A Distributed Load Balancing
Algorithm for the Hot Cell Problem in Cellular Mobile Networks. In the 6lh
IEEE International Symposium on High Performance Distributed Computing
(Portland, USA, August 1997) pp. 254-263.
3. Katzela I. and Naghshineh M., Channel Assignment Schemes for Cellular
Mobile Telecommunication Systems: A Comprehensive Survey. IEEE Personal
Communications Magazine 3(3) (June 1996) pp 10-31.
4. Somers F., HYBRID: Intelligent Agents for Distributed ATM Network
Management. Proceedings of First International Workshop on Intelligent
Agents for Telecoms Applications (I ATA'96) at the European Conference on
Artificial Intelligence (ECAI'96) (Budapest, Hungary (1996).
5. Miiller J. P., The Design of Intelligent Agents: A layered approach. In Lecture
Notes in Artificial Intelligence 1177 (1996).
6. Bodanese E. L., A Distributed Channel Allocation Scheme for Cellular
Networks using Intelligent Software Agents. PhD Thesis, Queen Mary and
Westfield College, University of London (2000).
7. Smith, R.G., The Contract Net Protocol: High-Level Communication and
Control in a Distributed Problem Solver. In Readings in Distributed Artificial
Intelligence (Morgan Kaufman, 1988) pp 357-366.
8. FOUNDATION FOR INTELLIGENT PHYSICAL AGENTS. FIPA 97
Specification, Version 2.0 - Part 2 - Agent Communication Language
Publication. October, 1998. http://www.fipa.org.
AN ATOMIC APPROACH TO AGENT-BASED IMAGERY
A N D GEOSPATIAL P R O B L E M SOLVING

J A M E S J. N O L A N , R O B E R T S I M O N , A R U N K. S O O D
George Mason University
Center for Image Analysis
Dept. of Computer Science
Fairfax, VA 22030
{jnolan,simon,asood} ©cs.gmu. edu

This paper describes our approach to building a scalable, flexible agent-based ar-
chitecture for imagery and geospatial processing. The architecture implements over
100 imagery and geospatial processing agents based on the Java Advanced Imaging
and OpenMap APIs. The agents are distributed over a Jini enabled network, and
communicate with one another via JavaSpaces. We discuss our "atomic" approach
in this paper: developing low-level processing agents that are used by applica-
tion specific agents. We discuss several concepts in this approach: agent lookup
and discovery through traditional information retrieval techniques, the ability to
rapidly prototype agents based on commercial software products, and a knowledge
management approach that reuses prior processing approaches and results.

1 Introduction

Imagery and geospatial systems are used in the intelligence gathering, car-
tography, and resource management domains, among others. These systems
utilize low-level imagery and geospatial services to answer high-level queries.
Services might include edge detection on images or route planning on vector
data sets, for example. In production intensive environments, it is typical for
these systems to process hundreds of images and geospatial data sets per day
that each range from several megabytes to several gigabytes in size.
The low-level imagery and geospatial processing services used in these
systems are usually well defined in terms of the service's name, input data
types, parameters, and output data types. On the other hand, the questions
that are posed to an imagery and geospatial processing system are usually
very high level. For example, a farmer may ask "Is the soil on my farm rich
enough for wheat production this year?" or the general may ask "What are my
expected troop locations over the next 48 hours?". To answer these queries
requires the aggregation of low-level services into higher level services that
address the query.
In today's largely manual environment, a human analyst makes the trans-
lation from high-level query to low-level processing steps, including the input

334
335

data sets and parameter values. In some cases, this translation can be subjec-
tive, with a wide range of approaches, and is highly dependant on the analyst's
experience. In other cases, the translation is well known, with a limited range
of approaches, and can be easily codified. In general we have a good under-
standing of the representation of the queries posed, and of the low-level image
processing tasks that are used to perform those tasks. Queries are usually
comprised of some location, a subject of interest, and some time frame. This
structure is fairly static. However, the translation of the high level query,
to low-level processing tasks can be dynamic, and highly dependent on the
application.
To address this dynamic transition area, we have developed an agent-
based architecture for imagery and geospatial computing that provides an
approach for application-specific agents to be easily constructed from lower-
level processing agents. This architecture, called the Agent-based Imagery
and Geospatial processing Architecture (AIGA), provides a well-defined set of
low-level imagery and geospatial processing agents, which we term " atomic".
These agents describe themselves by using an ontology and Agent Commu-
nication Language for imagery and geospatial computing. The ontology and
ACL are critical for: enabling the discovery of agents to solve a particular
query, finding other agents to assist in processing, or discovering information
from the agent knowledge base.
In this paper, we discuss our approach, the overall agent architecture,
our ontology and ACL, the discovery process and how it used by agents and
clients, and finally our Java-based implementation.

2 The Agent Architecture

The AIGA architecture is comprised of: imagery and geospatial processing


agents, an information retrieval agent, agents that provide access to imagery
and geospatial data, locations for hosting agents, a communication page space
(used for communication and knowledge discovery), and agent communication
pages (I-XML Pages). A logical view of this architecture can be seen in Figure
1.
Imagery and geospatial processing agents represent low-level functions or
algorithms typically used in an image processing or geographic information
system. The information retrieval agent provides the mechanism for agents
to discover other resources (agents, clients, etc.) on the network. The com-
munication page space is central to our architecture. It provides a place for
agents to leave and pick up messages, and serves as a knowledge base repos-
itory. Prior agent strategies to particular problems are stored here in this
336

i
Image
Processing p — . , ^ , Image Data
—Jli-

f Information ^
Retrieval
^ Agent

Geospatial
Data Agent

Figure 1. AIGA Agent Architecture

fully searchable repository, and other agents may utilize, or leverage from this
knowledge. I-XML pages are the mechanism with which the agents commu-
nicate and share information. These pages are structured, using the Resource
Description Framework 1 and extensible Markup Language (XML) 2 .

3 The Ontology

An ontology provides a shared and common understanding of a domain that


can be communicated between people and across application systems 3 . An
ontology for imagery & geospatial processing is fundamentally composed of
three core elements: the data used in processing, the functions that process
the data, and the query and resultant information. By defining a common
vocabulary for these three core elements, we allow for interoperability among
heterogeneous systems, as well as the potential to increase the number of tasks
offloaded to automated processes in the form of agents.
The imagery and geospatial communities have developed many data spec-
ifications that suitably describe imagery and geospatial data concepts. For
example, an image in the National Imagery Transmission Format (NITF) con-
tains metadata information that can thoroughly describe the contents of an
image file. The same can be said for geospatial formats such as Vector Prod-
uct Format (VPF) or the Spatial Data Transfer Standard (SDTS). While the
aforementioned standards describe imagery & geospatial data concepts, what
337

these standards lack are the relationships between those concepts. We have
taken these standards, and represented them in such a way that relationships
can be easily built.
Imagery and geospatial processing services are fundamentally composed
of: a name, a required and/or optional set of parameters, input data types, and
output data types. In addition, there may be other descriptive information
such as the service creator, or documentation on the service. For example,
to perform image change detection, the name of the operation is "Change
Detection", the parameters are a start date and end date, and the service
requires two images.
The query and resultant information represent the beginning and end of
the agent process. The query, as described previously, represents some rela-
tively high-level question (e.g., "What are my expected troop locations over
the next 48 hours?"). The resultant information represents information pre-
sented back to the client to assist in the decision-making process. In addition
to an answer to the query, the resultant information contains supporting in-
formation used during processing such as imagery and geospatial data sets or
intermediate processing results. This information provides additional context
to the resultant answer.
We have chosen to represent our ontology in RDF using XML for the
encoding. This approach provides several advantages: 1) these are emerging
standards, with several implementations available to easily parse data; 2) the
mapping of the components of our ontology into RDF has proven straightfor-
ward, as the concept of " Resource" can be applied to the components of our
ontology, as others have shown 4 . An example of the RDF Schema for agent
descriptions can be seen in Figure 2.

4 The Agent Communication Language

According to 5 , an agent communication language that allows agents to in-


teract while hiding the details of their internal workings will result in agent
communities able to tackle problems no individual agent could. Above, we
described an ontology that hides such implementation details from agents.
Agents are able to describe themselves using this ontology, and other agents
can interpret and use the functionality of those agents. We use the described
ontology as the core part of our Agent Communication Language, which we
term I-XML. In this section we describe the components of I-XML: the query,
baseline representation, computational steps, processing strategy, and finally,
results.
The Query section represents a question that an analyst wishes to have
338

<rdf:Description ID="Service">
<rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Class'7>
<rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource'7>
<rdfs:comment>An abstract class used as the top-level class for processing services</rdfs:comment>
</rdf:Description>
<rdf:Description ID="description" >
<rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/>
<rdfs:domain rdf:resource="#Service"/>
<rdfs:range rdf:resource="http://www.w3.org/TR/2000/WD-xmlschema-2-20000407#string"/>
<rdfs:comment>A human readable description of the processing service</rdfs:comment>
</rdf:Description>

<rdf: Description ID=" parameter" >


<rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/>
<rdfs:domain rdf:resource="#Service"/>
<rdfs:range rdf:resource=" ^Parameter" / >
<rdfs:comment>A parameter used by the agent.</rdfs:comment>
</rdf:Description>
<rdf:Description ID="dependantUpon">
<rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/>
<rdfs:domain rdf:resource=" #Service" / >
<rdfs:comment>A Service that the agent is dependant upon prior to processing.</rdfs:comment>
</rdf:Description>

Figure 2. Example RDF Schema for Agent Descriptions

answered. This query is decomposed into keywords and location, the first step
in the processing chain. This is the basis, or trigger, for all actions within the
AIGA architecture.
The Baseline Representation contains information about the geo-
graphic location of the query. For example, this may include a bounding
rectangle of the region of interest or a place name such as a country or city.
The Computational Steps represent the steps necessary to answer the
query. For example, steps for locating military troops may include: image
retrieval from a database, feature detection on the imagery to identify troops,
change detection to determine the movement of those troops, and an open
source literature search for ancillary information. The Computational Steps
represent the necessary processing steps and not the order in which they
will be done. This is essentially a listing of the steps required to determine
the resultant information, however this list has not been optimized to take
advantage of any parallel processing opportunities.
The Processing Strategy refines the Computational Steps into a di-
rected processing graph, which is the exact series of steps required to mini-
mize the time required to complete the task. This is an important feature of
time-sensitive systems such as imagery and geospatial systems. Consider, for
339

instance, the Computational Steps defined above. In this processing chain,


there are two independent sub-chains: an image processing sub-chain, and
an information retrieval sub-chain. The image processing sub-chain consists
of three sequential steps, each dependent on results from the previous step:
image retrieval, feature detection, and change detection. The information re-
trieval sub-chain consists of one step: searching the open-source literature.
The Processing Strategy provides a way to represent these two processing
chains as independent of one another, and identifies them as processing chains
that could be executed in parallel if more than one processing resource is avail-
able on the network.
The Results tag represents any outputs of agents that may help to answer
the query. As the query is executed and results are returned from agents,
the Results tag will be updated with information that may include geospatial
features, references to image sources, or open literature that was useful during
processing.

5 The Discovery Process

We have found that by using a well-defined ontology as described above, we


have created a mechanism that allows clients and agents to search for data
types and agents available on the network. Our agent and data set descriptions
contain much descriptive information, additionally, our use of RDF in XML
has made these descriptions an efficient search space.
We have implemented Salton's vector model information retrieval algo-
rithm 6 , and we use this as the basis to match queries up with appropriate
agents, who in turn make use of the algorithm to find other agents to assist
in processing. The algorithm is as follows:

Sim q) =
^ \A-\^Q
where sim(aj,q) represents the similarity of agent j to query q, A represents
a vector of agent descriptions, and Q represents a vector of the terms from
query q. This formula states that the similarity of the capabilities of an agent
a., to a particular query q can be calculated by taking the cosine of the angle
between the vectors A (the terms of the agent description) and Q (the terms
of the query).
The terms available in the repository of agent descriptions is updated each
time an agent enters or leaves the network. Using these terms, each agent has
the capability to calculate its relevance to specific queries posed by a user, and
also search for agents it may require assistance from during processing. This
340

is especially useful as we make very fine-grain agents, isolating very specific


functionality, and these agents rely on other agents to fulfill processing in
response to queries.

5.1 Clients Discovering Agents


In the Client-Agent discovery process, a user is submitting a query, by means
of a client agent, for processing. This is usually some high level query such
as: "What are the expected troop movements in the next 48 hours?". When
the user submits this query to the system, they are seeking the best possible
solution. To make this match, we measure the similarity of the query to each
agent description on the network. Each agent description is text-based, and
in a format as defined by our ontology.
Returned to the client is a list of agent "bids", or ranked relevance to
the query. This process requires the human-in-the-loop to choose the most
appropriate agent "bid" against the query. When the user selects the most
appropriate agent, the then can fill in the details of the processing, including
input data sets or agent parameters.

5.2 Agents Discovering Agents


We have defined a model to implement low-level imagery and geospatial pro-
cessing agents. These agents are well described in the form of a name, input
and output types, and parameters. These agents can be used indirectly to
solve high-level queries. They are used by application specific agents that
directly map from a user query to agent form.
These application-specific agents theoretically come in one of three forms:
1) those that provide no additional functionality, and rely solely on atomic
agents for processing, 2) those that provide some additional functionality, and
rely partially on atomic agents for processing, and 3) those that provide all
of the functionality, relying on no atomic agents for processing.
We argue that our approach (agent types (1) and (2)) makes agent-
development easier and quicker than approach (3). In approaches (1) and
(2), agent dependency, or the ability to discover other agents that can fulfill
processing, becomes very important. This is fulfilled through the use of our
agent ontology, describing an agent by certain parameters as well as its agent
dependencies.
By using a well-defined ontology, higher-level agents can now search for
the atomic agents they are dependent upon by agent description. The opposite
of the Client-Agent model, this search is fully automated. The high-level agent
341

chooses a threshold, and will select the highest matching atomic agent as its
dependant agent as long as its relevance meets or exceeds the threshold.

5.3 Agents Discovering Knowledge


The last model that uses our discovery mechanism is the Agent-Knowledge
process. In the AIGA architecture, we have the I-XML Page Space. This
space is used for two purposes: 1) as a place for agents to exchange messages,
and 2) as a place for agents to store and discover knowledge. Knowledge
in this sense refers to prior I-XML pages that were used to solve problems.
In these I-XML pages, the query, computational steps, processing strategy,
and resultant information can all possibly be reused. Consider the following
scenario:

1. Analyst A submits the following query to the system: "Have Country


A's troops moved closer to the border of Country B?". To answer this
query, the analyst may use a set of agents that detect troops in imagery,
compare troop movements from previous days, etc.

2. Sometime later, analyst B submits the query "Have Country C's troops
moved closer to the border of Country D?".

In this example, query (2) is very similar to (1), the only change being the
location of the query. The approach that analyst A used to solve his/her query
could be of use to analyst B. As such, analyst A's approach is available in the
I-XML Page Space. This illustrates one reuse strategy in our architecture,
the reuse of prior computational steps or processing strategies. This reuse
approach is useful from two perspectives: 1) we provide the potential for less
skilled analyst's to leverage and learn from a more skilled analyst's approaches,
and 2) we reduce the computations necessary to develop computational steps
and processing strategies.
Our other reuse strategy centers on the reuse of prior processing results.
Consider the following change to the previous reuse example: instead of ana-
lyst B querying over a different geographic location, he/she is now interested
in the troop movement of Country A with respect to Country B's border,
as analyst A was. Only now, analyst B is interested in the answer to this
question a day after analyst A. In this case, analyst B would leverage off of
analyst A's processing result to determine troop movement. Now, the agents
tasked with the processing only need to start from the most recent result in
the I-XML Page Space to calculate troop movement.
342

Figure 3. Client Screenshot

6 Implementation

We have created a Java-based prototype of AIGA. Agents are Jini-enabled


services and can move to/from locations using this mechanism. We have im-
plemented over 100 agents that represent the functionality in commercial im-
age processing and GIS packages. Agents implemented include: imagery and
geospatial processing agents representing functionality of the Java Advanced
Imaging and OpenMap APIs; an information retrieval agent that implements
Salton's vector model algorithm; a client agent providing the interface between
human user and agent system; and imagery and geospatial data agents.
In Figure 3, we show a screenshot of the client interface to the system.
First, we show a map-background of the world. This map shows where the
analyst using the system might have agents working on problems concerning
a certain part of the world. Second, at the top of the screen, we show a box
where a user may submit a query to the system. When the "Submit" button
is pressed, this query is encoded in an I-XML page and written to the I-XML
Page Space.
The Information Retrieval agent is then notified of the query, and mea-
sures the similarity of each agent to the query. This agent ranks the relevancy,
and notifies the client (by writing out to the page space) of the highest ranked
agent plus any other agents within a threshold. These "bids" on the query
can, ranked by their associated relevance, be seen in the bottom of Figure 3.
By selecting a bid, the user selects an agent, or set of agents, to answer the
343

query. An I-XML page (shown in the upper right) appears on the screen. This
page contains the details of the agent approach to solve the query. On this
page, the user can fine-tune the computational steps and processing strategy
necessary to answer the query. The user can change parameter settings in
this screen before submitting the page back into the page space for agent
processing.

7 Conclusions

We have presented an agent-based architecture for distributed imagery and


geospatial computing. The approach allows for clients and agents to discover
agents for processing using well-known information retrieval techniques. We
have prototyped the system using Jini as the middleware. The system is
currently used to assist the human analyst in developing processing strate-
gies to solve typical imagery and geospatial queries. Additionally, we have
introduced an agent reuse concept where agents can leverage from prior pro-
cessing approaches or results to effectively create better results and reduce
the computational load on the system.

Acknowledgments

This is work is supported under the National Imagery & Mapping Agency's
(NIMA) University Research Initiative (NURI) program.

References

1. World-wide Web Consortium, Cambridge, MA. Resource Description


Framework (RDF) Model and Syntax Specification, February 1999.
2. World-wide Web Consortium, Cambridge, MA. Extensible Markup Lan-
guage (XML) 1.0, February 1998.
3. E. Mena, V. Kashyap, A. Illarramendi, and A. Sheth. in Formal Ontology
in Information Systems, IOS Press, Amsterdam, Netherlands, 1998.
4. S. Staab, M. Erdmann, A. Mdche, and S. Decker. An extensible ap-
proach for modeling ontologies in RDF(s). In Proceedings of the ECDL-
2000 Workshop "Semantic Web: Models, Architectures and Management,
September 2000.
5. Y. Labrou, T. Finin, and Y. Peng. Agent communication languages: the
current landscape. Intelligent Systems, 14(2):45-52, March-April 1999.
6. G. Salton. The SMART Retrieval System. Prentice Hall, Englewood
Cliffs, NJ, 1971.
MODEL-BASED CREATION OF A G E N T S A N D
DISTRIBUTION OF P R O B L E M SOLVING

KATSUAKI TANAKA, SETSUO OHSUGA


Waseda University, Department of Information and Computer Science
3-4--1 Ohkubo Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: katsuaki@ohsuga.info. waseda. ac.jp

Abstract
Problem solving process is defined as 'a subject applies activities
to an object'. A combination of Subject, Activity, Object represents
a problem solving and is called a problem model. It is represented as
an agent. When a problem is large, it is decomposed into sub-problems
based on the problem model structure and a structure of agents is created
as a multi-agent system. Each agent solves assigned sub-problem and
cooperates with the other agents to solve the whole problem. Persons
control agents via user agents.

1 Introduction

Today, problems t h a t h u m a n being must solve are becoming large and complex,
because of growing of social systems. Every large problem is becoming unsolv-
able by a single person but a cooperative problem solving by many persons is
necessary. There how to distribute problems and how to manage persons who
join the problem solving are big issues. It is worried t h a t the current method
of management is inadequate for following up the growth of problem scale. We
are required today to develop a new method of management to solve very large
problems.
T h a t person is the main body of problem solving processes is one of the
reasons why current method is inadequate. A large amount of decisions is
distributed to number of persons and the decision procedures made by t h e m
are apt to remain only in their brains without being recorded. Very often they
make errors as an intrinsic n a t u r e of h u m a n being. T h e errors let the quality of
the solution very low, but manager cannot follow the problem solving process
afterward for checking its validity. It is worried t h a t it causes frequent accidents
in the future because the scale of problems grows rapidly.
A way to improve this situation is to introduce computers in a problem
solving process much more t h a n ever before and let t h e m record the history of
the process, especially the history of decisions made there by persons. T h u s
computer as software-agents to replace some part of workload of persons in
problem solving is discussed.

344
345

Many papers have discussed problem solving by software-agents1 2 6 . It is


assumed in most of the current systems that agents tailored to the problems
are to be created by persons. That is, every multi-agent system is dedicated to
a given problem with the specific goal. Ihida 3 discussed a multi-agent system
where agents are created automatically to increase the number of inference
engines that works in parallel to reduce the load. In this paper, the authors
propose a new method of agent-based problems solving in which agents are cre-
ated automatically and adaptively to a given problem. The created orgnization
of agents represents a internal structure of the problem model. A new modeling
scheme, multi-strata model 4 , is used for the purpose.

2 Problem Solving
2.1 Design type problem as example
It is said that every problem concerns some object in the world and an object
has its own structure that is composed of components. Its formal representa-
tion is an object model. Problem solving is defined as operations to this object
model.
There must be some subject as an entity that applies operation to this
object in order to get solution. Thus a formalized representation of a triple
(Subject, Activity, Object) represents objectively a problem solving. It means
that Subject intends to apply Activity to Object in order to arrive at a goal
state. In reality its computerization is an agent. In real problems however,
each of these three items in the triple consists of many elements and forms
a structure such as a structure of subjects, a structure of activities, and a
structure of objects. These structures are related in the different ways and
define a variety of problems. The more complex their relations are, the more
complex the problem becomes.
If there is a proper way to decompose these relations into a set of triples
of (a subject, the simpler structure of finite activities, the simpler structure of
finite objects), then the complex problem is decomposed into a set of simple
sub-problems. A criterion of decomposability is that the mutual relation be-
tween sub-problems is weak and independence of each sub-problem solving is
kept as much as possible. These relations however cannot reduce to null but
certain mutual relations remain between sub-problems. An agent is created to
each triple and a multi-agent system is formed by means of these agents with
relationships between agents to represent the mutual relations between sub-
problems. As the way of decomposing the original model is dependent on the
problem, a multi-agent system specific to the problem is created automatically.
In many real problems there are specific priorities among the structuring
346

Model Analysis

r Model ">•• Solution

Figure 1: Process of a design type problem solving

rules of subjects, activities and objects respectively in a problem model. For


example, making an object structure is the major goal in an engineering design
problem. The objective of design is to build up a structure of objects such that
satisfies given functional requirements. In this case problem is decomposed first
based on decomposition of the object structure. The structures of subjects and
activities are built depending on the object model building.
A basic operation for a design type problem solving is represented as fol-
lows (fig. 1). It consists of four stages. First, an incipient model is built as an
embodiment of a person's idea on problem solving. It includes requirements to
be satisfied. Second, the model is analyzed to obtain its functions and behav-
ior. This is an analysis problem solving. Then the result is evaluated whether
the model satisfies requirements. If requirements are satisfied on an evaluation
step, the design process stops. The model represents a solution. Otherwise the
model is modified.
An analysis type problem solvng is included in an design type problem
solving. Therefore, decomposition and distribution of problems are discussed
mainly based on design type problems in the following.

2.2 Large Problem Solving by Persons

To solve a large problem, the problem must be decomposed to a set of smaller


sub-problems. The solution of the original problem is derived as an integra-
tion of solutions of sub-problems. These sub-problems are distributed to the
different persons and solved there. Since sub-problems are generated by decom-
position of the original problem, these persons cannot be assigned in advance,
but must be assigned dynamically in parallel with the decomposition process.
Fig. 2 shows the case of an airplane design.
347

9 ir^no

eleclronic system 1
I
engine mgi.c
Model

1 tail wing
Model
Decomposition
=
Assignment
— — •*-
W
X l
ffl r
"Design (his Airplan'
wing fuselage
yj«in S
lij
fuselage

Figure 2: Design process of an airplane

1. A chief designer builds an incipient generous model based on given re-


quirements from clients.
He/she creates a top node of a hierarchy model to represent the object
and gives it requirements.

2. The designer decides tentatively main assemblies of the airplane such


as engine, main wing, tail wing, vertical wing, landing gear, etc. If this
tentative structure does not satisfy the given requirements, he/she has to
find another structure or changes the required function of the assemblies.
In general, a complex object is decomposed to a number of assemblies
and each assembly is further decomposed to a set of sub-assemblies and
so on. In this way, a hierarchical object is generated.

3. The designer distributes problems of designing each assembly to an ex-


pert of assembly design. For example, the engine design is assigned to
an engine designer. After then, these experts perform the similar process
for designing each assembly.
Each expert of assembly design is the subject of activity to the assigned
object. Therefore, subjects hierarchy is built similarly as and depending
on the objects model (fig. 3).

Thus many people commit a design of common objects. It means that


decisions are distributed to the different persons and remain there without
being recorded. It causes the difficulty of tracing afterward the design for a
check and maintenance. It will also be very much troubled in the document
acquisition, if previous record is imperfect.
This human-centered problem solving method is replaced by a computer-
centered method. The computer-centered method means that a computer sys-
tem manages a total process and persons joining the problem solving. In
this computer system, a problem is described by knowledge representation
language and knowledge-processing agents deal with the problem cooperative-
348

Subj.J Subject

Obj. Object

Parent-child relationship

Subject-Object correspondence

Figure 3: Problem decomposition

ly. The structure of the agents corresponds to the human organization in a


human-centered problem solving.

2.3 KA US as Knowledge Representation Language


A language suited for representing this system is necessary. In order to cope
with problem model, it must be suited for representing predicate including
data-structure as argument and for describing meta-level operation such as
knowledge for describing on other knowledge. KAUS 8 (Knowledge Acquisi-
tion and Utilization Language) has been developed for the purpose. In the
following, some logical expressions appear as knowledge. However, these are
not necessarily written in correct KAUS expressions but locally simplified. It
is because KAUS syntax is not included in this volume and these locally sim-
plified expressions are more comprehensive than correct expressions.

3 Multi-Agent Problem Solving System


The major parts of the system are user agents, a global knowledge base and
problem-solving agents as the building blocks of a cooperatively distributed
problem solving system. The overall structure of the multi-agent problem
solving system is shown in fig. 4. Each agent communicates in XML like
extended KAUS language via the HTTP protocol.
A problem solving session starts when a user gives his/her requirements
to own user agent. The user agent generates a problem-solving agent with
the requirements. Then the problem-solving agent retrieves knowledge of the
problem (to satisfy the requirements) from the global knowledge base, and
generates other agents depending upon the knowledge. The problem solving
and the problem solving system generation progress in this way.
349

Agent
)
k Agent
A gent
/t \ ^

c Agent
X\ Agent
)
\
X" Agent
J^ Agent
)
Problem Solving System

Figure 4: Multi-agent problem solving system

When problem is solved, problem solving agents send results to the user
agent. The user agent shows the detail of the process to its owner. If the user
decides the solution does not meet his/her requirements, he/she can control
the whole process of solving via own user agent.

3.1 User Agents


Every user who joins the system has an own user agent. A user agent
manages information about a user such as e-mail addresses, domains where
the user joins, and problems that the user joins solving, etc.
It provides interfaces between a user and problem-solving agents. The user
can change any decisions made by problem-solving agents. He/she can modify
and add knowledge that is stored in problem-solving agents. He/she can also
send requests for other user to join a problem solving. The user who invited
to the problem solving, can join it and control problem-solving agents.

3.2 Global Knowledge Base


The global knowledge base supplies knowledge necessary for problem solving
to every problem-solving agent.
A knowledge base management system manages large amount of knowledge
from various types and domains. It accepts requests from agents to retrieve
knowledge. Knowledge retrieval must be done in a short time for assuring
the practicality of the system. Therefore the knowledge is divided into chunks
by type information, domain information and the other information for aiding
rapid retrieval of knowledge 5 . These chunks are structured under the manage-
ment system. The large knowledge base management system is itself a special
agent. It accepts requests from the other agents, and sends suitable knowledge
back to the agent.
350

, f f W ^ y # 4 ' ^ t ^ J » l - - ' " > " • * •• -» ••> >• '!••' ••»••••.

".'CIO IS&.UHJ JBOj

(«ail)Ml4al*MlJiiIUJhAt.[iI&*fariilit!i;

j«f, jcwga

Figure 5: Screen of a user agent

3.3 Problem-Solving Agents


A problem-solving agent is designed not as a special-purpose agent to achieve
a special goal but a general-purpose agent that can accept and cope with any
problem. No agent has any knowledge related to a specific object in advance. It
retrieves necessary knowledge from the global knowledge base when a problem
is assigned and just before to start problem solving.
A problem-solving agent has three layers in accordance with multi-strata
model. First layer is an agent controller. It generates the problem solving
system to solve assigned problem by retrieving necessary knowledge from the
global knowledge base and communicates with the other agents. Second is
the problem solving system that consists of an inference engine and a local
knowledge base for solving the given problem actually. Third is a model of
activities and objects that is manipulated by the second layer (fig. 6).

3,4 Large Problem Solving by Agents


When an upper-agent (an upper problem-sovling agent or a user agent) needs
to solve a sub-problem and no agent has been assigned the sub-problem, the
upper-agent generates a new lower agent and assigns it the problem. Otherwise
if a lower agent that has been assigned the problem already exists, the upper-
agent requests the agent to retry the problem solving for another solution. The
agent starts working. The controller retrieves knowledge of the problem from
351

4 ...... Global
1
'• Knowledge
Base

Figure 6: Structure of a Problem-Solving Agent

the global knowledge base and stores knowledge in the local knowledge base.
Then the inference engine starts inferences. When there is not enough
knowledge in the local knowledge base, the inference engine requests the agent
controller to give new knowledge. The controller requests user agents of which
owners have joined the same domain as the problem to input new knowledge.
If necessary, the inference engine requests the agent controller to distribute
sub-problems to lower agents or sends messages to the other agents in accor-
dance with the relations between sub-problems (assemblies in the case of design
problem) in the original problem model. When the problem is solved, the agent
controller returns the solution to the upper agent.
A problem-solving agent destructs oneself when it receives message telling
that the problem solving has been finished. If a problem-solving agent succeed-
ed in solving problem, it stores the solution in the problem model. Otherwise,
it discards the solution.
Knowledge of a design type problem is defined as fig. 7. The prefix "ALL"
in (3) means to apply this knowledge to all items of a list, "{distribute
design)" means to distribute the predicate "{design)" in this knowledge. The
inference engine distributes design sub-problems to other agents following this
rule. (6) and (7) means to retry the predicate "design" and "decompose" in a
condition that requirements are not satisfied.
A problem-solving agent works as follows with this knowledge: First, de-
compose the object into sub-objects (2) and assign each sub-object to a lower
agent (3), (5). Then receive sub-solutions from lower agents and merge them
(4). While the solution does not satisfy requirements, request the lower agent
to find another solution (6), (3). If a lower agent cannot find any solution, send
a fail message to all lower agents to destruct themselves and change the way
352

('(design functionList restrict A) (1)


[(decompose functionList restrict
SubFunctionsList SubRestrictsLisi) (2)
[(design ALL subF unction ALLsubRestrict Sub A SubRestrictA) (3)
[(mergeModel A P ALLSubA ALL SubRestrictA) (4)
(distribute deisgn) (5)
(retry. 1 design) (6)
(retry 2 decompose) (7)
)•

Figure 7: Knowledge for design

(decompose [PC, forEditVideo] 1000


[[PCIO, IEEE1394] 200, [SCREEN, HighResolution] 700] ...). (8)
(design [PCIO, IEEE1394] 1000 [CARD, CI]). (9)
(design [PCIO, IEEE1394] 1500 [CylflD, C2]). (10)
(decompose [SCREEN, HighResolution] 700
[[PC/O, HighResolutionVideoCard] 100, [CAT, C#T1] 600]). (11)

Figure 8: A part of knowledge of designing PCs

of decomposition (7), (2) and distribute new sub-problems. If still it cannot


find any solution, it returns to the upper agent the message that any solution
could not found. When the solution meets the requests, it is returned.

4 Experiments
This system was applied to the design problem of personal computers. Users
give requirements to own user agent such as a PC for editing video movies, for
working as a web server, etc. and costs as a limitation. In a case where a user
wants a computer to play DVDs under $900, he/she gives his/her requirement
as "(design [PC, PlayDVD]) 900 ,4)?" to own user agent. Knowledge of
personal computers is recorded in the global knowledge base. For example, a
PC for editing video needs an IEEE1394 I/O port, a high-resolution screen,
etc. (fig. 8), (9)
The problem of designing a PC was divided into sub-problems of designing
parts; these sub-problems were distributed to the different agents and the
353

™WWB1WMIMIIII,IIIM^
ffi» d * * S ¥iy? ComsMinteator i*^
Objwi morfrl of: (design [PC, edit Video] 1000 A)7
A - [CARD, 101], [CARD, VI], [CRT, CRT1], [HDD, HD1]

[PC, editVideo] 1000


[PCIO, IEEE1394],200
[CARD, IOl], 200
[SCREEN, HigKResolution],700
[VIDEOCARD, VRAM32], 200
[CARD, VI], 200
[CRT, U t a h ] , 500
[CRT, CRT1], 500
[HDD, Large], 100
[HDD, HD1], 100

ICIoip]

IS^^^^I .h* w #» cs \*|

Figure 9: Result of "(design [PC, editVideo) 1000 A)T'

designed parts were merged to obtain the model of a PC. Users could change
decisions made by agents. When a user requires an alternate solution but there
is no more knowledge in knowledge base, the agent send requests users to give
new knowledge.
The solution and the object model composed by the system are shown in
fig. 9. It was confirmed that the different organization of agents was generated
depending on the way of decomposition of a problem, and results of the past
trials were used effectively.

5 Conclusions
In this paper, it was discussed a way of solving a large problem by a distributed
multi-agent system in cooperation with persons.
Problem solving was represented by a triple (Subject, Activity, Object)
and relations among them. Based on the relation, a large problem solving was
decomposed into sub-problem solving. An agent was created corresponding
to each sub-problem and a multi-agent system was organized according to the
decomposition structure.
The agent is intelligent in the sense that it can solve various type of problem
autonomously, and it can create other agents as needed.
Each person who joins a problem solving can control the behavior of
problem-solving agents via own user agent. He/she can change any decision
made by any agent, give new knowledge to any agent, and ask other persons
to join a problem solving process.
A basic idea, a way of problem solving, also a way of generating a multi-
agent system, was tested by an experiment using a simple example. This
system is a part of a larger system the author's group is developing now. The
part discussed in this paper is a central portion of the ideas on this system
development.

Acknowledgment
This research was conducted sponsored by the Ministry of Education, Culture,
Sports, Science and Technology of Japanese Government. The authors would
like to express sincere thanks to their support.

References
1. Caroline C. Hayes, Agents in a Nutshell - A very Brief Introduction,
IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 1,
January/February 1999
2. M. Harandi and G. Rendon, A Support Environment for Building Dis-
tributed Problem Solvers, Proc. IEEE Conf. Systems, Man, and Cyber-
netics, Oct. 1997.
3. T.Ishida, L.Gasser, M.Yokoo, An Organizational Approach to Real-Time
Continuous Problem Solving, Journal of Japanese Society for Artificial
Intelligence, Vol.7, No.2, Mar. 1992
4. Setsuo Ohsuga, Toward truly intelligent information systems — from
export systems to automatic programming, Knowledge-Based Systems,
pp.363-396, Oct. 1998
5. Setsuo Ohsuga, Hiroyoshi Ohshima, A Practical Approach to Intelligen-
t Multi-Task Systems - Structuring Knowledge Base and Generation of
Problem Solving System, European-Japanese Conference 2001 on Infor-
mation Modeling and Knowledge Bases, Jun. 2001
6. G.W. Tan, C.C. Hayes, and M. Shaw, An Intelligent-Agent Framework
for Concurrent Product Design and Planning, IEEE Trans. Eng. Man-
agement, vol.43, no.3, pp.297-306, Aug. 1996
7. Katsuaki Tanaka, Michiko Higashiyama, Setsuo Ohsuga, Problem De-
composition and Multi-Agent System Creation for Distributed Problem
Solving, ISMIS 2000, LNAI 1932, pp. 237-246, 2000
8. Hiroyuki Yamamuchi, KAUS User's Manual Version 6.502, RCAST,
University of Tokyo, 1999
A DISTRIBUTED ALGORITHM FOR COALITION
FORMATION AMONG E-COMMERCE AGENTS

GUILLAUME VAUVERT AND AMAL EL FALLAH - S E G H R O U C H N I


Laboratoire d'Informatique de Paris Nord - UPRES-A 7030 - Institut Galilee
Universite Paris 13 - 99, av. J-B Clement - 93430 Villetaneuse - France
{guillaume.vauvert, elfallah\@lipn.univ-parisl3.fr

Since no optimal structure exists, organizations have to be flexible to dynamically


react towards environment changes. In an economic context, agents are strongly
autonomous and weakly rational and have to deal with cooperation and competi-
tion, as in task repartition domain. This paper proposes an open and distributed
protocol based on exchange of preferences computed using qualitative and quanti-
tative criteria: agents will agree on coalitions to form in order to fulfill tasks. We
are going to prove that our protocol converges to reach consensus. Experimenta-
tion shows that the most rigid strategy is not optimal and that higher competition
leads to easier consensus.

1 Introduction

In economic context, rational agents are lead to cooperate in several situ-


ations: 1) agents cannot perform tasks by themselves; 2) other agents are
more efficient in performing tasks; and 3) working on the task collaboratively
will increase benefits or reduce its costs 1 . To increase the efficiency of task
achievement, agents may work jointly 2 ' 3 ' 4 , 5 and may form coalitions viewed
as groups of agents whom have decided to cooperate in order to carry out a
common task 1 .
Suitable to dynamic environments, coalition formation is usually studied from
three perspectives, considered independently even if they are not: 1) coalition
structure generation (partitioning or covering the set of agents); 2) solving the
optimization problem (solving their joint problem and receiving eventually a
benefit depending on used resources and spent time) and 3) dividing the value
of the solution among agents (decided by agents themselves or imposed before
beginning the process, addressed by game theory 6 ).
Many coalition formation approaches exist, but address different problems
in different domains. Sarit Krauss has proposed a classification of works in
coalition formation 7 t h a t we are going to extend to emphasize our criteria.
T h e set of main criteria is about the domain. Distributed authority, commu-
nication and negotiation are always considered:
—Individual goals 8 vs C o m m o n goal 1 , 9 ' 9 (social welfare maximizing).
—Self-interested 8,10 vs Altruistic.

355
356

—Only pure software agents vs pure software with people agents.


—Known rationality (group 1 , 1 1 , 6 , personal 8 , 1 0 , coalition 9 , 8 ) vs unknown.
—Bounded rationality 1 2 , 3 .
—Positive externalities (cooperation) 8 vs Negative (competition).
—Number of agents: a dozen 1 , a hundred, thousands.
—Untractable size of solution space 10 > 13 vs Small space.
—Defined and agreed protocols 1 0 ) vs non pre-defined protocols.
—Static vs evolutionary evaluation of incomes.
—Common vs individual evaluation of incomes.
—Costly 3 vs costless computation.
—Independent tasks 9 .
—Satisfy the more tasks as possible 9 vs satisfy all tasks.
—Enough competences and agents to solve problem.
—Dynamicity: agents appear/disappear and task arrive constantly 1 , 4 .
—Transferable resources (more beneficial coalition, 9 , 1 0 ) vs no transferability.
—Monetary system for side-payment 1 0 .
—Set partitioning 1 0 vs set covering 1 4 .
_ CFG 10,13,5,12,2 vs n o n _ C F G .
—Super-additive 5 , 8 , 2 , sub-additive 1 2 , 2 , no additivity (most of cases).
In e-commerce, self-interested agents deal with selfish goals and in D P S , altru-
istic agents deal with common goals. The type of goal is given by the problem,
since the type of agent is defined by environment (DPS, e-commerce), or res-
olution choice (DAI, MAS, ...).
As shown in 1 2 , bounded rational value of a coalition is determined by three
parameters: 1) as usual, the domain problem (task and resources); 2) the
execution architecture (limited and costly computation) and especially 3) the
possibility for agents to design their protocols. Effects of computational limi-
tations on coalition structure formation and stability have been studied in 3 , 1 2 .
In this paper, we focus on self-interested agents acting in an economic con-
text. They have individual goals (to increase their incomes) and might be
pure software agents or interface for human, and then no strategy is assumed
and rationality is bounded (autonomy and rationality deeply studied i n 1 5 ) .
T h e protocol we propose is assumed to be known and agreed by agents, but
they are completely autonomous: protocols take into account possibilities for
agents to try to cheat.
T h e problem of task allocation binds agents to cooperate in order to fulfill
tasks (each agent is able to fulfill a part of a task). We assume t h a t all tasks
can and must be fulfilled. A task might be dependent of an another (prece-
dence order, income decrease, same/different agent for some subtasks) and
coalitional value may depend on non-member actions: this may be taken into
357

account by a modification of solution space and of subtasks incomes (but no


experimentation have been made upon). Resources may be not transferable,
but if they are, agents may exchange resources outside the protocol without
modifying it. A monetary system is used for experimentations to simplify
computation, but since the protocol is only based on preferences exchange, it
is not necessary (agents need only criteria to compute their preferences).
The number of agents may be large (around 25), and experimentations show
that the number of turns decreases when the number of agents increases (time
however increases because each turn spend more time).
Evaluations of incomes are individual and may evolve during the process.
Computation and communication time might be taken into account, by de-
creasing sub-task income as time elapses.
Experimentations assume that agent may fulfill subtasks in different coali-
tions, but the protocol run with a partition (reduction of the solution space).
The optimality of the solution have no sense here, because it depends on
agents viewpoint. However, chosen solution is legitimate, because no agent is
favored. In this defined context, we propose a protocol that take into account
strong autonomy 15 , weak rationality 15 and heterogeneity to reach a consensus
about a subtask repartition.
This article is organized as follows: section 2 formalizes the concepts we de-
fine in order to solve the consensus problem. Section 3 proposes a distributed
algorithm to be executed by agents during the consensus process. It goes
on to prove the convergence of the proposed algorithm. Section 4 discusses
experimentation and provides some of our most significant results. Finally,
section 5 concludes the paper.

2 Coalition Formation

To reach a consensus, agents have to exchange information to possibly evolve


their preferences. Argumentation should be used, but it needs a complex pro-
cess, it binds agents to have a common communication language and to know
the rationality of others. Heterogeneous agents should prefer to exchange ba-
sic information that don't need such a formal process. Thus, at each turn,
agents send their preferences to others and consider other's preferences to
compute their next preferences. Because agents whom don't make conces-
sions are more likelihood to be ejected from the final solution (see 4), agents
may be flexible. If they aren't enough, they may form alliances; if no alliance
is formed, agents choose two agents whom are obliged to ally. Finally, alliance
formation leads to facilitate a consensus to be reached. This algorithm is more
broadly beared out and described in 16 .
358

2.1 Formalization

Let us now presents the concepts of the coalition formation problem and
highlight their meaning within a case study: airlines choose to cooperate to
provide their passengers with a unified reservation system. The problem is
t h a t for each travel, several airlines are in competition on some stages.
D e f i n i t i o n 1 ( C o a l i t i o n F o r m a t i o n P r o b l e m ( C F P ) ) A CFP is de-
fined as a tuple (A,T,S,C,V), where:
A: the set of agents candidate to the execution of sub-tasks;
T: the set of tasks to be accomplished;
S: the set of sub-tasks to be carried out;
C: the set of competences necessary to perform the sub-tasks;
V: the set of incomes.
An agent a £ A is defined by: a = (C, strategy), where C C C, and strategies
contain competences computation (see 2.2).
A task t £ T is defined by the set of sub-tasks it contains: t = (S), S C S.
A sub-task s £ S is defined by s = (C,p), C C C,ps £ V, where c is the set
of competences which an agent must have to be able to carry out the sub-task,
and p the associated profit (used by agents to compute his preferences).
A competence c £ C is a single item which represents what is required to be
carried out by an agent. A sub-task can require more than one competence.
A profit p £ V is used as an income, but only to simplify agents internal
calculations: V C H + . However, the type of profits independence implies that
any unit could have been used.
E x a m p l e 1 Agents = arlines: A = {EUropean Airlines, US'Airlines,...}.
A task = a flight: T = {New York-MAdrid (via PAris and LYon), . . . } .
Each flight: needs competences: authorization to do a national stage, pas-
sengers capacity, range of action: EUA—({autEU, MidC, ShrtR}); provides
incomes: V=[0,10000] and NY - M=({NY^P,P-^L,L-^M},8000).
D e f i n i t i o n 2 ( S o l u t i o n ) A solution is an assignment of each sub-task to an
agent which is able to perform it. A solution a £ E is an application S —>• A
such that Vs E S, a = cr(s) => s.C C a.C.
D e f i n i t i o n 3 ( P r e f e r e n c e ) A preference is represented by distances (in the
meaning given below) S £ A between solutions, where S : E x S —> [—1,1]
is an antisymmetrical application. So, &{<Ti,<T2) = d is interpreted by "u^
is preferred to (T\ with a distance d if d > 0 and <T\ is preferred to <T2 with a
distance —d if d < 0". A null distance means that the solutions are indifferent.
E x a m p l e 2 <T15 = [NY-^PA2^->WOA,L-+M<-+BUA,P-i-MO<->EUA,...}.
Let Si = {O-Q, 0-2,0-4} the set of solutions which provide outcomes and S2 =
{&1, °"3, C5} the set of solutions which provide none. 8(0, 0') = 0 if o and a'
359

are in the same set, and 8(a,cr') — 1 otherwise.


D e f i n i t i o n 4 ( S i g h t ) A sight (i>t)teiN EV is an application A —>• A .
D e f i n i t i o n 5 ( H i s t o r y ) A history h £ H is a sequence of sights. A history
h = (t>t)i<t<T represents all preferences exchanged between turn 1 and T.
An alliance is a set of agents and behave like a single one. A member have
the representative role: he send alliance's preference to outside agents.
D e f i n i t i o n 6 ( A l l i a n c e ) An alliance ) 6 A is defined by A = (A,arep),
where A Q A and arep £ A an alliance member with a special role (see 2.3),
with the constraint that an agent can belong to only one alliance.
To compute the alliance's preference, the representative needs a function of
agregation, which is difficult to find 1 7 .
D e f i n i t i o n 7 ( A l l i a n c e P r e f e r e n c e s C o m p u t a t i o n ) An alliance prefer-
ences computation APC is an application A —> A . This application is known
only by alliance members; other agents only known result of computation.
E x a m p l e 3 Let A £ A an alliance, A = {A,arep}, A C A. APC{\) = 8,
where 8 is defined by: Vfcr^o^) £ E 2 , 8(a\,<T2) = YlaEA a -^(< T ii <J2)- This
example use only members preferences to compute alliance preference.

2.2 Strategy of a member agent

The agent's strategy depends on his preferences computation: Independent


Preferences C o m p u t a t i o n (computation of the first preferences without know-
ing those of the others) and Dependent Preferences C o m p u t a t i o n (computa-
tion of preferences of next turns).
D e f i n i t i o n 8 ( C o a l i t i o n ) A coalition Q,(cr,t) C A associated to the task t £
T in the solution a £ E is defined by: Q(«r, t) = {a £ A/3s £ S,s £t.S, a(s) 3
a}=[Jset s (T(S). A coalition contains all the agents which take part in a task.
Definition 9 (Preferences Computation)
— Independent Preferences Computation: IPC £ A .
—Dependent Preferences Computation: DPC '• H —»• A, h — i >• 8.
E x a m p l e 4 Let 8 = IPC, V(<ri,o-2) £ S 2 ) , 8(ai,a-2) = profit{a2) -
profit(ai). 8 is an antisymmetrical application. Let 8 = DPC(h), h =
(ut)teiN- V(cr1,(T2) £ E 2 , <$(<Ti,d 2 )=E a(E> t(«T(a)M<Ti,(T2)]/|.4|. S is an anti-
symmetrical application.

2.3 Strategy of a representative agent

D e f i n i t i o n 10 ( C r i t e r i a )
—Releasing Switch-over Proposal Criterion (criterion used to decide when to
propose to release to switch-over mode): RSPC : H —y {False,True}.
360

—Releasing Switch-over Acceptance Criterion (criterion which decides to ac-


cept or not to switch to release mode): RSAC : H —> {False, True}.
—Alliance Formation Proposal Criterion (gives a list of agents to which to pro-
pose to form an alliance): AFPC : ft >->• ( A i , . . . , A n ), where AFPC(h) = 0 is
allowed and is interpreted by "the agent doesn't want to form an alliance".
—Alliance Formation Acceptance Criterion (allows to answer to alliance for-
mation propositions): AFAC : H,A^ {False,True}.
E x a m p l e 5 Let ft = (vt)i<t<T- RSPC{h) = False if T < 2 and
RSPC(h) = (VT = VT-I) V (VT-I — VT-2) V (VT — VT-I) otherwise. To
diminish computation complexity, only loops of length 3 or less are detected
and to simplify computations, RSAC = RSPC. Let d : A x A —>• IR a dis-
tance between agents preferences, for example: ¥(61,62) G A 2 , d(6i,62) =
£(<7i,<72)eS2 I M 0 " ! ' ^ ) -62{<ri,o-2)\. For an agent a, AFPC(h) is the set of
agents which preferences are enough near to him using a threshold. We can
use the same application to compute AFAC but using a greater threshold.

3 T h e algorithm of consensus protocol

Each agent may play several roles within the system. T h e organizer sends
d a t a s and manages inscriptions and turns. The supervisor prevents agents to
send different preferences to each agent (information can not be used before
others thanks to a parallal diffusion 16 ) by asking agents what preferences they
have sent and received (penalty may be paid by culprits). T h e candidate re-
ceives tasks to fulfill and decides to take p a r t in or not: if he does, he becomes
an alliance of one member (himself) and the representative of this alliance.
T h e member receives and sends his preferences when asked by the represen-
tative. The representative has been defined in section2.3 and his algorithm is
given below.
T h e representative's algorithm plays a leading role. Each representative
has a list of interlocutor's InterList C A initialized with the list of the can-
didates. T h e following algorithm is carried out by each representative ai in a
distributed way. In switch-over mode, representatives decide which alliances
are going to merge (using AFPC and AFAC); if no alliance desires to merge,
the system choses t h e m .
T e r m i n a t i o n . In order to be able to guarantee t h a t the process terminates,
we have to assume t h a t the criteria of switch-over mode releasing checks the
existence of a loop: if the same situation occurs twice (this case will necessar-
ily happen), then an alliance is formed. In the worst case, there will be only
formations of forced alliances, what will lead t o a great alliance. In fact, the
number of situations is not finite (preference use real numbers). To escape
361

Algorithm 1 Representative: Main


receive( "IndPref", AllianceMembers)
h ^> ParallelDiff (IndPref, InterList)
send( "IndPref", AllianceMembers)
while the consensus isn't reached do
if RSPC(h) then send("proposition to switch-over mode",)
if receive("proposition to switch-over mode",) then
if RSAC then send ("proposition to switch-over mode",)
if Va £ InterList, receive("proposition to switch-over mode",a)
then call switch-over mode
receive( "DepPref", AllianceMembers)
h <-^ ParallelDiff (DepPref, InterList)
send( "DepPref, AllianceMembers)
end while

this problem, we consider that two sights are equal if all their preferences are
rather close w.r.t. the given distances as introduced in Example 5.
Definition 11 (Pseudo-equality) Let e a small real, 8 and 8' two prefer-
ences and vt = and vt> two sights. We shall say that:
—8 and S' are pseudo-equal (5 ~ 8') i / W £ S, \delta(a) — delta'(<T)\ < e;
—vt and vti are pseudo-equal (vt ~ vti) «/Va £ A, \vt(a) ~ vt*(a)\ < e.
Definition 12 (A cycle-like in a history) A history h = (vt)i<t<T con-
tains a cycle-like if3(ri,T2) £ [[l,? 1 ]] 2 , T\ ^ r 2 such that vTl ~ vT2.
Definition 13 (A CFP detects cycle-like) A CFP (A,T,S,C,Q) de-
tects cycle-like if (h contains a cycle-like => (Bao £ A such that ao.RSPC(h) =
True A Va £ A, a.RSAC(h) = True)). In other words, a CFP detects cycles-
like if at least one agent detects it and all then accept to change mode.
Theorem 1 If a CFP detects cycle-like, then the program terminates.
Proof. If a CFP detects cycles-like and there is a cycle-like, then at least one
agent will propose to change mode and all other will accept. Agents may then
form alliances. If they don't, two agent will be compelled to form an alliance.
As number n of agents and number k of solutions aren't infinite, the number
of sights not pseudo-equal is finite (2nk/e). Finally, after 2kn(n — \)/e turns
at worst, there is consensus. I
Complexity. Complexity depends in particular on the number of possible
solutions which is directly related to the problem datas. Let us assume that
our system contains n agents and that each of them is able to process a portion
of 1/m of the tasks; then a task has on average n/m agents able to carry it,
362

Figure 1. Income / strategy Figure 2. Number of turn / strategy

what gives k = (n/m)s solutions. In the most general case, our algorithm
does not make it possible to change class of complexity, but experimentations
show t h a t with alliance formation, the t u r n number is bounded.

4 Experimentation

Many parameters influence the process, but three of t h e m have more influ-
ence: agents strategies, competences repartition (more or less competition)
and the number of agents. To measure the influence of the first parame-
ter, the number of agents is fixed (7). T h e preference of agent a at turn t:
6a(t) = (1 - w(t)) x Sa(0) + w(t) x J2beA Sb(* ~ 1)/\A\, where w{t) = e~at.
This weight simulates a more or less flexible strategy. T h e goal of this exper-
imentation is to find the best average strategy according to other strategies.
In F i g . l , each curve represents the strategy of the population (from 0.0 =rigid
to 0.0 =fiexible strategy). Results are the average of a large amount of ex-
perimentations (350). As expected, agent's income begin to increase, but,
around 0.7, agent's income decreases: to be too rigid should lead an agent to
be excluded from chosen solution, he will so earn less income. T h a t should
lead agent to choose flexible strategies.
Fig.2 shows t h a t when more agents are rigid, consensus is hardly reached. If
agents are too rigid, j a m m i n g detection leads to form an alliance and conse-
quently to reach a consensus more quickly, even if the last is not desired.
More the agents have competences, more they have to compete with others.
We studied the influence of the number of agents per subtask (competition
level) on the incomes (no Figure) and on the number of turns (Fig.3). As ex-
pected, when competition increases, incomes decrease and consensus become
more difficult to reach.
363

Figure 3. Turn / agents per subtask Figure 4. Turn number / |.4|

As the number of agents increases (Fig.4), there are more and more agents
able to fulfill subtasks and competition increases. But if the number of agents
is greater than 25 (this value depends on other parameters), then reaching a
consensus is easier, because the formed coalition contains enough agents to
fulfill all the tasks: usually, one coalition fulfill all tasks.

5 Conclusion

From the point of view of deployment of M.A.S. in an economic context, it is


necessary to consider weakly rational, strongly autonomous and heterogeneous
agents. To arrive, within this framework, to form coalitions, we propounded
an open, distributed and egalitarian protocol based on an exchange of prefer-
ences computed with quantitative and qualitative criteria since dependent on
the specific strategies of the agents. Moreover, we set up releasing procedures
thanks to the flexible concept of alliance to avoid the system paralysis. We
have shown t h a t , with this protocol, to be extremely rigid is not optimal and
high competition leads to a faster consensus.
T h e next step is to lower complexity by reducing the size of exchanged pref-
erences (using Taboo to find quickly good solutions) and the number of turn
(by using models of others).

References

1. O. Shehory, K. Sycara, and S. J h a . Multi-agent coordination through


coalition formation. In M. Singh A. Rao and M. Wooldridge, editors,
Lecture Notes in Artificial Intelligence, volume Intelligent Agents IV -
1365, pages 143-154. Springer, 1997.
364

2. G. Zlotkin and J. S. Rosenschein. Coalition, cryptography, and stability :


Mechanisms for coalition formation in task oriented domains. In Proc.
of AAAI94, pages 432-437, Seattle, Washington, 1994.
3. T . W. Sandholm and V. R. Lesser. Coalition formation among bounded
rational agents. In Proc. of IJCAI-95, pages 662-669, Montreal, 1995.
4. O. Shehory and S. Krauss. Formation of overlapping coalitions for
precedence-ordered task-execution among autonomous agents. In Proc.
of 1CMAS-96, pages 330-337, Kyoto, J a p a n , 1996.
5. S. P. Kepchel. Forming coalitions in the face of uncertain rewards. In
Proc. of AAAI94, pages 414-419, Seattle, Washington, 1994.
6. A. Rapoport. N-person game theory. Technical report, Michigan Univ.,
1970.
7. S. Kraus. Negotiation and cooperation in multi-agent environments. Ar-
tificial Intelligence, 94(l-2):79-98, 1997.
8. O. Shehory and S. Kraus. Coalition formation among autonomous agents:
Strategies and complexity. Lecture Notes in A.I., From Reaction to Cog-
nition, C. Castelfranchi and J. P. Muller (Eds.)(957), 1995.
9. O. Shehory and S. Kraus. Task allocation via coalition formation among
autonomous agents. In Proc. of IJCAI-95, Montreal, August 1995.
10. O. Shehory and S. Kraus. A kernel-oriented model for autonomous-agent
coalition-formation in general environments: Implementation and results.
In Proc. of AAAI-96, pages 134-140, Portland, Oregon, 1996.
11. J. C. Harsanyi. Rational Behavior and Bargaining Equilibrium in Games
and Social Situations. Cambridge University Press, 1977.
12. T . Sandholm and V. Lesser. Coalitions among computationally bounded
agents. Artificial Intelligence, Special issue on Economic Principles of
Multiagent Systems(94(l)):99-137, 1997.
13. T. Sandholm, K. Larson, M. Andersson, O. Shehory, and F. Tohme.
Coalition structure generation with worst case guarantees. Artificial In-
telligence, 1999.
14. O. Shehory and S. Kraus. Methods for task allocation via agent coalition
formation. Artificial Intelligence, 101(l-2):165-200, 1998.
15. G. Vauvert and A. El Fallah Seghrouchni. Coalition formation among
strongly autonomous and weakly rational agents. In Proceedings of MAA-
MAW2001, Annecy, France, May, 2-4 2001.
16. G. Vauvert and A. El Fallah Seghrouchni. Coalition formation among
egoistic agents. In Proceedings of MAMA '2000, Wollongong, Australia,
December 11-13 2000.
17. K. Arrow. The Origins of the Impossibility Theorem, chapter 1. Elsevier
Science Publishers B. V., Amsterdam, 1991.
Optimal Reward Functions in Distributed Reinforcement Learning

David H. W o l p e r t a n d K a g a n Turner
NASA Ames Research Center, Mailstop 269-1, Moffett Field, CA 94035
{dhw,kagan} Qptolemy.arc.nasa.gov

We consider the problem of designing (perhaps massively) distributed collections


of adaptive agents so as to optimize a world utility function dependent the behav-
ior of the entire collection. We consider this problem when each agent's individual
behavior is cast as striving to maximize an associated payoff utility function. The
central issue in such design problems is how to initialize/update the payoff utility
function so as to induce best possible world utility. Traditional "team game" ap-
proaches simply assign to each agent the world utility as its payoff utility function.
In previous work we used the "Collective Intelligence" framework to derive a better
choice of payoff utility functions, one that results in world utility performance up
to orders of magnitude superior to that ensuing from use of the team game utility.
In this paper we extend these results using a novel mathematical framework. We
review the derivation under that framework of the general class of payoff utility
functions that both are easy for the individual agents to learn and that, if learned
well, result in high world utility. We then demonstrate experimentally that using
these new utility functions can result in significantly improved performance over
that of previously investigated collective intelligence payoff utilities, over and above
those previous utilities' superiority to the conventional team game utility.

1 Introduction
In this paper we are interested in Multi-Agent Systems (MAS's) 1 - 2 - 3 ' 4 where
there is a provided world utility function that rates the possible histories of
the full system. At the same time, each agent runs a reinforcement learning
(RL) algorithm 5 ' 6 ' 7 , to try to maximize its associated private utility function.
In such a system, we are confronted with an inverse problem: How should
we initialize/update the agents' private utility functions to ensure that as the
system unfolds the agents do not "work at cross-purposes", and their collective
behavior maximizes the provided world utility function. Intuitively, to solve
this inverse problem requires private utility functions that the agents can each
learn well, but that also are "aligned" with the world utility. In particular,
such alignment is necessary to avoid economics phenomena like the Tragedy of
The Commons (TOC) 8 or Braess' paradox 9 .
This problem is related to work in many other fields, including computa-
tional economics 10 , mechanism design 11 , reinforcement learning 7 , statistical
mechanics 12 , computational ecologies 13 , (partially observable) Markov deci-
sion processes 14 and game theory n . However none of these fields is both
applicable in large, real-world problems, and also directly addresses the gen-

365
366

eral inverse problem rather than a very special instance of it. (In particular,
the field of mechanism design is not generally applicable. A detailed discussion
of related fields, involving hundreds of references is available 15 .)
It's worth emphasizing that some of the previous work that does con-
sider the general inverse problem does so by employing MAS's in which each
agent uses RL 1 6 ' 1 7 . However, in those cases, each agent generally receives the
world utility function as its private utility function (i.e., implements a "team
game" 1 8 ). The shortcoming of such approaches, as expounded below and in
previous work, is that they scale very poorly to large problems. (Intuitively,
the difficulty is that each agent can have a hard time discerning the echo of its
behavior on the world utility when the system is large.)
In previous work we modified these systems by using the Collective INtel-
ligence (COIN) framework to to derive the alternative "Wonderful Life Utility"
(WLU) 15 , a private utility that generically avoids the pitfalls of the team game
private utility 9 ' 1 9 ' 1 5 ' 2 0 . For example, in some of that work we used the WLU
as the private utility for distributed control of network packet routing 19 . Con-
ventional approaches to packet routing have each router run a shortest path
algorithm (SPA), i.e., each router routes its packets in the way that it ex-
pects will get those packets to their destinations most quickly. Unlike with a
COIN, with SPA-based routing the routers have no concern for the possible
deleterious side-effects of their routing decisions on the global goal (e.g., they
have no concern for whether they induce bottlenecks). We ran simulations
that demonstrated that a COIN-based routing system has substantially better
throughputs than does the best possible SPA-based system 19 , even though
that SPA-based system has information denied the COIN system. In related
work we have shown that use of the WLU automatically avoids the infamous
Braess' paradox, in which adding new links can actually decrease throughput
— a situation that readily ensnares SPA's.
As another example, we considered the pared-down problem domain of a
congestion game 21 , in particular a more challenging variant of Arthur's El Farol
bar attendance problem 22 , sometimes also known as the "minority game" 12 .
In this problem, agents have to determine which night in the week to attend
a bar. The problem is set up so that if either too few people attend (boring
evening) or too many people attend (crowded evening), the total enjoyment of
the attendees drops. Our goal is to design the reward functions of the attendees
so that the total enjoyment across all nights is maximized. In this previous
work we showed that use of the WLU can result in performance orders of
magnitude superior to that of team game utilities.
In this article we extend this previous work, by investigating the impact of
the choice of the single free parameter in the WLU (the "clamping parameter"),
367

which we simply set to 0 in our previous work. In particular, we employ some


of the mathematics of COINs to determine the theoretically optimal value of
the clamping parameter, and then present experimental tests to validate that
choice of clamping parameter. In the next section we review the relevant con-
cepts of COIN theory. Then we sketch how to use those concepts to derive
the optimal clamping parameter. To facilitate comparison with previous work,
we chose to conduct our experimental investigations of the performance with
this optimal clamping parameter in variations of the Bar Problem. We present
those variations in Section 3. Finally we present the results of the experiments
in Section 4. Those results corroborate the predicted improvement in perfor-
mance when using our theoretically derived clamping parameter. This extends
the superiority of the COIN-based approach above conventional team-game
approaches even further than had been done previously.

2 Theory of COINs
In this section we summarize that part of the mathematics of COINs that is
relevant to the study in this article. We consider the state of the system across
a set of consecutive time steps, t € {0,1,...}. Without loss of generality, all
relevant characteristics of agent rj at time t — including its internal parameters
at that time as well as its externally visible actions — are encapsulated by a
Euclidean vector £,,(, the state of agent r\ at time t. (j^ is the set of the states
of all agents at t, and C is the system's worldline, i.e., the state of all agents
across all time.
World utility is G(C), and when t] is an RL algorithm "striving to in-
crease" its private utility, we write that utility as 7^(C)- (The mathematics
can readily be generalized beyond such RL-based agents 1 5 . Here we restrict
attention to utilities of the form J^t ^ t ( O ) f° r reward functions Rt.
Definition 1: A system is factored if for each agent r) individually,

7„(0>7„(C) * G(C)>G(C),
for all pairs £ and ( that differ only for node rj.
For a factored system, when every agents' private utility is optimized (given
the other agents' behavior), world utility is at a critical point (e.g., a local
maximum) 1 5 . In game-theoretic terms, optimal global behavior occurs when
the agents' are at a private utility Nash equilibrium 11 . Accordingly, there can
be no TOC for a factored system 15 ' 19 ' 20 .) In addition, off of equilibrium, the
private utilities in factored systems are "aligned" with the world utility.
Definition 2: The (t — 0) effect set of node r\ at £, St^(Q, is the set of
all components Cri',t' f° r which the gradients V^ ]0 (()?;',«' ^ 0. S * " with no
368

specification of £ is defined as \J^S^*{(,). We will also find it useful to define


~S%Jf as the set of all components that are not in S*f* • Intuitively, the t = 0
effect set of 77 is the set of all node-time pairs which, under the deterministic
dynamics of the system, are affected by changes to 77's t = 0 state.
Definition 3: Let a be a set of agent-time pairs. CL (7 (() is ( modified by
"clamping" the states corresponding to the elements of a to some arbitrary
pre-fixed vector K. Then the (effect set) Wonderful Life Utility for node 77
(at time 0) is WLUn(() = G(() - G(CLs,/f(Q), where conventionally K = 0.
Note the crucial fact that to evaluate the WLU one does not need to
know how to calculate the system's behavior under counter-factual starting
conditions. All that is needed to evaluate WLUV is the function G(.), the
actual £, and S*H (which can often be well-approximated even with little
knowledge about the system).
In previous work, we showed that effect set WLU is factored 20 . As another
example, if 7,, = G V77 (a team game), then the system is factored. However
for large systems where G sensitively depends on all components of the system,
each agent may experience difficulty discerning the effects of its actions on G.
As a consequence, each 77 may have difficulty achieving high j v in a team game.
We can quantify this signal/noise effect by comparing the ramifications on
7T;(C) arising from changes to (Vfi with the ramifications arising from changes
to C>?,o, where "rj represents all nodes other than 77. We call this quantification
learnability 15 . A linear approximation to the learnability in the vicinity of
( is the differential learnability \,lri (C):

w o ^ "?^(C)". a)
' l|VC,,o7,(C)ll
It can be proven that in many circumstances, especially in large problems,
WLU has much higher differential learnability than does the team game choice
of private utilities 15 . (Intuitively, this is due to the subtraction occurring in
the WLU's removing a lot of the noise.) The result is that convergence to
optimal G with WLU is much quicker (up to orders of magnitude so) than
with a team game.
However the equivalence class of utilities that are factored for a particular
G is not restricted to the associated team game utility and clamp-to-0 WLU.
Indeed, one can consider solving for the utility in that equivalence class that
maximizes differential learnability. An approximation to this calculation is to
solve for the factored utility that minimizes the expected value of [\n,WLRn]~2,
where the expectation is over the values £)0.
A number of approximations have to be made to carry out this calcula-
369

tion 15 . The final result is that 77 should clamp to its empirical expected average
action, where that average is over the elements in its training set 2 3 . Here, for
simplicity, we do not actually make sure to clamp each 77 separately to its own
average action, a process that involves 77 modifying what it clamps to in an
online manner. Rather we clamp all agents to the same average action. We
then made the guess that the typical probability distribution over actions is
uniform. (Intuitively, we would expect such a choice to be more accurate at
early times than at later times in which agents have "specialized".)

3 The Bar Problem


We focus on the following six more general variants of the bar problem inves-
tigated in our earlier work 2 0 : There are N agents, each picking one out of
seven actions every week. Each action corresponds to attending the bar on
some particular set of / out of the seven nights of the current week, where
/ £ {1,2,3,4,5,6}." At the end of the week the agents get their rewards and
the process is repeated. For simplicity we chose the attendance profiles of each
potential action so that when the actions are selected uniformly the resultant
attendance profile across all seven nights is also uniform.
World utility is G(Q = E ^ G ( O ) , where flG(G) = E L i 0(*fc(C,*)),
Xk((,t) is the total attendance on night k at week t, (j>(y) = yexp(—y/c), and
c is a real-valued parameter. (To keep the "congestion" level constant, for /
going from 1 to 6, c = {3,6,8,10,12,15}, respectively.) Our choice of </>(.)
means that when either too few or too many agents attend some night in some
week world reward RQ is low.
Since we are concentrating on the utilities rather than on the RL algo-
rithms that use them, we use (very) simple RL algorithms. Each agent 77 has a
7-dimensional vector giving its estimates of the reward it would receive for tak-
ing each possible action. At the beginning of each week, each 77 picks the night
to attend randomly, using a Boltzmann distribution over the seven components
of 77's estimated rewards vector. For simplicity, the temperature parameter of
the Boltzmann distribution does not decay in time. However to reflect the fact
that each agent operates in a non-stationary environment, reward estimates
are formed using exponentially aged data: in any week t, the estimate 77 makes
for the reward for attending night % is a weighted average of all the rewards it
has previously received when it attended that night, with the weights given by
a
I n order to keep the learning difficulties faced by the agents similar for various choices of /,
the agents always have seven action from which to choose. Each such action gets mapped to
an "attendance" profile, e.g.., for ( = 2, so that each agent must choose two nights, action
one maps to attending on days one and two, action two maps to attending on days two and
three etc.
370

an exponential function of how long ago each such reward was. To form the
agents' initial training set, we had an initial period in which all actions by all
agents were chosen uniformly randomly, before the learning algorithms were
used to choose the actions.

4 Experimental Results
We investigate three choices of/5: 0, 1 = (1,1,1,1,1,1,1), and the "average"
action, a = |-, where I e {1,2,3,4,5,6} depending on the problem. The
associated WLU's are distinguished with a superscript. In the experiments
reported here all agents have the same reward function, so from now on we
drop the agent subscript from the private utilities. Writing them out, the three
WLU reward functions are:

RWL<s(£t) = Raiit) ~ Ra{CL%t))


= <t>d,(xd„((,t)) - ^,(ard,(C,*)-l)
RWL;((,t) = RG(C,t) - RaiCLfct))
7
= Y, Mxd(C,t)) - Mxd(£,t) + i)
d^dr,

RWL*(0) = RaiCt) ~ Ro{CL%t))


7
= ] T 4>d(xd((,t)) - 4>d{xd((,t)+ad)
d^tdv
+ 4>dv(xdv(C,t)) - 4>d„(xdv(C,t) - 1 + ad)

where dn is the night picked by 77 and ad = 1/7 The team game reward func-
tion is simply RQ. Note that to evaluate RWLs each agent only needs to know
the total attendance on the night it attended. In contrast, RQ and RWLS
require centralized communication concerning all 7 nights, and RWLi requires
communication concerning 6 nights. Finally, note that when viewed in at-
tendance space rather than action space, CLa is clamping to the attendance
vector Vi = X)d=i ^sX' where ud<i is the i'th component (0 or 1) of the the <i'th
action vector. So for example, for / = 1, CLa clamps to Vi = Y^d=i "4^"i where
5dji is the Kronecker delta function.
In the first experiment each agent had to select one night to attend the bar
(/ = 1). In this case, K = 0 is equivalent to the agent "staying at home," while
K = 1 corresponds to the agent attending every night. Finally, K — a = \ is
371

6
•o
co
CD 5
DC
"co
.o 4
.O 11 • • - » ' • W fr

O
3

100 200 300 400 500


Time
Figure 1: Reward function comparison when agents attend one night. {WLS is O ; WLx is
+ ; WLf is D ; G is x)

equivalent to the agents attending partially on all nights in proportions equiv-


alent to the overall attendance profile of all agents across the initial training
period. (Note, none of these "actions" are actually available to the agents.
They simply use these fictional actions to compute their utilities, as described
in Section 2.)
Figure 1 graphs world reward against time, averaged over 100 runs, for 60
agents and c = 3. (Throughout this paper, error bars are too small to depict.)
The two straight lines correspond to the optimal performance, and the "base-
line" performance given by uniform occupancies across all nights. Systems
using WLa and WL$ rapidly converged to optimal and to quite good perfor-
mance, respectively. This indicates that for the bar problem the "mild assump-
tions" mentioned above hold, and that the approximations in the derivation of
the optimal clamping parameter are valid.
Figure 2 shows how t = 500 performance scales with N for each of the
reward signals. For comparison purposes the performance is normalized — for
each utility U we plot XV~JR"° , where Ropt and Rbase are the optimal per-
formance and a canonical baseline performance given by uniform attendance
across all nights, respectively. Systems using RQ perform adequately when N
is low. As N increases however, it becomes increasingly difficult for the agents
to extract the information they need from RQ- Because of their superior learn-
ability, systems using the WL rewards overcome this signal-to-noise problem
to a great extent. Because the WL rewards are based on the difference between
the actual state and the state where one agent is clamped, they are much less
affected by the total number of agents. However, the action vector to which
372

o
CO

CD
Q_
•a
CD
M
"cc

40 80 120 160 200 240 280 320


Number of Agents
Figure 2: Scaling properties of the different reward function. (WL$ is O WL5 is WLT
is D ; G is x)

agents are clamped also affects the scaling properties.


Figure 3 shows the normalized world reward obtained for the different pri-
vate utilities as a function of / (i.e., when agents attend the bar on multiple
nights in one week). RWLs performs well for all problems. RWIjr on the other
hand performs poorly when agents only attend on a few nights, but reaches the
performance of RWL3 when agents need to select six nights, a situation where
the two clamping vectors are very similar (1 and | , respectively). RWLa shows
a slight drop in performance when the number of nights to attend increases,
while RG shows a much more pronounced drop. Furthermore, in agreement
with our previous results 20 , despite being factored, the poor signal-to-noise in
RG results in poor performance with it for all problems. (Temperatures varied
between .01 and .02 for the three WL rewards, and between .1 and .2 for the G
reward, which provided the respective best performances for each.) These re-
sults confirm our theoretical prediction of what private utility converges fastest
to the world utility maximum.

5 Conclusion
In this article we considered how to design large multi-agent systems to meet
a pre-specified goal when each agent in the system uses reinforcement learning
to choose its actions. We cast this problem as how to initialize/update the
individual agents' private utility functions so that their collective behavior
optimizes a pre-specified world utility function. The mathematics of COINs is
specifically concerned with this problem. In previous experiments we showed
373

1 = « e e- e ~-*
<x> 43-'""
o 0.9
li

cc 0.8 ,•"*"'
F
o 0.7
erf

0.6 -----D
r~""
0.5
CD
M 0.4 x.
M
" x.
al

0.3
LUJC

0.2 'X...

Z 0.1 ""X
0
1 2 3 4 5 6
N u m b e r of Nights to A t t e n d
Figure 3: Behavior of different reward function with respect to number of nights to attend.
(WLS is © ; WL° is + ; WLX is D ; G is x)

that systems based on that math far outperformed conventional "team game"
systems, in which each agent has the world utility as its private utility function.
Moreover, the gain in performance grows with the size of the system, typically
reaching orders of magnitude for systems that consist of hundred of agents.
In those previous experiments the COIN-based private utilities had a free
parameter, which we arbitrarily set to 0. However as synopsized in this paper,
it turns out that a series of approximations in the allows one to derive an op-
timal value for that parameter. Here we have repeated some of our previous
computer experiments, only using this new value for the parameter. These
experiments confirm that with this new value the system converges to signifi-
cantly superior world utility values, with less sensitivity to the parameters of
the agents' RL algorithms. This makes even stronger the arguments for using
a COIN-based system rather than a team-game system. Future work involves
improving the approximations needed to calculate the optimal private utility
parameter value. In particular, given that that value varies in time, we intend
to investigate having to calculate it in an on-line manner.

References
1. C. Boutilier, Y. Shoham, and M. P. Wellman. Editorial: Economic principles
of multi-agent systems. Artificial Intelligence Journal, 94:1-6, 1997.
2. J. M. Bradshaw, editor. Software Agents. MIT Press, 1997.
3. N. R. Jennings, K. Sycara, and M. Wooldridge. A roadmap of agent research
and development. Autonomous Agents and Multi-Agent Systems, 1:7-38, 1998.
4. K. Sycara. Multiagent systems. AI Magazine, 19(2):79-92, 1998.
5. J. Hu and M. P. Wellman. Multiagent reinforcement learning: Theoretical
framework and an algorithm. In Proceedings of the Fifteenth International
374

Conference on Machine Learning, pages 242-250, June 1998.


6. L. P. Kaelbing, M. L. Littman, and A. W. Moore. Reinforcement learning: A
survey. Journal of Artificial Intelligence Research, 4:237-285, 1996.
7. R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT
Press, Cambridge, MA, 1998.
8. G.Hardin. The tragedy of the commons. Science, 162:1243-1248, 1968.
9. K. Turner and D. H. Wolpert. Collective intelligence and Braess' paradox. In
Proceedings of the Seventeenth National Conference on Artificial Intelligence,
pages 104-109, Austin, TX, 2000.
10. T. Sandholm, K. Larson, M. Anderson, O. Shehory, and F. Tohme. Anytime
coalition structure generation with worst case guarantees. In Proceedings of
the Fifteenth National Conference on Artificial Intelligence, pages 46-53, 1998.
11. D. Fudenbergand J. Tirole. Game Theory. MIT Press, Cambridge, MA, 1991.
12. D. Challet and Y. C. Zhang. On the minority game: Analytical and numerical
studies. Physica A, 256:514, 1998.
13. B. A. Huberman and T. Hogg. The behavior of computational ecologies. In
The Ecology of Computation, pages 77-115. North-Holland, 1988.
14. M. L. Littman. Markov games as a framework for multi-agent reinforcement
learning. In Proceedings of the 11th International Conference on Machine
Learning, pages 157-163, 1994.
15. D. H. Wolpert and K. Turner. An Introduction to Collective Intelligence.
Technical Report NASA-ARC-IC-99-63, NASA Ames Research Center, 1999.
U R L : h t t p : / / i c . a r c . n a s a . g o v / i c / p r o j e c t s / c o i n _ p u b s . h t m l . To appear in
Handbook of Agent Technology, Ed. J. M. Bradshaw, AAAI/MIT Press.
16. C. Claus and C. Boutilier. The dynamics of reinforcement learning cooperative
multiagent systems. In Proceedings of the Fifteenth National Conference on
Artificial Intelligence, pages 746-752, Madison, WI, June 1998.
17. T. Sandholm and R. Crites. Multiagent reinforcement learning in the iterated
prisoner's dilemma. Biosystems, 37:147-166, 1995.
18. R. H. Crites and A. G. Barto. Improving elevator performance using reinforce-
ment learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors,
Advances in Neural Information Processing Systems - 8, pages 1017-1023. MIT
Press, 1996.
19. D. H. Wolpert, K. Turner, and J. Frank. Using collective intelligence to route
internet traffic. In Advances in Neural Information Processing Systems - 11,
pages 952-958. MIT Press, 1999.
20. D. H. Wolpert, K. Wheeler, and K. Turner. Collective intelligence for control
of distributed dynamical systems. Europhysics Letters, 49(6), March 2000.
21. D. H. Wolpert, K. Wheeler, and K. Turner. General principles of learning-based
multi-agent systems. In Proceedings of the Third International Conference of
Autonomous Agents, pages 77-83, 1999.
22. W. B. Arthur. Complexity in economic theory: Inductive reasoning and
bounded rationality. The American Economic Review, 84(2):406-411, 1994.
23. D. H. Wolpert. The mathematics of collective intelligence, pre-print, 2001.
POLYGONAL APPROXIMATION OF PLANAR DIGITAL CURVES USING
ANT SYSTEM

PENG-YENG YIN
Department of Information Management, Ming Chuan University, Taoyuan 333, Taiwan
E-mail: pyyin@mcu. edu. tw

This paper presents a new polygonal approximation method using the ant system. The
problem is represented by a directed graph such that the objective of the original problem
becomes to find the shortest cycle that satisfies the problem constraints. A number of
artificial ants (agents) are distributed on the graph and communicating with one another
through the pheromone trails which are a form of the long-term memory recording the
positive tours previously constructed. The important properties of the proposed method are
thoroughly investigated. The performance of the proposed method compared to those of the
genetic-based and the tabu search-based approaches is very promising.

1 Introduction

Planar digital curve approximation is a very important topic because the digital
curves often appear as the region boundaries and the object contours in an image. It
is desirable to approximate a digital curve with the corner points to reduce the
memory storage and the processing time for the subsequent procedures. Polygonal
approximation technique is one of the approaches which can accomplish this work
and has caused the attention of many researchers. The idea behind is to approximate
the digital curve by an optimal polygon with the minimal number of line segments
such that the approximation error between the digital curve and the polygon is no
more than a specified tolerance.
Most existing methods provide local optimal approximation results due to
limited computational time resource. They can be divided into three groups, (1)
sequential approaches [1-2]; (2) split-and-merge approaches [3-4]; and (3) dominant
points detection approaches [5-6]. These approaches are simple and fast, but their
approximating results are far from the optimal ones. However, an exhaustive search
for the optimal polygon will result in an exponential complexity. Approaches based
on genetic algorithms (GA) [7, 8] and tabu search (TS) [9] have been proposed to
solve the polygonal approximation problem and obtained much better approximation
results than the local optimal methods do. In this paper, we develop a more effective
and efficient global search algorithm based on a heuristic called the ant system (AS)
[10, 11]. To the best of our knowledge, our work is the first attempt to apply the AS
in the fields of image processing and computer vision. The properties of the
proposed algorithm have been thoroughly analyzed, and the approximation results
are encouraging compared to those of the works using GA and TS.

375
376

2 The Proposed Method

2.1. The Ant System


The ant system (AS) algorithm was proposed by Dorigo [10, 11] in 1992, and had
been used to solve many complex combinatorial optimization problems. The AS is
inspired by the research on the real ant behavior. Ethologists find that ants could
construct the shortest path from their colony to the feeding source and back through
the use of pheromone trails. An ant leaves some quantity of pheromone on the
ground and marks the path by a trail of this substance. The next ant will sense the
pheromone laid on different paths and choose the one with a probability
proportional to the amount of pheromone on it. Then the ant follows the path and
leaves its own pheromone. This is an autocatalytic (positive feedback) process
which favors the path along which more ants previously traveled.
The general principles of the AS are given as follows. Firstly, the underlying
problem should be represented as a graph. The feasible solutions of the problem are
usually corresponding to the paths on the graph with some problem-specific
restrictions. A number of ants are arbitrarily placed on the nodes chosen randomly.
Then each of the distributed ants will walk through the graph and construct its own
path. Each ant living on the graph travels from node to node by selecting edges
according to the node transition rule. The selection probability of the next edge is
exponentially proportional to two factors: the pheromone quantity laid on it and the
merit of greediness of choosing it. After all of the ants have completed their tours
(which is called a cycle), the intensity of pheromone on each edge is updated by the
pheromone updating rule that the edges belonging to better tours will receive more
pheromone. Then the ants are randomly distributed again and reiterate the cycle.
The stopping criterion of the AS could be the maximal number of running cycles,
the CPU time limit, or the maximal number of cycles between two improvements of
the global best solution.

2.2. Graph Representation ofPolygonal Approximation


The idea of polygonal approximation is to approximate the digital curve by an
optimal polygon with the minimal number of vertices such that the approximation
error between the digital curve and the polygon is less than a prespecified threshold
£ (we shall refer to this condition as the £ -bound constraint). To apply the AS, the
problem should be represented as a graph G = <N, E>, where N is the set of nodes,
E the set of edges. Apparently, for the polygonal approximation problem, each point
on the digital curve should be represented as a node on the graph. The ideal edge set
E* will be the one which has the desired property where any closed tour (a path
which originates and ends in the same node) on the graph represents a feasible
solution. However, it is very hard to generate the ideal edge set, or in most cases the
ideal edge set does not exist. A more practical way is to generate a pseudo ideal
377

edge set E such that E*czE • For those tours which do not satisfy the £ -bound
constraint, we can decrease the intensity of pheromone through a penalty function.
Now, we define some notations as follows. Let the tour completed by the Ath ant
be denoted as tourk, and the number of edges on tourk be \tour I • Since the
completed tour may violate the £ -bound constraint, we should compute the
approximation error yielded by every tour. We use Err(tourk) to denote the
approximation error between the digital curve and the approximating polygon
corresponding to tourk •

2.3. Initial Ant Distribution


Here we prefer to place the ants on the nodes which have better probability of
finding the shortest tour. In doing so, we design a special data structure called the
selection table of the starting node denoted by Tt, i = 1, 2, ..., n, where n is the
number of nodes. Initially, each entry of Tis equal to 1. The probability of node i to
be chosen as a starting node is the entry value Tt normalized by the sum of all the
entry values, i.e., Select =T/Yr • At the beginning of the first cycle, every node
/ i=i
will have equal probability of being chosen as the starting node. When this cycle
ends, every ant has completed a tour. Let the set of ants which chose the rth node as
the starting node at the previous cycle be Ant Start . and the size of Ant Start be
\Ant Start .1 • At the end of each cycle, we update the entries of the selection table by

r* Select,+. ( 1 " r ) , £ V^—f> t1)


T. =< \Ant _Starti\j,An,_su,t,\tour\
k
where r e (0,1) is the parameter which controls the relative importance between the
previous experiences and the current experience.
2.4. Node Transition Rule
There are two dominant factors in computing the selection probability of the edge
from node i to the next node,/: the pheromone intensity i.. and the greediness value
'j

Tf.. T• T..isisgradually
graduallychanged
changedatatthetheend
endofofeach
eachcycle
cycleus:using the pheromone updating
rule, n is determined by a greedy heuristic which forces the ants to walk to the
farthest accessible node. This can be accomplished by setting JJ as the number of
nodes on the corresponding arc of the chosen edge.
JNow, we define the transition probability from node i to node 7 as
378

= (T,)"(lJ,)' , (2)

v A

where OC and /? are control parameters regarding the relative importance of


exploitation and exploration searches.

2.5. Pheromone Updating Rule


The intensity of the pheromone trail is updated at the end of each cycle by the
quality of the solution corresponding to the completed tour. The edges belonging to
better tours receive more pheromone quantity in this cycle. However, the edges of
the tours which result in infeasible solutions will be penalized by decreasing an
amount of pheromone. We define the pheromone updating rule as follows. At the
end of each cycle, the pheromone intensity on the edge from node i to node j is
updated by

T,=PT..+max(£AT*,0)> (3>
where p e (0,1) is the persistence rate of the pheromone trails, and

1 if the edge belongs to tour t


>ur „1 ' and Err (tour t) < e
(4)
Err (tourt) if the edge belongs to tourt
Ar' = e * n, and Err (tourk) > e

), otherwise

Therefore, more quantities of pheromone will be laid on the edges on which


most passing ants can construct shorter feasible tours. On the other hand, in the
worst case, the edges will receive no positive rewards either no ants walked through
them or most passing ants constructed infeasible tours.

2.6. The Proposed Algorithm and the Best Parameter Setting


Now we present the details of the proposed algorithm (which we denote as AS/Poly)
for solving the polygonal approximation problem as follows.
1. Initialize.
Construct the directed graph as depicted in Subsection 2.2.
Set i\. = 1/ for every edge (i,j).
Set T. = 1 for every entry of the selection table of the starting node.
379

Set A^C = 1.
Set tour,,., . = x,x1 • • • x„x,.
2
*" global best 1 » '
2. For every ant do
Select the starting node according to the selection probability.
Repeat
Select the next node according to the node transition rule using Eq. (2).
until a closed tour is completed.
// the selection of the next node can not pass over the starting node //
3. Find out the shortest feasible tour, say tOUr , among the current m tours.
current best

4. If \tour < \tour then


luul l<J,
\ current _besl\ ^ | "'gM>fl/_6asf |

global_best ~ '""^current_besl

5. For every edge do


Update the pheromone intensity according to the pheromone updating rule
(see Eqs. (3) and (4)).
6. Update every entry of the selection table using Eq. (1).
7. If (NC = MAX_CYCLE) then output tourglM bal and stop; otherwise,
NC =NC + l and goto step 2.
To obtain the best performance of the AS/Poly, various parameter values have been
experimented. The test benchmark curves which are used in the literature [2, 4, 5, 7,
8. 9] are shown in Figs. l(a)-l(b). The best performance of the AS/Poly was found
when a = 1, p -5, r = 0 . 4 , p = 0 . 1 , and m = 2 0 for both test curves.

(a) (b)
Fig. 1 The test benchmark curves.
3. Experimental Results and Discussions

In this section, we will discuss more important properties of the AS/Poly through
empirical studies. The performance of various strategies of the AS/Poly is compared
to those of two other heuristics: genetic algorithms (GA) and tabu search (TS).

3.1. Properties of the Proposed Algorithm

3.1.1. AS/Poly v.s. Random Walk


We have implemented a random walk search method (referred to as random walk for
simplicity) where the distributed ants have no communications and no previous
380

experiences, just choose the next node randomly. Fig. 2 (a) shows the global shortest
tour length obtained at each cycle for both of the AS/Poly and the random walk. It is
seen that in the beginning stage of running cycles, the AS/Poly which does not yet
cumulate enough feedback experiences has similar performance as that of random
walk. After the 4th cycle, the global shortest tour length found by the AS/Poly keeps
decreasing, while the one found by the random walk is almost unchanged.
Consequently, the mechanisms facilitating the inter-ant communication and the
persistence rate of previous experiences play significant roles in the search
paradigm.

---ACSfl\4y

] -*- Ran.1oin Walk

4(1 43 4(5 49 C-'i'.-l

(a) (b)

Fig. 2 The important properties of the AS/Poly.

3.1.2. Node Branch Entropy


To examine the convergence behavior of the AS/Poly, we can measure the average
node branch entropy. For the rth node on the graph, the node branch entropy is
computed by the transition probabilities on all of the directed edges which originate
from the rth node. That is
(5)
£,=-X^i°g/V
The node transition rule becomes more deterministic when the node branch entropy
approaches to 0. The average node branch entropy is then computed over all node
branch entropy values, i.e., £ = Y" E.ln- Fig- 2(b) shows the value of E at each
cycle. Initially, the value of E decreases gradually since there is few experiences
cumulated and the AS/Poly tries to explore new branches to cumulate more
experiences. In the middle cycles (between the 40th and the 100th cycles), the value
of E drops drastically because some edges become more preferable and have
higher transition probabilities. In the later cycles, the value of E decreases
381

gradually again since the dominant edges stand out and the transition probabilities
become stable. Hence, the value of the maximal number of running cycles which
decides the stopping criterion of the AS/Poly could be set as the one falling in the
stable part.

3.2. Searching Strategy


In this subsection, we present two advanced searching strategies of the AS/Poly.

3.2.1. Elitist Strategy


The elitist strategy which comes from GA [12] is also suggested to be used in the
AS by Dorigo [11]. The basic idea is that the pheromone intensity on the best tour
obtained so far (denoted as tour, L, , ) is further reinforced such that
v
global _besl
tour has higher probability of changing edges with other tours. We have
implemented the elitist strategy with the AS/Poly.

3.2.2. Hybrid Strategy


We suggest to use a hybrid strategy which embeds a local search method in between
the iterations of the AS/Poly. The local search method used here is an iterative one,
on
and for saving the computational time, it is applied to tour ly- First,
global _ best
randomly delete one node from tour • Then for each of the remaining nodes
on tour > it is moved along every position between its two neighboring nodes
global best

and finally stops at the position which gives the minimal approximation error. An
iteration is completed when all of the nodes on tour have been processed.
The next iteration is activated if any node has been moved to a new position, i.e.,
there is an error reduction in the previous iteration. The iteration is repeated for at
most five times to save computations.
3.2.3. Comparative Performances
Experimentally we found that both of the elitist strategy and the hybrid strategy
overcome the AS/Poly, and the hybrid strategy has the best performance. As will be
seen in the next subsection, the two advanced strategies cost negligible extra CPU
time than the AS/Poly, and they have more significant approximation improvement
when smaller £ -bound is specified.

3.3. Comparison with GA and TS


The experiments are conducted on a PC with a Pentium 120MHz CPU. The
numerical results are obtained by averaging over 10 independent runs for each of the
test methods. Various values of £ have been specified for each of the test curves.
Table 1 lists the average number of line segments (d) of the final obtained
approximating polygon and the average computational time (t) of every comparative
382

approach. It can be seen that, for both of the two assessing factors, the proposed
AS/Poly and its variations have the best performance, the TS-based approach ranks
in the middle, and the GA-based approach is the worst. It is also observed that the
elitist strategy and the hybrid strategy have more prominent improvement in
.educing the number of approximating line segments than the AS/Poly when the
value of £ is decreasing. The average CPU cost time of the elitist strategy is similar
to that of the AS/Poly because only a few computations are needed to update the
pheromone trail of tour . . . b • The extra CPU cost time of the hybrid strategy is
also negligible if the user prefers to see a better approximation result with higher
compression ratio.

Table 1 The comparative performances of the GA-based approach, the TS-based approach,
the AS/Poly approach, the elitist strategy and the hybrid strategy of the AS/Poly.
GA-based TS-based AS/Poly Elitist Hybrid
Curves £ d t d t d t d t d t
150 15.6 5.71 10.6 0.93 11.2 0.68 11.6 0.69 11.0 0.87
100 16.3 4.45 13.7 0.92 13.0 0.70 13.0 0.68 12.6 0.84
Leaf 90 17.3 5.28 14.6 0.89 13.2 0.70 13.0 0.71 12.8 0.89
(n=120) 30 20.5 4.62 20.1 0.90 17.2 0.71 17.0 0.72 16.6 0.90
20 23.1 5.65 21.9 0.90 19.8 0.72 19.0 0.72 18.8 0.90
60 13.2 4.56 11.0 0.87 10.0 0.59 10.0 0.59 10.0 0.78
30 13.9 4.80 13.6 0.79 12.6 0.59 12.4 0.57 12.0 0.75
Semicircle 25 16.8 4.29 14.9 0.78 13.4 0.59 13.0 0.61 13.0 0.74
(n=102) 20 19.2 4.67 16.2 0.78 16.4 0.60 16.0 0.62 15.8 0.73
15 23.0 4.44 18.3 0.76 18.0 0.61 17.4 0.63 16.8 0.73

Figs. 3(a)-3(e) show the final approximating polygon and the corresponding
number of the approximating line segments for the leaf-shaped curve with the £ -
bound set to 20 for each of the test methods; Figs. 3(f)-3(j) are those for the curve
with four semicircles given the £ -bound equal to 15. It can be seen that the
proposed algorithms produce the least number of approximating line segments for
all test curves.
383

(a)GA-based (b) TS-based (c) AS/Poly (d) Elitist (e) Hybrid


(rf = 23) (<*=21) (d=20) (rf=19) (rf=18)

Fig. 3 The approximating polygon and the number of approximating line segments
using different test approaches.

4. Summary

Polygonal approximation of the digital curves is very important since it not only
facilitates the reduction of the memory storage and the computational time but also
provides the feature analysis of the digital curves. Most existing approaches are
local search methods and can be classified to three classes: the sequential
approaches, the split-and-merge approaches, and the dominant points detection
approaches. Although they are computationally fast, the approximation results may
be far from the global optimal ones.
In this paper, we have proposed a new polygonal approximation method using a
global search heuristic called the ant system (AS). The principal components,
namely, graph representation, initial ant distribution, node transition rule, and
pheromone updating rule, of the AS have been investigated and adapted to the
underlying problem. Some important properties of the proposed method are
examined through empirical studies. Inspired by the research of genetic algorithms,
we have proposed the elitist strategy and the hybrid strategy for our method. The
performances of the proposed methods are compared to those of genetic-based and
tabu search-based methods. The numerical results are very encouraging.
384

References

[I] J. Sklansky, V. Gonzalez, ast polygonal approximation of digitized curves", Pattern


Recognition 12 (1980), pp. 327-331.
[2] B. K. Ray, K. S. Ray, etermination of optimal polygon from digital curve using Lj
norm", Pattern Recognition 26 (1993), pp. 505-509.
[3] U. Ramer, n iterative procedure for the polygonal approximation of plane curves",
Computer Graphics and Image Processing 1 (1972), pp. 244-256.
[4] B. K. Ray, K. S. Ray, new split-and-merge technique for polygonal approximation of
chain coded curves", Pattern Recognition Lett. 16 (1995), pp. 161-169.
[5] C. H. Teh, R. T. Chin, n the detection of dominant points on digital curves", IEEE
Trans. Pattern Anal. Machine Intell. 11 (1989), pp. 859-872.
[6] W. Y. Wu, M. J. Wang, etecting the dominant points by the curvature-based polygonal
approximation", CVGIP: Graphical Models and Image Processing 55 (1993), pp. 79-88.
[7] P. Y. Yin, enetic algorithms for polygonal approximation of digital curves", Internat. J.
Pattern Recognition Artificial Intelligence 13 (1999), pp. 1-22.
[8] S. C. Huang, Y. N. Sun, olygonal approximation using genetic algorithms", Pattern
Recognition 32 (1999), pp. 1409-1420.
[9] P. Y. Yin, tabu search approach to the polygonal approximation of digital curves",
Internat. J. Pattern Recognition Artificial Intelligence 14 (2000), pp. 243-255.
[10] M. Dorigo, Optimization, learning and natural algorithms, Ph.D. Thesis, Dip.
Elettronica e Informazione, Politecnico di Milano, Italy, (1992).
[II] M. Dorigo, V. Maniezzo, A. Colorni, nt system: optimization by a colony of
cooperating agents", IEEE Trans. Syst, Man, Cybern. - Part B 26 (1996), pp. 29-41.
[12] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning,
Addison-Wesley, Reading, MA, 1989.
A BIOLOGICAL VIEW ON INFORMATION ECOSYSTEMS

BENGT CARLSSON AND PAUL DAVIDSSON


Department of Software Engineering and Computer Science
Blekinge Institute of Technology, 371 25 Ronneby, Sweden
E-mail: bengt.carlsson@bth.se, paul.davidsson@bth.se

We study the dynamics of information ecosystems where there are antagonistic agents or
groups of antagonistic agents. In particular, we focus on systems that consist of exploiter
agents and agents being exploited. When comparing information ecosystems with biological
ecosystems, it becomes clear that both types of systems seem to support robust solutions that
are hard to violate by a single agent. In the analysis of information ecosystems, it is important
to take into consideration that agents may have a Machiavellian intelligence, i.e., that they
take the self-interest of other agents into consideration. We conclude that in the interaction
between antagonistic agents within information systems, arms race is a major force. A
positive result of this is a better preparedness for innocent agents against the vigilant agents.
Some examples are given to show how the modelling of information ecosystems in this way
can explain the origin of more robust systems when antagonistic agents are around.

1 Introduction

The emerging global information infrastructure may be regarded as a universal


information ecosystem. An information ecosystem is populated by infohabitants,
i.e., (semi-) autonomous software entities typically acting on the behalf of humans.
These infohabitants, or agents, may or may not have conflicting interests. In this
paper we will compare the dynamic process resulting from the interaction between
agents in information ecosystems that have conflicting interests to the processes
within biological ecosystems.
Biological ecosystems are basically determined by an individually centered
mechanism called natural selection [9, 10]. Normally, fitness expresses the rate of
success for individuals within an ecosystem. If antagonistic agents of an information
ecosystem that are acting in their own interests, are modeled as part of an
independent biological system, this will represent a worst-case scenario because no
cooperation can be taken for granted. As a result, the robustness of an information
ecosystem may be explained from a truly distributed point of view.
In section 2 we will examine the interaction between agents and between agents
and the humans behind. Section 3 provides some Internet-based examples of
antagonistic interactions, and is followed by a discussion of the dynamics of the
information ecosystem compared to a biological system. Finally, in section 4 some
conclusions are drawn.

385
386

Exploiter antagonism User


9 ^—•

instructions feedback instructions feedback

interaction
Exploiter agent •« • User agent

Fig. 1. Part of an antagonistic information ecosystem.

2 The dynamics of antagonistic information ecosystems

Humans have the possibility to represent knowledge outside the brain as mind-tools
[2, 3]. Computers, "intelligent" systems and agent technology within the global
network may all be regarded as mind-tools, controlled by independent and selfish
humans.
In his book "The Prince", Machiavelli wrote about how to conquer and preserve
authority. The impression of being fair and honest may, if the preservation of the
authority requires it, be followed by tricks, lies and violence. Humans are presumed
to have such a Machiavellian intelligence to bring out self-interest at the expense of
others, mainly because we are part of the biological system. Thus, if the intentions
of an agent involve some kind of conflicting goals, we should expect Machiavellian
intelligence to be present.
We will here focus on a typical scenario of such an ecosystem where there is
one (or more) exploiter agent(s) and one (or more) user agent(s) being exploited. In
Fig. 1 we see an example with one exploiter and one user.
The goal for the human exploiter is to make profit from the agent interaction.
Besides giving the initial instructions to the exploiter agent, the exploiter most
likely has to continually instruct the exploiter agent because of the limited
knowledge of the domain of a software agent compared to a human being. The
human user and its agent when trying to avert the exploiter agent will perform the
same kind of reasoning. The situation will end up in an arms race where the second
agent retorts the improvement of the first agent by having its human owner giving it
improved instructions. The long-term outcome of a continuing arms race is an
improved retort against the unfriendly actions already performed by the opponent,
and probably reducing the number of obvious new exploiting behaviours.
387

Unlike the traditional descriptions of agent systems based on mental states


having belief, desire and intentions [6], we here focus on the humans owners. A
human is capable of using knowledge outside the actual domain and arrange it using
consciousness. This knowledge is then transferred to the agents through instructions
and based on feedback. We believe that this interaction is a key component for
understanding the mechanisms behind antagonistic information ecosystems.

3 Examples of antagonistic information ecosystems

We will here give some examples of antagonistic conflicts within information


ecosystems ranging from an illegal virus attack to legal registrations of users.

3.1 Computer viruses


It is easy to imagine an arms race between crackers (the exploiters) and users
referring to Fig. 1. In the first phase, the cracker finds out about the weakness of the
user's computer system and then creates a program that sends out the viruses, or
with the terminology used here, an exploiter agent. To avoid virus-attacks, the user
has to use some filtering function or anti-virus program (the user agent). The result
may be a refined arms race where the exploiter agent changes the appearance of the
virus automatically and the user agent finds these new variants.
Despite its harmful purpose, a virus could be treated as a catalyst for a more
robust information ecosystem. We here refer to the biological concept of robustness,
i.e., the ability of an agent to properly balance efficiency and efficacy in order to
survive in many different environments. The robustness of an ecosystem is then
dependent of the robustness of its members. Thus, if ecosystems can be made more
robust, they can perform their functions longer and better, which will make costly
redesigns unnecessary or at least less costly and less frequent.
A time based security model [8] quantify the exposure time related to a virus
attack as dependent on protection, detection and response. These variables in turn
have a quality factor involved, expressed as the strength of the mechanism and some
active time interval. If the final result of the virus battle is a balance where viruses
only in exceptional cases do harm because of an improved time based security, we
probably manage better with than without viruses. Just as in natural ecosystem we
should not expect systems easy to manipulate. An information ecosystem with
humans using their Machiavellian intelligence to control perceiving agents is a
much more likely consequence.

3.2 Spy ware


Today there are millions of Web sites reachable by any Internet user. These users
spend more and more time to customize and improve their Web browser
388

experiences through third-party browser extensions. These extensions may be able


to monitor and report the behavior of the users to anyone needing this information.
Cookies are used to register information and later retrieve or modify this
information with other Web browsers. A Web site may store information about the
user in a cookie that the user does not know about; the cookie mechanism could be
considered a form of spyware, abusing the privacy on the Internet.
In general spyware (the exploiter agent) is any technology that aids in gathering
information about a person or organization without their knowledge. On the
Internet, spyware is programs that are put in someone's computer to secretly gather
information about the user and relay it to advertisers or other interested parties. In a
recent report, Martin et al. [4] examined business-to-consumer browser add-on
products. The research focused on the ways that the browser extensions collected
user information, how the data flowed back to the company's database and, how the
data capture techniques matched the privacy policies of each company.
"We downloaded 16 Internet Explorer browser extensions and watched them
work. A number were well behaved. But others seemed to outridge exploit
our hospitality, watching and reporting our every move in the browser, some
intercepting data sent to competitors and other reporting back to headquarters
on pages we 'securely' downloaded using Secure Socket Layers" [4]
There is a lack of trust against third-party browser extensions and recently anti-
spyware tools (user agents) were offered the Web users. These programs inform the
user about any unwanted cookies present and, how to get rid of them.
For both spywares and anti-spywares there is a profit calculation involved. The
consumer does not exactly know the purpose of the data collection, but may suspect
a third party selling of information. Commercial anti-spyware products replace the
first free products. The spyware evolution is similar to the virus example but may be
less destructive because:
• The consumer is normally partly informed about the collecting activity (but not
about the future use of the data).
• The consumer may be offered an additional web-service or freeware program.
The spyware acts as the commercial part of the offered service.
• The data collection may improve the development of e-business.
An arms race within spywares may improve some e-business concepts, i.e. act
as a resource generating marketing. Between competing business companies such
an improvement may cause advantages against other companies not using spywares.

4 Conclusions

Based on the assumption that software agents may mediate the Machiavellian
intelligence of their human owners, there are a lot of similarities between
389

information and biological ecosystems. The main conclusion we draw from these
similarities is that arms race is a major force within information ecosystems.
Both examples presented in section 3 show a development of the information
ecosystem through an arms race. From a system perspective, this can be seen as a
positive thing because the ecosystem will become more robust. If a user knows
about complications caused by exploiting agents and prepare to defend against these
intruders, the user will get off better compared to being unprepared. From the user's
perspective, the disadvantage is the resources, e.g., money and time, spent on
procuring anti-virus and anti-spyware programs.
The Machiavellian intelligence has aroused through an arms race of the
capacity to deceive, but this does not mean we lost our (inherited) capability to
cooperate. The choice between long-term cooperation and getting some short-term
advantage of being selfish is called Prisoner's Dilemma within game theory [7].
Prisoner's dilemma describes the rise of cooperation in both social [1] and natural
science [5], within a restricted domain. The results from the analysis of Prisoner's
Dilemma can be described as: every agent wins by cooperation, but if everybody
else cooperate, the single agent will benefit by being selfish. If no one cooperates
everybody will be worse off. Most efforts today to solve this dilemma are done by
legislate methods but, as stated previous, we argue that there is a self-adjusting
quality that influence the dynamics of antagonistic information ecosystems.

References

1. Axelrod, R., The Evolution of Cooperation Basic Books Inc. (1984)


2. Dennett, D.C. Darwin's dangerous idea Allen Lane/The Penguin Press, London
(1995)
3. Gardenfors, P. Hur Homo blev sapiens, Bokforlaget Nya Doxa, Nora (2000)
4. Martin Jr, D.M., Smith, R.M., Brittain, M„ Fetch, I., and Wu, H., The Privacy
Practices of Web Browser Extensions, University of Denver (2000)
5. Maynard Smith, J., Evolution and the theory of games, Cambridge University
Press, (1982)
6. Rao, A.S., and Georgeff, M.P., "BDI agents: from theory to practice." In
Proceedings of the First International Conference on Multi-Agent Systems, p.
312-319, AAAI Press/The MIT Press (1995)
7. Rapoport, A., and Chammah, A.M., Prisoner's Dilemma A Study in Conflict
and Cooperation Ann Arbor, The University of Michigan Press (1965)
8. Scgwartay W. Time Based Security. Practical and Provable Methods to Protect
Enterprise and Infrastructure, Networks and Nation Interpact Press (1999)
9. Williams G. C , Adaptation and natural selection Princeton University Press
1966
10. Wilson, E. O., Sociobiology-The abridged edition, Belknap Press (1980) 1975
THE G . D A C C O L L A B O R A T I O N F R A M E W O R K

K. W. NG T. O. LEE
Department of Computer Science & Engineering, The Chinese University of Hong Kong,
Shatin, N. T. Hong Kong, China
E-mail: {kwng, tolee}@cse.cuhk.edu.hk

The Internet has been expanding rapidly over the recent decades as are the activities
conducting over the World Wide Web. The complexity of online services grows along
with the increasing population online. The robustness of network applications and
distributed systems can no longer be sustained by the traditional distributed
programming approaches in an effective manner. For this reason, the mobile agent
paradigm has emerged as a promising methodology to resolve complex distributed
computation problems at high scalability. In this paper, we present a C omponentware
for Distributed Agent Collaboration (CoDAC) as a solution to general agent
coordination problems. CoDAC implements the component model to offer flexible and
reliable coordination support to mobile agents distributed over the Internet.

1 Introduction

The mobile agent paradigm brings benefits in many ways. An agent may continue to operate
even if it is temporarily disconnected from the network as it essentially performs its
operation locally at the data source. In fact, an agent can be kept offline and is immune to any
harm caused by network latency for most of the time of its execution. In addition, it utilizes
the limited bandwidth efficiently by sending only the relevant results over the network. All
these benefits justify the deployment of agents in the distributed computation environment.
The multiagent paradigm stems from employing multiple agents to add further capabilities
and performances to distributed systems. The multiagent paradigm further unravels the
potential of software agents in realizing various attractive goals. For example, more elaborated
services, parallel processing, and increased system throughput with highflexibilityand fault
tolerance, etc.
In this paper, we present a Componentware for Distributed Agent Collaboration
(CoDAC) as a solution to general agent coordination problems. CoDAC utilizes the
component model [7] to offer flexible and reliable coordination services to mobile agents
distributed over the network. It functions on top of the Jini infrastructure [1,4] in ordertobe
deployable with plug-and-play capability at runtime. CoDAC encapsulates its constituent
features with respect to the enforcement of common knowledge [2] and interacts with agents
through well-defined interfaces. It features modularized and interchangeable building blocks
for multiagent systems. On top of that, it exercises the self-managing properly to manage
resources of its own and adds no management burden on the associated agents.

390
391

2 Collaboration Life Cycle

2.1 Initialization
At the very beginning, the coordinator agent [8] c starts a collaboration group by instantiating
a Distributed Agent Adapter (DA adapter) [8] with a unique group ID. This instance of DA
adapter, in turn, discovers all available lookup services on the network. The DA adapter
opens the collaboration group to the public through registering a serialized instance of its
clone as a service proxy on each lookup service it has discovered. Each registered proxy
shares the same service ID [Sun99a]. For each agent p that intends to engage into a
collaboration group, it first gains access to one or more lookup services around. Next, p
searches for the desired service proxy, that is, a serialized instance of DA adapter in our case,
through the lookup service. The search criteria can be based on the group ID, the Jini service
ID [5] or even the agent ID of the coordinator. As long as the desired collaboration service is
located, the relevant DA adapter will be downloaded to p. After being deserialized, the DA
adapter contacts the original DA adapter (the one associated with c) and issues a request to
join the collaboration group on behalf of p. In response, the DA adapter of c verifies the
request, checks for data consistency and grants the membership for/? under mutual agreement
with all available members within the group. Such mutual agreement is enforced by the group
membership protocol described in [8]. If the request is granted, p becomes part of this group
and is ready to collaborate.

2.2 Resources Gathering


The collaboration begins with the assembling of available resources within the collaboration
context. To begin collaboration, the DA adapter of c instantiates a collaboration manager [8].
Upon initiation, the collaboration manager issues a collaboration request to each DA adapter
within the collaboration context. This request signals each participating agent to deliver its
individual computation results to c. As a Distributed Agent Manager [8] (DA manager)
receives the collaboration request, it notifies its associated agent immediately by firing a
GatherResourcesEvent. In response, the agent presents the relevant data to the DA manager
as soon as the data is available and the DA manager simply forwards the data to the
collaboration manager. The resources gathering phase terminates after all the participating
agents have contributed their computation results or when the collaboration manager times
out. Either case, all the gathered information will be deliveredtothe kernel [8] for analysis.

2.3 Results Delivery


After the kernel has completed evaluating the collection of data assembled, it comes up with
some kind of collaboration results and the collaboration may end. At the end of collaboration,
the kernel returns the collaboration results to the collaboration manager, which, in turn,
392

forwards the collaboration results to each DA manager within the collaboration context inside
a transaction. The underlying atomic commitment protocol will be described in Section 3.
Eventually, all collaborating agents will install the same collaboration results consistently as
long as the transaction commits while the coordinator may initiate subsequent collaboration as
needed.

3 Atomic Commitment Protocol

After the kernel has finished computing the collaboration results R, it returns R to the
collaboration manager. The collaboration manager is then responsible to coordinate all agents
within the group to deliver R consistently in order to terminate the collaboration transaction.
The protocol proceeds infiverounds [3] as follows:
1. The collaboration manager sends a deliver_req predicate enclosed with R to every DA
manager within the collaboration context.
2. Next, each DA manager fires a PrepareDeliveryEvent, embedded with R, to the
associated agent.
3. In response, each agent checks its own state to see if it can committoR. The agents may
throw a VetoDeliveryException to vote against delivering R, or it may remain silent to
indicate an implicit agreement.
4. The DA managers return the appropriate vote (either yes or no) to the collaboration
manager on behalf of the participating agents
5. The collaboration manager collects all the votes among the group
a) If none of the participants vetoes the transaction, the decision will be to deliver R.
The collaboration manager will coordinate all DA managers to deliver R by initiating
a Jini transaction [6]toforward a deliver predicate to every DA manager.
b) Otherwise, the collaboration manager will coordinate the rollback of R by initiating a
Jini transactiontodeliver a rollback predicate to every DA manager.
6. Finally, each DA manager receives either a deliver or rollback predicate as the
transaction terminates. The DA manager then signals the agent whether to deliver or
abort R byfiringthe CommitDeliveryEvent or AbortDeliveryEvent respectively.
Figure 1 summarizes the above protocol. For simplicity, only one agent and one DA
manger is shown to interact with the collaboration manager. The delivery of each R is totally
ordered by the transaction ID.
393

Agent

DA
Mgr
7 \ VetoDeliveryException/

\ f&eliverl \ p
p CommitDeliveryEvent/

COmn,
commit
AbortDeliveryEvent

Collaboration/
Mgr —
\ / rollback \ / abort
Figure 1. Atomic commitment protocol
Whenever the delivery of R starts from step 1, there are two phases in the protocol where
some CoDAC entity is waiting for remote messages: in the beginning of step 5 and step 6. As
remote messages may get lost or their delivery time may vary due to link failures or network
latency, these phases are bounded to a timeout delay d to trigger fault discovery. The actions
triggered by a timeout are explained as follows.
In step 5, the collaboration manager is waiting for votes ftom all the DA managers. At
this stage, the collaboration manager has not yet reached any decision. In addition, no
participating agent can have decided to commit. Therefore, as it times out without getting all
vote to make the decision, (e.g. because of a vote is lost or delayed, the agent has crashed or
even the request has not reached the ajent in the beginning) the collaboration manager can
decide to abort and proceed to step 6 by sending a rollback predicate to every DA manager.
In step 6, a DA manager that voted Yes is waiting for a deliver or rollback predicate in
return. In this case, the DA manager cannot unilaterally decide to rollback because the Jini
transaction guarantees that either one of these two predicates will eventually reach all DA
managers as long as the collaboration manager (and the associated coordinator) keeps
fiinctioning, although the delivery time may vary after all. Therefore, the DA manager should
not decide to rollback unless it gets a rollback predicate or has certified the coordinator as
crashed. In other words the timeout triggers a fault discovery and the necessary recovery
procedure. This is done as follows:
When a DA manager dmgrp times out in step 6 of the commitment protocol, it retrieves
the coordinator channel in the space and writes a decision_req predicate to it. If the channel
cannot be found in the first place (because the coordinator failed to renew the lease on its
channel), then the coordinator may have failed and dmgrp thus triggers the recovery through
the group membership protocol described in [8]. Otherwise, dmgrp, waits for another d units
of time before it re-issues the decision_req. In the meantime,dmgrp may also break the loop
and proceed with the recovery as long as the lease on the coordinator channel expires.
On the other hand, the collaboration manager, in response to the decision_req, checks to
see if it has gathered enough votes to make the decision. If it possesses enough knowledge to
decide or if it has actually decided but the decision somehow has not been delivered to the
agents yet (perhaps due to network latency), then the collaboration manager retransmits the
decision to all DA managers inside a Jini transaction given the same transaction ID.
394

Otherwise, it waits until either all votes are gathered or its timer expires and to deliver the
appropriate decision by then.
Otherwise, if the original coordinator has crashed, the new coordinator c' elected from
the recovery protocol coordinates all agents to rollback. The atomicity is still preserved
because the Jini transaction model guarantees no participating agent can have committed.
Hence, c' can rollback the delivery of R by distributing a rollback predicate inside a Jini
transaction to all agents within the group.

4 Conclusion

We have introduced CoDAC as a comprehensive solution to general agent coordination


problems. The major contribution of CoDAC is to embed atomic commitment capabilities
into collaboration amongst distributed agents with enhanced fault tolerance. It delivers the
core functionality to attain common knowledge within a collaboration group and signals all
participants to take consistent actions. It fulfills the key properties in the component model
to offer reliable coordination service to mobile agents with plug-and-play capability,
encapsulated functionality and self-managing capacity. Beyond that, CoDAC breaks the gap
between different agent platforms with its strong compatibility to enable heterogeneous
agents to engage into a collaboration group as a whole.

References

1. W. Kenith Edwards, Core JINI, The Sun Mircosystems Press, Java Series, Prentice
Hall, Inc, Sept (1999).
2. Joseph Y. Halpern, and Yoram Moses, Knowledge and Common Knowledge in a
Distributed Environment, Journal of the Association for Computing Machinery, Vol 37,
No. 3, July (1990), pp549-587.
3. Suciu, O., Cristian, F.: Evaluating the performance of group membership protocols,
Engineering of Complex Computer Systems, (1998) pp.13 -23
4. Sun Microsystems, Jini™ Architecture Specification, Version 1.1 Alpha, Nov (1999),
http://www.sun.com/jini/
5. Sun Microsystems, Jini™ Lookup Service Specification, Version 1.1 Alpha, Nov
(1999), http://www.sun.com/jini/
6. Sun Microsystems, Jini™ Transaction Specification, Version 1.1 Alpha, Nov (1999),
http://www.sun.com/jini/
7. Clemens Szyperski, Component Software, ACM Press Books, Addison-Wesley, (1997)
8. T.O. Lee and K.W. Ng, A Componentware for Distributed Agent Collaboration, in Proc.
of First Int. Workshop on Web-Agent Systems and Applications, IEEE Computer
Society, (2000), pp. 780-784.
A Multi-agent Approach to Modelling
Interaction in Human Mathematical Reasoning

Alison Pease, Simon Colton, Alan Smaill, John Lee


Division of Informatics, University of Edinburgh,
80 South Bridge, Edinburgh, EH1 1HN, Scotland
{alisonp, simonco}9dai.ed.ac.uk, A.SmaillQed.ac.uk, john9cogsci.ed.ac.uk

1 Introduction
Current work in automated reasoning does not in general model social aspects
of human mathematics, with a few exceptions, for example [1]. We are inter-
ested in modelling concept and conjecture refinement, i.e. the way in which
the definition of a concept evolves as a conjecture develops. Modelling this
process is important because (a) it will illuminate aspects of the social nature
of mathematics and (b) it may be useful for improving existing automated
reasoning programs. In §2 we outline descriptions by Devlin and Lakatos of
the human process. In §3 we describe an agent architecture for this task and
how it could be implemented using the HR theory formation system[2].

2 Social Aspects of Human Mathematical Reasoning


Devlin [3] challenges the viewpoint that mathematics should be seen as a purely
logical endeavour. Logic seeks to abstract terms from their context and manip-
ulate them in an objective way which is independent of the motivation behind
the manipulation. Instead Devlin claims that any field of human knowledge
should be interpreted in terms of communication and context, in which mean-
ing and motivation play key roles. He argues that humans act rationally rather
than logically, i.e. they act according to individual goals, desires and capabil-
ities rather than following objective rules of logic. Therefore any sphere of
human knowledge, even mathematics, should be seen as rational rather than
logical.
Lakatos[4] highlights the roles that intuition, dialectic and group dynamics
play in mathematics. He presents a rational reconstruction of the development
of Euler's Conjecture and its proof, as a discussion between a cast of stu-
dents and a teacher. The students start with different examples of polyhedra,
from which someone suggests that 'for all polyhedra there is a relationship
between the number of vertices (V), edges (E) and faces (F), namely that
V — E + F = 2' (C). The discovery of counter-examples soon leads to a discus-
sion of what constitutes a polyhedron and whether it is possible or useful to

395
396

define a subset of polyhedra for which the equation holds. According to their
intuition (influenced by their experience of objects they classify as polyhedra),
the students use different methods which enable them to accept, reject or im-
prove the concept or conjecture. We list some of the methods.
1. Induction - generalise from particulars. (Since the equation holds for all
regular polyhedra it holds for all polyhedra, i.e. C.)
2. Surrender - look for counter-examples and use them to refute C. (The
hollow cube" is a counter-example since 16 - 24 + 12 = 4.)
3. Monster-barring - given a counter-example, modify the definition of the con-
cept or subconcept so as to exclude it. (The hollow cube is not a polyhedron
and therefore is not a real counter-example.) Note that Lenat's AM program
was able to perform monster-barring [5].
4. Exception-barring 1: piecemeal exclusion - find those properties which make
a counter-example fail C and then modify C by excluding that type of counter-
example. (Generalising from the hollow cube we say that any polyhedron with
a cavity will be a counter-example. Therefore C" becomes 'for all polyhedra
without cavities, V - E + F = 2'.)
5. Exception-barring 2: strategic withdrawal - instead of listing the exceptions
(as above), withdraw into a much smaller domain for which C seems certain to
hold. (Generalising from the examples for which the equation holds we see that
they are all convex. So C" becomes 'for all convex polyhedra, V — E + F = 2\)
Devlin and Lakatos both stress the development of mathematics through
social interaction. This indicates that an agent architecture, in which the
agents are defined by their intuitions, motivations and actions would provide
an appropriate framework.

3 An Agent Architecture for Concept Refinement


We define below an architecture of equal-status agents in terms of the problem
being addressed, what the agents do and how they communicate. Following
this, we suggest an implementation of this architecture using the HR theory
formation system.
The Problem: We want to model the social process of concept refinement in
the face of conjectured general properties and counter-examples to them.
The Agents: The task of the agents in our architecture is to develop in-
teresting concepts, conjectures and examples, and to react accordingly to the
introduction of counter-examples to a false conjecture. The methods outlined
above guide the roles that intuition, motivation and action play. Intuition is
built up via experience and used to generate concepts and conjectures via in-
a
A cube with a cube-shaped hole in it.
397

duction. Each agent is given a different set of examples, so that experience


differs between them, and their intuitions vary. The strength of the intuition
also varies, depending on the number of examples from which it derives. Intu-
ition is used to guide the motivation to action. Motivation includes accepting,
rejecting and modifying concept definitions and conjectures. Actions specify
ways in which to do this, such as the methods outlined above. For example,
an agent which suggests a conjecture based on many examples would have a
strong intuition about it and in the face of counter-examples, the motivation
would be to modify it rather than reject it, using either method 4 or 5 above.
Intuition and motivation are dynamic since agents influence each other.
Communication: The agents communicate by sending concepts, counter-
examples and conjectures whenever one is found and negotiating about con-
cept definitions. In Lakatos' discussion (temporary) agreement about concept
definitions is reached by a strategy of accepting the most exclusive definition
in order to continue the discussion. Another strategy might be to accept the
definition suggested by the agent with the strongest intuition.
In [2] Colton describes a multi-agent approach to concept formation in
pure mathematics, using the HR program which invents concepts, makes and
proves theorems and finds counter-examples to false conjectures. Four copies of
HR were employed as agents in an architecture where they communicated the
most interesting concepts they invented. By enabling them to communicate
conjectures and counter-examples instead, we suggest that the system could
model the process of concept refinement using the architecture described above.
Each copy of HR must be supplied with a set of objects of interest (for
example integers in number theory) from which it makes conjectures empiri-
cally about the concepts it invents. We propose to give each agent a different
set of objects of interest. This means that the conjectures they make will be
more likely to be false as they are based on a subset of the empirical evidence
available. However, it will provide an opportunity for concept and conjecture
refinement to occur in the following way:
1. Agent X makes a conjecture about a concept and communicates it to all
agents.
2. Each agent looks at the examples it has and communicates back any counter-
examples it finds.
3. X uses the counter-examples to redefine the concept and restate the conjec-
ture. For example all the others are asked for concepts which are true about
their counter-examples. If one concept is true of all the counter-examples, X
generalises the counter-examples into a concept definition and fixes the con-
jecture by exception-barring. It then starts the process again.
As a theoretical example in number theory, suppose agents X, Y and Z
398

are working with the numbers 1 - 10, 11 - 50 and 51 - 60 respectively. Using


the method of induction, Z forms the conjecture C — all integers have an even
number of divisors, and sends it to the others. X finds and communicates
counter-examples 1, 4 and 9. Since a high proportion of his knowledge refutes
C he has a strong intuition that it is false (not worth modifying). Therefore
his motivation is to reject the conjecture and he uses the method of surrender
to claim — C. Y finds and communicates counter-examples 16, 25, 36 and
49. Since they form quite a low proportion of his knowledge his intuition
is that it might be worth modifying C, so he uses the method of exception-
barring (piecemeal withdrawal) and looks for a generalisation of the known
counter-examples. He finds the concept square numbers which has already
been formed, and modifies C to Cy = all integers except square numbers have
an even number of divisors. Z might use the method of exception-barring
(strategic withdrawal) to modify C to Cz = all integers between 50 and 60
have an even number of divisors. The new conjectures are now considered and
counter-examples sought, with none found as they are all true (—C is a dead
end, Cy an interesting conjecture and Cz trivial).
The methods described by Lakatos have thus led in the above example to
the discovery of the concept all integers except square numbers and the conjec-
tures all integers except square numbers have an even number of divisors and
all integers between 50 and 60 have an even number of divisors. It shows how
we might begin to model simplified social aspects of mathematical reasoning
via agent interaction.

4 Future Work and Conclusions

We intend to implement the number theory example in the agent architecture


to observe concept and conjecture refinement. Testing will be carried out to
ascertain whether the architecture improves HR's performance or degrades it
(by wasting time on false or trivial conjectures), where performance is measured
by the number of interesting concepts, conjectures, and examples found.
There is also much scope for extending the architecture suggested above.
Lakatos describes further methods which allow interaction between proof at-
tempts and counter-examples. For instance, a counter-example may show
where a proof is faulty (which steps of the proof it violates) and analysis of the
faulty proof may suggest a counter-example. The value of a hierarchy should
also be investigated. In Lakatos' discussion there is a teacher whose role is
to stimulate (suggest a controversial conjecture), guide (judge when a certain
topic or method has been fully explored and suggest another) and evaluate
discussion (point out any inconsistencies in the students' claims). This may
399

tie in with a notion of respect between agents (thus more realistically simu-
lating group dynamics). Agents could record the sender's name along with a
received message and build a respect measure from the value of the message.
They would then give priority to messages from more highly respected agents.
This extended architecture would better capture what is meant by social in-
teraction. The dialogue involved in producing a mathematical theory should
then itself be evaluated, although this will be harder (since it is a qualitative
judgement).
Modelling social aspects of mathematical reasoning within an agent ar-
chitecture is worthwhile since it would lead to a better understanding of the
human process. This would have theoretical value for philosophers of math-
ematics and practical value for students of mathematics, as a history of the
proof, including failures and collaboration between experts would avoid the
mystifying steps which are often a feature of published mathematics. Ad-
ditionally in providing new methods it may show how to model aspects of
mathematics not yet automated, or provide more efficient ways of modelling
those aspects already automated. The theoretical example suggests that im-
plementation of the architecture described is a very promising approach.

Acknowledgements
We would like to thank Paul Crook for comments on an earlier draft, as well
as the anonymous reviewers for their comments. This work was supported
by EPSRC grants GR/M45030. and GR/M98012. The second author is also
affiliated with the Department of Computer Science, University of York.

References
[1] C. Benzmuller, Jamnik M., Kerber M., and Sorge V. An agent-oriented ap-
proach to reasoning. In Proceedings of the German Conference on Artificial
Intelligence (2001). Springer, 2001.
[2] S. Colton, A. Bundy, and T. Walsh. Agent based cooperative theory for-
mation in pure mathematics. In Proceedings of AISB-00, pages 11 - 18.
The Society for the Study of Artificial Intelligence and the Simulation of
Behaviour, UK, 2000.
[3] K. Devlin. Goodbye, Descartes. John Wiley & Sons, Inc., NY, 1997.
[4] I. Lakatos. Proofs and Refutations. CUP, Cambridge, UK, 1976.
[5] D. Lenat. AM: An Artificial Intelligence Approach to Discovery in Math-
ematics. PhD thesis, Stanford University, 1976.
SECURE ASYNCHRONOUS SEARCH

MARIUS-CALIN SILAGHI, DJAMILA SAM-HAROUD, AND BOI FALTINGS


Swiss Federal Institute of Technology Lausanne
1015 Ecublens, Switzerland
{silaghi, haroudjaltings] <§lia. di. epfl. ch

Distributed Constraint Satisfaction (DisCSP) is a general framework for modeling


distributed combinatorial problems. Practical distributed problems can involve
competitional situations among agents. In such cases not all agents are needed
for building a final solution and agents may try to hamper their competitors from
reaching a solution. This peculiarity cannot be modeled by current DisCSP for-
malisms. In particular, there is no mechanism for thwarting agents from dis-
seminating fake nogoods in order to impede their competitors from proposing or
reaching a good solution. In this paper we extend a DisCSP framework in order
to model competition. We adapt one of the most recent search algorithms to offer
agents means to check that received messages are legal.

1 Introduction

A wide variety of problems such as negotiation, resource allocation, design


or scheduling are inherently distributed. Importing techniques from a prob-
lem to another is easier when general frameworks are used. Distributed
Constraint Satisfaction (DisCSP) provides such a framework for static dis-
tributed combinatorial problems. A DisCSP is composed of a set of agents
A = {Ai, A?., ---An} and a set of k variables V = {vi,t>2, •••ffc}, each of them
under the control of the agents interested in it. The variables in V are called
external variables. With each agent Aj is associated a set of external variables
Vi = {vii,Vi2,...,vimi}, V C V, and a set of constraints C% = {cii,Ci2,...cik,}
such that any external variable constrained by a constraint in Cj is also con-
tained in Vi. The domain of a variable Vi is Di. All the variables Xj constrained
by constraints in Cj, and such that Xj $ Vi are said to be internal.
In problems with self-interested agents, the agents can actually be com-
peting for a resource or state and some agents can reach a solution without
the agreement of some others (e.g. with several clients or several providers).
The competitors are interested in concealing solutions they dislike. Often they
can do it by illegally generating nogood messages for solutions that normally
do not need their agreement. The existing distributed protocols for DisCSPs
do not offer the possibility to check these byzantine failures.
In this paper, we present an extended DisCSP framework that can model
these missing features. Since in practice competition between agents often

400
401

occurs in conjunction with negotiation problems, the extended framework


also enables the agents to attach preferences to their alternatives and to relax
their constraints. We assume that the sum of preferences in solutions has to be
minimized. The relaxation consists in either reducing values or, as proposed
in 1 , in accepting new tuples of valuations. The new algorithms can prevent
agents neither from making coalitions, nor from byzantine failures that act
against themselves. However, the new technique helps agents to avoid being
cheated with the help of the distributed search protocol.

2 Dynamic DisCSPs

By dynamism we understand that the participation of an agent to the solu-


tion/search process is dynamically conditioned by certain value assignments.
The extended framework builds on the notion of Valued CSPs 2 . First we de-
scribe the problem of an agent, Au, as a Negotiation Valued CSP, (NVCSP U ).
NVCSPU consists of a minimal increment, e, a set of external variables, V(u),
and an ordered set of global constraints, ci(u),...,c„ u (u). The domain of each
external variable contains a value, F, meaning unchanged and indifferent.
Each pair (valuation v, constraint Cj(u)) has associated a tuple:
Ti(u) = (f'easible^(u), preference^u)).
T"(u) is such that if n u > i > j > 0 then for any valuation v,
feasible*-(u)-^>feasible"(u) and pref erencej(u) ^preference" (u). There ex-
ists a valuation, v, such that either feasiblej(u) ^= f' easiblevAu), or otherwise
feasible^(u) = f easiblevAu) = T and preference"(u) + £ < preferencevAu).
A Dynamic DisCSP (DyDisCSP) is defined by a set of agents Ao,...,An.
Ak,k=[o,h),n>h>o a r e ^ agents called initiators. Each agent Aj owns a NVCSP,
NVCSPj. Given a valuation, v, for a set of external variables, S(v) is the set
of agents owning a variable not instantiated in v to F. By convention, the
initiators always belong to S(v). An agent is active if it belongs to the minimal
subset, A(v), of S(v) such that S{Uvars(A(v))v) n (S{v) \ A(v)) = 0.
Definition 1 (Acceptable valuation) A valuation v is acceptable if each
agent in S(v) proposes for v a feasible associated tuple (f easibleWi) = T).
Definition 2 (Solution) A solution of a DyDisCSP is an acceptable valua-
tion v of all the external variables such that if each agent Ai in S(v) is active
and proposes for v an associated tuple (T,preference^ (i)), where hi < rij,
then

v G {6| b = argmin( YJ preference%.(i))}


a
Aies(v),i>h
402

and no agent Ai, i>0, wants to reveal a constraint Cj, j>ki- The feasibility
condition is Y^AieS(v)Preference<ki('>') — 0-
The feasibility condition verifies that the solution is acceptable to the
initiators. If v is a solution of a DyDisCSP, then S(v) is the solver set for v.

3 Extending A A S R

In this section we introduce Secure Asynchronous Search (SAS) which is an


adaptation of Asynchronous Aggregation Search with Reordering (AASR) to
the DyDisCSP framework. First we recall the basic elements of AASR 4 .
Definition 3 (Aggregate) An aggregate is a triplet (v, s, h) where v is a
variable, s a set of values for v and h a history of the pair (v, s).
A history h for an aggregate a = (v, s, h) proposed by an agent Ak takes
the form of a list of pairs \i : l\ where i is the index of an ancestor of h that has
made a proposal on v and I is the value of a counter. An aggregate requests
higher priority agents to comply with a proposal, therefore it defines by itself
a nogood. Such nogoods are called nogoods entailed by the view.
Definition 4 An explicit nogood has the form -V. V is a list of aggregates.
The agents communicate via: ok, nogood, add-link and reorder mes-
sages. ok messages are sent from agent Aj to agent Ai, and have as parameter
a list of aggregates for variables in which Ai is interested, nogood messages
have as parameter an explicit nogood. add-link messages are sent from agent
Aj to agent Ai, informing Ai that Aj is interested in a set of variables, vars.
The agent on position i is denoted A1. Rl is the agent that can reorder Al+1
by sending reorder messages.4

3.1 Legal messages


In AASR, both ok and nogood messages transport some kind of nogoods.
These are the nogoods entailed by the view, respectively the explicit nogoods.
In order to allow the agents detect messages that are potentially harmful for
the quality of the computed solution, we introduce the notions of legal nogood
and legal aggregate. We want to prevent the agents from disturbing the search
by generating illegal messages. A message is illegal if it is generated by an
inactive agent. SAS requests agents to build messages in such a way that
their lawfulness can be proved.
Definition 5 (Legal explicit nogood) Any legal explicit nogood generated
by an agent Ai, where Ai is not an initiator, must contain at least one aggre-
gate (VJ, s, h), Vj € V(i) such that s does not contain F.
403

Definition 6 (Justification) Each aggregate Ii generated by an agent Ai


that is not initiator needs a justification. The justification of the aggregate It
consists of a pair (v,h) built from an aggregate (v,s,h) that activates Ai.
The justification of an aggregate, a, corresponds to a relaxation of the
nogood entailed by the view given by a and is stored in the history of the
aggregate, attached to the pair corresponding to the agent that has generated
a. A history has now the form \i\,h, j\\i2-,h, J2\--- where ik is the index of an
agent, Ik is the value of an instantiation counter and jk is the justification of
the corresponding instantiation.
Property 1 The space needed by an agent to store all the aggregates is 0(nv),
where n is the number of agents and v is the number of variables.
Corolary 1 The size of an aggregate is 0(nv).
Property 2 SAS has polynomial space complexity in each agent.
The proofs are given in 3 . Besides generating illegal nogoods, the agents
can also generate illegal aggregates against their competitors.
Definition 7 (Legal aggregate) An aggregate is legal if its justification is
valid and the variable in the justification does not contain F in its instantia-
tion. By convention, any aggregate generated by an initiator is legal.

3.2 The SAS protocol


In SAS the messages must prove that their sender is active. Agents must
generate only legal nogoods. Any other nogood would be discarded. The
next rule shows how legal nogoods can be obtained."
Rule 1 (Nogood generation) An agent Ai may compute an explicit no-
good N that is not legal, but the set in the newest aggregate it has received
for some variable Vj from V(i) does not contain F. Ai should add the newest
aggregate of Vj to N. If this is not possible, it should refrain from sending N
to other agents. This rule does not apply to initiators.
Rule 2 (Checking) The receiver of an explicit nogood ^N should check that
-•A?" is legal. Also the receiver of any aggregate, should check that the new
aggregate is legal. Illegal information is discarded.
The justifications trigger add-link messages in the same conditions as the
aggregates received in an explicit nogood in AASR. Justified nogoods should
not be delivered to the agent and integrated in the other structures inherited
from AASR before the answer to eventual add-link messages is received.
"When illegal nogoods are made legal, they are in fact relaxed. Agents that must relax
nogoods can use heuristics for choosing the variable Vj from V(i). (e.g. choosing the variable
for which the known aggregate was generated by an agent with the lowest position.)
404

Rule 3 (Justification change) Whenever the justification of an agent Ai


is modified, Ai has to send again all its aggregates.
Rule 4 (Justification invalidation) Whenever the justification J of a
stored aggregate a\ in Ai is invalidated by some incoming new aggregate a-i,
Ai has to invalidate a\ and has to apply again this invalidation rule as if a
new aggregate of the variable in oi would have been received.
Each proposal that activates or inactivates agents is broadcast to all agents
with higher positions.
Rule 5 (Next active) If possible, acting for Rl, each A1, proposes new or-
ders to ensure that Al+1 is active. R° is an initiator.
Proposition 1 The Secure Asynchronous Search maintains the characteris-
tics of completeness, correctness, and termination of AASR.
SAS is an asynchronous protocol. A corresponding synchronous protocol
(SSS) can be obtained in an obvious way.

4 Conclusions

We present an approach to distributed problems with competition and


byzantine-failures. The concept of Dynamic Distributed Constraint Satisfac-
tion is proposed and we show how it allows for modeling complex characteris-
tics of such problems. As shown in 3 , DyDisCSPs can easily model and solve
(Generalized) English Auctions. The presented algorithms and framework
inherit from Constraint Reasoning generality and flexibility in modeling.

References

1. K. Hirayama and M. Yokoo. Distributed partial constraint satisfaction


problem. In CP,LNCS 1330, pages 222-236, 97.
2. T. Schiex, H. Fargier, and G. Verfaillie. Valued constraint satisfaction
problems: hard and easy problems. In IJCAI, pages 631-637, 95.
3. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. Generalized En-
glish Auctions by relaxations in DyDisCSPs with private constraints.
TR #01/365, EPFL, 2001.
4. M.-C. Silaghi, D. Sam-Haroud, and B. Faltings. ABT wih asynchronous
reordering. IAT, 2001.
FOUNDATIONS OF MARKET-DRIVEN AGENTS:
AN ADAPTATION OF ZEUTHEN'S BARGAINING MODEL

KWANG MONG SIM1 and CHUNG YU CHOI2


department of Information Engineering,
Chinese University of Hong Kong, Shatin, NT, Hong Kong.
Email: kmsim(ai.ie. cuhk. edu. hk
2
Hong Kong and Shanghai Banking Cooporation, Hong Kong.

Market-driven agents are negotiation agents that react to changing market situations by
making adjustable rates of concession. In determining the amount of concession for each
trading cycle, market-driven agents are guided by four mathematical functions of eagerness,
remaining trading time, trading opportunity and competition. At different stages of trading,
agents may adopt different trading strategies and make different rates of concession. Four
classes of strategies with respect to remaining trading time are discussed. Trading opportunity
is determined by considering: (i) number of trading partners, (ii) spreads - differences in
utilities between an agent and its trading partners, and (iii) probability of completing a deal.
While eagerness represents an agent's desire to trade, trading competition is determined by
the probability that it is not considered as the most preferred trader by its trading partners.

1 Introduction

One of the most crucial issues in both conventional and electronic commerce is for
both sellers and buyers to reach a consensus on pricing and other terms of
transactions. While there are extant agent-based negotiation systems [1, 2, 3, 4],
agents in these systems adopt fixed (pre-specified) negotiation strategies which
may not necessarily be the most appropriate strategy for changing market
situations. As products/services become available and traders enter and leave a
market, the conditions for deliberation change as new opportunities/threats are
constantly being introduced. In addition, deliberation may also be bounded by time.
In fixed negotiation strategies, these issues are not addressed and agents
increase/relax their bids at a constant rate. Preliminary results from [5] showed that
by making dynamic adjustment of bids, market-driven agents outperformed fixed
strategies agents in most situations. The motivating consideration of market driven
agent is to assist human users in making optimal trading decisions in response to
changing market situations. The goal of this research is to design and engineer
agents that are guided by market-driven strategies adapted from Zeuthen's
bargaining model [6]. While Sim and Wong's agents [5] search for optimal deals in
a given market situation by considering market factors such as opportunity,
competition and remaining trading time, Zeuthen's model can be used to analyze
the probability of generating a successful deal. In particular, it seems prudent to
supplement the market-driven approach by also considering the risk of not
completing a deal if an agent insists on its bid/offer.

Market-driven strategy and Zeuthen's model: This research extends Sim and
Wong's market-driven strategy [5] by considering the spread k (difference)
405
406

between an agent's and its trading partners' bids/offers. In making concession, an


agent may need to assess the probability p of successfully completing a deal when
bargaining for/giving up some items/commodities in different market situations.
One way to assess/? is to consider the value of A:. Let v be the utility of an agent and
w be the best utility among all its trading partners in the current trading cycle. The
(best) spread in the current cycle is k = v - w. A market-driven agent makes
concession by determining the utility of its next bid/offer v" using the sum of the
spread in the next cycle k' and w. Hence, v' = k' + w. An agent strives to attain
the highest possible utility while maintaining a reasonable level of p in the next
trading cycle. It is more probable to complete a deal when k" is small, and vice
versa. k' is determined by assessing current market situation:
k'= E(e)T(t,t',T,X)P(n,< wt >,v)C(m,n)k . E() is a function on an agent's eagerness
that determines the amount of concession (section 2). T(t,t',,") is a function on
trading time that determines an agent's rate of concession (section 3). 0(n,<w{>,v)
is a function that determines the amount of concession according to an agent's own
expectation, number of trading partners and their (differences in) offers/bids
(section 4). C(m,n) is a function on the number of competitors m and trading
partners n that determines the probability that an agent is ranked as the most
preferred trader by at least one of its trading partners (section 5).

2 Eagerness and Convergence

The eagerness e of an agent is the percentage of convergence of the spread k. e


represents an agent's desire to make concession, that is, to narrow k in each trading
cycle, independent of the current trading time, number of competitors and trading
partners. If the market situation remains unchanged, an agent's spread A' in its next
bid/offer is given by: k'= (1 -e)k where 1 > e > 0. e is a user-supplied value that is
assumed to be constant throughout the entire trading process. The
function E(e) = \-e where 1 > e > 0 determines the ratio between k* and k, hence,
the amount of concession. A high value of e corresponds to a strong desire of an
agent to narrow k' and moving closer to reach a consensus. For instance, with
other market factors being constant, if e = 0, then k'=(]-0)k = k; an agent does not
make any concession. If e= 1, then k'= (1 -\)k = 0; a consensus is reached.

3 Trading Time and Strategies

At different stages of trading, agents may make different rates of concession. Their
strategies with respect to remaining trading time can be classified as follows:
1. An agent that is keen to complete a deal quickly may make large concessions
in the first few trading cycles. However, if a consensus is not reached rapidly,
there will be little room for negotiation in later stages.
2. An agent may choose to make minimal concession in early trading cycles and
large concession when the trading time is expiring.
407

3. An agent may make constant rate of concession throughout the trading process.
4. Remaining trading time does not affect an agent's next bid/offer. It sticks to its
original bid/offer throughout the entire trading process.
The formulation of next spread k' with respect to trading time is given as:

*'= ((< T) where k0 is the initial spread; t is the current trading


"(T
time, T is an agent's closing time, and A is a non-negative temporal sensitivity
factor that models the four classes of strategies (see figure 1). Since
—— = - A — — k 0 j—= —A(A — 1 ) — — k„
dt T
and dt t a n d A, t, and T are non-
negative, the slope (^LL) is always negative. Hence, k' and also the utility of the
agent either decreases with time (since concession is being made) or remain
constant throughout the trading process. The rate of change of the slope <ilk' |
I'" 2 J
differentiates the four strategies as follows:

1. when 0 < A < 1, the rate of change in the slope is increasing, corresponding to
larger concession in the early cycles but smaller concession in later cycles.
2. when A > 1, the rate of change in the slope is decreasing, corresponding to
smaller concession in the early cycles but larger concession in later cycles.
3. when A = 1, the rate of change in the slope is zero, corresponding to making
constant concession throughout the trading process.
4. when A = 0 , the rate of change of the slope and the slope itself are always
zero, corresponding to not making any concession throughout the entire trading
process. This is based on the assumption that the number of trading partners
and, their bids/offers remain unchanged.
X is supplied by a user and is assumed to remain constant throughout the entire
trading process. Let the spread at time t (when the last bid/offer was made) be k,
and the next spread at time t' (when the next bid/offer to be made) be k\ With other
market factors unchanged, an agent's next spread is:

*'= ^ ' ' k


and
T ( t , f , T , X )
ill
-(f) '-(H
4 Trading Opportunities

In determining trading opportunities, (i) number of trading partners n, (ii) spreads k


between an agent and its trading partners, and (iii) probability p of completing a
deal, are considered. The probability /?' of a completing a deal in the next cycle can
be assessed by considering the value of the next spread k\ If A:' is very large, it is
408

less likely that a consensus is reached in the next cycle and vice versa. Hence, p"
and k' are inversely proportional:
(3-0 i . L
P •" k
p is determined by considering the notion of a conflict probability [6].
Conflict probability: Suppose that at any stage of negotiation, an agent B/'s last bid
is represented as a utility vector v = (vb, vs) and its trading partner Si's offer is a
utility vector w = (wh, ws) with vb > wb and vs < ws (since B; and 5/ are utility
maximizing agents). Based on Zeuthen's analysis [6], if Bt accepts Si's last offer,
then it will obtain wb with certainty. If Bt insists on its last bid and (i) Si accepts it,
Bi obtains vb and (ii) St does not accept it, Bt may be subjected to a conflict utility
cb. cb is the worst acceptable utility for Bj (e.g., its reserved price). If Sj does not
accept Bi's last bid, B, may ultimately have to settle with lower utilities (the lowest
possible being the conflict utility), if there are changes in the market situation in
subsequent cycles. For instance, B/ may face more competitions in the next or
subsequent cycles and may have to ultimately accept a utility that is lower than wb
(possibly as low as cb). If the subjective probability of B/ obtaining ch ispc (conflict
probability) and the probability that 5/ achieving vb is (1- pc), then according to
Zeuthen's analysis [6], if Bi insists on holding its last bid, Bj will obtain a payoff
of (1- pc) vh + pc cb. Hence, B] will find that it is advantageous to insist on its last
bid only if [(1- pc) vh + pc ch] > wh. The probability of conflict pc is given as
v _ w . Consequently, the maximum value of
p < —-
v c

i, - i,

v _ w k is the highest probability of a conflict that B, may


v v c
i, - c * i, - i,

encounter. pc is a ratio of two utility differences. While (vh- wb) measures the cost
of accepting the trading agent's last offer (the spread k or difference between the
bid and offer of 5/ and Si), (vh- cb) measures the cost of provoking a conflict. (vb-
cb) represents the range of possible values of utilities between the best case utility
and the worst case (conflict) utility.
Aggregated Probability of Conflict: Let/?, be the conflict probability of B, with any
of its trading partner Sh then the aggregated conflict probability of B, with all of its
n trading partners is given as follows:

(3.2) . . k n K - */)
P c =
11 P i• = IT ' =
"^ ;— with k; = v - w,
v c
<= i f>i * - * (v„ - c„ )

Probability of completing a deal: The probability p that Bt will obtain a utility vb


with at least one of its n trading partners is:

(v w
(3.3) p
. =_ l, - pD
. _ ,
= 1 -
n, " - -}
C
c
o» - i, y
409

From (3.1) and (3.3),


n (v» w ,. )

P Oft - c
I. )"

The function that determines trading opportunities is given as follows:


(3.4)
O ( n , < w , >
n (v t - » - , >
) = ;= l

p' can be a user-defined parameter as a trading agent may try to maintain a certain
level of probability for completing the transaction while demanding the highest
possible utility in a given market situation. Although the notion of /?' appears to
resemble the notion of eagerness e, they are different. While p" models the extent
that an agent will make concessions in response to a market situation to complete a
deal, e models an agent's desire to acquire a product/service regardless of the
market condition. Furthermore, the market-driven strategy in this research is
designed for buyer and seller agents, hence (3.4) can be re-written as follows:
(3.5)
w,)
O (n ,< w t > , v )
n (v
o )"

where v and c are the utility of the last bid/offer of a trading agent and its conflict
utility respectively.

5 Trading Competition

The amount of competition of a market-driven agent is determined by the


probability that it is not being considered as the most preferred trading partner.
Suppose an agent B, has m-1 competitors, B = {B2,...,B„,} and n trading partners, S
= {S,,...,S„}. The probability that B, is not the most preferred trading partner of any
S, G S is m ~ ' . The probability of B, not being the most preferred trading partner
m
of all S, e S is (m ~x\ • Hence, the probability of B, being considered the most

preferred trading partner by at least one of S, e S is C(m, n) = , _ (m - ' If


m
xl
m = n = 1 then there is one buyer and one seller. Hence, c(l,l) = l - [ -—- | = l - If «
= /, there are m buyers B = {B,,..., B„,} and one seller S,. Then,
m
C(m,\) = l ~' i Each 5, G B is exposed to a probability of it not the most

preferred trading partner of S,. If there are m buyers B = {B,,..., B„,} and n sellers S
410

= {S,,..., S„], then each 5, £ B has a probability of " - ' that it not the most
m

preferred trading agent of any S, e S. The probability that B, e B i s not the most
preferred trading partner of all S, e S is (m ~ ' Y . Therefore, the probability that a

Bi e B is the most preferred trading partner of at least one St e S is


cim n) = i {'" ~ ' T • The market-driven strategy considered in this research is

designed for both buyer and seller agents and the above arguments hold for both
buyer and seller agents. Furthermore, it is reminded that the cardinalities of B and S
vary with changing market situations (as buyers and sellers can enter and leave the
market at any time).

6 Conclusion

This paper has presented the foundations of engineering market-driven agents. It


extends the work in [5] by considering the additional notions of spread, and
probability of completing a deal. The mathematical functions in sections 2 through
5 were used to guide the design of market agents that can automatically control the
amount of concession in a negotiation for a given market situation. Experimental
results in [7] demonstrated that the reactions of market-driven agents coincide with
intuitions in real life trading (space limitation precludes them from being included
here). It is hoped that the foundations discussed in this paper can shed new light in
engineering negotiation agents.

References

1. J. Rodriguez-Aguilar, et.al. Towards a Testbed for Trading Agents in


Electronic Auction Markets. AI Communication, Vol. 11, No. 1, pp 5-19, 1998.
2. P. Wurman, et.al. The Michigan Internet AuctionBot: A Configurable Auction
Server for Human and Software Agents. Proceedings of the 2nd Int. Conf. on
Autonomous Agents, May 1998.
3. D. Chavez, et.al. A Real-Life Experiment in Creating an Agent Marketplace.
Proceedings of the 2nd Int. Conf. on the Practical Application of Intelligent
Agents and Multi-Agent Technology, London, April 1997.
4. R. Guttman and P. Maes. Cooperative vs. Competitive Multi-Agent
Negotiations in Retail Electronic Commerce. Proceedings of the 2nd Int.
Workshop on Cooperative Information Agents (CIA'98). Paris, July 1998.
5. K.M. Sim and E. Wong. Engineering Market-driven Agents for E-auction. In
proceedings of the Pacific Asia Conf. in Info. System 2000, pp 1128 - 1138.
6. John C. Harsanyi, Bargaining, In (Eatwell, Milgate, & Newman eds.), The
New Palgrave: Game Theory, Macmillan Press Limited, 1989, pp. 60-62.
7. C. Choi. Towards Strategic Agents for Auction & Negotiation. MSc. Thesis,
Dept. of Comp., HK Poly. Univ., HK, 2001. Supervised by Prof. K. M. Sim.
CHAPTER 6

AGENT BASED APPLICATIONS


KAVANAH: A N ACTIVE USER INTERFACE INFORMATION
RETRIEVAL A P P L I C A T I O N

EUGENE SANTOS JR. AND HIEN NGUYEN


Intelligent Distributed Information System Lab
Department of Computer Science and Engineering
University of Connecticut
{eugene,hien} @cse. uconn.edu

SCOTT M. BROWN
Air Force Research Laboratory
Crew System Interface Division
Wright-Patterson AFB, OH
sbrown777@acm. org

This paper reports our implementation and evaluation of an active user interface
in an information retrieval application called Kavanah. The goal of the active user
interface is to improve the quality of information retrieval and to reduce the user's
cognitive workload while searching for information. Our underlying concept is to
dynamically construct the search queries based on a dynamic representation that
captures user interests, preferences and searching context (as represented in a user
ontology). Our approach to disaggregating the essential aspects of a user's intent
for searching allows for focused multi-agent based construction and correction of
the overall user model that captures the user's intent, thus promoting increased
effectiveness and efficiency. We evaluate the effectiveness of the active user in-
terface with commonly used metrics from the information retrieval community by
measuring retrieval performance with and without the presence of an active user
interface. Furthermore, we measure the ability to discover new knowledge by eval-
uating our dynamic online ontology construction. The evaluations use the Unified
Medical Language System knowledge base as a test bed.

1 Introduction

During the last few years, as the result of the overwhelming number of choices
of online and offline information resources, we have witnessed an increasing
trend towards the construction of personal assistant agents in information fil-
tering, recommender systems and agent communities 2,9 ' 11 . The main focus of
these approaches is to capture user interests by analyzing the user interactions
with the system and to use these interactions to guide the system reactions
accordingly to improve the quality of the users' work.
In this paper, we hypothesize that constructing a unified model of the
user's interests, preferences, and context in an information seeking task pro-
vides a fine-grained model that more effectively captures the user's informa-

412
413

Hon seeking intent than a model addressing a subset of these salient charac-
teristics. While other previous efforts have focused exclusively on learning any
one aspect of information seeking, none of them has attempted to integrate all
three aspects together for determining a user's intent in seeking information.
We refer to our personal assistant agent as an active user interface (AUI) in
this paper. Active user interfaces not only capture user interests, preferences,
and contexts but also focus on the interactions among them in a dynamic
fashion. In particular, our focus is on deriving and learning the context or
user ontology. Most existing methods assume that all users share a single
common ontology 13 . This implicitly assumes that all users have the same
level of understanding and beliefs expressed in the common ontology. We
believe that users understand information and how it interacts in their own
individual way. This arises from many factors ranging from user experience
and expertise to basic differences in user style and operation. We show that
by using our model, we can do more than just elicit the user interests and
preferences. We provide a learning capability for the system to discover new
knowledge based on analyzing the documents relevant to the user and the
context, i.e. why the user is focusing on the given information. This work is
derived from our earlier research with a predecessor system, Clavin 4,15 ' 16 .
We evaluate our hypothesis by constructing an AUI in an information
retrieval application called Kavanah. The implementation of our AUI is a
multi-agent based system in which the main agent contains the user model
consisting of user preference, interest, and context and the suppporting agents
are used to dynamically construct and maintain the user model based on
changes in the user's intent as well as incorrectness and incompleteness in the
user model. Our evaluation goal is to show the effectiveness of this model by
comparing the system performance in cases with and without an AUI using
commonly used metrics in information retrieval.
The rest of the paper is organized as follows: the next section discusses
the architecture of the system followed by a detailed description of our im-
plementation. Next, we discuss our preliminary empirical evaluation. Finally,
related work and future research issues are considered.

2 System architecture

The main goal of Kavanah is to use its AUI to assist the users in getting the
right information at the right time using the right tools 4 . The goal of the AUI
is to accurately represent a user's intent. Intent inference involves deducing
an individual's goals based on observations of that individual's actions 12 . In
automated intent inference, this process is typically implemented through
414

one or more behavioral models that have been constructed and optimized for
the individual's behavior patterns. In an automated intent inference system,
data representing observations of an individual, the individual's actions, or
the individual's environment (collectively called observables) are collected and
delivered to the model(s), which match the observables against patterns of
behavior and derive inferred intent from those patterns. These inferences can
then be passed to an application for generation of advice, definition of future
information requirements, or proactive aiding.
We partition intent inference into three formative components. The first,
interests, captures at a high level the focus and direction of the individual's
attention. The second, preferences, describes the actions and activities that
can be used to carry out the goals that currently hold the individual's at-
tention, with a focus on how the individual tends to carry them out. The
third, context, provides insight into the user's knowledge and deeper motiva-
tions behind the goals upon which the individual is focused and illuminates
connections between goals. In other words, the first component captures what
the individual is doing, the second captures how the individual might do it,
and the third infers why the individual is doing it. With regards to the re-
search presented in this paper, the AUI needs to provide the right assistance
to the information retrieval application on what the user is currently inter-
ested in; how a query needs to be constructed and returned results needs to
be portrayed; and why the user dwells on a search topic.
We assume that the interests are influenced by the ultimate goal that the
user is trying to reach and the methods which she uses to accomplish that
goal. For example, suppose that the user's goal is to study lung cancer and
her approach is to scan materials from general definitions to specific methods
used to treat this disease. Her interests will thus vary from general treatments
to specific chematography processes. In particular, her interests may change
from a certain drug to a more general approach for treatment. The user
interests, in turn, influence user preferences and context. If user interests
appear to be far off the goal that the user is trying to reach, she may change
her search strategies and understanding of the subject accordingly.
In our AUI, we capture the interest, preference, and context aspects of
user intent with an interest relevancy set, a user ontology network, and a
preference network correspondingly. The interests relevancy set determines
what is currently relevant to the user. It is generated by reasoning over the
user ontology network. Based on the utility values of each concept node in
the user ontology network, we end up with a rank ordering of the concepts
to build an interest relevancy set. Since user interests change over time, we
incorporate a fading function to make the irrelevant interests fade away. We
415

will describe this process in more detail in Section 3.


The user ontology network captures the user's knowledge of concepts and
the relations among concepts in a specific domain. Before further discussing
the user ontology network, we introduce briefly the concept of a domain on-
tology. The domain ontology captures the domain knowledge containing the
concepts and the relations among them in a specific domain. The user on-
tology exploits the domain ontology by extracting the missing information
that it needs to have in order to help identify the concepts which the user is
interested in. Therefore, the user ontology will be similar to a subgraph of the
domain ontology. However, in the user ontology network, additional relations
which are not found in the domain ontology may exist as a result of user
misconceptions or user-specific expert knowledge that is not fully captured in
the domain ontology. The user ontology network in Kavanah is represented
by a Bayesian network in which each node either represents a concept or a
relation among concepts.
The preference network represents how the user wants to form the query,
how this query should be answered, and how results should be portrayed.
The user's preference in Kavanah is reflected by how the user prefers to use
a class of tools. A tool is defined as an operator to perform specific actions
to transform the information that the user needs based on preference. Each
node in the preference network represents a tool, an action associated with
that specific tool, or a pre-condition which represents the requirements of the
tool connected with it. An example of a tool is a filter that removes those
documents that do not match a certain criteria. Another example of a tool
is an expander that searches for documents that expand the searching topic.
Figure 3(a) shows an example of a preference network.
The AUI uses correction adaptation agents to maintain the preference
network. Each correction adaptation agent offers a bid to the AUI to change
the preference network. They maintain a user model that is identical to that
of the AUI until the AUI requests bids, at which time the AUI adapts its
own user model based on its bidding behavior component. In the bidding
process, the correction adaptation agent that most likely improves the AUI's
effectiveness will win the bid and this winning agent is permitted to correct the
user model. We evaluate the user model by a set of metrics that measure its
adaptivity, autonomy, collaboration and robustness requirements. We capture
the user's utility for having the AUI perform an action on his behalf to achieve
a goal by the utility function over that set of requirements. An example of
such a utility function is as follows: Ureq = 0.14*Ureactive+0.14 * Upredictive +
U.14 * Uperceptive + 0. 14 * Uautonomous + 0 . 1 4 * t/coMadoratiue"("0.14 * Ucapability '
0.14 * V'misconception- For more information about the correction adaptation
416

agents, the bidding process and the metrics, please see our previous paper 5 .

3 System implementation

We start this section by describing the overall process in Kavanah and then
describe in detail how the AUI helps the system build the adapted query.
Kavanah consists of five modules as shown in Figure 1(a). The input module
accepts the user's natural language queries and transfers them to the query
module where they are parsed and converted into a query graph (QG) which
is similar in construction to the user ontology network except that it may con-
tain a node(s) representing a variable (usually denoted as X) that is necessary
to represent unknown concepts in the user query. A query graph is a directed
acyclic graph, where each node represents a concept or a relation among the
concepts. A relation node should have concept nodes as parents and children.
A concept node represents a noun phrase while a relation node represents a
verb phrase in a user query or a natural language sentence. An example of
a QG of the query "What causes liver damage?" is shown in the left side
of Figure 3(b). The AUI uses the query graph and generates a new adapted
query for the search module based on the current user model. An example of
an adapted query is shown in the right side of Figure 3(b). The search module
matches the QG of the adapted query against each document graph represent-
ing a record in the database of documents, chooses those records that have
the number of matches greater than a user-defined threshold, and displays the
output to the user. A document graph (DG) is a directed graph that contains
concept and relation nodes and is also similar to the user ontology network
(e.g Figure 2(a)). Note that all of the common concepts in all of documents
are found in a global dictionary and domain ontology. A match between a
QG and a DG is defined as the number of concept and relation nodes of the
QG being found in the DG over the number of nodes of the QG. After the
search module returns the search results, the feedback module allows the user
to indicate whether the search result is relevant or not.
The AUI constructs the adapted query in Kavanah by maintaining the
updated user interests, preferences and context. The logical architecture of
the AUI is shown in Figure 1(b). The AUI determines the current interests by
reasoning over the user ontology network with the concepts found in the user
query set as evidence. Each element of the interest relevancy set consists of an
interest concept and an interest level. The interest concept represents the topic
that the user is currently interested in. It is determined from the user query
and the set of documents that the user has indicated as relevant in the recent
search. The interest level is a real number indicating how much the user is
417

Input ^ Query
module , DO . I- Interest -i >,
5_
M AUI
User Ontology network J? \

Search/
Output

S Preference network ^

(a) " (b)"~"~

Figure 1. (a) The overall Kavanah architecture, (b) AUI Architecture

interested in the corresponding interest concept. Denote each interest concept


as a and its associated interest level as L(a). We compute L(a) after every
query by: L(a) = 0.5(L(o) + ^ ) with n as the number of relevant documents
containing this concept a and m as the number of relevant documents. If
L(a) falls below the user-defined threshold value, the corresponding interest
concept a is removed from the interest relevancy set. To compute the new set
of interests, we set as evidence in the user ontology network those concepts
found in the query and the interest relevancy set, and perform belief updating
on the user ontology network.
We construct the user ontology network dynamically by finding a common
set of subgraphs of all relevant documents. Each document is represented as
a DG(e.g Figure 2(a) and 2(c)). For each relevant document, we build a set
of its subgraphs. A subgraph X of a DG Y is a DG such that each node
a belongs to X also belongs to Y. The sets of subgraphs of the concepts
"urate oxidase" and "cosmids" are generated in Figure 2(b) and 2(d). After
generating all of the subgraphs, we compute the number of occurences for each
subgraph in the entire set. We select those subgraphs that have the number
of occurences greater than a user-defined threshold and consider them as the
common subgraphs of the relevant documents. The common set of subgraphs
of the two above concepts is shown in Figure 2(f). This set is used by an agent
to update the user ontology network. The agent will check if a subgraph is not
currently in the user ontology network, and adds it accordingly. This agent
will ensure that the update will not result in a loop in the existing ontology
network. If it does, we skip this addition. A new link between two existing
concepts in the user ontology network will also be created if two concepts
are linked in the set of common subgraphs and the frequency of these links
exceeds a certain user-defined threshold.
The preference network is built when the user issues a new query and
gives relevance feedback. Each user query is considered as a pre-condition
418

Subgraphs Common
Document graphs subgraphs
Cosmids) llinzymc
Cosmids }*{ lsa \—WLnzymeJ

( Cosmids VK lsa J~*\ Enzyme J

Figure 2. (a) The graph represents "cosmids". (b) Subgraphs of concept "cosmids". (c)
The graph represents "urate oxidase", (d) Subgraphs of concept "urate oxidase", (f) The
set of common subgraphs of the concepts "cosmids" and "urate oxidase".

Query graph Query graph of


ofquery_02: adapted query 02:
Liver
ssjamagg/

(b)

Figure 3. (a) An example of a preference network, (b) Examples of query graphs associated
with the user query and the adapted query generated by AUI

node in the preference network. If this query or its part is already asked, the
existing node in the preference network which has a QG matched with the QG
of the new query or of its part will be set as evidence. Each interest concept
from the interest relevancy set is added to the preference network as pre-
condition node and set as evidence. If the user query is totally new, the tool
being used by the user is set to the default value (a filter) and a goal node
419

representing the filter tool is added to the preference network. Otherwise,


it is set to the tool being represented by the goal node with highest utility
value. Each action node represents a way to construct a adapted query based
on the current tool, interests and user query. Figure 3(a) shows an example
of a preference network in which the user is using an expander, is currently
interested in the concept "urate" and wants to find out the causes of the liver
damage. Note that each user query and adapted query generated by AUI
associates with a QG. The current way of forming a new query is determined
as a result of reasoning over the preference network by taking top n goals
nodes as defined by the user. We computed the expected utility for a goal g
with associated action Oj as follows: EU(g) = P{g = T)U(a — g = T) + P(g =
F)U(a - g = F) in which U(a - g = T/F) = Y,P(a-g = T/F)U(ai) and
U(a,i) — J2 U(attributes) The set of attributes in the formulas above includes
the workload, user expertise, temporal memory, and spatial memory. For
more detail information on this, please refer to our ealier work 6 .

4 Empirical Evaluation

We empirically evaluate the system using the definitions of 100 concepts ex-
tracted from the Unified Medical Language System (UMLS). In the first eval-
uation, we focus on the quality of the retrieval process. We constructed a set
of queries and processed this set through the system with and without the
AUI. In this query set, we are mainly using the "wh" questions to find out the
definitions of concepts or identify concepts that match certain requirements.
For example, "what is urate oxidase?" or "which enzyme inhibits monoami
oxidase and causes liver damage?". We made an assumption that the user
does not just explore the concept randomly, but focuses on what he is study-
ing. We used the precision and recall metrics commonly used in information
retrieval 14 as our evaluation criteria. Figure 4 shows the precision and recall
for all the questions in the cases with and without AUI. As we see, the preci-
sion and recall in cases that have an AUI are better than those without any
help. If Kavanah is working without an AUI, it simply matches the QG of the
user query with each DG representing each record in the database. Depending
on how well the user manipulates the keywords in a query, the search may
return more, less or even none of documents. This process requires the user
either know the contents of the database or be very familiar with the search
topics to achieve a decent result. The user's feedback is not used to adapt the
search query. With AUI, depending on the user's feedback, Kavanah helps
the user construct an appropriate search query that satisfies the user's search-
ing intent. For example, if the user does not indicate any documents from
420

Precision comparisons with ant) without active user interface


(AIM)

-WithoutAUl
"With AUI

—i 1 r
Q1 Q2 Q3 Q4 G5 Q6 Q? Q8 Q9 Q10 Questions

Recall comparisons with and without active user interface (AUI)

1.5 -.

=on 1- ^Su«iaB)fflnaiti>i»B|Hkn
-WithoutAUl

^ZZ
o
i)
* 0.5 - With AUI

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9Q10 Questions

Figure 4. Precision and recall for Kavanah using with and without active user interface.

the returned list relevant, Kavanah then knows that perhaps, a wrong tool
has been used, or the interests are not up-to-date or the ontology is far off
the mark. It will automatically correct those misses in order to improve the
quality of the search.
We also evaluated the process of constructing and updating the user's
ontology network by building simulated user ontologies from the domain on-
tology. We randomly choose some concept nodes from the domain ontology
(referred in this experiment as testing concepts) and randomly remove some
links associated with them to see if our system can reconstruct those missing
links in the user ontology network. For each testing concept, we construct a set
of queries such that they reflect the relations between the testing concept and
the removed links. We compute the link error as follows: LinkError = ^ in
which n is the number of links in the user ontology network matched against
the target user subgraph's and m is the total number of links of the user
ontology network constructed by AUI. First, we performed this experiment
using the testing database mentioned above and found out that there is a
large mismatch between the domain ontology and the set of concepts being
421

Matching percentage of ontology construction

1.50 1 — • » - — Matching
percentage with

>c^
1.00- separated database
and domain
0.50- ontology

••i.^im, Matching
0.00- percentage with
1 2 3 4 5 domain ontology as
database
Test case

Figure 5. Link matching percentage of ontology construction

used to construct our testing database. As it is shown in Figure 5, the match-


ing percentage of the links created in the user ontology against the real user
subgraph in domain ontology is low for this case. Next, we tried to use a
portion of the domain ontology as the testing database to perform the same
experiment in order to see if the matching percentage of those links is changed
without the mismatch between the database and domain ontology. Obviously,
we see that the results are significantly better.

5 Related work

In the information retrieval and information filtering community, dynamically


reacting to the changes in the user interests, preferences and context (i.e.,
ontology) to reduce the user cognitive workload is a challenging problem. Most
of the research from information retrieval has been focused on capturing the
user interests only 7 - 18,3 . Work by the InfoSleuth Group explores the ontology
to improve the retrieval process 10 is a closely related to our work. In this
work, ontology is dynamically constructed from the text and constantly been
updated as the retrieval process goes on. However, this technique doesn't
focus on the preferences and interests. There is not enough detail from the
paper about this approach to empirically compare it with ours. Another
work related to ours is IRIA 1 which unobtrusively builds a map of accessible
relevant information and uses it to enable users to find information quickly.
The difference between this work and ours was the IRIA approach is based on a
context-sensitive search which spreads the user interests based on the relevant
information to users while ours is based on a decision theoretic approach to
maintain the model of the user preferences, interests and context.
422

6 Future work

This paper has described our on-going work to construct an active user inter-
face that provides intelligent assistance to the user in an information retrieval
system. There are a number of issues that arise from our design and empiri-
cal evaluation. We want to extend our evaluation to a more complex scenario
with different kind of questions and search strategies. Unfortunately, the cur-
rent database has the problem of low term frequency which is usually referred
to as data sparseness problem in information retrieval 17 . We are also looking
for another supplement database or semantic network in UMLS that will help
us to overcome the problem of disjointness between the domain ontology and
the database used as testbed. We wanted to measure not only links errors,
but also concepts errors which refers to the number of concepts in the user
ontology network matched against the original real user subgraph. At present,
we use a fading mechanism to fade away interests, preferences or context that
are no longer used. This may result in more frequent updates than necessary
if the user intent is not very dynamic. We wanted to employ a mechanism to
differentiate between the short-term and long-term interests, preferences and
context in an intuitive way using findings from experimental psychology8.
Acknowledgements
This work was supported in part by AFOSR Grant No.F49620-00-l-0244
and the AFRL Human Effectiveness Directorate Through Sytronics Incor-
porated. Thanks to Greg Johnson and Thuong Doan for helping with this
paper.

References

1. Anthony, Jr. G. F.; Devaney, M.; and Ram, A. 2000. Iria: The infor-
mation retrieval intelligent assistant. In Proceedings of the International
Conference on Artificial Intelligence. 275-280.
2. Balabanovic, M.; and Shoham, Y. 1997. Content-based, collaborative
recommendation. In Communications of the ACM, 66-72. Vol 40, No.3
3. Billsus, D.; and Pazzani, M. J. 2000. User modeling for adaptive
news access. In Journal of User Modeling and User-Adapted Interac-
tion.Vol( 10),issue 2/3.147-180.
4. Brown, S. M.; Santos, Jr. E.; and Banks, S. B. 1999. Active user
interfaces for building decision-theoretic systems. In Proceedings of the
1st Asia-Pacific Conference on Intelligent Agent Technology, Hong Kong.
244-253.
5. Brown, S.M.; Santos, Jr. E.; Banks, S. B.; and Oxley, M. 1998. Using
423

explicit requirements and metrics for interface agent user model con-
struction. In Proceedings of the Second International Conference on Au-
tonomous Agents, 1-7, Minneapolis, MN.
6. Brown, S. M. 1998. Decision theoretic approach for interface agent de-
velopment. Ph. d disertation.
7. Chen, L.; and Sycara, K. 1998. Webmate: A personal agent for browsing
and searching. In Proceedings of the 2nd International Conference on
Autonomous Agents and Multi Agent Systems. Minneapolis, MN.
8. Ericsson, K. A.; and Kintsch, W. 1995. Long-term working memory. In
Psychology Review, 211-245. 102.
9. Horvitz, E.; Breeze, J.; Heckerman, D.; Hovel, D.; and Rommelse, K.
1998. The lumiere project: Bayesian user modeling for inferring goals
and needs of software users. In Proceedings of the Fourteenth Annual
Conference on Uncertainty in Artificial Intelligence.256-265.
10. Hwang, C. H. 1999. Incompletely and imprecisely speaking: Using dy-
namic ontologies for representing and retrieving information. InfoSleuth
Group, Microeletronics and Computer Technology Corp, 3500 West Bal-
connes Center Drive, Austin, TX 78759.
11. Maes, P. 1994. Agents that reduce work and information overload. In
Communications of the ACM, 31-40. 37(7).
12. Geddes, N. 1986. The Use of Individual Differences in Inferring Human
Operator Intentions. In Proceedings of the Second Annual Aerospace Ap-
plications of Artificial Intelligence Conference
13. Gruber, T. R. 1993. Toward principles for the design of ontologies used
for knowledge sharing. In the International Workshop on Ontology.
14. Salton, G.; and McGill, M. 1983. Introduction to Modern Information
Retrieval. McGraw-Hill Book Company.
15. Santos, Jr. E.; Brown, S. M.; Lejter, M.; and Banks, S. B. 1999. Dy-
namic user model construction with bayesian networks for intelligent in
formation queries. In Proceedings of of the 12th Flairs conference.3-7.
16. Santos, Jr. E.; Nguyen, H.; and Brown, S. M. 2000. Medical document
information retrieval through active user interfaces. In Proceedings of the
2000 International Conference in Artificial Intelligence.323-329.
17. van Rijsbergen, C. 1975. Information Retrieval.The Whitefriars Press
Ltd, London and Tonbridge.
18. Widyantoro, D. H.; Ioerger, T. R.; and Yen, J. 1999. Adaptive agent
for learning changes in user interests. In Proceedings of the Interna-
tional Conference on Information and Knowledge Management CIKM'99.
Kansas City.
iJADE WeatherMAN - A MULTIAGENT FUZZY-NEURO NETWORK
BASED WEATHER PREDICTION SYSTEM

RAYMOND LEE, JAMES LIU, JANE YOU


Department of Computing, Hong Kong Polytechnic University, Hong Kong
Email: {csstlee, csnkliu, csyjiaj@comp.polyu.edu.hk

Weather forecasting has been one of the most challenging problems around the world for
more than half a century. Not only because of its practical value in meteorology, but it is also
a typically "unbiased" time-series forecasting problem in scientific researches. In this paper,
we propose an innovative intelligent multi-agent based environment, namely (/JADE) -
intelligent Java Agent Development Environment - to provide an integrated and intelligent
agent-based platform in the e-commerce environment. In addition to contemporary agent
development platforms, which focus on the autonomy and mobility of the multi-agents,
/JADE provides an intelligent layer (known as the 'conscious layer') to implement various AI
functionalities in order to produce 'smart' agents. From the implementation point of view, we
introduce the I'JADE WeatherMAN - an intelligent multi-agent based system for automatic
weather information gathering, filtering and time series weather prediction (which is done by
a fuzzy-neuro network model), based on the weather information provided by various weather
stations. Compared with previous studies on single point source using similar network and
others like radial basis function network, learning vector quantization and Naive Bayesian
network, the results are very promising. This neural-based rainfall forecasting system is useful
and parallel to traditional forecast from the Hong Kong Observatory.

1 Introduction

Traditionally, weather forecasting is based mainly on numerical models [15]. This


classical approach attempts to model the fluid and thermal dynamic systems for
grid-point time series prediction based on boundary meteorological data. The
simulation often requires intensive computations involving complex differential
equations and computational algorithms. Besides, the accuracy is bound by certain
"inherited" constraints such as the adoption of incomplete boundary conditions,
model assumptions and numerical instabilities [12]. This kind of approach is more
appropriate for long-term (over 24 hours) forecasting over a large area of several
thousand kilometres [2]. For mesoscale and short-term weather forecasting in a
relative small region such as Hong Kong there is a need for some alternative. In
addition, it is generally recognized by meteorologists that rainfall forecasting is
quite difficult to be accurate, due to the geographical and topographic features of the
region here. Similar studies [11][14] confirmed the difficulty and thus presenting a
challenge for further investigation of the problem. Moreover, with the exponential
growth in Internet technology and e-commerce applications, real-time weather
information gathering (from different weather stations) and weather prediction
(such as temperature and rainfall forecasts) are increasingly important and
demanding. Agent technology, with its automatic delegation of tasks, autonomous

424
425

and highly mobile characteristics in the Web environment, is starting to play an


important role of e-commerce in the new millennium.
However, contemporary agent systems such as IBM Aglets [1] and
ObjectSpace Voyager [16] focus on the mobility and multi-agent communications.
The 'core' functions of intelligent agents (IA) - the AI (Artificial Intelligent)
counterpart with intelligent capabilities including machine learning, intelligent
pattern recognition and classification functions - are difficult to implement. In a
typical e-shopping scenario, most of the time we are handling 'inexact' product
selection criteria. For instance, in rainfall prediction, we might not interest in the
exact rainfall amount, instead we only care about the degree of "Rain Depth"
ranging from "nil" or "trace" amount to "heavy" rainfall. In these cases, we are
dealing with different degrees of 'fuzziness', which can be efficiently handled by AI
technologies such as fuzzy systems. With the integration of machine learning
technique such as neural networks, an 'intelligent' weather forecasting system can be
constructed.
In this paper, we propose a comprehensive and truly intelligent agent-based
framework, known as iJADE (pronounced as 'IJ') - intelligent Java Agent
Development Environment. To compensate for the deficiency of contemporary
agent software platforms such as IBM Aglets and ObjectSpace Voyager Agents,
which mainly focus on multi-agent mobility and communication, iJADE provides
an ingenious layer called the 'Conscious (Intelligent) Layer' which implements
various AI functionalities into multi-agent applications. From the implementation
point of view, we demonstrate how /JADE model can be applied in real-time
weather information gathering and weather prediction process - /JADE
WeatherMan. In short, /JADE WeatherMan integrates the intelligent mobile agent
technology on information gathering, with the fuzzy neural network for weather
prediction.

2 iJADE Architecture

2.1 UA DE Framework: A CTS and DNA Layered Model


In this paper, we propose an innovative and fully integrated intelligent agent model
called /JADE for intelligent Web-mining and other intelligent agent-based e-
commerce applications. The system framework is shown in Figure 1. Unlike
contemporary agent systems such as IBM Aglets [1], which focus on the multi-
agent communication and autonomous operations, the aim of /JADE is to provide
comprehensive 'intelligent' agent-based framework and applications for future e-
commerce and Web-mining applications. Figure 1 depicts the two level abstraction
in the /JADE system: a) /JADE system level - ACTS model, and b) /JADE data
level - DNA model. The ACTS model consists of 1) the Application Layer, 2) the
Conscious (Intelligent) Layer, 3) the Technology Layer, and 4) the Supporting
426

Layer. The DNA model is composed of 1) the Data Layer, 2) the Neural Network
Layer, and 3) the Application Layer.

Figure 1. System architecture of I'JADE (v 1.6) model

2.2 Application layer including UADE eMiner


This is the uppermost layer that consists of different intelligent agent-based
applications. These /JADE applications are developed by the integration of
intelligent agent components from the 'Conscious Layer' and the data 'knowledge
fields' from the DNA model. Concurrent applications (/JADE vl.6) implemented in
this layer include:
• /JADE eMiner [10] - an intelligent Web-mining agent system for e-shopping. It
consists of the implementation of 1) FAgent, an automatic authentication
system based on human face recognition [6], and 2) FShopper, a fuzzy agent-
based Internet shopping agent [7].
• /JADE WeatherMAN - an intelligent weather forecasting agent which is the
extension of previous research on multi-station weather forecasting using fuzzy
neural networks [13] proposed in this paper.
• /JADE WShopper [9] - an integrated intelligent fuzzy shopping agent with
WAP technology for intelligent mobile shopping on the Internet.
• /JADE Stock Advisor - an intelligent agent-based stock prediction system using
time series neuro-oscillatory prediction technique [8].

2.3 Conscious (Intelligent) layer


This layer provides the intelligent basis of the /JADE system, using the agent
components provided by the 'Technology Layer'. The 'Conscious Layer' consists of
the following three main intelligent functional areas:
427

• 'Sensory Area' - for the recognition and interpretation of incoming stimulates. It


includes a) visual sensory agents using EGDLM (Elastic Graph Dynamic Link
Model) for invariant visual object recognition [4], and b) auditory sensory
agents based on wavelet-based feature extraction and interpretation scheme [3].
• 'Logic Reasoning Area" - conscious area providing different AI tools for logical
'thinking' including rule-based reasoning, such as fuzzy and GA (Genetic
Algorithms) rule-based systems [5].
• 'Analytical Area' - consists of various AI tools for analytical calculation, such
as recurrent neural network-based analysis for real-time prediction and data
mining [8].

2.4 Technology and Supporting layers


The technology layer provides all the necessary mobile agent implementation APIs
for the development of intelligent agent components in the 'Conscious Layer'. In the
current version of the /JADE model, IBM Aglets [1] are used as the agent
'backbone'. The supporting layer provides all the necessary system supports to the
'Technology Layer'. It includes 1) Programming language support based on Java, 2)
Network protocols support such as HTTP, HTTPS, ATP, etc., and 3) Markup
languages support such as HTML, XML, WML, etc.

3 Implementation - iJADE WeatherMAN

3.1 UADE WeatherMAN - System Overview


In this paper, we based on the /JADE model as the system framework to introduce
an innovative intelligent agent-based weather information extraction, analytical and
prediction system, based on weather information provided by multiple weather
stations distributed in Hong Kong. To summarize, /JADE WeatherMAN consists of
five major components:
1. User Requirement Definition Scheme (URDS)
2. Data Collection Scheme (DCS)
3. Variable Selection and Transformation Scheme (VSTS)
4. Fuzzy-neuro Training and Prediction Scheme (FNTPS)
5. Weather Reporting Scheme (WRS)
A schematic diagram of the /JADE WeatherMAN is depicted in Figure 2.

3.2 User Requirement Definition Scheme (URDS) & Weather Reporting


Scheme (WRS)
The URDS involves the collection of the user requirement, including the selection
:>f forecasting element(s) (such as temperature, rainfall, humidity, etc), forecasting
•ange (e.g. next day or next-N day forecast) and other parameter such as regional
brecast / global weather report. These information are collected via the /JADE
428

Weather Reporter - a stationary agent situated in the client machine for the
collection of user requirement, negotiation and dispatch mobile agents (/JADE
Weather Messengers) the final weather reporting in the WRS.

Figure 2 - Schematic diagram of the iJADE WeatherMAN

3.3 Data Collection Scheme (DCS)


In this scheme, each /JADE Weather Messenger (mobile agent) will "visit" different
weather station for the collection of weather information, which will in turn be
centralized within the main weather center (HKO site) for further processing.
In our implementation, meteorological data during the period between 1993 and
1997 were collected from 11 regional weather stations in Hong Kong. These
stations are located at Ta Kwu Ling (TKL), Ping Chau (EPC), Lau Fau Shan (LFS),
Tai Po Kau (TPK), Sha Tin (SHA), Sha Lo Wan (SLW), H.K. Observatory (HKO),
Junk Bay (JKB), Wong Chuk Hang (WCH), Cheung Chau (CCH) and Waglan
Island (WGL). The data are 6-hour records of weather elements at 0600H, 1200H,
1800H, 2400H and they include: dry-bulb temperature (TT); dew-point temperature
(DP); relative humidity (RH); hourly rainfall (RF); mean wind speed (FF); 60-min
prevailing wind direction (DD); mean sea-level pressure (MSLP).
429

3.4 Variable Selection and Transformation Scheme (VSTS)


Once all the /JADE Weather Messengers are centralized in the central station, they
will exchange and integrate the weather information they have and reorganize
(which involve weather element selection, information grouping and
transformation) these information such that they can be more efficiently be used for
system training and testing.
For instance, in our experimental testing using the weather information from
the 11 station, there are severe loss of data from certain stations. As a result, the 11
stations were grouped to cover five regions (Rl, R2, R3, R4 & R5) according to the
distribution of weather records that can be collected in each region. Certain
incomplete data were approximated using some linear function interpolating with
the nearby values of the same element within the region.

100 -, • TT"
a *> • DD
„....~~~.
.£ 80
1

1.
DDP
8 70
S 60 E3FF
S> 50
L • MSLP

li
| 40
g 30 _ 1 ERF

iLti tj \i
• RH
°- 10
0 I .41 u ,,
CCH FPC HKO HK! .1 ' • _ SHA SLW TKL TPO WGL
Weather Station

Figure 3 - Data distribution among 11 weather stations

3.5 Fuzzy-neuro Training and Prediction Scheme (FNTPS)

Neural Network Fuzzy


Module y~\ Module
N
1A
1 /\
1 A
1 /\
1 /\
1 A
1 Membership
functions
Input Variables

Figure 4 - Schematic diagram for Fuzzy-neuro network on rainfall (RF) forecast using meteorological
data: Relative humidity (RH), dry-bulb temperature (TT), dew-point temperature (DT), wind direction
(WD), wind speed (WS), mean sea level pressure (PR) and rainfall (RF)
430

Having all of the relevant weather information collected and pre-processed, the
/JADE Weather Forecaster (a stationary computational agent situated in the central
station) will start the appropriate network training and forecasting - based on a
back-propagation based fuzzy-neuro network (Figure 4). Table 1 shows the
category defined for the fuzzication of the rainfall element into five different
categories.
Table 1. Rainfall categories
Category (mm) Nil Trace Light Moderate Heavy
Range in depth 0<d<0.05 0.05<d<0.1 0.1<d<4.9 4.9<d<25.0 d>25.0
In our experiments, the fuzzy data for predicting the occurrence of either rain or
no-rain, and for the precipitation prediction are with the membership functions as
follows:

4 Experimental Results

Experiments were carried out using the following data sets :


Set A: Weather data from single station at Hong Kong Observatory (HKO);
Set B: Weather data from multiple stations at Hong Kong Observatory, Cheung
Chau, Junk Bay, Sha Tin, Ta Kwu Ling and Lau Fau Shan;
Set C: Time series weather data from multiple stations.
Each data record of Set A contains 6 hourly meteorological data from a single
point source (i.e. HKO); that of Set B contains data from multiple point sources at
0600, 1200, 1800, 2400 consecutive time periods; and that of Set C contains data
also from multiple point sources but at [0600, 1200, 1800, 2400], [1200, 1800,
2400, 0600], [1800, 2400, 0600, 1200] and [2400, 0600, 1200, 1800] consecutive
time series of data. The setting is aimed to find out the most suitable and useful
input variables for building those prediction models in order to minimize the
training time and space complexity of the networks.

4.1 Evaluation Consideration


Several evaluation methods used for analyzing the prediction problem such as
Linear correlation (R) - it is the linear correlation between real world target output
and the real world prediction. This analysis measures the relationship between two
data sets that are scaled to be independent of the unit of measurement. The
population correlation (px,y) calculation returns the covariance (cov(X,Y)) of two
data sets (X and Y) divided by the product of their standard deviations. Perfectly
correlated outputs have px,y = 1.0. Anti-correlated outputs have an px,y = -1.0.
Uncorrelated outputs have px,y = 0.0.
431

4.2 Average Classification Rate


For rainfall classification, the method of average classification rate was used as a
measuring criterion for best performance. Results for rain/no rain and rainfall depth
classification are shown in Tables 2-3.
Table 2. Classification on Rain/No Rain of the following day

Output Evaluation Function Average Classification Rate


No. of Input Network Accuracy Classification Rate (%)
Data Set
Variables Architecture (20%) No Rain Rain
A 9 5-2-2 0.729238 78.10 69.77
B 96 8-6-2 0.746303 80.72 76.28
C 152 4-2-2 0.844141 93.25 82.60

Table 3. Classification on Rainfall Depth of the following day

Output Evaluation Function Average Classification Rate


Data No. of Input Network Accuracy Nil Trace Light Moderate Heavy
Set Variables Architecture (20%)
A 9 9-13-5 0.390216 39.75 44.61 36.70 60.16 87.50
B 96 9-3-5 0.509670 61.70 28.65 37.16 57.97 84.38
C 152 486-4-5 0.686007 77.96 79.08 91.39 87.36 87.50

4.3 Model Performance


To summarize, the best experimental results performed on different input data sets
are shown in Table 4. Generally, the performance of experiments on time series
multiple stations is the best among the others. The results on single station and
multiple stations do not impose a large difference, therefore just having multiple-
station data is not sufficient to build a good model.
Table 4. Performance comparison
Data S e t A B C
Rain/No R a i n 73.94% 78.50% 87.93%
Classification
R a i n f a l l Depth 53.74% 53.97% 84.76%
Classification
The best results achieved by the time series prediction is due to the availability
of a large number of input data for the model to select the right variables, thus
having a greater chance to give better prediction results. It is observed that for all
time series experiments, the accuracy is 10-28% more than that in the single- or
multiple-station experiments. It can be deduced that the correlation between data at
time t and t+1 are high, therefore it is easier to build a successful model.
432

5 Conclusion

In this paper, we introduce an innovative intelligent agent based platform, namely


/'JADE. Through the implementation of /JADE WeatherMAN, we illustrate how
intelligent agent-based system can be efficiently integrated with time series fuzzy-
neuro prediction model to implement an online weather information retrieval,
analysis and prediction system.
From the weather prediction point of view, we can conclude that multiple-
station model using /JADE WeatherMAN for online information gathering by the
mobile agents (the Weather Messengers) is better than single-station prediction. For
rainfall prediction, as shown in Figure 5, /JADE WeatherMAN has a significant
improvement over that of the single-station one (three-folded increase on
correlation).

1
0.8 -Multiple
Single
| 0.6
g 0.4
0.2
—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i—i

0 100 200 300 400 500 600 700 800 900


No. of epochs

Figure 5 - Comparison on correlation using single- and multiple-station data

6 Acknowledgement

The authors are grateful to the partial supports of the Departmental Grants for
iJADE Projects including iJADE Framework (4.61.09.Z042) and iJADE WShopper
(4.61.09.Z028) from the Hong Kong Polytechnic University.

References
1. Aglets. URL: http://www.trl.ibm.co.ip/aplets/.
2. Chow T. W. and Cho S. Y., Development of a Recurrent Sigma-Pi Neural Network
Rainfall Forecasting System in Hong Kong, Neural Computation & Applications 5
(1993) pp. 66-75.
3. Hossain I., Liu J., and Lee R., A Study of Multilingual Speech Feature: Perspective
Scalogram Based on Wavelet Analysis. In Proceedings of IEEE International
433

Conference on Systems, Man, and Cybernetics (SMC'99) 2, Tokyo, Japan (1999) pp.
178-183.
4. Lee R. S. T. and Liu J. N. K., An Automatic Satellite Interpretation of Tropical Cyclone
Patterns Using Elastic Graph Dynamic Link Model. International Journal of Pattern
Recognition and Artificial Intelligence 13(8) (1999) pp. 1251-1270.
5. Lee R. S. T. and Liu J. N. K., Teaching and Learning the A. I. Modeling. In Innovative
Teaching Tools, Knowledge-Based Paradigms (Studies in Fuzziness and Soft
Computing 36), Springer (2000), pp. .31-86.
6. Lee R. S. T. and Liu J. N. K., FAgent - An Innovative E-Shopping Authentication
Scheme using Invariant Intelligent Face Recognition Agent. In Proceedings of
International Conference on Electronic Commerce (ICEC2000), Seoul, Korea (2000)
pp. 47-53.
7. Lee R. S. T. and Liu J. N. K., Fuzzy Shopper - A fuzzy network based shopping agent in
E-commerce environment. International ICSC Symposium on Multi-Agents and Mobile
Agents in virtual organizations and E-commerce (MAMA '2000), Wollongong, Australia
(2000).
8. Lee R. S. T. and Liu J. N. K.: Tropical Cyclone Identification and Tracking System
using Integrated Neural Oscillatory Elastic Graph Matching and Hybrid RBF Network
Track Mining Techniques. IEEE Transaction on Neural Network 11(3) (2000) pp. 680-
689.
9. Lee Raymond, A New Era Mobile Shopping Based on Intelligent Fuzzy Neuro-based
Shopping Agents. To appears in IEEE Trans, on Consumer Electronics (2001).
10. Lee R. S. T. and Liu J. N. K., iJADE eMiner - A Web-based Mining Agent based on
Intelligent Java Agent Development Environment (iJADE) on Internet Shopping.
Lecture Notes in Artificial Intelligence series, Springer-Verlag, (2001) pp. 28-40.
11. Li B., Liu J. and Dai H., Forecasting from low quality data with applications in weather
forecasting, International Journal of Computing and Informatics 22(3), pp. 351-358
(1998).
12. Liu N. K., Computational Aspects of a Fine-mesh Sea Breeze Model, M. Phil.
Dissertation, Department of Mathematics, Murdoch University, Western Australia
(1988).
13. Liu J. N. K. and Lee R. S. T., Rainfall Forecasting from Multiple Point Source Using
Neural Networks. In Proceedings of IEEE International Conference on Systems, Man,
and Cybernetics (SMC'99) 2 Tokyo, Japan (1999) pp. 429-434.
14. Liu J. and Wong L., A case study for Hong Kong weather forecasting, In Proceedings of
International Conference on Neural Information Processing'96, Hong Kong, (1996) pp.
787-792.
15. McGregor J. L., Walsh K. J. and Katzfey J J., Climate simulations for Tasmania, In
Fourth Int'l Conference on Southern Hemisphere Meteorological and Oceanography,
American Meteorological Society (1993) pp. 514-515.
16. Voyager. URL: http://wvvw.obiectspace.com/vovager/
Acquaintance Models in Coalition Planning
for Humanitarian Relief Operation

MICHAL PECHOUCEK, VLADIMIR MARK, JAROSLAV BARTA


Gerstner Laboratory, Czech Technical University in Prague
jpechouc\marikj @ labe.felk. cvut. cz

1 Introduction

The application domain of this coalition formation research belongs to the area of
war avoidance operations such as peace-keeping, peace-enforcing, non-combatant
evacuation or disaster relief operations. Unlike in classical war operations, where
the technology of control is strictly hierarchical, operations other than war
(OOTW) are very likely to be based on cooperation of a number of different, quasi-
volunteered, vaguely organized groups of people, non-governmental organizations
(NGO's), institutions providing humanitarian aid but also army troops and official
governmental initiatives.
Collaborative, unlike hierarchical, approach to operation planning allows
greater deal of flexibility and dynamics in grouping optimal parties playing an
active role in the operation. New entities shall be free to join autonomously and
involve themselves in planning with respect to their capabilities. Therefore any
organization framework must be essentially "open". OOTW have, according to 12,
multiple perspective on plan evaluation as there does not need to be one shared
goal or a single metrics of the operation (such as political, economical,
humanitarian). From the same reason the goals across the community of entities
involved in a possible coalition may be in conflict. Even if the community members
share the same goal it can be easily misunderstood due to different cultural
backgrounds.
The main reason why we can hardly plan operations involving different
NGO's by a central authority results from their reluctance to provide information
about their intentions, goals and resources. Consequently, besides difficulties
related to planning and negotiation we have to face the problems how to assure
sharing detailed information. Many institutions will be ready to share resources
and information within some well specified community, whereas they will refuse to
register their full capabilities and plans with a central planning system and will not
follow centralized commands. They may agree to participate in executing a plan, in
forming of which they played an active role.

434
435

Actual information may become unavailable also due to unreliable


communication channels. It may happen that a collaborative entity gets cut off the
communication links for certain period of time and the rest of the community still
wishes to be able to form/execute plan relying upon a missing player (and the same
vice-versa). For this reason, each participating entity shall be able to maintain
approximate model of the collaborating members of the coalition.
Naturally, a paradigm of multi-agent systems, a community of heterogeneous,
autonomous, proactive and collaborative agents, suits the domain of coalition
planning. Multi-agent systems provide important features that will be with a great
advantage applied in our problem 3. When modeling/running a very complex
system, multi-agent abstraction will allow natural decomposition of the problem
into interacting components, each represented by an agent. By doing so,
tractability, maintainability and flexibility is provided. In our interpretation an
agent is a complex, organized entity playing an active role in OOTW (NGO,
humanitarian organization, army troop, etc.). Agents communicate via declarative
communication languages that facilitate a wide range of interaction patterns. The
important virtue of agent-based paradigm is the fact that many courses of
interaction (and resulting collaboration) are not foreseen during the system design
and they can occur emergently. Agents can join and leave community upon their
own decision, they do not necessarily need to be benevolent, and they improve their
behavior in time. The paradigm of multi-agent system allows integrating agents
that are heterogeneous in nature and communicate in peer-to-peer manner. Let us
understand a targeted multi-agent system to consist of a number of agents that
group themselves in various, temporary coalitions (each solving a specific
mission/part of the mission).
Firstly, let us put down several terms that will be used throughout this study.
As a multi-agent community we understand the whole collection of agents
participating in the above-described OOTW (quasi-volunteered, vaguely organized
groups of people, non-governmental organizations, institutions providing
humanitarian aid, army troops or official governmental initiatives). Throughout the
paper we will refer to a coalition as a set of agents (an agreement) who agreed to
fulfill a single, well-specified goal. Agents thus commit themselves to collaboration
with respect to the in-coalition-shared goal. An agent may participate in multiple
coalitions. Unlike an alliance (see below), a coalition is usually regarded as a short-
term agreement between collaborative agents. As a coalition formation/planning
we understand the process of finding a coalition of agents who may participate in
achieving the respective goal 10,11. As an alliance we understand a collection of
agents that share general humanitarian objectives and all agree to form possible
coalitions. Moreover, the alliance members decided to share detailed information
about their status, resources they may provide and their collaboration preferences.
The alliance is regarded as a long term cooperation agreement among the agents.
436

2 CPlanT Multi-Agent Architecture

We will model collective behavior of the multi-agent community by a CPlanT


(Coalition PLANning Tool) multi-agent system. The CPlanT architecture consists
of several specific classes of agents:
Resource Agents (R-agents) represent the in-place resources that are
inevitable for delivering humanitarian aid, such as roads, airports, seaports but
also subcontracted shipping agencies. Unlike below-defined H-agents, the R-
agents are regarded as passive and they do not initiate any kind of
humanitarian effort.
In-need Agents (In-agents) are critical bodies in the entire simulation. They
will represent the centers of conflict that call for help (e.g. cities, villages,
etc.).
Coordination Agents (C-agents) are specific types of matchmaking agents
that monitor the In-agents' call for help, contract other countries' C-agents,
who broadcast the call within humanitarian agencies. In many cases the C-
agents also wish to coordinate the flow of humanitarian aid towards the In-
agents.
Humanitarian Agents (H-agents), who are computational representations of
the participating humanitarian agencies. Like the R-agents, H-agents provide
humanitarian aid. Therefore, one may regard the H-agent as a subclass of R-
agents. However the H-agents are proactive and they initiate providing the
respective humanitarian aid.

Figure 1 - CPlanT Multi-Agent Architecture

Though the coalition formation problem is much wider and involves forming
coalition together with all the other participating agents, we will investigate
coalition formation among the H-agents.
437

3 H-Agent Knowledge Architecture

The H-agent may participate in one or more alliances and at the same time he may
be actively involved in a coalition of agents cooperating in fulfilling a shared task.
Computational and communication complexity of the above defined coalition
formation problem, depends on the amount of pre-prepared information the agents
administer one about the other and on the sophistication of the agents meta-
reasoning mechanisms (as a meta-reasoning we understand agent's capability to
reason about the other agent's reasoning processes).We suggest three levels of
agent's knowledge representation:
Public Knowledge is shared within the entire multi-agent community. This
class of knowledge is freely accessible within the community. As public
knowledge we understand agents name, type of the organization the agent
represents, general objective of the agent's activity, country where the agent
is registered, agent's human-human contact (telephone, fax number, email),
the human-agent type of contact (usually http address) and finally the agent-
agent type of contact (the IP address, incoming port, ACL)
Alliance-Accessible Knowledge is shared within a specific alliance. We do
not assume the knowledge to be shared within the overlapping alliances.
Members of an alliance will primarily share information about free
availability of their resources and respective position. This resource-oriented
type of knowledge may be further distinguished as material resources, human
(professional) resources and transport resources.
multi-agent system

Figure 2 - Concept of the Alliance and the Coalition

Private Knowledge is owned and administered by the agent himself. Though


the agents are expected to share private information neither within an
alliance nor within a coalition, provided they find it useful, they may
438

communicate some pieces of the private information upon a request. As an


instance of private knowledge we can consider mainly specific status of the
resources the agent administers. Similarly the agent will maintain both future
plans and allocation of resources equally as his past performance. An
important type of private knowledge relates agent's collaboration preferences
and possible restrictions, but also agent's planning and scheduling
algorithms.
We suggest three separate phases of the multi-agent community lifecycle:
Registration: Throughout the registration phase a new-coming agent registers
within the multi-agent community. In the simplest possible case the
registering component of the respective interoperability platform (i.e. F1PA
AMS) updates its white-page list (list of available agents). However, we want
the agents to be as autonomous as possible and we try to avoid making them to
rely upon central components. Once the newcomer registers with the AMS
component, the AMS will broadcast an evidence of this event within the
already existing community and at the same time it will forward its white-page
to the new agent. Usually, a white-page list contains only the information
about existence (symbolic name) and communication access (IP address and
the respective port). We want an agent to be attributed with a complete record
of public knowledge.

Alliance Formation: In this phase the agents analyze the information they
have about the members of the multi-agent system and attempt to form
alliances. In principle, each agent is expected to compare its own private
knowledge (i.e. mission objectives, collaboration preferences and collaboration
restriction) with the public knowledge about the possible alliance members
(i.e. type of an organization, its objective, country of origin, etc.). Had he
detected a possible future collaborator the agent would propose possible
collaboration. As such collaboration inclination does not need to be necessarily
bi-directional, the proposal may be followed by a negotiation where some
pieces of non-private information may be discussed. In a real life cases we
expect human intervention when forming alliances. Design of sophisticated
negotiation protocols may be subject of further research.

Coalition Formation Phase: In this phase agents group together not


according to a similar mission objective and but with respect to a single, well
specified task that needs to be accomplished. The simplest and the most
straightforward approach to forming an effective coalition is to search for
partners within the alliance members. The alliance members know the most of
each other and are able to suggest a coalition that will very likely have
foreseen properties. Whichever member of an alliance faces the role of
coordinating the task accomplishment, he parses his social neighborhood
439

(services the agents provide and allocation of their resources) and detects the
most suitable collaborators. Upon an approval from each of the suggested
agents the respective coalition is formed.

4 Agents' Acquaintance Model

The main research challenge of this project is to design an appropriate H-agents'


acquaintance model. An acquaintance model represents agent's awareness about
other members of the community. Utilisation of an agent's acquaintance model
will allow fast and efficient coalition formation while it will keep an important part
of the agent's knowledge private and inaccessible to unauthorised agents. Apart
from ensuring the agents' knowledge privacy, we use the acquaintance models in
order to allow an agent to reason about the other agents while it is disconnected
from the rest of the community. Similarly the other alliance members may form
plans that rely upon the missing agent as they keep much of the knowledge about
such agent in their acquaintance models.
There have been many different acquaintance models designed for various
purposes. The concept of the tri-base acquaintance model (3bA model) has been
successfully applied in the domain of production planning 4 and in providing
safety and efficiency improvements of the communication traffic in multi-agent
systems 7. A similar approach has been applied to the problem of multi-level
optimisation in multi-agent systems. A general meta-reasoning acquaintance
model (MRAM) has been designed for implementing meta-agents and instantiating
reflexivity in a multi-agent system 6.

4. ] Suggested Acquaintance Model for Humanitarian Relief Operations


Hereafter we will try to put together the knowledge architecture requirements and
suggested acquaintance model knowledge structures. The acquaintance model shall
contain four key knowledge-bases (based on our experience with 3bA and MRAM):
self-belief-base (stored in the knowledge wrapper) stores the knowledge the
agent knows about himself and may want to make accessible to all or some of
agents. As there are accessibility relations to each piece of knowledge in the
self-belief-base (specifying who can manipulate/access the respective
knowledge), all three types of knowledge (private knowledge, alliance-
accessible knowledge and public knowledge) may be stored in a single
knowledge structure.
community-belief-base (stored in the communication wrapper) contains the
communication-like, technical public knowledge about all members of the
440

multi-agent systems (e.g. symbolic name, IP addresses, communication means,


type, ....)
social-belief-base (stored in the agents' body) stores knowledge about the
other agents. In the ground-belief-table it stores the accessible fraction of
someone else self-belief-base and in the assumed-belief-table it maintains new
hypothesis about other agents operation that has been formed by the agent.
task-base (stored in the agents' body) is a collection of pre-planned coalitions
that the agent may coordinate or participate in with respect to different tasks.
Planned coalitions are constructed from the information stored in other parts
of the agents acquaintance model and we explain how its maintained below:

coalition planning algorithms, user Interface, resource administration


(In agents' body)

Figure 3 - Suggested Acquaintance Model Architecture

The agents' social neighbourhood - structure of various alliances - is expected


to change dynamically. Even already planned coalitions may require acute re-
planning and resources reconfiguration as the environment changes, the respective
crises dynamically evolve or even the agents may leave the coalition for many
different reasons. Logical clustering of knowledge about services the agents may
provide and its appropriate maintenance will transform an important amount of
computation from the time of request to the agent's idle times.
An important issue is the problem of maintaining the knowledge an agent may
have about the others. In principle, we disregard the knowledge maintenance
mechanisms, that are based on a central component (i.e. blackboard architectures,
facilitators, etc). Knowledge maintenance may be driven by an agent who reasons
about the other agent, or by an agent who is a subject of such reasoning. In the
former case we talk mainly about the approach of periodical revisions, when the
reasoning agent asks periodically the agents he reasons about for an update. More
appropriate is the latter alternative when the knowledge maintenance is driven by
the agent who is subject of someone else's reasoning - subscription-based
441

maintenance. Once an agent appends a record about another agent to its social-
belief-base it subscribes this collaborating agent for future updates. Upon a change
of a required resource the subscribed agents informs its subscriber.

4.2 Using Acquaintance Model for Coalition Formation


Let us distinguish the process of forming a coalition from the process of planning a
coalition. As the process of coalition planning we will understand agent's
reasoning about possible candidate for membership in the planned coalition. The
planning agent can either broadcast a call for cooperation and select the best
possible proposal or he can directly analyse the social knowledge it keeps in its
acquaintance model. Coalition formation is the process of closing an agreement
among candidates for membership in a planned coalition. Usually the
planning/coordinating agent contracts the respective collaborators. Once the
coalition is formed the coalition activity may start. The quality of the coalition is
the factor of how suitable will be the coalition for accomplishing the shared
objective.
The coordinating agent may be (i) either asked to form a coalition and
immediately initiate task implementation within the coalition members in order to
accomplish the shared objective (both coalition planning and formation in the
coalition formation phase) (ii) or he will maintain certain number of patterns of
the coalition structures (coalition planning phase) by combination of which he
will form an accurate coalition in a reasonable amount of time (coalition formation
phase). Practically, agents will maintain a collection of suggested coalitions in
their task-base. As the involved agents will change their status throughout the
time, the suggested coalitions will be changing in quality. The agent's activity of
keeping the task-base knowledge up-to-date and consistent with changes in the
social belief-base is called coalition re-planning. Upon each update in the social-
belief-base the agent has to check quality of the coalition that depends on the given
update or plan a new coalition if the already planned coalition proves to be
impossible to be formed or with the quality lower than acceptable. For accurate re-
planning the agents may keep in their state-bases several alternative coalitions for
one task. It is uneasy to decide about the quality of the coalition if there is nothing
to compare to.
The problem of coalition planning is rather complex one. Many researchers
have experimented with various methodologies. In Gerstner Laboratory we have
been investigating behaviour of two types of algorithms that reduce the state-space
of possible coalitions 1. By using distributed branch-and-bounds algorithm, we
manage to decompose the problem of searching for the optimal coalition within
possible candidates. The agents jointly find an optimal coalition. The other
algorithm is based on parallel state-space search. Though we have proved linear
442

complexity of the single agent problem solving, the found coalition is not
guaranteed to be optimal. Nevertheless, the experiments shown that the solutions
found by the fast algorithm are very close to the optimum (see Figure 3 for
comparison).

600000

500000

400000

300000

200000

100000

1 2 3 4 5 6 Task
TTT i
1 2 3
i
4
i
5
1 1 6 Task

Figure 4 - Comparison of the optimal and linear coalition formation algorithms in terms of quality of the
formed coalition (left graph) and in terms of time requirements for the coalition formation process

5 Conclusion

Testing correctness of the acquaintance models requires a well-defined, formal, but


realistic enough scenario that can represent, model and initiate all aspects of
agents' nontrivial behaviour. The above specified principles and ideas are currently
tested and implemented on a subset of the OOTW types of operations -
humanitarian relief (HR) operations. For this purpose we have designed and a
hypothetical humanitarian scenario in a fictious country - SufferTerra (inspired by
89). The scenario knowledge has been encoded in XML and the computational
model of the scenario has been implemented in Allegro Common Lisp.
Our experience shows that knowledge is the key driving force in MAS
organisation, functionality and efficiency of action. Multi-agent systems have to
solve increasingly complex tasks. In many cases agent are not benevolent, they
sometimes refuse to collaborate fully, they do not always want share they private
knowledge and sometimes they do strategic decision-making (that does not need to
be sincere). In these areas - the OOTW coalition planning is an instance - agents'
acquaintance model provides natural, general vehicle for organizing, maintenance
and exploration of the agents' knowledge of their collaborator neighbourhood.
This project was supported by AFOSR/European Office of Aerospace Research
and Development under contract number F61775-00-WE043.
443

1. Barta J., Pechoucek M., Stepankova O.: Object Oriented Approach to


Coalition Forming. GL 2000/100 - Technical Report of The Gerstner Lab
CTU Prague & FAW Linz, Hagenberg - Praha - Wien, 2000
2. Dix J., Subrahmanian V.S., Pick G.: Meta Agent Programs. Journal of Logic
Programming, 46(l-2):l-60, 2000.
3. Jennings, N., Agent Based Computing: Promises and Perils, Computer and
Thought Award Lecture, International Conference on Artificial Intelligence,
Stockholm, August 1999
4. Marik V., Pechoucek M., Lazansky J., Roche, C : PVS'98 Agents: Structures,
Models and Production Planning Application. In: Robotics and Autonomous
Systems, vol. 27, No. 1-2, Elsevier, 1999, pp.29-44. ISSN 0921-8890
5. Marik V., Pechoucek M., Stepankova O., Lazansky J.: ProPlanT: Multi-Agent
System for Production Planning, International Journal of Applied Artificial
Intelligence, September 2000
6. Pechoucek, M., Norrie, D.: Knowledge Structures for Reflective Multi-Agent
Systems: On reasoning about other agents, registered as Report Number 538,
Department of Mechanical and Manufacturing Engineering, University of
Calgary, Alberta, Canada, December 2000
7. Pechoucek M., Stepankova O. Marik V.: Saving Communication in Multi-
Agent Systems with Tri-base Acquaintance Models and Machine Learning -
Project Report of US Air Force Research Contract - Multi-Agent Systems in
Communication, project contract no.: F61775-99-WE099, August 2000
8. Rathmell R. A.: A Coalition Force Scenario 'Binni - Gateway to the Golden
Bowl of Africa', Defence Evaluation Research Agency, 1999
9. Reece, G. A. and Tate, A. (March 1993) The Pacifica NEO Scenario.
Technical Report ARPA-RL/0-Plan2/TR/3. Artificial Intelligence
Applications Institute, University of Edinburgh, Scotland, 1993
10. Sandholm W. Distributed Rational Decision Making, In Multi-Agent Systems:
Modern Approach to Distributed Artificial Intelligence (Weiss G., ed.), The
MIT Press, Cambridge, 1999
11. Shehory O. and Kraus S., Methods for Task Allocation via Agent Coalition
Formation, Artificial Intelligence, \o\. 101 (1-2), pp. 165-200, 1998
12. Walker, E. C. T., Panel Report: Coalition Planning for Operations Other Than
War, Workshop atAIAI, Edinburgh, Spring 1999
AGENT NEGOTIATION IN A VIRTUAL MARKETPLACE

WALID S. SABA AND PRATAP R. SATHI


Knowledge & Language Engineering Group, School of Computer Science, University of
Windsor, Windsor, ON N9B-3P4 Canada
E-mail: (saba, sathi) @cs. uwindsor.ca

We are in the process of building DALIA - an environment for distributed, artificial, and
linguistically competent intelligent agents that communicate in natural language and perform
commonsense reasoning in a highly dynamic and uncertain environment. There are several
challenges in this effort that we do not touch on in this paper. Instead, we focus here on the
design of a virtual marketplace where buying and selling agents that learn from experience
negotiate autonomously on behalf of their clients. Buying and selling agents enter the
marketplace with an 'attitude'formulated as a complex function of prior experience(s), market
conditions, product information as well as personal characteristics such as importance of time
vs. importance of price and the commitment level to the purchase/sale of the product.

1 Introduction

While the term "intelligent agent" seems to mean different things for different
people, there clearly is a core meaning that the agent community agrees upon,
which, at a minimum, includes the following: (/) an agent is an autonomous
module/system that is expected to be an expert at performing a specific task; (i7)
agents are situated, i.e., they operate in a dynamic environment of which they must
be aware; and (Hi) agents are expected to be capable of performing some kind of
reasoning and to exhibit flexible problem solving behavior. Other important
characteristics include learning, mobility and communication (see [1,6]).
In our view, intelligent agents must also have a certain level of linguistic
competency and must be able to perform commonsense reasoning in a highly
dynamic and uncertain environment. To this end we are in the process of building
DALIA - an environment for distributed, artificial, and linguistically competent
intelligent agents that communicate in natural language and perform commonsense
reasoning in a distributed and highly dynamic and uncertain environment. There are
several challenges in this effort that we do not touch on in this paper1. Instead, we
focus here on describing a virtual marketplace where buying and selling agents
autonomously negotiate on behalf of their clients. We briefly touch on the type of
commonsense reasoning that buying and selling agents must perform in such a
dynamic and uncertain environment where facts are clearly fuzzy and subject to
temporal and modal qualifications. The motivation of our long-term objectives can
be illustrated by the following scenario involving a certain buyer, B:

' See Saba and Corriveau [9,10] and Saba [8] for an overview of our concurrent work on
commonsense reasoning and language understanding.

444
445

It is very likely that the price of PCs will keep going down for a while
B can wait for another few months to buy a PC

B should probably wait a few months to buy a PC, unless B stumbles


on a very good deal

We assume that intelligent agents are ultimately expected to perform a similar type
of reasoning. Formalizing this kind of reasoning is clearly not a trivial task. First, it
is clear that there are temporal and modal aspects that must be taken into
consideration. For example, if "very likely" was changed to "highly unlikely" in the
first premise, B should be advised to make a purchase sooner rather than later. The
above inference is also partly based on the assumption that B 'can' (possibly) wait a
few months to make a purchase. An entirely different conclusion should be drawn if
it happened that B "must" actually make a purchase (i.e., if there are time
constraints on B.) Finally, the conclusion B draws is conditional - that is, B is to
wait for a few months, unless B stumbles on a "very good deal". The challenge here
of course is to quantify "very good deal" in this context.
Our ongoing effort towards formalizing this kind of reasoning is still in its early
stages, and some preliminary attempts in this regard are only briefly discussed at the
end of the paper. The focus here is primarily on a framework that we are building
that would support the design of artificial and linguistically competent intelligent
agents. In particular, we discuss the design of a prototype implementation of a
virtual marketplace where buying and selling agents autonomously negotiate on
behalf of their clients. While our negotiation model has several common features
with a number of existing approaches to negotiation (e.g., [2,3,7]), none of these
models the notion of a mental state of an agent, which, as will be argued below,
plays an important role in the negotiation process. Moreover, exploring the
interaction between an agent's mental state and an agent's prior experience is novel
in our model, although limited form of learning from experience (using case-based
reasoning) has been previously suggested [13].
In the next section we give an overview of the negotiation model and discuss
the various parameters of the negotiation process. In section 3 we discuss the
learning strategy employed by the agents using case-based reasoning. In section 4
we briefly discuss some preliminary work on developing a commonsense reasoning
for agents. Finally, we provide some concluding remarks in section 5.

2 Agent Negotiation in a Virtual Marketplace

As part of our ongoing effort on DALIA - an environment for distributed, artificial


and linguistically competent intelligent agents - we have designed a virtual
marketplace environment where buying and selling agents negotiate on behalf of
their clients. In this section we discuss the basic components of this model.
446

2.1 Basic Components of the Model


The virtual marketplace is an environment that at any point in time has a list of
buyers and sellers. Buyers and sellers in the environment are also assumed to have
access to two knowledge sources: (i) an ontology for domain-specific product
information; and (ii) a set of general commonsense rules. Agents are assumed to
learn from experience using case-based reasoning (CBR) techniques: negotiations
along with product information, market conditions and the outcome of the
negotiation are stored as experiences for use in future negotiations. A high-level
view of the model is shown in figure 1 below.

buyers
sellers •HZHZHZHZHZ1

•\buyerClone , | - »Tseller}Clone | ^ -I seller , I


buyer
\buyerCkme 2[J ^•jseller,Clone\ ^ -| seller^\__

CaseBase

Figure 1. Basic components of the virtual marketplace.

The process starts when clients (users) create buyer and seller agents that are sent to
the virtual marketplace. Buyers and sellers are registered in the marketplace where a
list of buyers and sellers is maintained. In the current model it is buyers that are
assumed to be proactive; it is buyers that look for and initiate a negotiation with
sellers. Here's an overview of the process from a buyers perspective:
A buyer b is created by some client.
The buyer enters the marketplace (it is added to the list of buyers)
The buyer retrieves a publicly available price range for the product in question
Based on its attitude and the publicly available price range a buyer computes its
own price range as a complex function that we discuss below.
b queries the environment for a list of sellers, S, selling the same product.
b sends an (asynchronous) message to each seller se 5 requesting a negotiation
Sellers provide a handle for a negotiation clone, or decline to negotiate
For each seller clone sc, b creates a buyer clone be
A negotiation starts between each pair of sc and be
Buyers start bidding with their price range's minimum, while sellers start with
their maximum (agents'price ranges are hidden)
A deal is made when the buyer's maximum reaches the seller's minimum
447

• No deal is made if the buyer's maximum falls short of the seller's minimum.
• Both buyers and sellers might save the experience for future use
We explore this process in some detail below. First some definitions are in order.
Definition 2.1 An Agent's Attitude (aat) is a hidden mental state comprised of a
triple (xi^^i), representing the importance of time, the importance of price and the
commitment level of an agent, respectively, where x, e [0,1 ].
For an agent with attitude = (1.0,0.5,0.8), for example, time is a priority, and
the commitment level is rather high although it is somewhat indifferent to the price
(an ideal buyer attitude from the perspective of a seller). If we take the absolute
Euclidean distance as an equivalence operator, as done in [5], we can compute a
measure of similarity between two attitudes, 0 < AS(attitude\,attitudei) < 1, as:

AS((xl,x2,x3),(yl,y2,y3))={l-\x1-yl\)^{\-\x2-y2\)^-\x3-y3!(j

Although the two t-norm functions commonly used for conjunction (minimum and
product) seem equally plausible here, we have (admittedly) arbitrarily chosen to use
product in the current model. This is surely worthy of further investigation.
Definition 2.2 Public Price Range: All agents in the marketplace are assumed
to have access to a product price range that can be obtained from a product
ontology: PPRiprod) = [pmin,pmax\. We compute a similarity measure between
two public price ranges, 0 < RS(range\jangei) < 1, as follows:

man, —min,\ \maxj -max2\


,lt
RS([nminx, maxl J, \min2, max2 \) =
min] + min2 max, + max,
z
)
Definition 2.3 A Negotiation is a process that can be in any of the following states2:
(0 DONE4"; (ii) DONE"; or (Hi) DONE0, corresponding respectively to a negotiation
that completed successfully (deal was made), a negotiation that completed
unsuccessfully (no deal is possible), and to a negotiation that is still in progress. A
negotiation record is an ordered list of offers and counter offers, and an outcome,
({(Offer,CounterOffer)},Outcome) , where Outcome e {DONE"*,DONE"}. In addition
to the outcome, another important feature of a negotiation is the time it took, which
could be crudely approximated as the length of the negotiation record,
length({{(Offer,CounterOffer)},Outcome)) = \ {(Offer.CounterOffer)} \*. Actual
offers and counter-offers are also important features of a negotiation. Since the

" Our negotiation process can be modeled as a finite state machine (FSM) similar to [ 11 ].
Since the 'actual' time a negotiation takes is implementation-dependent such a measure
would clearly be misleading. However, some combination of time and number of steps would
be needed.
448

length of the negotiation has already been taken into account, one only needs to
consider the average offer and counter-offer of a negotiation. That is,

avgOffer, = j - r ^ offer
I 'I (offer joimierOffer^U

avgCounterOffer; =-.—r 7 counteroffer


I 'I (offer.counteroffer^^

Taking the average offer and counter offer of a negotiation, the time a negotiation
takes, as well as the negotiation outcome, a similarity measure between two
negotiations 0 < NS{(L,,Outcomel),{L2,Outcome2)) < 1 can now be defined as:

NS ({i|, outcome,}, (L2, outcome2 ))

{ F{Sim(length(L,),length(L2))ASimloffers^offers-t))
Sim{length(L,), length(L2)) A Sim{offers,, offers1)
if outcome, ^outcome,
otherwise

where

5iffj(ojQfers,,ojgfew2)
= (l -\avgOffer\ -avgC$fer2|)A (l - |avgC0Mnfe/-O/fer, -avgCbwMer<9//'<?/-:,|)
F(*) = max(0, x-e)

where £is a bias against the difference in the outcome (currently e = 0.5). Note that
since there is always at least one offer and counter offer length{L\)+length(Li) * 0.
Definition 2.4 A new Agent Experience results after every negotiation. In
addition to the negotiation record, an experience record contains information about
the agent's attitude, the public price range, the agent's price range, as well as market
conditions. The following is an example experience:

ProductCategory => PersonalComputer


PublicPriceRange => [1000,3000]
AgentPriceRange => [1000,2000]
Attitude => [1.0,0.5,0.8]
SupplyDemandRatio => (3,1)
Negotiation => <{<1000,1500),(1100,1400),(1200, 1200)} , DONE*)

This represents an agent's experience in buying a personal computer, when the


supply-to-demand ratio was 3 to 1, the agent was highly committed to buying, the
price was not much of a factor, but time was crucial. Under those circumstances, the
negotiation was successfully completed after three offers and counter offers.
449

Definition 2.5 An Agent's Price Range is a function of its attitude (definition


2.1) and the public price range (definition 2.2). There are several functions that are
plausible here. However note that the lower the minimum price range of a buyer the
more time a negotiation will take. Thus, the buyer's minimum price range must be
higher than the public's minimum if the importance of time is high for that buyer.
Moreover, when the commitment of a buyer is high the agent should be as close to
the public's maximum as possible, since a high commitment indicates an urgent
need for the product. However, the importance of the price should balance the
degree of commitment (in a sense, a high commitment is a form of desperation, and
the importance of price provides a sobering effect).
A seller reasons in exactly the opposite direction. The following two functions
are one suggestion on how such reasoning might plausibly be captured, where
(m,n,k) represents an agent's attitude:

APRb"y" {[pmin, pmax\ (m, n, k))


= [pmin + (m\pmax — pmin)/10, pmax — {pmax\n\\-k)l\0\

APRse,kr t[pmin, pmax\ (m, n,k})


= [pmin + (pmin\n\l-k)/10, pmax - (m\pmax - pmin)/10]

Definition 2.6 Supply/Demand Ratio: Part of an agent's experience is the relation of


the price to the supply/demand ratio, which is simply |S|/|fl| where S is the current
list of sellers and B is the current list of buyers of a certain product.

2.2 The Negotiation Process


A buyer b with an attitude {bm,bn,bk) and a seller s with an attitude (sm,sn,sk) enter
the marketplace. Consequently,
• b computes its price range: [bpmax,bpmi„]<-APRb"yer([pmin,pmax],{bm,bn,bk))
• s computes its price range: [spmax,spmj„]<r-APRseller(\pmin,pmax], (sm,sn,sk))
• b hides its bpmax and starts its bidding with bbid <— bpmax
• s hides its spmi„ and starts its bidding with sbid <— spmh,
• With each successive offer/counter offer buyers and sellers update their
respective biddings as follows: bbid <— (bbid + a) and sbid <— (sbid - P)
• a and p are the buyer's step increment and the seller's step decrement,
respectively. In the current model a and P are initially (pmin+ pmax)/'100.
• A negotiation is always in one of the following states: (i) DONE* if bbid > sbid;
(it) DONE" if bpmax < spmi„ and (Hi) DONE0 if (bbid < sbid)A(bp,mx> spmin).
It must also be noted that in general a buyer has more than one negotiation thread
running concurrently (one thread with every potential seller). The buyer waits for
each (negotiation) thread to return a result r e {(DONE*, pn'ce),(DONE", price)}. If

»
I
450

time is more important than price, the buyer exists the marketplace (terminating all
its clones) as soon as a result r = (DONE+, price) is received, otherwise, the buyer
waits for all (negotiation) threads to terminate and selects the one that found the best
deal (if any)4.

2.3 The Ontology


Agents in the virtual marketplace have access to a domain-specific ontology of
product information. Currently, the ontology and the domain knowledge are quite
limited, and the main functionality of the ontology is to provide a public price range
for specific product categories. Using the notion of semantic distance in a semantic
network [12], we also compute a simple measure of conceptual similarity between
two products as follows':

CS(prodt, prod 2)
= 1 l(dist(prodx, \ub{prodx, prod2))+ dist(prod2, lub(prod], prod2)))

where 1ub' is the least upper bound of two concepts in the ontology, and the
distance between two concepts, dist(cuc2), is the number of isa links from c\ to c2.
Computing a conceptual similarity between two products based solely on the
product category is not sufficient for our purposes. The reason for this is the
following: when buying a scanner, one might recall their experience in buying a
printer. In this case the conceptual similarity of the product categories seems to be
sufficient. However, this is a very simplistic view, since one would hardly recall
their experience in buying a (computer) mouse when one is buying a (computer)
monitor, although both are "computer products". Clearly, the price range is also
crucial. That is, our experience in buying big items with similar price ranges might
be similar even though the product categories might be different. The similarity
between two products might therefore be a function of both product and price:

PS(prt, pr2, prodx, prod2) = CS(prod,, prod2)/


pr, + pr2

3 Case-Based Reasoning and Learning from Experience

Regardless of the outcome of a negotiation, both buyers and sellers occasionally


save their experience in a 'case base' for future use. When dealing with a case base

4
At the moment there is no bilateral communication between the buyer's clones. Such an
extension adds considerable complexity to the model, although it does open up interesting
possibilities that we plan to explore.
5
The ontology must be considerably extended to support recommender agents [4,10].
451

one has to carefully consider a strategy for (i) case representation; (it) case indexing
and retrieval; (Hi) case matching; and (iv) case adaptation (see [7]). We consider
these very briefly here. A case (experience) in our model has the structure:
(ProductCategory : prod, (e.g., PersonalComputer)
ProductName : pname, (e.g., Intell PHI)
PublicPriceRange : ppr, [pmin,pmax]
AgentPriceRange : apr, [bmin,bmax]
Attitude : att, (e.g., <1.0,0.5,0.8>)
SupplyDemandRatio : sdr, (e.g., (3,1))
Negotiation : neg> (({(1000, 1500),(1100, 1400),(1200 , 1200)} ,DONE*})

Cases are indexed in the case base by the product category and the average price,
computed as (pmin+pmax)l2. When searching for "relevant" cases (or experiences),
a perfect match cannot be expected, instead the search is conducted as follows: two
lists of all cases, corresponding to failed and successful experiences, are generated.
Cases included in these lists are those that match the search criteria within a certain
threshold. When searching for a relevant experience cases are matched as follows:
Agents learn from experience in our model by using prior experiences to adjust their

Match(f\, c2) = — (PS(prod(cl), prod(c2))+ AS(att(cl), att(c2))+ RS(ppr(ct), ppr(c2)))


attitude and the increment they make during negotiation (recall a in section 2.2)
prior to computing their [bmin,bmax] price range. Since the attitude, its
corresponding price range and the bidding increment affect the entire negotiation
process, agents will overtime tend to minimize the negotiation time and the
maximum bidding price. Moreover, the proportion of successful negotiations will
increase over time since agents also learn from negative experiences (those that
completed without making a deal).

Old Experience Agent's Current Profile

Adjust [(prod , pname , ppr, apr , aat, sdr, neg V (a, aat \)

(a',aat')<r- fini.((a,aat)) if outcome (neg ) = DONE


+
(a', aat') <- fdn. {(a, aat)) if outcome (neg ) = DONE

The functions f-mc and fdec update the bidding increment and the attitude based on
previous results as follows: fdec hardens the bidding increment and the attitude (for
every occurrence of a previous success), while finc loosens the bidding increment
and the attitude (for every occurrence of a previous failure). The reasoning behind
this process is to let agents find an optimal attitude/bidding-practice that maximizes
the number of successes. The exact threshold by which the attitudes and the bidding
increments (decrements) are updated based on previous successes (failures) is still
452

being investigated, although one plausible approach is to start with a percentage of


the initial values.
A final point should be made about case updates. When a negotiation
terminates, a search and match is done on the case base. When searching for
updating or adding a new experience a match between two cases is done as follows:

f
wx xPS(prod(c\),prod(c 2 )) + w2x AS(att(cl),att(c2)) +
Match(cl,c2) = —
w3 x RSippri^),ppr(c2)) + w4 x NS(neg(Ci\neg(c2))

Currently we assign equal weights, w-„ to all attributes of a case, although we plan to
test various weighting schemes, perhaps using a machine learning experiment. If a
strong match is not found, the new case represents a "novel" experience and is
added to the case base. When a strong match occurs, the two cases are "merged"
resulting in a modification of an existing experience:

Merge[^pl,pnl,pprl,aprl,attl,sdrl,negif,(iP2,pn2,ppr2,apr2,att2,sdr2,neg2))
/'ut>(p,,p 2 \{pn{ =pn2\{ppr^ = ppr2\avg{apr{,apr2)^
\avg (attl, att2), avg (sdrt, sdr2), min (neg], neg 2)

The logical connectives A, V and —. can be defined in this formalism much like
connectives of fuzzy logic (min, max and \-q). Implication on the other hand seems
to be more complicated here. In particular it seems that p^> q must be interpreted as
the degree to which the 3D space defined by p is included in the 3D space defined
by q. A similar approach (albeit on 1 -dimentional intervals) was successfully used
in [9] to derive a numerical measure of implication between two predicates in a
commonsense reasoning strategy to resolve quantifier scope ambiguities. As
mentioned previously, much of this work is preliminary. However, as our
introductory example illustrate, such reasoning must be formalized if we ever hope
to "trust" software agents to buy and sell on our behalf in the highly dynamic and
uncertain environment of a marketplace.

4 Concluding Remarks

In this paper we described a prototype of a virtual marketplace environment where


buying and selling agents that learn from experience autonomously negotiate on
behalf of their clients in a highly dynamic and changing (uncertain) environment.
The current model is still too simplistic, and to date the main focus has been on
building the appropriate infrastructure that allows us to experiment with various
reasoning models and strategies. Nevertheless we have incorporated a powerful
learning strategy and a mental state model that seem to provide the agents with a
453

novel problem solving behavior. What is important to note here is that in the current
model agents perform very simple commonsense reasoning. As stated in section 1,
however, our long-term goal is to develop commonsense reasoning strategies that
would account for various temporal and modal aspects and various types of
vagueness and uncertainty. In this regard we currently are developing a framework
for 3-dimensional commonsense reasoning where a statement p is situated in a 3-
domentional space representing its degree of truth, as well as its temporal and modal
aspects. The challenge in this model is to develop the semantics of the logical
connectives, and in particular the semantics of implication.

References

1. Bradshaw, J. (Ed.), Software Agents, AAAI/MIT Press.


2. Chavez, A. and Maes, P., Kasbah: An Agent Marketplace for Buying and Selling
Goods, In Proc. of the First International Conference on the Practical Application
of Intelligent Agents and Multi-Agent Technology, (1996)
3. Esmahi, L. and Dini, P., Toward an Open Virtual Market Place for Mobile Agents,
In Proceedings of the IEEE 8th International Workshops on Enabling
Technologies: Infrastructure for Collaborative Enterprises, (1999).
4. Glover, E., et. al, Recommending Web Documents Based on User Profiles, In
Proc. ofACM SIGIR'99 Workshop on Recommender Systems, (1999).
5. Jennings, N. R. and Wooldridge, M. (Eds.), Agent Technology: Foundations,
Applications, and Markets, (1998), Springer-Verlag.
6. Kolonder, J., Case-Based Reasoning, (1993), Morgan-Kaufman Pubs.
7. Kumar, M. and Feldman S.I., Business Negotiation on the Internet, IBM Institute
for Advanced Commerce, Yorktown Heights, NY, (1998).
8. Saba, W., The Resolution of Quantifier Scope Ambiguities, In Proceedings of the
Il'h Irish Conference on Artificial Intelligence and Cognitive Science (AICS-2000),
(2000), Galway, Ireland.
9. Saba, W. and Corriveau, J.-P., Plausible Reasoning and the Resolution of Quantifier
Scope Ambiguities, Studia Logica (special issue on commonsense reasoning), 67
(2001), pp. 1-19.
10. Saba, W. and Corriveau, J.-P., A Pragmatic Treatment of Quantification in Natural
Language, In Proceedings of the 1997 National Conference on Artificial
Intelligence (AAAI-97), (1997), pp. 610-615.
11. Sussna, M., Word sense disambiguation for free-text indexing using a massive
semantic network, In Proceedings of the second international conference on
Information and knowledge management, (1993), pp. 1-5.
12. Voss, A. and Kreifelts, T., SOAP: social agents providing people with useful
information, In Proceedings the international ACM SIGGROUP conference on
Supporting group work: the integration challenge, (1997), pp. 16-19.
13. Wong, W. Y., Zhang, D. M. and Kara-Ail, M., Negotiating with Experience, Proc.
AAAI-2000 Workshop on Knowledge-Based Electronic Markets-KBEM'00, (2000).
MODELING USER PREFERENCES TO FACILITATE MORE AUTOMATED
AND ACCURATE TRANSACTION BROKERING WITHIN
HETEROGENEOUS MULTI-AGENT ELECTRONIC MARKETS

G. TEWARI, P. MAES, A. BERKOVICH, AND V. GABOVICH


MIT Media Laboratory
20 Ames Street, EI5-305
Cambridge, MA 02142 USA
+1 617 253 7442
{gtewari, pattie, sashaber, vyg }(a),media.mit.edu

We review the work we have done in architecting multi-agent electronic markets as part of the
MARI (Multi-Attribute Resource Intermediaries) research initiative within the Software Agents
group at the MIT Media Lab. Allowing human users to express their complex underlying
preferences, and using this information to find automated matches between buyer and seller agents
in electronic markets has been a dominant area of research in multi-agent e-business systems. In
this paper, we discuss the techniques we have deployed within MARI so as to model user utility
functions and broker transactions for resource allocation. Our methodology permits us to maximize
aggregate social welfare (defined as aggregate surplus) while, at the same time, allowing each
agent to find a transaction partner that is myopically optimal from its self-interested perspective.
The research brings up a variety of questions and interesting discussion issues pertaining to the key
considerations that market-makers ought to be cognizant of when architecting multi-agent e-
marketplaces.

1 The MARI Research Initiative

The MARI (Multi-Attribute Resource Intermediaries) research initiative within the


Software Agents group at the MIT Media Laboratory [4] proposes to improve online
marketplaces, specifically those that involve the buying and selling of non-tangible
goods and services.
MARI is an agent-based market-intermediary architecture intended as a generalized
platform for the specification, valuation and brokering of heterogeneous goods and
services. MARI makes it possible for both buyers and sellers alike to more holistically
and comprehensively specify relative preferences for the transaction partner, as well as
for the attributes of the product in question, making price just one of a multitude of
possible factors influencing the decision to trade. Users are expected to specify an
"ideal" configuration or offer of what they are looking for and, furthermore, are able to
explicitly specify permissible ranges of attributes over which they are willing to be
flexible. For instance, a buyer in a language translation market might ideally want a

454
455

"task completion time" (meaning the amount of time it will take the translator (seller) to
complete the translation) of 30 minutes, but could express the fact that she would be
willing to accept anything ranging between 30 to 120 minutes. Moreover, the buyer can
visually identify how her valuation might change as the time varies over this range,
which is effectively equivalent to specifying utility functions (discussed further in
Section 4).
By allowing each party to choose and implicitly associate weights with relevant
attributes from the underlying ontology, MARI makes it possible to take into account
subtle differences in the characteristics of each party, so as to facilitate a more accurate
match. MARI makes it possible for users to reveal and effectively quantify their
intrinsic utility functions for a given product or service. This, in turn, makes it
substantially easier and more transparent for participants in online marketplaces to
partake in complex and sophisticated interactions with software agents and to accurately
specify relative preferences and permissible tradeoffs within the context of a particular
product domain. Subsequently, these agents are better able to accurately identify
suitable products and trading partners on behalf of their owners, autonomously generate
"valuations" based upon the owner's revealed preferences, and ultimately negotiate the
terms of the transaction.

2 Brokering Transactions in Electronic Markets

Finding automated matches between buyer and seller agents in electronic markets has
been at the forefront of research in multi-agent e-business systems. While a number of
heuristics have been explored for brokering transactions, in the vast majority of systems
there is potentially a conflict between maximizing global welfare or surplus, versus
allowing each agent to act in self interest to optimize its individual gains.
Maximizing aggregate welfare for a collection of buyer and seller agents is not
necessarily harmonious with allowing each individual economic agent to act in self-
interest. This is equivalent to saying that a centralized entity that allocates scarce
resources, seeking to maximize the aggregate profits of a "society" of agents, may leave
some individual agents faring very poorly, essentially as a "sacrifice" to improve the
aggregate lot of society at large. However, from the perspective of the self-interested,
individualistic agent who is required to make a "sacrifice," this is hardly an attractive
outcome!
As part of the MARI [4] research initiative, we have had first-hand experience in
dealing with the problem of matching buyers and sellers in a mutually beneficial, yet
provably optimal fashion. The technique we have developed permits us to maximize
aggregate social welfare (defined as aggregate surplus) while, at the same time,
456

permitting each agent to find a transaction partner that is myopically optimal from its
self-interested perspective.

3 Overview of our Proposed Matching Scheme

Our matching technique involves leveraging the existing MAR1 infrastructure so as to


approximate user utility functions and preferences via mathematical modeling. MARI
allows users to specify an ideal offer, as well as to cite ranges of attribute values over
which they are willing to be flexible. Using this information, we can build a
mathematical model of user utility functions. This model of user preferences allows us
to predict how much a given user is willing to bid (or ask) for a given transaction
partner, as well as to gauge how "far" the transaction partner lies from the user's desired
ideal offer configuration.
Specifically, for each buyer, we look at each qualified seller to see whether a deal is
possible and, if so, what the "Optimal Transaction Configuration" (OTC) would be.
Exploring the space of possible deals is equivalent to exploring the space defined by
flexible attribute ranges, as specified by the buyer and seller. The OTC is simply a
particular configuration of attributes, or a "deal," that is consistent with the permissible
attribute ranges that both the buyer and the seller are willing to tolerate, and is the
"deal" which lies "closest" to the buyer's and seller' ideal offers.
The notion of the OTC captures the "deal" configuration that is myopically optimal
in the context of a specific buyer-seller pairing, from the self-interested perspectives of
the two buyer and seller agents involved in this pairing. Calculating the surplus (defined
as the bid-ask spread) corresponding to the OTC allows us to gauge how the "goodness"
(in terms of aggregate welfare or surplus) of this pairing might compare with that of
other pairings. Since, for any given buyer, an OTC will be computed for every qualified
seller, comparing the relative "goodness" of the various OTCs gives us a global
heuristic of surplus maximization by which we may decide which of the various
qualified sellers the buyer ought to be ultimately paired with.

4 Modeling Buyer and Seller Preferences

At this point, we briefly discuss the user interaction schema currently endorsed by
MARI [4]. Effectively, we use and analyze the information gathered about user
preferences gathered, so as to subsequently match transaction partners in the
marketplace.
457

4.1 Capturing User Preferences


Each distinct buyer or seller is represented within MARI by an agent. The "buyer agent"
embodies the buyer's revealed preferences with respect to the desired resource.
Similarly, "seller agents" embody the preferences and interests of sellers. MARI's
interaction with the user, via which the user agent is initialized, can be decomposed into
several steps [4], enumerated below. Each ontology-specific attribute has a predefined
"default" value associated with it, and the user can accept or override these defaults.

Step 1). Specifying the Ideal Offer. The user specifies an "ideal" configuration, or offer,
which consists of specific product and transaction partner attribute values, as derived
from the underlying domain ontology. The user can modify which attributes are fixed
and which are flexible and must also associate a monetary valuation ("bid" or "ask")
with this offer (referred to as pbsvalue).
The attributes of any given product can be classified as being either fixed or
flexible. A fixed attribute is one whose value, as specified by the user, is used for
transaction party qualification. By contrast, flexible attributes have associated ranges,
and are used for transaction party valuation. For instance, in the example of language
translation services (buyer's perspective), the number of words to be translated could be
a fixed attribute, while the reputation of the seller, the degree of expertise of the seller,
and the amount of time within which the translation will be completed could be flexible
attributes.

Step 2). Gathering Ranges for Flexible Attributes: Having specified which attributes
ought to be considered flexible and which ought to be fixed, a user must also associate a
permissible range of values with each flexible attribute. Flexible attributes essentially
embody the tradeoffs that a given user is willing to make. Associated utility functions
(discussed in section 4.2) define how the user's valuation changes as flexible attributes
vary over their permissible ranges.

4.2 Modeling User Utility Functions


Since MARI is fundamentally an infrastructural tool, we expect that each "market
maker" will create an instantiation of MARI specific to their product domain. When
instantiating MARI, the market maker must specify the product ontology as well as a set
of parameters which determine how user utility functions are modeled.
For each flexible attribute, the aim is to gather sufficient information from the user
so as to be able to accurately infer how her (uni-dimensional) utility might change as the
flexible attribute varies over its permissible range, while all other attributes are held at
458

their ideal (offer) values. As such, we require the user to specify the range of
permissible valuations, referred to as maxvalue and minvalue, associated with the
flexible attribute being held at the high and low endpoints of its permissible range,
respectively, while all other attributes are held fixed at their optimal or offer values.
Doing so enables us to accurately assess how the user would value product offerings and
transaction partners that have not been explicitly seen or "rated" before. Based upon the
market maker's configuration parameters, MARI models the user's utility function as
follows:

Step 1). Visually Selecting Utility Functions: When first instantiating MARI, the market
maker is required to visually associate a generic (pre-defined) mathematical function
with each flexible attribute [2] (see Figure 1). Of course, users have the option of being
able to override these "default" values during the offer specification process.

1 Seller BgputBlfoa
| $us&li Usage

| Mf«jts?aHikflt>]eIiffllft8

SgSerfoBBi&e
pewMe Rsmgs Rwi'tfcitSio'tM ,A;._
SasliwWSMiB
Vgn&severttsIWtSfiblE &figB

task Canrfff^ga Tung


:&««L:
|MFT3
line ¥we> wet as itmrftk feast

«9m\ teXFM« | N M » I

Figure 1. Visually Associating Utility Functions with Flexible Attributes


459

Step 2). Quantifying Utility Functions: Using the generalized equation form of the utility
function, in conjunction with the pbsvalue, maxvalue, and minvalue parameters
specified by a given user, MARI is able to compute a mathematical approximation to
the utility function corresponding to each flexible attribute [4]. The polynomial used to
represent the function is usually a quadratic.
For example, let us assume that a given buyer is willing to accept a "seller
reputation" ranging from 6 to 10. Assume that in her referential offer the buyer specifies
a preferred value of 6. Further, say the market maker has pre-associated UF2 (see Figure
1) with this flexible attribute as it varies over its range - the choice of this utility
function would reflect the fact that the buyer is willing to bid higher as the seller's
reputation increases, and that her valuation increases exponentially as reputation
approaches the maximum possible. In this case we can derive the equation which
captures the change in the buyer's utility as reputation varies, as:

.,_. . ( m ax value - pbsvalue} , ( (-2)(m axvalue - pbsvalue)(xlnw !£JL L) ^


U F,(x) = -— \x' + — ; —— - - x
{ (*,„• - *;•,„ )" ) {. (*/„ - *;„»•)" J
[ , , (maxvalue— pbsvalue)(x, )2 }
+ pbsvalue + ;— '""

Where:
xiow = the value of the attribute specified in the referential offer (i.e. 6);
xhi = high endpoint of the permissible range (i.e. 10).

5 Matching Buyers and Sellers

5.1 Generating Data Points from Functional Approximations


A mathematical model of preferences allows us to predict how much a given user would
be willing to bid (or ask) for a given transaction partner. We begin by generating
additional data points, using the seed data provided to us by the user. In order to
formulate a functional generalization for a user's multi-dimensional utility function, we
use a quadratic least squares approximation technique [5].
To do so, we first generate an mxn matrix, A, for every user. Every row of A
corresponds to a hypothetical configuration of attributes, or market "bundles," derived
using the user's ideal offer in conjunction with flexible attribute ranges. For example,
460

lets consider a "Professional Buyer" Bl'. Lets assume Bl possesses a reputation of 10


and, ideally (offer configuration), is willing to pay $100 for a seller who can translate
4000 words in 30 minutes and who has a reputation of 10 and an expertise of 5. Further,
lets say that Bl is willing to be flexible on translation time (30-120 minutes) and
reputation (5-10), and associates utility functions UF5 and UF2 (see Figure 1) with these
flexible attributes, respectively. Bl's "profile" can be summarized as:

Table 1. Profile of a "Professional Buyer" B1


Attribute Optimal Permissible Utility Bid
(Offer) Range Function Range
Bl Value (see Figure 1)
Words 4000 Fixed N/A N/A
Reputation=10 Seller Expertise 5 Fixed N/A N/A
Offer_Brd=$100 Seller 10 5-10 UF2 $80-$100
Reputation
Time (mins) 30 30-120 UF5 $100-$65

The "bid range" in Table 1 is deduced by explicitly asking the user a sequence of
questions. For instance for the time attribute, the $65 figure is obtained by asking the
buyer how much she would be willing to pay if translation time were to equal 120
minutes, while all other attributes are held fixed at their optimal values. In other words,
if <4000, 5, 10, 30> is the buyer's offer bundle, we ask the buyer to "bid" on a bundle of
the form <4000, 5, 10, 120>. With this information, in conjunction with the fact that we
know that UF2 (see Figure 1) is a mathematical approximation to how the buyer's
valuation changes from $100 to $65 as time ranges from 30 to 120 minutes, we can
discretize the [30, 120] range to automatically generate data points of the form <4000, 5,
10, [30...120]> and corresponding bids. Doing so for every flexible attribute allows us
to generate the matrix A, and a corresponding bid vector b. A will thus be an mx4 matrix,
where the exact value of m is configurable, depending on how many data points we
generate.
Having thus generated A and b, the task is to model the underlying mathematical
function that maps each row in A to each row in b. We can do so using least squares data
fitting [5] - a well known technique in linear algebra, where the problem is to solve an
over-determined system of equations Ax = b, so as to deduce the vector x that maps each
row of A to the corresponding entry in b. The vector x can be interpreted as a set of
coefficients or "weights," which effectively defines a function that can be used to map a

' We could just have well have considered a hypothetical seller; the treatment is
symmetric.
461

configuration of attributes to a bid, in a manner consistent with the user's revealed


preferences.
In essence, the least squares technique attempts to "solve" the Ax = b system by
minimizing the residual form b-Ax. More precisely,
Given AsCmx",m>n,beCm,
We find x e C" such that ||6 - Ax\2 is minimized.
Here |6-.4x| denotes taking the 2-norm [5], which corresponds to euclidean distance.
As such, the geometric interpretation of (2) is that we seek a vector x e C " such that the
vector Ax e C™ is the closest point in range(A) to b.
Once we determine x using our automatically generated data points, we can
effectively automatically estimate the buyer's bids for various sellers in the market.
Each differentiated seller offering simply represents a combination of attributes, of the
same form as a row of A. Since x approximates the mapping from each row of A to a
bid, the process of "valuating" a given seller is simply a matter of "scaling" each seller
attribute by the corresponding entry in x.

5.2 Exploring the Attribute Space to identify the Optimal Transaction


Configuration (OTC)
Subsequently, for each buyer-seller pair in the market, we explore the space of potential
transactions or "deals" that can take place. Exploring the space of possible deals is
equivalent to exploring the space defined by flexible attribute ranges, as specified by the
buyer and seller. For instance, lets consider a "Professional Seller," SI, whose "profile"
can be summarized as (analogous to Table 1):

Table 2. Profile of a "Professional Seller" SI


Attribute Optimal Permissible Utility Ask
(Offer) Range Function Range
SI Value (see Figure 1)
Words 4000 4000-6000 UF, $100-
Reputation=10 $150
Expertise = 5 Buyer 10 Fixed N/A N/A
Offer_Ask=$100 Reputation
Time (mins) 60 30-120 UF6 $125-
$85

Comparing SI with Bl, clearly we see that any attribute (such as 'time') that is a
flexible attribute for both parties with overlapping range, is a candidate for
462

"negotiation." A possible "deal" between SI and Bl could involve any value of 'time'
between 30 and 120. However, since, Bl and SI each have different optimal values (30
and 60 minutes respectively), a point in the interval [30, 120] will not be equally
desirable from the perspective of both parties. Our goal is to delineate the particular
"deal" (Optimal Transaction Configuration or OTC) that is myopically optimal from the
self-interested perspectives of the buyer and seller agents involved in the pairing. As
such, we use a distance function to assess how any given "deal" might deviate from the
buyer's and seller's optimal offer "bundles:"

Distance, A = fc[(d, - B?)2 + (d, - S?)2]

Where:
i, ranges over all flexible attributes;
dj is the value of attribute i corresponding to the specific deal, d, under consideration;
B° is the value of attribute 'i' corresponding to the Buyer's optimal offer;
S" is the value of attribute 'i' corresponding to the Seller's optimal offer.

For each buyer-seller pair in the marketplace, we can enumerate a set of possible
"deals" that can be brokered. By searching over the (discretized) underlying attribute
space, corresponding to differentiated buyer and seller offerings, we can identify a
particular "deal" that is suitable from the perspective of both transaction partners, in the
sense that it lies close to both of their "optimal" configurations. By searching over the
whole attribute space, in conjunction with our model of user preferences, we effectively
explore permissible tradeoffs that users' are willing to make, and integratively negotiate
over the holistic product offering to identify a "transaction configuration" that is suitable
for both the buyer as well as the seller.
From the set of all possible "deals" for which surplus is non-negative (i.e. bid is
greater than or equal to the ask) we identify the particular deal for which A is
minimized, and call that the Optimal Transaction Configuration (OTC). Since, for any
given buyer, an OTC will be computed for every qualified seller, comparing the bid-ask
spread (surplus) corresponding to each OTC gives us a global heuristic of surplus
maximization by which we may decide which of the various qualified sellers the buyer
ought to be ultimately paired with.
463

5.3 Delineating Transaction Partners to Maximize Aggregate Surplus


Finally, having found an OTC corresponding to each possible buyer-seller pair in the
market, we use a heuristic of aggregate surplus maximization so as to delineate
transaction pairs.
In essence, for each buyer, we evaluate the "cost" that would be incurred if the
buyer were to engage in a transaction with any of the qualified sellers. Currently, we
take this "cost" to be equal to the OTC "bid-ask spread," which can be interpreted as the
aggregate surplus [3, 6] that the two parties would derive if the transaction were to take
place. We use this metric of "cost" since our indicator of the "goodness" of an allocation
is welfare, which, in this case, is measured by the surplus that the allocation generates.
Subsequently, we can conveniently formulate the problem of optimally pairing up
buyers and sellers as a generalized "min-cost flow" problem, as encountered in network
optimization [1]. With this formulation the matching problem can be solved using a
linear program [4, 1], as is currently done in MARL This linear program can be
executed to identify buyer-seller pairings for which the aggregate surplus of transaction
parties is globally maximized. The "clearing price" for any given transaction pair is, by
default, set at the midpoint between the original bid and ask prices, thereby equally
dividing the surplus between the buyer and the seller. The market maker can, however,
modify this distribution of surplus, choosing to retain the bid-ask spread as operating
profit for instance. As a consequence of surplus-maximization, the final pairing can be
shown to be Pareto optimal [6].

6 Bibliography

1. Ahuja R. K., Magnanti T. L. and Orlin J. B. Network Flows: Theory, Applications


and Algorithms (Prentice-Hall, New Jersey, 1993).
2. Guttman, R. Merchant Differentiation through Integrative Negotiation in Agent-
mediated Electronic Commerce (Masters Thesis, MIT Media Laboratory, Cambridge,
MA, September 1998).
3. Pindyck R. and Rubinfeld D. Microeconomics (Prentice-Hall, New Jersey, 1997).
4. Tewari G. and Maes P. Design and Implementation of an Agent-Based
Intermediary Infrastructure for Electronic Markets. In
Proceedings of ACM Conference on Electronic Commerce (Minneapolis, MN,
USA, 17-20 October 2000).
5. Trefethen L. and Bau D. Numerical Linear Algebra (Society for Industrial and
Applied Mathematics, Philadelphia, PA, 1997).
6. Varian H. Intermediate Microeconomics (W. W. Norton & Co., New York, 1999).
ATTITUDE BASED AGENTS IN E-COMMERCE APPLICATIONS

S. AU AND N. PARAMESWARAN
School of Computer Science and Engineering,
The University of New South Wales
Sydney 2052 Australia
E-mail: (sherlock, paramesh)@cse.unsw.edu.au

In this paper, we propose attitude based behaviours for agents in an E-commerce domain.
Often, agents operating in an E-commerce application have to achieve multiple goals
concurrently requiring different strategies. In order to be able to perform these behaviours,
agents need meta-level controls known as attitudes to guide them towards selecting the proper
actions for any particular goal. We argue that agents in an E-commerce environment are best
guided by attitude based behaviours. In this paper, we investigate the role of attitudes in
problem solving in the world of E-commerce, and suggest several attitudes. We then evaluate
and compare the performance of agents employing different attitudes.

1 Introduction

Over the last five years, the Internet has redefined business. The Internet has erased
traditional boundaries of time and geography creating an virtual community of
customers and suppliers: the E-commerce domain. Typically, in an E-commerce
application, humans buy and sell items and negotiate prices. One of the major
problems for consumers of a large scale E-commerce world is that there are
overwhelming amount of content and possibilities which the human consumers have
to manage in order to make a best deal. It is this situation where AI technology can
offer assistance.
In this paper, we study an agent which can act on behalf of a human in the
trading process. The agent takes instructions from the human user who specifies the
criteria of buying and selling several items and the attitudes it must hold towards the
specific items it needs to trade.
Attitude is mental attribute that guides the agent's behaviour in dynamic
situations. Earlier, we have successfully demonstrated the use of attitudes in problem
solving in a fire world domain [1]. In this paper, we apply this strategy to E-
commerce applications.
Most people think E-commerce means online shopping. In fact, E-commerce
refers to any transaction that are done using electronic means. Some of the more
well known E-commerce systems are Electronic Data Interchange (EDI) [2] which
works by providing a collection of standard message for businesses to exchange
data, and KASBAH [3] which employs agents to perform negotiation and settlement
of deals automatically.

464
465

2 Characteristics of Attitudes in E-commerce

Fishbein [4] defined an attitude as a built-in predisposition to respond in a


consistently favourable or unfavourable manner with respect to a given object. We
adopt this definition as the basis for our proposed attitude. This concept of attitude
has been successfully applied in a hostile dynamic multiagent world [1]. When
applied in an E-commerce application, some of these characteristics remain and we
discuss them below.
Persistence: attitudes once adopted, must persist for a reasonable period of time.
The extent to which the agent should persist the current attitude in dynamic worlds
is specified for each different attitude.
Behaviour: an agent has to exhibit a behaviour considered appropriate for an
adopted attitude. This behaviour must include responses to all unexpected events
and failures of activities. In addition, the behaviour must have an overall consistency
over the period of time during which the agent is holding that attitude.
Attitude based negotiation methodology differs different from traditional
negotiation strategy. In this approach, behaviours performed by the agent are
attitude driven. Once a task is specified by the client, the agent will attach an attitude
towards the task. The agent will then exhibit specific behaviour towards the task. As
an agent can concurrently execute multiple tasks and each task can be attached with
different attitudes, the behaviour of the agent becomes very dynamic and
unpredictable. A description of this problem solving methodology is given below.

1. A human client specifies a task to an agent with a desired attitude.


2. The agent will constantly monitor the E-commerce domain for any
opportunity of completing the task.
3. When an event occurs in the worlcTwhich represent the chance of achieving
the task, the agent will generate goals appropriate for the attitude.
4. Abstract plans are then devised.
5. Abstract plans are executed by decomposing complex actions into primitive
actions. Primitive actions are executed generating ultimately the desired
behaviour.

3 Attitudes for E-commerce

Obtaining items as inexpensive as possible is the major objective for any commerce
environment. Therefore, providing an profit orientated agent for the E-commerce
environment is the paramount motivation. However, we believe in a real complex E-
commerce environment, a purely profit orientated behaviour alone might not be the
best interest for the agent, and in turn for the human client that the agent represents.
For example, for an item that is highly desirable by many agents, it is inappropriate
466

to try to negotiate for a lower price as the item would be obtained by another agent
and the agent in concern will never have the chance of getting it.
Humans usually use a variety of strategies to negotiate different items. We
would like to reflect that in our agents. Thus, we propose several behaviour guided
by attitudes to deal with various situations. These attitudes are Careful, Easy,
Desperate, Risky, Normal, Greedy, Opportunistic and Methodical. As space
precludes the listing of all attitudes, we will only discuss two of them.

Greedy:
Buying behaviour: An agent attaches the greedy attitude towards an item will try to
obtain the item with as low a price as possible. An agent adopting this attitude will
not participate in any "bidding war". Instead, it will wait patiently for the desired
item to come up to the market without other agents competing for it, and then
counteroffer with a lower price.
Selling behaviour: An agent attaches the greedy attitude towards an item will try to
sell the item as expensive as possible. When selling, an agent adopting this attitude
will always counteroffer for a higher price. The agent would also wait for more
agents to offer prices, starting a "bidding war" before deciding actually selling the
item.
Conservative:
Buying behaviour: An agent buys items in a conservative attitude will pick the
cheapest offer in the market to negotiate. If the seller agent is willing to sell the item
at that price, then a deal is made. However, if the seller agent is waiting for more
offers and a "bidding war" is started, the buyer agent will withdraw itself from
getting the item and will look for another offer.
Selling behaviour: When putting up an item in the market, the agent will set it with a
medium price tag. The agent is willing to sell the item at that price and will not ask
for a higher price when a buyer agent is ready to make a deal.

4 Implementation and Performance

Our chosen market is comprised of university students who trade items in a E-


commerce environment. In this domain, students typically engage in small-scale
transactions which can go over relatively long periods of time (days and months).
The marketplace is a decentralised system with multiple simultaneous negotiations
at any one time.
We implemented an E-commerce trading system and it includes a society of
autonomous agents which function over a long period of time and each representing
one human client, a decentralised marketplace which contains information about the
available items, a server which coordinates agent communications, and a web
interface for the client.
467

The entire system is implemented in Java and the Web interface is coded with
servlets, JavaScript and HTML. Each agent runs as a separate process, started by the
server. The marketplace is implemented as an Object-Oriented database. Database
queries can be initiated by the Agent, the Server or the Web Servlets.
Our initial studies on attitudes in E-commerce applications show how an agent
adopting a particular attitude performs in a given society. Four agents were given
tasks to trade items attached with different attitudes. The agents in the system can
choose to adopt one of the eight different attitudes towards its tasks (see Section 3).
In our implementation, these attitudes are represented by rules. The finer details
regarding the details of the rules to achieve these attitudes is too lengthy for this
paper.
In our sets of experiment, we employed three Control agents each adopting a
fixed attitude and a fourth Variable agent adopting a different attitude in each test
run. The Control agents are all selling chairs, of which the Variable agent wishes to
buy one. In addition, the Variable Agent is selling a lamp which the Control are
competing to buy. All agents are negotiating in the range of twenty to a hundred
dollars. Each test run is repeated a number of times and the average results of the
experiment are shown by the final agreed price.
In experiment A, the three Control agents adopt the attitudes Greedy,
Methodical and Desperate respectively (a random environment). In the second
experiment B, the Control agents adopt attitude Careful, Easy and Opportunistic (a
random environment). In experiment C, the three Control agents adopt attitudes
Opportunistic, Greedy and Methodical (a tough environment) and lastly,
experiment D is a weak environment where the three Control agents adopts attitude
Careful, Easy and Desperate. Figure 1 illustrates the performance of the Variable
agent. Each column represents a particular experiment and final agreed bid prices are
shown in the rows.

Seller's Performance ($) Buyer's Performance ($)


Experiment A B C D A B C D
Careful 61.40 60.10 59.41 61.15 58.90 59.23 61.08 58.89
Desperate 60.65 60.33 58.22 60.32 60.33 60.73 62.37 59.84
Easy 60.40 60.74 60.76 60.54 60.01 59.81 62.17 59.84
Greedy 61.82 61.08 59.40 62.12 59.32 60.38 Failed 58.13
Methodical 62.73 62.36 60.66 62.21 57.70 59.21 60.27 57.53
Normal 61.32 61.48 58.73 60.92 57.78 60.26 60.76 59.30
Opportunistic 61.96 61.49 60.50 62.24 57.81 59.22 60.10 57.85
Risky 58.66 61.10 59.06 60.99 61.38 58.91 61.12 58.92
Figure 1. Performance of an Attitude Based Agent.
468

Form the results in Figure 1, we observe that in general, agents adopting less
aggressive attitudes towards the item finalised deals at lower prices while the
intermediate aggressive agents fared better, and the most aggressive agents usually
closing very good deals indeed.
The society that forms the agent environment also has an impact on the deals,
since the results indicate that the tougher the environment, the poorer the buyer
agents perform. Therefore, it would appear that in a strong market, with plenty of
competition, a tougher agent would be an ideal choice. However, if the market is
weak, then weaker agents will at least close deals by attracting negotiations with
their willingness to finalise at poorer prices for themselves. In can be seen in the
random societies A and B where the Variable Agent "target" the "weakest" Control
agent to finalise a deal. The results also demonstrates the impact of the tough and
weak market environments (C and D) on the Variable agent. Again, the final agreed
price is the evaluating factor, and it can be seen how the Variable agent took
advantage of the weaker Control agents or were taken advantage of by the stronger
Control agents.

5 Conclusion

In AI, the term attitude has been used to denote different concepts. For example,
Pollack [5] refers to plans as mental attitudes. In this paper, we have viewed attitude
towards an object as a mechanism which generated an appropriate meta-level
behaviour with regard to that object. The role of attitudes is to specify this extra
component in the agent's behaviour so that agents can exhibit an appropriate overall
behaviours in a given situations.
The applications of attitudes have been proved to be valuable in a fire domain
[1]. As can be seen, this concept has given encouraging results when extended to an
E-commerce application and support many more interesting features such as allied-
agent negotiations, grouping of items and variable autonomy. By using the concept
of attitudes, the agent can thus deal with multiple goals and plan ahead while
remaining extremely flexible to changes in the dynamic environment.

References
1. Attitudes for Agents in Dynamic Worlds, S. Au and N. Parameswaran, The 14th
International Florida AI Research Society (FLAIRS) Conference 21-23 May 2001.
2. Demystifying EDI, Russell Allen Stultz, Michael Busby, Wordware Publishing.
3. Chavez, Anthony, and Maes, Pattie 1996. KASBAH: An Agent Marketplace for
Buying and Selling Goods. Proceedings ofPAAM'96.
4. M. Fishbein and I. Ajzen. Belief, Attitude, Intention and Behaviour: An
Introduction to Theory and Research. Addison-Wesley Publishing Company, 1975.
5. M. E. Pollack, "Plans as Complex Mental Attitudes", Intentions in
Communication, MIT Press, Cambridge, MA, 1990.
ORGANIZING I N T E R N E T A G E N T S A C C O R D I N G TO A
H I E R A R C H Y OF INFORMATION D O M A I N S

SYLVIE C A Z A L E N S A N D P H I L I P P E L A M A R R E
IRIN
2, rue de la Houssiniere BP92208
F44322 Nantes Cedex 3
France
E-mail: {Cazalens, Lamarre} Qirin.univ-nantes.fr

This paper foccuses on the formal representation of an organization where the


agents are situated with respect to a hierarchy of information domains. The rep-
resentation is based on the definition of two structures: a structure of agents is
mapped on to a structure of information domains.

1 Introduction

Obviously, the Internet represents a distributed framework in which, in addi-


tion, information and service sources are numerous, heterogeneous and possi-
bly concurent. In order to search for relevant information or services, a widely
accepted view is to consider three basic types of agents: requesters, providers
and middle agents which mediate between requesters and providers 1 ' 2 . Ap-
proaches using a single middle agent like a broker or a matchmaker have
shown difficulties to scale up 3 . In this paper, we are interested in applications
where middle agents and providers can be numerous. In addition, we assume
that a reasonable way to structure them is to situate them with respect to
a hierarchy of information domains. This improves the searching process by
only requesting the sources of relevant domains.
The goal of this paper is to explore the formal representation of such an
organization, called Bonom a . As already noticed 4 , the case of Internet multi-
agent systems or organization is somewhat special. Any body may want to
build an agent and introduce it within the system. This is why it is of no
use to describe the internal structure of the agents. In the same way, roles
or at least functionalities may slightly vary from an agent to another. This
is why we do not attempt to exhaustively describe all the roles and possible
interactions between the agents. Rather, we focus on the features which are
clearly involved in characterizing the organization. A complete work should
consider both the the statics and dynamics of the organization 5 . This paper

a
Bonom is not an acronym, just common use among the people involved in the project.

469
470

focuses on the statics of the organization. The questions we adress are the
following: "Which roles and interactions should be considered?", "At which
level should the information domains be introduced?" and "What are the
links between domains and groups of agents?".
We have chosen to separate the structure of domains and the structure
of agents, mapping one on to the other to define a Bonom organization. The
description is three steps, starting with the structure of domains, then the
structure of agents (which looks like a simplified version of a "population-
organization structure" 5 ) and finally the mapping between the two structures.

2 The structure of information domains

The term "information domain" is intentionally general. We would as well talk


about topics or categories like for instance / l e i s u r e / s p o r t s / b a s k e t b a l l .
We consider a tree-like structure.
Definition 1 A structure of domains VS is a triple (D,Root,>) s.t.:

• D is a non empty set of information domains containing Root.

• > is a relation on D, which defines a connected acyclic directed graph on


D with a single root noted Root. It is anti-transitive.

d> d' should be read: d' is more specific than domain d. Root corre-
sponds to the most general domain.

3 The structure of agents

We sort the roles out into two "super-roles" called "Worker" and "BaGate" 6 .
In the case of an Internet application, workers may be information or service
providers but also query re-formulators, translators, etc. BaGates are specific
middle agents 1 . Notice that end-user agents, which will send queries to the
organization, are considered as external to the system.
Among the possible interactions between these two "super-roles", we rep-
resent two of them: the client and the brother relations. Workers or BaGates
can interact with BaGates as clients. They ask for users' queries, describing
the types of queries they want, and specifying an interval of time during which
the queries have to be sent. The brother relation, only defined on BaGates,
is an acquaintance relation. It represents the fact that one agent knows the
6
The name BaGate comes from the two words "Bag" and "Gate" which refer to the roles
that the BaGates play with respect to requests.
471

other one and that it is able to initiate an interaction with it, even if it does
not necessarily do so. Interaction may occur to have a users' query treated or
to maintain the structure.
We emphasize the fact that the brother relation represents a potential
interaction while the client relation represents an actual interaction. Indeed,
at some particular time, the worker only asks some specific BaGates for users'
query. This is what the client relation represents. On the other hand, the
brother relation represents a possible interaction initiated by one of the two
agents.
Definition 2 A structure of agents AS is a tuple {B,W,-^ ,<-^ ) s.L:

• B is a non empty set of BaGates.

• W is a set of workers.

• —*• is a relation on B : b\ —*• 62 iffbi knows that 62 is its brother. Because


of technical reasons, this relation is reflexive.

• <r^ is a relation on BUW and B expressing a notion of actual client-hood.


a 4-^ b means "a is a client of b" or "b is a provider for a". For both
intuitive and technical reasons, the relation is acyclic.

If the structure would be ideal, the —*• relation would be an equivalence


relation.This would mean that all the brothers would know each others. This
hypothesis would be too strong to model a system deployed over Internet
where, for example, network problems may occur. The notation -=>• is in-
troduced to represent this ideal case. It corresponds to the transitive and
symmetric closure of relation —l .
Groups are defined as particular structures where BaGates ideally know
each others.
Definition 3 Let AS = (B, W, —*• ,-<->) be a structure of agents. Q =
(BG,WG,^ G,^-G) is a group of AS iff:
an
• Q is a structure s.L: BQ C B, WQ C W, -^ a =—^\BG> ^
J
• ^ G =<- |(BGUH'G)

. V(M')e(Bo) 2 , b±b'
BaGates of BQ are all brothers and ideally can know each others.

• V(6,b') e (BG x B), if b^b' then V G BG.


BQ is closed under ideal brothers relation.
472

• WG = {w 3b £ BG,W ed}
a worker is present iff it is a client of some BaGate of BQ-
The set of groups that can be defined on the structure of agents AS is
noted G(AS). A client relation can be defined on this set.
Definition 4 Let G\ and G2 be two groups of the structure of agents AS. G\
is a client of Q2, noted G\ *^> Q2 , iff3(bi,b2) € (BGl x BG2) h f^ b2

4 A Bonom organization

A Bonom organization links the structure of agents and the structure of in-
formation domains. From an agent point of view, this comes to situate each
agent of the structure of agents with respect to the structure of domains. A
worker may be situated with respect to several domains, while a BaGate has
to be situated with respect to a single domain.
Definition 5 A Bonom organization BO is a triple (AS, DS, H-») where
AS = (B, W, —v ,«—>), VS = (D, Root, t>), and ^ is a relation on Q(AS) and
D s.t.:
• VS e G{AS)3\d eD g^d
• V(£i,a 2 ) e G(AS)2, if (Q2 ^ Qi) then (3(di,d 2 ) € D2 dx > d2 and
(Gi i-> d\ and Q2 H+ d2))
• V(di,d2) € D2, if (dx >d2 and 3Q2 e 9{AS) Q2 •-> d2) then (3Gi £
Q{AS) Gi H-> di and (Q2 ^ Gi))-
If the mapping (>-») were one to one, the client relation (<->) on groups
would be the exact reflect of the specialization relation (>) on domains. How-
ever the definition does not require such ideal hypothesis because of two rea-
sons: 1 - the deployment of the organization of agents can not be completed
at one time (for some domain, there may not be already associated groups);
2 - we have to take into account the world wide distributed aspect of the
problem, which makes it almost impossible to synchronize the creation and
the introduction of agents, because of network shutdowns for example (for
some domaine, there may be several groups working in parallel).
The set of groups which work for the same domain is named community 0 .
Definition 6 Let BO = (.45, DS, 1-4) be a Bonom organization where
VS = (D,Root,>). Let be d£ D. Cj, community of domain d is defined
by: Cd = {G £ G(AS) : G^d}
c
Notice that it is a specific notion of community which does not cover all the meanings that
can be found on the Internet for this word.
473

5 Conclusion

As a consequence of the definitions above, the organization can be viewed


according to different levels of granularity. One may adopt the semantic view
(the domains/communities) only, or a more structured view with groups or
consider the agents themselves. It is interesting to notice that, in order to
avoid too strong and unrealistic hypothesis on Internet agents, we have to
take into account both the notions of group and community. In the same way
the acquaintance relation is not assumed to be symmetric. The dynamics of
the organization has also been studied 6 . It defines the introduction of agents
but also the merging of groups referring to a same domain.
Based on this organization, a prototype has has been implemented using
Java and Corba ORB Orbacus 7 . BaGates are implemented and communities
can be deployed on several machines. The first results, obtained with basic
site agents are encouraging. More sophisticated prototypes of site agents are
currently developed. In particular a semi-automatic analysis of a site content,
exploiting ontologies, will soon be implemented within a site agent 8 .

References

1. Keith Decker, Katia Sycara, and Mike Williamson. Middle-Agents for


the Internet. In Fifteenth International Joint Conference on Artificial
Intelligence (IJCAI'97). Morgan Kaufmann, 1997.
2. Mattias Klusch, editor. Intelligent Information Agents. Springer, 1999.
3. Marian Nodine, William Bohrer, and Anne H. H. Ngu. Semantic Broker-
ing over Dynamic Heterogeneous Data sources in InfoSleuth. In Interna-
tional Conference on Data Engineering (ICDE), 1999.
4. Franco Zambonelli, Nicholas R. Jennings, Andrea Omicini, and Michael
Wooldridge. Coordination of Internet Agents: Models, Technologies and
Applications, chapter Agent Oriented Software Engineering for Internet
Applications. Springer, 2000.
5. Yves Demazeau and A.C. Rochas Costa. Populations and organizations
in open multi-agent systems. In First Symposium on Parallel and Dis-
tributed A I, 1996.
6. Sylvie Cazalens and Philippe Lamarre. An organization of internet agents
based on a hierachy of information domains. In MA AM AW, 2001.
7. Object Oriented Concepts (IONA). Object Oriented Concepts. Orbacus
home page: http://www.ooc.com/ob.
8. Emmanuel Desmontils and Christine Jacquin. Indexing a web site to
highlight its content. In Web Intelligence, 2001.
INTRODUCING USER PREFERENCE MODELING
FOR MEETING SCHEDULING

HON WAI CHUN AND REBECCA Y. M. WONG


Department of Electronic Engineering, City University of Hong Kong
Tat Chee Avenue, Kowloon, Hong Kong SAR
E-mail: eehwchun@cityu.edu.hk & ymwong@ee.cityu.edu.hk

In this short paper, we present a formalized user preference model that encapsulates
knowledge on user preferences in meeting scheduling and uses this knowledge to evaluate
proposals and to generate counter proposals during negotiation. Our user preference model
will allow fully automated meeting scheduling to be possible. This research is part of our
work in developing an agent-based infrastructure to automate various tasks within an office
environment. We call our infrastructure MAFOA - mobile-agents for office automation.

1 Introduction

Meeting scheduling involves searching for a time and place when and where all the
meeting participants are free and available. There may be global (organizational)
and local (individual) constraints and preferences on the time and/or location. If
information was complete, i.e. all the global and local constraints and preferences
are known to everyone in the organization, then this can be solved using traditional
search algorithms or modeled as a constraint-satisfaction problem (CSP). However,
in reality, personal (local) constraints and preferences and even the personal
calendar, or part of it, might be hidden from others for privacy. For example, when
asked, one might say: "I prefer to have meeting Wednesday or Thursday morning,"
but is not expected to divulge great details of all the reasons behind this suggestion.
In this paper, we propose to encapsulate personal preferences within a software
agent called Secretary Agent (SA) that functions within a multi-agent environment
[SEN98, HUHN99]. Each person's SA knows that person's particular calendar,
priorities, preferences and constraints. Although the model itself is hidden from all
other agents, the result of using the model to evaluate a proposal can be announced.
For example, a person might say: "I like this proposal but not the other." By
insulating the negotiation process from details of the user preference model, we
enable our model to be used by different negotiation algorithms within possibility
heterogeneous agent-based environments where agents might be built from different
agent technologies. Negotiation in a heterogeneous environment can be performed
through a well-defined negotiation protocol and ontology [SMIT80, FINI97].
Creating a user preference model that can be used in negotiation and distributed
scheduling is the main objectives of our research. In this paper, we will present a

474
475

formal definition of our user preference model and show how it is used to evaluate
proposals.

2 User Preference Model

Each person has his/her own unique set of personal and business priorities,
preferences and constraints. All these come into play during negotiation. In order for
us to delegate negotiation to a software agent, it must also understand and be able to
make use of the unique preferences of an individual. Our user preference model tries
to encapsulate knowledge on different types of personal preferences and how they
might change according to changes in the environment. The user preference model is
used to influence the behavior of the software agent during negotiation.
In MAFOA, a negotiation problem is defined by a finite set of m fixed
attributes//, f2, ..., fm, whose values will not change during negotiation, and a finite
set of n variable attributes v,, v2, .... v„, whose values will be negotiated. For
example, in a meeting-scheduling problem, the "day" might be fixed to be on
Tuesday, while the "time" and "location" attributes might be variables and
negotiated. In addition, each variable attribute v, is associated with a domain dt that
defines a finite set of possible values xh x2, ..., xk, which that variable attribute may
be assigned. The value of "time" might be "9am," "10am," etc. The user preference
model allows users to define priorities on variable attributes and preferences on
their potential values as well as rules on how these priorities and preferences may
change due to changes in the environment or decisions that were made. A
negotiation "decision" or solution is defined as a set of assignments of value to
variable attributes.
Our user preference model associates priorities, preferences and rules with
negotiable variable attributes. Each person may have his/her own set of priorities,
preferences and rules, which an agent uses during negotiation to evaluate whether a
proposal is acceptable or not and what counter proposals to make. This evaluation
results in a preference level for each proposal and counter proposal. The following
sections provide detail definitions of priorities, preferences, rules, and preference
levels. This preference model is stored in and accessible only by a person's
Secretary Agent.

2.1 Attribute Priority


The importance of a negotiable attribute might be different for different
people. For example, the "location" of a meeting might be more important than the
"time" to a particular person. The attribute priority apt of a variable attribute v,
defines its degree of importance. It is a number between 0 to APmox, where APmax is a
global constant. The importance of a particular variable attribute to a particular
person is proportional to the value of the attribute priority. For example, if the
476

meeting "location" is important to John, then the priority for "location" will be
higher than all other attributes of that meeting. Attribute priorities will affect how an
agent negotiates and hence influence the outcome of the negotiation process.
To ensure that priorities are used fairly among all the negotiating agents, we
normalize the attribute priorities for a given agent such that their total sum must be
equal to a fixed global constant A P ,„,„/.
n
(Eq. 1) 2, aPx ~ A"total where n is the total number of variable
attributes in the given negotiation
problem

If a priority is adjusted for an agent, all the priorities of that agent will be
affected and normalized back to the same total value. For example, if the user
adjusts priorities apt to new values ap\ with a new total value of AP'mu, * APlolai,
then the new normalized priority will be calculated as follows:

(Eq. 2) ap, = (^-)APlolal, where AFnew = j>',


A
" new x=l

By default, all variable attributes are considered equally important and


hence have the same initial priority value apx = APlalailn (Eq. 3).

2.2 Preference Value


A person might consider certain attributes, such as "location," to be more
important than others during negotiation. Likewise, he might prefer certain values of
these attributes over others, such as preferring location to be in "boardroom" rather
than "demo room." We call the amount of preference for a particular value the
preference value.
For each person, each variable attribute v, and each potential domain value x,,
there is an associated preference value pv,. The value of pv, indicates the preference
on having v, be assigned the value xt. The degree of "preference" is proportional to
the value ofpvh The value of pVj may be a number between 0 to PVmax, where PVmax
is a global constant.
Furthermore, as a way to ensure that the preferences are not abused or overused
by an agent and to give each agent a fair chance in negotiation, we normalize the
preference values, such that the preference values pv,, pv2, —, pvk, of a variable
attribute v, with domain values xh x2, ..., xk must add up to a fixed global constant
P 'total-
k

(Eq. 4) 2mipvx = PVlolal where k is the total number of domain values for
*=' a particular variable attribute
477

Initially, by default, there is no special preference on any particular value to be


assigned to a variable attribute. Each potential domain value will be treated equally.
Hence, all preference values pvx of that variable attribute v/will have equal value and
be set topvx = PV,otailk (Eq. 5), where k is the total number of values in domain dh

2.3 Preference Rules


Besides attribute priorities and preference values, each user may also have a finite
set of/ preference rules rh r2, ..., /), that defines how these priorities and preferences
may change as variable attributes are negotiated and assigned values. Each
preference rule rt defines a finite set of conditions that, when all the conditions are
satisfied, trigger actions that cause priorities and preferences to be adjusted.
r,: if (c/, c2,..., cj) then {a,, a2, ..., ak)

In MAFOA, rule conditions ch c2, ..., c, are defined as a pattern of values on


fixed attributes, variable attributes or global attributes (attributes that are shared by
all agents). Each Q defines a pattern that will be matched against scheduled events.
For example, a rule might be "If there is already a meeting on Monday morning,
then I don't want any other meetings in the afternoon." The condition of "meeting
on Monday morning" defines a pattern that will be matched against all scheduled
events. The rule consequence consists of a finite set of actions at, a2, ..., ak to be
performed on priorities and preferences once the rule conditions are satisfied.

2.4 Preference Level


In MAFOA, a potential solution to a negotiation problem is defined as a tuple from
d] x d2 x ... x d„ such that the n assignments of values to the problem's variable
attributes is to the "satisfaction" of all the negotiating agents, i.e., a compromise is
found. During the negotiation process, each negotiating agent will need to evaluate
how "satisfied" it might or might not be with the proposed compromise/solution. In
MAFOA, each proposed solution is called a. proposal when offered by the initiating
agent, i.e., the agent that initiated the problem to be negotiated, and a counter
proposal when offered by any other agent involved in the negotiation.
For example, a proposal/counter proposal Px might be the tuple (V,, V2, .., VJ
where each Vj is a constant value from the domain dj of variable attribute y,-. If we
need to negotiate the "day" and "time" of a meeting, a potential proposal might be
the tuple ("Tue", "9am").
In MAFOA, the result of evaluating how satisfied an agent is with a proposal or
counter proposal is called the preference level of that proposal. Different agents
might of course potentially have a different preference level for the same proposal.
For agent /, for a negotiation problem with n variable attributes, the preference level
pli for a particular proposal/counter proposal Px is defined as:
n

(Eq. 5) pl; (Px ) = 2_j aPj x


Pvjk where pvJk is the preference value for
j=i the assignment of value Vk to the
variable attribute v7 and ap, is the
attribute priority

A proposal/counter proposal with a higher preference level means it is more


preferred by the user.

3 Acknowledgements

This work was supported in part by a Hong Kong RGC Earmarked Grant and a
Strategic Research Grant provided by the City University of Hong Kong.

4 References

[FINI97] T. Finin, Y. Labrou, and J. Mayfield, "KQML as an Agent


Communication Language," in Software Agents (J. M. Brad-shaw,
ed.), Ch. 14, pp. 291-316, AAAI Press/The MIT Press, 1997.
[HIGA96] Higa, K., Shin, B., Sivakumar, V., "Meeting Scheduling: An
Experimental Investigation," In the Proceedings of the IEEE
International Conference on Systems, Man and Cybernetics, Vol. 3,
1996, pp. 2023-2028.
[HUHN99] Michael N. Huhns, Larry M. Stephens, "Multiagent Systems and
Societies of Agents" Gerhard Weiss, ed., Multiagent Systems: A
Modern Approach to Distributed Artificial Intelligence, MIT Press,
1999.
[SEN98] Sandip Sen and Edmund H. Durfee, "A Formal Study of Distributed
Meeting Scheduling Group Decision and Negotiation," Group
Decision and Negotiation Support System, Vol. 7, 1998, pp. 265-
289.
[SMIT80] Smith, Reid G, "The contract net protocol: High-level
communication and control in a distributed problem solver," IEEE
Transactions on Computers, C-29 (12), 1980, pp. 1104-1113.
EXECUTIVE ATTENTIONAL CONTROL IN AUTONOMOUS ROBOTIC
AGENTS

JASON GARFORTH, ANTHONY MEEHAN AND SUE MCHALE

Sheffield Hallam University, Pond Street, Sheffield SI 1WB, UK.


E-mail: jpg@janus.demon.co.uk; {a.s.meehan \ s.mchale}® shu.ac.uk;

We describe a robot control architecture designed to reduce the occurrence of action-


selection errors associated with many reactive autonomous robots. The errors considered
are those that can be interpreted as arising from a lack of an explicit attentional mechanism.
The architecture described is an adaptation of a neuropsychological model of executive
control in humans.

1 Introduction

Whilst there have been some notable successes in creating autonomous machines
that exhibit low level behaviours (e.g. wall following, surface exploration, map
building), machines that exhibit higher level behaviours have proven more
difficult to develop [2,3,5,8]. In order to achieve higher level behaviours some
researchers have combined traditional AI techniques (especially for dynamic
planning and scheduling) with neuro-evolutionary approaches. However, the
challenge of building robotic agents which autonomously perform high level
behaviours using neural and neuro-evolutionary approaches remains valid, not
least because it remains the case that neural networks are inherently better suited
to noisy sensory inputs found in the real world and that they are naturally fault
tolerant. It is these advantages which, in part, motivates our continued efforts.

The remainder of this paper identifies some common problems of behavior


expression found in reactive robotic agents. It describes a neuropsychological
inspired model of executive attentional control and shows how this model can be
implemented as a control architecture for an autonomous robotic agent.

2 Problems of Action Selection

Many researchers have found a number of common expressions of failure in


behaviour selection.

1. Unattended expression of an unintended behaviour (sometimes through


continuing to exhibit a behavior which should be discontinued).

479
480

2. Unattended exhibition of a behaviour which would, in normal circumstances,


be appropriate, but which, in some slightly different novel situation, is no
longer appropriate.
3. Inflexibility in behaviour that results in continued, often repetitive, behaviour.

We should not be surprised to find these behavioural problems in reactive robots


as they are found in both animals and humans. 1 and 2 are often called capture
error: we label them Type I and Type II. The third is called perseveration [15].

3 Models of Attentional Control

Neuropsychological studies of people for whom these errors are pronounced


suggests that error correction, the interruption of ongoing behaviour and the
subsequent initiation of another, is largely governed by activity in the pre-frontal
cortex in an area which is functionally labeled the executive [12]. Norman &
Shallice provide a model of executive attention control [15]. It is composed of
several functional sub-components (figure 1):

• A mapping between perception and a set of possible behaviours. This


association takes account of the internal state of the agent and any goals that
the agent currently has (e.g. as generated by a planner).
• A Contention Scheduler which chooses between two behaviours competing
for expression. (Norman & Shallice suggested that a structure called the basal
ganglia implements the contention scheduler.)
• The Supervisory Attentional System (SAS) which expresses willed behaviour
e.g. in the event of awareness of a capture error having occurred or when a
novel action is required.

There are a number of connectionist models of attention (see [6] for a recent
review). None of these represent an implementation of the Norman& Shallice
model although they do share some features in common. Cohen et al. [4] have
modelled willed attention shifts between tasks. Their model does not explicitly
address contention scheduling and its correspondence to the Norman & Shallice
model is that the attentional control outside of the main line of information flow.
Sandon [14] advances an attentional control mechanism that operates directly in
the line of information flow. This mechanism seems more in keeping with the
contention scheduler and, as such, it is a useful model for unwilled attention but
seems less satisfactory as a model of willed attention. Olshausen et al. have
proposed a model in which attentional units potentiate attended pathways and
attenuate non-attended pathways [10]. In behavioural terms, this increases the
likelihood of a behaviour persisting once it is initiated. In our opinion, this model
481

corresponds more to an attentional reinforcement mechanism than to an attention


initiation system. LaBerge has developed a similar model [7].

A goal of our research is to go beyond these existing models and to integrate both
unwilled and willed attention and to do this without a human in the loop to
determine the focus of the willed attention. Implementing the Norman & Shallice
attentional control architecture is our point of departure.

Supervisory
Attentional
System
(SAS)
Monitor

Contention
Scheduler

Perceptual Action
System
*

Basal Ganglia

Figure 1 Illustration of Norman and Shallice model of attentional control (adapted from [15,16]). The
SAS is composed of functional units which initiate, monitor and control high level behaviour
especially willed behaviour. Amongst other roles, the SAS seems to be involved in managing capture
errors and perseverative behaviour.

4 Implementation of Supervisory Attentional Control

Neural networks form the basis of our implementation. The mapping of perception
and intention to behaviours can be realised as an associative memory. Networks
which exhibit basis and higher level behaviours are readily built and tested as a
482

separate system [9]. Prescott [11] has developed a model of the contention
scheduler based on the on the computational properties of the basal ganglia. This
operates as an unattended mechanism for behaviour selection. The output of the
contention scheduler provides the input to the thalamic circuitry which actively
disinhibits motor control for a particular behaviour allowing the selected
behaviour direct access to the motor control systems. This circuitry also provides
positive, reinforcing feedback to enhance the persistence of the selected behaviour
[13].

Sequences of expressed behaviours forming high level behaviours (interpretable


as planned behavior) are realised as temporal sequences are generated by a
subsystem composed of a recurrent network. This enables higher level, planned
behaviour to be used by the machine.

Although the SAS has several cognitive functions, only the monitor is currently
present in the implementation. The monitor seeks to detect a situation in which the
planned action and the exhibited action differ. Using a representation of a plan as
a temporal action sequence generated by a temporal sequence generator it requires
only a simple circuit to detect the when the temporal action sequence representing
the intended behaviour and the action sequence actually exhibited do not agree.
This monitoring function is an important component of the Norman & Shallice
model and would normally invoke a range of responses including and the
generation of novel plans when required. In our simple model of the SAS
however, only three responses are currently possible:

1. Attenuate the active behaviour for a given time and potentiate the behaviour
already selected by the temporal sequence generator.
2. Attenuate the active behaviour for a given time and potentiate a default
response e.g. 'run away'
3. Try to attenuate all active behaviours for a given time, allowing the
contention scheduler to re-select a particular behaviour.

It is possible to see how removal or failure of the SAS produces the behavioral
errors that we have sought to avoid. It is the monitor in the SAS that is the key to
the system having some "awareness" of when planned actions are not followed
through and so it is failure in this circuitry that will tend to produce capture errors.

5 Conclusion

In this paper we have described a robust control architecture which draws upon
advances in neuropsychology and neurophysiological research which seems to
explain executive behaviour and in particular attentional behaviour in humans. We
show how one of these models, the Norman & Shallice model of willed and
483

unwilled automatic control of behaviour, can be used both to account for and
overcome typical errors found in the behaviour of these robots.

References

1. Alexander, G. A., Basal Ganglia. In The Handbook of Brain Theory and


Neural Networks, Arbib M. A (ed.), (MIT Press, 1995) pp. 139-144.
2. Beer, R., Chiel, H. and Sterling, L., A biological perspective on autonomous
agent design. Robotics and Autonomous Systems 6 (1990) pp. 169-186.
3. Brooks, R. A. Cambrian Intelligence: The Early History of the New AI. (MIT
Press, 1999.) pp. 79-101.
4. Cohen, J.D., Dunbar, K. and McClelland, J.L., On the control of automatic
processes: a parallel distributed account of the Stroop effect. Psychological
Review, 97 (1990) pp. 332-361.
5. Dennett, D. C. Brainchildren: Essays on Designing Minds. (Penguin Books,
1998).
6. LaBerge, D., Attention. In Bly, B.M. and Rumelhart, D.E., Cognitive Science.
(Academic Press, 1999) pp 43-97.
7. Laberge, D., Carter, M. and Brown, D., A network simulation of thalamic
circuit operations in selective attention. Neural Computation, 4 (1992) pp.
318-331.
8. Maes P., Modeling Adaptive Autonomous Agents. Artificial Life 1 (1994) pp.
135-162.
9. Mataric, M. J., Designing and Understanding Adaptive Group Behaviour.
Adaptive Behaviour 4(1) (1996) pp. 51-80.
10. Olshausen, B.A., Anderson, C.H., and Van Essen, D.C., A neurobiological
model of visual attention and invariant pattern recognition based on dynamic
routing of information. Journal of Neuroscience 13 (1993) pp. 4700-4719.
11. Prescott, T.J., Redgrave, P and Gurney, K., Layered control architectures in
robots and vertebrates. Adaptive Behaviour, 7(1) (1999) pp. 99-127.
12. Roberts, A. C , Robbins T. W., Weiskrantz, L., (eds.) The Prefrontal Cortex.
(Oxford University Press, 1998) pp. 22-35.
13. Rolls, E. T„ The Oribtofrontal Cortex. (Oxford University Press 1998) pp. 67-
86.
14. Sandon, P.A., Simulating visual attention. Journal of Cognitive Neuroscience
2 (1990) pp. 213-231.
15. Shallice, T., From Neuropsychology to Mental Structure. (Cambridge
University Press, 1988) Ch. 14.
16. Shallice, T. and Burgess, P. 'The domain of supervisory processes and
temporal organisation of behaviour' in Roberts, A. C , Robbins T. W.,
Weiskrantz, L. (eds.) The Prefrontal Cortex. (Oxford University Press, 1990)
pp. 22-35.
IMPLEMENTATION AND ANALYSIS OF MOBILE AGENTS IN A
SIMULATION ENVIRONMENT FOR FIELDBUS SYSTEMS

R. HUNSTOCK, U. RUCKERT
Heinz Nixdorf Institute, University of Paderborn, System and Circuit Technology
Fuerstenallee 11, D-33102 Paderborn, Germany
E-mail: hunstock@hni.upb.de, rueckert@hni.upb.de

T. HANNA
Siemens AG,
Heinz Nixdorf Ring 1, D-33106 Paderborn, Germany
E-mail: thomas.hanna@pdb4.Siemens.de

Internet agents, agents in local area networks or agents in factory production planning, to
name a few examples, are well known and become increasingly popular. The basic
technologies which carry the agent technology are often based on JAVA or special agent
languages and on personal, industrial or embedded computers and their related network
technologies. In the upcoming field of home and building automation, special, dedicated
hardware and software are used, called fieldbus systems. Fieldbus systems are structurally
identical with computers and computer networks but show restrictions in resources and
performance. Mobile agent technology also seems to be an appropriate paradigm for typical
applications of building automation. In this paper we present the implementation of a basic
agent system in an existing software simulator for a special fieldbus technology. The analysis
of this implementation indicates that this technology offers advantages for fieldbus systems.

1 Introduction

Agent technology has emerged in the world of personal and industrial computers
and their interconnecting networks. The basic software technologies which facilitate
agent technology are either platform-independent programming languages like
JAVA and their runtime infrastructures like JVM (JAVA Virtual Machine) or
specifically designed agent languages or systems. In all cases, hardware and
software requirements are oriented towards personal computers, workstations and
current protocols for computer networks.
In the field of home and building automation with its hardware and software,
called fieldbus systems, the trend towards distributed computing and communication
is also recognizable. In comparison to computers and their networks, fieldbus
systems are obviously restricted in resources such as memory and performance in
computation and communication. Memory sizes of about 64 kBytes and clock rates
of less than 100 MHz are typical. Operating systems for fieldbus systems are often
straightforward and do not allow multitasking. The main fields of application are
control tasks as well as simple sensing and acting tasks. Although there are

484
485

differences to computers and computer networks, there is a great number of


potential applications for mobile agents as will be described in [9, 10].
In this paper, we present a model of mobile agent technology for fieldbus
systems and its implementation in an existing software simulator. We will also
compare the performance of an application realised with mobile agents with that of a
conventional programming approach. Implementation and analysis are based on the
ANSI/EIA 709.1-A-1999 control network protocol standard and the so-called
Neuron Chip, which implements this protocol.

2 Motivation

Applications for home and building automation (in the following we will subsume
these two areas under the term building automation) are typical measurement and
control tasks in crafts such as heating, ventilating, air conditioning, lighting,
security, etc. Beyond these tasks many upcoming applications influenced by existing
Internet agent applications are imaginable:
1. Monitoring agent. A monitoring agent accompanies a person on his way
through the building and monitors each of his actions.
2. Transaction agent. During the person's absence, e.g. for holiday, a transaction
agent will simulate his presence, imitating all his actions by using the data
obtained by the monitoring agent.
3. Information agent. The task of an information agent is to obtain all types of
data, e.g. the number of open windows or the temperature in all rooms.
4. Report agent. This agent determines the state of devices and then initiates
countermeasures, e.g. giving note to a craftsman.
5. Outsourcing agent. If the resources in terms of available memory or computing
time are insufficient to perform a specific task, the outsourcing agent can
transfer the task to another device.
At present, no fieldbus system fully supports agent technology. The main reason for
this can be seen in less computational performance compared to personal computers
and their network technologies. Thus, technologies like JAVA/JVM cannot simply
be adapted. However, due to the uniform hardware and the upcoming quantity of
fieldbus devices there are advantages which qualify fieldbus systems for agent
environments.
1. Binary compatibility. The same program code is executable on every device. A
runtime-consuming interpreter or compiler is not needed for a platform-
independent language as is the case with JAVA.
2. Low costs. A large number of devices with uniform hardware would lead to
small hardware costs.
486

3. Free resources. Not all device applications have the same resource
requirements. Devices with a small workload can offer computation time or
memory to other devices.
4. Availability. Most of the automation systems are very sophisticated and millions
of devices are already installed [1].

3 Modelling

Concepts of mobile agents for computers and their networks are summarized in [7].
We selected the mobility concept, a lifecycle and a communication model, which are
essential for mobile agents and therefore have to be realized in fieldbus systems as
well. Further models, e.g. a security model, are necessary in a real hardware
implementation and need to be discussed. These topics, however, are not within the
scope of this work.
The modelling of the agent system is oriented towards the MAF specification
(Mobile Agent Facility, [5, 8]), which is proposed by the Object Management Group
(OMG) for standardisation. Here, terms from the MAF specification are taken,
especially for mobile agents and agent systems. The Foundation for Intelligent
Physical Agents (FIPA) has specified a lifecycle model composed of five states
{initiated, active, suspended, waiting and transit, [2, 4]) and transitions between
these states. Closely following that specification, we have modelled the agent for the
automation system in question under the following reasonable assumptions:
• An agent must come back to the device which created the agent with the state
initiated. Only here it can be deleted.
• Each agent has to report a result to the source device. An empty result should
also be reported in case of no explicit result. This leads to an additional state
called reporting.
• Each agent is allowed to be executed on the source device before its first
migration and before its termination.
• Each agent is only allowed to be executed once on every device when no inter-
mediate migration takes place. It is assumed that a migration is released by the
agent itself only. Thus, it is unnecessary to execute an agent more than one time.
• The ability to migrate is optional. A stationary agent can, for example, be seen
as a special case of a mobile agent.
• The states active, suspended, waiting and transit are taken from the FIPA
specification unmodified.
In this model all transitions except wait and move can be released by the agent
system on which the agent actually resides. The release of wait and move are
reserved for the agent; suspend can be released by the agent and the agent system.
For a complete communication model all kinds of communication relations
have to be considered. Communication between agent and local agent system is
487

necessary for the creation, reporting or deleting of agents. Communication between


two agent systems is absolutely necessary, e.g. to find a new destination for
migration. In our basic model these relations as well as the communication between
agent system and local application were modelled for initiating the creation of a
sample agent and the reception of a requested report. This modelling is based on the
Knowledge Query and Manipulation Language (KQML), which was developed by
the Defense Advanced Research Projects Agency (DARPA) with the aim to provide
a standard for all kinds of communication within an agent system [2].
With our model the versatile abilities of agents are admittedly limited, e.g. no
collaborative behaviour will be supported because no direct agent-agent
communication is yet available. But the chosen relations are sufficient for a first
analysis and can easily be extended in future.

4 Implementation

For a prototypical implementation, we focused on the LonWorks technology, which


is a fieldbus systems used in the area of building automation as well as in industrial
automation and is based on a hardware implementation called Neuron Chip [11].
(LonWorks and Neuron are registered trademarks of the Echelon Corp, Palo Alto,
CA, USA).
Onto several existing libraries a software simulator and object-oriented library
was implemented in C++ for the modelling of Lon Works-based devices, called
DiLoS (Distributed LonWorks Simulator, [6]). The classes of DiLoS were then
supplemented by classes representing an agent system (AgentSystem,
AgentMessageSender) as well as a superclass for all agents (Agent) and, derived
from the latter, a sample agent (GatheringAgent). All future agents could simply be
implemented by deriving and extending the superclass Agent.

5 Case Study and Evaluation

Logging data, temperature values in home automation for instance, produces a high
amount of data within short periods. Transferring this data from a sensing device to
a central processing device results in traffic on the fieldbus. Alternatively, the
sensing device could store the data locally. In this case an agent could be sent to the
device and evaluate the data where it is produced. We call this kind of mobile agent
a gathering agent. It is obvious that this only works well when the amount of agent
code to be transferred is smaller than the data stored by the sensing device.
The following scenario was modelled: A gathering agent starts from one central
control device and migrates to a temperature sensing device. There, it requests a list
of temperature values, e.g. of the last day, week, or month. The agent calculates the
average value, migrates back to the source device and reports the result.
488

The execution time - the time from creation to deletion of the agent - is mainly
influenced by the transmission times and the computation time of the agent. The
simulation was carried out under the following conditions:
• The empirically obtained size of the agent is about 632 Bytes.
• The code of the agent is transmitted in quantities of 32 Bytes using a file
transfer protocol.
• The computation time for one value (2 Bytes) is empirically determined to be
9 ms at 10 MHz Neuron Chip rate.
• A typical temperature sensing device for indoor operation logs an average
between 50 and 75 measured values a day with a configured hysteresis of 0.2°C.
This volume is empirically obtained at a real network in the rooms of the office
building of the Heinz Nixdorf Institute.
For the conventional programming approach, we assumed that every measured value
is transmitted in a separate message (that is what fieldbus systems are typically
designed for). For the simulation of the two approaches typical technology-specific
parameters were configured, like the 10 MHz clock rate of the Neuron Chip and the
1.25 Mbps transmission rate of the communication medium.
The scenario has been simulated for different amounts of measured values from
50 up to 250 values which is equivalent to 100 up to 500 Bytes. The execution time
results are reported in figure 1. As a concrete result, the simulation shows that even
at a low data volume a local execution by an agent can reduce the bus utilization
(632 Bytes vs. the volume of measured data). Additionally, it can be seen that the
time for the calculation of the result can be shortened. A reason for this lies in the
fact that the communication times are unusually higher than the computation times
of the agent. It is to be expected that this effect is more distinctive with an increasing
bus load of the underlying network in a real environment.

Mobile agent approach


Separate transmission
of each measured value

100 150 200


Number of measured values

Figure 1. Comparison of mobile agent approach with conventional programming approach.

6 Conclusions

The main mechanisms for the application of mobile agent technology in fieldbus
systems were worked out and modelled on the basis of an existing software
489

simulator. With a concrete case study we verified that mobile agents are feasible as a
matter of principle with restricted resources of fieldbus systems and have advantages
in execution time and communication bandwidth under specific conditions. In
particular, we showed that today's fieldbus hardware is capable of supporting agent
technology.

7 Acknowledgements

This work was in parts supported by the Heinz Nixdorf Institute, Paderborn,
Germany. The title of the research project is "Dynamische Aufgabenverteilung in
Kommunikationsnetzen der Automatisierungstechnik".

References

1. Bowen, K., Smart Home Networks Heading for Mass Market, Cahners
Microprocessor Report, Vol. 15, Archive 4, pp. 9-10, April 2001.
2. DARPA Knowledge Sharing Initiative External Interfaces Working Group,
Draft Specification of the KQML Agent-Communication Language, 1993.
http://www.csee.umbc.edu/kqml/kqmlspec.ps
3. Foundation for Intelligent Physical Agents, FIPA 97 Specification, Part 1,
Version 2.0, Agent Management, FIPA, Genf, Belgium (1998).
4. Foundation for Intelligent Physical Agents, FIPA 98 Specification, Part 11,
Agent Management Support for Mobility, FIPA, Genf, Belgium (1998).
5. GMD FOKUS, IBM, Mobile Agent Facility Specification, New Edition,
January 2000.
6. Hunstock, R., Riiping, S., RUckert. U, A Distributed Simulator for large
Networks used in Building Automation Systems. 3rd IEEE International
Workshop on Factory Communication Systems, WFCS'2000, September 2000,
Porto, Portugal, pp. 203-210.
7. Mattern, F., A Tutorial on Mobile Agents. Spring School on Agent Technology,
Lenk, Switzerland, March 2000.
8. Open Management Group (OMP), The Mobile Systems Interoperability Facility
(MASIF/MAF). MAF Team (1997).
9. Palensky, P., Intelligent Software Agents for EIB Networks. In EIB-
Proceedings Contributions part 3/2000, Richard Pflaum Verlag, Munich,
Germany (2000), pp. 67-76.
10. Schneider, F„ Trankler, H.-R., "Intelligentes Haus Deutschland" - Just an Idea?
In EIB-Proceedings Contributions part 3/2000, Richard Pflaum Verlag,
Munich, Germany (2000), pp. 27-32.
11. Toshiba Corporation, Neuron Chip (1995).
EVALUATING BELIEVABILITY IN AN INTERACTIVE NARRATIVE

JARMO LAAKSOLAHTI, PER PERSSON & CAROLINA PALO


Swedish Institute of Computer Science, Box 1263 SE-164 29 Kista,
Sweden
E-mail: {jarmo, perp, Carolina}@sics.se

Believability is affected not only by the graphical appearance (movements, synchronisation,


gestures), but also by actions of characters, and how users understand actions in terms of
everyday psychology, social life and narrative situation. We performed a study of empathic
processes in a socio-emotionally rich drama in which players interacted with text-based
virtual characters. The results indicate the importance of having some structure in the ways
emotions are displayed by characters. The emotional reactions need to be determined in terms
of the character's goals, plans and personality, as well as the narrative context.

1 Introduction

Systems containing emotionally intelligent (semi)interactive characters have been


deployed in a number of domains such as commerce (Extempo.com), play [1] and
therapy [6]. A common ambition of these systems is to allow models of emotional
appraisal and personality traits to steer the behavior of interactive characters in a
constrained situation. Emotionally intelligent reactions on user input are claimed to
increase believability of the characters.
Believability arises in-between a specific user and some symbolic
representation of a character (image or text). A character is considered believable "if
it allows the audience to suspend their disbelief and if it provides a convincing
portrayal of the personality they expect or come to expect" [5]. That is, believability
makes the user 'disregard' the physical circumstances of the medium and instead
'enter' the story, focusing on the events and become cognitively, emotionally and
morally engaged in the lives of the characters.
This type of engagement is often described in terms of identification with
characters or empathic processes. Following Eisenberg [2] (cf. [7]) we define
empathy in a broad sense as a process by which a person takes another person's
'perspective' and tries to understand the mental lives of other people. Empathic
processes involve a 'merging of perspectives' between two persons, albeit of
different degrees.
In the Kaktus project we carried out an informal study of how users appraise
and attribute emotions and other mental states to interactive characters. For the study
we used a text-based interactive drama developed within the project.

490
491

2 System Overview

The game scenario centers around three teen-aged girls organizing a party for
their high-school friends. The plot develops over the week before the party. The
player acts as one of the characters while the system controls the non-player
characters (NPCs). In order to arrange a successful party, the player must make
socially complex decisions, e.g., inviting the 'right' people, getting rid of parents,
encouraging or discouraging alcohol consumption. In order to be successful, the
player must adopt the role of a teenage girl, be sensitive to the social and emotional
cues in the environment, and act on the basis of those.
The player's main mode of interaction is a limited form of dialogue similar to
that found in many role-playing games. The player chooses from a set of predefined
statements - or sometimes uses an object e.g., diary, candy or mobile phone -
evoking reactions from the characters and causing a new set of statements to be
presented (Fig. 1). The game is organized as a set of scenes in a hyperlinked
structure, each scene having its own set of available statements to choose from.
However, the path between scenes is not fixed but is affected by the emotional state
of the NPCs. Thus a central aspect of the game is getting characters into the 'right
mood' in order to make progress. For instance, if your friend Lovisa is angry with
you, she may refuse to grant you to use her parents' big villa for the party.

Figure 1 Kaktus game


Characters' emotions about an event are displayed in the form of small textual
cues echoing the emotion and its valence (e.g., "X frowns** or "Y smiles"). The
expressions were created to be rather emotionally neutral - albeit still congruent
with the emotion - so that users actively would attribute emotions to the characters
on the basis of the situation rather than the expression.
Characters evaluate events according to a model of emotional attribution based
on a theory by Roseman et al. [8], According to this theory, emotions are appraised
according to a set of parameters. For instance, is the event consistent with a
character's goals or not? Is the event self-caused, other-caused or caused by
circumstance? Roseman's theory proposes five parameters resulting in a total of 15
492

emotions. For Kaktus, we only used the two parameters above. Thus, our characters
are equipped with in total six emotions (joy, liking, pride, anger, sadness, regret).
A character's emotions regarding an event are affected by its goals but also by
its personality. Personality traits (e.g., dominance or egotism) affect the degree of
importance a character assigns to a goal such as, be_popular(x). However goals can
also be strictly personal for a character e.g., go steady(lovisa, niklas).

3 User Study

The goal of the user study was to investigate whether our model of emotion - which
determined the behavior and expressions of the characters - in fact contributed to
the believability of the characters? That is, if users, on the basis of generated
emotion expressions, managed to attribute and appraise emotions in the interactive
characters in a meaningful and ordered way (in relation to the narrative situation)
then we would have succeeded in creating some level of believability.
To this end, we constructed two versions of the game. In the structured version,
we had the model determine the emotion and value of the expression as was
described above. In the non-structured version, the system used the same library of
expressions, but here they were presented at random (both in terms of valence and
value). We hypothesized users to have more trouble with empathy and believability
in the second version.
We measured empathy and believability through a qualitative analysis of users'
post-usage descriptions of the drama (cf. [4]).
• If subjects used an emotionally rich vocabulary and described the characters'
life and personalities without hesitation, this would be an indication of empathy
and believability.
• If subjects hesitated when describing the characters' expressions, or finding
them 'strange' or 'incomprehensible', this would indicate low empathy and
believability.
• If subjects noticed nothing peculiar with the expressions, this would indicate
that expressions were consistent with their expectations, and thus be a sign of
believability.

3.1 Subjects, Task and Procedure


7 male and 8 female - ages 16 to 48 - counted as subjects. Most of the subjects were
computer science students at the University of Stockholm or the Royal Institute of
Technology.
Subjects were divided into two groups. One played the structured version of the
game ('the structured group', 9 subjects), the other the non-structured version ('the
non-structured group', 6 subjects).
493

After the gaming session, which was videotaped, each subject was interviewed
about her/his experience. The interviews were un-cued and had an open structure
where subjects were asked to freely describe the gaming situation, the narrative and
the characters. At the end of the interview, more direct questions about believability
and emotions were asked.

3.2 Results of the Study


Almost all subjects (13 out of 15) judged the characters as believable fifteen-year-
old girls on a direct question. The characters' attitudes, their way of talking, and
their problems were judged to be typical for teenage girls:
• "[At that age] you have to be careful, be popular, if you say the wrong things
people will get mad at you" (subject 11)
• "It is so typical ninth grade. Having to get rid of the parents" (6)
• "That's exactly the way it is [being in ninth grade]" (15)
Although both groups found the characters believable, the 'structured' group
used somewhat longer and friendlier descriptions of the characters in comparison
with the non-structured group. This could indicate that 'structured' subjects engaged
more emphatically in the narrative and the destiny of the characters.
All subjects except one in the non-structured group indicated continuity
problems in the behaviors of the characters. Even though few subjects could
pinpoint the emotional expressions as the cause of the problem, most of the subjects
experienced some form of uneasiness:
• "They didn't answer a lot, they mostly just looked in different directions... and
from that it's hard knowing what they thought" (12)
• "They seemed a bit strange sometimes" (11).
• "Sometimes they changed totally, first they were about to start crying, the next
moment their eyes were twinkling" (11)
Sometimes subjects even blamed themselves for the strange behavior: "I
suppose I picked the wrong choices" (5).
In the 'structured' group, no oddities were experienced. Subjects seldom
spontaneously mentioned the emotional responses of the characters. Only four said
anything about the topic, and in those cases it was mentioned in a 'by the way'
fashion:
• "That made them angry, of course" (8)
• "They changed moods depending on what you did"(9)
• "They put their chin out when they got mad, looked happy or sad" (9)
• "I got positive reactions from Lovisa"
All of these results seem to suggest that the 'structured' version of the game
actually generated higher level of believability and empathy than the 'non-
structured' version.
494

4 Implications for Design

The ability to show emotions has long been recognized as an important factor for
achieving believability in synthetic characters [3,5]. In addition to this, however, this
study indicates the necessity to have some organization to the way in which
emotions are displayed. Emotions should in some way be correlated to the situation
in which they are displayed. Character's reactions should be connected to their
goals, plans and personality as well as the narrative situation they are currently in.
Without this connection characters' reactions tend to become cryptic and hard to
understand. Characters cannot be made believable by simply showing emotions at
random. In fact, such behavior generates a kind of emotional schizophrenia that tend
to ruin the illusion of life instead of enhancing it (cf. [9]).

References

1. Bates, J., Loyall, B. & Reilly, S. (1992) An Architecture for Action, Emotion,
and Social Behavior, School of Computer Science, Carnegie Mellon University,
Pittsburgh, PA..
2. Eisenberg, N. (1986) Altruistic emotion, cognition and behavior, Hillsdale, NJ:
Lawrence Erlbaum Associates.
3. Elliott, C. (1992) The Affective Reasoner: A process model of emotions in a
multi-agent system, Institute for the Learning Sciences, Northwestern
University, Tech. Report #32.
4. HOok, K., Persson, P. & Sj6linder, M. (2000) Evaluating user's experience of a
character-enhanced information space, AI Communications, 13, pp. 195-21.
5. Loyall, L (1997) Believable Agents: Building Interactive Personalities, Ph.D.
Thesis. Technical Report CMU-CS-97-123, School of Computer Science,
Carnegie Mellon University, Pittsburgh, PA. May 1997.
6. Marsella, S. (2000) Pedagogical Soap, AAAI Fall Symposium Technical Report
FS-00-04, AAAI Press, pp. 107-112.
7. Omdahl, B. L. (1995) Cognitive Appraisal, Emotion, and Empathy, Mahwah:
Lawrence Erlbaum Associates.
8. Roseman, I., Antoniou, A.. & Jose, P. (1996) Appraisal Determinants of
Emotions: Constructing a More Accurate and Comprehensive Theory,
Cognition and Emotion, 10(3), pp. 241-77.
9. Sengers, P. (2000) Narrative Intelligence, In Human Cognition and Social
Agent Technology, Dautenhahn (ed), Advances in Consciousness Series. John
Benjamins Publishing Company, pp. 1-26
/JADE STOCK PREDICTOR - AN INTELLIGENT MULTI-AGENT BASED
TIME SERIES STOCK PREDICTION SYSTEM

RAYMOND S. T. LEE AND JAMES N. K. LIU


Department of Computing, Hong Kong Polytechnic University, Hung Horn, Hong Kong
Email: {csstlee, csnkliu}@comp.polyu.edu.hk

Financial prediction - such as stock forecast is always one of the hottest topics for research
studies and commercial applications. In this paper, we propose an innovative intelligent
multi-agent based environment, namely (I'JADE) - intelligent Java Agent Development
Environment - to provide an integrated and intelligent agent-based platform in the e-
commerce environment. In addition to contemporary agent development platforms, which
focus on the autonomy and mobility of the multi-agents, i'JADE provides an intelligent layer
(known as the 'conscious layer') to implement various AI functionalities in order to produce
'smart' agents. From the implementation point of view, we introduce the i'JADE Stock
Predictor - an intelligent agent-based stock Predictory system for stock prediction using our
proposed Hybrid RBF recurrent Network (HRBFN). Using the 10-year stock pricing
information (1990 - 1999) that consists of 33 major Hong Kong stocks for testing, i'JADE
Stock Predictor has achieved promising results in terms of efficiency, accuracy and mobility
as compared with contemporary stock prediction models.

1 Introduction
Financial prediction (such as stock prediction) so far is one of the hottest topics, not
only in terms of research perspective, but also for commercial applications. Owing
to the importance of this topic, a well-established school of concepts and techniques
have been devised in the previous decades, namely the fundamental [7] and
technical [6] analysis. However, owing to the fact that these tools are based on
totally different approaches of analysis, they always give rise to contradictory
results. More importantly is that, these analytical tools are heavily dependent on
human expertise and justification.
In this paper, we propose an innovative intelligent agent-based framework,
known as /JADE - intelligent Java Agent-based Development Environment. To
accommodate the deficiency of contemporary agent software platforms such as IBM
Aglets [1] and ObjectSpace Voyager Agents [8], which mainly focus on multi-agent
mobility and communication, /JADE provides an ingenious layer called the
'Conscious (Intelligent) Layer', which supports different AI functionalities to the
multi-agent applications. From the implementation point of view, we will
demonstrate one of the most important applications of /JADE in the e-commerce
environment - /JADE Stock Predictor. /JADE Stock Predictor is a truly intelligent
agent-based stock prediction application which produces a 'smart' stock Predictory
agent based on the integration of mobile agent technology with our proposed
recurrent neural network - namely Hybrid Radial Basis-function Network (HRBN)
[2] for financial forecasting.

495
496

2 /JADE Architecture
In this paper, we propose an innovative and fully integrated intelligent agent model
called /JADE for intelligent Web-mining [4] and other intelligent agent-based e-
c'mmerce applications [3][5]. The system framework is shown in Figure 1. Unlike
contemporary agent systems such as IBM Aglets [1] and ObjectSpace Voyager [8],
which focus on the multi-agent communication and autonomous operations, the aim
of /JADE is to provide comprehensive 'intelligent' agent-based framework and
applications for future e-commerce and Web-mining applications.
WAGE
System Framewoifc

Supporting
Layer
iJADE
System Components

Figure 1. System architecture of/JADE (v 1.2) model

3 /JADE Stock Predictor - System Overview


In this paper, we proposed an important /JADE application - /JADE Stock Predictor.
/JADE Stock Predictor is an integrated intelligent agent-based stock predicting and
forecasting system based on the /JADE model as the system framework. From the
agent collaboration point of view, /JADE Stock Predictor consists of three different
types of intelligent agents:
• /JADE Stock Predictor - an intelligent stationary agent located within the client
machine. The main role of /JADE Stock Predictor is to collect user
requirements (e.g. type of financial prediction, selection of stocks, etc).
• /JADE Stock Broker - an intelligent mobile agent which acts as a messenger to
convey user requirements to the /JADE stock analysts in various financial sites,
and 'carries' the response information back to the user.
• /JADE Stock Analyst - an intelligent stationary agent located in the financial
site. Its main duty is to perform the web-based stock mining and prediction
based on the HRBF recurrent neural network technique [2].
The workflow of/JADE Stock Predictor consists of the following five modules:
497

1. Forecast requirement processing scheme (FRPS) - done by /JADE Stock


Predictor
2. Stock agents negotiation scheme (SANS) - between i) /JADE Stock Predictor
and Stock Brokers and ii) /JADE Stock Brokers and Stock Analysts
3. Stock prediction scheme (SPS) using HRBFN - performed by /JADE Stock
Analysts
4. Forecast information reporting scheme (FIRS) - between /JADE Stock Brokers
and /JADE Stock Predictor
5. Stock information consolidation and display scheme (SICDS) - performed by
/JADE Stock Predictor

4MWA ! . « • • »*!**
JUMHf DMteK

Figure 2. Schematic diagram of/JADE Stock Predictor

4 Experimental Results
From the system validation and performance evaluation point of view, /JADE Stock
Predictor are tested under the following schemes: Round Trip Time (RTT) test;
Stock prediction performance test. For system training of the HRBFN model, time
series stock information of 33 Hong Kong major stocks in a period from 1990 to the
end of 1999 were 'fed in' to the hybrid RBF network, with windows size ranging
from 11 to 45 days (for long-term trend prediction) and 1 to 10 days (for short-term
stock prediction).
498

4.1 Round Trip Time (RTT) test


The aim of the RTT test is to evaluate the overall operational time of /JADE Stock
Predictor started from the collection of customer stock forecasting information,
dispatching /'JADE Brokers, to the receiving and display of stock prediction results
on the user browser. Results of the mean RTT for 100 trails are shown in Table 1.
Table 1 - Mean RTT summary after 100 trials
Time (msec)
Action
Module located at T1 server T2sever
(within campus) (Remote site)
1.FRPS 205 221
Client machine
2. SANS 576 2137

3. SPS /JADE Servers 7713 7421

4. FIRS 432 2247


Client machine
5. SICDS 236 242
Total TTL: 9.2 sec 12.3 sec

4.2 Stock prediction performance test


In this test, the ten-year (1990-1999) stock information of 33 major Hong Kong
stocks were adopted.
Table 2 - Stock prediction performance test results
(For next-day stock price prediction)
Neuro Forecaster (Average % error) /JADE
Business
Hyberbolic Neuro- Stock Predictor
Type Sigmoid Genetica
tangent Fuzzy (Av. % error)

A) Banking 4.647% 3.457% 8.451% 3.427% 1.427%


B) Finance & investment 3.745% 3.758% 7.845% 4.747% 1.573%
C) Public utility 3.412% 4.285% 10.457% 2.747% 1.347%
D) Property 4.452% 5.474% 7.457% 3.417% 1.417%
E) Others 5.124% 4.789% 8.982% 3.746% 1.379%

Overall 4.217% 4.648% 8.975% 3.417% 1.401%

For comparison purpose, a neural network based forecasting aid 'Neuro


Forecaster' from Neuro Intelligent Business Software was adopted in the test.
Actually, Neuro™ Forecaster provides the following neural network models:
1) Time series feed-forward back-propagation model (FFBP) with different
choices of transfer functions, which includes: standard sigmoid function,
hyperbolic tangent function, Neurofuzzy function, etc.
499

2) 'Genetica Net Builder' - based on Genetic Algorithms (GA) for the construction
and optimization of the network model.
For the ease of comparison, these 33 stock items are grouped under the four
critical business sectors, namely: banking, finance and investment, public utility,
property and others. As shown in Table 2, /JADE Stock Predictor outperforms all of
the four Neuro Forecaster models for different business types, with an overall
average percentage error of 1.401%, which is even better than Genetica (the one
that attains the best results (3.417%) over the other Neuro Forecaster models) by
more than 58% in terms of reduction in percentage errors.

5 Summary
In this paper, we propose the /JADE model - an innovative intelligent agent-based
model as the basic framework for the development of future e-business applications.
Through the implementation of /JADE Stock Predictor, we have demonstrated how
intelligent agent technology can be successfully and fully integrated into other
support technologies to provide a new era of intelligent mobile e-Business for future
e-Commerce development.

6 Acknowledgment
The authors are grateful to the partial supports of the Departmental Grants for
/JADE project (4.61.09.Z042) from the Hong Kong Polytechnic University.

References
1. Aglets. URL: http://www.trl.ibm.co.jp/aglets/.
2. Lee R. S. T. and Liu J. N. K., Tropical Cyclone Identification and Tracking System using
integrated Neural Oscillatory Elastic Graph Matching and hybrid RBF Network Track
mining techniques. IEEE Transaction on Neural Network 11(3) (2000) pp. 680-689.
3. Lee Raymond, A New Era Mobile Shopping Based on Intelligent Fuzzy Neuro-based
Shopping Agents. To appears in IEEE Trans, on Consumer Electronics, (2001).
4. Lee R. S. T. and Liu J. N. K., UADE eMiner - A Web-based Mining Agent based on
Intelligent Java Agent Development Environment (UADE) on Internet Shopping. Lecture
Notes in Artificial Intelligence series, Springer-Verlag (2001) pp 31-36.
5. Lee Raymond, Liu James and You Jane, UADE WeatherMAN - A Multiagent Fuzzy-
Neuro Network Weather Prediction System. To appears in Proc. of the 2nd Asia-Pacific
Conference on Intelligent Agent Technology (IAT'2001) (2001).
6. Murphy J. J., Technical Analysis of the Future Markets. The New York Institute of
Finance, Prentice Hall, New York (1986).
7. Ritchie J., Fundamental analysis: a back-to-the-basics investment guide to selecting
quality stocks. Chicago, Irwin Professional Pub. (1996).
8. Voyager. URL: http://www.objectspace.com/voyager/.
A P P R O X I M A T E SENSOR FUSION IN A NAVIGATION
AGENT

J . F . P E T E R S , S . R A M A N N A , M. B O R K O W S K I
Computer Engineering, Univ. of Manitoba, Winnipeg, MB R3T 5V6 Canada
Email: {jfpeters, ramanna, maciey}@ee.umanitoba.ca

A. S K O W R O N
Institute of Mathematics, Warsaw Univ., Banacha 2, 02-097 Warsaw, Poland
Email: skowronQmimuw.edu.pl

A multiple sensor fusion model for a navigation agent based on rough integration
is given in this paper. A rough measure of sensor signal values provides a basis
for a discrete form of rough integral. This integral computes a form of ordered
weighted average using a weighting factor determined by a classifier in the form
of a set of "ideal" sensor values. In this paper, the focus is on classifying sensor
signals relative to a classification interval of interest in guiding the navigation of
a mobile robot. A navigation agent "looks" for rough integral values representing
sensor signals to determine appropriate movements in a particular region of space.
A navigation algorithm used by an agent to govern the movements of a mobile
robot is given.

1 Introduction

An agent is an independent process capable of responding to stimuli from its


environment and communicating with other agents in its society. The aim
of the current research is to study one form of navigation by an agent based
on rough set theory 2 - 3 , recent work with sensors, filters and sensor fusion
5 7
- , rough measures and integrals 4 , and approximate reasoning by agents 8 .
The contribution of this paper is the modeling of agents that classify sensor
signals using rough integration to measure the effectiveness of a navigation
plan needed to achieve an objective.
This paper is structured as follows. Section 2 provides a brief presentation
of the basic concepts underlying sensor signal analysis by a navigation agent,
namely, set approximation, rough membership functions and rough measures.
Discrete rough integrals and identification of relevant sensors are briefly pre-
sented in Sections 3 and 4. A navigation algorithm for an agent is given in
Section 5.

500
501

2 Basic Concepts of Rough Sets

Rough set theory offers a systematic approach to set approximation 2 . To


begin, let S = (U, A) be an information system where U is a non-empty, finite
set of objects and A is a non-empty, finite set of attributes, where a : U —> Va
for every a € A. For each B C A, there is associated an equivalence relation
Ind,A(B) such that

IndA(B) = {(x,x') e U2 | Va € B.a(x) = a(x')}


If (x,x') € Ind,A{B), we say that objects x and a;' are indiscernible from
each other relative to attributes from B. The notation [x]g denotes equiva-
lence classes of Ind,A{B).
Definition 2.1 Let S = (U,A) be an information system, B C A, u € U
and let [U]B be an equivalence class of an object u € U of Ind,A{B). The set
function

rf!p(U)-MD.l],,hB.rfm =£ ^ ^ I l (1)

for any X € p(t/) is called a rough membership function (rmf).


The form of rough membership function in Def. 2.1 is slightly different
from the classical definition where the argument of the rough membership
function is an object x and the set X is fixed 3 .
Definition 2.2 Let u G U. A non-negative and additive set function
pu : p{X) -> [0,oo) defined by Pu(Y) = p'(Y n [u]B) for Y G p(X), where
p' : p(X) —> [0, oo) is called a rough measure relative to U/Ind,A(B) and u on
the indiscernibility space (X,p(X),U/Ind,A(B)).
The rough membership function (i^ : p(X) —> [0,1] is a non-negative set
function 4 .
Proposition 2.1 (Pawlak et al. 4 ) The rough membership function n^ as
defined in Definition 2.1 ( formula (1)) is additive on U.

3 Rough Integrals

Rough measure-based integrals were introduced in 4 , and applied in 5 - 7 . In


this section, we introduce a particular form of a discrete rough integral defined
relative to a rough measure.
Definition 3.1 Let p be a rough measure on X where the elements of X
are denoted by x\,... ,xn. A particular form of a discrete rough integral of
502

/ : X —> 5R+ with respect to the rough measure p is defined by


n
fdp=^2 (/(*(»)) - /(x (i _i)))/?(X (i) )
/ »=i
where •($) specifies that indices have been permuted so that
0 < f{x(i)) < . . . < f(x(„)), X(i) := {x(i),...,x{n)}, and f(x(Q)) = 0 .
This definition of the discrete rough integral is based on a formulation of
the Choquet integral in Grabisch 1.
Proposition 3.1 (Pawlak et al. 4 ) Let 0 < s < r. If a(x) € [s,r] for all
x € Xa, then f adpeu € (0,r] where u e U.

4 Relevance of a Sensor

In this section, we consider the measurement of the relevance of a sensor using


a rough integral. A sensor is considered relevant in a classification effort in
the case where J a dpfu for a sensor a is close enough to some threshold in a
target interval of sensor values.
Example 4.1 Assume that a denotes a sensor that responds to stimuli with
measurements that govern the actions of an agent. Let {a} = B C A where
a : U —> [0,0.5] where each sample sensor value a(x) is rounded to two decimal
places. Let (Y, U — Y) be a partition defined by an expert and let [u]e denote a
set in this partition containing u for a selected u € U. We further assume the
elements of [u]e are selected relative to an interval (u — e, u + e) for a selected
e > 0. We assume a decision system (Xa,a,e) is given for any considered
sensor a such that Xa C U, a : Xa -> 9t+ and e is an expert decision restricted
to Xa defining a partition (YC\Xa, (U — Y)C\Xa) of Xa. Moreover, we assume
that Xa n [u]e 7^ 0. The set [u]e is used to classify sensors and is given the
name "classifier". Consider the following decision tables.

Table 1(a) Table 1(b)


X\{a,e} a e X\{a,e} a e
xi = 0.203 0.2 0 x2 = 0.454 0.45 1
x2 = 0.454 0.45 1 z 9 = 0.455 0.46 1
x3 = 0.453 0.45 1 x10 = 0.401 0.4 1
xt = 0.106 0.11 0 xn = 0.407 0.41 1
x5 = 0.104 0.10 0 xX2 = 0.429 0.43 1

Let u = 0.425 and e = 0.2, and obtain [0.425]e with values in the interval
[0.225, 0.625]. The aim is to fuse the sample values in each signal using a
503

rough integral, and evaluate the rough integral value relative to [u]e. From
Table 1(a) compute /ad\x e u = 0.1 and Jadfieu = 0.239 from Table 1(b). The
first integral value lies outside the target interval [0.225, 0.625] and the second
integral value falls inside [0.225, 0.625]. Let u denote the average value in the
classifier [u]e, and let S 6 [0,1]. Then, for example, the selection R of the
most relevant sensors in a set of sensors is found using

R = I a* € B : a,i(j,eu- a(u) <S>

In effect, the integral Ja^d/j^ serves as a filter inasmuch as it "filters"


out all sensors with integral values not close enough to a (it).

5 Basic Navigation Algorithm

A navigation agent begins with a universe of objects reflecting possible sensor


values, a set of sensors, classifier set [u]e, signal value threshold w, boundary
values £ and 6, and time limit t.
Navigation Algorithm
Input: U, A, [u]e,u, e, 5, t; / / universe, sensors, classifier, threshold,
/ / bounds, ms delay
Constraint: | / a dp — a(u)\ < 6 / / sensor signal within an acceptable range
Output: R I/ measured response to selected sensor
while (true) {
delay(t); sample = integrate(read(sensorSignal));
switch (sample) {
(|sample — a(u)\ < S) && (sample < u ): moveForward;
(sample < u — e): stop; u = calibrate(sample, u, e);
(|sample — a(u)| < S) && (sample > u): stop; moveBackward;
(sample > u + e): stop; u = calibrate (sample, u, e);
}}
6 Conclusion

This article presents an application of a discrete form of rough integral in the


design of a navigation agent for a mobile robot. This integral computes an
ordered weighted average and provides a means of sensor fusion. For a value
u in the universe of sensor values and bound e on a classification interval, the
selection of an appropriate u and e needed to construct [u]e is important. The
set [u]e makes it possible to classify sensor signals inasmuch as it prescribes
a required region of space considered safe for the movements of the robot
504

being controlled by a navigation agent. In a sense, [u]e provides a schema


that mediates between the sensors and planner of an intelligent agent. In this
context, the term schema denotes a mediating representation. Hence, [u]e is
also called a classification schema, a fundamental feature in the intelligence
of an agent. By comparing the integral value and the average value in [u]e,
a navigation agent can decide when the calibration of u is necessary to align
the movements of a robot within the required "walking" region.

Acknowledgment

The research of Sheela Ramanna and James Peters has been supported by
the Natural Sciences and Engineering Research Council of Canada (NSERC)
research grant 194376 and research grant 185986, respectively. The research
of Maciej Borkowski has been supported by a grant from Manitoba Hydro.
The research of Andrzej Skowron has been supported by grant 8 T11C 025
19 from the State Committee for Scientific Research (KBN) and from a grant
from the Wallenberg Foundation.

References

1. M. Grabisch, Alternative expressions of the discrete Choquet integral. In:


Proc. 7th IFSA World Congress, Prague (25-29 June 1997) 472-477.
2. Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning About Data
(Kluwer Academic Publishers, Boston, MA, 1991).
3. Z. Pawlak, A. Skowron, Rough membership functions. In: R. Yager, M.
Fedrizzi, J. Kacprzyk (Eds.), Advances in the Dempster-Shafer Theory
of Evidence (JohnWiley & Sons, NY, 1994) 251-271.
4. Z. Pawlak et al., Rough measures: Theory and application. In: Proc.
Rough Set Theory and Granular Computing (RSTGR'2001), Bull. Int.
Rough Set Society 5(1/2), Matsue, Japan, 20-22 May 2001, 177-184.
5. J.F. Peters et al., Sensor fusion: A rough granular approach. In: Proc. of
Int. Fuzzy Systems Association World Congress (IFSA'01), Vancouver,
July 2001 [to appear].
6. J.F. Peters et al., Sensor fusion: A rough Petri net model. In: Proc. of
Int. Fuzzy Systems Association World Congress (IFSA'01), Vancouver,
July 2001 [to appear].
7. J.F. Peters et al., Sensor, filter, and fusion models with rough Petri nets,
Fundamenta Informatica 34, 2001, 1-19.
8. A. Skowron, Approximate reasoning by agents in distributed environ-
ments. In: Proc. Int. Conf. on Intelligent Agent Technology (IAT'01)
[to appear].
SIMULATING DAY-AHEAD TRADING IN ELECTRICITY MARKETS
WITH AGENTS

MAX SCHEIDT
ProCom Systemhaus undIngenieurunternehmen GmbH, P.O.B. 1902, 52021 Aachen
Germany
E-mail: ms(ai,procom.de

HANS-JURGEN SEBASTIAN
Aachen University of Technology, Templergraben 64, 52062 Aachen
Germany
E-mail: Sebastian(a),or. nvth-aachen. de

Electricity markets all over the world are being liberalized these days. By means of this
liberalization, the former monopoly-like structure of the electricity markets is being changed
to a market structure, where the price for electricity is derived by the principles of supply and
demand.
We propose to use an agent-based simulation system as a basis for analyzing liberalized
electricity markets, their underlying dynamics and their future development.

Keywords: multi-agent system, coordination and cooperation, simulation

1 Introduction

Electricity markets all over the world are being liberalized these days. By
means of this liberalization, the former monopoly-like structure of the electricity
markets is being changed to a market structure, where the price for electricity is
derived by the principles of supply and demand. The hope is that deregulation will
result in cheaper prices by encouraging competition between electric utilities. The
shift in market regime implies a fundamental change in the market laws, in the set
of possible actions and the number of participants. This again implies that market
participants will need new strategies to stay competitive, since strategies that
worked well in the past (under a monopolistic regime) cannot be expected to work
well in the different market environment.
The new dynamics of electricity markets, i.e. the high price volatility and a
noticeable amount of short-term contracting, pose increased risks on generators and
distributors. The rising trading volume inevitably exposes the portfolios of
generating assets and various supply contracts held by traditional electric utility
companies to market price risk. In the day-ahead production planning, market prices
become a pre-eminent criteria on optimizing the use of all generation facilities, of
all plants and units, combined with external purchase and sales opportunities.
We propose to use an agent-based simulation system for analyzing liberalized
electricity markets, their underlying dynamics and their future development.

505
506

In 1998 Germany passed a new Energiewirtschaftsgesetz (EnWG) ending more


than 100 years of local monopoly supply. Until then suppliers had no focus on
competitive prices. Reliable generators and a reliable transmission network were the
main concern from the supplier's view, as higher prices could be passed on to the
customer easily. Thus price was calculated as cost plus margin.
Within a few weeks after liberalization, a wholesale market for bulk electricity
between generators, distributors and large consumers, mainly heavy industry,
developed without the interference of any regulatory body. An informal market for
over-the-counter (OTC) trading of electricity was established. Average industry
tariffs were reduced by 35% from the beginning of the liberalization to the end of
1999, as calculated by the Dow Jones/VIK industrial electricity price index.
By the beginning of 2000, all the major generators were involved in merger
deliberations. While there were eight major electricity producers in 1998, today,
there are only six left. Those six companies operate the high-voltage grid and
control more than 80% of generation capacity [1]. Apart from the large companies
there are approximately 80 regional electricity companies and more than 800
municipalities and smaller suppliers [1].
In 2000 two power exchanges were established in Leipzig (LPX) and
Frankfurt/Main (LPX). In both trading places, market participants trade electrical
power for each hour of the following day. While at LPX market clearing is done via
a double-sided auction, EEX defines standard products, similar to stock certificates,
which can be traded in a continuous auction during trading hours.
Price has become the central decision criteria, as can be seen from the
aforementioned. Accordingly, future prices will be the most important criterion in
the planning process, concerning risk and portfolio management, decisions on tariff
structures and marketing strategies, valuation of generating assets and transmission
capacities, long-term investment decisions for new generating assets or transmission
capacities.
The absolute value of the spot price for electricity and its volatility are of
special interest to market participants to fulfill these tasks. Models of electricity
markets will be useful to understand the price building process. This again, might
lead to more reliable price forecasts [2].

2 Towards an agent-based simulation system

Traditional economic equilibrium models tend to look at an industry in aggregate.


In this case forecasts are based on a statistical analysis of historical data. This, of
course, requires the existence of sufficient historical data and the absence of any
structural breaks in the market under inspection. Both prerequisites do not hold for
the electricity market.
In contrast to the traditional "top-down" approach, the use of agent-based
simulation (ABS) aims at understanding the underlying principles of economic
markets "bottom-up". Modeling the economic market on a microscopic level allows
507

one to capture many factors which cannot be represented within a traditional


economic equilibrium model. In addition ABS brings advantages in terms of
realism, the inclusion of learning effects, and the potential for heterogeneity
between firms.
In the past Wu et. al. [3] have spent some efforts on finding optimal trading
strategies in electricity markets. In a complete market with perfectly rational
participants, every trader would trade in the spot market as long as he makes profit
with this trade, i.e. as long as the selling price is above production cost. Thus, the
market price is set by the most expensive generator used, and one should expect
system marginal cost to be a good estimator for the spot price [4].
To produce electric power, utilities use electric generators, transforming fossil
fuel into electric power. To meet the fluctuating electric demand a utility uses a
wide range of generating units. The characteristics of these units vary in terms of
fuel used, lead time and generating capacity, which makes their scheduling a
challenging problem. Known in the literature as unit commitment problem, the
problem electric utilities are facing is how to find a cost-minimizing generation plan
while meeting the electric load. Moreover several constraints need to be satisfied.
These are imposed by the physical characteristics of the generating system and
establish inter-temporal dependencies between the hourly production plans. Ede et.
al. remarked in [5, 6] that exactly those inter-temporal dependencies, e.g. startup
costs, "provide an incentive to accept losses or reduced profits in one period" [6].
Based on their findings, the authors derived a rule-of-thumb on how to bid in power
auctions.
Mount analyzed [7] time series of the wholesale market for bulk power in
Pennsylvania, New Jersey and Maryland. He found out that the market power
increases super-linear with the size (in terms of installed generating capacity) of a
market player, and that there is no rational explanation for the peak prices observed.
"The best explanation for why suppliers submit very high offers is that they are
indifferent to whether or not marginal units are dispatched because there is no loss
of expected profit" [7].
It is those unique characteristics, as shown above, that lead to the expectation
that an agent-based simulation approach is a promising way to model the electricity
market in order to provide a better understanding of and deeper insights into the
underlying market principles and mechanisms.
As a first step a simulation system, where trader agents engage in day-ahead trading
on a power exchange and on a over-the-counter (OTC) basis, has been built.
Figure 1 pictures the simulation algorithm for a trading day. Upon system
startup every agent is endowed with its unique resources. An electric utility agent,
for example, is initialized with the cost function of its generating assets, information
about production capacity at the discrete points in time and a set of market
hypotheses for the market clearing price (MCP) forecast.
Each simulated morning, the trader agents forecast the load for the next day. They
may contract out this duty to a dedicated agent, as explained in section 5. A market
508

descriptor, summarizing the result of the power auction of this day, and the load
forecast are being used as input for a forecast of the market clearing price (MCP).
Forecasts are being made for every hour of the next day traded.
Upon notification from the auctioneer, trader agents generate their bids for the
day-ahead power auction. A bid consists of a supply or a demand curve for each
hour of the day traded, each curve given by a fixed number of price-quantity tupels.
In order to generate a bid, trader agents set up a cost-minimizing production
schedule based on the load forecast for the day traded. Knowing this schedule, they
calculate the additional capacity which may be traded at the power exchange or in
OTC contracts. Based on their expectation of market clearing price and different
rules for price setting, the agents derive their individual bid. Rules for price setting
depend on cost and price expectation, as well as on attributes like risk attitude or

Proceaa now market


Information

X
( Stop J

minimum yield aimed at.


Figure 1: Simulated trading day
Upon reception of settlement information from the auctioneer trader agents
update their knowledge about the current state of the market. Afterwards they start
bilateral trading activities, thereby trading one contract for each hour at a time. Each
trader agent first decides whether to buy or sell in an hour and afterwards sets it
reservation price. Again, setting the reservation price is dependent on market
information, risk attitude, rate of utilization. Trader agents, that want to sell, aim at
509

maximizing their profit and at reaching some targeted rate utilization across their
portfolio of generating units. This utilization target provides an incentive to accept
small losses and can be interpreted as strategic behavior, like "buying into the
market". After deciding on the reservation price, agents start looking for contractors
and, once those are identified, they begin negotiating about prices. Negotiation is
based on the Contract-Net-protocol.
At the end of the trading day, trader agents start a learning phase, wherein they
reflect their experiences of the last trading day and eventually adopt new strategies
or market hypotheses for the following days.

3 Preliminary Results

Using a small population of agents, representing huge, middle and small electric
utilities, we are able to produce effects similar to those in real markets. Still, the
system needs some calibration on historical data. Furthermore investigation on how
to reproduce peak prices will be undertaken. Within the next steps, further
psychological aspects of human decision making and of uncertainty will be
incorporated, and the learning functions of the agents will be improved.

References

1. Bower, J., D. Bunn, and C. Wattendrup, A model-based analysis of


strategic consolidation in the German electricity industry, 2000, London
Business School - Energy Markets Group: London, p. 44.
2. Kramer, N. and K. Hufendiek, Preisprognose im liberalisierten
Strommarkt. VDI Berichte, 1999(1508): p. 219-227.
3. Wu, D.J., P. Kleindorfer, and J.E. Zhang. Optimal Bidding and
Contracting Strategies in the Deregulated Electric Power Market: Part I.
in 33rd Hawaii International Conference of System Sciences - 2000. 2000.
Hawaii.
4. Kreuzberg, M., Spotpreise und Handelsfliisse auf dem europdischen
Strommarkt - Analyse und Simulation. Zeitschrift fur Energiewirtschaft,
1998. 4: p. 43-63.
5. Ede, S., et al. Experimental tests of competitive markets for electric power.
in 34th Hawaii International Conference on System Sciences. 2001.
Hawaii.
6. Ede, S., et al. An Economic Analysis of the Self Commitment of Thermal
Units, in 33rd Hawaii International Conference of System Sciences - 2000.
2000. Hawaii.
7. Mount, T. Market power and price volatility in restructured markets for
electricity, in 32nd Hawaii International Conference on System Sciences -
1999. 1999. Hawaii.
U S I N G MOBILE A G E N T S TO U P D A T E A N D M A I N T A I N
COURSE MATERIALS O N S T U D E N T S ' C O M P U T E R S IN
I N T E R N E T - B A S E D DISTANCE E D U C A T I O N

HONGXUE WANG AND PETE HOLT


Centre for Computing and Information Systems
Athabasca University, Athabasca, AB T9S 3A3, Canada
Email: {harrisw,holt} Qathabascau.ca

It is a general practice in Internet-based distance education to allow students to


download a whole bunch of course materials to their own computers, so that they
can study the course even if they are off line. In doing so we are facing a big
challenge in keeping the course materials on students' machines consistent with
that on the central server. In this paper we present a mobile agent-based solution
to this practical problem in Internet-based distance education.

1 Introduction

Distance education is an effective way to remove "barriers that tradition-


ally restrict access to and success in university-level studies, and to increase
equality of educational opportunity" n . Over the recent years the Internet
and the Web in particular has been well recognised as a good media to imple-
ment distance education at various levels. As a result, many universities have
been developing and offering Internet-based. Even some commercial compa-
nies have also joined the race to make profit from providing Internet-based
courses.
Although the Internet can help distance education in several different
ways, distance education itself generally involves issues of course design, deliv-
ery, maintenance, student supervision and assessment. It is a general practice
in Internet-based distance education to allow students to download a whole
bunch of course materials to their own computers, so that they can study the
course even if they are off line. This practice, however, results in a problem
of keeping course materials on students' computers and that on the central
course server consistent. In this paper we present an mobile agent-based so-
lution to this practical problem in Internet-based distance education.

2 The design of the mobile agent

Software agents are special purpose software objects designed for addressing
specific problems 5 . Mobile agents refer to software agents that can migrate

510
511

themselves from one computer to another over a computer network such as


the Internet, and complete their missions independently on behalf of their
originator or other agents 7 . The advantages of mobile agents over other
software techniques include, but are not limited to, great flexibility, easing
the network traffic 8 . A user at remote site such as in distance education
can be updated much easily because the mobile agents with an update can
travel to the site any time. Furthermore, mobile agents can carry out their
designated tasks without supervision. Maybe, the most attractive features of
a mobile agent is that, with certain learning mechanisms built in, the mobile
agent can even evolve itself to accommodate the new environment and new
tasks it may encounter while it is alone 6 .
The update agent is the first breed of mobile agents we are implementing
for our integrated system for Internet-based distance education 10 . In the
previous section we have discussed our initiative to develop such an agent. In
this section, we are going to present details on the mechanism of the agent
and its design.

2.1 How the update mobile agent works: the mechanism


The mobile agent is built into the course delivery and presentation module
(CDM) in our integrated system, which is illustrated in Figure 1.

Figure 1. Architecture of the integrated system

In Figure 1, the CDM module runs on a course server machine within an


educational institution, while the SLM module runs on students machines. A
student don't have to connect its computer to the Internet and the central
server to study the course. However, once s/he gets connected and the SLM is
active, the SLM module will send a unique message to the central course server
via the Internet. The message contains the identification of the student and
512

the copy of the SLM in particular, as well as the status of course materials
on the student's machine, such as when it was last updated and with which
version of updating files.
The message is received by the CDM module on the course server and
read by the update mobile agent, which is always awake whenever the course
server and thus the CDM is running. With this message, the agent will then
check if there are new update for the particular student identified by the
message. If there is any, the agent will then decide what update files should
be used and pack all these files into an update package. Then the agent will
travel to the student's machine and get ready to update the course materials
at whatever time the agent believes more appropriate. This whole process
can be described in Figure 2.

Figure 2. Actions agent took before actual updating

As may have been noticed, there is a decision to be made on when the


agent should carry out the actual updating task on student's machine. There
are several possible solutions. One thing that should be considered in making
such decision if not to update the course material while the student is working
on the course. Therefore, we can configure the agent to update a course right
before the student is going to exit from the SLM, or to update when the
system is idle.

2.2 The system structure of the mobile agent


As we discussed in the previous section, there are mainly three tasks that
need to be done by the mobile agent: to make decision on whether to update;
to choose what update files should be used in a specific update; to clone
itself; to actually update course material; to pack selected update files with
cloned agent for migration; to migrate itself to students' machines. Based on
513

these tasks to be taken, the mobile agent is designed to include the following
modules: a decision maker, a cloning module, a packing module, an updating
module and a transport module. Their relationships are shown in Figure 3.

Figure 3. System structure of the mobile agent

3 Implementation issues

As part of our integrated system for Internet-based distance education,


the mobile agent is implemented mainly within the course delivery module
(CDM), with only few arrangements in the student learning module (SLM).
These arrangements include: generating messages which represent the iden-
tity of the student and the copy of the SLM in particular and status of the
course material, and send the messages to CDM on the central course server
usually located in the educational institution; enabling the mobile agent gets
into the student's computer with update package. For security reasons, the
agent is supposed to get into SLM first. If we allow the agent to update course
material only before the SLM is actually shut down, we even do not need to
let the mobile agent live outside SLM, so that the mobile agent is not directly
visible to the operating system.
Within the SLM, there is also a data cell storing the ID of the student
and status of the course material including whether it has been updated and
with which update file. This information is used to compose the message.
Inside CDM, modules of the mobile agent are implemented in a easy
detaching manner 2 so that course authors can update each of the modules
individually. That the mobile agent is implemented in this way is because
514

necessary changes to these modules may occur at different time, and some
modules may need rewritten more often than others. For example, the up-
date module may be regenerated every time a new update is planned because
the update may require a different set of adding, deleting and other file ma-
nipulations.

4 Discussion

We have presented in this paper a mobile agent designed for updating course
material on student computers in Internet-based distance education. Over
the last few years many researches in mobile agent have been done but only
few application areas have been found for the great idea of mobile agent.
Our contribution by this paper is thus two folds: the finding of Internet-
based distance education as a good application area for mobile agents, and
the design of the mobile agent for course update and maintenance.

References

1. Robert H. Jackson, Jr. An Overview of Web-Based Learning,


http://www.outreach.utk.edu/weblearning/
2. Ladislau Boloni and Dan C.Marinescu, An Object-Oriented Framework
for Building Collaborative Network Agents, in Intelligent Systems and
Interfaces, A. Kandel, K. Hoffmann, D. Mlynek, and N.H. Teodorescu,
eds., pp. 31-64, Kluewer Publishing, 2000
3. Danny B. Lange, Mitsuru Oshima, Programming & Deploying Mobile
Agents with Java Aglets, Addison-Wesley, 1998.
4. William R. Cockayne, Michael Zyda, Mobile Agents, Prentice Hall, 1998.
5. Genesereth, M. k Ketchpel, S. Software agents. CACM, (1994), 5.
6. Hendler, J. (1999). Is There an Intelligent Agent in Your Future? Nature.
http://helix.nature.com/webmatters/agents/agents.html
7. Kotz, D. & Gray, R. (1999). Mobile Agents and the Future of the Inter-
net, ACM Operating Systems Review, 33(3), August 1999, pages 7-13.
8. Andre Vellino, What Are Mobile Agents Good For Anyway?,
http://ai.iit.nrc.ca/ andre/publications
9. Wang H., Holt P. Developing Mobile Agents for Internet-based Distance
Education, to appear in Proceedings of Computers and Advanced Tech-
nology in Education (CATE 2001), 2001.
10. Wang H., Holt P. The design of an integrated system for Internet-based
distance education, manualscript, being prepared for publication, 2001.
11. Athabasca University's homepage: http://www.athabascau.ca
AUTHOR INDEX

A D

Aldeco, A., 284 Davidsson, P., 385


Algorri, M.E., 284 Dilger, W., 220
Amamiya, M., 264
Amamiya, S., 264 F
Ambroszkiewicz, S., 134
Anane, R., 144 Fallah-Seghrouchni, A.E., 355
Au, S., 464 Faltings, B., 54, 98, 400
Ayesh, A., 230 Fasli, M., 74
Fong, S., 294
B Fonseka, J., 124
Finin, T., 86
Barta, J., 434
Berkovich, A., 454 G
Binder, W., 114
Bodanese, E.L., 322 Gabovich, V., 454
Boella, G., 64 Garforth, J., 479
Boman, M., 150 Gerber, A , 119
Borghuis, T., 69 Godwin, N., 144
Borkowski, M., 500 Greenwood, S., 240
Brown, S.M., 412
Buccafurri, F., 44 H
Buskens, V., 92
Hanna, T., 484
C He, L., 170
Helmy, T., 264
Cairo, O., 284 Hernandez, A.G., 160
Carlsson, B., 385 Hicks, J.D., 139
Cazalens, S., 469 Holt, P., 510
Chao, K.M., 144 Hsu, C.N., 274
Chaib-Draa, B., 252 Hulaas, J.G., 114
Chen, E., 289 Hunstock, R., 484
Cho, S.B., 305
Choi, C.Y., 405 I
Chun, H.W., 474
Cole, G.R., 245 Ishii, N., 180
Colton, S., 395 Itoh, H., 170, 235
Cuthbert, L.G., 322

515
516

Iwata, K., 180 N

J Naing, T.T., 170


Nakamura, T., 170, 235
Jeong, E., 274 Nawa, N.E., 190
Jiao, W., 104 Nederpelt, R., 69
Johansson, S.J., 80,150 Ng, K.W., 390
Nguyen, H., 412
K Nishida, T., 3
Nolan, J.J., 334
Kalita, J.K., 300 Nowak, T., 134
Kamareddine, F., 69
Katai, O., 190 O
Kinoshita, K., 235
Klusch,M., 119 Ohsuga, S., 344
Kummeneje, J., 80
P
L
Palo, C , 490
Laaksolahti, J., 490 Palopoli, L., 44
Lam, C.P., 245 Papendick, S., 220
Lamarre, P., 469 Parameswaran, N., 464
Lee, T.O., 390 Pease, A., 395
Lee, J., 395 Pechoucek, M., 434
Lee, R., 424 Peng, S., 245
Lee, R.S.T., 495 Peng, Y., 86
Lee, S., 305 Persson, P., 490
Liu, J., 424 Peters, J.F., 500
Liu, J.N.K., 495 Petrov, P.V., 139
Loke, S.W., 124 Plumley, J., 144
Luan, X., 86
Lyback, D., 150 R

M Ramanna, S., 500


Rana, O.F., 315
Maes, P., 454 Ras, Z.W., 17
Marik, V., 434 Rohilla, P., 300
Marshall, P., 240 Rosaci, D., 44
Matskin, M., 129 Royakkers, L., 92
Mchale, S., 479 Ruckert, U., 484
Meehan, A., 479
Mutoh, A., 170, 235
Myers, G.J., 139
517

S Wang, X., 289


Wellner, J., 220
Saba, W.S., 444 Wobcke, W., 310
Sam-Haroud, D., 54, 98, 400 Wolpert, D.H., 365
Same, G.M.L., 44 Wong, K.K., 294
Santos JR., E., 412 Wong, R.Y.M., 474
Sathi, P.R., 444
Scheidt, M., 505 Y
Scheutz, M., 200
Sebastian, H.J., 505 Yang, J., 289
Seghrouchni, A.E., 160 Yang, Y., 315
Shimohara, K., 190 Yap, B., 124
Silaghi, M.C., 54, 98, 400 Yin, P.Y., 375
Sim, K.M., 405 You, J., 424
Simon, R., 334
Skowron, A., 28, 500 Z
Sloman, A., 200
Smaill, A , 395 Zaslavsky, A., 124
Soldano, H., 160 Zhang, Z., 289

Stoyen, A.D., 139


Sun, A , 294
Sun, R., 210
Sycara, K., 40

Tanaka, K., 344


Tewari, G., 454
Turner, K., 365

Vauvert, G., 355


Vidal, R.G., 114
Villazon, A , 114

Wah, B.W., 2
Walker, D.W., 315
Wang, H., 510
IN 981-02-4706-0

www. worldscientific. com


4770 he

You might also like