You are on page 1of 375

COMP310

Multi-Agent Systems
Chapter 1 - Introduction

Dr Terry R. Payne
Department of Computer Science
Five Trends in the History of Computing
• ubiquity;
• interconnection;
• intelligence;
• delegation;
• human-orientation.
2 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Ubiquity
• Continual reduction in cost of computing makes it
possible to introduce processing power into places and
devices that would have once been uneconomic.

• As processing capability spreads, sophistication (and


intelligence of a sort) becomes ubiquitous.

• What could benefit from having a processor embedded in


it?
3 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Home Automation Wars
• Apple announced “HomeKit” in 2014, & rolled out
full App support in iOS10 in Sept 2016

• Siri-driven HopePod released February, 2018

• Amazon launched “Echo” in the


UK on 26th Sept, 2016

• Google announced
“Home” in May 2016, with
a launch date planned in
Nov 2016
4 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Interconnection
• Computer systems no longer stand alone, but are networked
into large distributed systems.

• Internet an obvious example, but networking is spreading its


ever-growing tentacles.

• Since distributed and concurrent systems have become the


norm, some researchers are putting forward theoretical
models that portray computing as primarily a process of
interaction.
5 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intelligence
• The complexity of tasks that we are capable of
automating and delegating to computers has grown
steadily
• Many of these tasks are ones that can be thought of as requiring a good
deal of intelligence

• If you don’t feel comfortable with this definition of


“intelligence”, it’s probably because you are a human. . .

6 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Delegation
• Computers are doing more for us . . . without
our intervention

• We are giving control to computers, even in


safety critical tasks

• One example:
• fly-by-wire aircraft, where the machine’s judgment may be
trusted more than an experienced pilot

• Next on the agenda:


• fly-by-wire cars, intelligent braking systems, cruise control
that maintains distance from car in front. . .
7 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Human Orientation
• The movement away from machine-oriented views of
programming toward concepts and metaphors that more
closely reflect the way we ourselves understand the world
• Programmers (and users!) relate to the machine differently

• Programmers conceptualize and implement software in


terms of ever higher-level – more human-oriented –
abstractions

8 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Abstractions
• Remember: most important developments in
computing are based on new abstractions. Programming has progressed through:
• machine code;
• assembly language;
• Just as moving from machine code to higher • machine-independent programming
languages;
level languages brings an efficiency gain, so • sub-routines;
does moving from objects to agents. • procedures & functions;
• abstract data types;
• The following 2006 paper claims that developing complex • objects;
applications using agent-based methods leads to an to
average saving of 350% in development time (and up to • Agents, as intentional systems, that
500% over the use of Java). represent a further, and increasingly
• S. Benfield, Making a Strong Business Case for Multiagent powerful abstraction.
Technology, Invited Talk at AAMAS 2006.

9 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Other Trends in Computer Science

• the Grid/Cloud;
• ubiquitous computing;
• semantic web.

10 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


The Grid/Cloud
• The Grid aims to develop massive-scale open distributed
systems, capable of being able to effectively and
automatically deploy and redeploy computational (and
other) resources to solve large computational problems:
• huge datasets;
• huge processing requirements.

• Current Grid research focussed mainly on middleware


11 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The Grid and MAS
‘The Grid and agent communities are both pursuing the development of
such open distributed systems, albeit from different perspectives. The
Grid community has historically focussed on [. . . ] “brawn”:
interoperable infrastructure and tools for secure and reliable resource
sharing within dynamic and geographically distributed virtual
organisations (VOs), and applications of the same to various resource
federation scenarios.
In contrast, those working on agents have focussed on “brains”, i.e., on
the development of concepts, methodologies and algorithms for
autonomous problem solvers that can act flexibly in uncertain and
dynamic environments in order to achieve their objectives.’
(Foster et al, 2004)
12 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The Grid and MAS

‘[P]opulations of computing entities – hardware and software - will


become an effective part of our environment, performing tasks that
support our broad purposes without our continual direction, thus
allowing us to be largely unaware of them. The vision arises because
the technology begins to lie within our grasp. This tangle of concerns,
about future systems of which we have only hazy ideas, will define a
new character for computer science over the next half-century.’
(Milner, 2006)

13 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


The Semantic Web
• The semantic web aims to annotate web
sites with semantic markup: information in a
form processable by computer, typically
relating to the content of the web site.

• The idea is that this markup will enable


browsers (etc) provide richer, more
meaningful services to users.
14 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Berners Lee on the Semantic Web
‘I have a dream for the Web [in which computers] become
capable of analysing all the data on the Web – the content,
links, and transactions between people and computers. A
‘Semantic Web’, which should make this possible, has yet
to emerge, but when it does, the day-to-day mechanisms
of trade, bureaucracy and our daily lives will be handled
by machines talking to machines. The ‘intelligent agents’
people have touted for ages will finally materialise.’
(Berners-Lee, 1999)

15 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agents: A First Definition

• An agent is a computer system that is


capable of independent (autonomous) action
on behalf of its user or owner
• I.e. figuring out what needs to be done to satisfy design
objectives, rather than constantly being told.

16 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Multi-Agent Systems: A First Definition
• A multiagent system is one that consists of a
number of agents, which interact with one-
another.

• In the most general case, agents will be acting


on behalf of users with different goals and
motivations.

• To successfully interact, they will require the


ability to cooperate, coordinate, and negotiate
with each other, much as people do.
17 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
A Vision: Autonomous Space Probes
• When a space probe makes its long flight from
Earth to the outer planets, a ground crew is usually
required to continually track its progress, and
decide how to deal with unexpected eventualities.
• This is costly and, if decisions are required quickly, it is simply
not practicable.
• For these reasons, organisations like NASA are seriously
investigating the possibility of making probes more autonomous
• giving them richer decision making capabilities and responsibilities.

• This is not fiction: NASA’s DS1 did it 20 years ago


in 1998!
18 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
A Vision: Internet Agents
• Searching the Internet for the answer
to a specific query can be a long and
tedious process.
• So, why not allow a computer program — an agent — do
searches for us?
• The agent would typically be given a query that would require
synthesising pieces of information from various different Internet
information sources.
• Failure would occur when a particular resource was unavailable,
(perhaps due to network failure), or where results could not be
obtained.

19 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


The Micro and Macro Problems
• Agent design • Society Design
• How do we build agents that are • How do we build agents that are
capable of independent, capable of interacting
autonomous action in order to (cooperating, coordinating,
successfully carry out the tasks negotiating) with other agents in
that we delegate to them? order to successfully carry out
the tasks that we delegate to
them, particularly when the other
agents cannot be assumed to
share the same interests/goals?

20 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Some Views of the Field
• Agents as a paradigm for software engineering:
• Software engineers have derived a progressively better understanding of
the characteristics of complexity in software. It is now widely recognised
that interaction is probably the most important single characteristic of
complex software.

• Agents as a tool for understanding human societies:


• Multiagent systems provide a novel new tool for simulating societies, which
may help shed some light on various kinds of social processes.

21 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Some Views of the Field
• Agents are the achievable bit of the AI project:
• The aim of Artificial Intelligence as a field is to produce general human-level
intelligence. This requires a very high level of performance in lots of areas:
• Vision
• Natural language understanding/generation
• Reasoning

• Building an agent that can perform well on a narrowly defined task in a


specific environment is much much easier (though not easy).

• Systems like Deep Space 1 and the Autonomous Asteroid Exploration


Project show that this is possible..
22 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Objections to MAS
• Isn’t it all just Distributed/Concurrent
Systems?

• Isn’t it all just AI?


• Isn’t it all just Economics/Game Theory?
• Isn’t it all just Social Science?
23 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Summary
• This has been a brief introduction to “An Class Reading (Chapter 1):

Introduction to Multiagent Systems”


“Readings in Distributed Artificial
Intelligence.”, Bond, A. H. and Gasser, L.,
editors (1988). Morgan Kaufmann
• We have argued that MAS are: Publishers: San Mateo, CA (Introduction)

• a natural development of computer science; This article is probably the best survey
of the problems and issues associated
• a natural means to handle ever more distributed with multiagent systems. Most of the
systems; and issues are fundamentally still open!

• not science fiction :-)

• We also made a first definition of agent


and multiagent system.
24 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
COMP310
Multi-Agent Systems
Chapter 2 - Intelligent Agents

Dr Terry R. Payne
Department of Computer Science
What is an Agent?
• The main point about agents is they are autonomous:
capable independent action.

“... An agent is a computer system that is situated in some


environment, and that is capable of autonomous action in that
environment in order to meet its delegated objectives...”

• It is all about decisions


• An agent has to choose what action to perform.
• An agent has to decide when to perform an action.

2 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agent and Environment
}
sensors

Perception
}
perce
e dback
fe

pts

Decision
environment

}
action
s

Action
effectors/actuators

3 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Autonomy
• There is a spectrum of autonomy

Simple Machines People


(no autonomy) (full autonomy)

• Autonomy is adjustable
• Decisions handed to a higher authority when this is beneficial
4 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Simple (Uninteresting) Agents
• Control Systems
• Example: Thermostat (physical environment)
• delegated goal is maintain room temperature
• actions are heat on/off

• Software Demons
• Example: UNIX biff program (software environment)
• delegated goal is monitor for incoming email and flag it
• actions are GUI actions.

• They are trivial because the decision making they do is trivial.


5 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agents and Objects

• Are agents just objects by another


name?
“... Agents are objects
• Object: with attitude...”

• encapsulates some state;


• communicates via message passing;
• has methods, corresponding to operations that
may be performed on this state.

6 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Differences between Agents and Objects
• Agents are autonomous:
Objects do it because they
• agents embody stronger notion of autonomy than objects, have to!
and in particular, they decide for themselves whether or not
to perform an action on request from another agent; Objects do it for free!

• Agents are smart:


• capable of flexible (reactive, pro-active, social) behaviour –
the standard object-oriented model has nothing to say about
such types of behaviour;
Agents do it because they
want to!
• Agents are active: Agents do it for personal gain!
• not passive service providers; a multi-agent system is
inherently multi-threaded, in that each agent is assumed to
have at least one thread of active control.
7 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Aren’t agents just expert systems
by another name?
• Expert systems typically disembodied
‘expertise’ about some (abstract) domain of MYCIN is an example of an
discourse. Expert System that knows
about blood diseases in
• agents are situated in an environment humans.
It has a wealth of knowledge
• MYCIN is not aware of the world — only about blood diseases, in the
form of rules.
information obtained is by asking the user
A doctor can obtain expert
questions. advice about blood diseases by
• agents act giving MYCIN facts, answering
questions, and posing queries.

• MYCIN does not operate on patients.


8 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intelligent Agents and AI
• When building an agent, we simply want a system that can choose the
right action to perform, typically in a limited domain.

• We do not have to solve all the problems of AI to build a useful agent:

“...a little intelligence goes a long way!..”

• Oren Etzioni, speaking about the commercial experience of NETBOT, Inc:

“... We made our agents dumber and dumber and dumber . . . until finally they made money...”

9 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Properties of Environments
• Since agents are in close contact with their environment, the properties
of the environment affect agents.
• Also have a big effect on those of us who build agents.

• Common to categorise environments along some different dimensions.


• Fully observable vs partially observable
• Deterministic vs non-deterministic
• Static vs dynamic
• Discrete vs continuous
• Episodic vs non-episodic
• Real Time

10 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Properties of Environments

• Fully observable vs partially observable.


• An accessible or fully observable environment is one in which the agent can obtain complete,
accurate, up-to-date information about the environment’s state.
• Most moderately complex environments (including, for example, the everyday physical world and
the Internet) are inaccessible, or partially observable.
• The more accessible an environment is, the simpler it is to build agents to operate in it.

11 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Properties of Environments

• Deterministic vs non-deterministic.
• A deterministic environment is one in which any action has a single guaranteed effect — there is
no uncertainty about the state that will result from performing an action.
• The physical world can to all intents and purposes be regarded as non-deterministic.
• We'll follow Russell and Norvig in calling environments stochastic if we quantify the non-
determinism using probability theory.
• Non-deterministic environments present greater problems for the agent designer.

12 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Properties of Environments

• Static vs dynamic.
• A static environment is one that can be assumed to remain unchanged except by the
performance of actions by the agent.
• A dynamic environment is one that has other processes operating on it, and which hence
changes in ways beyond the agent’s control.
• The physical world is a highly dynamic environment.
• One reason an environment may be dynamic is the presence of other agents.

13 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Properties of Environments
• Discrete vs continuous.
• An environment is discrete if there are a fixed,
finite number of actions and percepts in it.
• Otherwise it is continuous

• Russell and Norvig give a chess game as an


example of a discrete environment, and taxi
driving as an example of a continuous one.

• Often we treat a continuous


environment as discrete for simplicity
14 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Properties of Environments
• Episodic vs non-episodic
• In an episodic environment, the performance of an agent is dependent on a number of discrete
episodes, with no link between the performance of an agent in different scenarios.
• An example of an episodic environment would be an assembly line where an agent had to spot defective
parts.
• Episodic environments are simpler from the agent developer’s perspective because the agent
can decide what action to perform based only on the current episode — it need not reason
about the interactions between this and future episodes.
• Relations to the Markov property
• Environments that are not episodic are sometimes called non-episodic or sequential.
• Here the current decision affects future decisions.
• Driving a car is sequential.

15 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Properties of Environments

• Real time
• A real time interaction is one in which time plays a part in evaluating an agents performance
• Such interactions include those in which:
• A decision must be made about some action within a given time bound
• Some state of affairs must occur as quickly as possible
• An agent has to repeat some task, with the objective to repeat the task as often as possible

16 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Intelligent Agents

• We typically think of as intelligent agent as


exhibiting 3 types of behaviour:
• Reactive (environment aware)
• Pro-active (goal-driven);
• Social Ability.

17 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Reactivity
• If a program’s environment is guaranteed to be fixed, the
program need never worry about its own success or
failure

• Program just executes blindly.


• Example of fixed environment: compiler.

• The real world is not like that: most environments are


dynamic and information is incomplete.
18 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Reactivity

• Software is hard to build for dynamic domains: program


must take into account possibility of failure
• ask itself whether it is worth executing!

• A reactive system is one that maintains an ongoing


interaction with its environment, and responds to changes
that occur in it (in time for the response to be useful).

19 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Proactiveness
• Reacting to an environment is easy
• e.g., stimulus → response rules

• But we generally want agents to do things for us.


• Hence goal directed behaviour.

• Pro-activeness = generating and attempting to achieve


goals; not driven solely by events; taking the initiative.
• Also: recognising opportunities.

20 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Social Ability
• The real world is a multi-agent environment: we cannot go
around attempting to achieve goals without taking others into
account.
• Some goals can only be achieved by interacting with others.
• Similarly for many computer environments: witness the INTERNET.

• Social ability in agents is the ability to interact with other


agents (and possibly humans) via cooperation, coordination,
and negotiation.
• At the very least, it means the ability to communicate. . .
21 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Social Ability: Cooperation

• Cooperation is working together as


a team to achieve a shared goal.

• Often prompted either by the fact that


no one agent can achieve the goal
alone, or that cooperation will obtain
a better result (e.g., get result faster).

22 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Social Ability: Coordination
• Coordination is managing the
interdependencies between
activities.

• For example, if there is a non-


sharable resource that you want to
use and I want to use, then we need
to coordinate.

23 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Social Ability: Negotiation
• Negotiation is the ability to reach
agreements on matters of common
interest.

• For example:
• You have one TV in your house; you want to watch a
movie, your housemate wants to watch football.
• A possible deal: watch football tonight, and a movie
tomorrow.

• Typically involves offer and counter-offer,


with compromises made by participants.
24 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Some Other Properties...
• Mobility • Veracity
• The ability of an agent to move. For • Whether an agent will knowingly
software agents this movement is communicate false information.
around an electronic network.

• Benevolence
• Rationality • Whether agents have conflicting goals,
• Whether an agent will act in order to and thus whether they are inherently
achieve its goals, and will not helpful.
deliberately act so as to prevent its
goals being achieved.
• Learning/adaption
• Whether agents improve performance
over time.

25 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agents as Intentional Systems
• When explaining human activity, it is often useful to make statements
such as the following:
• Janine took her umbrella because she believed it was going to rain.
• Michael worked hard because he wanted to possess a PhD.

• These statements make use of a folk psychology, by which human


behaviour is predicted and explained through the attribution of attitudes
• e.g. believing, wanting, hoping, fearing ...

• The attitudes employed in such folk psychological descriptions are


called the intentional notions.

26 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Dennett on Intentional Systems
• The philosopher Daniel Dennett coined the term
intentional system to describe entities:
“... whose behaviour can be predicted by the method
of attributing belief, desires and rational acumen...”

• Dennett identifies different ‘grades’ of intentional system:


“... A first-order intentional system has beliefs and desires (etc.)
but no beliefs and desires about beliefs and desires...
... A second-order intentional system is more sophisticated; it
has beliefs and desires (and no doubt other intentional states)
about beliefs and desires (and other intentional states) — both
those of others and its own...”

• Is it legitimate or useful to attribute beliefs, desires, and


so on, to computer systems?
27 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
McCarthy on Intentional Systems

• John McCarthy argued that there are occasions when the


intentional stance is appropriate:
28 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
McCarthy on Intentional Systems
“... To ascribe beliefs, free will, intentions, consciousness, abilities, or wants to a machine is
legitimate when such an ascription expresses the same information about the machine that it
expresses about a person. It is useful when the ascription helps us understand the structure of the
machine, its past or future behaviour, or how to repair or improve it. It is perhaps never logically
required even for humans, but expressing reasonably briefly what is actually known about the state of
the machine in a particular situation may require mental qualities or qualities isomorphic to them.
Theories of belief, knowledge and wanting can be constructed for machines in a simpler setting than
for humans, and later applied to humans. Ascription of mental qualities is most straightforward for
machines of known structure such as thermostats and computer operating systems, but is most
useful when applied to entities whose structure is incompletely known ...”

• John McCarthy argued that there are occasions when the


intentional stance is appropriate:
29 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
What can be described with the intentional stance?

• As it turns out, more or less anything can. . . consider a light switch:

“... It is perfectly coherent to treat a light switch as a


(very cooperative) agent with the capability of
transmitting current at will, who invariably transmits
current when it believes that we want it transmitted
and not otherwise; flicking the switch is simply our
way of communicating our desires …”
Yoav Shoham

• But most adults would find such a description absurd!


• Why is this?

30 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Intentional Systems

• It provides us with a familiar, non-technical way of


understanding and explaining agents.
31 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
What can be described with the intentional stance?

• The answer seems to be that while the intentional stance description is


consistent:
“... it does not buy us anything, since we essentially
understand the mechanism sufficiently to have a simpler,
mechanistic description of its behaviour ...” (Yoav Shoham)

• Put crudely, the more we know about a system, the less we need to rely on
animistic, intentional explanations of its behaviour.

• But with very complex systems, a mechanistic, explanation of its behaviour may
not be practicable.
• As computer systems become ever more complex, we need more powerful abstractions and
metaphors to explain their operation — low level explanations become impractical.
• The intentional stance is such an abstraction.

32 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agents as Intentional Systems
• So agent theorists start from the (strong) view of
agents as intentional systems: one whose simplest
consistent description requires the intentional stance.

• This intentional stance is an abstraction tool... So why not use the intentional
• ... a convenient way of talking about complex systems, which allows stance as an abstraction tool in
us to predict and explain their behaviour without having to understand
computing — to explain,
how the mechanism actually works.
understand, and, crucially,
program computer systems,
• Most important developments in computing are based through the notion of “agents”?
on new abstractions:
• procedural abstraction, abstract data types, objects, etc

• Agents, and agents as intentional systems, represent a


further, and increasingly powerful abstraction.
33 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agents as Intentional Systems
North by Northwest

• There are other arguments in favour


of this idea...
1.Characterising Agents
• It provides us with a familiar, non-technical way of
understanding and explaining agents.

2.Nested Representations Eve Kendell knows that Roger Thornhill


is working for the FBI. Eve believes that
• It gives us the potential to specify systems that include Philip Vandamm suspects that she is
representations of other systems. helping Roger. This, in turn, leads Eve to
• It is widely accepted that such nested representations are believe that Philip thinks she is working
essential for agents that must cooperate with other agents. for the FBI (which is true). By pretending
to shoot Roger, Eve hopes to convince
• “If you think that Agent B knows x, then move to location L”.
Philip that she is not working for the FBI

34 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agents as Intentional Systems

• There are other arguments in favour of this idea...


3.Post-Declarative Systems
• In procedural programming, we say exactly what a system should do;
• In declarative programming, we state something that we want to achieve, give the system
general info about the relationships between objects, and let a built-in control mechanism (e.g.,
goal-directed theorem proving) figure out what to do;
• With agents, we give a high-level description of the delegated goal, and let the control
mechanism figure out what to do, knowing that it will act in accordance with some built-in
theory of rational agency.

35 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Post-Declarative Systems
• What is this built-in theory?
• Method of combining:
• What you believe about the world.
• What you desire to bring about

• Establish a set of intentions


• Then figure out how to make these happen.
DS1 seen 2.3 million miles from Earth

36 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Abstract Architectures for Agents
• Assume the world may be in any of a finite set E of 0
discrete, instantaneous states E = {e, e , . . .}

• Agents are assumed to have a repertoire of possible


actions, Ac, available to them, which transform the
state of the world.
0
Ac = {↵, ↵ , . . .}
• Actions can be non-deterministic, but only one state ever results from
and action.

• A run, r, of an agent in an environment is a sequence


of interleaved world states and actions:
↵0 ↵1 ↵2 ↵3 ↵u 1
r : e0 ! e1 ! e2 ! e3 ! · · · ! eu
37 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Abstract Architectures for Agents (1)
• When actions are deterministic each state has only one possible
successor.
• A run would look something like the following:

North

North
38 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Abstract Architectures for Agents (2)
• When actions are deterministic each state has only one possible
successor.
• A run would look something like the following:

East

North
39 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Abstract Architectures for Agents

North

North

We could illustrate
this as a graph...

40 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Abstract Architectures for Agents

North
North

When actions are non-


deterministic a run (or
trajectory) is the same, but
the set of possible runs is
more complex.

41 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Runs
• In fact it is more complex still, because all of the runs we
pictured start from the same state.

• Let: R be the set of all such possible finite sequences (over E and Ac);
Ac
R be the subset of these that end with an action; and
RE be the subset of these that end with a state.

• We will use r,r′,... to stand for the members of R


• These sets of runs contain all runs from all starting states.

42 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Environments
• A state transformer function represents behaviour of the environment:
Ac E
• Note that environments are... ⌧ :R !2
• history dependent: the next state not only dependent on the action of the agent, but an
earlier action may be significant
• non-deterministic: There is some uncertainty about the result

• If ⌧ (r) = ; there are no possible successor states to r, so we say the


run has ended. (“Game over.”)
• An environment Env is then a triple Env = hE, e0 , ⌧ i where E is set of states, e0 ∈ E is
initial state; and τ is state transformer function.

43 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agents
• We can think of an agent as being a function which maps
runs to actions: E
Ag : R ! Ac

• Thus an agent makes a decision about what action to


perform
• based on the history of the system that it has witnessed to date.

• Let Ag be the set of all agents.


44 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
System
• A system is a pair containing an agent and an
environment.

• Any system will have associated with it a set of possible


runs
• We denote the set of runs of agent Ag in environment Env by:

R(Ag, Env)

• Assume that this only contains runs that have ended.


45 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Systems
Formally, a sequence
(e0 , ↵0 , e1 , ↵1 , e2 , . . .)
represents a run of an agent Ag in environment Env = hE, e0 , ⌧ i if:

1. e0 is the initial state of Env

2. ↵0 = Ag(e0 ); and
3. for u > 0,
eu 2 ⌧ ((e0 , ↵0 , . . . , ↵u 1 )) and
↵u = Ag((e0 , ↵0 , . . . , eu ))

46 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Why the notation?
• Well, it allows us to get a precise handle on some ideas about
agents.
• For example, we can tell when two agents are the same.

• Of course, there are different meanings for “same”. Here is one


specific one.
Two agents are said to be behaviorally equivalent with
respect to Env i↵ R(Ag1 , Env) = R(Ag2 , Env).

• We won’t be able to tell two such agents apart by watching what


they do.
47 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Deliberative Agents
North East
• Maecenas aliquam maecenas ligula nostra,
North
accumsan taciti. Sociis mauris in integer
• El eu libero cras interdum at eget North
habitasseWest
elementum est, ipsumWestpurus pede
• Aliquet sed. Lorem ipsum dolor sit amet,
ligula suspendisse nulla pretium, rhoncus

Potentially the agent will reach a different


decision when it reaches the same state by
different routes.
48 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Purely Reactive Agents
• Some agents decide what to do without reference to their
history
• they base their decision making entirely on the present, with no reference
at all to the past.

• We call such agents purely reactive:


action : E ! Ac

• A thermostat is a purely reactive agent.



o↵ if e = temperature OK
action(e) =
on otherwise.
49 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Reactive Agents
• Maecenas aliquam maecenasNorth ligula nostra,
North
accumsan taciti. Sociis mauris in integer
• El eu libero cras interdum at eget North
habitasseWest
elementum est, ipsumWestpurus pede
• Aliquet sed. Lorem ipsum dolor sit amet,
ligula suspendisse nulla pretium, rhoncus

A reactive agent will always do the


same thing in the same state.

50 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Purely Reactive Robots

• A simple reactive program for a robot might be:


• Drive forward until you bump into something. Then, turn to the right.
Repeat.

51 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agents with State

Environment

see action
Agent

next state

52 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Perception
• The see function is the agent’s ability to observe its
environment, whereas the action function represents the
agent’s decision making process.

• Output of the see function is a percept:


see : E ! P er
• ...which maps environment states to percepts.

• The agent has some internal data structure, which is typically


used to record information about the environment state and
history.

• Let I be the set of all internal states of the agent.


53 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Actions and Next State Functions
• The action-selection function action is now defined
as a mapping from internal states to actions:
action : I ! Ac
• An additional function next is introduced, which
maps an internal state and percept to an internal
state:
next : I ⇥ P er ! I

• This says how the agent updates its view of the


world when it gets a new percept.
54 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agent Control Loop
1. Agent starts in some initial internal state i0 .
2. Observes its environment state e, and generates a percept see(e).
3. Internal state of the agent is then updated via next function, becoming
next(i0 , see(e)).

4. The action selected by the agent is action(next(i0 , see(e))).


This action is then performed.
5. Goto (2).

55 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Tasks for Agents

• We build agents in order to carry out tasks


for us.
• The task must be specified by us. . .

• But we want to tell agents what to do


without telling them how to do it.
• How can we make this happen???

56 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Utility functions
• One idea:
• associated rewards with states that we want agents to bring about.
• We associate utilities with individual states
• the task of the agent is then to bring about states that maximise utility.

• A task specification is then a function which associates a


real number with every environment state:

u:E!R

57 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Local Utility Functions
• But what is the value of a run...
• minimum utility of state on run?
• maximum utility of state on run?
• sum of utilities of states on run?
• average?

• Disadvantage:
• difficult to specify a long term view when assigning utilities to individual states.

• One possibility:
• a discount for states later on. This is what we do in reinforcement learning.
58 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example of local utility function
• Goal is to select actions to maximise r = -0.04 (unless stated otherwise)
future rewards 1
r=+1
👍
• Each action results in moving to a state with some r=-1
assigned reward 2
👎
• Allocation of that reward may be immediate or delayed
3
(e.g. until the end of the run)
• It may be better to sacrifice immediate reward to gain 1 2 3 4
more long-term reward

• We can illustrate with a simple 4x3


environment
• What actions maximise the reward?
59 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example of local utility function
Assume environment was deterministic
r = -0.04 (unless stated otherwise)
r=+1
1
• Optimal Solution is: 👍
r=-1
• [Up, Up, Right, Right, Right] 2
👎
• Additive Reward is: 3
• r = (-0.04 x 4) + 1.0
• r = 1.0 - 0.16 = 0.84 1 2 3 4

• i.e. the utility gained is the sum of the rewards


received Deterministic Environment p=1.0
Agent is guaranteed to be in
• The negative (-0.04) reward incentivises the agent to reach its the intended cell (i.e.
goal asap. probability = 1.0)

60 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Sequential Decision Making
Returning to our earlier example
r = -0.04 (unless stated otherwise)
• However, now assume environment was non-deterministic
r=+1
1
👍
• Probability of reaching the goal if successful: r=-1
2
• p = 0.85 = 0.32768 👎

3
• Could also reach the goal accidentally by going the
wrong way round: 1 2 3 4
• p = 0.14 x 0.8 = 0.0001 x 0.8 = 0.00008 p=0.8
• Final probability of reaching the goal: p = 0.32776 p=0.1
Non-Deterministic Environment
• Utility gained depends on the route taken Agent may fail to reach its intended
cell (i.e. probability of success = 0.8, p=0.1
• We will see later how to compute this… but may move sideways with p=0.1
in each direction
• Reinforcement Learning builds upon this type of model
61 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Utilities over Runs
• Another possibility: assigns a utility not to individual states, but to
runs themselves:

u:R!R
• Such an approach takes an inherently long term view.
• Other variations:
• incorporate probabilities of different states emerging.

• To see where utilities might come from, let’s look at an example.


62 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Utility in the Tileworld
The agent starts to
• Simulated two dimensional grid environment on push a tile towards
the hole.
which there are agents, tiles, obstacles, and holes.

• An agent can move in four directions:


• up, down, left, or right But then the hole
disappears!!!
• If it is located next to a tile, it can push it.

• Holes have to be filled up with tiles by the agent.


• An agent scores points by filling holes with tiles, with the aim being
Later, a much more
to fill as many holes as possible.
convenient hole
appears (bottom
• TILEWORLD changes with the random appearance right)

and disappearance of holes.


63 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Utilities in the Tileworld
• Utilities are associated over runs, so that more holes filled
is a higher utility.
number of holes filled in r
• Utility function defined as follows: u(r) =
ˆ
number of holes that appeared in r
• Thus:
• if agent fills all holes, utility = 1.
• if agent fills no holes, utility = 0.

• TILEWORLD captures the need for reactivity and for the


advantages of exploiting opportunities.
64 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Expected Utility
• To denote probability that run r occurs when agent Ag is placed in
environment Env, we can write:
P (r | Ag, Env)
• In a non-deterministic environment, for example, this can be computed
from the probability of each step.

For a run r = (e0 , ↵0 , e1 , ↵1 , e2 , . . .):

P (r | Ag, Env) = P (e1 , | e0 , ↵0 )P (e2 | e1 , ↵1 ) . . .

and clearly:
X
P (r | Ag, Env) = 1.
r2R(Ag,Env)

65 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Expected Utility
• The expected utility (EU) of agent Ag in environment Env
(given P, u), is then:
X
EU (Ag, Env) = u(r)P (r | Ag, Env).
r2R(Ag,Env)

• That is, for each run we compute the utility and multiply it
by the probability of the run.

• The expected utility is then the sum of all of these.


66 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Expected Utility
• The probability of a run can be determined from individual
actions within a run
• Using the decomposability axiom from Utility Theory

“... Compound lotteries can be reduced to simpler ones using the law of
probability. Known as the “no fun in gambling” as two consecutive
lotteries can be compressed into a single equivalent lottery….”

p
e1 e1
p
e0 q e3 is equivalent to e0 q(1-p)
e3
(1-p) e2
(1-p)(1-q) e4
(1-q) e4
67 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Optimal Agents
• The optimal agent Agopt in an environment Env is the one
that maximizes expected utility:

Agopt = arg max EU (Ag, Env)


Ag2AG

• Of course, the fact that an agent is optimal does not


mean that it will be best; only that on average, we can
expect it to do best.
68 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The probabilities of the various runs are

Example 1 as follows:
↵0
P (e0 ! e1 | Ag1 , Env1 ) = 0.4
↵0
P (e0 ! e2 | Ag1 , Env1 ) = 0.6
↵1
P (e0 ! e3 | Ag2 , Env1 ) = 0.1
Consider the environment Env1 = hE, e0 , ⌧ i ↵1
defined as follows: P (e0 ! e4 | Ag2 , Env1 ) = 0.2
↵1
P (e0 ! e5 | Ag2 , Env1 ) = 0.7
E = {e0 , e1 , e2 , e3 , e4 , e5 }
↵0 Assume the utility function u1 is defined
⌧ (e0 !) = {e1 , e2 } as follows:
↵1
⌧ (e0 !) = {e3 , e4 , e5 } ↵0
u1 (e0 ! e1 ) = 8
↵0
There are two agents possible with respect u1 (e0 ! e2 ) = 11
to this environment: ↵1
u1 (e0 ! e3 ) = 70
Ag1 (e0 ) = ↵0 ↵1
u1 (e0 ! e4 ) = 9
Ag2 (e0 ) = ↵1 ↵1
u1 (e0 ! e5 ) = 10

What are the expected utilities of the


agents for this utility function?
69 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example 1 Solution
Given the utility function u1 in the question, we have two transition func-
↵0 ↵1
tions defined as ⌧ (e0 !) = {e1 , e2 , e3 }, and ⌧ (e0 !) = {e4 , e5 , e6 }. The
probabilities of the various runs (two for the first agent and three for the sec-
ond) is given in the question, along with the probability of each run occurring.
Given the definition of the utility function u1 , the expected utilities of agents
Ag0 and Ag1 in environment Env can be calculated using:
X
EU (Ag, Env) = u(r)P (r|Ag, Env).
r2R(Ag,Env)

This is equivalent to calculating the sum of the product of each utility for a run
ending in some state with the probability of performing that run; i.e.

• Utility of Ag0 = (0.4 ⇥ 8) + (0.6 ⇥ 11) = 9.8

• Utility of Ag1 = (0.1 ⇥ 70) + (0.2 ⇥ 9) + (0.7 ⇥ 10) = 15.8

Therefore agent Ag1 is optimal.


70 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example 2
The probabilities of the various runs are
as follows:
↵0
P (e0 ! e1 | Ag1 , Env1 ) = 0.5
↵0
P (e0 ! e2 | Ag1 , Env1 ) = 0.5
↵1
Consider the environment Env1 = hE, e0 , ⌧ i P (e1 ! e3 | Ag1 , Env1 ) = 1.0
defined as follows: ↵0
P (e0 ! e1 | Ag2 , Env1 ) = 0.1
↵0
E = {e0 , e1 , e2 , e3 , e4 , e5 } P (e0 ! e2 | Ag2 , Env1 ) = 0.9
↵0 ↵2
⌧ (e0 !) = {e1 , e2 } P (e2 ! e4 | Ag2 , Env1 ) = 0.4
↵2
↵1 P (e2 ! e5 | Ag2 , Env1 ) = 0.6
⌧ (e1 !) = {e3 }
↵2
Assume the utility function u1 is defined
⌧ (e2 !) = {e4 , e5 } as follows:
↵0
u1 (e0 ! e1 ) = 4
There are two agents, Ag1 and Ag2 , with respect
to this environment: ↵0
u1 (e0 ! e2 ) = 3
↵1
Ag1 (e0 ) = ↵0 Ag2 (e0 ) = ↵0 u1 (e1 ! e3 ) = 7
Ag1 (e1 ) = ↵1 Ag2 (e2 ) = ↵2 ↵2
u1 (e2 ! e4 ) = 3
↵2
u1 (e2 ! e5 ) = 2

What are the expected utilities of the agents


for this utility function?

71 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Example 2 solution
Agent 1
0 . 5 p(e1⟶ e3)=1.0
=
e 1) e1 e3
e 0⟶
p(
. 1
e0p e 1)
=0
e1
(e
0⟶ ⟶
p( e 0
e2 )
=0 e2 e0p
.5
(e 0. 4
0⟶ =
e2 ) e 4) e4
=0 e 2⟶
.9 p (
e2
p(
e2 ⟶
e5 ) Agent 2
=0 e5
.6
72 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example 2 solution 0 .1
e1
Agent 1 p=
p=0.9⨉0.4=0.36
=0 . 5 p(e1⟶ e3)=1.0 e0 e4
e 1) e1 e3 p=
0.9
e 0⟶ ⨉0
p( .6=
. 1 0.5 e5
e0p e 1)
=0
e1 4
(e
0⟶ ⟶
p( e 0
e2 )
=0 e2 e0p
.5
(e 0. 4
.5 0⟶ =
0 4) e4
1 . 0 = e3 e2 )

e
.5 ⨉ =0 e 2
0 .9 p (
p=
e0 e2
p(
e2 ⟶
p=
0.5 e5 ) Agent 2
e2 =0 e5
.6
73 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example 2 solution =4
u 0.1
e1
Agent 1 u(e1⟶ e3)=7 p=
4 p=0.9⨉0.4=0.36
) =
e )=0
1 . 5 p(e1⟶ e3)=1.0 e0 e4
e 0⟶ e 1 e1 e3 p=
0.9
u=3+3=6
(
u e 0⟶ ⨉0
p( 4 .6=
) =
e 0.1 0 e5
e0p ⟶
1

e 1) =
e1
u= .5
3+ 4
(e e
u( ⟶
0
2=
u( 0⟶
e 0 5
e0 ⟶ e p(
2)
= e
e2 ) 0.5 2 e0p
=3 = 3
1 (e e 4) . 4
=1 . 5 u( 0 ⟶ =0
7 0
= e e0 ⟶ ⟶ ) e
4 + . 0 3 e2 ) e 2 e 4 4
u = 1 u ( ⟶
0 .5
⨉ e2 ) =0 (e 2
p= =3 .9 p
Find sum of e0 e2
p(
u( e2 ⟶
utilities for p=
u= .50
e2 ⟶
e5 ) Agent 2
e5 ) = e5
each run 3 e2 =2 0.6
74 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example 2 solution
Run Utility Probability

Agent 1 e0⟶ e3 u=11 p=0.5


e0⟶ e2 u=3 p=0.5
Agent 2 e0⟶ e1 u=4 p=0.1
e0⟶ e4 u=6 p=0.36
e0⟶ e5 u=5 p=0.54

Ag1= (11 ⨉ 0.5) + (3 ⨉ 0.5) = 5.5 + 1.5 = 7


Ag2= (4 ⨉ 0.1) + (6 ⨉ 0.36) + (5 ⨉ 0.54)
= 0.4 + 2.16 + 2.7 = 5.26
75 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Bounded Optimal Agents
• Some agents cannot be implemented on some computers
• The number of actions possible on an environment (and consequently the number of
states) may be so big that it may need more than available memory to implement.

• We can therefore constrain our agent set to include only those


agents that can be implemented on machine m:
AG m = {Ag | Ag 2 AG and Ag can be implemented on m}.
• The bounded optimal agent, Agbopt, with respect to m is then. . .
Agbopt = arg max EU (Ag, Env)
Ag2AG m
76 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Predicate Task Specifications
• A special case of assigning utilities to histories is to assign
0 (false) or 1 (true) to a run.
• If a run is assigned 1, then the agent succeeds on that run, otherwise it
fails.

• Call these predicate task specifications.


• Denote predicate task specification by Ψ:

: R ! {0, 1}

77 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Task Environments
• A task environment is a pair <Env, Ψ>, where Env is an
environment, and the task specification Ψ is defined by:

: R ! {0, 1}

• Let the set of all task environments be defined by:


TE
• A task environment specifies:
• the properties of the system the agent will inhabit;
• the criteria by which an agent will be judged to have either failed or
succeeded.
78 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Task Environments
• To denote set of all runs of the agent Ag in environment Env that satisfy Ψ, we
write:
R (Ag, Env) = {r | r 2 R(Ag, Env) and (r) = 1}.

• We then say that an agent Ag succeeds in task environment <Env, Ψ > if


R (Ag, Env) = R(Ag, Env)
• In other words, an agent succeeds if every run satisfies the specification of the
agent.

We could also write this as: A more optimistic idea of success is:
8r 2 R(Ag, Env), we have (r) = 1 9r 2 R(Ag, Env), we have (r) = 1
However, this is a bit pessimistic: if the agent which counts an agent as successful as soon as it
fails on a single run, we say it has failed overall. completes a single successful run.
79 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The Probability of Success
• If the environment is non-deterministic, the τ returns a set
of possible states.
• We can define a probability distribution across the set of states.
• Let P(r | Ag, Env) denote probability that run r occurs if agent Ag is placed
in environment Env.
• Then the probability P(Ψ | Ag, Env) that Ψ is satisfied by Ag in Env would
then simply be:

X
P( | Ag, Env) = P (r | Ag, Env)
r2R (Ag,Env)

80 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Achievement and Maintenance Tasks
• The idea of a predicate task specification is admittedly
abstract.

• It generalises two common types of tasks, achievement


tasks and maintenance tasks:
1. Achievement tasks: Are those of the form “achieve state of affairs φ”.
2. Maintenance tasks: Are those of the form “maintain state of affairs ψ”.

81 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Achievement and Maintenance Tasks
• An achievement task is • A maintenance goal is
specified by a set G of “good” specified by a set B of “bad”
or “goal” states: G ⊆ E. states: B ⊆ E.
• The agent succeeds if it is • The agent succeeds in a particular
guaranteed to bring about at least environment if it manages to avoid all
one of these states (we don’t care states in B — if it never performs
which, as all are considered good). actions which result in any state in B
occurring.
• The agent succeeds if in an
achievement task if it can force the • In terms of games, the agent
environment into one of the goal succeeds in a maintenance task if it
states g ∈ G. ensures that it is never forced into
one of the fail states b ∈ B.

82 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Summary
• This chapter has looked in detail at what Class Reading (Chapter 2):
constitutes an intelligent agent.
“Is it an Agent, or Just a Program?: A
• We looked at the properties of an intelligent agent and the Taxonomy for Autonomous Agents”, Stan
properties of the environments in which it may operate. Franklin and Art Graesser. ECAI '96
Proceedings of the Workshop on Intelligent
• We introduced the intentional stance and discussed its use. Agents III, Agent Theories, Architectures,
and Languages. pp 21-35
• We looked at abstract architectures for agents of different
kinds; and
This paper informally discusses various
• Finally we discussed what kinds of task an agent might different notions of agency. The focus
need to carry out. of the discussion might be on a
comparison with the discussions in
this chapter
• In the next chapter, we will start to look at how
one might program an agent using deductive
reasoning.
83 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
COMP310
Multi-Agent Systems
Chapter 3 - Deductive Reasoning Agents

Dr Terry R. Payne
Department of Computer Science
Agent Architectures
• Pattie Maes (1991) • Leslie Kaebling (1991)
“... [A] particular methodology for building
[agents]. It specifies how . . . the agent can be
decomposed into the construction of a set of
“... [A] specific collection of software (or
component modules and how these modules
hardware) modules, typically designated by
should be made to interact. The total set of
boxes with arrows indicating the data and
modules and their interactions has to provide
control flow among the modules. A more
an answer to the question of how the sensor
abstract view of an architecture is as a general
data and the current internal state of the agent
methodology for designing particular modular
determine the actions . . . and future internal
decompositions for particular tasks ...”
state of the agent. An architecture
encompasses techniques and algorithms that
support this methodology ...”

2 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Classes of Architecture
• 1956–present: Symbolic Reasoning Agents
• Agents make decisions about what to do via symbol manipulation.
• Its purest expression, proposes that agents use explicit logical reasoning in order to
decide what to do.

• 1985–present: Reactive Agents


• Problems with symbolic reasoning led to a reaction against this
• led to the reactive agents movement, 1985–present.

• 1990-present: Hybrid Agents


• Hybrid architectures attempt to combine the best of reasoning and reactive architectures.
3 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Symbolic Reasoning Agents
• The classical approach to building agents is to view them as
a particular type of knowledge-based system, and bring all
the associated methodologies of such systems to bear.
• This paradigm is known as symbolic AI.

• We define a deliberative agent or agent architecture to be


one that:
• contains an explicitly represented, symbolic model of the world;
• makes decisions (for example about what actions to perform) via symbolic
reasoning.

4 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Two issues
The Transduction Problem The Representation/Reasoning Problem
Identifying objects is hard!!! Representing objects is harder!

The transduction problem is that of translating the How to symbolically represent information about
real world into an accurate, adequate symbolic complex real-world entities and processes, and
description, in time for that description to be how to get agents to reason with this information
useful. in time for the results to be useful.
This has led onto research into vision, speech This has led onto research into knowledge
understanding, learning… representation, automated reasoning, planning…

Most researchers accept that neither problem is anywhere near solved.


5 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The representation / reasoning problem
• The underlying problem with knowledge representation/
reasoning lies with the complexity of symbol manipulation
algorithms.
• In general many (most) search-based symbol manipulation algorithms of
interest are highly intractable.
• Hard to find compact representations.

• Because of these problems, some researchers have looked


to alternative techniques for building agents; we look at
these later.
6 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Deductive Reasoning Agents
• How can an agent decide what to do using theorem proving?
• Basic idea is to use logic to encode a theory stating the best action to perform
in any given situation.

• Let:
• ρ be this theory (typically a set of rules);
• ∆ be a logical database that describes the current state of the world;
• Ac be the set of actions the agent can perform;
• ∆ ⊢ρ φ means that φ can be proved from ∆ using ρ.

7 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Deductive Reasoning Agents
• How does this fit into the abstract description we talked about last time?
• The perception function is as before:

see : E ! P er
• of course, this is (much) easier said than done.

• The next state function revises the database ∆ :


next : ⇥ P er !
• And the action function?
• Well a possible action function is on the next slide.

8 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Action Function
for each ↵ 2 Ac do /* try to find an action explicitly prescribed */
if `⇢ Do(↵) then
return ↵
end-if
end-for

for each ↵ 2 Ac do /* try to find an action not excluded */


if ` 6 ⇢ ¬Do(↵) then
return ↵
end-if
end-for

return null /* no action found */

9 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


An example: The Vacuum World
The Vacuum World
The goal is for the robot to clear up all the dirt.

Uses 3 domain predicates in this 2


exercise:
In(x,y) agent is at (x,y)
Dirt(x,y) there is dirt at (x,y)
1
Facing(d) the agent is facing direction d

Possible Actions:
Ac = {turn, forward, suck}
0
Note: turn means “turn right”
0 1 2

10 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


The Vacuum World
In(0, 2)
In(0, 1) In(0, 2)
forward Facing(north) suck
Facing(north) Facing(north)
Dirt(0,2)
Dirt(0,2) Dirt(1,2)
forward Dirt(1,2)
In(0, 0) Dirt(1,2)
Facing(north)
Dirt(0,2) suck
Dirt(1,2) turn
In(0, 0) In(2, 0)
Facing(east) In(1, 0) Facing(east)
forward forward Dirt(0,2)
Dirt(0,2) Facing(east)
Dirt(1,2) Dirt(0,2) Dirt(1,2)
Dirt(1,2)
turn
2
In(1, 0)
Facing(south)
Dirt(0,2)
1 Dirt(1,2)
With the system as depicted
above, here are some possible
0
ways that the system might run.
0 1 2
11 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The Vacuum World Uses 3 domain predicates
in this exercise:

• Rules ρ for determining what to do:


In(x,y) agent is at (x,y)
Dirt(x,y) there is dirt at (x,y)
the agent is facing
Facing(d)
direction d
In(0, 0) ^ F acing(north) ^ ¬Dirt(0, 0) ! Do(f orward) Possible Actions:
In(0, 1) ^ F acing(north) ^ ¬Dirt(0, 1) ! Do(f orward) Ac = {turn, forward, suck}

In(0, 2) ^ F acing(north) ^ ¬Dirt(0, 2) ! Do(turn) Note: turn means “turn right”

In(0, 2) ^ F acing(east) ! Do(f orward)

2
• ... and so on!
1
• Using these rules (+ other obvious
ones), starting at (0, 0) the robot will 0

clear up dirt. 0 1 2
12 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The Vacuum World
• Problems: • Typical solutions:
• how to convert video camera • weaken the logic;
input to Dirt(0, 1)?
• use symbolic, non-logical
• decision making assumes a static representations;
environment:
• shift the emphasis of reasoning
• calculative rationality. from run time to design time.
• decision making using first-order
logic is undecidable!

13 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agent-oriented programming
• Yoav Shoham introduced “agent-oriented programming” in
1990:

“... new programming paradigm, based on a societal view of computation ...”

• The key idea:


• directly programming agents in terms of intentional notions
• like belief, desire, and intention
• Adopts the same abstraction as humans

• Resulted in the Agent0 programming language


14 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agent0
• AGENT0 is implemented as an extension to LISP.
• Each agent in AGENT0 has 4 components:
• a set of capabilities (things the agent can do);
• a set of initial beliefs;
• a set of initial commitments (things the agent will do); and
• a set of commitment rules.

• The key component, which determines how the agent acts, is the
commitment rule set.
• Each commitment rule contains
• a message condition;
• a mental condition; and
• an action.

15 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agent0 Decision Cycle
On each decision cycle . . .
• The message condition is matched against the messages the agent has received;
• The mental condition is matched against the beliefs of the agent.
• If the rule fires, then the agent becomes committed to the action (the action gets added to the agents
commitment set).

Actions may be . . . Messages are constrained to be


• Private one of three types . . .
• An externally executed computation
• Communicative • requests
• To commit to action
• Sending messages
• unrequests
• To refrain from action
• Informs
• Which pass on information
16 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Commitment Rules

• This rule may be paraphrased as A commitment Rule

follows: COMMIT(
( agent, REQUEST, DO(time, action)

• if I receive a message from agent which requests


), ;;; msg condition
( B,
me to do action at time, and I believe that: [now, Friend agent] AND
CAN(self, action) AND
• agent is currently a friend; NOT [time, CMT(self, anyaction)]
• I can do the action; ), ;;; mental condition
self,
• at time, I am not committed to doing any other action, DO(time, action)
)
• then commit to doing action at time.

17 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


PLACA
• A more refined implementation was developed by
Becky Thomas, for her 1993 doctoral thesis.

• Her Planning Communicating Agents (PLACA)


language was intended to address one severe
drawback to AGENT0
• the inability of agents to plan, and communicate requests
for action via high-level goals.

• Agents in PLACA are programmed in much the


same way as in AGENT0, in terms of mental change
rules.
COMP310: Chapter 3 Copyright: M. J. Wooldridge & S.Parsons, used with permission/updated by Terry R. Payne, Spring 2013 18
PLACA
• A more refined implementation was developed by Becky
Thomas, for her 1993 doctoral thesis.

• Her Planning Communicating Agents (PLACA) language was


intended to address one severe drawback to AGENT0
• the inability of agents to plan, and communicate requests for action via high-
level goals.

• Agents in PLACA are programmed in much the same way


as in AGENT0, in terms of mental change rules.
19 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
PLACA: Mental Change Rule
• If:
• someone asks you to xerox something A PLACA Mental Change Rule
x at time t and you can, and you don’t (((self ?agent REQUEST (?t (xeroxed ?x)))
believe that they’re a VIP, or that (AND (CAN-ACHIEVE (?t xeroxed ?x)))
you’re supposed to be shelving books (NOT (BEL (*now* shelving)))
(NOT (BEL (*now* (vip ?agent))))
((ADOPT (INTEND (5pm (xeroxed ?x)))))

• Then: ((?agent self INFORM


(*now* (INTEND (5pm (xeroxed ?x)))))))

• adopt the intention to xerox it by 5pm,


and
• inform them of your newly adopted
intention.
20 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Concurrent MetateM
• Concurrent METATEM is a multi-agent
language, developed by Michael Fisher
• Each agent is programmed by giving it a temporal logic
specification of the behaviour it should exhibit.
• These specifications are executed directly in order to generate the behaviour
of the agent.

• Temporal logic is classical logic augmented by


modal operators for describing how the truth
of propositions changes over time.
• Think of the world as being a number of discrete states.
• There is a single past history, but a number of possible
futures
• all the possible ways that the world might develop.
21 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
MetateM Agents
• A Concurrent MetateM system
contains a number of agents (objects)
• Each object has 3 attributes:
• a name
For example, a ‘stack’ object’s interface:
• an interface stack(pop, push)[popped, stackfull]
• a MetateM program

• An agent’s interface contains two sets: {pop, push} = messages received

• messages the agent will accept; {popped, stackfull} = messages sent

• messages the agent may send.

22 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


MetateM
• The root of the MetateM concept is Gabbay’s separation
theorem:
• Any arbitrary temporal logic formula can be rewritten in a logically equivalent
past future form.

• Execution proceeds by a process of continually matching


rules against a “history”, and firing those rules whose
antecedents are satisfied.
• The instantiated future-time consequents become commitments which must
subsequently be satisfied.

23 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Examples
important(agents) means “it is now, and will always be true
that agents are important”

}important(ConcurrentMetateM) means “sometime in the future,


ConcurrentMetateM will be important”

}
• important(Prolog) means “sometime in the past it was true
that Prolog was important”

(¬friends(us)) U apologise(you) means “we are not friends until you


apologise”
gapologise(you) means “tomorrow (in the next state), you
apologise”
fapologise(you) ) gfriends(us)
bcde means “if you apologised yesterday, then
tomorrow we will be friends”

friends(us) S apologise(you) means “we have been friends since you


apologised”

24 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Summary
• This chapter has focussed on Agent Class Reading (Chapter 3):
Architectures and general approaches to
programming an agent. “Agent Oriented Programming”, Yoav
Shoham. Artificial Intelligence Journal 60(1),
• We defined the notion of symbolic reasoning agents, and March 1993. pp51-92.

discussed...
• ...how can deductive reasoning be achieved through the use of logic; and This paper introduced agent-oriented
programming and throughout the late
• ...the Transduction and Representation Problems
90ies was one of the most cited
• We introduced the concept of Agent Oriented Programming, articles in the agent community. One
and looked at examples of AOP languages, including: of the main points was the notion of
using mental states, and introduced
• Agent0 and PLACA the programming language Agent0.
• Concurrent MetateM and temporal logic

• In the next chapter, we will consider the merits


of practical reasoning agents.
25 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
COMP310
Multi-Agent Systems
Chapter 4 - Practical Reasoning Agents

Dr Terry R. Payne
Department of Computer Science
Pro-Active Behaviour
• Previously we looked at:
• Characteristics of an Agent and its Environment
• The Intentional Stance
• Translating the Formal Agent model into a Deductive Logic framework

• We said:
• An intelligent agent is a computer system capable of flexible autonomous action in some environment.
• Where by flexible, we mean:
• reactive;
• pro-active;
• social.

• This is where we deal with the “proactive” bit, showing how we can program
agents to have goal-directed behaviour.

!2 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


What is Practical Reasoning?
• Practical reasoning is reasoning directed towards actions — the process of
figuring out what to do:

“... Practical reasoning is a matter of weighing conflicting


considerations for and against competing options, where
the relevant considerations are provided by what the agent
desires/values/cares about and what the agent
believes...” (Bratman)

• Distinguish practical reasoning from theoretical reasoning.


• Theoretical reasoning is directed towards beliefs.
!3 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
The components of Practical Reasoning
• Human practical reasoning consists of two activities:
• deliberation:
• deciding what state of affairs we want to achieve
• the outputs of deliberation are intentions;
• means-ends reasoning:
• deciding how to achieve these states of affairs
• the outputs of means-ends reasoning are plans.

• Intentions are a key part of this.


• The interplay between beliefs, desires and intentions defines how the model
works
!4 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intentions in Practical Reasoning
1. Intentions pose problems for agents, who need to determine ways of achieving
them.
If I have an intention to φ, you would expect me to devote resources to deciding how to bring
about φ.

2. Intentions provide a “filter” for adopting other intentions, which must not conflict.
If I have an intention to φ, you would not expect me to adopt an intention ψ that was
incompatible with φ.

3. Agents track the success of their intentions, and are inclined to try again if their
attempts fail.
If an agent’s first attempt to achieve φ fails, then all other things being equal, it will try an
2.
alternative plan to achieve φ.
!5 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intentions in Practical Reasoning
4. Agents believe their intentions are possible.

That is, they believe there is at least some way that the intentions could be brought about.

5. Agents do not believe they will not bring about their intentions.

It would not be rational of me to adopt an intention to φ if I believed I would fail with φ.

6. Under certain circumstances, agents believe they will bring about their
intentions.

• If I intend φ, then I believe that under “normal circumstances” I will succeed with φ.
!6 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intentions in Practical Reasoning
7. Agents need not intend all the
expected side effects of their
intentions.
If I believe φ ψ and I intend that φ,
I do not necessarily intend ψ also.

• Intentions are not closed under


implication. I may believe that going to the dentist
involves pain, and I may also intend to
• This last problem is known as the side effect go to the dentist — but this does not
or package deal problem. imply that I intend to suffer pain!
!7 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intentions are Stronger than Desire
“... My desire to play basketball this afternoon is merely a potential
influencer of my conduct this afternoon. It must vie with my other
relevant desires [. . . ] before it is settled what I will do.

In contrast, once I intend to play basketball this afternoon, the matter


is settled: I normally need not continue to weigh the pros and cons.

When the afternoon arrives, I will normally just proceed to execute my


intentions...”
Michael E. Bratman (1990)

!8 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Means-ends Reasoning/Planning
Environment
• Planning is the design of a course of Task/Goal/
State
action that will achieve some desired Intention Possible
Action
goal.
• Basic idea is to give a planning system:
• (representation of) goal/intention to achieve;
• (representation of) actions it can perform; Planner
• (representation of) the environment;

• and have it generate a plan to achieve the goal.

• This is automatic programming. Plan to Achieve


Goal
!9 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Means-ends Reasoning/Planning
Environment
State
Task/Goal/
Intention Possible
Action

• Don't have to directly tell the system


what to do!
• Let it figure out how to achieve the goal on its Planner
own!

Plan to Achieve
Goal
!10 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
STRIPS Planner

• STRIPS
• The Stanford Research Institute Problem Solver
• Used by Shakey, the robot
• Developed by Richard Fikes and Nils Nilsson in 1971 at SRI
International

!11 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Representations
• Question: How do we represent. . .
• goal to be achieved;
• state of environment;
• actions available to agent;
• plan itself.

• Answer: We use logic, or something that


looks a lot like logic.

!12 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Blocksworld
• We’ll illustrate the techniques with A
reference to the blocks world.
• A simple (toy) world, in this case one where we
B C
consider toys

• The blocks world contains a robot arm,


3 blocks (A, B and C) of equal size, and
a table-top.

• The aim is to generate a plan for the


robot arm to build towers out of blocks.
!13 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Blocksworld
Blocksworld Ontology
On(x,y) object x on top of object y
• The environment is represented by an OnTable(x) object x is on the table

ontology. Clear(x) nothing is on top of object x


Holding(x) arm is holding x

• The closed world assumption is used Representation of the following blocks


• Anything not stated is assumed to be false. Clear(A)
On(A, B)
OnTable(B)
Clear(C)

• A goal is represented as a set of OnTable(C)


ArmEmpty

formulae. The goal:


{OnTable(A), OnTable(B), OnTable(C), ArmEmpty}

!14 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Blocksworld Actions
• Each action has:
• a name: which may have arguments;
• a pre-condition list: list of facts which must be true for
action to be executed;
• a delete list: list of facts that are no longer true after action
is performed;
• an add list: list of facts made true by executing the action.

• Each of these may contain variables.


• What is a plan?
• A sequence (list) of actions, with variables replaced by
constants.
!15 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Blocksworld Actions
Stack(x, y) P ickup(x)
pre Clear(y) ^ Holding(x) pre Clear(x) ^ OnT able(x) ^ ArmEmpty
del Clear(y) ^ Holding(x) del OnT able(x) ^ ArmEmpty
add Holding(x)
add ArmEmpty ^ On(x, y)

The stack action occurs when the robot


arm places the object x it is holding is The pickup action occurs when the arm
placed on top of object y. picks up an object x from the table.

U nStack(x, y)
pre On(x, y) ^ Clear(x) ^ ArmEmpty P utDown(x)
pre Holding(x)
del On(x, y) ^ ArmEmpty
del Holding(x)
add Holding(x) ^ Clear(y)
add OnT able(x) ^ ArmEmpty ^ Clear(x)

The unstack action occurs when the


robot arm picks an object y up from on top The putdown action occurs when the
of another object y. arm places the object x onto the table.

!16 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Using Plans
Stack(x, y)
pre Clear(y) ^ Holding(x)
To get from here (left) to here (right)... del Clear(y) ^ Holding(x)
add ArmEmpty ^ On(x, y)
A
A B U nStack(x, y)
C pre On(x, y) ^ Clear(x) ^ ArmEmpty
B C
del On(x, y) ^ ArmEmpty
add Holding(x) ^ Clear(y)
...we need this set of actions:
P ickup(x)
UnStack(A,B)
pre Clear(x) ^ OnT able(x) ^ ArmEmpty
Putdown(A) del OnT able(x) ^ ArmEmpty
Pickup(B) add Holding(x)
Stack(B, C)
Pickup(A) P utDown(x)
pre Holding(x)
Stack(A, B) del Holding(x)
add OnT able(x) ^ ArmEmpty ^ Clear(x)

!17 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Plan Validity

• Thus, a plan is simply a sequence of


steps

• However, how can we:


• Generate the plan?
• Ensure that it is correct?

!18 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Formal Representation
• Let’s relate the STRIPS model back to the formal description of an agent we talked
about before.
• This will help us to see how it fits into the overall picture.

• As before we assume that the agent has a set of actions Ac, and we will write
individual actions as α1, α2 and so on.

• Now the actions have some structure, each one has


preconditions Pαi, add list Aαi, and delete list Dαi, for each αi ∈ Ac:

↵i = hP↵i , D↵i , A↵i i

• A plan is just a sequence of actions, where each action is one of the actions from Ac:
⇡ = (↵1 , . . . , ↵n )
!19 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Formal Representation
• A planning problem is therefore: B0, Ac, I
• B0 is the set of beliefs the agent has about the world.
• Ac is the set of actions, and
• I is a goal (or intention)

• Since actions change the world, any rational agent will change its beliefs
about the world as a result of carrying out actions.
• Thus, a plan π for a given planning problem will be associated with a sequence of sets of
beliefs:
↵1 ↵2 ↵n
B0 ! B1 ! · · · ! Bn
• In other words at each step of the plan the beliefs are updated by removing the items in the
delete list of the relevant action and adding the items in the add list.

!20 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Formal Representation
• A plan π is said to be acceptable with respect to the
problem B0, Ac, I if and only if, for all 1 ≤ j ≤ n, Bj-1 ⊨ Pαj
• In other words, the pre-requisites for each action have to be true right
before the action is carried out.
• We say this because the pre-conditions don’t have to be in Bj-1, we just have to be able to prove
the pre-conditions from Bj-1.

• A plan π is correct if it is acceptable, and: Bn ⊨ i


• In other words, it is correct if it is acceptable and the final state makes the
goal true.

!21 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example - Question 2.b

T h is
is a
que n ex
s t io a m
n fr ple
M oc om
k pa the
per

!22 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
A plan ⇡ is a sequence of actions, where each action results in changing the
set of beliefs the agent has, until the final set of beliefs matches that of the
↵1 ↵2 ↵n
intentions, such that B0 ! B1 ! · · · ! Bn . Therefore, a planner will explore
all the di↵erent possible sequences of actions to determine which one will result
in the final set of intentions.
In this solution, only those actions that result in the final solution are given,
with the set of beliefs that result in each step presented. The aim is to start
with an initial set of beliefs, B0 , and arrive at a final set of beliefs, Bn which
corresponds to the intentions given in the question - i.e.

Belief s B0 Intention i
Clear(B) Clear(A)
Clear(C) Clear(B)
On(C, A) On(B, C)
C
OnT able(A) OnT able(A) A B
OnT able(B) OnT able(C)
ArmEmpty ArmEmpty
The solution is given on the next slide. In each case, the beliefs that hold prior
to the action are given in bold, and the beliefs that are new after the action are
also presented in bold.
!23 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Belief s B0 Action Belief s B1 Belief s B2 Action Belief s B3
Clear(B) U nstack(C, A) Clear(B) Clear(B) P ickup(B) Clear(B)
Clear(C) Clear(C) Clear(C) Clear(C)
On(C, A) On(C, A) OnT able(A) OnT able(A)
OnT able(A) OnT able(A) OnTable(B) OnT able(B)
OnT able(B) C OnT able(B) Clear(A) B Clear(A)
ArmEmpty ArmEmpty OnT able(C) OnT able(C)
A B Holding(C) ArmEmpty A C ArmEmpty
Clear(A) Holding(B)

Belief s B1 Action Belief s B2 Belief s B3 Action Belief s B4


Clear(B) P utDown(C) Clear(B) Clear(B) Stack(B, C) Clear(B)
Clear(C) Clear(C) Clear(C) Clear(C)
OnT able(A) OnT able(A) OnT able(A) OnT able(A)
OnT able(B) OnT able(B) Clear(A) Clear(A)
Holding(C) Holding(C) OnT able(C) B OnT able(C)
Clear(A) A B C Clear(A) Holding(B) Holding(B)
OnTable(C) A C ArmEmpty
ArmEmpty On(B, C)

The beliefs B4, once rearranged, are now equivalent to the intentions.
!24 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Action Definitions are important!!!

One
defi of t
n i ti hese
The ons
othe wo r
r do ks.
esn’
t!

!25 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Implementing Practical Reasoning Agents

• A first pass at animplementation Agent Control Loop Version 1


of a practical reasoning agent: 1. while true
2. observe the world;
3. update internal world model;
4. deliberate about what intention
• For now we will not be concerned 5.
to achieve next;
use means-ends reasoning to get
with stages 2 or 3. a plan for the intention;
6. execute the plan
• These are related to the functions see and 7. end while
next from the earlier lecture notes.

!26 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Implementing Practical Reasoning Agents
• see is as before:
Environment
• see: E → Percept
percepts actions
• Instead of the function next…
• which took a percept and used it to update the see action
internal state of an agent Agent

• …we have a belief revision function: next state


• brf: 𝒫{Bel} x Percept → 𝒫{Bel}
• 𝒫{Bel} is the power set of beliefs
• Bel is the set of all possible beliefs that an agent might have.
!27 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Implementing Practical Reasoning Agents
• Problem: • So the agent selects an
• deliberation and means-ends reasoning intention to achieve that would
processes are not instantaneous. have been optimal at the time it
• They have a time cost. observed the world.
• This is calculative rationality.
• Suppose that deliberation is
optimal • The world may change in the
• The agent selects the optimal intention meantime.
to achieve, then this is the best thing for
the agent. • Even if the agent can compute the right
thing to do, it may not do the right thing.
• i.e. it maximises expected utility.
• Optimality is hard.

!28 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Implementing Practical Reasoning Agents
• Let’s make the algorithm more
formal with the algorithm opposite Agent Control Loop Version 2
• where I ⊆ Int, i.e the set of intentions, 1. B := B0 ; /* initial beliefs */
2. while true do
• plan() is exactly what we discussed above, 3. get next percept ⇢;
4. B := brf (B, ⇢);
• brf() is the belief revision function, 5. I := deliberate(B);
• and execute() is a function that executes 6.
7.
⇡ := plan(B, I);
execute(⇡)
each action in a plan. 8. end while

• How might we implement these


functions?
!29 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Deliberation
• How does an agent deliberate?
• begin by trying to understand what the options available to
you are;
• choose between them, and commit to some.

• Chosen options are then intentions.


• The deliberate function can be decomposed
into two distinct functional components:
• option generation; and
• filtering.
!30 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Option Generation and Filtering
• Option Generation • Filtering
• In which the agent generates a • In which the agent chooses
set of possible alternatives between competing alternatives,
and commits to achieving them.
• Represent option generation via a
function, options(), which takes • In order to select between
the agent’s current beliefs and competing options, an agent
current intentions, and from them uses a filter() function.
determines a set of options • intentions
• desires

options : 𝒫(Bel) × 𝒫(Int) ➝ 𝒫(Des) filter : 𝒫(Bel) × 𝒫(Des) × 𝒫(Int) ➝ 𝒫(Int)

!31 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Implementing Practical Reasoning Agents
Agent Control Loop Version 3

1. B := B0 ;
2. I := I0 ;
3. while true do
4. get next percept ⇢;
5. B := brf (B, ⇢);
6. D := options(B, I);
7. I := f ilter(B, D, I);
8. ⇡ := plan(B, I);
9. execute(⇡)
10. end while

!32 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Under Commitment

“... Some time in the not-so-distant future, you


are having trouble with your new household
robot. You say “Willie, bring me a beer.” The
robot replies “OK boss.” Twenty minutes later,
you screech “Willie, why didn’t you bring me
that beer?” It answers “Well, I intended to get
you the beer, but I decided to do something
else.” Miffed, you send the wise guy back to
the manufacturer, complaining about a lack of
commitment...”

P. R. Cohen and H. J. Levesque (1990). Intention is choice with commitment. Artificial intelligence, 42(2), 213-261.
!33 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Over Commitment

“... After retrofitting, Willie is returned, marked “Model C:


The Committed Assistant.” Again, you ask Willie to bring
you a beer. Again, it accedes, replying “Sure thing.”
Then you ask: “What kind of beer did you buy?” It
answers: “Genessee.” You say “Never mind.” One
minute later, Willie trundles over with a Genessee in its
gripper...”

P. R. Cohen and H. J. Levesque (1990). Intention is choice with commitment. Artificial intelligence, 42(2), 213-261.
!34 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Wise Guy ???
“... After still more tinkering, the manufacturer sends
Willie back, promising no more problems with its
commitments. So, being a somewhat trusting customer,
you accept the rascal back into your household, but as a
test, you ask it to bring you your last beer. [. . . ]
The robot gets the beer and starts towards you. As it
approaches, it lifts its arm, wheels around, deliberately
smashes the bottle, and trundles off. Back at the plant,
when interrogated by customer service as to why it had
abandoned its commitments, the robot replies that
according to its specifications, it kept its commitments as
long as required — commitments must be dropped when
fulfilled or impossible to achieve. By smashing the bottle,
the commitment became unachievable...”

P. R. Cohen and H. J. Levesque (1990). Intention is choice with commitment. Artificial intelligence, 42(2), 213-261.
!35 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Degrees of Commitment
• Blind commitment
• A blindly committed agent will continue to maintain an intention until it believes the
intention has actually been achieved. Blind commitment is also sometimes referred to as
fanatical commitment.

• Single-minded commitment
• A single-minded agent will continue to maintain an intention until it believes that either the
intention has been achieved, or else that it is no longer possible to achieve the
intention.

• Open-minded commitment
• An open-minded agent will maintain an intention as long as it is still believed possible.
!36 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Degrees of Commitment
• An agent has commitment • Currently, our agent control
both to: loop is overcommitted,
• ends (i.e., the state of affairs it both to means and ends.
wishes to bring about), and
• Modification: replan if ever a plan
• means (i.e., the mechanism via goes wrong.
which the agent wishes to
• However, to write the algorithm
achieve the state of affairs). down we need to refine our
notion of plan execution.

!37 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Degrees of Commitment Agent Control Loop Version 4

1. B := B0 ;
2. I := I0 ;
3. while true do
• The previous version was blindly 4. get next percept ⇢;

committed to its means and its ends 5.


6.
B := brf (B, ⇢);
D := options(B, I);
7. I := f ilter(B, D, I);
8. ⇡ := plan(B, I);
• If π is a plan, then: 9. while not empty(⇡) do
10. ↵ := hd(⇡);
• empty(π) is true if there are no more actions in the 11. execute(↵);
plan. 12. ⇡ := tail(⇡);
13. get next percept ⇢;
• hd(π) returns the first action in the plan. 14. B := brf (B, ⇢);
• tail(π) returns the plan minus the head of the plan. 15.
16.
if not sound(⇡, I, B) then
⇡ := plan(B, I)
• sound(π, I, B) means that π is a correct plan for I 17. end-if
given B. 18. end-while
19. end-while
!38 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Degrees of Commitment Agent Control Loop Version 4

1. B := B0 ;
2. I := I0 ;
3. while true do

• Blind Commitment 4.
5.
get next percept ⇢;
B := brf (B, ⇢);

• Makes the control loop more reactive, able to


6. D := options(B, I);
7. I := f ilter(B, D, I);
change intention when the world changes. 8. ⇡ := plan(B, I);
• i.e. it is not committed to its means (line 16) 9. while not empty(⇡) do
10. ↵ := hd(⇡);
11. execute(↵);

• Still overcommitted to intentions 12.


13.
⇡ := tail(⇡);
get next percept ⇢;
(ends). 14.
15.
B := brf (B, ⇢);
if not sound(⇡, I, B) then
• Never stops to consider whether or not its 16.
17. end-if
⇡ := plan(B, I)

intentions are appropriate. 18. end-while


19. end-while
!39 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agent Control Loop Version 5

Single Minded Commitment 1. B := B0 ;


2. I := I0 ;

• Modification:
3. while true do
4. get next percept ⇢;
5. B := brf (B, ⇢);
• stop to determine whether intentions have 6. D := options(B, I);
succeeded or whether they are impossible 7. I := f ilter(B, D, I);
8. ⇡ := plan(B, I);
9. while not( empty(⇡)
• Our agent now gets to reconsider its or succeeded(I, B)
intentions once every time around the 10.
or impossible(I, B)) do
↵ := hd(⇡);
outer control loop (line 9), i.e., after: 11. execute(↵);
12. ⇡ := tail(⇡);
• it has completely executed a plan to achieve its 13. get next percept ⇢;
current intentions; or 14. B := brf (B, ⇢);
15. if not sound(⇡, I, B) then
• it believes it has achieved its current intentions; or 16. ⇡ := plan(B, I)
17. end-if
• it believes its current intentions are no longer 18. end-while
possible. 19. end-while
!40 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agent Control Loop Version 6

Open Minded Commitment 1.


2.
B := B0 ;
I := I0 ;
3. while true do
4. get next percept ⇢;

• Open Minded Commitment 5.


6.
B := brf (B, ⇢);
D := options(B, I);
7. I := f ilter(B, D, I);
• In the previous version, our agent 8. ⇡ := plan(B, I);
reconsiders its intentions once every time 9. while not(empty(⇡)
or succeeded(I, B)
around the outer control loop or impossible(I, B)) do

• In this new version, our agent also


10. ↵ := hd(⇡);
11. execute(↵);
reconsiders its intentions after every action 12. ⇡ := tail(⇡);
(lines 15 & 16) 13.
14.
get next percept ⇢;
B := brf (B, ⇢);
15. D := options(B, I);
16. I := f ilter(B, D, I);
• But this intention reconsideration is 17.
18.
if not sound(⇡, I, B) then
⇡ := plan(B, I)
costly! 19.
20.
end-if
end-while
21. end-while
!41 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Agent Control Loop Version 7

Intention Reconsideration 1.
2.
B := B0 ;
I := I0 ;
3. while true do

• A dilemma:
4. get next percept ⇢;
5. B := brf (B, ⇢);
6. D := options(B, I);
• an agent that does not stop to reconsider its intentions 7. I := f ilter(B, D, I);
sufficiently often will continue to attempt to achieve its 8. ⇡ := plan(B, I);
intentions even after it is clear that they cannot be 9. while not(empty(⇡)
or succeeded(I, B)
achieved, or that there is no longer any reason for or impossible(I, B)) do
achieving them; 10. ↵ := hd(⇡);
11. execute(↵);
• an agent that constantly reconsiders its attentions may 12. ⇡ := tail(⇡);
spend insufficient time actually working to achieve 13. get next percept ⇢;
them, and hence runs the risk of never actually achieving 14. B := brf (B, ⇢);
if reconsider(I, B) then
them. 15.
16. D := options(B, I);
17. I := f ilter(B, D, I);

• Solution: incorporate an explicit meta-level


18. end-if
19. if not sound(⇡, I, B) then
control component, that decides whether or 12.
21. end-if
⇡ := plan(B, I)

not to reconsider. 22. end-while


23. end-while
!42 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Intention Reconsideration
• The possible interactions between meta-level control and
deliberation are:
Situation Chose to Changed Would have reconsider(. . .)
number deliberate? intentions? changed intentions? optimal?
1 No — No Yes
2 No — Yes No
3 Yes No — No
4 Yes Yes — Yes

• An important assumption: cost of reconsider(. . .) is


much less than the cost of the deliberation process itself.
!43 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
8. ⇡ := plan(B, I);
9. while not empty(⇡)

Intention Reconsideration
or succeeded(I, B)
or impossible(I, B)) do
10. ↵ := hd(⇡);
11. execute(↵);
12. ⇡ := tail(⇡);
13. get next percept ⇢;
Situation Chose to Changed Would have reconsider(. . .)
14. B := brf (B, ⇢);
number deliberate? intentions? changed intentions? optimal?15. if reconsider(I, B) then
1 No — No Yes 16. D := options(B, I);
2 No — Yes No 17. I := f ilter(B, D, I);
3 Yes No — No 18. end-if
4 Yes Yes — Yes 19. if not sound(⇡, I, B) then
12. ⇡ := plan(B, I)
21. end-if
22. end-while
• In situation (1), the agent did not choose to deliberate, and as a consequence,
23. end-whiledid not choose to change
intentions. Moreover, if it had chosen to deliberate, it would not have changed intentions. In this situation, the
reconsider(. . .) function is behaving optimally.

• In situation (2), the agent did not choose to deliberate, but if it had done so, it would have changed
intentions. In this situation, the reconsider(. . .) function is not behaving optimally.

• In situation (3), the agent chose to deliberate, but did not change intentions. In this situation, the
reconsider(. . .) function is not behaving optimally.

• In situation (4), the agent chose to deliberate, and did change intentions. In this situation, the
reconsider(. . .) function is behaving optimally.
!44 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Optimal Intention Reconsideration
• Kinny and Georgeff’s experimentally
investigated effectiveness of intention
reconsideration strategies.

• Two different types of reconsideration strategy


were used:
• bold agents: never pause to reconsider intentions, and
• cautious agents: stop to reconsider after every action.

• Dynamism in the environment is represented


by the rate of world change, γ.
• Experiments were carried out using Tileword.
!45 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Optimal Intention Reconsideration
• If γ is low (i.e., the environment does not change Bold Agent
quickly), then bold agents do well compared to
cautious ones.
• This is because cautious ones waste time reconsidering their
commitments while bold agents are busy working towards — and
achieving — their intentions. low γ high

• If γ is high (i.e., the environment changes frequently),


then cautious agents can outperform bold agents.
• This is because they are able to recognise when intentions are
doomed, and also to take advantage of serendipitous situations and Cautious Agent
new opportunities when they arise.

• When planning costs are high, this advantage can be


eroded.
!46 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Implemented BDI Agents:
Procedural Reasoning System
• We now make the discussion even more concrete
by introducing an actual agent architecture: the
Procedural Reasoning System (PRS). percepts

• In the PRS, each agent is equipped with a plan library,


representing that agent’s procedural knowledge: knowledge beliefs plan library
about the mechanisms that can be used by the agent in order
to realise its intentions. interpreter

• The options available to an agent are directly determined by the


plans an agent has: an agent with no plans has no options. desires intentions

• In addition, PRS agents have explicit actions

representations of beliefs, desires, and intentions,


as above.
!47 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example PRS (JAM) System
• The agent possesses a number of pre-compiled plans
(constructed manually)
• Each plan contains: GOALS:
• a goal - the postcondition of the plan
ACHIEVE blocks_stacked;
• a context - the pre condition of the plan
• a body - the course of action to take out FACTS:
// Block1 on Block2 initially so need
//to clear Block2 before stacking.
• When an agent starts, goals are pushed onto the
intention stack. FACT ON "Block1" "Block2";
• This stack contains all of the goals that are pending FACT ON "Block2" "Table";
FACT ON "Block3" "Table";
• A set of facts or beliefs are maintained and updated as the agent
achieves different goals FACT CLEAR "Block1";
FACT CLEAR "Block3";
• The agent then deliberates (i.e. selects the most appropriate goal to FACT CLEAR "Table";
adopt).
FACT initialized "False";
• This is achieved using meta level plans, or utilities
• When utilities are used, the agent selects the goal with the highest value
!48 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example PRS (JAM) System
• This is the plan for the top Plan: {
NAME: "Top-level plan"

level goal: DOCUMENTATION:


"Establish Block1 on Block2 on Block3."

• ACHIEVE blocks_stacked.
GOAL:
ACHIEVE blocks_stacked;
CONTEXT:
BODY:
EXECUTE print "Goal: Blk1 on Blk2 on Blk3 on Table.\n";
EXECUTE print "World Model at start is:\n";
EXECUTE printWorldModel;

• Note that the body contains a EXECUTE print "ACHIEVEing Block3 on Table.\n";

mix of instructions and goals. ACHIEVE ON "Block3" "Table";

EXECUTE print "ACHIEVEing Block2 on Block3.\n";


ACHIEVE ON "Block2" "Block3";

• When executing, the goals will EXECUTE print "ACHIEVEing Block1 on Block2.\n";
ACHIEVE ON "Block1" "Block2";

be added to the intention EXECUTE print "World Model at end is:\n";


EXECUTE printWorldModel;
stack }

!49 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example PRS (JAM) System
Plan: {
• This plan also has a NAME: "Stack blocks that are already clear"
GOAL:
utility associated with it ACHIEVE ON $OBJ1 $OBJ2;
CONTEXT:

• This is used by the agent BODY:


EXECUTE print "Making sure " $OBJ1 " is clear\n";
during the deliberation phase ACHIEVE CLEAR $OBJ1;
EXECUTE print "Making sure " $OBJ2 " is clear.\n";
ACHIEVE CLEAR $OBJ2;

• The plan can also


EXECUTE print "Moving " $OBJ1 " on top of " $OBJ2 ".\n";
PERFORM move $OBJ1 $OBJ2;
UTILITY: 10;
determine actions to FAILURE:
EXECUTE print "\n\nStack blocks failed!\n\n";
execute if it fails }

!50 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example PRS (JAM) System
• This plan includes an Plan: {
NAME: "Clear a block"
EFFECTS field GOAL:
ACHIEVE CLEAR $OBJ;
CONTEXT:
FACT ON $OBJ2 $OBJ;

• This determines what BODY:


EXECUTE print "Clear " $OBJ2 " from on top of " $OBJ "\n";
the agent should do EXECUTE print "Move " $OBJ2 " to table.\n";
ACHIEVE ON $OBJ2 "Table";

once the agent has EFFECTS:


EXECUTE print "Clear: Retract ON " $OBJ2 " " $OBJ "\n";

succeeded in executing RETRACT ON $OBJ1 $OBJ;


FAILURE:

all of the BODY }


EXECUTE print "\n\nClearing block " $OBJ " failed!\n\n";

instructions.
!51 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Example PRS (JAM) System
Plan: {

• I’ll leave it as an exercise to


NAME: "Move a block onto another object"
GOAL:
PERFORM move $OBJ1 $OBJ2;
work out what plans will be CONTEXT:
FACT CLEAR $OBJ1;
executed FACT CLEAR $OBJ2;
BODY:
• If you have a solution and want me EXECUTE print "Performing low-level move action"
EXECUTE print " of " $OBJ1 " to " $OBJ2 ".\n";
to check, let me know. EFFECTS:
WHEN : TEST (!= $OBJ2 "Table") {
EXECUTE print "-Retract CLEAR " $OBJ2 "\n";

• The Java version of JAM and


RETRACT CLEAR $OBJ2;
};
FACT ON $OBJ1 $OBJ3;
further details/documentation EXECUTE print "-move: Retract ON " $OBJ1 " " $OBJ3 "\n";
RETRACT ON $OBJ1 $OBJ3;
are available from Marcus EXECUTE print "-move: Assert CLEAR " $OBJ3 "\n";
ASSERT CLEAR $OBJ3;
Huber’s website: EXECUTE print "-move: Assert ON " $OBJ1 " " $OBJ2 "\n\n";
ASSERT ON $OBJ1 $OBJ2;

• http://www.marcush.net/IRS/ FAILURE:
EXECUTE print "\n\nMove failed!\n\n";
irs_downloads.html }

!52 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Summary
• This lecture has covered a lot of ground on Class Reading (Chapter 4):
practical reasoning.
“Plans and resource-bounded practical
• We started by discussing what practical reasoning was, and reasoning”, Michael E. Bratman, David J.
how it relates to intentions. Israel, Martha E. Pollack. Computational
Intelligence 4: 1988. pp349-355.
• We then looked at planning (how an agent achieves its desires)
and how deliberation and means-ends reasoning fit into the
basic agent control loop. This is an interesting, insightful article,
with not too much technical content.
• We then refined the agent control loop, considering commitment It introduces the IRMA architecture for
and intention reconsideration. practical reasoning agents, which has
been very influential in the design of
• Finally, we looked at an implemented system (the textbook subsequent systems.
discusses a couple of others).

• Next Lecture - we start looking at the AgentSpeak


and the Jason framework to exploit BDI
!53 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
COMP310
Multi-Agent Systems
Chapter 5 - Reactive and Hybrid Architectures

Dr Terry R. Payne
Department of Computer Science
Reactive Architectures
• There are many unsolved (some would say insoluble) problems
associated with symbolic AI.
• These problems have led some researchers to question the viability of the whole
paradigm, and to the development of reactive architectures.
• Although united by a belief that the assumptions underpinning mainstream AI are in
some sense wrong, reactive agent researchers use many different techniques.

• In this chapter, we look at alternative architectures that better


support some classes of agents and robots
• At the end, we then examine how hybrid architectures exploits the best aspects
of deliberative and reactive ones

2 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


General Control Architecture
• So far, we have viewed the control Environment

architecture of an agent as one that: percepts actions


• Perceives the environment
• Revises its internal state, identifying beliefs and desires
see action
• Selects actions from its intention and plan
Agent
• Acts, possibly changing the environment

• Intention Reconsideration is important next state

in highly dynamic environments

3 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Agent Control Loop as Layers
Sensors Perception

Revise Internal State (Beliefs/Desires)

Select Intention and Plan

Execute Action Actuators

The classic “Sense/Plan/Act” approach breaks it down serially like this


4 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Behaviours
Communicate Data

Discover new area

Sensors Detect goal position Σ Actuators

Avoid Obstacles

Follow right/left wall Coordination / Fusion


e.g. fusion via vector
summation

• Behaviour based control sees things differently


• Behavioural chunks of control each connecting sensors to actuators
• Implicitly parallel
• Particularly well suited to Autonomous Robots

5 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Behaviours
• Range of ways of combining behaviours.
• Some examples:
• Pick the ``best''
Σ Actuators
• Sum the outputs
• Use a weighted sum
Coordination / Fusion
• Flakey redux used a fuzzy combination e.g. fusion via vector
summation
which produced a nice integration of
outputs.
6 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Subsumption Architecture
• A subsumption architecture is a hierarchy of
task-accomplishing behaviours.
• Each behaviour is a rather simple rule-like structure.
• Each behaviour ‘competes’ with others to exercise control
over the agent.
• Lower layers represent more primitive kinds of behaviour, (such
as avoiding obstacles), and have precedence over layers
further up the hierarchy.

• The resulting systems are, in terms of the Rodney Brooks “subsumption


architecture” was originally
amount of computation they do, extremely
developed open Genghis
simple.
• Some of the robots do tasks that would be impressive if they
were accomplished by symbolic AI systems.
7 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Brooks Behavioural Languages
• Brooks proposed the following three
theses:
1. Intelligent behaviour can be generated without
explicit representations of the kind that
symbolic AI proposes.
2. Intelligent behaviour can be generated without
explicit abstract reasoning of the kind that
symbolic AI proposes.
3. Intelligence is an emergent property of certain
complex systems.

8 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Brooks Behavioural Languages
• He identified two key ideas that have informed his
research:
1. Situatedness and embodiment: ‘Real’ intelligence is situated in the world,
not in disembodied systems such as theorem provers or expert systems.
2. Intelligence and emergence: ‘Intelligent’ behaviour arises as a result of an
agent’s interaction with its environment. Also, intelligence is ‘in the eye of
the beholder’; it is not an innate, isolated property.

• Brooks built several agents (such as Genghis) based on


his subsumption architecture to illustrate his ideas.
9 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Subsumption Architecture
• It is the piling up of layers that gives the approach of its power.
• Complex behaviour emerges from simple components.
• Since each layer is independent, each can independently be:
• Coded / Tested / Debugged
• Can then assemble them into a complete system.
Communicate Data

Discover new area


Sensors Detect goal position Σ Actuators
Avoid Obstacles
Follow right/left wall Coordination / Fusion
e.g. fusion via vector
summation
10 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Abstract view of a Subsumption Machine
• Layered approach based on levels of competence
• Higher level behaviours inhibit lower levels

• Augmented finite state machine:

Reset

Inhibition Suppression
Sensors Behaviour Model Actuators

11 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Emergent Behaviour
Putting simple behaviours together leads to synergies
Forward motion with a
slight bias to the right

Obstacle
Avoidance

Wall Following

12 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Emergent behaviour
• Important but not well-understood phenomenon Emergent Flocking
• Often found in behaviour-based/reactive systems Flocking is a classic example of emergence, e.g.
Reynolds “Boids”, or Mataric ’́ s “nerd herd”.
• Agent behaviours “emerge” from interactions of rules with
environment.
• Sum is greater than the parts.
• The interaction links rules in ways that weren’t anticipated.

• Coded behaviour: In the programming scheme


Each agent uses the following three rules:
• Observed behaviour: In the eyes of the observer 1. Don’t run into any other robot
• There is no one-to-one mapping between the two! 2. Don’t get too far from other robots
3. Keep moving if you can
When run in parallel on many agents, the result
• When observed behaviour “exceeds” programmed is flocking
behaviour, then we have emergence.
13 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
ToTo
• Maja Mataric ́ ’s Toto is based on the subsumption
architecture
• Can map spaces and execute plans without the need for a symbolic
representation.
• Inspired by “…the ability of insects such as bees to identify shortcuts
between feeding sites…”

• Each feature/landmark is a set of sensor readings


• Signature

• Recorded in a behaviour as a triple:


• Landmark type
• Compass heading
• Approximate length/size

• Distributed topological map.


14 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
ToTo
• Whenever Toto visited a particular
landmark, its associated map behaviour
would become activated
• If no behaviour was activated, then the landmark was
new, so a new behaviour was created
• If an existing behaviour was activated, it inhibited all
other behaviours

• Localization was based on which


behaviour was active.
• No map object, but the set of behaviours clearly
included map functionality.
15 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Steel’s Mars Explorer System
Objective
To explore a distant planet, and in
• Steels’ Mars explorer system particular, to collect sample of a precious
rock. The location of the samples is not
• Uses the subsumption architecture to achieve known in advance, but it is known that
near-optimal cooperative performance in they tend to be clustered.
simulated ‘rock gathering on Mars’ domain
• Individual behaviour is governed by a set of
simple rules.
• Coordination between agents can also be
achieved by leaving “markers” in the
environment.

16 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Steel’s Mars Explorer System
1. For individual (non-cooperative) agents, the lowest-level 1
behaviour, (and hence the behaviour with the highest if detect an obstacle then change
direction
“priority”) is obstacle avoidance.

... High
2
2. Any samples carried by agents are dropped back at the if carrying a sample and at the
mother-ship. base then drop sample

Priority
3. If not at the mother-ship, then navigate back there. if carrying a sample and not at
the base then travel up gradient
• The “gradient” in this case refers to a virtual “hill” radio signal that slopes up to
the mother ship/base.
4
if detect a sample then pick
4. Agents will collect samples they find. sample up

Low ...
5
5. An agent with “nothing better to do” will explore randomly.
if true then move randomly
This is the highest-level behaviour (and hence lowest level
“priority”).
17 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Steel’s Mars Explorer System
•Existing strategy works well when samples are 3’
if carrying samples and not at
distributed randomly across the terrain. the base then drop 2 crumbs and
travel up gradient.
•However, samples are located in clusters
4.5
•Agents should cooperate with each other to locate clusters
if sense crumbs then pick up 1
crumb and travel down gradient
•Solution to this is based on foraging ants.
•Agents leave a “radioactive” trail of crumbs when returning
to the mother ship with samples.
•If another agent senses this trail, it follows the trail back to the source of the
samples
•It also picks up some of the crumbs, making the trail fainter.
•If there are still samples, the trail is reinforced by the agent returning to the
mother ship (leaving more crumbs)
•If no samples remain, the trail will soon be erased.
18 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Situated Automata
• Approach proposed by Rosenschein and Kaelbling.
• An agent is specified in a rule-like (declarative) language.
• Then compiled down to a digital machine, which satisfies the declarative specification.

• This digital machine can operate in a provable time bound.


• Reasoning is done off line, at compile time, rather than online at run time.

• The theoretical limitations of the approach are not well


understood.
• Compilation (with propositional specifications) is equivalent to an NP-complete
problem.
• The more expressive the agent specification language, the harder it is to compile it.
19 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Situated Automata
RULER takes as its input three
components…
“…[A] specification of the semantics of the
[agent’s] inputs (“whenever bit 1 is on, it is
• An agent is specified by perception raining”); a set of static facts (“whenever it is
raining, the ground is wet”); and a
and action specification of the state transitions of the
world (“if the ground is wet, it stays wet until
• Two programs are used to synthesise agents: the sun comes out”). The programmer then
1.RULER specifies the perception component specifies the desired semantics for the
output (“if this bit is on, the ground is wet”),
• (see opposite) and the compiler … [synthesises] a circuit
2.GAPPS specifies the action component whose output will have the correct
• Takes a set of goal reduction rules and a top-level goal semantics… All that declarative “knowledge”
(symbolically specified) and generates a non-symbolic program has been reduced to a very simple circuit…”
Kaelbling, L.P. (1991) A Situated Automata
Approach to the Design of Embedded Agents.
SIGART Bulletin, 2(4): 85-88

20 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Limitations of Reactive Systems

• Although there are clear advantages of Reactive Systems,


there are also limitations!
• If a model of the environment isn’t used, then sufficient information of the local environment is
needed for determining actions
• As actions are based on local information, such agents inherently take a “short-term” view
• Emergent behaviour is very hard to engineer or validate; typically a trial and error approach is
ultimately adopted
• Whilst agents with few layers are straightforward to build, models using many layers are
inherently complex and difficult to understand.

21 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Hybrid Architectures
• Many researchers have argued that neither a completely deliberative
nor completely reactive approach is suitable for building agents.

• They have suggested using hybrid systems, which attempt to


marry classical and alternative approaches.

• An obvious approach is to build an agent out of two (or more)


subsystems:
• a deliberative one, containing a symbolic world model, which develops plans and
makes decisions in the way proposed by symbolic AI; and
• a reactive one, which is capable of reacting to events without complex reasoning.

22 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Hybrid Architectures
• Often, the reactive component is given some kind of precedence over
the deliberative one.

• This kind of structuring leads naturally to the idea of a layered


architecture, of which InterRap and TouringMachines are examples.
• In such an architecture, an agent’s control subsystems are arranged into a hierarchy…
• …with higher layers dealing with information at increasing levels of abstraction.

• A key problem in such architectures is what kind control framework to


embed the agent’s subsystems in, to manage the interactions
between the various layers.
23 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Hybrid Architectures
• Horizontal layering. • Vertical layering.
• Layers are each directly connected to • Sensory input and action output are
the sensory input and action output. each dealt with by at most one layer
each.
• In effect, each layer itself acts like an
agent, producing suggestions as to
what action to perform.

24 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Ferguson - TouringMachines

• The TouringMachines architecture modelling layer

consists of perception and action


subsystems sensor
input
perceptual
sub−system planning layer
action
subsystem actions

• These interface directly with the agent’s


environment, and three control layers, reactive layer

embedded in a control framework, which


mediates between the layers. control
subsystem

25 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Ferguson - TouringMachines
• The reactive layer is implemented as a set of situation-
action rules, a` la subsumption architecture.
• The planning layer constructs plans and selects actions to execute in order
to achieve the agent’s goals.

rule-1: kerb-avoidance
if
is-in-front(Kerb, Observer) and
speed(Observer) > 0 and
separation(Kerb, Observer) < KerbThreshHold
then
change-orientation(KerbAvoidanceAngle)

26 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Ferguson - TouringMachines
• The modelling layer contains symbolic representations
of the ‘cognitive state’ of other entities in the agent’s
environment.
• The three layers communicate with each other and are embedded in a
control framework, which use control rules.
• Such control structures have become common in robotics.

censor-rule-1:
if
entity(obstacle-6) in perception-buffer
then
remove-sensory-record(layer-R, entity(obstacle-6))

27 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Real World Example: Stanley
• Won the 2005 DARPA Grand Challenge
• Used a combination of the subsumption
architecture with deliberative planning
• Consists of 30 different independently operating
modules across 6 layers
Global Services Layer
User Interface Layer
The ke
Vehicle Interface Layer y chall
not on enge…
e of ac was
Planning and Control layer o n e of tion, b
percep ut
tion…
Perception layer
Sensor interface layer
28 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Question 2.a

29 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018


Q2.a answer
The Touring Machines architecture is an example of a hybrid architecture that combines reactive behaviour with
that of deliberative, or pro-active behaviour. It consists of three layers, each of which operate in parallel. Each has
access to the perceptual sub-system, which is responsible for converting the percepts obtained from sensor input
into predicates that can be used for reasoning. In addition, each layer can result in the generation of actions that
can then be executed. A control subsystem is responsible for monitoring the incoming percepts held in the
perceptual sub-system, and then determining which of the actions (if any) should be executed from the different
layers; i.e. it determines which layer is responsible for controlling the agent. In particular, the control subsystem
can suppress sensor information going to certain layers, or it can censor actions generated by the different layers.
The reactive layer is responsible for responding to changes in the environment (in a similar way to Bookes
subsumption architecture). A set of situation-action rules are defined, which then fire if they map to sensor input.
For example, if the agent is controlling an autonomous vehicle and it detects a kerb unexpectedly in front of the
vehicle, it can stop (or slow down) and turn to avoid the kerb.
The planning layer is responsible for determining the actions necessary to achieve the agent's goals. Under
normal operation, this layer determines what the agent should do. This is done by making use of a set of planning
schema, relating to different goals, and then performing the necessary actions. Note that no low level planning is
performed.
The modelling layer represents the various entities in the world. This is responsible for modelling the world,
including other agents, and for determining the agents goals, or planning goals that resolve any conflicts with other
agents if such conflicts are detected. Whenever a goal is generated, it is passed onto the planning layer, which
then determines the final actions.

Although several of the details here are from your notes, much more description was
originally given in the lecture, and is also available from the course text book.
30 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
Summary
• This lecture has looked at two further Class Reading (Chapter 5):

kinds of agent:
“A Robust Layered Control System for a
• Reactive agents; and Mobile Robot,”, Rodney A. Brooks. IEEE
Journal of Robotics and Automation, 2(1),
• Hybrid agents. March 1986, pp. 14–23. (also MIT AI Memo
864, September 1985)

• Reactive agents build complex A provocative, fascinating article,


packed with ideas. It is interesting to
behaviour from simple components. compare this with some of Brook’s
later - arguably more controversial -
articles

• Hybrid agents try to combine the speed


of reactive agents with the power of
deliberative agents.
31 Copyright: M. J. Wooldridge, S.Parsons and T.R.Payne, Spring 2013. Updated 2018
1/18/2022

What are Multi-Agent Systems?


• A multiagent system contains a number
of agents that:
IT4899 • interact through communication;
• are able to act in an environment;
Multi-Agent Systems • have different “spheres of influence” (which may
Chapter 11 - Multi-Agent Interactions coincide); and

Dr. Nguyen Binh Minh • will be linked by other (organisational) relationships.


Environment
Department of Information Systems KEY

• We will look at how agents decide how


organisational relationship
sphere of influence
interaction

agent
to interact in competitive situations.
1 2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

1 2

1
1/18/2022

Utilities and Preferences Multiagent Encounters


• Our Assumptions: • We need a model of the environment in which these agents will act...
• Assume we have just two agents: Ag = {i, j}
• Agents are assumed to be self-interested i.e. they have preferences over
• agents simultaneously choose an action to perform, and as a result of the actions they
how the environment is. select, an outcome in Ω will result
• Assume Ω = {ω1, ω2, . . .} is the set of “outcomes” that agents have • the actual outcome depends on the combination of actions
preferences over.
money • assume each agent has just two possible actions that it can perform:
• We capture preferences by utility functions, represented as • i.e. Ac = {C,D}, where
real numbers (ℝ):
ui : Ω → ℝ • C (“cooperate”) and
uj : Ω → ℝ • D (“defect”)

• Utility functions lead to preference orderings over


outcomes, e.g.:
ω ≽i ω′ means ui(ω) ≥ ui(ω′) Utility is not money. Just a way
• Environment behaviour given by state transformer function τ
ω ≻i ω′ means ui(ω) > ui(ω′) to encode preferences. • (introduced in Chapter 2):
• where ω and ω' are both possible outcomes of Ω
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
1/18/2022

Multiagent Encounters Rational Action


• Here is a state transformer function τ(i,j) τ(D,D) = ω1 τ(D,C) = ω2 • Suppose we have the case where both agents can influence the outcome,
τ(C,D) = ω3 τ(C,C) = ω4 and they have the following utility functions:
• This environment is sensitive to actions of both ui(ω1)=1 ui(ω2)=1 ui(ω3)=4 ui(ω4)=4
agents. uj(ω1)=1 uj(ω2)=4 uj(ω3)=1 uj(ω4)=4
• With a bit of abuse of notation:
ui(D,D) = 1 ui(D,C) = 1 ui(C,D) = 4 ui(C,C) = 4
• With this state transformer, neither τ(D,D) = ω1 τ(D,C) = ω1 uj(D,D) = 1 uj(D,C) = 4 uj(C,D) = 1 uj(C,C) = 4
agent has any influence in this τ(C,D) = ω1 τ(C,C) = ω1 • Then agent i’s preferences are (C, C) ≽ i (C, D) ≻ i (D, C) ≽ i (D, D)

environment. • In this case, what should i do?

• i prefers all outcomes that arise through C over all outcomes that arise
• With this one, the environment is τ(D,D) = ω1 τ(D,C) = ω2 through D.
controlled by j τ(C,D) = ω1 τ(C,C) = ω2 • Thus C is the rational choice for i.

5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 6 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
1/18/2022

Payoff Matrices Solution Concepts


• We can characterise the previous scenario in a
payoff matrix shown opposite In this case,i cooperates
and gains a utility of 4;
• How will a rational agent will behave in any
• Agent i is the column player and gets the upper reward in a whereas j defects and gains
given scenario?
cell. a utility of only 1.

• Agent j is the row player and gets the lower reward in a cell. i
defect coop
defect 1 4 • Play. . .
• Actually there are two matrices here, one (call it j 1 1
• dominant strategy;
coop 1 4
A) that specifies the payoff to i and another B
4 4
that specifies the payoff to j. • Nash equilibrium strategy;

(C, C) ≽ i (C, D) ≻ i (D, C) ≽ i (D, D)


• Pareto optimal strategies;
• Sometimes we’ll write the game as (A, B) in
recognition of this. • strategies that maximise social welfare.

7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
1/18/2022

Dominant Strategies Dominant Strategies


• Given any particular strategy s (either i
• A rational agent will never play a i
C or D) that agent i can play, there will defect coop dominated strategy. defect coop
be a number of possible outcomes. defect 1 4 • i.e, a strategy that is dominated (and thus defect 1 4
j 1 1 inferior) by another j 1 1
• We say s1 dominates s2 if every outcome coop 1 4 coop 1 4
possible by i playing s1 is preferred over every 4 4 4 4
outcome possible by i playing s2. • So in deciding what to do, we can
delete dominated strategies.
• Thus in the game opposite, C dominates D for both players. • Unfortunately, there isn’t always a unique un-
dominated strategy.

9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
1/18/2022

Nash Equilibrium Nash Equilibrium


• In general, we will say that two strategies
s1 and s2 are in Nash equilibrium (NE) if: John Forbes Nash • Consider the payoff matrix opposite: i
defect coop

• under the assumption that agent i plays s1, agent j


(Nobel Laureate in Economics)
• Here the Nash equilibrium (NE) is (D, D).
j
defect
3
5
2
1

can do no better than play s2; • In a game like this you can find the NE by coop
2
0
1
0
• I.e. if I drive on the left side of the road, you can do no better than also cycling through the outcomes, asking if either
driving on the left!
agent can improve its payoff by switching its
• under the assumption that agent j plays s2, agent i strategy.
can do no better than play s1. i
defect coop
• I.e. if you drive on the left side of the road, I can do no better than also
defect 5 1
driving on the left!
• Thus, for example, (C, D) is not a NE j 3 2
coop 0 0
• Neither agent has any incentive to
Portrayed by Russel Crowe in the film
because i can switch its payoff from 1 2 1
“A Beautiful Mind”
deviate from a Nash Equilibrium (NE). to 5 by switching from C to D.
11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
1/18/2022

Nash Equilibrium Nash Equilibrium


i
• More formally: • The game opposite (upper) has two pure defect coop
defect 5 1
• A strategy (i, j) is a pure strategy Nash Equilibrium solution strategy NEs, (C, C) and (D, D) j 3 2
to the game (A, B) if: • In both cases, a single agent can’t unilaterally improve coop 0 3
2 3
its payoff.
i, ai,j ≥ ai,j
j, bi,j ≥ bi,j
• In the game opposite game (lower) has no i
defect coop
• Unfortunately: pure strategy NE defect 2 1
• Not every interaction scenario has a pure strategy Nash • For every outcome, one of the agents will improve its j 1 2
coop 0 1
Equilibrium (NE). utility by switching its strategy.
2 1
• Some interaction scenarios have more than one pure strategy • We can find a form of NE in such games, but we need
to go beyond pure strategies.
Nash Equilibrium (NE).
13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
1/18/2022

Mixed Strategy Nash equilibrium Mixed Strategy Nash equilibrium


• Matching Pennies
i
• Consider the Game Rock/Paper/Scissors
• Players i and j simultaneously choose the face of a coin, either
“heads” or “tails”. heads tails • Paper covers rock
heads 1 -1 • Scissors cut paper
• If they show the same face, then i wins, while if they show different
j -1 1
faces, then j wins.
tails -1 1 • Rock blunts scissors
1 -1
• NO pair of strategies forms a pure strategy NE: • This has the following payoff matrix
• whatever pair of strategies is chosen, somebody will wish they had rock
i
paper scissors
done something else. rock 0 1 0
0 0 1
j paper 0 0 1
• The solution is to allow mixed strategies: 1 0 0
scissors 1 0 0
• play “heads” with probability 0.5 0 1 0
• play “tails” with probability 0.5.
• What should you do?
• This is a Mixed Nash Equilibrium strategy. • Choose a strategy at random!
15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 16 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
1/18/2022

Mixed Strategies Pareto Optimality


• A mixed strategy has the form • An outcome is said to be Pareto optimal This game has one
Nash’s Theorem
• play α1 with probability p1
(or Pareto efficient) if: Pareto efficient
outcome: (D, D)
• there is no other outcome that makes one agent
• play α2 with probability p2 Nash proved that every finite
better off without making another agent worse off. i
game has a Nash equilibrium defect coop
• If an outcome is Pareto optimal, then at least one agent will be
• ... in mixed strategies. (Unlike reluctant to move away from it (because this agent will be worse off). defect 5 1
the case for pure strategies.) j 3 2
• play αk with probability pk. • If an outcome ω is not Pareto optimal, then there is another outcome
ω′ that makes everyone as happy, if not happier, than ω. coop 0 0
• such that p1+p2+··· +pk =1. So this result overcomes the
2 1
lack of solutions; but there still • “Reasonable” agents would agree to
• Nash proved that: may be more than one Nash
equilibrium. . . move to ω′ in this case. There is no solution in which
either agent does better
• every finite game has a Nash equilibrium in • Even if I don’t directly benefit from ω′, you can
mixed strategies. benefit without me suffering.
17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 18 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
1/18/2022

Social Welfare Competitive and Zero-Sum Interactions


• The social welfare of an outcome ω is the sum In both these games, • Where preferences of agents are diametrically opposed
of the utilities that each agent gets from ω: (C, C) maximises we have strictly competitive scenarios.
social welfare
i • Zero-sum encounters are those where utilities sum to
defect coop zero:
defect 2 1
j 2 1 ui(ω)+uj(ω)=0 for all ω  Ω.
• Think of it as the “total amount of money in the system”. coop 3 4
3 4 • Zero sum encounters are bad news: for me to get + utility you have to
get negative utility! The best outcome for me is the worst for you!
• As a solution concept: i
defect coop
• Zero sum encounters in real life are very rare . . . but people tend to
• may be appropriate when the whole system (all agents) has a defect 2 1
act in many scenarios as if they were zero sum.
single owner (then overall benefit of the system is important, j 2 1
not individuals). coop 3 7 • Most games have some room in the set of outcomes
• It doesn’t consider the benefits to individuals. 3 0 for agents to find (somewhat) mutually beneficial
• A very skewed outcome can maximise social welfare. outcomes.
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
1/18/2022

The Prisoner’s Dilemma What should you do?


• Payoff matrix for prisoner’s dilemma: The Prisoner’s Dilemma • The individual rational action is defect.
Solution Concepts
As neither men
i want to admit to Two men are collectively charged with •
defect coop being guilty, a crime and held in separate cells, with
This guarantees a payoff of no worse than 2, whereas • The dominant strategy here is to
cooperating guarantees a payoff of at most 1. defect.
defect 2 1 cooperation means no way of meeting or communicating.
j 2 4 not confessing!
They are told that:
• So defection is the best response to all possible strategies: • (D, D) is the only Nash equilibrium.
coop 4 3 both agents defect, and get payoff = 2. • All outcomes except (D, D) are
1 3 • if one confesses and the other does
not (C,D) or (D,C), the confessor will be Pareto optimal.
freed, and the other will be jailed for
• But intuition says this is not the best outcome: • (C, C) maximises social welfare.
• Top left: If both defect, then both get punishment for mutual three years;
defection.
• if both confess (D,D), then each will • Surely they should both cooperate and each get payoff of 3!
i
• Top right: If i cooperates and j defects, i gets sucker’s payoff of 1, be jailed for two years. defect coop
while j gets 4. Both prisoners know that if neither • This is why the Prisoners Dilemma game is defect 2 1
• Bottom left: If j cooperates and i defects, j gets sucker’s payoff of confesses (C,C), then they will each be j 2 4
interesting
1, while i gets 4. jailed for one year. coop 4 3
• Bottom right: Reward for mutual cooperation (i.e. neither confess). • The analysis seems to give us a paradoxical answer. 1 3

21 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

21 22

11
1/18/2022

The Prisoner’s Dilemma Arguments for Recovering Cooperation


• This apparent paradox is the fundamental Solution Concepts
problem of multi-agent interactions. • Conclusions that some have drawn from this analysis:
• • The dominant strategy here is to
It appears to imply that cooperation will not occur in societies
of self-interested agents. defect. • the game theory notion of rational action is wrong!
• (D, D) is the only Nash equilibrium. • somehow the dilemma is being formulated wrongly
• All outcomes except (D, D) are
• Real world examples: Pareto optimal.
• nuclear arms reduction -“why don’t I keep mine” • (C, C) maximises social welfare. • Arguments to recover cooperation:
• free rider systems - public transport, file sharing;
i
• Altruism
• in the UK — television licenses.
• in the US — funding for NPR/PBS.
defect coop • The other prisoner is my twin!
defect 2 1
j 2 4 • Program equilibria and mediators
• The prisoner’s dilemma is ubiquitous. coop 4 3
• The shadow of the future. . .
1 3
• Can we recover cooperation?
23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 24 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

23 24

12
1/18/2022

We are not all Machiavelli The Other Prisoner is My Twin


• Lots of “altruism” is something else: • Argue that both prisoner’s will think alike and
decide that it is best to cooperate.
• Either there is some delayed reciprocity; or
• If they are twins, they must think along the same lines,
• There are mechanisms to punish defection. right?
• (Or they have some agreement that they won’t talk.)
• There is a reason why HMRC (or the IRS “... We aren’t all that hard-boiled,
and besides, people really do act
in the US) audits people’s taxes :-) altruistically ...”
• Well, if this is the case, we aren’t really
playing the Prisoner’s Dilemma!
• Altruism may be something that makes
us feel good • Possibly more to the point is that if you know
• This is why we are prepared to pay for it. the other person is going to cooperate, you
are still better off defecting.
25 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 26 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

25 26

13
1/18/2022

The Iterated Prisoner’s Dilemma


Program Equilibria Player 1 (P1) Player 2 (P2)
(The Shadow of the Future)
• The strategy you really want to play in the If (P1 == P2) { If (P1 == P2) {

prisoner’s dilemma is: I’ll cooperate if he


do(C)
} else {
do(C)
} else { • Play the game more than once.
do(D) do(D)
will } } • If you know you will be meeting your opponent again, then the incentive to defect
stop stop appears to evaporate.
• Program equilibria provide one way of enabling this. • If you defect, you can be punished (compared to the co-operation reward.)
Mediator
• If you get suckered, then what you lose can be amortised over the rest of the iterations, making it a small loss.
P1:C P2:C
• Each agent submits a program strategy to Player 1 (P1) Player 2 (P2) • Cooperation is (provably) the rational choice in the infinitely
a mediator which jointly executes the If (P1 == P2) { do(D)
do(C) stop repeated prisoner’s dilemma.
strategies. } else {
do(D) • (Hurrah!)
• Crucially, strategies can be conditioned on the }
stop
strategies of the others.
• The best response to this program: Mediator • But what if there are a finite number of repetitions?
• submit the same program, giving an outcome of (C, C)! P1:D P2:D
27 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

27 28

14
1/18/2022

Backwards Induction Axelrod’s Tournament


• But. . . suppose you both know that you will
play the game exactly n times. As long as you have some
probability of repeating the
• Suppose you play iterated prisoner’s Robert Axelrod

• On round n − 1, you have an incentive to defect, to gain that extra bit of


payoff.
interaction co-operation can have dilemma (IPD) against a range of
a better expected payoff.
• But this makes round n − 2 the last “real”, and so you have an incentive to
defect there, too.
opponents.
• This is the backwards induction problem.
• What approach should you choose, Robert Axelrod (1984) investigated
• Playing the prisoner’s dilemma with a fixed, As long as there are enough co-
operative folk out there, you can
so as to maximise your overall payoff? this problem, with a computer
tournament for programs playing
finite, pre-determined, commonly known come out ahead by co-operating. • Is it better to defect, and hope to find suckers the iterated prisoner’s dilemma.
number of rounds, defection is the best to rip-off?
Axelrod hosted the tournament
strategy.
• Or is it better to cooperate, and try to find and various researchers sent in
• That seems to suggest that you should never cooperate. other friendly folk to cooperate with?
approaches for playing the game.

• So how does cooperation arise? Why does it make sense?


29 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 30 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

29 30

15
1/18/2022

Strategies in Axelrod’s Tournament Recipes for Success in Axelrod’s Tournament


• Surprisingly TIT-FOR-TAT for won. JOSS
As TIT-FOR-TAT, except periodically defect. • Don’t be envious: • Retaliate appropriately:
• But don’t read too much into this :-)
ALL-D
• Don’t play as if it were zero sum! • Always punish defection
immediately, but use “measured”
• In scenarios like the Iterated Prisoner’s “Always defect” — the hawk strategy;
force
Dilemma (IPD) tournament… Tit-For-Tat
• Be nice: • don’t overdo it.
• …the best approach depends heavily on what the 1. On round u = 0, cooperate. • Start by cooperating, and
2. On round u > 0, do what your opponent reciprocate cooperation.
full set of approaches is. did on round u − 1.
• Don’t hold grudges:
• Always reciprocate cooperation
• TIT-FOR-TAT did well because there
Tester
On 1st round, defect. If the opponent immediately.
were other players it could co-operate retaliated, then play TIT-FOR-TAT. Otherwise
intersperse cooperation & defection.
with.
31 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 32 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

31 32

16
1/18/2022

Stag Hunt Stag Hunt


Stag Hunt Payoff Matrix
• A group of hunters goes stag hunting. “...You and a friend decide it would be a great joke to
show up on the last day of school with some ridiculous
• Two Nash equilibrium solutions (C, C) and (D, D).
haircut.Egged on by your clique,you both swear you’ll • If you know I’ll co-operate, the best you can do is to co-operate as well.
i
• If they all stay focussed on the stag, they will get the haircut.
defect coop
• If you know I’ll defect, then that is the best you can do as well.
catch it and all have a lot of food. A night of indecision follows.As you anticipate your
defect 2 2 3 1 • Social welfare is maximised by (C, C).
parents’and teachers reactions [...] you start
wondering if your friend is really going to go through j •
• If some of them head off to catch rabbits, the with the plan. coop 3 4
The only Pareto efficient outcome is (C, C).

stag will escape. Not that you don’t want the plan to succeed: the best 1 4
possible outcome would be for both of you to get the • As usual with Nash equilibrium, theory gives us no
haircut.
real help in deciding what the other party will do.
• In this case the rabbit hunters will have The trouble is,it would be awful to be the only one to
show up with the haircut.That would be the worst The difference from the • Hence the worrying about the haircut.
possible outcome. prisoner’s dilemma is that
some small amount of food and the You’re not above enjoying your friend’s embarrassment. now it is better if you both co-
(remaining) stag hunters will go hungry. If you didn’t get the haircut, but the friend did,and
looked like a real jerk,that would be almost as good as
operate than if you defect • The same scenario occurs in mutinies and strikes.
while the other co-operates.
if you both got the haircut...” • We would all be better off if our hated captain is deposed, but if
• What should each hunter do? MikeWooldridge some of us give in, we will all be hanged.

33 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 34 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

33 34

17
1/18/2022

Game of Chicken Other Symmetric 2x2 Games


Fir st, cases with dominant solutions:
The game of chicken gets its name from a rather
silly, macho “game” that was supposedly popular
• Chicken has the following payoff matrix: i
defect coop • Given the 4 possible outcomes of (symmetric) Cooperation dominates
defect 1 2 CC ≻i CD ≻i DC ≻i DD
amongst juvenile delinquents in 1950s America; the
jump = coop 1 4
cooperate/defect games, there are 24 possible
game was immortalised by James Dean in the j CC ≻i CD ≻i DD ≻i DC
1950s film Rebel without a Cause. The purpose of stay in car = defect. coop 4 3 orderings on outcomes. Deadlock (You will always do best by defecting)
the game is to establish who is bravest of the two 2 3
players.
• Difference to prisoner’s dilemma: • The textbook lists them all, but here we give the DC ≻i DD ≻i CC ≻i CD
combinations DC ≻i DD ≻i CD ≻i DD
• Mutual defection is most feared outcome.
• Whereas sucker’s payoff is most feared in prisoner’s dilemma.
• These are more abstract descriptions of the Games that we looked at in detail:
Prisoner’s dilemma.
The game is played by both players driving their • There is no dominant strategy games than the payoff matrices we DC ≻i CC ≻i DD ≻i CD
cars at high speed towards a cliff. The idea is that
the least brave of the two (the “chicken”) will be the • Strategy pairs (C, D) and (D, C) are Nash equilibria. considered. Chicken
first to drop out of the game by jumping out of the • If I think you will stay in the car, I should jump out. DC ≻i CC ≻i CD ≻i DD
speeding car. The winner is the one who lasts

• All payoff matrices consistent with these
If I think you will jump out of the car, I should stay in.
longest in the car. Of course, if neither player jumps Stag Hunt
out of the car, then both cars fly off the cliff, taking • All outcomes except (D, D) are Pareto optimal. CC ≻i DC ≻i DD ≻i CD
their foolish passengers to a fiery death on the rocks
• All outcomes except (D, D) maximise social welfare. preferences orders are instances of the games.
that undoubtedly lie at the foot of the cliff.
35 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 36 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

35 36

18
1/18/2022

Summary
• This chapter has looked at agent
Class Reading (Chapter 11):

interactions, and one approach to “The Evolution of Cooperation”, R. Axelrod.


Basic Books, New York, 1984.
characterising them.
• The approach we have looked at here is that of game This is a book, but is quite easy going,
and most of the important content is
theory, a powerful tool for analysing interactions. conveyed in the first few chapters.
• We looked at solution concepts of Nash equilibrium Appart from anything else, it is
beautifully written, and it is easy to see
and Pareto optimality. why so many readers are enthusiastic
about it. However, read it in
• We then looked at the classic Prisoner’s Dilemma, and conjunction with Binmore’s critiques,
how the game can be analysed using game theory. cited in the course text.

• We also looked at the iterated Prisoner’s Dilemma,


and other canonical 2 × 2 games.

37 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

37

19
5/13/2019

Social Behaviour
• Previously we looked at:
• Deductive Agents
• Practical Reasoning, and BDI Agents

IT4899 • Reactive and Hybrid Architectures

• We said:
Multi-Agent Systems •

An intelligent agent is a computer system capable of flexible autonomous action in some environment.
Where by flexible, we mean:
Chapters 6/7 - Ontologies & Communication • reactive
• pro-active
Dr. Nguyen Binh Minh • social
Department of Information Systems • This is where we deal with the “social” bit, showing how agents communicate
and share information.

1 2 Copyright: Binh Minh Nguyen, 2019

1 2

1
5/13/2019

John Langshaw Austin


Agent Communication Speech Acts
• Austin’s 1962 book “How to Do Things with Words” is usually taken
• In this lecture, we cover macro-aspects of intelligent agent technology, to be the origin of speech acts
and those issues relating to the agent society, rather than the individual:
• communication: • Speech act theories are pragmatic theories of language, that is
theories of how language use: in 1962,published the
• speech acts; KQML & KIF; FIPA ACL book:
• they attempt to account for how language is used by people every day to achieve their
• ontologies: goals and intentions.
• the role of ontologies in communication
• aligning ontologies • Austin noticed that some utterances are rather like ‘physical actions’
• OWL that appear to change the state of the world.

• There are some limited things that one can do without communication,
but they are…, well…, limited!!!
• Most work on multiagent systems assumes communication.

3 Copyright: Binh Minh Nguyen, 2019 4 Copyright: Binh Minh Nguyen, 2019

3 4

2
5/13/2019

Speech Acts John R.Searle


Speech Acts: Searle
• Paradigm examples are: • In his 1969 book Speech Acts: an Essay in the Philosophy of
• declaring war; Language he identified:
• naming a child; • representatives:
• “I now pronounce you man and wife” :-) in 1969,published the • such as informing, e.g., ‘It is raining’
book:
• directives:
• But more generally, everything we utter is uttered • attempts to get the hearer to do something e.g., ‘please make the tea’

with the intention of satisfying some goal or intention. • commisives:


• which commit the speaker to doing something, e.g., ‘I promise to...’

• A theory of how utterances are used to achieve • expressives:


intentions is a speech act theory. • whereby a speaker expresses a mental state, e.g., ‘thank you!’
• Proposed by John Searle, 1969. • declarations:
• such as declaring war or naming.
6 Copyright: Binh Minh Nguyen, 2019 7 Copyright: Binh Minh Nguyen, 2019

6 7

3
5/13/2019

Speech Acts: Searle Plan Based Semantics The semantics f or “request”


request(s, h, φ)
precondition:
The door is closed...
• How does one define the semantics of
• There is some debate about whether request
speech act = “please close the door” speech acts? • s believes h can do φ
(you don’t ask someone to do something
this (or any!) typology of speech acts inform
speech act = “the door is closed!”
• When can one say someone has uttered, e.g., a request or unless you think they can do it)
is appropriate. inquire
an inform?
speech act = “is the door closed?” • s believe h believe h can do φ

• In general, a speech act can be seen • Cohen & Perrault (1979) defined semantics of (you don’t ask someone unless
they believe they can do it)

to have two components: speech acts using the precondition-delete- • s believe s want φ
add list formalism of planning research. (you don’t ask someone
• a performative verb: unless you want it!)
• Just like STRIPS planner
• (e.g., request, inform, . . . ) post-condition:
Each of the above speech acts result from
• h believe s believe s want φ
• propositional content: the same propositional content (“...the
door is closed...”), but with different
• Note that a speaker cannot (generally) force a (the effect is to make them
• (e.g., “the door is closed”) performatives hearer to accept some desired mental state. aware of your desire)

8 Copyright: Binh Minh Nguyen, 2019 9 Copyright: Binh Minh Nguyen, 2019

8 9

4
5/13/2019

KQML and KIF KQML and KIF


• We now consider agent communication languages (ACLs) • KQML is an ‘outer’ language, that defines various acceptable
• ACLs are standard formats for the exchange of messages.
‘communicative verbs’, or performatives.
• Example performatives:
“… [developing] protocols for the exchange of represented • ask-if (‘is it true that. . . ’)
knowledge between autonomous information systems…” • perform (‘please perform the following action. . . ’)
Tim Finin, 1993 • tell (‘it is true that. . . ’)
• reply (‘the answer is . . . ’)
• One well known ACL is KQML, developed by the DARPA-funded
Knowledge Sharing Effort (KSE). • KIF is a language for expressing message content, or domain
• The ACL proposed by KSE was comprised of two parts: knowledge.
• the message itself: the Knowledge Query and Manipulation Language (KQML); and the body • It can be used to writing down ontologies.
of the message: the Knowledge interchange format (KIF). • KIF is based on first-order logic.
10 Copyright: Binh Minh Nguyen, 2019 11 Copyright: Binh Minh Nguyen, 2019

10 11

5
5/13/2019

KQML & Ontologies Blocksworld


• In order to be able to communicate, agents need to agree on the • The environment is represented by an
words (terms) they use to describe a domain. ontology.
• Always a problem where multiple languages are concerned.

• A formal specification of a set of terms is known as a ontology.


• The DARPA Knowledge Sharing Effort project has associated with it a large effort at
defining common ontologies
• software tools like ontolingua, etc, for this purpose.

• We’ve previously discussed the use of ontologies and semantics…


12 Copyright: Binh Minh Nguyen, 2019 13 Copyright: Binh Minh Nguyen, 2019

12 13

6
5/13/2019

Ontologies Ontologies
• The role of an ontology is to fix the meaning of the terms • Alice:
used by agents. • Did you read “Prey”?

“… An ontology is a formal definition of a body of knowledge. The most


typical type of ontology used in building agents involves a structural
component. Essentially a taxonomy of class and subclass relations coupled
• Bob:
with definitions of the relationships between these things …” • No, what is it?
Jim Hendler

• How do we do this? Typically by defining new terms in • Alice:


terms of old ones. • A science fiction novel. Well, it is also a bit of a
horror novel. It is about multiagent systems
• Let’s consider an example. going haywire.
14 Copyright: Binh Minh Nguyen, 2019 15 Copyright: Binh Minh Nguyen, 2019

14 15

7
5/13/2019

Ontologies Ontologies
• What is being conveyed about “Prey” here? Types of objects • Part of the reason this interaction works is that Bob has knowledge
1. It is a novel Classes
collections of things with similar
that is relevant.
2. It is a science fiction novel properties • Bob knows that novels are fiction books
3. It is a horror novel Instances • “novel” is a subclass of “fiction book”
specific examples of classes
4. It is about multiagent systems • Bob knows things about novels: they have
Relations
Describe the properties of objects • authors,
• Alice assumes that Bob knows what a “novel” and connect them together • publishers,
• publication dates, and so on.
is, what “science fiction” is and what “horror” Note that we also have these types of
is.
• Because “Prey” is a novel, it inherits the properties of novels. It has
objects in languages such as Java, or
modelling frameworks such as ER
Diagrams. Such languages and
an author, a publisher, a publication date.
• She thus defines a new term “Prey” in terms frameworks also support inheritance.
• Instances inherit attributes from their classes.
of ones that Bob already knows.
16 Copyright: Binh Minh Nguyen, 2019 17 Copyright: Binh Minh Nguyen, 2019

16 17

8
5/13/2019

Ontology Inheritance Ontologies


• A lot of knowledge can be captured using these notions.
• We specify which class “is-a” sub-class of which other class.
• Classes also inherit. • We specify which classes have which attributes.
• Classes inherit attributes from their super-classes.
• An axiomatic theory can also be included to support inference
• If “novel” is a subclass of “fiction book”, then “fiction book” is a superclass of “novel”
• if socrates is [an instance of a] man, and all men are mortal…
• … we can infer that socrates is mortal!

• Fiction books are books.


• Books are sold in bookstores.
• This structure over knowledge is called an ontology.
• A knowledge base is an ontology with a set of instances.
• Thus fiction books are sold in bookstores.

• A huge number of ontologies have been constructed.


18 Copyright: Binh Minh Nguyen, 2019 19 Copyright: Binh Minh Nguyen, 2019

18 19

9
5/13/2019

An ontology of threats Ontologies


• In general there are multiple
ontologies at different levels of detail.
• Application ontology
• Like the threat ontology (see opposite)

• Domain ontology
• Upper ontology
• Contains very general information about the world.

• The more specific an ontology, the


less reusable it is.
20 Copyright: Binh Minh Nguyen, 2019 21 Copyright: Binh Minh Nguyen, 2019

20 21

10
5/13/2019

Multiple Ontologies Modelling and Context


Ontologies as perspectives on
• Application and domain ontologies a domain • The problem with modelling ontologies is that different
will typically overlap
A single domain may have an intended
representation in the real world, that is not
designers have different contexts, and requirements
perfectly represented by any single formal
• Illustrated by the challenges of facilitating ontology. Many separate ontologies then
emerge, based on different contexts…
interoperability between similar ontologies.
The Architect The Militar y
• Different knowledge systems can be integrated
to form merged knowledge bases Ontology Alignment Evaluation When modelling a bridge, When modelling a bridge,
Initiative important characteristics important characteristics
A test suite of similar ontologies used to test include: include:
out alignment systems that link different
tensile strength what munitions are
• But in many cases, all that is needed ontologies representing the same domain.
The conference test suite consists of 21
weight
Pat Hayes, 2001 in
required to destroy it!
load
is to be understood! ontology pairs.
etc conversation

22 Copyright: Binh Minh Nguyen, 2019 23 Copyright: Binh Minh Nguyen, 2019

22 23

11
5/13/2019

Ontologies, Alignments and Correspondences Aligning Agents’ Ontologies


• Different ontological models exist for overlapping domains
• Modelled implicitly, or explicitly by defining entities (classes, roles etc),
typically using some logical theory, i.e. an Ontology

• Alignment Systems align similar ontologies


Correspondence
• Agent alignment is inherently decentralised!
Alignment

24 Copyright: Binh Minh Nguyen, 2019 25 Copyright: Binh Minh Nguyen, 2019

24 25

12
5/13/2019

Correspondence Inclusion Dialogue Picking the right correspondences


• Correspondence Inclusion Dialogue (CID) • Quality vs Quantity ⟨article,draft,≣⟩
• Allows two agents to exchange knowledge about correspondences • Do we maximise coverage ⟨publication,paper,≣⟩
to agree upon a mutually acceptable final alignment AL. Alice
6A
endassert
• Preferable when merging the whole ontology
• This alignment aligns only those entities in each agents’ working
ontologies, without disclosing the ontologies, or all of the known reject
• Do we find the “best” mappings ⟨article,paper,≣⟩
correspondences. accept

Alice Bob
join 2A Alice 5B Preferable when aligning specific signatures
& ect object
j
Bob b


o
Assumptions Alice join 3AB Alice Alice
1A assert 4A 5A
0.5
1. Each agent knows about different correspondences from different e
ct
j article draft
sources matched-close ob
0.7
2. This knowledge is partial, and possibly ambiguous; i.e. more than one endassert
publication paper
correspondence exists for a given entity 0.6
3. Agents associate a utility (Degree of Belief) κc to each unique
correspondence

26 Copyright: Binh Minh Nguyen, 2019 27 Copyright: Binh Minh Nguyen, 2019

26 27

13
5/13/2019

OWL - Web Ontology Language OWL Example


• A general purpose family of ontology NS1:geographicCoordinates rdf:nodeID='A179'/>
<NS1:mapReferences>North America</NS1:mapReferences>
languages for describing knowledge <NS1:totalArea>9629091</NS1:totalArea>
<NS1:landArea>9158960</NS1:landArea>
• Originated from the DARPA Agent Markup <NS1:waterArea>470131</NS1:waterArea>
<NS1:comparativeArea>about half the size of Russia; about three-tenths the size of Africa; about half the size of South America (or
Language Program slightly larger than Brazil); slightly larger than China; about two and a half times the size of Western Europe
</NS1:comparativeArea>
• Followup to DARPA Control of Agent Based Systems (CoABS) <NS1:landBoundaries>12034</NS1:landBoundaries>
<NS1:coastline>19924</NS1:coastline>
<NS1:contiguousZone>24</NS1:contiguousZone>
<NS1:exclusiveEconomicZone>200</NS1:exclusiveEconomicZone>
<NS1:territorialSea>12</NS1:territorialSea>

• Based on description logics <NS1:climate>mostly temperate, but tropical in Hawaii and Florida, arctic in Alaska, semiarid in the great plains west of the
Mississippi River, and arid in the Great Basin of the southwest; low winter temperatures in the northwest are ameliorated occasionally in
January and February by warm chinook winds from the eastern slopes of the Rocky Mountains
• Various flavours with different expressivity / </NS1:climate>
<NS1:terrain>vast central plain, mountains in west, hills and low mountains in east; rugged mountains and broad river valleys in
computability Alaska; rugged, volcanic topography in Hawaii

• Different syntaxes: XML, Turtle, Manchester Syntax… </NS1:terrain>

• Underpins the Semantic Web


28 Copyright: Binh Minh Nguyen, 2019 29 Copyright: Binh Minh Nguyen, 2019

28 29

14
5/13/2019

OWL and Services KQML / KIF


• OWL-S is an upper level service • After that digression, we can return to the KQML/KIF show.
• KQML is an agent communication language. It provides a set of
ontology developed to describe performatives for communication.
agent based (or semantic web)
services • KIF is a language for representing domain knowledge.
• Profile used for discovery • It can be used to writing down ontologies.
• input / outputs etc
• KIF is based on first-order logic.
• Process model provided a planning formalism
• Grounding linked to the syntactic messaging
• Given that, let’s look at some examples.
30 Copyright: Binh Minh Nguyen, 2019 31 Copyright: Binh Minh Nguyen, 2019

30 31

15
5/13/2019

KQML/KIF Example Problems with KQML


• The basic KQML performative set was fluid
(tell
:sender B
• Different implementations were not interoperable
:receiver A
(stream-about
:sender A
:in-reply-to q1
:content • Transport mechanisms for messages were not precisely
:receiver B (= (torque m1) (scalar 12 kgf))
:language KIF ) defined
(tell
:ontology
:reply-with
motors
q1 :sender B
B responds to A’s query q1.
• Again - interoperability
:content m1 :receiver A
:in-reply-to q1 Two facts are sent:
)
A asks B for information
:content
(= (status m1) normal)
1) that the torque
of motor 1 • Semantics of KQML were not rigorously defined
is 12kgf;and
about motor 1,using the )
2) that the status of the
• Ambiguity resulted in impairing interoperability!
ontology (represented in (eos
KIF) about motors. :sender B motor is normal.
:receiver A The ask stream is terminated
:in-reply-to q1 using the eos performative. • There were no commissives in the language
)
• Without the ability to commit to a task, how could agents coordinate behaviour

• The performative set was arguably ad-hoc and overly large


32 Copyright: Binh Minh Nguyen, 2019 33 Copyright: Binh Minh Nguyen, 2019

32 33

16
5/13/2019

FIPA ACL Perf ormatives


FIPA ACL performative passin
g info
requestin
g info
negotiation performin
g actions
error
handlin
FIPA ACL example g
• More recently, the Foundation for (inform
accept-proposal
agree
x x x x x

Intelligent Physical Agents (FIPA) started :sender


:receiver
agent1
agent5 cancel x
:content (price good200 150) cfp x
work on a program of agent standards :language sl confirm
:ontology hpl-auction disconfirm x
• the centrepiece is an ACL. )
failure
inform x
inform-if x
inform-ref x
• Basic structure is quite similar to KQML: not-understood
propose x
x

• The number of performatives was reduced to 20. query-if x


query-ref x
• A formal semantics has been defined refuse x
reject-proposal x
for the language, using the language SL request x
• SL can represents beliefs, desires request-when x
and uncertain beliefs, as well as actions request-whenever x
subscribe x
34 Copyright: Binh Minh Nguyen, 2019 35 Copyright: Binh Minh Nguyen, 2019

34 35

17
5/13/2019

“Inform” and “Request” Communication in AgentSpeak


• “Inform” and “Request” are the two basic performatives in FIPA. Others are • AgentSpeak agents communicate using a simpler structure to KQML/
macro definitions, defined in terms of these. ACL
• The meaning of inform and request is defined in two parts:
• pre-condition: what must be true in order for the speech act to succeed.
• Messages received typically have the form
<sender, performative, content>
• “rational effect”: what the sender of the message hopes to bring about.
• sender: the AgentSpeak term corresponding to the agent that sent the message
FIPA “Inform” Performative FIPA “Request” Performative • i.e. an agentID
The content is a statement. The pre-condition is that the
sender:
The content is an action. The pre-condition is that the sender: • performative: this represents the goal the sender intends to achieve by sending the
• intends action content to be performed; message
• holds that the content is true; • believes recipient is capable of performing this action;
• intends that the recipient believe the content; • tell, achieve, askOne, tellHow etc
• does not believe that recipient already intends to perform
• does not already believe that the recipient is aware of action. • content: an AgentSpeak formula or message body
whether content is true or not The last of these conditions captures the fact that you don’t
The speaker only has to believe that what he says is true. speak if you don’t need to.
• varies depending on the performative

36 Copyright: Binh Minh Nguyen, 2019 37 Copyright: Binh Minh Nguyen, 2019

36 37

18
5/13/2019

Messages in Jason Handling messages in Jason


• Messages are passed through the use of internal actions that are pre- • In the internal Jason architecture
defined in Jason • messages are delivered into the agents “mailbox”
• The most typically used are: • This is done automatically by the customisable checkMail method
.send(receiver, performative, content) • Passes them onto the AgentSpeak interpreter

.broadcast(performative, content) • One message is processed during each reasoning cycle


• where receiver, performative and content relate to the elements in the message • A customisable message selection function (SM) selects the next message
to process
• A selection process (SocAcc) determines if the message should be
• The .send action sends messages to specific agents rejected
• For example, ignoring messages from a certain agent
• The receiver can be a single agentID, or a list of agentIDs • Think of this as a spam filter
• If the message goes through, Jason will interpret it
• The .broadcast action sends the message to all agents registered in the according to precise semantics
system • by generating new events pertaining to the goal and belief bases, and in
turn, triggering plans
38 Copyright: Binh Minh Nguyen, 2019 39 Copyright: Binh Minh Nguyen, 2019

38 39

19
5/13/2019

Performatives in Jason Semantics of tell / untell


• Sharing Beliefs • Delegate an Achievement Gaol Information Exchange
(Information Exchange) (Goal Delegation) Cycle # sender (s) actions recipient (r) belief base recipient (r) events
• tell and untell • achieve and unachieve 1 .send (r, tell, open(left_door))
• The sender intends the receiver (not) to believe the • The sender requests the receiver (not) to try and
literal in the content to be true and that the sender achieve a state-of-affairs where the content of the
believes it message is true 2 open(left_door)[source(s)] ⟨+open(left_door) [source(s)],⊤⟩

3 .send (r, untell, open(left_door))


• Sharing Plans (Deliberation) • Delegate a Test Goal 4 ⟨-open(left_door) [source(s)],⊤⟩
• tellHow and untellHow (Information Seeking)
• The sender requests the receiver (not) to include
within their plan library the plan in the message content
• askOne and askAll
• The sender wants to know whether the receiver
• askHow knows (askOne) if the content is true (i.e. is there a • Note that events are represented internally as a tuple: ⟨event, intention⟩
single answer) or for all answers (askAll).
• The sender wants to know the receiver ’s applicable • This associates an event with an intention that generated it
plan for the triggering event in the message content
• With communication, there is no intention responsible for the event
• Thus, we indicate this with the ⊤ symbol
40 Copyright: Binh Minh Nguyen, 2019 41 Copyright: Binh Minh Nguyen, 2019

40 41

20
5/13/2019

Semantics of achieve / unachieve Semantics of askOne / askAll


Goal Delegation Information Seeking
Cycle # sender (s) actions recipient (r) intentions recipient (r) events Cycle # sender (s) actions recipient ( r ) actions sender (s) events
1 .send (r, achieve, open(left_door)) 1 .send ( r, askOne, open(Door))

2 ⟨+!open(left_door) [source(s)],⊤⟩ 2 .send(s, tell, open(left_door))


3 ⟨+open(left_door)[source(r)],⊤⟩
3 !open(left_door)[source(s)]
4 .send ( r, askAll, open(Door))
3 .send (r, unachieve, open(left_door)) !open(left_door)[source(s)]
5 .send(s, tell, [open(left_door), open(right_door)])
4 <<< intention has been removed >>> ⟨+open(left_door)[source(r)],⊤⟩
6
⟨+open(right_door)[source(r)],⊤⟩

• Note that the intention is adopted after the goal is added. r ’s belief base
• With unachieve, the internal action .drop_desire(open(left_door)) is executed.
open(left_door)

open(right_door)
42 Copyright: Binh Minh Nguyen, 2019 43 Copyright: Binh Minh Nguyen, 2019

42 43

21
5/13/2019

Semantics of Deliberation Handling performatives


• .send(receiver, tellHow,“@p ... : ... <- ...”) • Jason implements plans for each of the performatives
• adds the plan to the plan library of the receiver with its plan label @p • More elegant than hard coding within the interpreter
• .send(r, tellHow, “@pOD +!open(Door): not lock ed(Door) <- t u r n_handle(Door); push(Door); ?
oopen(Door).”) • Allows the agent developer to introduce new performatives when necessary
• Existing performatives can be overridden

• .send(receiver, untellHow, @p) • The goal !kqml_received is created whenever a message is received
• removes the plan with the plan label @p from the plan library of receiver • Predefined plans can be found in Jason Distribution in
• .send(r, untellHow, “@pOD”) • src/asl/kqmlPlans.asl

/ * ---- achieve perfo rmatives ---- * /


• .send(receiver, askHow, Goal-addition-event)
@kqmlReceivedAchieve
• requires receiver to pass all relevant plans to the triggering event in the content +!kqml_received(KQML_Sender_Var, achieve, KQML_Content_Var, KQML_MsgId)
• .send(r, askHow, “+!open(Door)”) <- .add_annot(KQML_Content_Var, source(KQML_Sender_Var), CA);
!!C A.
44 Copyright: Binh Minh Nguyen, 2019 45 Copyright: Binh Minh Nguyen, 2019

44 45

22
5/13/2019

Creating performatives Creating performatives


//tell_r ule.mas2j / / A gent sender in project tell_rule / / A gent receiver in project tell_rule

• Defining the tell_rule performative


/* / * Initial goals * / / * Initial beliefs * /
This example shows how to customise the !start. b.
KQML to add a new perfo rmative, c.
• This simple example illustrates how a new identified by "tellRule", used by one agent
to send r ules lik e "a :- b & c" to another
/ * Plans * /
+!start : t r ue / * Plans * /
performative for sharing rules could be written agent. <- / / ask the receiver to achieve the goal test +!test : a <- .print("Yes, a is t r ue").
• It is based on the example code tell_rule in the Jason */ .send(receiver,achieve,test); +!test <- .print("Don't know if a is t r ue").
Distribution / / send a list with a single r ule
MAS tell_rule { .send(receiver,tellRule, [ { a :- b & c}]); / / customisation of KQML perfo rmative tellRule
• Two agents are defined: infrastructure: Centralised
/ / ask the receiver to achieve the goal test +!kqml_received(A,tellRule,Rules,_)
.send(receiver,achieve,test). <- .print(“R eceived r ule(s) ”, Rules, “ from ”, A);
• receiver, which implements the plan for the new for ( .member(R, Rules) ) {
performative tellRule agents:
+R[source(A)];
receiver;
}
• sender, that sends two messages using tellRule sender;
/ / get all r ules and p ri n t them
• No environment is used in this multi-agent aslSourcePath:
.relevant_rules(_,LR);
.print("Rules: ",LR).
"src/asl";
system (MAS) }

46 Copyright: Binh Minh Nguyen, 2019 47 Copyright: Binh Minh Nguyen, 2019

46 47

23
5/13/2019

Jade Jade
• The FIPA ACL provides a language for writing • In JADE, agents are Java threads running in a “container”.
messages down.
• It says nothing about how they are passed between agents.

• Several software platforms have been


developed to support ACL-based
communication.
• One of the most widely used is JADE.

• Provides transparent (from the perspective of • All containers register with the main container
the agent designer) transport of ACL messages
48 Copyright: Binh Minh Nguyen, 2019 49 Copyright: Binh Minh Nguyen, 2019

48 49

24
5/13/2019

JADE Main Container Alternative Semantics


• The main container does the following: • There is a problem with the “mental state” semantics that have
• Maintains the container table which lists all the containers and their been proposed for the FIPA ACL.
contact information. • This also holds for KQML.
• Maintains a list of all the agents in the system (including location and status).
• Hosts the agent management system (AMS) which names agents as well • How do we know if an agent’s locutions conform to the
as creating and destroying them. specification?
• Hosts the directory facilitator which provides a yellow pages allowing • As Wooldridge pointed out, since the semantics are in terms of an agent’s
agents to be identified by the services they provide. internal state, we cannot verify compliance with the semantics laid down by
FIPA.
• In practice, this means that we cannot be sure that a agent is being sincere.
• See http://jade.tilab.com/ for more details. • Or, more importantly, we cannot detect if it is being insincere.

50 Copyright: Binh Minh Nguyen, 2019 51 Copyright: Binh Minh Nguyen, 2019

50 51

25
5/13/2019

Alternative Semantics Summary


• Singh suggested a way to deal with this. • This lecture has discussed some
Class Reading (Chapter 7):

• Rather than define the conditions on a locution in terms of an agent’s mental state, base it on
something external to the agent. aspects of agent ontologies and Agent Communication Languages:
Rethinking the Principles”, Munindar P.
Singh. IEEE Computer: 1998, pp40-49.
communication between agents.
• Move from a “mentalistic” semantics to a social semantics. • It has focussed on the interpretation of locutions/ This is an overview of the state of the
• How? performatives as speech acts, and some art in agent communication (as of
1998), and an introduction to the key
suggestions for what performatives one might use. challenges, particularly with respect to

• Take an agent’s utterances as commitments. • Examples of communication were also given in AgentSpeak /
the semantics of agent
communication.
Jason (Chapter 6 of Bordini et al.)
• But what does it mean to say that “if an agent utters an inform then it is committing to the truth
of the proposition that is the subject of the utterance”? • There is much more to communication that this. . .
• . . . but this kind of thing is required as a “transport layer” to
• Doesn’t stop an agent lying, but it allows you to detect when it does. support the kinds of thing we will talk about later.

52 Copyright: Binh Minh Nguyen, 2019 53 Copyright: Binh Minh Nguyen, 2019

52 53

26
5/20/2019

Working Together
• Why and how agents work together?

• Since agents are autonomous, they have to make decisions


IT4899 at run-time, and be capable of dynamic coordination

• Overall they will need to be able to share:


Multi-Agent Systems • Tasks
Chapters 8 - Working Together • Information

Dr. Nguyen Binh Minh • If agents are designed by different individuals, they may not
have common goals
Department of Information Systems
• Important to make a distinction between:
• benevolent agents and
• self-interested agents
1 2 Copyright: Nguyen Binh Minh, Spring 2019.

1 2

1
5/20/2019

Agent Motivations Cooperative Distributed Problem Solving


• Benevolent Agents • Self Interested Agents
• If we “own” the whole system, we can • If agents represent the interests of
design agents to help each other individuals or organisations, (the more “... CDPS studies how a loosely coupled network of problem solvers can work
whenever asked general case), then we cannot make together to solve problems that are beyond their individual capabilities. Each
problem solving node in the network is capable of sophisticated problem-solving,
• In this case, we can assume agents are the benevolence assumption
and can work independently, but the problems faced by the nodes cannot be
benevolent: our best interest is their • Agents will be assumed to act to completed without cooperation. Cooperation is necessary because no single node
best interest further there own interests, possibly has sufficient expertise, resources, and information to solve a problem, and
at the expense of others. different nodes might have expertise solving different parts of the problem….”
• Problem-solving in benevolent systems
• Potential for conflict (Durfee et. al. 1989).
is Cooperative Distributed Problem
Solving (CDPS) • May complicate the design task enormously.

• Benevolence simplifies the system design task • Strategic behaviour may be required —
enormously! we will cover some of these aspects in
• We will talk about CDSP in this lecture later lectures

3 Copyright: Nguyen Binh Minh, Spring 2019. 4 Copyright: Nguyen Binh Minh, Spring 2019.

3 4

2
5/20/2019

Coherence and Coordination Task Sharing and Result Sharing


• How does a group of agents work together to solve
problems?
• Coherence: “... how well the [multiagent] system
Decomposition

• We can measure coherence in terms of solution


behaves as a unit along some
dimension of evaluation...”
• CPDS addresses the following:
quality, how efficiently resources are used, (Bond and Gasser, 1988).
• Problem decomposition
• How can a problem be divided into smaller tasks for distribution amongst agents?
conceptual clarity and so on.
• Sub-problem solution Solution
• How can the overall problem-solving activities of the agents be optimised so as to

• Coordination:
produce a solution that maximises the coherence metric?

“... the degree. . . to which [the • What techniques can be used to coordinate the activity of the agents, thus avoiding
destructive interactions?
agents]. . . can avoid ‘extraneous’
• If the system is perfectly coordinated, agents will activity [such as] . . . synchronizing • Answer synthesis
not get in each others’ way, in a physical or a Synthesis
and aligning their activities...” • How can a problem solution be effectively synthesised from subproblem results?

metaphorical sense. (Bond and Gasser, 1988).


• Let’s look at these in more detail.
5 Copyright: Nguyen Binh Minh, Spring 2019. 6 Copyright: Nguyen Binh Minh, Spring 2019.

5 6

3
5/20/2019

Problem Decomposition Sub-problem Solution


• The overall problem is divided • Clearly there is some processing
into smaller sub-problems. to do the division. • The sub-problems derived in the
• This is typically a recursive/hierarchical • How this is done is one design choice.
previous stage are solved.
process.
• Subproblems get divided up also. • Another choice is who does the • Agents typically share some information during
• The granularity of the subproblems is this process.
division.
important
• Is it centralised?
• At one extreme, the problem is decomposed to
atomic actions • Which agents have knowledge of task • A given step may involve two agents
• In ACTORS, this is done until we are at the level of
individual program instructions.
structure? synchronising their actions.
• Who is going to solve the sub-
problems? • eg. box pushing

7 Copyright: Nguyen Binh Minh, Spring 2019. 8 Copyright: Nguyen Binh Minh, Spring 2019.

7 8

4
5/20/2019

Solution Synthesis Solution Synthesis


• Given this model of cooperative problem Task Sharing
solving, we have two activities that are likely Task 1
to be present:
• In this stage solutions to sub- • task sharing:
problems are integrated. • components of a task are distributed to component agents;
Task 1.1 Task 1.2 Task 1.3

• Again this may be hierarchical • how do we decide how to allocate tasks to agents?
• result sharing: Result Sharing
• information (partial results etc) is distributed.
• Different solutions at different levels of • how do we assemble a complete solution from the parts?

abstraction.
• An agent may well need a solution to both
these problems in order to be able to function
in a CDPS environment.
9 Copyright: Nguyen Binh Minh, Spring 2019. 10 Copyright: Nguyen Binh Minh, Spring 2019.

9 10

5
5/20/2019

Task Sharing & the Contract Net Recognition


• Well known task-sharing protocol for task • In this stage, an agent recognises it has a
allocation is the contract net. problem it wants help with.
Recognition

• The contract net includes five stages:


1. Recognition; • Agent has a goal, and either. . . I have a problem

2. Announcement; • realises it cannot achieve the goal in isolation


3. Bidding; • i.e. it does not have capability;
4. Awarding; • realises it would prefer not to achieve the goal in
5. Expediting. isolation (typically because of solution quality, deadline,
etc)
• The textbook describes these stages in
procedural terms from the perspective of an • As a result, it needs to involve other
individual agent. agents.
11 Copyright: Nguyen Binh Minh, Spring 2019. 12 Copyright: Nguyen Binh Minh, Spring 2019.

11 12

6
5/20/2019

Announcement Bidding
• In this stage, the agent with the task sends • Agents that receive the announcement
out an announcement of the task which decide for themselves whether they wish to
Announcement Bidding
includes a specification of the task to be bid for the task.
achieved.
• Factors:
• Specification must encode: • agent must decide whether it is capable of expediting
• description of task itself (maybe executable) task;
• any constraints (e.g., deadlines, quality constraints) • agent must determine quality constraints & price
information (if relevant).
• meta-task information (e.g., “ . . . bids must be submitted
by . . . ”)
• If they do choose to bid, then they submit
• The announcement is then broadcast. a tender.
13 Copyright: Nguyen Binh Minh, Spring 2019. 14 Copyright: Nguyen Binh Minh, Spring 2019.

13 14

7
5/20/2019

Awarding & Expediting The Contract Net via FIPA Performatives


• Agent that sent task announcement • The FIPA ACL was designed to be
must choose between bids & decide Awarding and able to capture the contract net.
who to “award the contract” to. Expediting
• cfp (call for proposals):
• The result of this process is communicated to • Used for announcing a task;
agents that submitted a bid.
• propose, refuse:
• The successful contractor then expedites the task.
• Used for making a proposal, or declining to make a proposal.
• accept, reject:
• May involve generating further manager- • Used to indicate acceptance or rejection of a proposal.
contractor relationships: • inform, failure:
• sub-contracting. • Used to indicate completion of a task (with the result) or
• May involve another contract net. failure to do so.
15 Copyright: Nguyen Binh Minh, Spring 2019. 16 Copyright: Nguyen Binh Minh, Spring 2019.

15 16

8
5/20/2019

CNP in Jason: the MAS CNP in Jason: silentpartner


• The Contract Net Protocol • An agent that doesn’t
(CNP) in AgentSpeak / Jason respond
• Six Agents 1.MAS contractNetProtocol {
2.inf rastructure: Centralised 3.
• Line 4: Initial belief that contractor is 1. / / A gent silentpartner in project contractNetProtocol
2.
• One Contractor who initiates the CNP 4. the initiator. 3. / / the name of the agent playing initiator in the CNP
5. agents:
4. plays(initiator,contractor).
• Three agents that fully participate in the 6.
7.
contractor / / The CNP Initiator
[mindinspector="gui(cycle,html,history)"];
• Line 8: A belief that In is the agent contractor 5.
protocol 8. participant #3; / / The 3 service providers generates a message to In introducing the 6. / / send a message to the initiator introducing the
7. / / agent as a participant
9. refusenik; / / Pa rticipant who always refuse agent.
• One agent that always refuses 10. silentpartner; / / A Pa rticipant that doesn't answer
8.
9.
+plays(initiator,In)
: .my_name(Me)
• One agent that announces itself and then goes
11. • Using the internal action .my_name() 10. <- .send(In,tell,introduction(participant,Me)).
12. aslSourcePath:
silent “src/asl"; • A message is then sent to contractor 11.
12. / / Nothing else

13. }
• This example also illustrates the mind But at that point, nothing else is done
inspector • So, no response to any message
• A way to examine an agents beliefs etc.
17 Copyright: Nguyen Binh Minh, Spring 2019. 18 Copyright: Nguyen Binh Minh, Spring 2019.

17 18

9
5/20/2019

CNP in Jason: refusenik CNP in Jason: participant


• An agent that says no • A participant agent 1. / / A gent participant in project contractNetProtocol
2.

• Line 4: Initial belief that contractor is • Lines 8-14: introduce the agent to the initiator 3. / / gets the price for the product,
4. / / a random value between 100 and 110.
/ / A gent refusenik in project contractNetProtocol
the initiator.
1.
2. • Line 5: rule that generates a random price for 5. price(_Service,X) :- .random(R) & X = ( 10*R)+100.
6.
3. / / the name of the agent playing initiator in the CNP its service 7. / / the name of the agent playing initiator in the CNP
• Line 8: A belief that In is the agent contractor 4. plays(initiator,contractor). 8. plays(initiator,contractor).
generates a message to In introducing the 5. 9.
6. / / send a message to the initiator introducing the
agent. 7.
8.
/ / agent as a participant
• Bidding - Line 17: Plan @c1 1 0 . / / send a message to the initiator introducing the
1 1 . / / agent as a participant

+plays(initiator,In)
12. +plays(initiator,In)
Line 13: A CfP message from an 9. : .my_name(Me)
• On receipt of a cfp message from agent A 13. : .my_name(Me)
10. <- .send(In,tell,introduction(participant,Me)).
initiator agent will generate a refuse 11. (line 17)
14.
15.
<- .send(In,tell,introduction(participant,Me)).
12. / / plan to answer a CFP
message 13. +cfp(CNPId,_Service)[source(A)] • Where A is the initiator, and where the agent can 16. / / answer to Call For Proposal
generate a price for the requested task 17.@c1 +cfp(CNPId,Task)[source(A)]
14. : plays(initiator,A)
18.: plays(initiator,A) & price(Task,Offer)
15. <- .send(A,tell,refuse(CNPId)).
• The agent keeps a mental note of its 19. <- / / remember my proposal
20. +proposal(CNPId,Task,Offer);
proposal (line 19) 21. .send(A,tell,propose(CNPId,Offer)).
• Responds to CfP by making an offer (line 21)
19 Copyright: Nguyen Binh Minh, Spring 2019. 20 Copyright: Nguyen Binh Minh, Spring 2019.

19 20

10
5/20/2019

CNP in Jason: participant CNP in Jason: contractor


• Expediting - Line 25: Plan @r1 • The contractor agent
• Handling Accept messages 1. / / A gent participant in project contractNetProtocol • The rule all_proposals checks that 1. / / A gent contractor in project contractNetProtocol
• The agent responds to the addition of the belief 2. …
accept_proposal() the number of the proposals received 2.
3. / / Initial beliefs and r ules
23. / / Handling an Accept message
• The agent prints a success message for the contract, 24. @r 1 +accept_proposal(CNPId) is equal to the number of 4. all_proposals_received(CNPId)
5.
: proposal(CNPId,Task,Offer) :- .count(introduction(participant,_),NP) &
by retrieving the belief regarding the proposal 25.
26. <- .print("My proposal ‘“, Offer,"' won CNP “, introductions 6. / / number of participants
7.
• Note that there is nothing here to actually do the task. 27. CNPId, " for “, Task, “! "). • The predicate will only be true for this equality 8.
.count(propose(CNPId,_)[source(_)], NO) &
28. / / do the task and report to initiator / / number of proposes received
9.
29. • Note in the default run of the system with the
10.
.count(refuse(CNPId)[source(_)], NR) &
/ / number of refusals received
30. / / Handling a Reject message silentpartner agent, this predicate will never be
• Line 32: Plan @r2
11. NP = NO + NR.
31. @r2 +reject_proposal(CNPId) true!!! 12.
32. <- .print("I lost CNP ",CNPId, ".");
1 3 . / / Initial goals
33. / / clear memory
• The initial achievement goal, !startCNP(), is
• Handling Reject messages 34. -proposal(CNPId,_,_).
created:
14.! startCNP(1,fix(computer)).
15.
• The agent responds to the addition of the belief • with an Id of 1, and the task fix(com puter)
16. / /! startCNP(2,banana).
accept_proposal()
• The agent prints a failure message and deletes the
proposal from memory.
21 Copyright: Nguyen Binh Minh, Spring 2019. 22 Copyright: Nguyen Binh Minh, Spring 2019.

21 22

11
5/20/2019

Checking beliefs using the mind inspector CNP in Jason: contractor


• CNP Announcement
• The mind inspector can be used to • Plan for +!startCNP() 1. / / A gent contractor in project contractNetProtocol
check the internal state of the • Line 21: wait for participants to introduce 2. …
themselves 18. / / start the CNP
contract agent • Line 23: track the current state of the protocol 19. +!startCNP(Id,Task)
20. <- .print("Waiting participants for task ",Task,"...");
• In the example opposite: 21.
22.
.wait(2000); // wait participants introduction
// remember the state of the CNP
• the number of introductions (4) is equal to the number of 23. +cnp_state(Id,propose);
proposals (3) and the number of refusals (1) 24. .findall(Name,introduction(participant,Name),LP);
25. .print("Sending CFP to ",LP);
• Note that in this run, we removed the agent 26.
27.
.send(LP,tell,cfp(Id,Task));
// the deadline of the CNP is now + 4 seconds
silentpartner from the agent community 28. // (or all proposals were received)
29. .wait(all_proposals_received(CNPId), 4000, _);
• the other slides in this set assume that this agent does 30. !contract(Id).
participate!!!

23 Copyright: Nguyen Binh Minh, Spring 2019. 24 Copyright: Nguyen Binh Minh, Spring 2019.

23 24

12
5/20/2019

CNP in Jason: contractor CNP in Jason: contractor


• CNP Announcement
• Plan for +!startCNP() 1. / / A gent contractor in project contractNetProtocol 1. / / A gent contractor in project contractNetProtocol
• Line 21: wait for participants to introduce 2. … 2. …
themselves 18. / / start the CNP 18. / / start the CNP
• Line 23: track the current state of the protocol 19. +!startCNP(Id,Task)
20. <- .print("Waiting participants for task ",Task,"...");
19. +!startCNP(Id,Task)
20. <- .print("Waiting participants for task ",Task,"...");
• Line 24: get a list of the agents that introduced 21.
22.
.wait(2000); // wait participants introduction
// remember the state of the CNP
21. .wait(2000); // wait participants introduction
themselves 22. // remember the state of the CNP
23. +cnp_state(Id,propose); 23. +cnp_state(Id,propose);
• Find all beliefs for the predicate introduction and 24. .findall(Name,introduction(participant,Name),LP); 24. .findall(Name,introduction(participant,Name),LP);
unify the variable Name for each 25. .print("Sending CFP to ",LP); 25. .print("Sending CFP to ",LP);
26. .send(LP,tell,cfp(Id,Task)); 26. .send(LP,tell,cfp(Id,Task));
• Construct a list LP of all of the unified values of 27. // the deadline of the CNP is now + 4 seconds 27. // the deadline of the CNP is now + 4 seconds
Name 28. // (or all proposals were received) 28. // (or all proposals were received)
29. .wait(all_proposals_received(CNPId), 4000, _); 29. .wait(all_proposals_received(CNPId), 4000, _);
• Line 26: Send cfp messages to each agent in 30. !contract(Id). 30. !contract(Id).
the list LP
• …
25 Copyright: Nguyen Binh Minh, Spring 2019. 26 Copyright: Nguyen Binh Minh, Spring 2019.

25 26

13
5/20/2019

CNP in Jason: contractor CNP in Jason: contractor


• CNP Announcement • CNP Awarding
• Plan for +!startCNP() • Plan for @lc1 +!contract() 1.
2.
/ / A gent contractor in project contractNetProtocol

1. / / A gent contractor in project contractNetProtocol
• Trigger only if we are in the propose state for the
• … 2. …
contract CNPId 3 2 . / / this plan needs to be atomic so as not to accept
3 3 . / / proposals or refusals while contracting
• Line 26: Send cfp messages to each agent in 18. / / start the CNP
• Change the cnp_state to signify that we are 34. @lc1[atomic] +!contract(CNPId)
the list LP 19. +!startCNP(Id,Task)
20. <- .print("Waiting participants for task ",Task,"..."); awarding the contract (line 37) 35. : cnp_state(CNPId,propose)
36. <- -cnp_state(CNPId,_);
• Line 29: Wait until all of the proposals have 21.
22.
.wait(2000); // wait participants introduction
// remember the state of the CNP
• Lines 38-44: Create a list L of offer(O,A) predicates 37. +cnp_state(CNPId,contract);
been received, or we have a timeout of 4s and find the winner 38. .findall(offer(O,A),propose(CNPId,O)[source(A)],L);
23. +cnp_state(Id,propose);
39. .print("Offers are ",L);
• Note that the rule all_proposals_received() fails 24. .findall(Name,introduction(participant,Name),LP); • Find all of the predicates propose() for the contact Id
25. .print("Sending CFP to ",LP); from each agent A, and extract the offer O from each 40. // constrain the plan execution to at least one offer
when the agent slientpartner is in the MAS 41. L \== [];
26. .send(LP,tell,cfp(Id,Task));
• Ensure the list has at least one entry (line 41) 42. // sort offers, the first is the best
• However, we recover by waiting for 4 seconds 27. // the deadline of the CNP is now + 4 seconds
43. .min(L,offer(WOf,WAg));
28. // (or all proposals were received) • The winning offer is the one from L with the lowest offer
• Line 30: Create the achievement goal to award 29. .wait(all_proposals_received(CNPId), 4000, _); WOf
44.
45.
.print("Winner is ",WAg," with ",WOf);
!announce_result(CNPId,L,WAg);
the contract for Id 30. !contract(Id).
• Create the goal to announce the result (line 45) 46. -+cnp_state(CNPId,finished).

• Change the cnp_state to signify that we are finished


(line 46)
27 Copyright: Nguyen Binh Minh, Spring 2019. 28 Copyright: Nguyen Binh Minh, Spring 2019.

27 28

14
5/20/2019

CNP in Jason: contractor CNP in Jason: contractor


• CNP Awarding • CNP Awarding 1. / / A gent contractor in project contractNetProtocol

• Alternate Plan for +!contract() • The awarding process is recursive 2. …

• The goal was created on line 45 of the plan @lc1 44. …


• An alternate plan exists if we are not in the right 1. / / A gent contractor in project contractNetProtocol
45. !announce_result(CNPId,L,WA g);
context; this does nothing (line 49) 2. … • If the head of the list L is the winner WAg, then the 46. …
plan on line 58 is satisfied
• Plan for -!contract()
39. …
40. / / constrain the plan execution to at least one offer • An accept_proposal belief is sent to the winner
54.
55.
/ / Terminate the recursion when we have no more
/ / agents participating in the CFP
41. L \== [ ] ;
• If we delete the goal contract() then we know 42. … • The goal announce_result is then called on the tail of 56. +!announce_result(_,[],_).
the list of agents L 57.
something failed, and thus a message is 58. / / announce to the winner
4 8 . / / nothing todo, the current phase is not 'propose'
generated
49.@lc2 +!contract(_).
• If the head of the list L is not the winner WAg, then 59. +!announce_result(CNPId,[offer(_,WA g)|T],WA g)
the plan on line 63 is satisfied 60. <- .send(WAg,tell,accept_proposal(CNPId));
• This can occur if there were no viable contracts 50.
51. -!contract(CNPId) 61. !announce_result(CNPId,T,WAg).
proposed (i.e. if the constraint on line 41 was • A reject_proposal belief is sent to the agent 62.
52. <- .print(“CNP ",CNPId," has failed!"). 63. / / announce to others
violated) • Again, the goal announce_result is then called on the 64. +!announce_result(CNPId,[offer(_,LA g)|T],WAg)
remaining agents (the tail of L) 65. <- .send(LAg,tell,reject_proposal(CNPId));

• To terminate the recursion 66. !announce_result(CNPId,T,WAg).

• Line 55 triggers with a call on an empty list


29 Copyright: Nguyen Binh Minh, Spring 2019. 30 Copyright: Nguyen Binh Minh, Spring 2019.

29 30

15
5/20/2019

CNP in Jason Issues for Implementing Contract Net


• Here is the trace of the agents
• Note that each agent’s output is preceded by the agent name.
• How to…
• ... specify tasks?
• ... specify quality of service?
• ... decide how to bid?
• ... select between competing offers?
• … differentiate between offers based on multiple
criteria?

31 Copyright: Nguyen Binh Minh, Spring 2019. 32 Copyright: Nguyen Binh Minh, Spring 2019.

31 32

16
5/20/2019

Deciding how to bid Deciding how to bid


• At some time t a contractor i is scheduled to carry out τti. • Due to synergies, this is often not just cti(τ(ts))
• Contractor i also has resources ei. • in fact, it can be zero — the additional tasks can be done for free.
• Then i receives an announcement of task specification ts, which is for a set of tasks τ(ts).
• The cost to i to carry these out is: cti(τ) • Think of the cost of giving another person a ride to work.
• As long as µi(τ(ts) | τti) < e then the agent can afford to do the new work, then it is
• The marginal cost of carrying out τ will be: rational for the agent to bid for the work.
• Otherwise not.
µi(τ (ts) | τti) = ci(τ (ts)  τti) − ci(τti)
• You can extend the analysis to the case where the agent gets
• that is the difference between carrying out what it has already agreed to do and what it paid for completing a task.
has already agreed plus the new tasks. • And for considering the duration of tasks.
33 Copyright: Nguyen Binh Minh, Spring 2019. 34 Copyright: Nguyen Binh Minh, Spring 2019.

33 34

17
5/20/2019

Results Sharing Results Sharing in Blackboard Systems


• In results sharing, agents provide each other • The first scheme for cooperative problem
with information as they work towards a solving: was the blackboard system.
solution. • Results shared via shared data structure (BB).
• Multiple agents (KSs/KAs) can read and write to BB.
• It is generally accepted that results sharing • Agents write partial solutions to BB.
improves problem solving by:
• Independent pieces of a solution can be cross-checked.
• Blackboards may be structured as a hierarchy.
• Combining local views can achieve a better overall view.
• Mutual exclusion over BB required ⇒ bottleneck.
• Shared results can improve the accuracy of results.
• Sharing results allows the use of parallel resources on a
• Not concurrent activity.
problem.
• Compare:
• The following are examples of results sharing. • LINDA tuple spaces, JAVASPACES.
35 Copyright: Nguyen Binh Minh, Spring 2019. 36 Copyright: Nguyen Binh Minh, Spring 2019.

35 36

18
5/20/2019

Result Sharing in Subscribe/Notify Pattern Handling Inconsistency


• Common design pattern in OO systems:
subscribe/notify. • A group of agents may have inconsistencies in their:
• An object subscribes to another object, saying “tell me • Beliefs
when event e happens”. • Goals or intentions
• When event e happens, original object is notified.

The Centibots robots collaborate to • Inconsistent beliefs arise because agents have different views
• Information pro-actively shared between map a space and find objects.
of the world.
objects.
• May be due to sensor faults or noise or just because they can’t see everything.
• Objects required to know about the interests
of other objects ⇒ inform objects when • Inconsistent goals may arise because agents are built by
different people with different objectives.
relevant information arises.
37 Copyright: Nguyen Binh Minh, Spring 2019. 38 Copyright: Nguyen Binh Minh, Spring 2019.

37 38

19
5/20/2019

Handling Inconsistency Coordination


• Coordination is managing dependencies between agents.
• Three ways to handle inconsistency (Durfee at al.) • Any thoughts in resolving the following?
• Do not allow it!
• For example, in the contract net the only view that matters is that of the manager agent.
• Resolve inconsistency 1. We both want to leave the room through the same door.
We are walking such that we will arrive at the door at the
• Agents discuss the inconsistent information/goals until the inconsistency goes away. same time. What do we do to ensure we can both get
• We will discuss this later (argumentation). through the door?

• Build systems that degrade gracefully in the face of 2. We both arrive at the copy room with a stack of paper to
photocopy. Who gets to use the machine first?
inconsistency.

39 Copyright: Nguyen Binh Minh, Spring 2019. 40 Copyright: Nguyen Binh Minh, Spring 2019.

39 40

20
5/20/2019

Coordination Social Norms


• Von Martial suggested that positive coordination is: • Societies are often regulated by (often
• Requested (explicit) unwritten) rules of behaviour.
• Non-requested (implicit)
• Example:
• Non-requested coordination relationships can be as follows. • A group of people is waiting at the bus stop. The bus
• Action equality: arrives. Who gets on the bus first?
• We both plan to do something, and by recognising this one of us can be saved the effort.

• Consequence: • Another example:


• What I plan to do will have the side-effect of achieving something you want to do. • On 34th Street, which side of the sidewalk do you walk along?
• Favor:
• What I plan to do will make it easier for you to do what you want to do. • In an agent system, we can design the
norms and program agents to follow them,
• Now let’s look at some approaches to coordination. or let norms evolve.
41 Copyright: Nguyen Binh Minh, Spring 2019. 42 Copyright: Nguyen Binh Minh, Spring 2019.

41 42

21
5/20/2019

A useful social law that

Offline Design Offline Design prevents collisions


(Wooldridge p177, from

• We can refine our view of an


Shoham and Tennenholtz):

• Recall how we described agents before: Ag:RE →Ac environment. 1. On even rows the robots move left while in

• As a function which, given a run ending in a state, odd rows the robots move right.

gives us an action. • Focal states, F  E are the states we want our


2. Robots move up when in the rightmost
column.

agent to be able to get to. 3. Robots move down when in the leftmost
column of even rows or the second rightmost

• A constraint is then a pair: <E′, α> • From any focal state e  F it should be possible to column of odd rows.

• where E′  E is a set of states, and α  Ac is an action. get to any other focal state e′  F (though not
necessarily right away).
• This constraint says that α cannot be done in any state
in E′.
• A useful social law is then one that
• A social law is then a set of these does not prevent agents from getting Not necessarily efficient (On2 steps to get
constraints. from one focal state to another.
to a specific square).

43 Copyright: Nguyen Binh Minh, Spring 2019.


44 Copyright: Nguyen Binh Minh, Spring 2019.

43 44

22
5/20/2019

Emergence Joint Intentions


• We can also design systems in which social laws • Just as we have individual intentions, we can have joint intentions for
emerge. Simple majority:
a team of agents.
Agents pick the shirt they have seen the
most.
“... Agents have both a red t-shirt and a blue t-shirt
and wear one. Goal is for everyone to end up with
Simple majority with types: • Levesque defined the idea of a joint persistent goal (JPG).
the same color on. In each round, each agent
Agents come in two types. When they meet
an agent of the same type, agents pass their
• A group of agents have a collective commitment to bring about some goal φ, “move the
meets one other agent, and decides whether or memories. Otherwise they act as simple couch”.
not to change their shirt. During the round they
only see the shirt their pair is wearing — they don’t
majority.
• Also have motivation ψ, “Simon wants the couch moved”.
get any other information...” Highest cumulative reward:
Agents can “see” how often other agents
T-shirt Game (Shoham and Tennenholtz)
(some subset of all the agents) have matched • The mental states of agents mirror those in BDI agents.
their pair. They pick the shirt with the largest
number of matches. • Agents don’t believe that φ is satisfied, but believe it is possible.
• Agents maintain the goal φ until a termination condition is reached.
• What strategy update function should they use?
45
Copyright: Nguyen Binh Minh, Spring 2019. 46 Copyright: Nguyen Binh Minh, Spring 2019.

45 46

23
5/20/2019

Joint Intentions
Multiagent Planning
• The terminations condition is that it is mutually believed that: “... You and I have a mutual
• Another approach to coordinate is to explicitly plan
• goal φ is satisfied; or belief that p if I believe p and you what all the agents do.
• goal φ is impossible; or believe p and I believe that you
believe p and I believe that you • For example, come up with a large STRIPS plan for all the agents
• the motivation ψ is no longer present believe that I believe p and ...” in a system.
• The termination condition is achieved when an agent realises
that, the goal is satisfied, impossible and so on.
• Could have:
• But it doesn’t drop the goal right away. • Centralised planning for distributed plans.
• Instead it adopts a new goal — to make this new knowledge mutually • One agent comes up with a plan for everybody
believed.
• This ensures that the agents are coordinated. • Distributed planning
• A group of agents come up with a centralised plan for another group of agents.
• They don’t stop working towards the goal until they are all • Distributed planning for distributed plans
appraised of the situation.
• Agents build up plans for themselves, but take into account the actions of others.
• Mutual belief is achieved by communication.
47 Copyright: Nguyen Binh Minh, Spring 2019. 48 Copyright: Nguyen Binh Minh, Spring 2019.

47 48

24
5/20/2019

Multiagent Planning Summary


• In general, the more decentralized it is, the harder it is. • This lecture has discussed how to get agents Class Reading (Chapter 8):
working together to do things.
• Key assumption: benevolence “Distributed Problem Solving and Planning”,
• Georgeff propsed a distributed version of STRIPS. E.H. Durfee. In Weiss, G. ed.: Multiagent
• Agents are working together, not in competition. Systems,1999, pp121-164.
• New list: during
• Specifies what must be true while the action is carried out.
• We discussed a number of ways of having agents This is a detailed and precise
• This places constraints on when other agents can do things. introduction to distributed problem
decide what to do, and make sure that their work solving and distributed planning, with
many useful pointers into the literature.
is coordinated.
• Different agents plan to achieve their goals using these operators and then do:
• A typical system will need to use a combination of these ideas.
• Interaction analysis: do different plans affect one another?
• Safety analysis: which interactions are problematic?
• Interaction resolution: treat the problematic interactions as critical sections and enforce mutual • Next time, we will go on to look at agents being in
exclusion. competition with one another.

49 Copyright: Nguyen Binh Minh, Spring 2019. 50 Copyright: Nguyen Binh Minh, Spring 2019.

49 50

25
11/29/2023

Social Choice
• We continue thinking in the same framework as the previous chapter:
• multiagent encounters
IT4899 • game-like interactions
• participants act strategically
Multi-Agent Systems
Chapter 12 - Making Group Decisions • Social choice theory is concerned with group decision making.
• Agents make decisions based on their preferences, but they are aware of other agents’
Dr. Nguyen Binh Minh preferences as well.
Department of Information Systems
• Classic example of social choice theory: voting
• Formally, the issue is combining preferences to derive a social outcome.
1 2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

1 2

1
11/29/2023

Components of a Social Choice Model Preference Aggregation


• Assume a set Ag = {1,...,n} of voters. • The fundamental problem of social choice theory is that…
• These are entities who express preferences. Preference Order Example • …different voters typically have different preference orders!
• Voters make group decisions with respect to a set Suppose
“... given a collection of preference orders, one for each voter, how do we combine these to
Ω = {ω1,ω2,…} of outcomes. Ω = {pear, plum, banana, orange}
derive a group decision, that reflects as closely as possible the preferences of voters? ...”
then we might have agent i with
• Think of these as the candidates.
preference order:
• If |Ω| = 2, we have a pairwise election. (banana, plum, pear, orange) • We need a way to combine these opinions into on overall
meaning decision.
banana ≻i plum ≻i pear ≻i orange
• Each voter has preferences over Ω • What social choice theory is about is finding a way to do this.
• Two variants of preference aggregation:
• An ordering over the set of possible outcomes Ω. • social welfare functions
• Sometimes we will want to pick one, most preferred candidate. • social choice functions
• More generally, we may want to rank, or order these candidates.
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
11/29/2023

Social Welfare Function Social Choice Function


• Let Π(Ω) be a set of preference orderings • Sometimes, we just one to select one of the
over Ω possible candidates, rather than a social order.
• This gives a social choice function (see opposite)
• A social welfare function takes voter preferences and
• For example, a local by-election or presidential election
produces a social preference order.
• That is it merges voter opinions and comes up with an order over the
candidates. • In other words, we don’t get an ordering out of a
social choice function but, as its name suggests,
we get a single choice.
• We let ≻ denote to the outcome of a social • Of course, if we have a social welfare function, we also have a
welfare function: ω ≻ ω′ social choice function.

• which indicates that ω is ranked above ω′ in the social


ordering • For the rest of this chapter…
• Example: combining search engine results, collaborative filtering, • …we’ll refer to both both social choice and social welfare
collaborative planning, etc. functions as voting procedures.
5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 6 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
11/29/2023

Voting Procedures: Plurality Strategic Manipulation by Tactical Voting


• Social choice function: selects a single Anomalies with Plurality • Suppose agent i wants ω1 to win, but otherwise prefers ω2 over ω3
outcome.
Suppose: • i.e. its preferences are: ω1 ≻i ω2 ≻i ω3
• Each voter submits preferences.
|Ag| = 100 and Ω = {ω1, ω2, ω3}
• Each candidate gets one point for every preference order
• However:
with:
that ranks them first.
40% voters voting for ω1
30% of voters voting for ω2 • you believe 49% of voters have preferences: ω2 ≻ω1 ≻ω3
• Winner is the one with largest number of points. 30% of voters voting for ω3 • and you believe 49% have preferences: ω3 ≻ω2 ≻ω1
• Also known in the UK as first past the post, or relative
majority With plurality, ω1 gets elected even
• Example: Political elections in UK. though a clear majority (60%) prefer • You may do better voting for ω2, even though this is not your
another candidate! true preference profile.
• If we have only two candidates, then plurality is • This is tactical voting: an example of strategic manipulation of the vote.
a simple majority election
7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
11/29/2023

Condorcet’s Paradox Sequential Majority Elections


Nicolas de Caritat, marquis de
Condorcet (1743-1794)
• Suppose Ag = {1,2,3} and Ω = {ω1,ω2,ω3} with: • One way to improve on plurality voting is to reduce a general voting
ω1 ≻1 ω2 ≻1 ω3 scenario to a series of pairwise voting scenarios.
ω3 ≻2 ω1 ≻2 ω2
ω2 ≻3 ω3 ≻3 ω1 Linear Sequential Pairwise Elections Balanced Binary Tree
One agenda for the election between Ω = {ω1, ω2, ω3, We can also organise this as a balanced binary tree.
• For every possible candidate, there is another candidate ω4} is ω2, ω3, ω4, ω1 – An election between ω1 and ω2.
that is preferred by a majority of voters! First we have an election between ω2 and ω3. – An election between ω3 and ω4.
The winner enters an election with ω4. – An election between the two winners.
• If we pick ω1, two thirds of the voters prefer ω3 to ω1.
The winner of that faces ω1. Rather like the Final Four
• If we pick ω3, two thirds of the voters prefer ω2.
• In a democracy, it seems inevitable that ? w1
we can’t choose an outcome that will • If we pick ω2, it is still the case that two thirds of the voters prefer a ? w4
make everyone happy. different candidate, in this case ω1 to the candidate we picked. ? w1 w4 w1

• Condorcet’s paradox tells us that in some ? ? w2 w4

situations, no matter which outcome we


choose, a majority of voters will be
• This is Condorcet’s paradox: there are situations in which: ? w4 w2 w4

unhappy with the outcome. • no matter which outcome we choose, a majority of voters will be w1 w2 w3 w4 w1 w2 w3 w4

w2 w3 w2 w3
unhappy with the outcome chosen.
9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
11/29/2023

Linear Sequential Pairwise Elections Anomalies with Sequential Pairwise Elections


• Here, we pick an ordering of the outcomes – the agenda Majority Graphs • Suppose:
• 33 voters have preferences: ω1 ≻* ω2 ≻* ω3
– which determines who plays against who. A directed graph with:
• vertices = candidates • 33 voters have preferences: ω3 ≻* ω1 ≻* ω2
• For example, if the agenda is: • an edge (i, j) if i would be at j is a
• 33 voters have preferences ω2 ≻* ω3 ≻* ω1
simple majority election.

ω2 , ω3 , ω4 , ω1
• Then for every candidate, we can fix an
A compact representation of voter
preferences. With an odd number of voters
• then the first election is between ω2 and ω3...
(no ties) the majority graph is such that:
• The graph is complete.
agenda for that candidate to win in a
• ... and the winner goes on to the second election with ω4 ... • The graph is asymmetric. sequential pairwise election!

• ... and the winner of this election goes in the final election with ω1.
The graph is irreflexive.
Such a graph is called a tournament, a
nice summarisation of information about • This idea is easiest to illustrate using a
voter preferences.
majority graph.
11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
11/29/2023

Majority Graph Example Agendas and Majority Graphs


• Given the previous example: • This is another example of a majority graph in which every
• with agenda (ω3, ω2, ω1), ω1 wins ω1 ω2 outcome is a possible winner
• i.e. the winner of ω3 vs ω2 is ω2, which is beaten by ω1
ω1 ω2
ω1 wins with agenda ω2 wins with agenda ω3 wins with agenda ω4 wins with agenda
• with agenda (ω1, ω3, ω2), ω2 wins (ω3, ω4, ω2, ω1) (ω1, ω3, ω4, ω2) (ω1, ω4, ω2, ω3) (ω1, ω3, ω2, ω4)
• i.e. the winner of ω1 vs ω3 is ω3, which is beaten by ω2 ω3 vs ω4 ω1 vs ω3 ω1 vs ω4 ω1 vs ω3
→ω3 vs. ω2 →ω1 vs. ω4 →ω1 vs. ω2 →ω1 vs. ω2
• with agenda (ω1, ω2, ω3), ω3 wins ω3 →ω3 vs. ω1 →ω1 vs. ω2 →ω2 vs. ω3 →ω2 vs. ω4
• 1) →ω1 2) →ω2 3) →ω3 4) →ω4
i.e. the winner of ω1 vs ω2 is ω1, which is beaten by ω3 ω3 ω4

• Since the graph contains a cycle, it turns • Note, that there may be multiple agendas that result in the
out that we can fix whatever result we want. same winner:
• All we have to do is to pick the right order of the • ω1 also wins with agenda (ω4, ω2, ω3, ω1)
elections.
13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
11/29/2023

Condorcet Winners The Slater Ranking


• Now, we say that a result is a possible winner if there is an agenda
that will result in it winning overall. • The Slater rule is interesting because it considers:
• The majority graph helps us determine this. • the question “of which social ranking should be selected”, as
ω1 ω2
ω1 wins with agenda
(ω1, ω3, ω4, ω2)
ω1 wins with agenda
(ω1, ω4, ω2, ω3)
• “the question of trying to find a consistent ranking that is as close to the
ω1 vs ω3 ω1 vs ω4 majority graph as possible”
... etc...
→ω1 vs. ω4 →ω1 vs. ω2 • i.e. one that does not contain cycles
→ω1 vs. ω2 →ω1 vs. ω3
2) →ω1 3) →ω1 ω3 ω4

• To determine if ωi is a possible winner, we have to find, for every • Think of it as:


other ωj, if there is a path from ωi to ωj in the majority graph. • If we reversed some edges in a graph, which ordering minimises this
inconsistency measure
• This is computationally easy to do.

15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 16 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
11/29/2023

Inconsistency Measure The Slater Ranking


ω1 ω2 Inconsistent Consistent

• Consider this majority graph (upper) • Remember that the following ranking ω1 ω2 ω1 ω2
• No cycles, therefore the ranking ω1 ≻* ω3 ≻* ω2 ≻* ω4 is acceptable: Consistent
• The graph is consistent has a cost of 1
ω3 ω4
• ω 1 ≻* ω2 ≻* ω3 ≻* ω4
ω3 ω4 ω3 ω4
• By flipping the single edge (ω4, ω1) we would have a consistent
• This majority graph (lower) has cycles Inconsistent Consistent
graph.
• We can have a ranking where one candidate beats another,
although it would loose in a pairwise election ω1 ω2 ω1 ω2 Inconsistent Consistent

• ω1 ≻* ω2 ≻* ω3 ≻* ω4 even though ω4 beats ω1 in a pairwise election


• By flipping the edge (ω4, ω1) we would have a consistent graph
• Consider the alternate ranking: ω1 ω2 ω1 ω2

• ω 1 ≻* ω2 ≻* ω4 ≻* ω3
• As this is the only edge we would need to flip, we say ω3 ω4 ω3 ω4 • In this case, we would have to flip two edges (ω4, ω1) and (ω3,
ω4) giving a cost of 2 giving
the cost of this order is 1. ω3 ω4 ω3 ω4
17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 18 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
11/29/2023

The Slater Ranking Borda Count


• One reason plurality has so many anomalies is that Example of Borda Count
Assume we have three voters with
it ignores most of a voter’s preference orders: it
preferences:
only looks at the top ranked candidate.
• The Slater ranking is the one with minimal • The Borda count takes whole preference order into account. ω2 ≻1 ω1 ≻1 ω3
cost ω3 ≻2 ω2 ≻2 ω1
• i.e. calculate the cost of each ordering and find the one
• Suppose we have k candidates - i.e. k = |Ω| ω1 ≻3 ω2 ≻3 ω3
• For each candidate, we have a variable, counting the strength of
with the minimal cost opinion in favour of this candidate.
The Borda count of ω2 is 4:
• Computing the ordering with minimal Slater ranking is NP- •

If ωi appears first in a preference order, then we increment the count for ωi by k − 1;
2 from the first place vote of voter 1.
we then increment the count for the next outcome in the preference order by k − 2,
hard • 1 each from the second place votes of
. . . , until the final candidate in the preference order has its total incremented by 0.
voters 2 and 3.

• After we have done this for all voters, then the What are the Borda counts of the other
candidates?
totals give the ranking.
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
11/29/2023

Alternative Vote (AV) Alternative Vote (AV)


• A social choice voting method
William Robert Ware
(1832-1915) Round 1
Votes 1st choice 2nd choice 3rd choice 4th choice
• Also known as Instant Runoff Voting (IRV)
7 action horror comedy drama
23 voters chose their
favourite movie
• Results in a single winner 5 comedy action horror drama
genres.

2 drama horror comedy action Majority (i.e. >50%)

• Unlike Plurality voting, voters in IRV rank


will be 12 or more
5 comedy drama action horror votes
4 horror action drama comedy
the candidates in order of preference. • Used in national elections in several
• Counting proceeds in rounds, with the last place
countries, including:
• Members of the Australian House of Round 1 Round 2 Round 3
In the first round, we consider all
candidate being eliminated, until there is a majority Representatives and most Australian of the 1st choice votes
action 7
vote state legislatures
comedy 5+5=10 As drama received the fewest
• The President of India, and members
votes, we eliminate this and
of legislative councils in India drama 2
reallocate the overall votes.
4

• Offers a solution to Condorcet’s paradox The President of Ireland horror

21 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

21 22

11
11/29/2023

Alternative Vote (AV) Alternative Vote (AV)


Round 2 Round 3
Votes 1st choice 2nd choice 3rd choice 4th choice 23 voters chose their Votes 1st choice 2nd choice 3rd choice 4th choice 23 voters chose their
7 action horror comedy drama favourite movie 7 action horror comedy drama favourite movie
genres. genres.
5 comedy action horror drama 5 comedy action horror drama
2 drama horror comedy action Majority (i.e. >50%) 2 drama horror comedy action Majority (i.e. >50%)
will be 12 or more will be 12 or more
5 comedy drama action horror votes 5 comedy drama action horror votes
4 horror action drama comedy 4 horror action drama comedy

In the second round, we allocate In the third round, we allocate


Round 1 Round 2 Round 3 Round 1 Round 2 Round 3
the 2 votes for drama to the next the 6 votes for horror to the next
action 7 7 choice, which is horror action 7 7 7+4=11 choices: 2 votes to comedy,
comedy 5+5=10 10 comedy 5+5=10 5+5=10 5+2+5=12 and 4 to action
drama 2 —— However, horror now has the drama 2 —— ——
fewest votes, and is eliminated Comedy now has the majority
horror 4 4+2=6 horror 4 4+2=6 ——
votes

23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 24 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

23 24

12
11/29/2023

Desirable Properties of Voting Procedures The Pareto Condition


The Pareto Property
• Can we classify the properties we If everybody prefers ωi over ωj, then ωi
should be ranked over ωj in the social • Recall the notion of Pareto efficiency from the previous
want of a “good” voting procedure? outcome.
lecture.
• An outcome is Pareto efficient if there is no other outcome that makes
• Three key properties: Condorcet Winner
If ωi is a condorcet winner, then ωi should
one agent better off without making another worse off.
• The Pareto property; always be ranked first. • In voting terms, if every voter ranks ωi above ωj then ωi ≻ ωj.
• The Condorcet Winner condition; Independence of Irrelevant
• Independence of Irrelevant Alternatives (IIA). Alternatives (IIA) • Satisfied by plurality and Borda but not by sequential
Whether ωi is ranked above ωj in the social
outcome should depend only on the relative majority.
• We should also avoid dictatorships! orderings of ωi and ωj in voters profiles.

25 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 26 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

25 26

13
11/29/2023

The Condorcet winner condition Independence of irrelevant alternatives


• Recall that the Condorcet winner is an outcome that would • Suppose there are a number of candidates including ωi and ωj
beat every other outcome in a pairwise election. and voter preferences make ωi ≻ ωj.
• A Condorcet winner is a strongly preferred outcome. • Now assume one voter k changes preferences, but still ranks ωi ≻k ωj
• The independence of irrelevant alternatives (IIA) condition says that however ≻
• The Condorcet winner condition says that if there is a changes, ωi ≻ ωj still.
Condorcet winner, then it should be ranked first. • In other words, the social ranking of ωi and ωj should depend only on the way
• Seems obvious. they are ranked in the ≻ relations of the voters.

• However, of the ones we’ve seen, only sequential majority • Plurality, Borda and sequential majority do not satisfy IIA.
satisfies it.
27 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

27 28

14
11/29/2023

Dictatorship Theoretical Results


• Not a desirable property, but a useful notion to define.
• We have now explored several social choice functions
• A social welfare function f is a dictatorship if for some agent i: • Do any of these satisfy our desirable properties (i.e. Pareto, etc)?
• No - according to Arrow’s Theorem

• Furthermore, voters can benefit by strategically


• In other words the output is exactly the preference order of the single “dictator” agent i.
misrepresenting their preferences, i.e., lying – tactical
voting
• Plurality and the Borda count are not dictatorships. • Are there any voting methods which are non-manipulable, in the sense that
• But, dictatorships satisfy the Pareto condition and IIA. voters can never benefit from misrepresenting preferences?
• No - according to the Gibbard-Satterthwaite Theorem

29 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 30 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

29 30

15
11/29/2023

Theoretical Results Computational Complexity to the Rescue


• Arrows Theorem
• For elections with more than 2 candidates the only voting procedure satisfying
the Pareto condition and IIA is a dictatorship • However…
• in which the social outcome is in fact simply selected by one of the voters. • Gibbard-Satterthwaite only tells us that manipulation is possible in
• This is a negative result: there are fundamental limits to democratic decision principle.
making! • It does not give any indication of how to misrepresent preferences.

• Bartholdi, Tovey, and Trick showed:


• The Gibbard-Satterthwaite Theorem • that there are elections that are prone to manipulation in principle, but where manipulation was
computationally complex.
• The only non-manipulable voting method satisfying the Pareto property for
elections with more than 2 candidates is a dictatorship. • “Single Transferable Vote” is NP-hard to manipulate!

• In other words, every “realistic” voting method is prey to strategic manipulation…


31 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 32 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

31 32

16
11/29/2023

Summary
• In this lecture we have looked at mechanisms Class Reading (Chapter 12):
for group decision making.
“The computational difficulty of manipulating
• This has been a bit stylised — we looked at how, if a group an election”, J.J. Bartholdi, C.A.Tovey and
of agents ranks a set of outcomes, we might create a M.A.Trick. Social Choice and Welfare. Vol. 6
227-241, 1989.
consensus ranking.
• This does have a real application in voting systems.
This is the article that prompted the
• Social choice mechanisms are increasingly used in real
current interest in computational
systems as a way to reach consensus.
aspects of voting. It is a technical
• We looked at the behaviour of some existing voting systems scientific article, but the main thrust of
and some theoretical results for voting systems in general. the article is perfectly understandable
without a technical detailed
• most of these results were pretty negative.
background.

• Lots we didn’t have time to cover— another


area with lots of active research.
33 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

33

17
1/3/2022

Cooperative Game Theory


• So far we have taken a game theoretic view of multi-agent
interactions
IT4899 • Prisoner’s Dilemma suggests that cooperation should not occur, as the
conditions required are not present:
Multi-Agent Systems • Binding agreements are not possible
Chapter 13 - Forming Coalitions • Utility is given to individuals based on individual action

Dr. Nguyen Binh Minh


Department of Information Systems • These constraints do not necessarily hold in the real world
• Contracts, or collective payments can facilitate cooperation, leading to
Coalition Games and Cooperative Game Theory
2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018
1
1 2

1
1/3/2022

Coalitional Games Formalising Cooperative Scenarios


• Coalitional games model scenarios where agents can • A Characteristic Function Game (CFG) is % R epresentation of a Simple
benefit by cooperating. represented as the tuple: G = ⟨ Ag,𝛎 ⟩ % Characteristic F unction Game

% List of Agents
• Sandholm (et. al., 1999) identified the following stages: 1,2,3
% Characteristic F unction
1 =5
Coalitional Structure Solving the optimization Dividing the benefits 2 =5
3 =5
Generation problem of each coalition Deciding “who gets what” in the • From this, we form a coalition C  Ag 1,2 = 10
Deciding in principle who will work Deciding how to work together, and payoff. Coalition members cannot 1,3 = 10
together. It asks the basic question: how to solve the “joint problem” of a ignore each other’s preferences, • Singleton: where a coalition consists of a single member 2,3 = 10
Which coalition should I join? coalition. It also involves finding how because members can defect: • Grand Coalition: where C = Ag (i.e. all of the agents) 1,2,3 = 25

The result: partitions agents into


to maximise the utility of the coalition
itself, and typically involves joint
...if you try to give me a bad payoff, I
can always walk away...
• Each coalition has a payoff value, defined by the
disjoint coalitions. planning etc.
We might want to consider issues
characteristic function 𝛎
The overall partition is a such as fairness of the distribution.
coalition structure. • i.e. if 𝛎 ( C ) = k then the coalition will get the payoff k if
they cooperate on some task
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
1/3/2022

Characteristic Function Games Which Coalition Should I Join?


• The objective is to join a coalition that the agent cannot object to • Assuming that we know the characteristic function and the payoff
• This involves calculating the characteristic function for different games vector, what coalition should an agent join?
• An outcome x for a coalition C in game ⟨Ag, 𝛎 ⟩ is a vector of payoffs to members of C, such
that x = ⟨x1, . . . , xk⟩ which represents an efficient distribution of payoff to members of Ag
• Sandholm (1999) showed that: • Where “efficient” means:
• If the game is superadditive: if 𝛎 (U) + 𝛎 (U) < 𝛎 (U⋃V)
• The coalition that maximises social welfare is the Grand Coalition

• If the game is subadditive: if 𝛎 (U) + 𝛎 (U) >𝛎 (U⋃V) • Example: if 𝛎 ({1, 2}) = 20, then possible outcomes are: ⟨20,0⟩, ⟨19,1⟩, ⟨18,2⟩ … ⟨1,19⟩, ⟨0,20⟩

• The coalitions that maximis social welfare are singletons

• However as some games are neither subadditive or superadditive: • Thus, the agent should only join a coalition C which is:
• the characteristic function value calculations need to be determined for each of the possible coalitions! • Feasible: the coalition C really could obtain some payoff than an agent could not object to; and
• This is exponentially complex
• Efficient: all of the payoff is allocated
5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 6 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
1/3/2022

Which Coalition Should I Join? Stability and the Core


• However, there may be many coalitions • Stability can be reduced to the notion of the core
• Each has a different characteristic function • Stability is a necessary but not sufficient condition for coalitions to form

• Agents prefer coalitions that are as productive as possible


• i.e. Unstable coalitions will never form, but a stable coalition isn’t guaranteed to form

• Therefore a coalition will only form if all the members


prefer to be in it
• The core of a coalitional game is the set of feasible distributions of payoff to
• I.e. they don’t defect to a more preferable coalition members of a coalition that no sub-coalition can reasonably object to
• Intuitively, a coalition C objects to an outcome if there is some other outcome that makes all of them
strictly better off
• Therefore: • Formally, C  Ag objects to an outcome x = ⟨x1, . . . , xn⟩ for the grand coalition if there is some outcome
• “which coalition should I join?” can be reduced to “is x′ = ⟨x1′, . . . , xk′⟩ for C such that: xi′ > xi for all i  C
the coalition stable?”
• Is it rational for all members of coalition C to stay with C, or could they
benefit by defecting from it? • The idea is that an outcome is not going to happen if somebody objects to it!
• There's no point in me joining a coalition with you, unless you want to • i.e. if the core is empty, then no coalition can form
form one with me, and vice versa.
7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
1/3/2022

The Core and Fair Payoffs Sharing the Benefits of Cooperation


Symmetry
• Sometimes the core is non-empty but is it “fair”? • The Shapley value is best known
Agents that make the same contribution should
get the same payoff. I.E. the amount an agent
• Suppose we have Ag = {1, 2}, with the following Characteristic Function: gets should only depend on their contribution.
attempt to define how to divide
benefits of cooperation fairly. Dummy Player
• It does this by taking into account how much an These are agents that never have any synergy
• The outcome ⟨20, 0⟩ (i.e., agent 1 gets everything) will not be in the core, since agent 2 can agent contributes.
with any coalition, and thus only get what they
can earn on their own.
object; by working on its own it can do better, because 𝛎 ({2}) = 5
• The Shapley value of agent i is the average
• However, outcome ⟨14, 6⟩ is in the core, as agent 2 gets more than working on its own, amount that i is expected to contribute to a
and thus has no objection. Additivity
coalition. If two games are combined, the value an agent


gets should be the sum of the values it gets in
The Shapley value is one that satisfies the
• But is it “fair” on agent 2 to get only a payoff of 6, if agent 1 gets 14??? axioms opposite!
the individual games.

9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
1/3/2022

Marginal Contribution Shapley Axioms: Symmetry


• The Shapley value for an agent is based on the marginal contribution • Agents that make the same
of that agent to a coalition (for all permutations of coalitions)
contribution should get the same
• Let δi(C) be the amount that agent i adds by joining a coalition C  Ag payoff
• i.e. the marginal contribution of i to C is defined as δi(C) = 𝛎 (C⋃{i}) - 𝛎 (C) • The amount an agent gets should only depend
on their contribution
• Note that if δi(C) = 𝛎 ({i}) then there is no added value from i joining C since the amount
i adds is the same as if i would earn on its own • Agents i and j are interchangeable if δi(C) = δj(C)
for every C  Ag \ {i, j}
• The Shapley value for i, denoted φi, is the value that agent i in Ag is
given in the game ⟨Ag, 𝛎 ⟩ • The symmetry axiom states:
• If i and j are interchangeable, then φi = φj
11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
1/3/2022

Shapley Axioms: Dummy Player Shapley Axioms: Additivity


• Agents that never have any synergy • If two games are combined, the value an agent gets should be the
sum of the values it gets in the individual games
with any coalition, and thus only get • I.e. an agent doesn’t gain or loose by playing more than once
what they can earn on their own. • Let G1 = ⟨Ag, 𝛎 1⟩ and G2 = ⟨Ag, 𝛎 2⟩ be games with the same agents
• An agent is a dummy player if δi(C) = 𝛎 ({i}) • Let i  Ag be one of the agents
for every C  Ag \ {i} • Let φ1i and φ2i be the value agent i receives in games G1 and G2 respectively
• i.e. an agent only adds to a coalition what it could get on its
own • Let G1+2 = ⟨Ag, 𝛎 1+2⟩ be the game such that 𝛎 1+2(C) = 𝛎 1(C) + 𝛎 2(C)

• The dummy player axiom states: • The additivity axiom states:


• The value φ1+2i of agent i in game G1+2 should be φ1i + φ2i
• If i is a dummy player, then φi = 𝛎 ({i})
13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
1/3/2022

Shapley value Shapley Example


• Recall that we stated:
• The Shapley value for an agent is based on the marginal contribution of that agent to a coalition (for all
permutations of coalitions)
• The marginal contribution can be dependent on the order in which an agent joins a coalition
• This is because an agent may have a larger contribution if it is the first to join, than if it is the last!

• For example, if Ag = {1,2,3} then the set of all possible orderings, 𝚷 (Ag) is given
as
• 𝚷 (Ag) = {(1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1)}

• We have defined the marginal contribution of i to C as δi(C) = 𝛎 (C⋃{i}) - 𝛎 (C)

• The Shapley value for i is defined as:


15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
1/3/2022

Representing Coalitional Games Representing Characteristic Functions?


% R epresentation of a Simple
• It is important for an agent to know if % Characteristic F unction Game • Two approaches to this problem:
the core of a coalition is non-empty % List of Agents
1,2,3
• try to find a complete representation that is succinct in “most” cases
• Problem: a naive, obvious representation of a % Characteristic F unction
1 =5
• try to find a representation that is not complete but is always succinct
coalitional game is exponential in the size of Ag. 2 =5
3 =5
• Now such a representation is: 1,2 = 10
1,3 = 10 • A common approach:
• utterly infeasible in practice; and 2,3 = 10
1,2,3 = 25 • interpret characteristic function over a combinatorial structure.
• so large that it renders comparisons to this input size
meaningless

• An n-player game consists of 2n-1 coalitions • We look at two possible approaches:


• e.g. a 100-player game would require 1.2 x 1030 lines • Induced Subgraph and Marginal Contribution Networks
17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 18 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
1/3/2022

Induced Subgraph Induced Subgraph


3
3 3 • Representation is succinct, but not complete
• Represent 𝛎 as an undirected A B A B
A B
graph on Ag, with integer weights • there are characteristic functions that cannot be captured using

wi,j between nodes i, j  Ag


𝛎 ({A,B,C}) 2 this representation
2 = 2 1
• Value of coalition C is then: 3+2 = 5 C • Determining emptiness of the core is NP-complete
C C D • Checking whether a specific distribution is in the core is co-NP-
4 complete

B 5
• i.e., the value of a coalition C  Ag Weighted Graph • Shapley value can be calculated in polynomial
is the weight of the subgraph time
induced by C 1
D 𝛎 ({B,D}) = 5+1 =
D
𝛎 ({D}) =
6 • i.e. an agent gets half the income from the edges in the
5 5 5
graph to which it is attached.
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
1/3/2022

Marginal Contribution Nets Coalition Structure Generation


• Calculating the Shapley value for • In addition to representing the characteristic function, there is the challenge
of calculating them!
marginal contribution nets is similar
• Remember, for a set of n agents in Ag, there will be 2n-1 distinct coalitions
to that for induced subgraphs
• Again, Shapley’s symmetry axiom applies to each • Shehory & Kraus (1998) proposed a method whereby agents distributed
agent the calculation amongst themselves
• The contributions from agents in the same rule is equal
• Resulted in a communication overhead, in coordinating which agent calculated the characteristic
• The additivity property means that: function value for which coalition
• we calculate the Shapley value for each rule • Rahwan & Jennings (2007) proposed the DVCD approach for allocating coalition value
calculations to agents without the need for communication
• sum over the rules to calculate the Shapley value for each agent
• However, agents could be incentivised to mis-represent the calculations for those coalitions in which they were not a member
• Handling negative values requires a different
• This was resolved by Riley, Atkinson, Dunne & Payne (2015) through the use of (n,s)-sequences
method
22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

22 23

11
1/3/2022

Summary
• In this lecture we have looked at mechanisms for identifying Class Reading (Chapter 13):
coalitions.
• The notion of a stable coalition game was presented, through the idea of a
“Marginal contribution nets: A compact
Core.
representation scheme for coalition games”,
• The Shapley Value was then introduced, to determine the contribution that S. Ieong and Y. Shoham. Proceedings of
different agents may have on a coalition. the Sixth ACM Conference on Electronic
Commerce (EC’05), Vancouver, Canada,
2005.
• The problem of representing coalitional games and
characteristic functions was then discussed, including: This is a technical article (but a very
• Induced Subgraphs nice one), introducing the marginal
• Marginal Contribution Nets. contribution nets scheme.

• We finally talked about Coalition Structure Generation


• This is again an active research area, especially from a
game-theoretic and computational complexity perspective.
28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

28

12
1/18/2022

Overview
• Allocation of scarce resources amongst a number
of agents is central to multiagent systems.
IT4899Q • A resource might be:
Multi-Agent Systems • a physical object
• the right to use land
Chapter 14 - Allocating Scarce Resources • computational resources (processor, memory, . . . )

Dr. Nguyen Binh Minh • It is a question of supply vs demand


Department of Information Systems • If the resource isn’t scarce…, or if there is no competition for
the resource...
• ...then there is no trouble allocating it

• If there is a greater demand than supply


• Then we need to determine how to allocate it
2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

1 2

1
1/18/2022

Overview What is an auction


• In practice, this means we will be talking • Auctions are effective in allocating resources
about auctions. efficiently
• These used to be rare (and not so long ago). • They also can be used to reveal truths about bidders “... An auction is a market institution in
which messages from traders include
• However, auctions have grown massively with the some price information — this

Web/Internet • Concerned with traders and their allocations of: information may be an offer to buy at a
given price, in the case of a bid, or an
• Frictionless commerce • Units of an indivisible good; and offer to sell at a given price, in the case
of an ask — and which gives priority to
• Money, which is divisible. higher bids and lower asks...”

• Now feasible to auction things that


weren’t previously profitable: • Assume some initial allocation. This definition, as with all this
terminology, comes from Dan Friedman

• eBay • Exchange is the free alteration of allocations of


• Adword auctions goods and money between traders
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
1/18/2022

Types of value Auction Protocol Dimensions


• There are several models, embodying Private Value • Winner Determination
different assumptions about the nature Good has an value to me that is independent • Who gets the good, and what do they pay?
of what it is worth to you. • e.g. first vs second price auctions
of the good. • For example: John Lennon’s last dollar bill.

• Private Value / Common Value / Correlated Value • Open Cry vs Sealed-bid


• Common Value
With a common value, there is a risk that the winner will suffer • Are the bids public knowledge?
from the winner’s curse, where the winning bid in an auction The good has the same value to all of us, but
exceeds the intrinsic value or true worth of an item • Can agents exploit this public knowledge in future bids?
we have differing estimates of what it is.
• Winner’s curse.
• One-shot vs Iterated Bids
• Each trader has a value or limit price Correlated Value • Is there a single bid (i.e. one-shot), after which the good is
allocated?
that they place on the good. Our values are related.
• If multiple bids are permitted, then:
• Limit prices have an effect on the behaviour of • The more you’re prepared to pay, the more
I should be prepared to pay. • Does the price ascend, or descend?
traders • What is the terminating condition?
5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 6 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
1/18/2022

English Auction Dutch Auction


• This is the kind of auction everyone knows. English Auction
• Also called a “descending clock” auction Dutch Auction
• Typical example is sell-side.
• Some auctions use a clock to display the prices.

• Buyers call out bids, bids increase in price.


• In some instances the auctioneer may call out prices with buyers
• Starts at a high price, and the auctioneer
indicating they agree to such a price. calls out descending prices.

Classified in the terms we used above: Classified in the terms we used above:
One bidder claims the good by indicating the current
• The seller may set a reserve price, the lowest • First-price
• Open-cry price is acceptable.
• First-price
• Open-cry
acceptable price. • Ascending • Descending
Around 95% of internet auctions are of this kind.
• Ties are broken by restarting the descent from a slightly higher High volume (since auction proceeds swiftly). Often
price than the tie occurred at.
The classic use is the sale of antiques and artwork. used to sell perishable goods:
• Auction ends: Susceptible to: • Flowers in the Netherlands (eg. Aalsmeer)
• • Winner’s curse • Fish in Spain and Israel.
• The winner pays the price at which they
at a fixed time (internet auctions); or when there is no more
• Shills • Tobacco in Canada.
bidding activity.
• The “last man standing” pays their bid. “stop the clock”.
7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
1/18/2022

First-Price Sealed-Bid Auction Vickrey Auction


• In an English auction, you get information FPSB • The Vickrey auction is a sealed bid auction. Vickrey Auction
about how much a good is worth. • The winning bid is the highest bid, but the winning bidder pays the
• Other people’s bids tell you things about the market.
amount of the second highest bid.

• This sounds odd, but it is actually a very smart design.


• In a sealed bid auction, none of that • Will talk about why it works later.
happens Classified in the terms we used above:

• at most you know the winning price after the auction. • First-price
• Sealed Bid
• It is in the bidders’ interest to bid their true value.
Classified in the terms we used above:
• One-shot • incentive compatible in the usual terminology.
• Second-price
• In the First-Price Sealed-Bid (FPSB) auction
Governments often use this mechanism to sell
• Sealed Bid
treasury bonds (the UK still does, although the US
recently changed to Second-Price sealed Bids).
• However, it is not a panacea, as the New Zealand • One-shot
the highest bid wins as always Property can also be sold this way (as in Scotland). government found out in selling radio spectrum rights Historically used in the sale of stamps and other
paper collectibles.
• As its name suggests, the winner pays that highest price • Due to interdependencies in the rights, that led to strategic bidding,
(which is what they bid). • one firm bid NZ$100,000 for a license, and paid the second-highest price of only NZ$6.
9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
1/18/2022

Why does the Vickrey auction work? Proof of dominance of truthful bidding
• Suppose you bid more • Suppose you bid less than • Let 𝜐 i be the bidding agent i’s value for an item, and bi be the agent’s bid
than your valuation. your valuation. • The payoff for bidder i is:

• You may win the good. • You stand less chance of winning
• If you do, you may end up paying the good.
more than you think the good is • However, even if you do win it,
worth. you will end up paying the same.
• Assume bidder i bids bi > 𝜐 i (i.e. overbids)
• If maxj≠i bj < 𝜐 i, then the bidder would win whether or not the bid was truthful. Therefore the
strategies of bidding truthfully and overbidding have equal payoffs
• If maxj≠i bj > bi, then the bidder would loose whether or not the bid was truthful. Again, both
strategies have equal payoffs
• If 𝜐 i < maxj≠i bj < bi, then the strategy of overbidding would win the action, but the payoff would
be negative (as the bidder will have overpaid). A truthful strategy would pay zero.

11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
1/18/2022

Proof of dominance of truthful bidding Collusion


• Let 𝜐 i be the bidding agent i’s value for an item, and bibe the agent’s bid • None of the auction types discussed
• The payoff for bidder i is:
so far are immune to collusion
• A grand coalition of bidders can agree
beforehand to collude
• Assume bidder i bids bi < 𝜐 i (i.e. underbids) • Propose to artificially lower bids for a good
• If maxj≠i bj > 𝜐 i, then the bidder would loose whether or not the bid was truthful. Therefore the • Obtain true value for good
strategies of bidding truthfully and underbidding have equal payoffs • Share the profit
• If maxj≠i bj < bi, then the bidder would win whether or not the bid was truthful. Again, both • An auctioneer could employ bogus bidders
strategies have equal payoffs
• Shills could artificially increase bids in open cry auctions
• If bi < maxj≠i bj < 𝜐 i, then only the strategy of truthtelling would win the action, with a positive
• Can result in winners curse
payoff (as the bidder would have). An underbidding strategy would pay zero.

13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
1/18/2022

Combinatorial Auctions Combinatorial Auctions


• A combinatorial auction is an auction for • Define a set of items to be auctioned as:
bundles of goods.
• A good example of bundles of goods are spectrum • Given a set of agents Ag = {1,...,n}, the
licences. preferences of agent i are given with the
• For the 1.7 to 1.72 GHz band for Brooklyn to be useful, valuation function opposite:
you need a license for Manhattan, Queens, Staten Island.
• If that sounds to you like it would place a big requirement on
• Most valuable are the licenses for the same bandwidth. agents to specify all those preferences, you would be right.
• But a different bandwidth license is more valuable than no • If 𝜐 i() = 0, then we say that the valuation function for i is
license normalised.
• a phone will work, but will be more expensive. • i.e. Agent i does not get any value from an empty allocation

• (The FCC spectrum auctions, however, did • Another useful idea is free disposal:
not use a combinatorial auction mechanism) • In other words, an agent is never worse off having more stuff.
15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 16 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
1/18/2022

Allocation of Goods Maximising Social Welfare


• If we design the auction, we get to say how the allocation
is determined.
• Combinatorial auctions can be viewed as different social choice functions,
with different outcomes relating to different allocations of goods
• A desirable property would be to maximize social welfare.
• i.e. maximise the sum of the utilities of all the agents.

• We can define a social welfare function:

17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
1/18/2022

Defining a Combinatorial Auction Winner Determination


• Given this, we can define a combinatorial auction. • How do we do this? • One problem here is
representation, valuations are
• Given a set of goods Z and a collection of valuation functions 𝜐 1,...,𝜐 n,
exponential:vi : 2Z ! R
one for each agent i  Ag, the goal is to find an allocation Z1*, … Zn* that • Well, we could get every agent
maximises sw: i to declare their valuation: v̂i
• A naive representation is impractical.
• The hat denotes that this is what the
agent says, not what it necessarily is. • In a bandwidth auction with 1122
• Remember that the agent may lie! licenses we would have to specify
21122 values for each bidder.
• Figuring this out is called the winner determination • Then we just look at all the
problem. possible allocations and figure • Searching through them is
out what the best one is. computationally intractable
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
1/18/2022

Bidding Languages XOR Bids


• Rather than exhaustive evaluations, allow bidders to construct valuations from • With XOR bids, we pay for at most one B1 =({a,b}, 3) XOR ({c, d}, 5)
the bits they want to mention. • A bid β = (Z1, p1) XOR … XOR (Zk, pk) defines a valuation function “…I would pay 3 for a bundle that contains a
• An atomic bid β is a pair (Z, p) where Z  Z vβ as follows: and b but not c and d. I will pay 5 for a bundle
that contains c and d but not a and b, and I will
• A bundle Z′ satisfies a bid (Z, p) if Z  Z′. pay 5 for a bundle that contains a, b, c and d...”

From this we can construct the valuation:


• In other words a bundle satisfies a bid if it contains at least the things in the bid. • I pay nothing if your allocation Z’ doesn’t satisfy any of my bids
v /31 ({ a} ) = 0
• Otherwise, I will pay the largest price of any of the satisfied bids. v /31 ({ b} ) = 0
• Atomic bids define valuations v /31 ({ a, b} ) = 3
• XOR bids are fully expressive, that is they can v /31 ({ c, d} ) = 5
express any valuation function over a set of goods. v /31 ({ a, b, c, d} ) = 5
• To do that, we may need an exponentially large number of atomic
• Atomic bids alone don’t allow us to construct very interesting valuations. bids.
• However, the valuation of a bundle can be computed in polynomial time.

21 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

21 22

11
1/18/2022

OR Bids OR Bids
• Here is another example!
• With OR bids, we are prepared to • B3 =({e, f, g}, 4) OR ({f, g}, 1) OR ({e}, 3) OR ({c, d}, 4)
pay for more than one bundle v/33 ({ e} ) = 3
• A bid β = (Z1, p1) OR … OR (Zk, pk) defines k • This gives us: v/33 ({ e,f } ) = 3
v/33 ({ e,f ,g} ) = 4
valuations for different bundles v/33 ({ b,c,d,f ,g} ) = 4+ 1= 5
• An allocation of goods Z’ is assigned given a set v/33 ({ a, b,c, d,e,f ,g} ) = 4+ 4= 8
v/33 ({ c,d,e} ) = 4+ 3= 7
W of atomic bids such that:
• Every bid in W is satisfied by Z’ • Remember that if more than one bundle is satisfied, then you pay for each
• No goods appear in more than one bundle; i.e. Zi ∩ Zj =  for of the bundles satisfied.
all i,i where i ≠ j
• Also remember free disposal, which is why the bundle {e,f} satisfies the bid ({e}, 3) as the agent
• No other subset W’ satisfying the above condition is better: doesn’t pay extra for f

23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 24 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

23 24

12
1/18/2022

OR Bids Winner Determination


• Determining the winner is a combinatorial optimisation problem
• OR bids are strictly less expressive than XOR bids (NP-hard)
• But this is a worst case result, so it may be possible to develop approaches that
• Some valuation functions cannot be expressed: are either optimal and run well in many cases, or approximate (within some
bounds).

• 𝜐 ({a}) = 1, 𝜐 ({b}) = 1, 𝜐 ({a,b}) = 1 • Typical approach is to code the problem as an integer linear
program and use a standard solver.
• This is NP-hard in principle, but often provides solutions in reasonable time.
• OR bids also suffer from computational complexity • Several algorithms exist that are efficient in most cases

• Given an OR bid β and a bundle Z, computing 𝜐 β(Z) is NP-hard • Approximate algorithms have been explored
• Few solutions have been found with reasonable bounds

• Heuristic solutions based on greedy algorithms have also


been investigated
• e.g. that try to find the largest bid to satisfy, then the next etc
25 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 26 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

25 26

13
1/18/2022

The VCG Mechanism The VCG Mechanism


• Auctions are easy to strategically manipulate
• In general we don’t know whether the agents valuations
are true valuations.
• Life would be easier if they were… Recall that we could get every
agent i to declare their valuation:
• … so can we make them true valuations?
v̂i
where the hat denotes that this
• Yes! is what the agent says, not what
it necessarily is.
• In a generalization of the Vickrey auction. • The agent may lie!
• Vickrey/Clarke/Groves Mechanism

• Mechanism is incentive compatible: telling


the truth is a dominant strategy.
27 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

27 28

14
1/18/2022

The VCG Mechanism The VCG Mechanism


• With the VCG, each agent pays out the cost (to the other agents) of it
having participated in the auction.
• It is incentive compatible for exactly the same reason as the Vickrey auction was before.
• No agent can benefit by declaring anything other than its true valuation
• To understand this, think about VCG with a singleton bundle
• The only agent that pays anything will be the agent i that has the highest bid
• But if it had not participated, then the agent with the second highest bid would have won
• Therefore agent i “compensates” the other agents by paying this second highest bid

• Therefore we get a dominant strategy for each agent that guarantees


to maximise social welfare.
• i.e. social welfare maximisation can be implemented in dominant strategies

29 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 30 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

29 30

15
1/18/2022

Summary
• Allocating scarce resources comes down to auctions Class Reading (Chapter 14):
• We looked at a range of different simple auction mechanisms.
• English auction “Expressive commerce and its application to
• Dutch auction sourcing: How to conduct $35 billion of
generalized combinatorial auctions”, T.
• First price sealed bid Sandholm. AI Magazine, 28(3): 45-58
• Vickrey auction (2007).

• The we looked at the popular field of combinatorial This gives a detailed case study of a
auctions. successful company operating in the
area of computational combinatorial
• We discussed some of the problems in implementing combinatorial auctions for industrial procurement.
auctions.

• And we talked about the Vickrey/Clarke/Groves


mechanism, a rare ray of sunshine on the problems
of multiagent interaction.
31 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

31

16
1/3/2022

Cooperative Game Theory


• So far we have taken a game theoretic view of multi-agent
interactions
IT4899 • Prisoner’s Dilemma suggests that cooperation should not occur, as the
conditions required are not present:
Multi-Agent Systems • Binding agreements are not possible
Chapter 13 - Forming Coalitions • Utility is given to individuals based on individual action

Dr. Nguyen Binh Minh


Department of Information Systems • These constraints do not necessarily hold in the real world
• Contracts, or collective payments can facilitate cooperation, leading to
Coalition Games and Cooperative Game Theory
2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018
1
1 2

1
1/3/2022

Coalitional Games Formalising Cooperative Scenarios


• Coalitional games model scenarios where agents can • A Characteristic Function Game (CFG) is % R epresentation of a Simple
benefit by cooperating. represented as the tuple: G = ⟨ Ag,𝛎 ⟩ % Characteristic F unction Game

% List of Agents
• Sandholm (et. al., 1999) identified the following stages: 1,2,3
% Characteristic F unction
1 =5
Coalitional Structure Solving the optimization Dividing the benefits 2 =5
3 =5
Generation problem of each coalition Deciding “who gets what” in the • From this, we form a coalition C  Ag 1,2 = 10
Deciding in principle who will work Deciding how to work together, and payoff. Coalition members cannot 1,3 = 10
together. It asks the basic question: how to solve the “joint problem” of a ignore each other’s preferences, • Singleton: where a coalition consists of a single member 2,3 = 10
Which coalition should I join? coalition. It also involves finding how because members can defect: • Grand Coalition: where C = Ag (i.e. all of the agents) 1,2,3 = 25

The result: partitions agents into


to maximise the utility of the coalition
itself, and typically involves joint
...if you try to give me a bad payoff, I
can always walk away...
• Each coalition has a payoff value, defined by the
disjoint coalitions. planning etc.
We might want to consider issues
characteristic function 𝛎
The overall partition is a such as fairness of the distribution.
coalition structure. • i.e. if 𝛎 ( C ) = k then the coalition will get the payoff k if
they cooperate on some task
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
1/3/2022

Characteristic Function Games Which Coalition Should I Join?


• The objective is to join a coalition that the agent cannot object to • Assuming that we know the characteristic function and the payoff
• This involves calculating the characteristic function for different games vector, what coalition should an agent join?
• An outcome x for a coalition C in game ⟨Ag, 𝛎 ⟩ is a vector of payoffs to members of C, such
that x = ⟨x1, . . . , xk⟩ which represents an efficient distribution of payoff to members of Ag
• Sandholm (1999) showed that: • Where “efficient” means:
• If the game is superadditive: if 𝛎 (U) + 𝛎 (U) < 𝛎 (U⋃V)
• The coalition that maximises social welfare is the Grand Coalition

• If the game is subadditive: if 𝛎 (U) + 𝛎 (U) >𝛎 (U⋃V) • Example: if 𝛎 ({1, 2}) = 20, then possible outcomes are: ⟨20,0⟩, ⟨19,1⟩, ⟨18,2⟩ … ⟨1,19⟩, ⟨0,20⟩

• The coalitions that maximis social welfare are singletons

• However as some games are neither subadditive or superadditive: • Thus, the agent should only join a coalition C which is:
• the characteristic function value calculations need to be determined for each of the possible coalitions! • Feasible: the coalition C really could obtain some payoff than an agent could not object to; and
• This is exponentially complex
• Efficient: all of the payoff is allocated
5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 6 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
1/3/2022

Which Coalition Should I Join? Stability and the Core


• However, there may be many coalitions • Stability can be reduced to the notion of the core
• Each has a different characteristic function • Stability is a necessary but not sufficient condition for coalitions to form

• Agents prefer coalitions that are as productive as possible


• i.e. Unstable coalitions will never form, but a stable coalition isn’t guaranteed to form

• Therefore a coalition will only form if all the members


prefer to be in it
• The core of a coalitional game is the set of feasible distributions of payoff to
• I.e. they don’t defect to a more preferable coalition members of a coalition that no sub-coalition can reasonably object to
• Intuitively, a coalition C objects to an outcome if there is some other outcome that makes all of them
strictly better off
• Therefore: • Formally, C  Ag objects to an outcome x = ⟨x1, . . . , xn⟩ for the grand coalition if there is some outcome
• “which coalition should I join?” can be reduced to “is x′ = ⟨x1′, . . . , xk′⟩ for C such that: xi′ > xi for all i  C
the coalition stable?”
• Is it rational for all members of coalition C to stay with C, or could they
benefit by defecting from it? • The idea is that an outcome is not going to happen if somebody objects to it!
• There's no point in me joining a coalition with you, unless you want to • i.e. if the core is empty, then no coalition can form
form one with me, and vice versa.
7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
1/3/2022

The Core and Fair Payoffs Sharing the Benefits of Cooperation


Symmetry
• Sometimes the core is non-empty but is it “fair”? • The Shapley value is best known
Agents that make the same contribution should
get the same payoff. I.E. the amount an agent
• Suppose we have Ag = {1, 2}, with the following Characteristic Function: gets should only depend on their contribution.
attempt to define how to divide
benefits of cooperation fairly. Dummy Player
• It does this by taking into account how much an These are agents that never have any synergy
• The outcome ⟨20, 0⟩ (i.e., agent 1 gets everything) will not be in the core, since agent 2 can agent contributes.
with any coalition, and thus only get what they
can earn on their own.
object; by working on its own it can do better, because 𝛎 ({2}) = 5
• The Shapley value of agent i is the average
• However, outcome ⟨14, 6⟩ is in the core, as agent 2 gets more than working on its own, amount that i is expected to contribute to a
and thus has no objection. Additivity
coalition. If two games are combined, the value an agent


gets should be the sum of the values it gets in
The Shapley value is one that satisfies the
• But is it “fair” on agent 2 to get only a payoff of 6, if agent 1 gets 14??? axioms opposite!
the individual games.

9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
1/3/2022

Marginal Contribution Shapley Axioms: Symmetry


• The Shapley value for an agent is based on the marginal contribution • Agents that make the same
of that agent to a coalition (for all permutations of coalitions)
contribution should get the same
• Let δi(C) be the amount that agent i adds by joining a coalition C  Ag payoff
• i.e. the marginal contribution of i to C is defined as δi(C) = 𝛎 (C⋃{i}) - 𝛎 (C) • The amount an agent gets should only depend
on their contribution
• Note that if δi(C) = 𝛎 ({i}) then there is no added value from i joining C since the amount
i adds is the same as if i would earn on its own • Agents i and j are interchangeable if δi(C) = δj(C)
for every C  Ag \ {i, j}
• The Shapley value for i, denoted φi, is the value that agent i in Ag is
given in the game ⟨Ag, 𝛎 ⟩ • The symmetry axiom states:
• If i and j are interchangeable, then φi = φj
11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
1/3/2022

Shapley Axioms: Dummy Player Shapley Axioms: Additivity


• Agents that never have any synergy • If two games are combined, the value an agent gets should be the
sum of the values it gets in the individual games
with any coalition, and thus only get • I.e. an agent doesn’t gain or loose by playing more than once
what they can earn on their own. • Let G1 = ⟨Ag, 𝛎 1⟩ and G2 = ⟨Ag, 𝛎 2⟩ be games with the same agents
• An agent is a dummy player if δi(C) = 𝛎 ({i}) • Let i  Ag be one of the agents
for every C  Ag \ {i} • Let φ1i and φ2i be the value agent i receives in games G1 and G2 respectively
• i.e. an agent only adds to a coalition what it could get on its
own • Let G1+2 = ⟨Ag, 𝛎 1+2⟩ be the game such that 𝛎 1+2(C) = 𝛎 1(C) + 𝛎 2(C)

• The dummy player axiom states: • The additivity axiom states:


• The value φ1+2i of agent i in game G1+2 should be φ1i + φ2i
• If i is a dummy player, then φi = 𝛎 ({i})
13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
1/3/2022

Shapley value Shapley Example


• Recall that we stated:
• The Shapley value for an agent is based on the marginal contribution of that agent to a coalition (for all
permutations of coalitions)
• The marginal contribution can be dependent on the order in which an agent joins a coalition
• This is because an agent may have a larger contribution if it is the first to join, than if it is the last!

• For example, if Ag = {1,2,3} then the set of all possible orderings, 𝚷 (Ag) is given
as
• 𝚷 (Ag) = {(1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1)}

• We have defined the marginal contribution of i to C as δi(C) = 𝛎 (C⋃{i}) - 𝛎 (C)

• The Shapley value for i is defined as:


15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
1/3/2022

Representing Coalitional Games Representing Characteristic Functions?


% R epresentation of a Simple
• It is important for an agent to know if % Characteristic F unction Game • Two approaches to this problem:
the core of a coalition is non-empty % List of Agents
1,2,3
• try to find a complete representation that is succinct in “most” cases
• Problem: a naive, obvious representation of a % Characteristic F unction
1 =5
• try to find a representation that is not complete but is always succinct
coalitional game is exponential in the size of Ag. 2 =5
3 =5
• Now such a representation is: 1,2 = 10
1,3 = 10 • A common approach:
• utterly infeasible in practice; and 2,3 = 10
1,2,3 = 25 • interpret characteristic function over a combinatorial structure.
• so large that it renders comparisons to this input size
meaningless

• An n-player game consists of 2n-1 coalitions • We look at two possible approaches:


• e.g. a 100-player game would require 1.2 x 1030 lines • Induced Subgraph and Marginal Contribution Networks
17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 18 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
1/3/2022

Induced Subgraph Induced Subgraph


3
3 3 • Representation is succinct, but not complete
• Represent 𝛎 as an undirected A B A B
A B
graph on Ag, with integer weights • there are characteristic functions that cannot be captured using

wi,j between nodes i, j  Ag


𝛎 ({A,B,C}) 2 this representation
2 = 2 1
• Value of coalition C is then: 3+2 = 5 C • Determining emptiness of the core is NP-complete
C C D • Checking whether a specific distribution is in the core is co-NP-
4 complete

B 5
• i.e., the value of a coalition C  Ag Weighted Graph • Shapley value can be calculated in polynomial
is the weight of the subgraph time
induced by C 1
D 𝛎 ({B,D}) = 5+1 =
D
𝛎 ({D}) =
6 • i.e. an agent gets half the income from the edges in the
5 5 5
graph to which it is attached.
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
1/3/2022

Marginal Contribution Nets Coalition Structure Generation


• Calculating the Shapley value for • In addition to representing the characteristic function, there is the challenge
of calculating them!
marginal contribution nets is similar
• Remember, for a set of n agents in Ag, there will be 2n-1 distinct coalitions
to that for induced subgraphs
• Again, Shapley’s symmetry axiom applies to each • Shehory & Kraus (1998) proposed a method whereby agents distributed
agent the calculation amongst themselves
• The contributions from agents in the same rule is equal
• Resulted in a communication overhead, in coordinating which agent calculated the characteristic
• The additivity property means that: function value for which coalition
• we calculate the Shapley value for each rule • Rahwan & Jennings (2007) proposed the DVCD approach for allocating coalition value
calculations to agents without the need for communication
• sum over the rules to calculate the Shapley value for each agent
• However, agents could be incentivised to mis-represent the calculations for those coalitions in which they were not a member
• Handling negative values requires a different
• This was resolved by Riley, Atkinson, Dunne & Payne (2015) through the use of (n,s)-sequences
method
22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

22 23

11
1/3/2022

Summary
• In this lecture we have looked at mechanisms for identifying Class Reading (Chapter 13):
coalitions.
• The notion of a stable coalition game was presented, through the idea of a
“Marginal contribution nets: A compact
Core.
representation scheme for coalition games”,
• The Shapley Value was then introduced, to determine the contribution that S. Ieong and Y. Shoham. Proceedings of
different agents may have on a coalition. the Sixth ACM Conference on Electronic
Commerce (EC’05), Vancouver, Canada,
2005.
• The problem of representing coalitional games and
characteristic functions was then discussed, including: This is a technical article (but a very
• Induced Subgraphs nice one), introducing the marginal
• Marginal Contribution Nets. contribution nets scheme.

• We finally talked about Coalition Structure Generation


• This is again an active research area, especially from a
game-theoretic and computational complexity perspective.
28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

28

12
1/18/2022

Overview
• Allocation of scarce resources amongst a number
of agents is central to multiagent systems.
IT4899Q • A resource might be:
Multi-Agent Systems • a physical object
• the right to use land
Chapter 14 - Allocating Scarce Resources • computational resources (processor, memory, . . . )

Dr. Nguyen Binh Minh • It is a question of supply vs demand


Department of Information Systems • If the resource isn’t scarce…, or if there is no competition for
the resource...
• ...then there is no trouble allocating it

• If there is a greater demand than supply


• Then we need to determine how to allocate it
2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

1 2

1
1/18/2022

Overview What is an auction


• In practice, this means we will be talking • Auctions are effective in allocating resources
about auctions. efficiently
• These used to be rare (and not so long ago). • They also can be used to reveal truths about bidders “... An auction is a market institution in
which messages from traders include
• However, auctions have grown massively with the some price information — this

Web/Internet • Concerned with traders and their allocations of: information may be an offer to buy at a
given price, in the case of a bid, or an
• Frictionless commerce • Units of an indivisible good; and offer to sell at a given price, in the case
of an ask — and which gives priority to
• Money, which is divisible. higher bids and lower asks...”

• Now feasible to auction things that


weren’t previously profitable: • Assume some initial allocation. This definition, as with all this
terminology, comes from Dan Friedman

• eBay • Exchange is the free alteration of allocations of


• Adword auctions goods and money between traders
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
1/18/2022

Types of value Auction Protocol Dimensions


• There are several models, embodying Private Value • Winner Determination
different assumptions about the nature Good has an value to me that is independent • Who gets the good, and what do they pay?
of what it is worth to you. • e.g. first vs second price auctions
of the good. • For example: John Lennon’s last dollar bill.

• Private Value / Common Value / Correlated Value • Open Cry vs Sealed-bid


• Common Value
With a common value, there is a risk that the winner will suffer • Are the bids public knowledge?
from the winner’s curse, where the winning bid in an auction The good has the same value to all of us, but
exceeds the intrinsic value or true worth of an item • Can agents exploit this public knowledge in future bids?
we have differing estimates of what it is.
• Winner’s curse.
• One-shot vs Iterated Bids
• Each trader has a value or limit price Correlated Value • Is there a single bid (i.e. one-shot), after which the good is
allocated?
that they place on the good. Our values are related.
• If multiple bids are permitted, then:
• Limit prices have an effect on the behaviour of • The more you’re prepared to pay, the more
I should be prepared to pay. • Does the price ascend, or descend?
traders • What is the terminating condition?
5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 6 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
1/18/2022

English Auction Dutch Auction


• This is the kind of auction everyone knows. English Auction
• Also called a “descending clock” auction Dutch Auction
• Typical example is sell-side.
• Some auctions use a clock to display the prices.

• Buyers call out bids, bids increase in price.


• In some instances the auctioneer may call out prices with buyers
• Starts at a high price, and the auctioneer
indicating they agree to such a price. calls out descending prices.

Classified in the terms we used above: Classified in the terms we used above:
One bidder claims the good by indicating the current
• The seller may set a reserve price, the lowest • First-price
• Open-cry price is acceptable.
• First-price
• Open-cry
acceptable price. • Ascending • Descending
Around 95% of internet auctions are of this kind.
• Ties are broken by restarting the descent from a slightly higher High volume (since auction proceeds swiftly). Often
price than the tie occurred at.
The classic use is the sale of antiques and artwork. used to sell perishable goods:
• Auction ends: Susceptible to: • Flowers in the Netherlands (eg. Aalsmeer)
• • Winner’s curse • Fish in Spain and Israel.
• The winner pays the price at which they
at a fixed time (internet auctions); or when there is no more
• Shills • Tobacco in Canada.
bidding activity.
• The “last man standing” pays their bid. “stop the clock”.
7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
1/18/2022

First-Price Sealed-Bid Auction Vickrey Auction


• In an English auction, you get information FPSB • The Vickrey auction is a sealed bid auction. Vickrey Auction
about how much a good is worth. • The winning bid is the highest bid, but the winning bidder pays the
• Other people’s bids tell you things about the market.
amount of the second highest bid.

• This sounds odd, but it is actually a very smart design.


• In a sealed bid auction, none of that • Will talk about why it works later.
happens Classified in the terms we used above:

• at most you know the winning price after the auction. • First-price
• Sealed Bid
• It is in the bidders’ interest to bid their true value.
Classified in the terms we used above:
• One-shot • incentive compatible in the usual terminology.
• Second-price
• In the First-Price Sealed-Bid (FPSB) auction
Governments often use this mechanism to sell
• Sealed Bid
treasury bonds (the UK still does, although the US
recently changed to Second-Price sealed Bids).
• However, it is not a panacea, as the New Zealand • One-shot
the highest bid wins as always Property can also be sold this way (as in Scotland). government found out in selling radio spectrum rights Historically used in the sale of stamps and other
paper collectibles.
• As its name suggests, the winner pays that highest price • Due to interdependencies in the rights, that led to strategic bidding,
(which is what they bid). • one firm bid NZ$100,000 for a license, and paid the second-highest price of only NZ$6.
9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
1/18/2022

Why does the Vickrey auction work? Proof of dominance of truthful bidding
• Suppose you bid more • Suppose you bid less than • Let 𝜐 i be the bidding agent i’s value for an item, and bi be the agent’s bid
than your valuation. your valuation. • The payoff for bidder i is:

• You may win the good. • You stand less chance of winning
• If you do, you may end up paying the good.
more than you think the good is • However, even if you do win it,
worth. you will end up paying the same.
• Assume bidder i bids bi > 𝜐 i (i.e. overbids)
• If maxj≠i bj < 𝜐 i, then the bidder would win whether or not the bid was truthful. Therefore the
strategies of bidding truthfully and overbidding have equal payoffs
• If maxj≠i bj > bi, then the bidder would loose whether or not the bid was truthful. Again, both
strategies have equal payoffs
• If 𝜐 i < maxj≠i bj < bi, then the strategy of overbidding would win the action, but the payoff would
be negative (as the bidder will have overpaid). A truthful strategy would pay zero.

11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
1/18/2022

Proof of dominance of truthful bidding Collusion


• Let 𝜐 i be the bidding agent i’s value for an item, and bibe the agent’s bid • None of the auction types discussed
• The payoff for bidder i is:
so far are immune to collusion
• A grand coalition of bidders can agree
beforehand to collude
• Assume bidder i bids bi < 𝜐 i (i.e. underbids) • Propose to artificially lower bids for a good
• If maxj≠i bj > 𝜐 i, then the bidder would loose whether or not the bid was truthful. Therefore the • Obtain true value for good
strategies of bidding truthfully and underbidding have equal payoffs • Share the profit
• If maxj≠i bj < bi, then the bidder would win whether or not the bid was truthful. Again, both • An auctioneer could employ bogus bidders
strategies have equal payoffs
• Shills could artificially increase bids in open cry auctions
• If bi < maxj≠i bj < 𝜐 i, then only the strategy of truthtelling would win the action, with a positive
• Can result in winners curse
payoff (as the bidder would have). An underbidding strategy would pay zero.

13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
1/18/2022

Combinatorial Auctions Combinatorial Auctions


• A combinatorial auction is an auction for • Define a set of items to be auctioned as:
bundles of goods.
• A good example of bundles of goods are spectrum • Given a set of agents Ag = {1,...,n}, the
licences. preferences of agent i are given with the
• For the 1.7 to 1.72 GHz band for Brooklyn to be useful, valuation function opposite:
you need a license for Manhattan, Queens, Staten Island.
• If that sounds to you like it would place a big requirement on
• Most valuable are the licenses for the same bandwidth. agents to specify all those preferences, you would be right.
• But a different bandwidth license is more valuable than no • If 𝜐 i() = 0, then we say that the valuation function for i is
license normalised.
• a phone will work, but will be more expensive. • i.e. Agent i does not get any value from an empty allocation

• (The FCC spectrum auctions, however, did • Another useful idea is free disposal:
not use a combinatorial auction mechanism) • In other words, an agent is never worse off having more stuff.
15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 16 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
1/18/2022

Allocation of Goods Maximising Social Welfare


• If we design the auction, we get to say how the allocation
is determined.
• Combinatorial auctions can be viewed as different social choice functions,
with different outcomes relating to different allocations of goods
• A desirable property would be to maximize social welfare.
• i.e. maximise the sum of the utilities of all the agents.

• We can define a social welfare function:

17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
1/18/2022

Defining a Combinatorial Auction Winner Determination


• Given this, we can define a combinatorial auction. • How do we do this? • One problem here is
representation, valuations are
• Given a set of goods Z and a collection of valuation functions 𝜐 1,...,𝜐 n,
exponential:vi : 2Z ! R
one for each agent i  Ag, the goal is to find an allocation Z1*, … Zn* that • Well, we could get every agent
maximises sw: i to declare their valuation: v̂i
• A naive representation is impractical.
• The hat denotes that this is what the
agent says, not what it necessarily is. • In a bandwidth auction with 1122
• Remember that the agent may lie! licenses we would have to specify
21122 values for each bidder.
• Figuring this out is called the winner determination • Then we just look at all the
problem. possible allocations and figure • Searching through them is
out what the best one is. computationally intractable
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
1/18/2022

Bidding Languages XOR Bids


• Rather than exhaustive evaluations, allow bidders to construct valuations from • With XOR bids, we pay for at most one B1 =({a,b}, 3) XOR ({c, d}, 5)
the bits they want to mention. • A bid β = (Z1, p1) XOR … XOR (Zk, pk) defines a valuation function “…I would pay 3 for a bundle that contains a
• An atomic bid β is a pair (Z, p) where Z  Z vβ as follows: and b but not c and d. I will pay 5 for a bundle
that contains c and d but not a and b, and I will
• A bundle Z′ satisfies a bid (Z, p) if Z  Z′. pay 5 for a bundle that contains a, b, c and d...”

From this we can construct the valuation:


• In other words a bundle satisfies a bid if it contains at least the things in the bid. • I pay nothing if your allocation Z’ doesn’t satisfy any of my bids
v /31 ({ a} ) = 0
• Otherwise, I will pay the largest price of any of the satisfied bids. v /31 ({ b} ) = 0
• Atomic bids define valuations v /31 ({ a, b} ) = 3
• XOR bids are fully expressive, that is they can v /31 ({ c, d} ) = 5
express any valuation function over a set of goods. v /31 ({ a, b, c, d} ) = 5
• To do that, we may need an exponentially large number of atomic
• Atomic bids alone don’t allow us to construct very interesting valuations. bids.
• However, the valuation of a bundle can be computed in polynomial time.

21 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

21 22

11
1/18/2022

OR Bids OR Bids
• Here is another example!
• With OR bids, we are prepared to • B3 =({e, f, g}, 4) OR ({f, g}, 1) OR ({e}, 3) OR ({c, d}, 4)
pay for more than one bundle v/33 ({ e} ) = 3
• A bid β = (Z1, p1) OR … OR (Zk, pk) defines k • This gives us: v/33 ({ e,f } ) = 3
v/33 ({ e,f ,g} ) = 4
valuations for different bundles v/33 ({ b,c,d,f ,g} ) = 4+ 1= 5
• An allocation of goods Z’ is assigned given a set v/33 ({ a, b,c, d,e,f ,g} ) = 4+ 4= 8
v/33 ({ c,d,e} ) = 4+ 3= 7
W of atomic bids such that:
• Every bid in W is satisfied by Z’ • Remember that if more than one bundle is satisfied, then you pay for each
• No goods appear in more than one bundle; i.e. Zi ∩ Zj =  for of the bundles satisfied.
all i,i where i ≠ j
• Also remember free disposal, which is why the bundle {e,f} satisfies the bid ({e}, 3) as the agent
• No other subset W’ satisfying the above condition is better: doesn’t pay extra for f

23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 24 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

23 24

12
1/18/2022

OR Bids Winner Determination


• Determining the winner is a combinatorial optimisation problem
• OR bids are strictly less expressive than XOR bids (NP-hard)
• But this is a worst case result, so it may be possible to develop approaches that
• Some valuation functions cannot be expressed: are either optimal and run well in many cases, or approximate (within some
bounds).

• 𝜐 ({a}) = 1, 𝜐 ({b}) = 1, 𝜐 ({a,b}) = 1 • Typical approach is to code the problem as an integer linear
program and use a standard solver.
• This is NP-hard in principle, but often provides solutions in reasonable time.
• OR bids also suffer from computational complexity • Several algorithms exist that are efficient in most cases

• Given an OR bid β and a bundle Z, computing 𝜐 β(Z) is NP-hard • Approximate algorithms have been explored
• Few solutions have been found with reasonable bounds

• Heuristic solutions based on greedy algorithms have also


been investigated
• e.g. that try to find the largest bid to satisfy, then the next etc
25 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 26 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

25 26

13
1/18/2022

The VCG Mechanism The VCG Mechanism


• Auctions are easy to strategically manipulate
• In general we don’t know whether the agents valuations
are true valuations.
• Life would be easier if they were… Recall that we could get every
agent i to declare their valuation:
• … so can we make them true valuations?
v̂i
where the hat denotes that this
• Yes! is what the agent says, not what
it necessarily is.
• In a generalization of the Vickrey auction. • The agent may lie!
• Vickrey/Clarke/Groves Mechanism

• Mechanism is incentive compatible: telling


the truth is a dominant strategy.
27 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

27 28

14
1/18/2022

The VCG Mechanism The VCG Mechanism


• With the VCG, each agent pays out the cost (to the other agents) of it
having participated in the auction.
• It is incentive compatible for exactly the same reason as the Vickrey auction was before.
• No agent can benefit by declaring anything other than its true valuation
• To understand this, think about VCG with a singleton bundle
• The only agent that pays anything will be the agent i that has the highest bid
• But if it had not participated, then the agent with the second highest bid would have won
• Therefore agent i “compensates” the other agents by paying this second highest bid

• Therefore we get a dominant strategy for each agent that guarantees


to maximise social welfare.
• i.e. social welfare maximisation can be implemented in dominant strategies

29 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 30 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

29 30

15
1/18/2022

Summary
• Allocating scarce resources comes down to auctions Class Reading (Chapter 14):
• We looked at a range of different simple auction mechanisms.
• English auction “Expressive commerce and its application to
• Dutch auction sourcing: How to conduct $35 billion of
generalized combinatorial auctions”, T.
• First price sealed bid Sandholm. AI Magazine, 28(3): 45-58
• Vickrey auction (2007).

• The we looked at the popular field of combinatorial This gives a detailed case study of a
auctions. successful company operating in the
area of computational combinatorial
• We discussed some of the problems in implementing combinatorial auctions for industrial procurement.
auctions.

• And we talked about the Vickrey/Clarke/Groves


mechanism, a rare ray of sunshine on the problems
of multiagent interaction.
31 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

31

16
12/28/2023

Overview
• How do agents agree on what to believe?
• In a court of law, barristers present a rationally justifiable
position based on the arguments put forward.
• If all of the evidence and arguments were consistent, there would be no
disagreement

COMP310 •
But often, the positions are contradictory or inconsistent

We need principled techniques for dealing with inconsistency

Multi-Agent Systems • Argumentation involves dealing with


Chapter 16 - Argumentation inconsistencies with beliefs of multiple agents
• Sometimes they are obvious
A/Prof. Nguyen Binh Minh Ph.D. • I believe p; but you believe ¬p
Department of Computer Science • Other times they are implicit
• I believe p, and p ⟶ q. However, you believe ¬q
2 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

1 2

1
12/28/2023

Argumentation Types of Argument


• Argumentation provides principled • Argumentation involves putting forward arguments for Michael Gilbert (1994) identified 4
and against propositions modes of argument
techniques for resolving inconsistency. • together with justifications for these arguments • Logical mode — akin to a proof, and is
• Or at least, sensible rules for deciding what to deductive in nature.
• “If you accept that A and that A implies B, then
believe in the face of inconsistency. • Michael Gilbert suggested that there are four modes of you must accept that B”.
argument in human argumentation (see opposite) • Emotional mode — appeals to feelings and

• If we are presented with p and ¬p it is
Typically, law courts prohibit emotional and visceral modes of attitudes.
argumentation • “How would you feel if it happened to you?”
not clear what we should believe. • But in other contexts (e.g. in families) emotional arguments may be • Visceral mode — physical and social aspect
permissible of human reasoning; e.g. stamping one’s
• There can be many rational positions, so which is feet to indicate strength of feeling.

• We focus here on two approaches to automated • “Cretin!”


the best?
argumentation • Kisceral mode – appeals to the mystical or
• If I believe p and you believe ¬p then ∅ is a rational position religious
• Or we could just accept one and discard the other (i.e. {p} or • Abstract argumentation, which examines how arguments co-exist • “This is against Christian teaching!”
{¬p}) • Deductive argumentation, which exploits logical reasoning
3 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 4 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

3 4

2
12/28/2023

Abstract Argumentation Dung’s Argumentation System


• An abstract argument system • A set of Dung-style arguments is represented as a
• p : Since the weather today is sunny, I’m going
• A collection of arguments together with a relation “ ⟶ ” tuple ⟨Σ, ⊳⟩: to go out on my bike.
saying what attacks what. • q : Since today is a weekday and I have to go
• Systems like this are called Dung-style (or Dungian) after their inventor.
If this seems too abstract, here are some
arguments we’ll be looking at.
• Σ is a (possibly inconsistent) set of arguments to work, I can’t go out on my bike.

• Arguments are presented as abstract symbols • ⊳ is a set of attacks between arguments in Σ • r : Since today is a holiday, I don’t have to go
to work.
• The meaning of an argument is irrelevant • p : Since the weather today is sunny, I’m • (𝜑, ψ)  ⊳ denotes the relationship: 𝜑 attacks ψ • s : Since I took the day off, I don’t have to go
going to go out on my bike. to work.

• If accepting one argument q means rejecting • q : Since today is a weekday and I have to
go to work, I can’t go out on my bike. • For example: ⟨{p, q, r, s}, {(r, q), (s, q), (q, p)}⟩
another argument p, we say that: • r : Since today is a holiday, I don’t have to • There are four arguments, p, q, r, s (see opposite) r
• go to work.
q attacks argument p
• There are three attacks:
• q is a counterexample of p; or • s : Since I took the day off, I don’t have to q p
go to work. • r attacks q
• q is an attacker of p.
• s attacks q
• This can be written as (q, p) or alternatively q ⟶ p • q attacks p s
• However, we are not actually concerned as to what p and q are. • The question is, given this, what should we believe?
5 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018
6
Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

5 6

3
12/28/2023

Conflict Free Positions Mutually Defensive Positions


• A position S ⊆ Σ is a set of arguments • A position S is mutually defensive if every
• p : Since the weather today is sunny, I’m going • p : Since the weather today is sunny, I’m going
• A position can be inconsistent - it is just a selection of to go out on my bike. element of S that is attacked is defended by to go out on my bike.
arguments • q : Since today is a weekday and I have to go some element of S. • q : Since today is a weekday and I have to go
to work, I can’t go out on my bike. to work, I can’t go out on my bike.
• r : Since today is a holiday, I don’t have to go • Self-defence is allowed • r : Since today is a holiday, I don’t have to go
• A position S is conflict free if no member of S to work. to work.

attacks another member of S. • s : Since I took the day off, I don’t have to go
to work.
• These positions are mutually defensive: • s : Since I took the day off, I don’t have to go
to work.
• If an argument a is attacked by another a′, then it is defended • ∅, {r}, {s}, {r, s}, {p, r}, {p, s}, {r, s, p}
by a′′ if a′′ attacks a′ • The position {p, r} is defended, because if we have the case that
r r
• The position is Internally consistent another argument q is added to the position {p, r}, then although
q attacks r, p defends r as it attacks q
q p q p
• The conflict-free sets in the previous system • Note that {p}, {q} are not mutually defensive
are: s
• The position {p} is not defended if another argument (e.g. q) is
s
• ∅, {p}, {q}, {r}, {s}, {r, s}, {p, r}, {p, s}, {r, s, p} added to it
• Thus p is defended by r and s 7 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 8 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

7 8

4
12/28/2023

Admissible Positions Preferred Extension


• A position that is conflict free and • A preferred extension is a maximal admissible set.
• p : Since the weather today is sunny, I’m going • p : Since the weather today is sunny, I’m going
mutually defensive is admissible. to go out on my bike. • Adding another argument will make it inadmissible. to go out on my bike.
• Recall that a position S is conflict free if no member of S attacks • q : Since today is a weekday and I have to go • q : Since today is a weekday and I have to go
to work, I can’t go out on my bike. to work, I can’t go out on my bike.
another member of S.
• r : Since today is a holiday, I don’t have to go • A position S is a preferred extension if S is • r : Since today is a holiday, I don’t have to go
• Also recall that a position S is mutually defensive if every element to work.
of S that is attacked is defended by some element of S.
admissible and no superset of S is admissible. to work.
• s : Since I took the day off, I don’t have to go • s : Since I took the day off, I don’t have to go
to work. • Thus, in our example, ∅ is not a preferred extension, because {p} is to work.
admissible.
• All of the following positions are
• Similarly, {p, r, s} is admissible because adding q would make it
admissible: r inadmissible. r
• ∅, {r}, {s}, {r, s}, {p, r}, {p, s}, {r, s, p}
• Admissibility is a minimal notion of a reasonable q p • A set of arguments always has a preferred extension q p
position: • The empty set ∅ is always an admissible position.
• It is internally consistent and defends itself against all attackers
s • If there are no other admissible positions, then it will be the maximal s
admissible set.
• It is a coherent, defendable position
9 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 10 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

9 10

5
12/28/2023

Preferred Extension Preferred Extension


• The following examples are pathological cases • With a larger set of arguments it is exponentially harder to find the preferred
extension.
• n arguments have 2n possible positions.
b g
a b a

a b c d c d e f
c a
b h
In this case, a and b are mutually
These two arguments are
mutually attacking.
With an odd number of
arguments attacking in a cyclic
attacking, and thus there will be at • The set of arguments above has two preferred extensions: {a, b, d, f} and {c, e, g, h}
As either could attack the other, pattern, there can be no
least two preferred extensions. As • Note that d and e mutually attack each other.
they both attack c, d is defended.
there are two preferred consistent state. Thus, the • Therefore we have two maximal admissible sets, depending on whether d attacks e, or e attacks d
Therefore, we have the two
extensions: {a} and {b} preferred extension is ∅.
extensions: {a,d} and {b,d}
11 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 12 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

11 12

6
12/28/2023

Preferred Extension Credulous and sceptical acceptance


• In contrast: • To improve on preferred extensions we • p : Since the weather today is sunny, I’m going

g
can define: to go out on my bike.
a • q : Since today is a weekday and I have to go
• An argument is sceptically accepted if it is a to work, I can’t go out on my bike.

member of every preferred extension; and • r : Since today is a holiday, I don’t have to go
c d e f to work.
• An argument is credulously accepted if it is a • s : Since I took the day off, I don’t have to go
to work.
member of at least one preferred extension
b h

• Clearly anything that is sceptically r

• The set of arguments above has only one preferred extension: {a, b, d, f} accepted is also credulously accepted. q p
• Both c and e are now attacked by d and neither are defended
• In our original example:
• Therefore neither can be within an admissible set
• p, r and s are all sceptically accepted s
• q is neither sceptically or credulously accepted
13 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 14 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

13 14

7
12/28/2023

Grounded Extension Grounded Extensions


• A grounded extension is the least questionable set.
• p : Since the weather today is sunny, I’m going
• Accept only the arguments that one cannot avoid to accept to go out on my bike. m


Reject only the arguments that one cannot avoid to reject
Abstain as much as possible.
• q : Since today is a weekday and I have to go
to work, I can’t go out on my bike. c
k l
• Consider computing the grounded
• r : Since today is a holiday, I don’t have to go
to work. extension of the graph opposite.
• This gives rise to the most skeptical (or least • s : Since I took the day off, I don’t have to go
d
g

committed) semantics to work. a


j • We can say that:
• Arguments are guaranteed to be acceptable if they aren’t attacked. i p • h is not attacked, so h is IN.
b
• There is no reason to doubt them - they are IN
r • h is IN and attacks a, so a is OUT.
• Arguments attacked by those that are in are therefore unacceptable e n q • h is IN and attacks p, so p is OUT.
• They are OUT — delete them from the graph.
q p f
• h is IN and attacks e, so e is OUT.
• Continue until the graph doesn’t change.
• …
h
• The grounded extension is the set of IN arguments s
• The grounded extension for our example is {r, s, p}
15 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 16 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

15 16

8
12/28/2023

Grounded Extensions Grounded Extensions


• Consider computing the grounded extension of
Consider computing the grounded the graph opposite.
m m
extension of the graph opposite. • We can say that:
c c
k l
• We can say that: k l • h is not attacked, so h is IN.
• …
d
g
• h is not attacked, so h is IN. d
g • p is OUT and is the only attacker of q so q is IN.
j • h is IN and attacks a, so a is OUT. j • g is not attacked, so g is IN.
a a
• …
i • h is IN and attacks p, so p is OUT. i
p p • B is no longer attacked, and so b is IN
b
• h is IN and attacks e, so e is OUT. b
• We can’t say anything about:
e n q • p is OUT and is the only attacker of q so q is IN. e n q • m, k, l as they attack each other in a cycle
f
• g is not attacked, so g is IN. f
• c as the status of m is not known

• g is IN and attacks d, so d is OUT. • i, j as they mutually attack each other


h h • n as the status of i or j is not definitively known
• g is IN and attacks p (which is also attacked by h) so p is OUT.
• f as the status of n is unknown
• B is no longer attacked, and so b is IN.
• The grounded extension is {b, g, h, q}
17 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 18 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

17 18

9
12/28/2023

Full Example #1 Full Example #2


• Conflict Free • Preferred Extensions: • Admissible:
• ∅, {A, D}, {A, E}, {B, C} • {B,C} • ∅, {b, d}, {c, e}, {e, h}, {d, f}, {d, h}, {b, d, f}, {b, d, h}, {c, e, h}, {d, f, h}, {b, d,
• These are the only positions that b f f, h}
B
• Credulously & Sceptically
exist with no attack relations

• Preferred Extensions:
• Mutually Defensive Accepted:
c d e • {b, d, f, h}
• ∅, {B, C}
• B, C
• {c, e, h}
• {A,D} is not mutually defensive,
A E D because neither are defended from C • Grounded Extension: • Credulously Accepted:
• {A,E} is not mutually defensive, • ∅
because A does not defend E from • b, c, d, e, f
an attack by D • Every argument is attacked by at g
least one other argument, so it is not
possible to determine any arguments • Sceptically Accepted:
• Admissible: that are IN (and consequently other
arguments that are out) •h
C • ∅, {B,C}
h
• These are the only positions that are
both conflict free and mutually
• Grounded Extension:
defensive •∅
19 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 20 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

19 20

10
12/28/2023

Deductive Argumentation Deductive Argumentation


• Abstract argumentation models arguments as atomic, Stephen Toulmin • Claim (Conclusion)
indivisible entities • A conclusion whose merit must be established.

• However, arguments have a structure, which can be exploited when • Ground/Support (Fact, Evidence, Data)
reasoning • A fact one appeals to as a foundation for the claim.

• Warrant (Rule, Axiom)


• In deductive Argumentation, the arguments are • A statement authorising movement from the ground to the claim.
modelled using logical formulae
• Argumentation models defeasible reasoning
• Backing
British philosopher who devoted his
• Credentials designed to certify the statement expressed in the warrant; backing must be
work to the analysis of moral introduced when the warrant itself is not convincing enough to the readers or the listeners.
(s) Therefore (Q) + (C) reasoning. Throughout his writings,
Support Qualifier Claim he sought to develop practical • Rebuttal
arguments which can be used
Since Because Unless • Statements recognising the restrictions which may legitimately be applied to the claim.
(W) (W) (R)
effectively in evaluating the ethics
behind moral issues. His works
Warrant Backing Rebuttal • Qualifier
were later found useful in the field
• Words or phrases expressing the speaker's degree of force or certainty concerning the claim.
• Conclusions can be rebutted, premises (and warrants) of rhetoric for analyzing rhetorical
arguments.
Such words or phrases include "probably," "possible," "impossible," "certainly," etc

can be challenged.
21 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 22 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

21 22

11
12/28/2023

Deductive Argumentation Deductive Argumentation


Example
• The basic form of deductive arguments is Example • Argumentation takes into account the relationship Σ={
human(Heracles)
Σ ⊢ (S, p) Σ={
human(Socrates)
between arguments. father(Heracles, Zeus)
• Let (S1, p1) and (S2, p2) be arguments from some database Σ father(Apollo, Zeus)
• Σ is a (possibly inconsistent) set of logical formulae;
human(Heracles)
father(Heracles, Zeus)
• Then (S1, p1) can be attacked in one of two ways: divine(X) ➝ ¬ mortal(X)
• p is a sentence or proposition; i.e. a logical formula father(Apollo, Zeus) human(X) ➝ mortal(X)
father(X, Zeus) ➝ divine(X)
known as the conclusion; and divine(X) ➝ ¬ mortal(X)
human(X) ➝ mortal(X) • Rebut }
¬(father(X, Zeus) ➝ divine(X)

• S is the grounds or support; i.e. a set of logical father(X, Zeus) ➝ divine(X)


¬(father(X, Zeus) ➝ divine(X) • (S2, p2) rebuts (S1, p1) if p2 ≡¬p1.
formulae such that: Given the argument Arg2:
}
• i.e. the conclusions attack or contradict each other Arg2= ({human(Heracles), human(X) ➝ mortal(X)},
• SΣ mortal(Heracles))
Therefore, the following argument Arg1 holds:
• S ⊢ p and Arg1= ({human(Socrates), human(X) ➝ mortal(X)},
mortal(Socrates)) • Undercut The argument Arg3 rebuts Arg2:
Arg3= ({father(Heracles, Zeus), father(X, Zeus) ➝ divine(X),
• There is no S′  S such that S′ ⊢p divine(X) ➝ ¬ mortal(X)}, ¬ mortal(Heracles))
I.e. • (S2, p2) undercuts (S1, p1) if p2 ≡ ¬q1 for some q1  S1.
S = {human(Socrates), human(X) ➝ mortal(X)}
• i.e. the conclusion p2 attacks some formulae q1 in the support The argument Arg4 undercuts Arg3:
• Often we just write the argument as (S, p) p = mortal(Socrates)
for p1
Arg4= ({¬(father(X, Zeus) ➝ divine(X))}, ¬(father(X, Zeus) ➝
divine(X)))
23 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 24 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

23 24

12
12/28/2023

Attack and Defeat Another Example


• Deductive argumentation connects to the Argument x
abstract ideas we were just looking at. Here is one deductive argument.
a denotes “We recycle”
• A rebuttal or undercut between two arguments b denotes “We save resources”
becomes the attack in a Dungian system. a → b denotes “If we recycle, then we save resources”
Formally we get: ({a, a → b}, b)
• Note that a rebut is symmetrical
• Causes problems with some kinds of extension. Argument y Argument z
A second argument, that conflicts with the first: A third argument, that conflicts with the first:
• Once we have identified attacks, we can c denotes “Recycled products are not used” d denotes “We create more desirable recycled products”
a  c → ¬b denotes “If we recycle and recycled products are d → ¬c denotes “If we create more desirable recycled
look at preferred extensions or grounded not used then we don’t save resources” products then recycled products are used”
extensions to determine what arguments to Formally we get: ({a, c, a  c → ¬b}, ¬b) Formally we get: ({d, d → ¬c}, ¬c)
accept.
x and y rebut each other. z undercuts y
25 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 26 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

25 26

13
12/28/2023

Different Dialogues Persuasion Dialogues


• With appropriate choice of language, can use • We have two agents, P and C, each with some knowledge base, ΣP and ΣC.
argumentation to capture all of these kinds of • Each time one makes an assertion, it is considered to be an addition to its commitment store,
dialogue. CS(P) or CS(C).
• Thus: P can build arguments from ΣP  CS(C) …
• Information seeking (Personal Ignorance) • … and C can use ΣC  CS(P).
• Tell me if p is true.
• Commitment stores are information that the agent has made public.
• Inquiry (General Ignorance)
• Can we prove p? • We assume that dialogues start with P making the first move.
• Persuasion (Conflict of opinions) • The outcomes, then, are:
• You’re wrong to think p is true. • P generates an argument both classify as IN, or
• Negotiation (Conflict of interest) • C makes P argument OUT.
• How do we divide the pie?
• Deliberation (Need for Action) • Can use this for negotiation if the language allows you to express offers.
• Where shall we go for dinner?
27 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 28 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

27 28

14
12/28/2023

Persuasion Dialogues Persuasion Dialogues


• A typical persuasion dialogue would proceed • This process eventually terminates when
as follows:
• P has an acceptable argument (S, p), built from ΣP, and ΣP  CS(P)  CS(C)
wants C to accept p.
and
• P asserts p.
• C has an argument (S′, ¬p).
ΣC  CS(C)  CS(P)
• C asserts ¬p. • eventually provide the same set of IN arguments and the
• P cannot accept ¬p and challenges it. agents agree.
• C responds by asserting S′.
• P has an argument (S′′, ¬q) where q  S′, and challenges
q.
• Clearly here we are looking at grounded
• ...
extensions.
29 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018 30 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

29 30

15
12/28/2023

Summary
• This chapter has looked at Class Reading (Chapter 16):

argumentation as a means through “An introduction to argumentation


semantics”, Pietro Baroni, Martin Caminada
which agents can reach agreement. and Massimiliano Giacomin, The Knowledge
Engineering Review, Volume 26 Issue 4,

• Argumentation allows for more complex


December 2011, pp 365-410

interactions than the negotiation mechanisms we This paper reviews Dung’s original
looked at last chapter. notions of complete, grounded,
preferred, and stable semantics, as
well as subsequently proposed notions

• Argumentation can be used for a like semi-stable, ideal, stage, and CF2
semantics, considering both the
extension-based and the labelling-
range of tasks that include negotiation. based approaches with respect to
their definitions.
• Also allows for inquiry, persuasion, deliberation.
31 Copyright: M. J. Wooldridge, S. Parsons and T.R. Payne, Spring 2013. Updated 2018

31

16

You might also like