Professional Documents
Culture Documents
Introduction To Multi Ag Ent SYSTEMS, 2012
Introduction To Multi Ag Ent SYSTEMS, 2012
INTRODUCTION TO MULTIAGENT
SYSTEMS, 2012
If you read any of my work before, you already know the drill: this document is a best-effort, brief,
inconclusive summary of the course’s material that I compiled for my own use. If it happens to help
you, that’s great, but don’t blame me for missing something. Comments are welcome at sashag@cs.
This text is organized in the same order the material was presented in class. However, I used the
second edition of Wooldridge’s “Introduction to Multiagent Systems” book, so the book’s chapters
are not presented in order.
Trends in computing:
Operate independently
Represent our best interests
Cooperate, negotiate and reach agreements
An agent is a computer program capable of independent action on behalf of its owner. Figures out
what it has to do on itself, does not have to be told. A multiagent system consists of several
interacting agents that cooperate, coordinate and negotiate.
The field of multiagent systems deals with agent design (micro) and society design (macro). The
field borrows from:
1
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Agents must be prepared for failure because they have incomplete control of their environment.
Classification of environments:
Additional challenges:
AN INTELLIGENT AGENT
Agents must strike a balance between reactive and proactive behavior; must be flexible.
Intentional stance: describe agents using mental states – beliefs, desires, intentions:
Useful when the system is sufficiently complex and its structure is not completely known
Not useful if we have a simpler, mechanistic description
2
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
FORMAL MODEL
Environment states E
Agent actions Ac
Run – sequence of environment states and actions
State transformer function τ – maps a run that ends with an agent action to a set of
possible environment states (non-deterministic)
Environment Env – triple of E, initial state, τ
Agent Ag – maps runs that end with an environment state to an action (deterministic)
o Purely reactive agent – function from E to Ac (no history)
o Stateful agent – maintains state, the function is from state to Ac
Set of runs of agent in environment R(Ag, Env) – terminated runs only
TASK SPECIFICATION
The optimal agent in an environment maximizes the expected utility. Bounded optimality
introduces the constraints of a given machine into the equation.
It’s often hard to reason in terms of utility. Instead, can use predicate task specifications that map
a run to either 1 or 0. Two common types of tasks:
Achievement task G – in every run, bring about one (or more) of the states in G
Maintenance task B – in every run, stay away from all the states in B
Synthesizing agents for task environments: take an environment and task specification and
generate an agent. Ideal synthesis is both sound and complete.
3
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Definition: a system situated within an environment that senses it and acts upon it, pursuing an
agenda so as to effect what it senses in the future.
Agents can be classified based on these properties – control mechanism, environment, mobility,
reasoning, purpose &c.
Try to deduce some action a to perform – prove Do(a) – from the internal database
If nothing is deduced, try to find an action a whose negation Do(a) cannot be deduced
from the internal database
Perform an action and modify the state of the database accordingly
Repeat
Shoham: an agent is any entity described using mental terms such as beliefs, capabilities, choices,
commitments.
Planning to provide:
4
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
AGENT0 agents:
Capabilities –
Initial beliefs - , initial commitments -
Commitment rules – message condition, mental condition (beliefs), action (commitment)
o Interpreted at runtime, do not require theorem proving
o To ensure easy belief consistency checking with new facts, facts are very restricted
All of these have a temporal component – “i believes that at time t, b will believe…” &c.
AGENT0 supports private (internal) and public messages: request, unrequest, inform, refrain.
Sending messages and acting upon them is constrained by the agent’s mental state, e.g.:
PLACA – subsequent work that introduces the ability to plan and communicate via high-level goals;
similar to AGENT0.
CONCURRENT METATEM
Rules are based on temporal logic operators such as ~open(door) U request(you), meaning the
door is not open until you request it, or ⦿request(you) → ◊open(door), meaning that if you just
requested it, the door will open at some point in the future.
Agent execution revolves around constructing proofs of satisfiability for the rules.
5
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Practical reasoning consists of figuring out what to do – it is directed towards action and not
theoretical beliefs. Consists of:
DELIBERATION
Properties of intentions:
Intentions are stronger than desires – they lead to action, although can be overridden
Intentions persist over time, you try again if you fail
Intentions can be dropped if unrealistic or if goals change
Intentions constrain future reasoning, e.g. incompatible intentions
Intentions influence future beliefs, e.g. for reasoning post-achievement of the intention
Agents must believe their intentions are achievable
Agents will not adopt as intentions something they believe to be inevitable
Agents need not intend all possible consequences of their intentions – package deal
PLANNING
Means-ends reasoning (planning) produces from beliefs and intentions a plan of concrete actions.
STRIPS planner:
STRIPS works using a goal stack – this is limiting (also in PRS), can’t pursue multiple goals at once:
must finish with a certain goal before moving to a sibling.
Plans can be generated from a plan library – for each plan, see if it brings about the intentions
required.
6
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
[*] Constant deliberation is expensive, but blindly executing in a changing world is expensive, too –
cheap meta-level control determines whether it’s worthwhile to reconsider: bold vs. cautious
agents in a world with varying rate of environment change (static/dynamic environment).
BDI LOGIC
Path quantifiers: A x means “on all paths, x is true”, E x means “on some paths, x is true”.
Axioms of KD45:
1. ( → )→( → )
2. →
3. →
4. →
5. If p and p → q, then q
6. If p is a theorem, then so is Bel p
CTL* notation:
BDI logic:
IRMA has a plan library and explicit representations of beliefs, desires and intentions.
Consists of:
7
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Planning relies on BDI database and a plan library constructed by the programmer. Plans in PRS
have goals (post-conditions), contexts (pre-conditions) and bodies (actions).
Actions can consist of sub-goals to be achieved, loops. Planning starts with a single top-level goal.
Deliberation between competing options is done using meta-level plans (or utilities for plans).
PRS executes plans using an intention stack – no branching, which means only one goal can be
pursued at a time. [This is limiting, current AI uses graph search algorithms instead.]
When dealing with deliberation and means-ends reasoning, it is critical to remember that agents
are resource bounded. The longer the agent deliberates, the more likely the deliberation results to
become stale.
Mechanisms for limiting reasoning time:
REACTIVE AGENTS
Advantages:
8
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Disadvantages:
SUBSUMPTION ARCHITECTURE
Brooks’ subsumption architecture proposes that intelligent behavior emerges without explicit
symbolic representations and reasoning.
The system is constructed in layers of task-accomplishing behaviors, simple rules of the form
situation → action. Layers can suppress/replace inputs or inhibit outputs of other layers (e.g.
collision avoidance will inhibit output of exploration layer if collision is imminent).
Layers are constructed as levels of competence; for a mobile robot:
Advantages:
Building the system in layers means a working system is available when level 0 is done
Layers can work in parallel
Failure, errors or slowness at higher levels does not cause the entire system to fail – it
merely degrades
SITUATED AUTOMATA
Agents are specified in a logic of beliefs and goals, but compiled to a digital circuit that does no
symbolic manipulation. Agents consist of perception component and action component.
9
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
The primary problem is around the compilation process, which can be extremely expensive or even
theoretically impossible.
OTHER IDEAS
HYBRID AGENTS
Organize the system into layers: some layers capable of reactive behavior, some layers capable of
proactive behavior. Control flow:
Horizontal layering – each layer connected to sensory input and action output, means
there are many complex interactions and mediation required
Vertical layering – sensory input and action output dealt with by one layer (each), simpler;
can be one-pass or two-pass
TOURINGMACHINES
A control subsystem decides which layer should have control of the agent – censorship rules for
inputs (inhibition) and outputs.
INTERRAP
Each layer has a knowledge base of corresponding information – raw sensory input, plans, models
of other agents, &c. – depending on layer.
10
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Bottom-up activation – lower layer passes control to higher layer because it doesn’t know what to
do. Top-down execution – higher layer uses lower layer to execute goals. Input and output are
linked only to the lowest layer.
Agents have multiple and sometimes overlapping spheres of influence, and are linked by
hierarchical and other relationships.
Simultaneous actions by agents on the environment result in a certain outcome. Agents have
preferences, expressed as utility functions over outcomes.
The relationship between money and utility is complicated: sometimes plenty of money brings
little utility; risk-related factors also affect the “value” of money vs. utility.
i defects i cooperates
4 1
j defects
Game 1 4 4
4 1
j cooperates
1 1
i defects i cooperates
3 1
j defects
Game 2 2 2
4 3
j cooperates
1 3
SOLUTION CONCEPTS
11
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Dominant strategy – a strategy s is dominant for player i if no matter what player j chooses, i will
do at least as well with that strategy. This is the obvious choice, when it exists.
[In Game 1, (D,D) is a dominant strategy.]
Nash equilibrium – strategies s, t for players i, j respectively are in Nash equilibrium if (a) under
the assumption i plays s, j can do no better than play t, and (b) under the assumption j plays t, i can
do no better than play s. There might be 0, 1, or several NEs.
[In Game 2, (D,D) is a Nash equilibrium.]
Mixed strategies – introducing an element of randomness: play s with probability p, play t with
probability q, &c. There is always a NE with mixed strategies.
[In rock-paper-scissors, playing each with probability ⅓ is a strategy that is in NE with itself.]
Sub-game perfect equilibrium – convert the game to extensive form, where players move
sequentially, in order; a sub-game perfect equilibrium is a NE in every sub-game.
(Or: work backwards from all the ends of the game, identify best responses, and move backward
to previous player’s decision, etc.)
Maximizing social welfare – an outcome maximizes social welfare if the sum of gained utilities is
maximal over the outcomes. When all players have the same owner, this is ideal.
PRISONER’S DILEMMA
i defects i cooperates
2 0
j defects
Prisoner’s Dilemma 2 5
5 3
j cooperates
0 3
Defection is a dominant strategy for both players, and (D,D) is the only Nash equilibrium.
However, this outcome is not Pareto efficient and does not maximize social welfare – utility is
wasted.
In an infinite iterated tournament, the shadow of punishment encourages cooperation, and the
rational behavior is to cooperate.
In a finite iterated tournament when the number of rounds is known, defection is still the
dominant strategy (backwards induction argument). This can be “fixed” by making the number of
rounds unknown in advance.
AXELROD’S TOURNAMENT
12
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
1980 tournament: programs competed against each other in Prisoner’s Dilemma games, 200
rounds each.
Some strategies:
Tit-for-tat won the tournament – the key is that it interacted with like-minded strategies that
favored cooperation. [Interestingly, it also won the second tournament!]
Characteristic “rules”:
Program equilibria – submit a program that plays the game. Programs can compare themselves to
each other. In Prisoner’s Dilemma – equilibrium formed by “cooperate if the other program is
identical to yours, otherwise defect”.
Stag hunt – ( ) ( ) ( ) ( ) – there are two NEs, with mutual defection and
mutual cooperation.
It can only emerge if the players might meet again – “the shadow of the future”
Typically, a discount parameter is applied to payoff from future interactions
Ecological approach to Axelrod’s tournament: if the representation of better strategies grows with
their success, the “non-nice” programs become extinct and tit-for-tat grows faster than any other.
Evolution of cooperation:
1. Can evolve even in a world of unconditional detection where there are small clusters of
cooperating individuals
2. Reciprocity can thrive in a world of many different strategies
3. Cooperation based on reciprocity can protect itself from invasion by other strategies
Invasion: a strategy is invaded by a new strategy if the newcomer gets a higher score with a native
than a native gets with another native. [Idea: the natives will then begin adopting the newcomer’s
strategy, until the old strategy is evolutionally eliminated.]
13
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Invasion in clusters: for a sufficiently large cluster of newcomers with a new strategy, new
results are possible:
Convergence/guaranteed success
Maximizing social welfare – there is no “wasted” utility
Pareto efficiency
Individual rationality – no agent is asked to give up a better outcome
Stability – no one has incentive to defect
Simplicity
Distribution – can be executed in a distributed system, no single point of failure
Public/private value – whether the item’s value is inherent to it, or specific to the bidder
+ Correlated value – depends on what other bidders estimate the item, as well
Winner determination – highest bidder, &c.
Payment – first-price, second-price, everyone pays what they bid
Open cry/sealed-bid – whether agents see each other’s bids
Bidding mechanism – one-shot, multiple rounds: ascending, descending
14
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
All protocols discussed are subject to bidder collusion. Collusion can be prevented by modifying
the protocol so that bidders can’t identify each other.
All protocols discussed are subject to auctioneer manipulation (e.g. lie about second highest bid
in Vickrey auction). Lies can be prevented by using a trusted third party or signing bids.
[Sections 14.3 and 14.4 skipped, as they were not discussed in class and are not in the first edition.]
[When multiple issues (for negotiation) are involved, the problems are considerably more complicated.
The chapter examines only single-issue, symmetric, one-to-one negotiation.]
DOMAIN THEORY
Worth-oriented domains: utility function rates acceptability of states, agents plan to maximize it.
[Note: negotiation is not over a single issue, but over states and how to reach them – plans, goals.]
15
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
ALTERNATING OFFERS
The protocol: make offers, in turn, until someone accepts. If not accepted – go for conflict deal.
Assuming that the conflict deal is the worst option, when the number of rounds is fixed, the player
in the last round has all the power – the other player would rather accept than go for the conflict
deal.
The same applies to an infinite number of rounds: using the strategy “always propose that I get
everything and reject anything else” is in NE with accepting that proposal. If you’re convinced
that is what your opponent is doing, you should accept.
To model impatient players, use a discount factor for each agent – the value of the good is reduced
by the discount factor with each round. In a fixed or infinite number of rounds, there is still a NE
strategy – the more patient the agent, the “bigger slice” he gets.
Negotiation decision function – use a function that tells you how much to bid in every round. Can
decrease price quickly or slowly, &c.
TASK ALLOCATION
Formal model:
A deal d dominates another deal e if it is not worse than e for any agent and better than e
for at least one agent (in terms of utility)
o We prefer non-dominated deals, which are Pareto optimal
A deal is individual rational if it weakly dominates the conflict deal
o If there is no such deal, it makes no sense to bargain at all
The negotiation set is the set of individual rational and Pareto optimal deals
16
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
The protocol terminates, but can proceed for an exponential number of rounds. Also, it isn’t clear
how to start and how to concede.
Zeuthen strategy:
This protocol terminates with a Pareto optimal deal, but still may proceed for an exponential
number of rounds. The Zeuthen strategy is in NE with itself, which makes it a good “open”
protocol.
[Weakness: this requires the agents to know the opponent’s risk, and hence its agreement space.]
DECEPTION
Phantom task – pretend you have been allocated a task that you have not; decoy task – a phantom
task that you can produce on demand, so it is also verifiable.
Hidden task – pretend that you have not been allocated a task that you have.
[Section 15.4 skipped, as it was not discussed in class and does not appear in the first edition.]
Why argumentation?
17
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
ABSTRACT ARGUMENTS
Abstract arguments can be described as a directed graph, where an edge (a, b) means that a
attacks b. If c attacks a in this case, then we say c defends b.
Begin with “in” arguments – those that don’t have any attackers
Eliminate arguments attacked by them – they are “out”
Continue until there are no changes in the graph that remains – that’s the GE
DEDUCTIVE ARGUMENTS
Deductive arguments are pairs (S, C) where S is in the database of logical formulae known to be
true, C can be proved from S, and S is consistent and minimal.
Rebuttal –
Undercut – for some
[We tend to prefer arguments that can be undercut to arguments that can be rebutted, because
undercuts can be “fixed” by choosing a different support set.]
A dialogue between agents is a sequence of arguments that cannot be repeated and that attack one
another. It ends when no further moves are possible.
Types of dialogues:
18
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Persuader is a system for labor negotiation, exchanging proposals and counter-proposals over
multiple issues (wages, pensions, &c.):
Determines violated goals and tries to find compensating actions and generates
arguments
Models argument severity – e.g. an appeal to status quo is preferred to threats or
prevailing practice
Because agents are autonomous, negotiation is central to multiagent systems – an agent has to
convince other agents to act in a particular way.
Topics in negotiation:
[Also, it is unclear how users instruct agents to negotiate – attitude, autonomy, duration &c.]
General model:
Heuristic methods:
19
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
In a voting setting, a set of agents vote over a set of candidates. Can either choose one candidate,
or provide a complete ordering of their preferences over all candidates.
Then, use voting procedures:
VOTING METHODS
Plurality voting – the winner is the candidate that appears first in the largest number of
preference orders.
Sequential majority – series of pairwise elections, the winner proceeds to the next round; can be
organized in a list or a tree (like the World Cup).
Majority graph – directed graph of candidates, an edge (a, b) means a would beat b in direct
competition. If the graph is a cycle, we can fix the order so that anyone can win!
Condorcet winner – overall winner in any possible order of pair-wise elections (there is an edge
from w to every other node in the majority graph).
Borda count – for each preference tuple of size k, the first candidate gets k-1 points, the second k-2,
and so on; the last candidate gets 0.
Takes into account all the preference information, not just the first element
Slater ranking – choose ordering that needs the minimal number of fixes to be consistent with the
majority graph (e.g. when the graph is a cycle, there are many orderings that need just one fix).
Copeland – a’s score is the number of candidates a beats in pairwise election; select the candidate
with the highest score.
Kemeny rule – produces a ranking that minimizes the total number of disagreements compared to
voter preferences (summed over all voters).
Plurality with run-off – select top two by plurality voting, and then match selected pair.
20
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Black’s procedure – if there’s a Condorcet winner, choose it; otherwise, use Borda count.
Single transferable vote (STV) – election proceeds in rounds, in each round voters choose the
highest ranking candidate and the candidate with least points is eliminated.
Dodgson’s rule – elect candidate with minimum number of exchanges between adjacent
candidates needed to make it a Condorcet winner.
[Somewhat similar to Slater ranking.]
Pareto condition – if every voter ranks w above u, the voting method should not choose u (should
not prefer u to w).
[Sequential pairwise voting violates this criterion.]
Condorcet winner condition – if there is a Condorcet winner, the voting method should choose it
(prefer it to other candidates).
[Borda count violates this criterion.]
Smith’s generalized Condorcet criterion – if the candidates can be partitioned into sets A and B
such that every candidate from A beats every candidate from B pairwise, the rule should not elect a
candidate from B.
Independence of irrelevant alternatives (IIA) – if the relative ranking of w and u is not changed
(e.g. changes to ), the outcome should still rank w and u in the same way.
[This is a strong requirement that we might want to lift – e.g. Borda count does not satisfy it.]
Monotonicity – if x is a winner under a voting rule and one or more voters change their
preferences in a way favorable to x, then x should still be the winner.
[Plurality with run-off violates this criterion.]
IMPOSSIBILITY THEOREMS
Arrow’s impossibility theorem: for more than two candidates, any voting procedure satisfying
the Pareto condition and IIA is a dictatorship.
Manipulable voting procedure – given the preferences and voting function f, the voting
function is manipulable if there exists such that ( ) ( ).
Gibbard-Satterthwaite theorem: for more than two candidates, any voting procedure satisfying
the Pareto condition that is not manipulable is a dictatorship.
MANIPULATION TECHNIQUES
21
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Each voter chooses a point to build a grocery store, wants it close to his house.
Nonmanipulable solution: choose the median point.
SHAPLEY VALUE
Shapley value – the better an agent performs, the higher its reward.
Axiomatically defined:
Symmetry – agents that make the same contribution get the same payoff
Dummy player – a player that contributes to any coalition only what it would make on its
own should get exactly that
Additivity – a player does not gain or lose from playing more than once
Shapley value of i is his marginal contribution to any order in which any coalition was formed:
∑ [ ( ( ) { }) ( ( ))]
| |
( )
22
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
…where ( ) is the set of all agents that appear before i in the permutation .
Weighted subgraph – represent game as weighted graph, v(C) is the weight of the edges
whose components are in C
Marginal contribution nets – represent game as a set of rules, pattern → value, where the
pattern is a conjunction of agents such as
Weighted voting game – see below
[Weighted subgraphs and weighted voting games are incomplete representations; marginal
contribution nets are complete, but can have exponentially many rules. Note that in all cases but
weighted voting games, checking whether the core is empty or whether a payoff is in the core are hard
– NP, co-NP and similar classes.]
VOTING GAMES
Weighted voting games – each agent has weight ; if the sum of weights in a coalition exceeds a
quota q, the coalition wins.
Shapley-Shubik power index – Shapley index adapted to weighted voting games. For agent i, it is
the fraction of times i’s vote is decisive in any permutation of the votes.
In the game <8; 5, 5, 3> all members have equal power, ⅓, because it takes any two
members to decide the vote.
In the game <8; 5, 3, 1> the member with weight 1 has no power, it is never the decisive
vote. Permutations: (5, 3, 1), (5, 1, 3), (3, 5, 1), (3, 1, 5), (1, 3, 5), (1, 5, 3) – both 5 and 3 have
voting power of ½, they are each pivotal half the time.
In the game <8; 5, 3, 3> - permutations: (5, 3, 3), (5, 3, 3), (3, 5, 3), (3, 3, 5), (3, 3, 5), (3, 5, 3)
– 5 is pivotal 4 times, each 3 is pivotal once; 5 has weight ⅔ and the 3’s have weight ⅙ each.
k-weighted voting games – there are multiple voting games with different weights, and a coalition
has to win all of them to win the total game (e.g. Senate + Congress voting on a law, or sub-
committee approval followed by the general body voting).
[Sections 13.4.2, 13.5, 13.6 skipped, as they were not discussed in class and do not appear in the first
edition.]
23
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
A language is required for expressing voter preferences over multiple outcomes (e.g. type of
meal, time of meal, cost of meal). CP-net is such a language – specifies dependent preferences.
Envy-freeness criterion – every agent prefers the tasks allocated to it than any other agent’s
bundle. (This still might not be Pareto efficient, and deciding whether such an allocation exists is
NP-hard.)
Kidney exchanges – finding cycles of sequentially compatible donors for transplants; NP-hard
when maximal cycle length is set, and greater than 2.
Combinatorial auction – multiple resources are sold, and bidders indicate a value for every
subset of the resources. Again, expressive bidding languages are required to prevent exponential
explosion of bids, e.g.:
XOR language – ({a}, 5) XOR ({b, c}, 10) – a is worth 5, {a, b} is worth 5, {b} is worth
nothing, {b, c} is worth 10, {a, b, c} is worth 10
Winner determination becomes NP-hard and inapproximable
Can proceed in multiple rounds, in which case we strive to minimize communication
Donations – build a framework that conditions a donation on the total amount donated (this
makes donors feel better about their donation).
Can also condition the donation on the total amount donated by a certain closed social
group, making an even stronger stimulus.
Prediction market – trading securities that pay out if an event occurs; then the price of the
security represents the probability of the event (in the audience’s eyes).
The combinatorial explosion when multiple outcomes are involved complicates things.
Agents aren’t objects – one agent can’t force another to perform an action. It can, however,
communicate as to influence another agent’s actions.
Locutionary – the act of making an utterance that is grammatically correct and meaningful
Illocutionary – action performed by saying something, its meaning (request, inform &c.)
Perlocutionary – effect of the act (action, knowledge, inspiration, fear &c.)
24
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
PLAN-BASED THEORY
Speech acts have preconditions and postconditions based on beliefs, abilities and wants
(adapted from STRIPS planner).
Examples:
Request – preconditions:
o Speaker believes hearer can perform the action
o Speaker believes the hearer believes he can perform the action
o Speaker believes he wants the action to be performed
Request – postconditions:
o Hearer believes the speaker believes the speaker wants the action to be performed
COMMUNICATION LANGUAGES
Knowledge query and manipulation language (KQML) – envelope format for messages, does not
concern the content.
Criticism:
FIPA Agent Communication Language (ACL) – defines outer language for messages with 20
performatives and allows any language for the content.
[Actual implementations exist, e.g. Java framework called JADE.]
25
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
o Example: request requires the agent believes the recipient is capable of the action,
and believes the recipient has not yet performed it
[Section 7.2.3 skipped, as it was not discussed in class and does not appear in the first edition.]
Speech acts are operators (actions) that have preconditions, effects and bodies – their effects
are on the models speakers and hearers maintain. [Models – beliefs, intentions &c.]
A plan-based theory takes an initial set of beliefs and goals and leads to the generation of plans
for speech acts to issue.
Preconditions:
o cando precondition: S Believe(H cando ACT) S Believe(H Believe(H cando ACT))
Later refined to just “H cando ACT”, makes composition more flexible
o want precondition: S Believe(S want request-instance)
Effects:
o H Believe(S Believe(S want ACT))
Preconditions:
o cando precondition: S Believe(P)
Later refined to just “P”
o want precondition: S Believe(S want inform-instance)
Effects:
o H Believe(S Believe(P))
Further operators are defined to plan wh-questions: informref, which is used to inform another
agent; and yes/no questions – informif. Issuing a request to inform is the way of asking such a
question.
Finally, side effects from operators must also be modeled – e.g. after receiving a request the
hearer knows that its preconditions hold (on the speaker’s side).
26
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Unlike distributed systems, agents have different (possibly contradictory) goals and must
dynamically coordinate their activities.
Cooperative Distributed Problem Solving (CDPS) – the entire system is designed to solve a
problem together, common interest (benevolent), but no single node has enough knowledge,
expertise, or processing power to solve alone.
This is not just parallelism – in parallelism, nodes are homogenous and “stupid”
Coherence – how well the system behaves as a unit: efficiency, solution quality, performance.
Decompose the big problem into sub-problems – may involve several levels
Solve the sub-problems (possibly by different agents) – may involve sharing of information
Synthesize an answer from the sub-solutions – again, may be hierarchical
Task sharing involves decomposition and allocation of tasks to agents. If agents are self-
interested, this may involve negotiation/auctions.
[Adaptations: retry if no one bid, revise the announcement e.g. with costs/requirements, decompose
the problem in a different way. Another variation is for contractors to announce availability instead of
the manager announcing tasks.]
Each agent maintained skills and interests (hypotheses requiring inquiry) for itself and its
acquaintances in the network
Messages were of type request, response and inform
When evaluating hypotheses, agents checked their acquaintances recursively and broadcast
information to interested agents
27
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
BLACKBOARD
INCONSISTENCIES
Inconsistencies arise w.r.t. beliefs (information) and intentions (goals), and are inevitable.
FA/C – functionally accurate/cooperative, do the best you can with partial information:
COORDINATION TYPOLOGY
28
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Agents exchange information to reach common conclusions – partial planning because a plan for
the entire problem is impossible, global planning because agents exchange local plans and
construct bigger plans from them.
JOINT INTENTIONS
A group of agents has a joint commitment to the overall goal – this commitment is persistent. A
commitment has a social convention – when it can be dropped, how to behave, &c.
An agent that believes the goal is impossible or the motivation for it is no longer present starts
persuading others that this is the case, to drop the joint persistent goal (JPG).
ARCHON – a system that has explicit rules modeling cooperation, such as “if the goal is satisfied
then abandon all related activities and inform others”.
TEAMWORK MODEL
1. Recognition – some agent recognizes a potential for cooperation w.r.t. one of its goals.
2. Team formation – the agent recruits others to help, and they agree on the joint goal.
3. Plan formation – the team plans together (possibly negotiating) how to achieve the goal.
4. Team action – the agents execute the plan (see JPG).
MUTUAL MODELING
MACE – a system that introduces acquaintance models, modeling other agents: class, name, role,
skills, goals, plans. This helps agents coordinate.
29
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
These conventions dictate agent behavior in certain situations, make it easier to plan and
coordinate.
Offline design – norms are hardwired into the agent. Obviously, not everything can be predicted
and the dynamic nature of systems calls for greater flexibility.
Emergence – conventions emerge from within a group of agents interacting with each other.
Strategy update functions dictate how agents modify their norms:
Centralized planning – master agent develops and distributes a plan to the slaves
Distributed planning – a group cooperates to form a centralized plan, they do not
necessarily execute it
Distributed planning for distributed plans – a group cooperates to form individual
plans and coordinate their activities; negotiation may be required
Merging plans:
Use generalized STRIPS notation (pre-, post-conditions) and a during list – conditions that
must hold during the action’s execution
Interaction analysis – identify how plans interact with each other (dependencies)
Safety analysis – identify harmful interactions
Interaction resolution – apply mutual exclusion to unsafe plan interactions
Iterative planning – instead of centralized merging, agents generate successively more refined
plans and interact with each other if necessary (e.g. if interference is detected).
MOBILE AGENTS
Mobile agents execute on a host on behalf of someone far from it on the network – good for
minimizing latency, low-bandwidth networks, &c.
Problems:
30
Introduction to Multiagent Systems 2012 Sasha Goldshtein, sashag@cs
Autonomous – decide when and where to go, e.g. TELESCRIPT language has go instruction,
agents can meet at a host and cooperate, have limited resources and travel permits
On-demand – executed when the host demands them (Java applet)
Active mail – piggy backs on email, scripts executed as part of email parsing
Alternative to result-sharing blackboard: assign an agent to each resource and have it process the
contending demands for the resource. Contention issues are settled before any subsequent work is
performed – this avoids useless work later.
Reduce communication by using the task decomposition structure – e.g. agents close to each
other in a mapping effort communicate, or in general use the organization structure for
communication:
31