Software Agents: What Is An Agent?

CC383/CE835 Agent Technology for e-Commerce
What is an agent?
CC383/CE835 Agent Technology for eCommerce

Software Agents
Maria Fasli mfasli@essex.ac.uk
A piece software (and/or hardware) that acts on behalf of the user Unfortunately there is no unique and universally accepted definition of what constitutes an agent Different characteristics are important for different domains of application
Software Agents
Defining agents
Characteristics of agents
An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors (Russell and Norvig 2003) Agents are active, persistent (software) components that perceive, reason, act and communicate (Huhns and Singh 1997) an entity that functions continuously in an environment in which other processes take place and other agents exist (Shoham 1997) Autonomous agents are computational systems that inhabit some complex environment, sense and act autonomously in this environment, and by doing so realize a set of goals or tasks that they are designed for (Maes 1995)
Although there is no agreement regarding the definitive list of characteristics for agents, among the most important seem to be: Autonomy Proactiveness Reactiveness Social ability
Software Agents
Software Agents
Autonomy

Proactiveness
Difficult to pin down exactly: how self-ruled the agent really is An autonomous agent is one that can interact with its environment without the direct intervention of other agents and has control over its own actions and internal states The less predictable an agent is the more autonomous it appears to be to an external observer Absolute autonomy (complete unpredictability) may not be desirable; travel agent may exceed the allocated budget Restrictions on autonomy via social norms
Proactive (goal-directed) behaviour: an agent actively seeks to satisfy its goals and further its objectives Simplest form is writing a procedure or method which involves: Preconditions that need to be satisfied for the procedure/method to be executed Postconditions which are the effects of the correct execution of the procedure If the preconditions are met and the procedure executes correctly, then the postconditions will be true
Software Agents
Software Agents
Reactiveness
Social ability
Goal-directed behaviour as epitomised via the execution of procedures makes two limiting assumptions: while the procedure executes the preconditions remain valid the goal and the conditions for pursuing such a goal, remain valid at least until the procedure terminates Not realistic in dynamic, complex and uncertain environments Agents must not blindly attempt to achieve their goals, but should perceive their environment and any changes that affect their goals and respond accordingly Building an agent that achieves a balance between proactive and reactive behaviour is difficult
Agents are rarely isolated entities, they usually live and act in an environment with other agents, human or software Social ability means being able to operate in a multi-agent environment and coordinate, cooperate, negotiatiate and even compete with others This social dimension of the notion of agency must address many difficult situations which are not yet fully understood, even in the context of human behaviour
Software Agents
Software Agents
Agents as intentional systems
Trying to understand and analyze the behaviour of complex agents in a natural, intuitive and efficient way is a nontrivial task Methods that abstract us away from the mechanistic and design details of a system may be more convenient Intentional stance: ascribing to a system human mental attitudes (anthropomorphism), e.g. beliefs, desires, knowledge, wishes
How useful/legitimate is this approach? McCarthy explains: To ascribe certain beliefs, knowledge, free will, intentions, consciousness, abilities or wants to a machine or computer program is legitimate when such an ascription expresses the same information about the machine that it expresses about a person. It is useful when the ascription helps us understand the structure of the machine, its past or future behaviour, or how to repair or improve it.
Software Agents
Software Agents
10
Making decisions
The intentional stance provides us with a powerful abstraction tool: the behaviour of systems whose structure is unknown can be explained Computer systems or programs are treated as rational agents. An agent possesses knowledge or beliefs about the world it inhabits, it has desires and intentions and it is capable of performing a set of actions An agent uses practical reasoning and based on its information about the world and its chosen desires and intentions will select an action which will lead it to the achievement of one of its goals
We require software agents to whom complex tasks and goals can be delegated Agents should be smart so that they can make decisions and take actions to successfully complete tasks and goals Endowing the agent with the capability to make good decisions is a nontrivial issue
Action
Sensory information
Environment
Software Agents
11
Software Agents
12
A simple view of an agent

Environment states S={s1, s2, } Perception see:SP An agent has an internal state (IS) which is updated by percepts: next:IS P IS An agent can choose an action from a set A={a1, a2, }: action:IS A The effects of an agents actions are captured via the function do: do:A S S
The control loop of such an agent would look as follows:
Software Agents
13
Software Agents
14
Characteristics of the environment
Fully vs partially observable environments

The nature of the environment has a direct impact on the design of an agent and its decision-making process It can be characterized as: Fully or partially observable Deterministic or stochastic Static or dynamic Episodic or sequential Discrete or continuous Single-agent or multi-agent
Observability describes access to information about the world In a fully observable environment an agent has complete access to its state and can observe any changes as they occur in it Most realistic environments are only partially observable Partial observability can be attributed to noise in the agents sensors or perceptual aliasing Partial observability is important in multi-agent systems as it also affects what the agent knows about the other agents The more information an agent has about its world, the easier it is to choose and perform the best action
Software Agents
15
Software Agents
16
Deterministic vs stochastic environments

Deterministic environments The next state is completely determined by the current state and the actions performed by the agent The outcome of an agents actions is uniquely defined, no need to stop and reconsider Most environments are stochastic There is a random element that decides how the world changes Limited sphere of influence: the effects of an agents actions are not known in advance An agents actions may even fail Stochasticity complicates agent design
Software Agents 17
Static vs dynamic environments

Static environments The world only changes by the performance of actions by the agent itself If an agent perceives the world at time t0 and the agent performs no action until t1, the world will not change Dynamic environments The world constantly changes Even when an agent is executing an action a with a precondition p which holds true before the execution, p may not be true at some point during the execution The outcome of an agents action cannot be guaranteed as other agents and the environment itself may interfere
Software Agents 18
Episodic vs sequential environments

It is more difficult to build agents for dynamic environments Issue: the agent needs to do information gathering often enough in order to have up-to-date information about the world; this depends on the rate of change of the environment An agent also needs to take into account the other agents and synchronize and coordinate its actions with theirs in order to avoid interference and conflicts
In an episodic environment A cycle of perception and then action is considered to be a separate episode The agents performance depends only on the current episode The agent need not worry about the effects of its actions on subsequent episodes and need not think ahead In sequential environments each decision made affects the next one How an environment is characterized depends on the level of abstraction
Software Agents
19
Software Agents
20
Discrete vs continuous environments

In a discrete environment There is a fixed, finite number of actions and percepts In principle, one can enumerate all possible states and the best action to perform in each of these not practical though In continuous environments The number of states may be infinitely long
Single-agent vs multi-agent environments

In a single-agent environment there is one agent operating whereas in multi-agent environments there are many agents that interact with each other But, at times objects or entities that we would not normally consider as agents may have to be modelled as such Nature may be modelled as an agent Usually any entity/object that affects or influences the behaviour of the agent under consideration needs to be regarded as an agent
Software Agents
21
Software Agents
22
Open environments
Performance measure

The most complex class of environments are those that are partially observable, stochastic, dynamic, sequential, continuous and multi-agent Known as open environments
Objective: develop agents that perform well in their environments A performance measure indicates how successful an agent is Two aspects: how and when
Software Agents
23
Software Agents
24
Goal states
How is performance assessed: Different performance measures will be suitable for different types of agents and environments Contrast a trading agent with a vacuum cleaning agent Objective performance measures are defined by us as external observers of a system When is performance assessed: Important Continually, periodically or one-shot
One possible way to measure how well an agent is doing is to check that it has achieved its goal There may be a number of different action sequences that will enable an agent to satisfy its goal A good performance measure should allow the comparison of different world states or sequences of states
Software Agents
25
Software Agents
26
Preferences and utilities
Maximum Expected Utility
Agents need to be able to express preferences over different goal states Each state s can be associated with a utility u(s) for each agent The utility is a real number which indicates the desirability of the state for the agent For two states s and s agent i prefers s to s if and only if u(s)>u(s) is indifferent between the two states if and only if u(s)=u(s) The agents objective is to bring about states of the environment that maximise its utility Decision Theory
In a stochastic environment the performance of an action may bring about any one of a number of different outcomes The expected utility of an action a is:
The agent then chooses to perform action a* which has the maximum expected utility (MEU):
Software Agents
27
Software Agents
28
Rationality
A complete specification of the utility function allows rational decisions when: there are conflicting goals, only some of which can be accomplished; the utility function indicates the appropriate tradeoff; there are several goals that the agent can endeavour to achieve, but none of which can be achieved with certainty; the utility provides a way in which the likelihood of success can be evaluated against the importance of the goals Definition: An ideal rational agent performs actions that are expected to maximize its performance measure
What is rational at any given time depends on: the performance measure that determines the degree of success everything that the agent has perceived so far what the agent expects to perceive and happen in the future what the agent knows about the environment the actions that the agent can perform
Software Agents
29
Software Agents
30
Bounded rationality
Rational decision making and optimal policies
Making a decision requires computational power, memory and computation takes time Agents are resource-bounded and this has an impact on their decision-making process: optimal decision making may not be possible Ideal rationality may be difficult to achieve Bounded rationality restrictions on the types of options may be imposed the time/computation for option consideration may be limited the search space may be pruned the option selected will be strategically inferior to the optimal one
31
An agent is situated in an environment where there may be other agents present Time can be measured in discrete time points T={1,2,} The environment is in a state st at time t and the set of world states is indicated by S The agent perceives its environment through a percept pt The agent affects its environment by performing an action at A policy is a complete mapping from states to actions Given a state, a policy tells an agent what action to perform An agent may take into account information about the past and the future
32
Software Agents
Software Agents
Taking into account the past

Markov environments
Information about the past: percept and action pairs (pt,at) An agents policy: ((p1,a1),(p2,a2),,pt)=at This can be problematic (i) history may be too large, and (ii) computing an optimal policy from a computational complexity point of view would be nontrivial
In some environments, the state of the world at time t provides a complete description of the history before t, hence pt=st All necessary information to decide on an optimal action is in pt Such a world state is said to be Markov or have the Markov property An agents policy is: (pt)=at or (st)=at Such an agent that can ignore the past is called a reactive agent and its policy reactive or memoryless
Software Agents
33
Software Agents
34
Taking into account the future
MDPs and POMDPs
In a discrete world the agent performs an action a at each time point t, and the world changes as a result of this in t+1 A transition model T(s,a,s) describes how the world s changes as a result of an action a being performed In a deterministic environment the transition model maps (s,a) to a single resulting state s a number of approaches exist to plan ahead In a stochastic environment: the transition model is T(s,a,s)= P(s|s,a) Graph search is not applicable as there is uncertainty about the transitions between states
35
The problem of calculating an optimal policy in a fully observable, stochastic environment with a transition model that satisfies the Markov property, is called a Markov decision problem (MDP) In a partially observable environment, pt provides limited information and the agent cannot determine in which state the world really is The problem of calculating an optimal policy in a partially observable environment is called a partially observable Markov decision problem (POMDP) Methods for solving MDPs are not directly applicable to POMDPs
36
Software Agents
Software Agents
Optimal policies in MDPs
Example

Given a MDP, one can calculate a policy from the transition model (probabilities) and the utility function Assume a stochastic, single-agent world with transition model P(s|s,a). The agent should choose an optimal action a*:
The agent can move North, South, East and West Bumping onto a wall leaves the position unchanged Fully observable and stochastic: every action to the intended direction succeeds with p=0.8, but with probability p=0.2 the agent moves at right angles towards the intended direction Only the utilities of the terminal states are known
1 3 2 3 +1 -1 Start
38
The optimal policy is
But to be able to calculate the policy we need to know the utilities of all states
37 Software Agents
2 1
Software Agents
The utility function needs to be based on a state sequence u([s0,,sn]) instead of a single state To use the MEU rule, the utility function needs to be separable u([s0,,sn]) =R(s0,)+ u([s1,,sn]) where R(s0,) is called the reward function
The immediate reward for each of the non-terminal states -0.04 The utility u(s) of a state is defined as
The rewards of the terminal states are propagated out through all the other states Known as the Bellman equation forms the basis of dynamic programming Calculating the utilities in dynamic programming is an n-step decision problem
Software Agents
39
Software Agents
40
Value Iteration
Two other methods of calculating optimal policies in MDPs are: Value iteration: starting with a transition model and a reward function, calculate a utility for each state, and then use this to create a policy Policy iteration: start with some policy, and then repeatedly calculate the utility function for that policy, and then use it to calculate a new policy, and so on. A policy usually converges long before the utility function does.
Software Agents
41
Software Agents
42
Example (cont.)
Initial utilities (rewards) After 1 iteration The final utilities are:
1 3 2 1 2 3 +1 -1 0.49 3 2 1 (b)
The best policy then is:

1 2 3 +1 -1
0.87 0.93 0.82 0.76 0.78 0.72 (a)
Software Agents
43
Software Agents
44

Software Agents: What Is An Agent?

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Software Agents: What Is An Agent?

Uploaded by

Copyright:

Available Formats

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Agents as intentional systems

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

A simple view of an agent

The control loop of such an agent would look as follows:

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Characteristics of the environment

Fully vs partially observable environments

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Deterministic vs stochastic environments

Static vs dynamic environments

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Episodic vs sequential environments

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Discrete vs continuous environments

Single-agent vs multi-agent environments

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Preferences and utilities

Maximum Expected Utility

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Rational decision making and optimal policies

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Taking into account the past

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Taking into account the future

MDPs and POMDPs

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

Optimal policies in MDPs

The optimal policy is

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

CC383/CE835 Agent Technology for e-Commerce

The best policy then is:

0.87 0.93 0.82 0.76 0.78 0.72 (a)