You are on page 1of 80


Reference book :
Artificial Intelligent a Modern
Lecture by
Zaib un nisa

 Anagent is anything that can be viewed as

perceiving its environment through
sensors and acting upon that
environment through actuators.

Various Agents
Human Agent Robotic Agent Software Agent

Sensors Eyes, Ears and Cameras, InfraredReceives key

other organs range finders and strokes, file
other devices contents, network
packets as
sensory input
Actuators Hands, legs, Various motors Acts on
mouth and other environment by
body parts displaying on
screen, writing
files, sending
network packets
Agent’s Concepts
 Percept
 Agents perceptual inputs at any given instant----The
information agent is observing/conceiving from its
 Percept Sequence
 Complete history of everything the agent has ever
 Agents choice of action at any given instance can
depend on the entire percept sequence observed to
 By specifying agent’s choice for every percept
sequence …… we have said all about agents
 Agent’s behavior is described by “Agent’s function”
--- that maps any given percept sequence to an
 By tabulating agent’s function that describes any
given agent ---- we come up with a very large table,
may be infinite
 This table will be external characterization for agent,
internally it has agent program which is responsible
agent’s function
 Agent function is a mathematical description; the
agent program is concrete implementation, which is
running on agent’s architecture.
Example : Vacuum-cleaner
 This
world is so simple that we can describe
everything about it
 This particular world has just two locations,
squares A and B
 The vacuum agent perceives which square it
is in and whether there is dirt in the square
 Can choose to move left, move right and
suck up the dirt or do nothing
 One very simple agent function: if current
square is dirty, then suck, otherwise move to
the other square.
Tabulation of Agent function
Percept Sequence Action
{A, Clean} Right
{A, Dirty} Suck
{B, Clean} Left
{A, Dirty} Suck
{A, Clean}, {A, Clean} Right
{A, Clean}, {A, Dirty} Suck
. .
. .
{A, Clean}, {A, Clean}, {A, Clean} Right
{A, Clean}, {A, Clean}, {A, Dirty} Suck
. .
. .
Agent Program
function REFLEX-VACUUM-AGENT (|location, status|)
returns an action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Left
 Various agents can be described by filling the
right hand side of the table in various orders
 Which is the right way to fill the table?

 What makes the agent good or bad?

 Intelligent or stupid?
Good Behavior: The Concept
Of Rationality
 Rational Agent
 That does the right thing
 Every entry in the table is filled correctly
 What does it mean to do right thing?
 That cause the agent to be most successful

 We need some way to measure success

 To find how to be rational?

Performance Measure
 It embodies criterion for success of an agent
 When an agent is moved to an environment, it
generates a sequence of actions
 These actions are generated according to the
percepts it receives
 These sequence of actions causes environment to
go through sequence of states
 If sequence is desirable then agent has performed
Performance Measure
 There is not one fixed measure suitable for all
 If we ask the agent … it may not answer or it will
delude itself
 So performance can be measured by the designer
who is designing the agent
 As general rule, it is better to design performance
measure according to what one actually wants in
environment, rather than according to how the agent
should behave
Performance Measure
 The selection of performance measure is not very
 In vacuum-cleaner example: “Clean Floor” is based
on average cleanliness over time.
 Same average cleanliness can be achieved by two
different agents
 One does the mediocre job all the time
 Other cleans energetically but takes long breaks
 Which one is better?
 A reckless life of highs and lows, or a safe but
humdrum existence?
 Answer is upto you
 What is rational at any given time depends on
four things
 The performance measure that defines criterion of
 The agent’s prior knowledge of the environment
 The actions that agent can perform
 The agent’s percept sequence to date
Definition of Rational Agent
 For each possible percept sequence
 A rational agent should select an action that is
expected to Maximize its performance measure
 Given the evidence provided by the percept
sequence and whatever built-in knowledge the
agent has
 Consider example of vacuum-Cleaner agent
that cleans the square if it is dirty and moves
to the other square if it is not
 Is this a rational agent?

 First we need to check what the performance

measure is
 What is known about environment ?

 What sensors and actuators agent has?

Let us assume the following
 The performance measure awards one point for each clean square
at each time step, over a life time of 1000 time steps
 The “geography” of the environment is known a priori but the dirt
distribution and the initial location of the agent are not.
 Clean squares stay clean and sucking cleans the current square
 The Left and Right actions move agent left and right except when this
would take the agent outside the environment, in this case agent remains
where it is.
 The only available actions are Left, Right, Suck, and NoOp ( Do
 The agent correctly perceives its location and whether that location
contains dirt
 We claim under these circumstances the agent in indeed rational
Omniscience (knowing about everything /having all

 We need to be careful to distinguish between

omniscience and rationality
 Being rational means able think to logically or
 An omniscience agent knows the actual outcome of
its actions and can act accordingly; but it is
impossible in reality
 Rationality maximizes expected performance,
(performance that can be achieved)
 Omniscience maximizes actual performance
(performance that must be achieved as result of
 If I was walking along one side of a road and I saw an
old friend on the other side. There was no traffic on the
road and I was not also engaged, so being rational I
crossed the street. Meanwhile, at 33,000 feet a door
from passing airline fell. And before I made the other
side of the street I was flattened.
 Was I rational to cross the street?
 Rationality is not same as perfection.
 You can maximize expected performance, but not actual
performance, not perfection ---- which is omniscience
 Retreating from a requirement of perfection is not
just a question
 If we expect an agent to do the best action after the
fact, it will be impossible the design an agent to fulfill
this specification.
 Rationality does not require Omniscience, rational
choice based on percept sequence to date.
 We must also ensure that we haven't in-advertently
allow agent to do unintelligent activities
 i.e allow to cross a busy road with out looking on
both sides
Rationality : Information
Gathering + Learning
 Doing actions to modify future percepts --sometimes called
-----“Information Gathering”
 An example of information gathering is exploration, that must
be undertaken by a vacuum-Cleaning agent in an initially
unknown environment
 Definition of rationality requires a rational agent should not
only gather information, but also learn as possible from what it
 The agent initial configuration could reflect some prior
knowledge of the environment, but as the agent gains
experience this can be modified.
Rationality: Autonomy
 Successful agents split the task of computing agent function into
three different periods
 When agent is being designed some of the computation work is
done by designers
 When it is deliberating( considering /doing intentionally) on its
next action, the agent does more computation
 As it learns from experience so it does more computation to
decide how to modify its behavior
 If agent relies on the prior knowledge that is given by the designer
and not rely on the knowledge on its own, we say it lacks
 A rational agent should be autonomous --- it should learn what it
can, to compensate the partial or incorrect prior knowledge
The Nature Of Environments
 Task environments are the “problems” to which the agents are
 How to specify task environment?
 There are many types of task environment --- and type directly
affects the appropriate design of the agent program
 In heading of task environment we ll group
 Performance measure,
 The environment,
 The agent’s actuators and
 Sensors
 We ll call it PEAS (Performance, Environment, Actuators,
PEAS Description
Agent Type Performance Environment Actuators Sensors

Taxi Driver Safe, fast, legal, Roads, Steering, Cameras,

Comfortable trip, Other traffic, Accelerator, Speedometer,
maximize profit Pedestrians, Brake, signal, GPS,
customers horn, Accelerometer,
Display Engine sensors
Properties of Task Environment
 Range of task environment that arise in AI is
 We ll identify small number of dimensions to
categorize task environments
 These dimensions determine
 The appropriate design and applicability of each
of the principle families of techniques (for agent
implementation )
Dimensions of Task
These definitions are informal
 Fully observable vs Partially observable
 Deterministic vs Stochastic (determines by random distribution of

 Episodic vs Sequential
 Static vs Dynamic

 Discrete vs Continuous

 Single-agent vs Multiagent
Fully vs Partially Observable
 Ifthe agent’s sensors give access to
complete state of environment at each point
on time, then we say task environment is fully
 Fully observable environments are
convenient because agent need not to
maintain any internal state to keep track of
 An environment may be partially observable
because of noisy and inaccurate sensors
Deterministic (certainly)
Stochastic ( determined by random
distribution of probabilities/ not certain)

 If the next state of environment is completely

determined by the current state and action executed
by agent ---- then environment is Deterministic
otherwise Stochastic
 An agent need not to worry about uncertainty in fully
observable and deterministic environment
 If the environment is partially observable then it
appears to be Stochastic
 This can be the case when the environment is complex,
making it hard to keep track of all the unobserved aspects
 Taxi driving is completely stochastic
 One cannot predict the behavior of traffic
 Tires can blow out, or engine sizes up without warning
 Vacuum world described in the previous example is
 Variations can include stochastic elements such as
randomly appearing dirt and unreliable suction mechanism
 If environment is deterministic, except for the actions
of other agents, we say environment is Strategic
Episodic vs Sequential
 Agent’s experience is divided into atomic episodes
 Each episode consists of the agent perceiving and
then performing single action
 Next episode does not depend on the actions taken
in previous episodes
 Choice of action in each episode depends only on
episode itself
 The current decisions does not effect the future
 When one episode completes the other will start,
there will be no effect of one episode on other , as
happens in semester system
 The current decisions could effect all future
 Example
 Chess and taxi driving are sequential : in both the
short-term actions have long-term consequences
 Chess championship competition can be episodic
because the actions taken in one game of chess does
not effect the other
Static vs Dynamic
 If the environment changes while the agent is
deliberating ( considering what to do), we say
environment is dynamic for that agent
 Otherwise it is static
 Static environment are easy to deal with
 Agent need not to keep track of the environment while
making decisions
 Need not to worry about passage of time
 Dynamic environment continuously asking the agent
what to do
 If agent hasn’t decide yet then counts as deciding to do
Static vs Dynamic
 Ifenvironment does not change with passage
of time but agent performance score does,
then environment is semi-dynamic
 Example
 Taxi driving is clearly dynamic
 Chess when played with clock is semi-dynamic
 Crossword puzzles are static
Discrete vs Continuous
 The discrete/continuous distinction can be applied to
the state of environment
 To the way time is handled
 To the percepts and actions of agent
 Example
 Chess game has finite number of distinct states
 Chess also has discrete set of percepts and actions --- so it
is example of Discrete environment
 Taxi driving is continuous-state and continuous-time
Single-Agent vs Multiagent
 An agent solving cross-world puzzle by itself
is single-agent environment
 Playing chess is in a two-agent environment

 If we have two agents A and B.

 If B’s behavior is best described as maximizing a
performance measure whose value depends on
agent A’s behavior
 Example
 In chess the opponent entity B is trying to
maximize its performance measure, which by
rules of chess minimizing performance of A ---
chess is a Competitive Multiagent environment
 the taxi driving environment, avoiding collisions,
maximizing performance of all agents, so it is
partially called cooperative Multiagent
Task Observabl Deterministic Episodic Static Discrete
Environment e

Crossword Fully Deterministic Sequential Static Discrete Single

Chess with a Fully Strategic Sequential Static Discrete Multi
Poker Partially Strategic Sequential Static Discrete Multi
Backgammon Fully Stochastic Sequential Static Discrete Multi
Taxi Driving Partially Stochastic Sequential Dynamic Continuous Multi
Medical Partially Stochastic Sequential Dynamic Continuous Single
Image Analysis Fully Deterministic Episodic Semi Continuous Single
Robot Partially Stochastic Episodic Dynamic Continuous Single

Refinery Partially Stochastic Sequential Dynamic Continuous Single

Interactive Partially Stochastic Sequential Dynamic Discrete Multi
English tutor
The Structure of Agents
 The job of AI is to design the agent program
 Agent program implements agent function mapping
percepts to actions
 This program runs on some sort of computing device with
physical sensors and actuators ---- we call this
 Agent = architecture + Program
 The program we choose has to be one that is appropriate
for the architecture
 If program is going to recommend actions like walk, the
architecture better have legs
The Structure of Agents
 The architecture takes the percepts from the
sensors available to the program
 Runs the program

 And feeds program’s action choices to the

Agent Program
They take
 Current percept as input from the sensors and

 The return an action to the actuators

Difference b/w agent program and agent’s function

 Agent program: takes current percept as input

 Agent function : takes entire percept history

 If agent’s actions depend on the entire percept

sequence, the agent will have to remember the
Agent Program
function TABLE-DRIVEN-AGENT (percept) returns an
static: percepts, a sequence, initially empty
table, a table of actions, indexed by percept
sequences, initially fully specified

append percept to the end of percepts

action LOOKUP (percepts, table)
return action
Explanation of Agent program
 There was an agent program in previous slide that
keeps track of percept sequence
 Then uses it to index into a table of actions to
decide what to do
 The table represents explicitly the agent function,
that the agent embodies
 To build a rational agent in this way, we must
construct a table that contains the appropriate action
for every possible percept sequence
 The look up table of chess --- would have 10150 entries
 The number of atoms in observable universe is less
than 1080
 The daunting size of these tables means that
 No physical agent in this universe will have space to store the
 The designer would have not much time to create such a table
 No agent could ever learn all right table entries from its
 Even if the environment is simple enough to yield a feasible
size, the designer has no guidance about how to fill in the table
 Despite of all this Table-Driven-Agent does do what
we want : it implements the desired agent function
 Challenge : to AI is to find out how to write programs
that produce rational behavior from small amount of
code rather than from large number of table entries
 Huge tables of square roots have been replaced by
a five-lines program by Newton’s method
 Can AI do for general intelligent behavior what
Newton did for square roots ?
 We believe the answer is yes
Agent Program Types
Four basic kinds of agent programs
 Simple Reflex Agents

 Modal-Based Agents

 Goal-Based Agents

 Utility-Based Agents
Simple-Reflex Agents
 Simplest kind of agent is the “Simple Reflex Agent”
 These agents selects actions on the basis of
current percepts, ignoring rest of percept history
 Vacuum agent whose agent function is tabulated is
a simple reflex agent
 Because its decision is based only on the current
location and whether that contains dirt.
Tabulation of Agent function
Percept Sequence Action
{A, Clean} Right
{A, Dirty} Suck
{B, Clean} Left
{A, Dirty} Suck
{A, Clean}, {A, Clean} Right
{A, Clean}, {A, Dirty} Suck
. .
. .
{A, Clean}, {A, Clean}, {A, Clean} Right
{A, Clean}, {A, Clean}, {A, Dirty} Suck
. .
. .
Simple Reflex Agent’s
function REFLEX-VACUUM-AGENT (|location,
status|) returns an action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Left
Simple Reflex Agent
 Vacuumagent program is very small as
compared to the corresponding table
 The most obvious reduction comes from ignoring
percept history, which cuts down number of
possibilities from 4T to just 4
Reflex Agent: Condition-Action
 Ifcar-in-front-is-braking then initiate-braking.
 Humans also have many such conditions,
some of which are learned responses (as for
driving )
 Some are innate reflexes (such as blinking
when something approaches the eye)
Simple Reflex Agent
 The program given in previous slides was
specific to only vacuum environment
 A more general and flexible approach is to
build a general-purpose interpreter for
condition-action rules
 Then to create rule sets for specific task
Schematic Diagram of Simple
Reflex Agent
Current internal state
Of the agent’s decisions process


What the world
Is like now

In the process
What actions
Condition Action Rules I should do now

Simple Reflex Agent
 Previousdiagram gives the structure of
general program in schematic form
 Showing how condition action rules can be
applied to make connection from percept to action
Simple Reflex Agent:
Condition action rule
function SIMPLE-REFLEX-AGENT (percept) returns an action
static: rules, a set of condition-action rules
state INTERPRET-INPUT (percept)
rule RULE-MATCH (state, rules)
action RULE-ACTION [rule]
return action
 INTERPRET-INPUT function generates an abstracted
description of the current state from percept
 RULE-MATCH returns first rule in set of rules that match the
given state description
 Description in terms of rule and match is conceptual, actual
implementation will be in from of Boolean logic gates
Simple Reflex Agent
 Reflex agents are very simple, so their
intelligence capability is very limited
 The are used in fully observable environment

 They can be used only if the correct decision

can be made on the basis of current percept
Modal Based Reflex Agent
 Itis used for the environment having partially
observable data
 The most effective way to handle that data
is : keep track of the part of world that unseen
 The agent must have some sort of internal
state that keeps track of percept history
 That can reflect at least some of the unobserved
aspects of current state
 i.e, Keeping the map of an area, if there is fog
Modal Based Reflex Agent
 Updating internal state information as time goes by,
we require two kinds of knowledge
 This knowledge have to be encoded in agent
 First, we need some information about how the
world evolves independently of the agent
 While overtaking car comes closer to the front car as it was
few moments ago
 Second, how agent’s action effect the world
 When agent turns steering wheel clockwise, car will turn to
Modal Based Agent
 This knowledge about “how the world works”
in implemented in simple Boolean circuits
 Or in complete scientific theories ---- is called
modal of the world
 An agent that uses such a modal is called
A Modal Based Reflex Agent


How the world evolves What the world
Is like now
What my actions do

What actions
Condition Action Rules I should do now

Agent Actuators
Modal Based Agent
Previous diagram shows how the current
percept is combined with the old internal
state to generate the update description of
the current state
Modal Based Agent’s Program
function REFLEX-AGENT-WITH-SATE (percept)
returns an action
static: sate, a description of current world state
rules, a set of condition-action rules
action, the most recent action, initially none

state UPDATE-STATE (state, action, percept)

rule RULE-MATCH (state, rules)
action RULE-ACTION [rule]
return action
Modal Based Agent
 The
interesting part is function UPDATE-
 Which is responsible for creating internal state
 Also interprets new percept in light of existing
knowledge about the state
 Uses information about “how the world
evolves” to keep track of unseen parts
 Also know about what the agent’s action do
to the state of the world
Goal-Based Agent
 Knowing about current state of environment is not enough
to decide what to do
 As well as current state description, agent needs some sort
of goal information
 This goal information describes situation that are desirable
 Sometimes goal-based action selection is straight forward
 When goal results immediately from a single action
 Sometimes it is tricky when agent has to consider long sequences
of twists and turns to find a way to achieve goal
 Search and Planning are subfields of AI to do this kind of job
 Winning a chess game
Goal-Based Agent

What the world

How the world evolves Is like now

What my actions do What it will be like if

I do action A

What actions
Condition Action Rules I should do now

Agent Actuators
Goal-Based Agent
 Decision making of this kind is fundamentally
different from condition-action-rules
 It involves consideration of future
 What will happen if I do such and such?
 Will that make me happy?
 InReflex agents this information is not explicitly
 Reflex agents have built-in rules that maps directly from
percepts to actions
Goal-Based Agent
 Example
 Reflex agent brakes when it sees the car in front has it brake
lights on
 Goal-based agent when sees the car in front has brake lights on,
it have a goal not to hit the car ---- it will achieve this goal by
slowing down the car
 These are less efficient , but more flexible
 Flexible coz knowledge that supports its decisions is represented
explicitly and can be modified
 If it starts raining, the agent update its knowledge of applying
brakes --- it will cause all the relevant behavior to change to suit
the new conditions
 For Reflex Agent we have to re-write many condition action rules
Utility-Based Agent
 Goals alone not enough to generate high-quality
behavior in most of the environments
 Goals provide crude (primary) binary distinction
between “happy” and “unhappy” states
 Where as a more general performance measure
should allow a comparison to different world of
 How happy they would be if they achieve the goal
 For example, there are many action sequences that
will get the taxi to its destination
 But some are quicker, more safe, more reliable or cheaper
than others
Utility-Based Agent
 The term “happy” is not customary, so replaces by
 If one world state is preferred to the other then it has
higher Utility for the agent
 Utility function maps a state (or sequence of states)
to a real number, which describes associated
degree of happiness
 Utility function allows rational decisions in two kinds
of cases where goals are inadequate (not complete, not enough)
 First, when there are conflicting goals and some of them
can be achieved ( e.g speed and safety)
 Second, when there are several goals that agent can aim
for, none of them can be achieved with certainty
Learning Agents
 How agent’s program come into being
 Turning purposes a method : i.e. to build learning
machines and then to teach them
 This is a preferred method for creating “state-of-the-
art” systems
 Learning allows the agent to operate in initially
unknown environments
 And to become more competent than its initial knowledge
Performance Standard

Critic Sensors

Feed back

Learning Performance
Element Element


Learning Agent : Conceptual
Learning agent can divided into four
conceptual components
 Learning Element

 Performance Element

 Critic

 Problem Generator
Learning Element

o It is responsible for making improvements

o It is making improvements on the basis of
feed back that it gets from critic
Performance Element
 It is responsible for selecting external actions
 Performance element is what we have
considered previously as entire agent
 It takes the percept and decides on action
 It is responsible for giving feed back on how the
agent is doing
 It determines how the performance element should
be modified to do better in future
 The critic tell the learning element how well the
agent is doing with respect to a fixed performance
 Critic is necessary because the percepts
themselves provide no indication of the agent’s
Example : Critic
A Chess program could receive a percept
indicating that it has checkmated its opponent
 But it needs a performance standard to know
it is a good thing ---- percept itself does not
say so ( by this we are also trying to create the feeling of doing
good/happiness or bad/unhappy in agents)
 It is important that performance standard should
be fixed
Design of Learning Agent
 The design of learning element depends very
much on the design of the performance
 When trying to design an agent that learns
certain capability the first question is not:
 How I am going to learn this ? Rather
 what kind of performance element will my agent
need to do this once I have learned how?
Problem Generator
 Last component of learning agent is Problem Generator
 It is responsible for suggesting actions that will lead to new and
informative experiences
 The point here is that when performance element had its way, it
should keep on doing actions that are best, given what it knows
 But if the agent is willing to explore a little, and do some
suboptimal actions in short run
 These actions may help the agent in performing more good in
long run
 It is job of problem generator to suggest these exploratory
Example : Automated Taxi
 The performance element consists of, what
 Collection of knowledge
 And procedures
 The taxi has for selecting its driving actions
 The taxi goes out on the road and drives, using the performance
 The critic observes the world and passes information along to the
learning element : e.g. if the taxi makes a quick left turn across
the three lanes of traffic, critic observes the shocking language
used by the other drivers
 Critic observes the reaction in the environment upon the actions
of agent, through critic agent is learning ---- but based on
 In problem generating situation ---- learning is performed but
through exploration
Example : Automated Taxi
 From this experience, the learning agent is
able to formulate a rule saying that was a bad
action ------- performance element is modified
by installing new rule
 The problem generator might identify certain
areas of behavior in need of improvement
and suggests experiments
 Such as trying out the brakes on different road
surfaces under different conditions