3 views

Uploaded by api-19643506

save

You are on page 1of 20

Utility theory

Chapter 16, AIMA

The utility function U(S)

• An agent’s preferences between different

states S in the world are captured by the

Utility function U(S).

**• If U(Si) > U(Sj) then the agent prefers state Si
**

before state Sj

• If U(Si) = U(Sj) then the agent is indifferent

between the two states Si and Sj

Maximize expected utility

A rational agent should choose the action

that maximizes the the agent’s expected

utility (EU):

**EU ( A | E) = ∑ P ( Resulti ( A) | Do( A), E)U ( Resulti ( A))
**

i

**Where Resulti(A) enumerates all the possible
**

resulting states after doing action A.

The basis of utility theory

Notation:

A ≻ B A is preferred to B

A ∼ B The agent is indifferent between A and B

A ≿ B The agent prefers A to B, or is indifferent

between them.

A lottery is described with

L = [ p1 , C1 ; p2 , C2 ; ; pn , Cn ]

The six axioms of utility theory

Orderability (A ≻ B) ∨ (B ≻ A) ∨ (A ∼ B) You must make a decision

Transitivity (A ≻ B) ∧ (B ≻ C) ⇒ (A ≻ C)

Continuity (A ≻ B ≻ C) ⇒ ∃p [p,A; 1-p,C] ∼ B

Substitutability (A ∼ B) ⇒ [p,A; 1-p,C] ∼ [p,B; 1-p,C]

Monotonicity (A ≻ B) ⇒ (p ≥ q ⇔ [p,A; 1-p,B] ≿ [q,A; 1-q,B])

Decomposability [p,A; 1-p,[q,B; 1-q,C]] ∼ [p,A; (1-p)q,B; (1-p)(1-q),C]

**It follows from these axioms that there exists a real-valued function U
**

that operates on states such that

U(A) > U(B) ⇔ A ≻ B

U(A) = U(B) ⇔ A ∼ B

The St. Petersburg ”paradox”

You are offered to play the following game (bet):

You flip a coin repeatedly until you get your first

heads. You will then be paid $2 to the power of

every flip you made, including the final one (the

price matrix is below).

**How much are you willing to pay to participate
**

(participation is not free)?

Toss Winning

H $2

TH $4

TTH $8

TTTH $16

... ...

The St. Petersburg ”paradox”

What is the expected winning in this betting

game?

N

2kN N

Winning N

= ∑ P (k )W (k ) =∑ k = ∑1 = N

k =1 k =1 2 k =1

lim Winning N

=∞

N →∞

**A rational player should be willing to pay
**

any sum of money to participate...

...if $ = Utility

The students in previous years’ classes have offered $4 or less on average...

The St. Petersburg ”paradox”

Bernoulli (1738): The utility of money is not

money; it is more like log(Money).

N N

ln(2 k ) N

k

Utility N

= ∑ P (k )U (k ) =∑ k = ln(2)∑ k

k =1 k =1 2 k =1 2

1 N 1

N +1

= 2 ln(2) 1 − − N

2 2

lim Utility N = 2 ln(2)

N →∞

Mr Beard’s utility curve General ”human nature” utility curve

Risk averse

Risk seeking

Lottery game 1

**You can choose between
**

$1,000,000 alternatives A and B:

A A) You get $1,000,000

for sure.

B) You can participate in

a lottery where you

$5,000,000 can win up to $5 mill.

0.1

B

0.89

$1,000,000

0.01

$0

Lottery game 2

$5,000,000

0.1

**C You can choose between
**

$0 alternatives C and D:

0.9 C) A lottery where you

can win $5 mill.

D) A lottery where you

can win $1 mill.

D

0.11

$1,000,000

0.89

$0

Lottery preferences

• People should select A and D, or B and C. Otherwise they

are not being consistent...

**U ( A) − U ( B) = U ($1M ) − 0.1U ($5M ) − 0.89U ($1M ) − 0.01U ($0) =
**

0.11U ($1M ) − 0.1U ($5M ) − 0.01U ($0)

**U ( D) − U (C ) = 0.11U ($1M ) + 0.89U ($0) − 0.1U ($5M ) − 0.9U ($0) =
**

0.11U ($1M ) − 0.1U ($5M ) − 0.01U ($0)

**Allais paradox. Utility function does not capture a human’s fear of
**

looking like a complete idiot.

**In last years classes, fewer than 50% have been consistent...
**

Form of U(S)

If the value of one attribute does not influence

one’s opinion about the preference for another

attribute, then we have mutual preferential

independence and can write:

V ( X 1 , X 2 ) = V1 ( X 1 ) + V2 ( X 2 )

Where V(X) is a value function (expressing the [monetary] value)

Example: The party problem

We are about to give a

wedding party. It will be

Rain Relieved

held during summer-

time.

Should we be outdoors or

In ¬Rain Regret

indoors?

The party is such that we

can’t change our minds Rain Disaster!

on the day of the party Out

(different locations for

indoors and outdoors). Perfect!

¬Rain

What is the rational

decision?

**Example adapted from Breese & Kooler 1997
**

Example: The party problem

The value function: Assign

a numerical (monetary)

Rain Relieved

value to each outcome.

U = 1.88

(We avoid the question on how

this is done for the time being)

In ¬Rain Regret

U = 1.41

**Location Weather Value
**

Rain Disaster!

Indoors Sun $25 Out

U = 0.00

Indoors Rain $75

Outdoors Sun $100 ¬Rain Perfect!

Outdoors Rain $0 U = 2.00

Let U(S) = log[V(S)+1]

**Example adapted from Breese & Kooler 1997
**

Example: The party problem

Get weather statistics for your

location in the summer (June).

Rain Relieved

U = 1.88

Location P(Rain)

In ¬Rain Regret

Stockholm, Sweden 18/30 U = 1.41

Bergen, Norway 19/30

San Fransisco, USA 1/30 Rain Disaster!

Seattle, USA 9/30 Out

U = 0.00

Paris, France 14/30

Haifa, Israel 0/30 ¬Rain Perfect!

U = 2.00

Rain probabilities from Weatherbase

www.weatherbase.com/

**Example adapted from Breese & Kooler 1997
**

Example: The party problem

Example:

Stockholm, Sweden

Rain Relieved

U = 1.88

18 12

EU (in) = × 1.88 + ×1.41 = 1.7

30 30

In ¬Rain Regret

18 12

EU (out ) = × 0.00 + × 2.00 = 0.8 U = 1.41

30 30

Rain Disaster!

Be indoors! Out

U = 0.00

¬Rain Perfect!

U = 2.00

**Example adapted from Breese & Kooler 1997
**

Example: The party problem

Example:

San Fransisco, California

Rain Relieved

U = 1.88

1 29

EU (in) = × 1.88 + × 1.41 = 1.4

30 30

In ¬Rain Regret

1 29

EU (out ) = × 0.00 + × 2.00 = 1.9 U = 1.41

30 30

Rain Disaster!

Be outdoors! Out

U = 0.00

¬Rain Perfect!

U = 2.00

The change from outdoors to indoors

occurs at P(Rain) > 7/30

**Example adapted from Breese & Kooler 1997
**

Decision network for

the party problem

Decision represented

by a rectangle

**Chance (random Location
**

variable)

represented by an

oval.

Happy

Weather U

guests

Utility function

represented by a

diamond.

The value of information

• The value of a given piece of information

is the difference in expected utility value

between best actions before and after

information is obtained.

**• Information has value to the extent that it
**

is likely to cause a change of plan and to

the extent that the new plan will be

significantly better than the old plan.

- Decision Making RenderUploaded byIvan Chiu
- chap1Uploaded byNhím Biển
- Eco 2020 i Midterm 2013 f SolutionsUploaded byalbertwing1010
- Last-Place Aversion and Low-Income Opposition to Income Redistribution- NBER 2011JulyUploaded bychgeorge8
- Economics-Demand and Consumer BahaviourUploaded byManasi D Cuty Pie
- Utility FunctionUploaded byPreston Lee
- Of EconomicUploaded byscottlinyupeng
- Chapter 4 - Theory of Consumer BehaviourUploaded byArup Saha
- Lecture I-II: Motivation and Decision Theory - Markus M. MÄobiusUploaded bykuchbhirandom
- Output SpssUploaded bydwi
- bba-103Uploaded bySubhan Uddin Khattak
- 1 The Attention Economy: Measuring the Value of Free Goods on the InternetUploaded byAustin Drukker
- Economics Quiz 1Uploaded byCharles Abel
- Kirk Hamilton (Sustainable Development the Hartwick Rule and Optimal Growth)Uploaded byNeyda Ibanez Cox
- Ho RandfixdUploaded byricky5ricky
- Procrastination with Time-Consistent PreferencesUploaded byMasterKnowHow
- XAT 2009_Solutions.pdfUploaded byAbhishek Ranjan
- Consumer BehaviourUploaded byraemu26
- Silabus Statistika Terapan S2 '14Uploaded byDhani Aldila Putra
- Conjoint Analysis of Treatment PreferenceUploaded bymaleticj
- 1-s2.0-0377221793903013-mainUploaded byPatricia Ordoñez
- 10.1.1.320.1538.pdfUploaded byMustafa Kamal
- maths of risk takingUploaded byravikr.pranavam9308
- Homework1 SolUploaded byRajat Sen
- ParetoUploaded byRodrigo Gualva
- Optimization Methods ConstrainedUploaded byGerardo Valencia
- flier specon 141Uploaded byapi-246703341
- DBUploaded byAnkur Dhirawat
- 2.05 Regression How Good is the LineUploaded byAisha Chohan
- Biometrik2-week3aUploaded by-

- lecture5Uploaded byapi-19643506
- lecture9Uploaded byapi-19643506
- lecture4Uploaded byapi-19643506
- lecture7Uploaded byapi-19643506
- lecture12Uploaded byapi-19643506
- lecture6Uploaded byapi-19643506
- lecture10Uploaded byapi-19643506
- lecture13Uploaded byapi-19643506
- AIMA_ch14Uploaded byapi-19643506
- AIMA_ch18Uploaded byapi-19643506
- lecture2Uploaded byapi-19643506
- AIMA_ch13Uploaded byapi-19643506
- AIMA_ch9Uploaded byapi-19643506
- lecture3Uploaded byapi-19643506
- lecture1Uploaded byapi-19643506