This action might not be possible to undo. Are you sure you want to continue?

Idea: preferences of a rational agent must obey constraints. Rational preferences ⇒ behavior describable as maximization of expected utility Constraints: Orderability (A B) ∨ (B A) ∨ (A ∼ B) Transitivity (A B) ∧ (B C) ⇒ (A C) Continuity A B C ⇒ ∃ p [p, A; 1 − p, C] ∼ B Substitutability A ∼ B ⇒ [p, A; 1 − p, C] ∼ [p, B; 1 − p, C] Monotonicity A B ⇒ (p ≥ q ⇔ [p, A; 1 − p, B] ∼ [q, A; 1 − q, B])

Rational decisions

Chapter 16

Chapter 16

1

Chapter 16

4

Outline

♦ Rational preferences ♦ Utilities ♦ Money ♦ Multiattribute utilities ♦ Decision networks ♦ Value of information

**Rational preferences contd.
**

Violating the constraints leads to self-evident irrationality For example: an agent with intransitive preferences can be induced to give away all its money C, then an agent who has C If B would pay (say) 1 cent to get B If A B, then an agent who has B would pay (say) 1 cent to get A If C A, then an agent who has A would pay (say) 1 cent to get C

A

1c 1c

B

1c

C

Chapter 16

2

Chapter 16

5

Preferences

An agent chooses among prizes (A, B, etc.) and lotteries, i.e., situations with uncertain prizes

A p L

**Maximizing expected utility
**

Theorem (Ramsey, 1931; von Neumann and Morgenstern, 1944): Given preferences satisfying the constraints there exists a real-valued function U such that U (A) ≥ U (B) ⇔ A ∼ B U ([p1, S1; . . . ; pn, Sn]) = Σi piU (Si)

1−p B

Lottery L = [p, A; (1 − p), B] Notation: A B A∼B A∼B

MEU principle: Choose the action that maximizes expected utility Note: an agent can be entirely rational (consistent with MEU) without ever representing or manipulating utilities and probabilities E.g., a lookup table for perfect tictactoe

A preferred to B indiﬀerence between A and B B not preferred to A

Chapter 16

3

Chapter 16

6

Utilities

Utilities map states to real numbers. Which numbers? For each x, adjust p until half the class votes for lottery (M=10,000)

p 1.0 0.9 0.8 0.7 0.6

Student group utility

Standard approach to assessment of human utilities: compare a given state A to a standard lottery Lp that has “best possible prize” u with probability p “worst possible catastrophe” u⊥ with probability (1 − p) adjust lottery probability p until A ∼ Lp

0.999999

0.4 0.3

continue as before

0.5

pay $30 ~

L

0.2

**0.000001 instant death
**

0.0 $x 0 500 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0.1

Chapter 16

7

Chapter 16

10

Utility scales

Normalized utilities: u = 1.0, u⊥ = 0.0 Micromorts: one-millionth chance of death useful for Russian roulette, paying to reduce product risks, etc. QALYs: quality-adjusted life years useful for medical decisions involving substantial risk Note: behavior is invariant w.r.t. +ve linear transformation U (x) = k1U (x) + k2 where k1 > 0 With deterministic prizes only (no lottery choices), only ordinal utility can be determined, i.e., total order on prizes

Litigation Noise

Decision networks

Add action nodes and utility nodes to belief networks to enable rational decision making

Airport Site

Air Traffic

Deaths

U

Construction

Cost

Algorithm: For each value of action node compute expected value of utility node given action, evidence Return MEU action

Chapter 16 8 Chapter 16 11

Money

Money does not behave as a utility function Given a lottery L with expected monetary value EM V (L), usually U (L) < U (EM V (L)), i.e., people are risk-averse Utility curve: for what probability p am I indiﬀerent between a prize x and a lottery [p, $M ; (1 − p), $0] for large M ? Typical empirical data, extrapolated with risk-prone behavior:

+U

o o o o o o o o o o o o o o

Multiattribute utility

How can we handle utility functions of many variables X1 . . . Xn? E.g., what is U (Deaths, N oise, Cost)? How can complex utility functions be assessed from preference behaviour? Idea 1: identify conditions under which decisions can be made without complete identiﬁcation of U (x1, . . . , xn) Idea 2: identify various types of independence in preferences and derive consequent canonical forms for U (x1, . . . , xn)

+$

800,000

−150,000

o

Chapter 16

9

Chapter 16

12

Strict dominance

Typically deﬁne attributes such that U is monotonic in each

SocioEcon Age GoodStudent ExtraCar RiskAversion VehicleYear SeniorTrain Mileage

Label the arcs + or –

**Strict dominance: choice B strictly dominates choice A iﬀ ∀ i Xi(B) ≥ Xi(A) (and hence U (B) ≥ U (A))
**

X2

B

DrivingSkill MakeModel DrivingHist Antilock DrivQuality Airbag Accident Theft OwnDamage Cushioning OtherCost OwnCost CarValue HomeBase AntiTheft Ruggedness

**This region dominates A
**

X2

C C A D A B

X1

Deterministic attributes Uncertain attributes

X1

MedicalCost

**Strict dominance seldom holds in practice
**

Chapter 16 13 Chapter 16 16

LiabilityCost

PropertyCost

Stochastic dominance

1.2 1

**Label the arcs + or –
**

SocioEcon Age GoodStudent ExtraCar RiskAversion

S1 S2

1 0.8 Probability 0.6 0.4 Probability 0.8 0.6 0.4 0.2 0 -6 -5.5 -5 -4.5 -4 -3.5 Negative cost -3 -2.5 -2 -6 -5.5 -5 -4.5 -4 -3.5 Negative cost -3 -2.5 -2 0 0.2 S1 S2

Mileage VehicleYear

SeniorTrain DrivingSkill DrivingHist Antilock DrivQuality Airbag Ruggedness Accident Theft OwnDamage Cushioning OtherCost MedicalCost OwnCost CarValue HomeBase AntiTheft MakeModel

+

Distribution p1 stochastically dominates distribution p2 iﬀ t t ∀ t −∞ p1(x)dx ≤ −∞ p2(t)dt If U is monotonic in x, then A1 with outcome distribution p1 stochastically dominates A2 with outcome distribution p2: ∞ ∞ −∞ p1(x)U (x)dx ≥ −∞ p2 (x)U (x)dx Multiattribute case: stochastic dominance on all attributes ⇒ optimal

Chapter 16 14

LiabilityCost

PropertyCost

Chapter 16

17

**Stochastic dominance contd.
**

Stochastic dominance can often be determined without exact distributions using qualitative reasoning E.g., construction cost increases with distance from city S1 is closer to the city than S2 ⇒ S1 stochastically dominates S2 on cost E.g., injury increases with collision speed Can annotate belief networks with stochastic dominance information: + X −→ Y (X positively inﬂuences Y ) means that For every value z of Y ’s other parents Z ∀ x1, x2 x1 ≥ x2 ⇒ P(Y |x1, z) stochastically dominates P(Y |x2, z)

Age

**Label the arcs + or –
**

SocioEcon GoodStudent ExtraCar RiskAversion SeniorTrain DrivingSkill DrivingHist Antilock DrivQuality Ruggedness Accident Theft OwnDamage Cushioning OtherCost MedicalCost OwnCost Airbag CarValue HomeBase AntiTheft MakeModel

+ +

Mileage VehicleYear

LiabilityCost

PropertyCost

Chapter 16

15

Chapter 16

18

**Label the arcs + or –
**

SocioEcon Age GoodStudent ExtraCar RiskAversion VehicleYear SeniorTrain DrivingSkill DrivingHist Antilock DrivQuality Airbag

− +

**Preference structure: Deterministic
**

X1 and X2 preferentially independent of X3 iﬀ preference between x1, x2, x3 and x1, x2, x3 does not depend on x3 E.g., N oise, Cost, Saf ety : 20,000 suﬀer, $4.6 billion, 0.06 deaths/mpm vs. 70,000 suﬀer, $4.2 billion, 0.06 deaths/mpm Theorem (Leontief, 1947): if every pair of attributes is P.I. of its complement, then every subset of attributes is P.I of its complement: mutual P.I.. Theorem (Debreu, 1960): mutual P.I. ⇒ ∃ additive value function: V (S) = ΣiVi(Xi(S)) Hence assess n single-attribute functions; often a good approximation

Mileage

MakeModel

+

CarValue HomeBase

AntiTheft

Ruggedness Theft OwnDamage Cushioning OtherCost OwnCost

Accident

MedicalCost LiabilityCost PropertyCost

Chapter 16

19

Chapter 16

22

**Label the arcs + or –
**

SocioEcon Age GoodStudent ExtraCar RiskAversion VehicleYear SeniorTrain DrivingSkill DrivingHist Antilock DrivQuality Airbag

− + − +

**Preference structure: Stochastic
**

Need to consider preferences over lotteries: X is utility-independent of Y iﬀ preferences over lotteries in X do not depend on y Mutual U.I.: each subset is U.I of its complement ⇒ ∃ multiplicative utility function: U = k 1 U 1 + k2 U 2 + k3 U 3 + k 1 k2 U 1 U 2 + k 2 k3 U 2 U 3 + k 3 k1 U 3 U 1 + k 1 k2 k3 U 1 U 2 U 3

AntiTheft

Mileage

MakeModel

CarValue HomeBase

Ruggedness Theft OwnDamage Cushioning OtherCost OwnCost

Accident

Routine procedures and software packages for generating preference tests to identify various canonical families of utility functions

MedicalCost LiabilityCost PropertyCost

Chapter 16

20

Chapter 16

23

**Label the arcs + or –
**

SocioEcon Age GoodStudent ExtraCar RiskAversion SeniorTrain DrivingSkill DrivingHist Antilock DrivQuality Ruggedness Accident Theft OwnDamage Cushioning OtherCost OwnCost Airbag CarValue HomeBase AntiTheft

− + − +

Value of information

Idea: compute value of acquiring each possible piece of evidence Can be done directly from decision network

Mileage VehicleYear

MakeModel

Example: buying oil drilling rights Two blocks A and B, exactly one has oil, worth k Prior probabilities 0.5 each, mutually exclusive Current price of each block is k/2 “Consultant” oﬀers accurate survey of A. Fair price? Solution: compute expected value of information = expected value of best action given the information minus expected value of best action without information Survey may say “oil in A” or “no oil in A”, prob. 0.5 each (given!) = [0.5 × value of “buy A” given “oil in A” + 0.5 × value of “buy B” given “no oil in A”] –0 = (0.5 × k/2) + (0.5 × k/2) − 0 = k/2

MedicalCost

LiabilityCost

PropertyCost

Chapter 16

21

Chapter 16

24

General formula

Current evidence E, current best action α Possible action outcomes Si, potential new evidence Ej EU (α|E) = max Σi U (Si) P (Si|E, a) a Suppose we knew Ej = ejk , then we would choose αejk s.t. EU (αejk |E, Ej = ejk ) = max Σi U (Si) P (Si|E, a, Ej = ejk ) a Ej is a random variable whose value is currently unknown ⇒ must compute expected gain over all possible values: V P IE (Ej ) = Σk P (Ej = ejk |E)EU (αejk |E, Ej = ejk ) − EU (α|E) (VPI = value of perfect information)

Chapter 16

25

Properties of VPI

Nonnegative—in expectation, not post hoc ∀ j, E V P IE (Ej ) ≥ 0 Nonadditive—consider, e.g., obtaining Ej twice V P IE (Ej , Ek ) = V P IE (Ej ) + V P IE (Ek ) Order-independent V P IE (Ej , Ek ) = V P IE (Ej ) + V P IE,Ej (Ek ) = V P IE (Ek ) + V P IE,Ek (Ej ) Note: when more than one piece of evidence can be gathered, maximizing VPI for each to select one is not always optimal ⇒ evidence-gathering becomes a sequential decision problem

Chapter 16

26

Qualitative behaviors

a) Choice is obvious, information worth little b) Choice is nonobvious, information worth a lot c) Choice is nonobvious, information worth little

P( U | Ej )

P( U | Ej )

P( U | Ej )

U U2 U1 U2 U1

U U2 U1

U

(a)

(b)

(c)

Chapter 16

27

- Robotics (40)
- Robotics (39)
- Robotics (38)
- Robotics (37)
- Robotics (36)
- Robotics (35)
- Robotics (34)
- Robotics (33)
- Robotics (31)
- Robotics (30)
- Robotics (29)
- Robotics (28)
- Robotics (27)
- Robotics (26)
- Robotics (25)
- Robotics (24)
- Robotics (23)
- Robotics (22)
- Robotics (21)
- Robotics (20)
- Robotics (19)
- Robotics (18)
- Robotics (17)
- Robotics (16)

Sign up to vote on this title

UsefulNot useful- Robotics (33)
- Exam 1 - Answers
- Basic Decision Analysis
- Notes on choice under Uncertainty
- Notes 6
- Lecture 1. Preferences and Utility
- Indifference Curves.s05
- Choice Under Uncertainty
- Expected UTILITY Theorem
- Webinar04 ABM Basics Final Slides-With-Notes2
- 501_10_08
- microecnomic theory ch03
- Decision Uncertainty
- Lecture5 Slides
- Prefs and Utility Examples
- The Fabric of Felicity
- Ch24
- EC201-slide3
- Utility
- Operation Research PPT.pdf
- Lecture 05
- Lecture 4
- Econ50-homework2
- expected utility
- Utility Function
- [Peter P. Wakker] Prospect Theory for Risk and Am(BookZZ.org)
- mana-utility.pdf
- Lecture02-1996
- micryi Cambridge University Press in 1951. 1951, 1952, 1955, 1973 by the Royal
- Case Study Microeconomics Module 1
- Robotics (32)

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd