You are on page 1of 37

Chapter 4

Using Data in Decisions


4.1 Data and State of Nature
• Suppose that the decision maker can perform an experiment in which
the state of nature determines the generation of the data, and so the
density function of the data depends on that state of nature.

• The data may consist of a single value of a random variable or a


random sample. For a general discussion we shall use symbol X to
refer to the data. Thus in the problem to be considered, X is either a
single random variable or a random vector.

• It is important to realize that in order for the data to carry information


about the state of nature 𝜃, the density function of X depends on 𝜃 .
We shall denote the density function of X by 𝑓(𝑥; 𝜃).
Chapter 4: Using Data in Decisions 2
4.2 Decision Rules
• Let (Θ, 𝐴, 𝐿) be a decision problem and let X be a random variable (or
vector ) defined on probability space (Θ, ℑ , 𝑃) and let 𝑆𝑋 = 𝑋 Ω .

• A procedure for using data as an aid for making decision will involve a
rule that assigns one of the available actions is called a decision rule.

Note
It is always assumed that 𝑑(𝑋) is a random variable.
Chapter 4: Using Data in Decisions 3
Example 4.1
In one particular morning, a UTAR student who enrolls in the course UECM2233
has to take one of the following actions:
𝑎1 : Go with umbrella and 𝑎2 : Go without umbrella.
Suppose the states of nature are
𝜃1 : rain and 𝜃2 : no rain.
Let X denote a rain indicator where
X = 0 means no rain and "X = 1" means rain.

(a) List out all the pure decision rules.


(b) Suppose the student has another alternative, 𝑎3 : Stay at home and skip lectures.
Identify the new pure decision rules.

Chapter 4: Using Data in Decisions 4


Solution
(a) There will be four pure decision rules:

(b) The set of pure decision rules are expanded from four elements to
nine pure decision rules.

Chapter 4: Using Data in Decisions 5


• In general if action space contains k actions and the observable random variable X
takes n values, then there are 𝑘 𝑛 distinct pure decision rules.

• Of the 𝑘 𝑛 possible rules, some are sensitive, some are foolish, some ignore the
data, and some will use it wrongly. In other words, some are good decisions and
some are bad decisions.

• When a given decision rule d is used, the loss incurred will depend not only on the
state of nature that governs, but also he value of the data X and hence the loss
incurred is a random variable 𝐿(𝜃, 𝑑 𝑋 )

Chapter 4: Using Data in Decisions


6
Note
Risk function was introduced by Wald to unify existing approaches for
evaluation of statistical procedures from the frequentist standpoint. It focuses
on the long-term performance of a decision rule in a series of repetitions of
the decision problems.

• The set of all pure decision rules will be denoted by D. Pure actions are
decision which ignore the values of the observed random variable and hence
is regarded as a subset of D. That is

𝐴⊆𝐷
and 𝑅 𝜃, 𝑑 = 𝐿(𝜃, 𝑎).

• With risk function and pure decision rules, the no-data decision problem
Θ, 𝐴, 𝐿 is extended to the statistical decision problem Θ, 𝐷, 𝑅 .

Chapter 4: Using Data in Decisions 7


Example 4.2
Consider the rain or no rain problem considered in Example 4.1. For the sake of
simplicity we assume that the loss table of the decision problem is as follows:

Let X denote a rain indicator where


X = 0 means no rain and "X = 1" means rain.
and the density function of X is assumed to be

Chapter 4: Using Data in Decisions 8


(a) Find the risk function
(b) Suppose that the student has another alternative that he would stay at home
(𝑎3 ) and skip lecture. Assume that the following loss table gives losses that
represent the dilemma which constitute the decision problem.

(i) List out all the pure decision rules


(ii) Find the risk function

Chapter 4: Using Data in Decisions 9


Solution
(a)

Chapter 4: Using Data in Decisions 10


Solution
(b)(i)

(ii)

Chapter 4: Using Data in Decisions 11


Example 4.3
Upon entering the KB Block building, Dr Wong, with his office on 8th floor can
either go down to the basement and ride up the lift, or go up on second floor and
ride the rest of the way. The lift is either working (𝜃1 ) or not working (𝜃2 ). If it is
working, walking up cost energy, and down costs no energy. Assume that the loss
function of this problem is given as follows:

Chapter 4: Using Data in Decisions 12


Suppose that he can see the light near the button summoning the lift. The data here
might be the number of lights showing, a quantity that takes values 0, 1, or 2.

Suppose now that the density function of X is given by

Chapter 4: Using Data in Decisions 13


a) List out all the pure decision rules
b) Find the risk function
Solution
(a)

(b)

Chapter 4: Using Data in Decisions 14


From Example 4.3, several observations should be made.

• Pure decision rules 𝑑1 and 𝑑8 , which ignore the data, give risk points which are
exactly the same as the corresponding loss points for pure actions 𝑎1 and 𝑎2 ,
respectively.

• The straight line joining the risk points of 𝑑1 and 𝑑8 consists of the loss points of
the mixed actions of 𝑎1 and 𝑎2 .

• The data then provide risk points which clearly dominate these no-data loss points.
Indeed the risk points of 𝑑2 and 𝑑5 are admissible points.

• Decision rules 𝑑4 , 𝑑6 and 𝑑7 are worse than rules that make no use of the data.

Chapter 4: Using Data in Decisions 15


Conclusion:
An intelligent use of data can improve the losses (expected losses) under
all states of nature, and using the data foolishly can deteriorate the loss
situation.

• When the state space contains more than two elements, it is difficult to
visualize the loss points of the pure decision rules.

• In this case, one can only compare, if possible, the risks of the decision
rules numerically.

Chapter 4: Using Data in Decisions 16


Example 4.4
Consider a statistical decision problem with loss table

Suppose that we can observe a binary random variable X whose distribution is given
as follows:

Chapter 4: Using Data in Decisions 17


a) List out all the pure decision rules
b) Find the risk function
c) Comment on the risk function obtained.

Solution
(a)

(b)

Chapter 4: Using Data in Decisions 18


Chapter 4: Using Data in Decisions 19
(c)
• 𝑑1 has the smallest risk under 𝜃1 and 𝜃2 . However, it incurs largest risk
under 𝜃3 .
• 𝑑3 has smaller risks compare with the risks of 𝑑2 .
• Note 𝑅 𝜃3 , 𝑑3 > 𝑅 𝜃3 , 𝑑4 and 𝑅 𝜃𝑖 , 𝑑3 < 𝑅 𝜃𝑖 , 𝑑4 , 𝑖 = 1, 2.

Chapter 4: Using Data in Decisions 20


4.3 Dominance and Admissibility
Let 𝑑1 and 𝑑2 be two nonrandomized decision rules from the statistical decision
problem Θ, 𝐷, 𝑅 .

Chapter 4: Using Data in Decisions 21


• If the state space contains only two elements, admissibility of decision rule can be
determined graphically.

Chapter 4: Using Data in Decisions 22


Example 4.5
Consider a decision problem with loss table given by

Suppose the decision maker can observe random variable with the following
probability function:

Determine the admissible and inadmissible decision rules.

Chapter 4: Using Data in Decisions 23


Solution
The set of pure decision rules is tabulated as follows

It follows that decision rules 𝑑1 , 𝑑3 and 𝑑4 are admissible, and decision rule 𝑑2 is
inadmissible.

Chapter 4: Using Data in Decisions 24


Example 4.6
Reconsider again the decion problem stated in Example 4.5

Suppose the probability function of the data is

Determine the admissible and inadmissible decision rules.

Chapter 4: Using Data in Decisions 25


Solution

The risk functions of the nonrandomized decisions 𝑑2 and 𝑑3 are

Chapter 4: Using Data in Decisions 26


When mixed decision rules are taken into consideration, the set of all admissible
decision rules are those whose risk points lie on the line segment joining the risk
points of 𝑑1 and 𝑑3 , and the line segment joining the risk points of 𝑑3 and 𝑑4 . Other
decision rules are inadmissible.

Chapter 4: Using Data in Decisions 27


4.4 Minimax Principle
• As in the no-data case, it is necessary to devise a scheme of preferences so that in
this ordering one can select at the most desirable decision rule. The minimax
principle again provides a numerical measure of decision rules, namely, the
maximum risk over the various states of nature.

• The selection of a decision rule, with the knowledge of the risk function but with
the state of nature unknown, is exactly the same problem – mathematically – as
the selection of an action, knowing the loss function but not the state of nature.

• The selection of a decision rule is more complicated because the set of


nonrandomized decision rules is larger, usually, than the set of pure action, and
because the risk functions must first be calculated from the given losses. However,
the effective use of data can reduce losses.
Chapter 4: Using Data in Decisions 28
Example 4.7
Given the risks of the four pure decision rules are

Determine the pure minimax decision rule.

Chapter 4: Using Data in Decisions 29


Solution

So 𝑑4 is the pure minimax decision rule.

Chapter 4: Using Data in Decisions 30


• If nature consists of two states, graphical techniques can be employed to determine
minimax decision rule.
• Plot the risk points of the decision rules on the plane.

Chapter 4: Using Data in Decisions 31


Example 4.8
Consider a decision problem with loss table given by

Determine the pure minimax decision rule.

Chapter 4: Using Data in Decisions 32


Solution

So 𝑑2 is the pure minimax decision rule.

Chapter 4: Using Data in Decisions 33


Chapter 4: Using Data in Decisions 34
Example 4.9
Consider a statistical decision problem with loss table and distribution of data X
under various states are given as follows:

Determine the minimax mixed decision rule.

Chapter 4: Using Data in Decisions 35


Solution

The risks of the above decision rules are tabulated as follows:

Chapter 4: Using Data in Decisions 36


It follows from the below figure that the minimax randomized decision rule is of the form
𝑝෤ = (0, 0, 𝑝, 1 − 𝑝) where p satisfies the equation

𝑅∗ 𝜃1 , 𝑝෤ = 𝑅∗ 𝜃2 , 𝑝෤
0.2𝑝 + 5 1 − 𝑝 = 2.6𝑝 + 1 − 𝑝
5
𝑝=
8
Thus, the minimax mixed decision rule is
5 3
𝑝෤ = (0, 0, , )
8 8
Chapter 4: Using Data in Decisions 37

You might also like