Professional Documents
Culture Documents
Witwatersrand University
School of Economic and Business Sciences
Financial Economics Honours Notes
2017
Lecturer: Tendai Gwatidzo, PhD.
Table of Contents
1
1. Simple and Compound Lotteries
A simple gamble/lottery offers certain payoffs with certain probabilities, e.g., G(x, y; j) offers
payoffs x with probability j and y with probability 1-j. With more than two payoffs the simple
lottery can be written as: G(x1, x2,…,xn; p1, p2, ….pN), where pi is the probability that xi
occurs. Please note that there is no standardized notation for lotteries, so where necessary I
adopt this particular notation.
A simple lottery assigns probabilities P = (p1, …..,pN) to xi outcomes (where i = 1, …N) such
that p1+p2+…..+pN = 1.
So for every lottery we have two main arguments: a set of outcomes or payoffs (xi in our
case) and a set of probabilities associated with each outcome (pi in our case).
A simple lottery has only deterministic outcomes and a compound lottery has some
outcomes that are also lotteries.
An example of a compound lottery is: A ‘fair’ coin is flipped and the individual then plays
out lottery L2 if heads and lottery L3 if tails. This lottery can be stated as: G(L2, L3; 0.5).
More formally, a compound lottery assigns probabilities a1, …, aK to one or more simple
K
lotteries L1,….,LK and can be stated as G[L1,….,LK; a1, …, aK], with ai 0 and ai 1 .
i 1
Note that αi is the probability that Li will occur.
2
So given a compound lottery G[L1,………..,LK; a1, ……, aK], R* is the reduced lottery that
generates the same probability distribution over outcomes, and it is defined as follows.
K
For each outcome x X, R * ( x) ai Li ( x)
i 1
That is, to get a probability of some outcome x in the reduced lottery, you multiply the
probability that each lottery Li arises (ai) by the probability that Li assigns to the outcome x
[Li(x)] and then add over all i.
Example 1
Consider the following two gambles: If I roll a fair die and the number that comes up is less
than 3, I get R120, otherwise I get nothing. Call this first lottery L1. So L1 = G(120, 0; 1/3).
If I flip a fair coin and it comes up heads I get R100, and if it comes up tails, I get nothing.
Call this first lottery L2. So L2 = G(100, 0; 0.5).
In this case K = 2 (we have two lotteries), and we have two a’s (a1 and a2), the probabilities
of choosing L1 and L2, respectively. How do we determine a1 and a2? We could say a1 is the
probability of choosing one of your 4 cars. So a1=1/4 and a2 = ¾ (Since a1 + a2 =1). So the
compound lottery is now: G[L1, L2; ¼, ¾]. This compound lottery can also be written as:
G[(120,0;1/3), (100,0:0.5); 1/4].
We can reduce the above compound lottery into a reduced lottery as follows:
3
Where α1 is the probability that L1 occurs and L1(120) is the probability that L1 assigns to outcome 120;
and α2 is the probability that L2 occurs and L2(120) is the probability that L2 assigns to outcome 120.
The following figure shows the two simple lotteries and the reduced lottery R*.
1/2 100
1/3 120
1/2 0
2/3 0
Lottery L1 Lottery L2
1/3 120
L1
2/3 0
1/4
0
13/24
1/2 100
3/4
L2 100
9/24
1/2 0
Compound Lottery
2/24
120
Reduced Lottery R*
Example 2
Consider simple lotteries p and q. Assume that both p and q are equally likely to be played.
Lottery p has payoffs (x1, x2, x3) = (0, 2, 1) and respective probabilities (p1, p2, p3) = (0.5, 0.2,
0.3). i.e., Lp = [0, 2, 1; 0.5, 0.2, 0.3]
Lottery q has payoffs (y1, y2) = (2, 3) with probabilities (p1, p2) = (0.6, 0.4), i.e.,
Lq = [2, 3; 0.6, 0.4].
Combining the sets of outcomes yields the reduced lottery r with payoffs:
(z1, z2, z3, z4) = (0,1,2,3).
The probabilities of each of these outcomes of r are obtained by taking the linear
combination of the probabilities in the original lotteries:
So outcome 2 has probability 0.2 in lottery p and 0.6 in lottery q, then it will have 0.5(0.2) +
0.5(0.6) = 0.4 probability in the reduced lottery r.
Outcome 1 has 0.3 probability in p and 0 probability in q thus that outcome will have
4
Outcome 0 has 0.5 probability in p and 0 in q, that outcome will have:5*0.5+0.5*0=0.25
Thus the reduced lottery r has outcomes (z1, z2, z3, z4) = (0,1,2,3) with probabilities (r0, r1, r2,
r3) = (0.25, 0.15, 0.4, 0.2). That is, R*(x) = [0,1, 2, 3; 0.25, 0.15, 0.4, 0.2].
The probability associated with an outcome in the compound lottery is a linear combination
of the probabilities for this outcome from the simple lotteries.
In our example, the consequentialist hypothesis requires that I view the compound lottery
G(L1,L2;1/4,3/4) and the reduced lottery R*[13/24,9/24,2/24] as equivalent.
More generally, it requires that I view any two lotteries that generate the same reduced lottery as
equivalent. Figure 2 illustrates a case where two different compound lotteries generate the same
reduced lottery. The consequentialist hypothesis requires that I regard both of them as equivalent.
1/2
1/3
=L(7/12,1/8,7/24)= Cq
L2(x1, x2, x3;2/8, 3/8, 3/8)
Cp 1/3 1/2
1/3
q2=L(x1, x2, x3; 5/6, 1/12, 1/12)
L3(x1, x2, x3; 1/2, 0, 1/2)
5
The VNM expected utility function is linear in probabilities. Assume three outcomes x1, x2
and x3, such that x1 < x2 < x3. Assume that outcomes x1, x2 and x3 occur with probabilities
p1, p2 and p3, respectively. So p1 + p2 + p3=1 and p2 = 1 – p1 – p3.
We can represent the lotteries in a unit triangle in a (p1, p3) plane known as the Machina
triangle.
1
P3=prob(x3)
B Increasing preferences
C A
0 1
P1= prob(x1)
Moving up from A to B increases p3 (the chance of the highest outcome), reduces p2, and
leaves p1 unchanged. So lottery at point B is preferred to that at point A.
Moving from point A to C reduces p1, increases p2 and leaves p3 unchanged. Since p2 has
gone up the lottery at point C is preferred to that at A. Why? While p3 has remained
constant p2 (the chance of a better outcome has gone up) and p1 (the chance of a lower
payoff) has gone down.
Expected utility increases when the probability of the best outcome increases and/or
probability of the worst outcome falls, expected utility must therefore increase as we move
north-westly from one IC to another.
Generally, all north-west movements lead to dominating lotteries that will be preferred.
6
Figure 4: Points on the Machina Triangle
1 (0, 0, 1)
(0, ½, ½)
P3=prob(x3)
(1,0,0)
1
0 2/3
(0, 1, 0) 1/3
P1= prob(x1)
(1/3, 1/6, ½)
p U(X
i 1
i i ) constant
c U ( x2 ) U ( x2 ) U ( x1 )
So p3 = a + bx, where; a and b
U ( x3 ) U ( x 2 ) U ( x3 ) U ( x 2 )
Note that in the probability triangle the outcomes are a fixed and exhaustive list while the
probabilities vary, the numbers u(xi) are constants independent of pi.
Therefore [U(x3) - U(x2)] and [U(x2) - U(x1)] are positive constants.
So slope of IC is positive and constant, hence ICs in the probabilities plane are straight lines
each with a slope b.
7
Figure 5: Indifference Curves in the Machina Triangle
1 (0, 0, 1)
IC6
IC5
IC4
P3=prob(x3)
IC3
1/2
IC2
C A IC1
B
(1,0,0)
1
0 2/3
(0, 1, 0) 1/3
P1= prob(x1)
IC1 has the least preferred gambles and IC6 has the highest preferred gambles. Why?
If you are at a point A and you move to B, then p1 increases, p3 remains constant, and p2
decreases (since p2 =1-p1-p3). So the difference between gambles at A and B is that at point B,
p1 has gone up and p2 has gone down. Since p3 did not change the gamble at A is preferred to
that at B since the probability of a higher outcome has gone down while that of an inferior
outcome has gone up.
Thus moving from IC1 to IC6 increases the chance of better outcomes and is thus preferred.
8
c x2 x2 x1
p3 p1
x3 x 2 x3 x 2
c x2 x x1
So p3=a + bx, where; a and b 2
x3 x 2 x3 x 2
Since x1 < x2 < x3, slope of IEL is also positive and since x1, x2 and x3 are given the slope is
also constant.
1 (0, 0, 1)
P3=prob(x3)
1/2
(1,0,0)
1
0 2/3
(0, 1, 0) 1/3
P1= prob(x1)
Northeast movements along the IELs do not change the expected value of the prospect but
they increase p1 and p3 (probabilities of tail outcomes, x1 and x3). Such movements increase
risk and are called “pure risk increases”.
9
Figure 7: Indifference Curves and Iso-Expected Lines for a Risk Averse Individual
1 (0, 0, 1)
Indifference Curves
Iso-expected Lines
P3=prob(x3)
1/2
(1,0,0)
1
0 2/3
(0, 1, 0) 1/3
P1= prob(x1)
If the slope of IC is smaller than slope of IELs then the individual is risk loving (That is,
the ICs are formed by the convex utility function of a risk-loving individual).
Figure 8: Indifference Curves and Iso-Expected Lines for a Risk Loving Individual
1 (0, 0, 1)
Indifference Curves
Iso-expected Lines
P3=prob(x3)
1/2
(1,0,0)
1
0 2/3
(0, 1, 0) 1/3
P1= prob(x1)
10
Figure 9: Indifference curves and Iso-expected lines for a risk neutral individual
1 (0, 0, 1)
P3=prob(x3)
1/2
(1,0,0)
1
0 2/3
(0, 1, 0) 1/3
P1= prob(x1)
Proposition 1: A concave utility function’s ICs are steeper than the corresponding IELs
Proof:
To prove this proposition we must show that if the utility function is concave (that is the
individual is risk averse) its ICs can be shown to be steeper than the IELs.
U(X)
U(x3)
U(x2)
U(x1)
0 X
x1 x2 x3
11
When consumption of X changes from x1 to x2 utility changes from U(x1) to U(x2). So
U ( x 2 ) U ( x1 )
marginal utility is: .
x 2 x1
When consumption of X changes from x2 to x3 utility changes from U(x2) to U(x3). So
U ( x3 ) U ( x 2 )
marginal utility is: .
x3 x 2
Since x1 < x2 < x3, diminishing marginal utility implies that marginal utility at higher values is
smaller than that at lower values of X.
U ( x 2 ) U ( x1 ) U ( x3 ) U ( x2 )
So, >
x 2 x1 x3 x 2
x 2 x1 U ( x2 ) U ( x1 ) x2 x1
Multiplying both sides by: yields: >
u ( x3 ) u ( x 2 ) U ( x3 ) U ( x2 ) x3 x2
x2 x1 U ( x2 ) U ( x1 )
Recall: Slope of IELs = and slope of IC =
x3 x2 U ( x3 ) U ( x2 )
U ( x2 ) U ( x1 ) x x
So = slope of IC and 2 1 = slope of IEL
U ( x3 ) U ( x2 ) x3 x2
U ( x2 ) U ( x1 ) x2 x1
Therefore slope of IC > slope of Iso - Expected Line
U ( x3 ) U ( x2 ) x3 x2
In our last lecture it was shown that if the second derivative of a utility function is negative
(that is, we have diminishing marginal utility), the individual is risk averse. It can be shown
that risk aversion implies diminishing marginal utility (DMU).
12
Figure 11: Concave Utility Function
But gamble B is riskier since it has a higher spread between bad and good outcomes. So a
risk-averse individual will go for A.
13
U ( X ) U ( X ) U (Y ) U (Y )
Dividing by σ yields:
So the LHS of the above equation is simply marginal utility, change in total utility over
change in X. The RHS is also marginal utility. As σ approaches zero, LHS approaches U’(X)
and RHS approaches U’(Y).
Therefore as σ approaches zero, U’(X) > U’(Y), as required by the law of diminishing
marginal utility. That is, marginal utility at higher quantities is smaller than marginal utility at
lower quantities.
Consider an individual whose current income is W, which he/she can invest in an asset that
yields the following payoffs, (W-σ) with probability 0.5 and (W+σ), with probability 0.5.
If the individual loses σ then the loss in utility (L) is: L = U(W) - U(W-σ)
If the individual wins σ then the gain in utility (G) is: G = U(W+σ) - U(W)
Even though the gain or loss in wealth is the same, DMU implies that G < L.
Since the probability of a loss/gain is 0.5, the above implies that: 0.5G < 0.5 L
14
0.5[U(W+σ) - U(W)] < 0.5[U(W) - U(W-σ)]
0.5U(W+σ) – 0.5U(W) < 0.5U(W) – 0.5U(W-σ)
Taking the U(W+σ) terms to one side and the U(W) terms to the other side yields:
The LHS is the expected utility of wealth for an asset that yields (W+σ) with probability 0.5
and (W-σ) with probability 0.5. The RHS is the utility of wealth when there is no gambling.
That is, the utility of wealth from a sure amount. The expected utility of an asset that yields
(W+σ) with probability 0.5 and (W-σ) with probability 0.5 is less than the EU of W from a
sure amount. That is, a risk averse individual prefers a sure amount to a gamble. This is a
consequence of DMU of wealth. Therefore DMU of wealth implies risk aversion.
End of Proof
3. Stochastic Dominance
The approach to the maximization of expected utility discussed in earlier lectures is based on
the assumption that the preferences of decision makers are known, easily obtained or
quantified. In a number of cases one may be confronted with the necessity of making a
prediction about a decision maker’s preferences between risky prospects with limited or no
knowledge of the underlying utility function. Under these conditions the decision-theoretic
approach is of limited value.
Stochastic Dominance (SD) is introduced to help solve this problem. It helps to resolve risky
choices while making the weakest possible assumptions. The most general form of SD
makes no assumptions about the form of the probability distribution.
Stochastic dominance is said to occur if the expected utility of one risky prospect exceeds
the expected utility of another for all possible utility functions within a defined class.
Under SD approach one is interested in defining selection rules that minimize the admissible
set of risk prospects by discarding those that are dominated. The set of risky prospects that
are found not to be dominated according to some rule(s) is referred to as the stochastically
efficient set. SD thus helps to isolate a smaller set of prospects by excluding those that are
inferior. This is important for decisions analysts since it reduces the number of alternatives
requiring explicit consideration.
It utilizes the fundamental property that decision makers prefer low probabilities to be
associated with less preferred outcomes and high probabilities to be associated with more
preferred outcomes.
15
Assumptions in traditional stochastic dominance:
1. Individuals are utility maximizers - stochastic dominance assumes expected utility
maximization.
2. Two alternatives to be compared (And these are mutually exclusive alternatives - that
is, the other must be chosen and not a convex combination of both).
3. The stochastic dominance is developed based on population distributions.
We focus on First Order Stochastic Dominance (FOSD) and Second Order Stochastic
Dominance (SOSD).
Consider two prospects: prospect f and prospect g. Assume x is the level of wealth and f(x)
is the probability of obtaining different levels of wealth in alternative f.
Assume that g(x) gives the probability of each level of wealth for alternative g. Thus f(x)
and g(x) are probability density functions under alternatives f and g, respectively.
The expected utility for prospect f =
u ( x) f ( x)dx
The expected utility for prospect g is =
u ( x) g ( x)dx
We may then write the difference in the expected utility between the two prospects as
follows:
u( x) f ( x)dx u( x) g ( x)dx
If f is preferred to g then:
u( x) f ( x)dx u( x) g ( x)dx 0
Thus prospect f stochastically dominates prospect g.
If g is preferred to f then:
u( x) f ( x)dx u( x) g ( x)dx 0
Thus prospect g stochastically dominates prospect f.
If one is indifferent between f and g then:
u( x) f ( x)dx u( x) g ( x)dx 0
1. First Order Stochastic Dominance – the least restrictive of the three. It assumes that
for every investor: more is better.
16
Investors prefers more to less
Investors are risk averse
Proof:
Assume two investment alternatives: alternative f and g. Alternative f is preferred to g if
expected utility from f is greater than that from g.
Let X be wealth, f(x) be the probability of each level of wealth under alternative f and g(x)
the probability of wealth under alternative g.
Or if u( x) f ( x)dx u( x) g ( x)dx 0
Which implies that: u( x) f ( x) g ( x)dx 0 …………………………………..…..[1]
Integration by parts is very useful when integrating products. Recall from integration by
parts that if you have two functions v(x) and w(x) then;
17
[v( x) w( x)] v( x) w( x) v( x) w( x)
Taking the indefinite integral of each side and using the rule for integratin g a sum gives :
f ( x) g ( x)dx
Let w’(x) = f(x) – g(x) which implies that w( x)
x
f ( x) g ( x)dx f ( x)dx g ( x)dx F ( x) G( x) …[2]
x
w( x)
Therefore:
u( x ) f ( x ) g( x )dx u( x )F ( x ) G( x )
u ( x )[ F ( x ) G( x )] dx
u( x)[ F ( x) G( x)]dx ……………………………….….. [3]
The first term u ( x)F ( x) G( x) is zero. Why?
Let us look at the last term: u( x )[ F ( x ) G( x )]dx .
Recall from the assumption of nonsatiation that u’(x) > 0, thus this term takes its sign from
the [F(x) – G(x)] term. The term gives the difference between two cumulative probability
distributions. The difference between F(x) and G(x) should be negative or zero for all x.
That is the cumulative probability distribution of f must always lie on or to the right of the
cumulative probability distribution of g.
Thus the term u’(x) > 0 has nothing to do with the overall sign of this term as it is always a
positive multiplier.
For any value of x there is an area between g and f but the area under g is greater than the
area under f. When this is true for all x we conclude that f dominates g.
18
Thus the theorem is proven. QED
Give two distributions f and g, f dominates g by FOSD when the decision maker has
positive marginal utility of wealth for all x [u’(x) > 0] and for all the cumulative probabilities
under the f distribution is less than or equal to the cumulative distribution under the g
distribution. This requires that for all x the cumulative probability distribution for f is always
to the right of the cumulative probability of g.
Meaning/intuition
Recall the meaning of cumulative density function (CDF).
F(x) ≤ G(x) means that for every x the cumulative probability of getting some given level of
wealth or higher is greater under f than under g.
Consider a wealth level of X = 5. Under g the probability that wealth is less than or equal to
5 is 0.5 That is, P(X ≤ 5) = 0.5. This implies that the probability that wealth is greater than 5
is 1-P(X<5) = 0.5. That is, the probability of getting a higher payoff is 0.5.
Under f the probability that x is smaller than 5 is 0.3. That is, P(X ≤ 5) =0.3. This implies
that the probability that X is greater than 5 is 1 – P(X > 5) = 0.7. That is, the probability of
obtaining a higher payoff is 0.7.
Thus the probability of higher wealth, at 0.7, is greater under f than under g, at 0.3. Thus one
is likely to get a better payoff under f than under g. Thus f dominates g.
The implication is that the mean of f is greater than the mean of g or that for every level of
probability you make at least as much money under f as you do under g.
This allows one to characterize the choices between two risky distributions for every utility
maximizer that prefers more wealth to less.
0.5 f
0.3
0
5 Wealth (x)
19
One weakness of the FOSD is that the above characterization is difficult if the distributions
intersect. To solve this problem we introduce SOSD.
Proof:
Basically we need to show that the expected utility under f is greater than or equal to that
under g. That is,
u( x) f ( x)dx u( x) g ( x)dx 0
u( x) f ( x)dx u( x) g ( x)dx
u( x) f ( x) g ( x)dx
u ( x)[ F ( x) G( x)]dx ……………………………………..(4)
u( x )[ F ( x ) G( x )] 0
A
Let’s look at part A first:
20
A u( x) [ F ( x) G( x)]dx , since u’(x)>0 – for A to be positive the integral must be
negative, i.e.,
[ F ( x ) G( x ) ] dx 0 .
TOSD involves applying integration by parts again to part B of the proof in the SOSD.
If f dominates g then we have a risk averter with diminishing absolute risk aversion.
In general, the FOSD, SOSD, and TOSD orderings for a decision problem have certain
properties or relations in common. These include:
2. Partial ordering – the dominance relations imply that the set of utility functions
comprising a class for one degree of SD contains the set of utility functions
comprising a class for a higher degree of SD, but not conversely. That is, if A
dominates B by FSD, then A dominates B by SOSD, and A dominates B by TOSD
also, but the reverse does not hold.
3. Necessary conditions – the necessary conditions for FOSD, SOSD and TOSD to hold
are that the lower bound of the cumulative distribution function for the dominant
prospect not be less than for the dominated prospect, and the mean of the dominant
prospect not be less than that of the dominated prospect.
Step 1: Take the outcomes for all probability distributions and arrange them in order.
21
Step 2: Write the frequencies of observations against each of the x levels for each
distribution. Some of these frequencies will usually be zero if for example an x level is
observed in one distribution and not in the other.
Step 3: Form the CDF starting at the first value of x.
Step 4: Do the comparisons.
FOSD Example 1
Suppose you are given two investment alternatives with returns (outcomes) and probabilities
as given in the following table. Which one do you prefer?
We shall use FOSD to answer this question.
It is obvious that A is preferred to B. This is not because the investor will always obtain a
higher return in B than in A, but because for all returns the odds of obtaining that return are
higher with A than with B. This is much clearer in the following table which compares the
CDFs for the two alternatives.
Investment A Investment B
Return CDF A CDF B
7 0 1/3
8 1/3 1/3
9 1/3 2/3
10 2/3 2/3
11 2/3 1
12 1 1
Alternatives A and B
2/3
1/3
A
7 8 9 10 11 12
22
NB: CDF of A is never greater than the CDF of B. The two curves do not cross and A does
not lie above B.
FOSD – states that if investors prefer more to less, and if CDF of A is never greater than
the CDF of B, then A is preferred to B.
Alternative A Alternative B
Outcome Probability Outcome Probability
6 1/4 5 1/4
8 1/4 9 1/4
10 1/4 10 1/4
12 1/4 12 1/4
Class Exercises
Do the following exercises in groups of two
1. Consider the following investments. What can be said about their desirability using
FOSD and SOSD?
(a)
A A B B C C
Probability Outcome Probability Outcome Probability Outcome
0.2 4 0.1 5 0.4 6
0.3 6 0.3 6 0.3 7
0.4 8 0.2 7 0.2 8
0.1 10 0.3 8 0.1 10
0.1 9
23
(b)
A A B B C C
Probability Outcome Probability Outcome Probability Outcome
0.4 3 0.1 5 0.1 5
0.3 4 0.2 6 0.1 7
0.1 6 0.1 8 0.2 8
0.1 7 0.2 9 0.2 9
0.1 9 0.4 10 0.4 11
A. If one assumes that the data is normally distributed then f dominates g if:
1. Mean of f is greater than mean of g: f g and,
2. Standard deviation of f is smaller than standard deviation of g: f g
2. f g
4.1 Portfolios
One problem likely to be faced in stochastic dominance is the potential presence of
portfolios. Namely, one may be looking at two stochastic prospects which are not mutually
exclusive but which may be correlated, i.e., if one was looking at two crops one might find
that wheat and corn perform differently in different weather conditions because they utilize
different growing seasons. Here we investigate the question of what happens with the
correlation. The question that we deal with is: If f dominates g under what conditions will f
also dominate all combinations of f and g. We here follow McCarl et al (1987)’s two-
moment based dominance rules procedure. This portfolio-based investigation requires us to
make some additional assumptions so that we may generate analytical results. We will rely on
the moment based normality Generalised Stochastic Dominance rule which states that
normally distributed prospect f dominates normally distributed prospect g whenever the
following two conditions hold:
Now we want to see the conditions under which prospect f will not only dominate prospect
g but also h (where h is the convex combination of f and g).
24
A convex combination is written according to the following formula:
We also know that when we form prospect h from prospects f and g its mean and variance
are given by:
h f (1 ) g ……………………………………………………………………(5)
f g and 2f g2
Now what we need to do is investigate the more general dominance conditions between f
and h and try to find conditions under which those conditions will hold given some arbitrary
f and g.
The first rule that we will investigate is the relationship between the means. Notice that the
definition of the mean as expressed above allows us to write the following:
h f (1 ) g ≤ f (1 ) f f ………………………………………….(7)
OR
f h
This arises since f g . Thus, uniformly f h .Thus the first of the two dominance
rules is always satisfied.
Examining the second dominance rule is more complicated. Here we need to investigate the
relationship between the variance of f and the variance of h. We can get the variance of h
from:
k f g where k ≤1
25
h2 2f [2 (1 ) 2 k 2 2k (1 )] …………………………………………….(10)
h2 2f [2 (1 ) 2 k 2 2k (1 )] 2f
[2 (1 ) 2 k 2 2k (1 )] 1
2 (1 ) 2 k 2 2k (1 ) 1 0
2 (1 ) 2 k 2 2k (1 ) 1 0
1
or
k 2 1
1 ……………………………………………………………………(13)
k 2 2 k 1
Now if we wish to preclude convex combinations we wish the equality of the variances to
hold somewhere outside the realm of feasible convex combinations.
So what we wish to have is that be strictly greater than or equal to one 1, which is what
we have above. That is,
k 2 1
1
k 2 2 k 1
Thus k 2 1 k 2 2k 1
26
2 k 2
1 f g 1 f
Therefore: (recall k f g which implies that k or )
k g f k g
Thus, we have the restriction that the correlation coefficient must be greater than or equal to
the ratio of the variances of the two prospects.
The significance of this result is that we now have a condition under which we are certain
that if f dominates g via a second degree stochastic dominance then f will dominate all
potential convex combinations of f and g.
This equation has several other implications. Namely, if the items are perfectly correlated
then we are always safe because we know that f is always less than or equal to g .
Thus, if 1 it is always going to be greater than the ratio of the standard errors.
Similarly, if 0 or negative then there is no way that one can ever guarantee that all the
convex combinations are dominated.
Regret Theory (RT) - RT retains the true EUH probabilities but amends the VNM utility
function in such a way that the decision maker (DM) compares the outcome in each state
with outcomes in other states that might have occurred but did not.
A fair coin is tossed until a head appears. If the head appears on the first toss the payoff is
R1.
If the head appears on the second toss, then the payoff is R2.
If the head appears on the third toss, then the payoff is R4.
If the head appears on the fourth toss, then the payoff is R8, etc, etc.
We keep on tossing until a head appears, then if it takes k tosses to get the head, the payoff
is R2k.
27
The set of outcomes in this reduced lottery is the set: {1, 2, 4, 8,…………k,…}
How much would you be willing to pay for this gamble? How much you are willing to pay
may depend on what you will win on average from the game.
With probability ½ you win R1 right away, with probability ¼ you win R2, with probability
1/8 you win R4,….., 1/2k-1 you win R2k.
= ½+1/2+……..+1/2+……=∞
Because the payoff is infinite, people should be willing to participate in the game no matter
how large the price of the gamble. In reality very few people would be ready to pay as much
as R100 for a ticket, hence the paradox.
It can be shown that a decision maker that chooses P1 over P2 must also choose Q1 over Q2.
If a decision-maker chooses P1 over P2 then according to the expected utility theory (EUT):
28
Thus equation 1 becomes:
Note that the LHS of equation (6) gives us E[U(W)] from gamble Q1 and RHS gives us the
E[U(W)] from gamble Q2.
According to the Expected Utility Theory an expected utility maximizer who prefers P1 to P2
ought to prefer Q1 to Q2 but when faced with the above set of gambles there are a number
of people who choose P1 over P2 and choose Q2 over Q1, contradicting the independence
axiom and the EUT predictions. This apparent contradiction is called the Allais Paradox.
Editing Phase - The editing phase is the preliminary analysis of the offered prospects.
People order outcomes. That is, people decide which outcomes they see as identical,
set a reference point and then consider lesser outcomes as losses and greater ones as
gains.
Evaluation Phase – In this phase the prospect with the highest value is chosen from
the edited prospects. People behave as if they would compute a value (utility) based
on the potential outcomes and their respective probabilities, and then chose the
alternative with a higher utility. Under prospect theory probabilities are replaced by
decision weights.
29
According to the expected utility theory the expected utility of the gamble is calculated as
follows:
n
Expected Utility of gamble L p1U ( x1 ) ... pnU ( xn ) piU ( xi )
i 1
According to the expected utility theory we compare different gambles’ expected utilities of
wealth and choose the one with the highest value.
According to Kahneman and Tversky (1979) the overall value of a prospect is expressed in
terms of two scales, w(pi) and v(xi). Where w(pi) is a probability decision weight function and
v(pi) is a value function. The overall utility function (or value of the prospect) is then stated as
follows:
n
U w( p1 )v( x1 ) w( p2 )v( x2 ) ...... w( pn )v( xn ) w( pi )v( xi )
i 1
In this case U is the overall utility function of the outcomes to the decision maker, xi‘s are
the potential outcomes, pi‘s are the respective probabilities. v(xi) is a value function that
assigns a value to the outcome.
Value Function [V(x)] – The value function assigns a number v(x) to each outcome x,
which reflects the subjective value of that outcome. Recall that outcomes are defined relative
to a reference point, which serves as a zero point, so v(x) measures deviations from that
reference point. The value function that passes through a reference point is S-shaped and
asymmetrical. People hate losses so losses hurt more than gains feel good (loss aversion).
That is, V(X)<-V(-X), where X>0.
The value function is concave above the reference point [v"(x) < 0 for x > 0], and convex
below it [v"(x) > 0 for x < 0]. Meaning, it is concave for gains and convex for losses. The
value function is also steeper for losses than gains, so that v'(x) < v'(-x) for x 0.
The further the distance from the reference point, the smaller the impact of a change. So the
impact of a change is more pronounced around the reference point. This principle is called
diminishing sensitivity principle.
It is important to note that w(p) is not a probability measure, and Kahneman and Tversky
prove that w(p) + w(1-p) is frequently less than 1. The weighting function, w(p), relates
decision weights to stated probabilities. It is an increasing function of p, with w(0) = 0 and
30
w(1) = 1. However, people tend to overweight very small probabilities, like 0.001, so that
w(p) > p for very small p. For large p decision weights are generally lower than
corresponding probabilities [w(p) < p]. In the figure below at p = 0.5, p = w(p). For p less
than 0.5, w(p) is greater than p and for p greater than 0.5 w(p) is less than p. For example,
when p = 0.25, w(p) = w(0.25) = 0.45 and when p = 0.75, w(p) = w(0.75) = 0.6.
Also, the weighting function is concave near 0 and convex near 1. This implies that a change
in probability from 0.9 to 1 has a greater impact than a change from 0.6 to 0.7. Also, a
change in probability by 0.001 from 0.01 to 0.011 has a greater impact than a change from
0.25 to 0.251. So the impact of a change in probability diminishes with its distance from the
boundaries.
2 Value W(p) 1
Weight Function
0.75
1
W(0.75)=0.6
0.5
-2 -1 1 2 Gains W(0.25)=0.45
Losses
-1 0.25
45 degree line
-2 0 0.25 0.5 0.75 1
p
Consider the gamble/prospect: L(x, y, 0; p, q, 1-p-q), where the individual receives x with
probability p, y with probability q, and nothing with probability 1-p-q, where p + q 1, we
say that:
31
1. An offered prospect is strictly positive if its outcomes are all positive, i.e. x > 0, y > 0
and p + q = 1.
2. An offered prospect is strictly negative if all its outcomes are negative, i.e. x < 0, y <
0 and p + q = 1.
3. A prospect is regular if it is neither strictly positive, nor strictly negative, i.e., either x
0 y, p + q < 1.
So for a regular prospect, i.e. either p + q < 1, or, without loss of generality, x 0 y, then:
V(x, p; y, q) = w(p)v(x) + w(q)v(y) + w(1-p-q)v(0) = w(p)v(x) + w(q)v(y)
where v(0) = 0, w(0) = 0, and w(1) = 1. V is defined on prospects, while v is defined on
outcomes.
The evaluation of strictly positive or strictly negative prospects follows a different rule,
described below:
If p + q = 1, and either x > y > 0, then:
V(x, p; y, q) = v(y) + π(p)[v(x) - v(y)],
That is, the value of a strictly positive prospect equals the value of the riskless component
plus the differences between the values of the two outcomes, multiplied by the weight
associated with the more extreme outcome. Note that a decision weight is applied only to
the risky component, not the riskless one.
Consider the following prospect: (300, 200; 0.8) which gives you outcome 300 with a
probability 0.8 and an outcome 200 with probability 0.2. So you can win 300 or 200, so
either way you are guaranteed a minimum of 200. We can therefore say that 200 is riskless
since after the gamble you will have a minimum of 200 in your pocket. So we can
decompose the prospect (300, 200; 0.8) into a sure gain of 200 and a risky prospect (100, 0;
0.8).
Scenario 2
Now choose between gambles C and D:
Gamble C: 11% chance of winning R100, 89% chance of winning R0
Gamble D: 10% chance of winning R500, 90% chance of winning R0
32