You are on page 1of 10

Chapter 5

124

will give the Agent a constant wage ws = wF - w such that u{w)


= U, and he will get an expected utility

W 0 = pxs + (1 - p)xF - w
The difference between W0 and W, can then be rewritten as

W1 - W0 = (P - 0(*s - X F) + w - Pws ~ <* ~ P)WF


Since the wages do not depend on xs and xF, it appears that if success
is much more attractive than failure for the Principal (xs - xF is high),
he will choose to get the Agent to work. (The reader is asked in exer-
cise 5.1 to prove that then xs - ws> xF - wF at the optimum, with the
surplus from success shared between the Agent and the Principal.)

5.2 The Standard Model

We consider here the standard model in a discrete version. The


Agent can choose between n possible actions: av...,an. These
actions produce one among m outcomes, which we denote xv ..., xm.
The outcome a priori is a signal that brings information on the
action the Agent chooses. To simplify matters, we identify it as sur-
plus from the Principal-Agent relationship.7 (We will return to this
assumption in section 5.3.4.)
The stochastic relationship between the chosen action and the out-
come is often called a "technology." The idea here is that when the
Agent chooses action av the Principal observes outcome X; with a
probability p,; that is positive.8 Because the only variable that is pub-

7. For instance, in an employer-employee relationship, a will be the effort and x the


resulting production or profit.
8. If some of the probabilities ptj were zero, the Principal could use this information to
exclude some actions. Suppose that action a, is the first-best optimal action and that
Pi} - 0 for some/. The Principal then can fine the Agent heavily when the outcome is
Xf, since the fact that he observes Xj signals that the Agent did not choose the optimum
action a,. This type of strategy will even allow the Principal to implement the first-best:
ii moreover pkj > 0 for all k * i, then the choice of any ak other than a{ will expose the
Agent to a large fine, thus effectively deterring him from deviating. This was noted
early on by Mirrlees (1975, published 1999); it is the reason why I exclude this case.
Moral Hazard
125

licly observed is the outcome, contracts must take thP f


that depends on the outcome. If the Prinrmai u ° f a Wa & e
, . he wi,. pay the Agent a wage Wj and k e ^ X T * ° U t a >™
A general specification for the Agent's von 2 '
stern utility function would be u{w,a). However theT"'^™'
would then affect the agent's preferences toward risk whin "u
complicate the analysis.' b e f o r e we will ^ 1 ^ ^
utility >s separable m mcome and action. Moreover it is always L i
ble to renormahze the actions so that their marginal cost is constant"
Thus ,n the standard model we take the Agent's utility function tobe
u(ic) - a

where // is increasing and concave. We can assume that the Principal


is risk-neutral, as done in most of the literature. The Agent's von
Neumann-Morgenstern utility function then is

x- w

5.2.1 The Agent's Program

When the Principal offers the Agent a contract wjf the Agent chooses
his action by solving the following program:
/ m

If the Agent chooses au then the (n - 1) incentive constraints


* m
I PijU(wj) -at>X PkjU(u>j) ~ % VCk)

must hold for k = l , . . . , n and k*i.

The n i[ m a
. # y
be optimal for the Principal to give higher wages if it reduces the
gent's disutility of effort, so that the individual rationality constraint may not be
bln
*ng at the optimum.
126 Chapter 5

We can assume that the Agent will accept the contract only if j t
gives him a utility no smaller than some U, which represents the
utility the Agent can obtain by breaking his relationship with the
Principal for his next-best opportunity. The participation constraint
(the individual rationality constraint) can in this case be written
m
ZpMwfl-Oi*!! (IR)
/-i

if the Agent's preferred action is at.

5.2.2 The Principal's Program

The Principal should choose the contract wv ..., wm that maximizes


his expected utility, while taking into account the consequences of
this contract on the Agent's decision:

(w'l

under

I (ICk)
(IR)
k = l,...,nandk±i (Ak)
(M)

where a,- is the action chosen at the optimum and the numbers in
parentheses represent the (nonnegative) multipliers associated with
the constraints. The maximization therefore is with respect to wages
(wj) and action av which the Principal indirectly controls.
If we fix Of, the Lagrangian of the maximization problem is
m n / m
*{v>. K M) = 1 Pfy - wj) + £ A* £ PijUty) - at
Moral Hazard
127

p.fferentiating it with respect to Wj a n d r e g r o u p i n g terms ^

At the first-best, w e w o u l d get the efficient risk-sharin «,


of marginal utilities of the Principal and the Aee 7 ^
stent, which implies that the wage itself is constant: ^ * C°n"
1
«'(Wy)

where /^ is chosen so that the constraint {IR) i s an equality.


The difference between the two equations (ICk) and (IR) above
comes from the fact that some multipliers lk are positive That is
incentive constraints may be active, so some actions ak give the
Agent the same expected utility as a. In equilibrium at least one of
the Xk must be positive (otherwise, we can neglect the incentive con-
straints, and the moral hazard problem will be moot); w- then
depends on; through the terms pJpv.
The pkj/Pij terms play a fundamental role in the analysis of moral
hazard. They can be interpreted by analogy with mathematical sta-
tistics. The Principal's problem likewise consists, in part, of inferring
the action the Agent will choose given the observed outcome. In sta-
tistical terms the Principal must estimate the "parameter" a from the
observation of "sample" x. This parameter can be obtained by way
of the maximum likelihood estimator, which is the ak such that k
maximizes the probability pkj. The next two statements are therefore
equivalent:

0, is the maximum likelihood estimator of a given xy


and
Chapter 5
128

u ^P n Iv quantities can be called "likelihood


Rv analogy then the pk/Fi; 4
« • and because of this analogy we can interpret equation (£).
F* the optimal action a, Because all multipliers Xt are nonnegatn e
^ the function ! / « ' is increasing, the w a g e ^ associated with out-
l e / will be higher when a greater number of likelihood ratios
T/v L smaller than 1. This wage is therefore higher when „, l s the
maximum likelihood estimator of a given x, Because the wage »(
depends on a weighted sum of the likelihood rahos, this argument is,
,;,+i<jlit10 ^Still the intuition is importantt and
of course, not. airngi"- u tbasically
.
right: the Principal will give the Agent a high wage when he observes
an outcome from which he can infer that the action taken was the
optimal one; however, he will give the Agent a low wage if the out-
come shows it unlikely that the Agent chose the optimal action.
Before we study the properties of the optimal contract, let us con-
sider briefly an alternative approach popularized by Grossman-
Hart (1983). They solve the Principal's maximization program in
two stages:
• For any action a„ they minimize the cost to implement it for the
Principal. This amounts to minimizing the wage bill

in

1 PiFi

under the incentive constraints and the participation constraint.


• They then choose the action that maximizes the difference between
the expected benefit from action ait or
m
I Pijxi

and the cost-minimizing wage bill.

10. The reader should check that with only two actions (» = 2), the argument holds
as given in the text.
Moral Hazard
129

The Grossman-Hart approach is clearly equivalent to the approach


we used above, and in some ways it may be more enlightening

5.2.3 Properties of the Optimal Contract

Let *! < ... < xm and ax< ... < an. We are interested here in how
the wage Wj depends on the outcome ;. We know that when the
action is observable and the Principal is risk-neutral, w is constant.
If, more generally, the Principal is risk-averse with a concave von
Neumann-Morgenstern utility function v, then the ratios of mar-
ginal utilities

v'jXj - wj)

are independent of/' at the first-best wage.11 We see that the first-best
wage w. must be an increasing function of/. This property is likewise
desirable for the second-best wage schedule. It is natural for the
wage to be higher when the surplus to be shared is higher. Recall
that we obtained such a result for the two-action, two-outcome
example in section 5.1.
It turns out that, it is only possible to show that in general (see
Grossman-Hart 1983),
1. Wj cannot be uniformly decreasing in /,
2. neither can (X: — w),
3. 3(/, /), Wj > w, and Xj - w} > xx - V)v
The proofs are fairly complex and will be omitted here. However,
these results are obviously far removed from what commen sense
tells us. For instance, they do not exclude an optimal wage schedule
in which wages decrease in part of the range. The usefulness of these
three results for our purpose appears when there are only two

11. This is known in the literature as Borch's rule.


130 Chapter 5

jible outcomes: success or failure. The optimal wage schedule


can then be written as

w1 = zv
| w 2 = iv + s(x2 - *i)
The Agent receives a basis wage w and a bonus proportional to the
increase in the surplus if he accepts the contract. Result 3 above
shows that the bonus rate s must satisfy 0 < s =S 1: wages increase
with the outcome but not so fast that they exhaust the whole
increase in the surplus.
When there are more than two outcomes, we cannot obtain more
positive results without adding structure to the technology that pro-
duces the outcome (the probabilities /?,y). The outcome has a dual
role in this model: it represents the global surplus to be shared, and
it also signals to the Principal the action taken by the Agent. The
shape of the solution is therefore determined by the properties of
this signal which is what we already saw in our study of likelihood
ratios.
Let us return to (E), the equation that defines the optimal contract:

1
-, + i Ji-^
As the left-hand side of (E) increases in wjf Wj will increase in/ if, and
only if, the right-hand side of (E) increases in ; as well. In other
words, we need to assume that a high action increases the probabil-
ity of getting a high outcome at least as much as it increases the
probability of getting a low outcome:

Vk<i,VKj, ^ > ^
Pu Pki
This condition is called the monotone likelihood ratio condition (MLRC).
It amounts to assuming that for all k < i, the likelihood ratio pr/pkj
Moral Hazard

increases with the outcome ;. Excercise 5 6 asks you


MLRC implies another commonly used comparison J ^ T * * '
distributions, first-order stochastic taSTXJT 1 %
n a n c e just states that as . increases, the cumulative d.slibuTon
function of outcomes moves to the right: however one define
good outcome, the probability of a good outcome increases in a

Since the multipliers Xk are nonnegative, MLRC allows us to state


that the Xk(l - Pkj/Pij) terms in (£) are increasing in; if jfc < t a n d
decreasing otherwise. We are done if we can find a condition
whereby the multipliers Xk are all zero when jfc is greater than i that
is, when the only active incentive constraints are those that prevent
the Agent from choosing actions less costly than the optimal action
Note that if i = n, in which case the Principal wants to implement
the most costly action, then we are indeed done. When there are two
possible actions—when the choice is work or not work and the Prin-
cipal wants the Agent to w o r k - t h e MLRC is enough to ensure that
the wage increases in the outcome. In the general case Grossman
and Hart proposed 12 the convexity of the distribution function condition
(CDFC),13 the cumulative distribution function of the outcome
should be convex in a on {av ..., an). More precisely, for i <j < k and
AG [0,1] such that

fly = Xa{ + ( 1 - l)ak

the CDFC says that


V/=1
*. fy*APfl + fl-A)PH
One rough interpretation of this new condition is that returns to the
action are stochastically decreasing, but this must be taken with a bit

12. Both (MLRC) and (CDFC) appear in earlier work by Mirrlees.


13. Some authors call this a concavity of the distribution function condition, meaning
that the Accumulated distribution function (one minus the cumulative distribution
function) is concave.
132 Chapter 5

of skepticism. CDFC really has no clear economic interpretation,


and its validity is much more doubtful than that of MLRC.14 The
main appeal of CDFC is that it allows us to obtain the result we seek,
as we will now show.
Let Oj be the optimal action. It is not difficult to see that there must
exist a / < / such that the multiplier A, is positive. If all ).k were zero
for A: < i, then the optimal wage would be the same if the choice of
possible actions were restricted to A = [a,, ..., an). But the optimal
wage would then be constant, since a; is the least costly action in A
Now a constant wage can only implement action al and not a in the
global problem, so this conclusion makes no sense.
Consider the problem in which the Agent is restricted to choosing
an action in {au ..., at), and let w be the optimal wage. In this prob-
lem Oj is the costliest action and MLRC therefore implies that w
increases in ;'. We will show that w stays optimal if we allow the
Agent to choose from the unrestricted set of actions \a , ...,a\
Assume, to the contrary, that there exists a k > i such that the Agent
prefers to choose ak:

'" m
lPklu(wl)-ak>Xpl,u(w,)-nl

and let / be the index of an action less costly than at and whose asso-
ciated multiplier Ak is nonzero so that
in m
X Ptjuiwj) - 4 = £ pijU(wj) - a
7-1 7-1

There exists a X G [0,1] such that


n, = Xak + (1 - X)a{

Hy different m d e l in
outcomes given by " l " ' which there is a continuous set of
SS m e r a n d o m noise with
bution function f. Th " " *' ^ ^ * ' ° probability distri-
6 1 t0 the action a r e
•ent here to f\Jnt, ^ J * ™ constant; however, CDFC is equiva-
"eing nondecreasing, not a very appealing property.
Moral Hazard

We can therefore apply CDFC:

V/ = l,...,m, Pij^XPy +d -X)Ptj


We deduce from this

m 1
TPijufrfl - «,• - £ Py("(^) - u(w;;.+1)) + „ ( a ; j - fl.

A/i-i
+ (1 - A)f X?/y(M(w;;.) - M (w y+1 )) + M ( W J - fl/

which is absurd by the definition of ak and fl,. The wage schedule w


therefore is the optimal solution in the global problem, and this con-
cludes our proof because zu is increasing.
The general logic that should be drawn from this analysis is that
the structure of the simplest moral hazard problem is already very
rich and that it is therefore dangerous to trust one's intuition too
much. It is not necessarily true, for instance, that the second-best
optimal action is less costly for the Agent than the first-best optimal
action. It may not be true either that the expected profit of the Prin-
cipal increases as the Agent becomes more "productive" (in the
sense of first-order stochastic dominance) whatever action he
chooses. 15 The literature contains many negative results of this sort.

15. Exercise 5.3 provides a counterexample.

You might also like