Introduction To Mathematical Economics Part 2

An Introduction to
Mathematical
Economics
Part 2
Loglinear
Publishing
Michael Sampson
Copyright © 2001 Michael Sampson.
Loglinear Publications: http://www.loglinear.com Email: mail@loglinear.com.
Terms of Use
This document is distributed "AS IS" and with no warranties of any kind, whether express or
implied.
Until November 1, 2001 you are hereby given permission to print one (1) and only one hardcopy
version free of charge from the electronic version of this document (i.e., the pdf file) provided
that:
1. The printed version is for your personal use only.

2. You make no further copies from the hardcopy version. In particular no photocopies,
electronic copies or any other form of reproduction.
3. You agree not to ever sell the hardcopy version to anyone else.
4. You agree that if you ever give the hardcopy version to anyone else that this page, in
particular the Copyright Notice and the Terms of Use are included and the person to
whom the copy is given accepts these Terms of Use.
Until November 1, 2001 you are hereby given permission to make (and if you wish sell) an
unlimited number of copies on paper only from the electronic version (i.e., the pdf file) of this
document or from a printed copy of the electronic version of this document provided that:
1. You agree to pay a royalty of either $3.00 Canadian or $2.00 US per copy to the
author within 60 days of making the copies or to destroy any copies after 60 days for
which you have not paid the royalty of $3.00 Canadian or $2.00 US per copy.
Payment can be made either by cheque or money order and should be sent to the
author at:
Professor Michael Sampson

Department of Economics
Concordia University
1455 de Maisonneuve Blvd W.
Montreal, Quebec
Canada, H3G 1M8
2. If you intend to make five or more copies, or if you can reasonably expect that five
or more copies of the text will be made then you agree to notify the author before
making any copies by Email at: sampson@loglinear.com or by fax at 514-848-
4536.
3. You agree to include on each paper copy of this document and at the same page
number as this page on the electronic version of the document: 1) the above
Copyright Notice, 2) the URL: http://www.loglinear.com and the Email address
sampson@loglinear.com. You may then if you wish remove this Terms of Use
from the paper copies you make.
Contents
0.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
1 Total Di¤erentials 1
1.1 Endogenous and Exogenous Variables . . . . . . . . . . . . . . . 1
1.2 Structural and Reduced Forms of a Model . . . . . . . . . . . . . 2
1.2.1 The Structural Form of a Model . . . . . . . . . . . . . . 2
1.2.2 The Reduced Form of a Model . . . . . . . . . . . . . . . 2
1.2.3 Implicit and Explicit Functions . . . . . . . . . . . . . . . 3
1.2.4 Calculating Derivatives . . . . . . . . . . . . . . . . . . . 4
1.3 Total Di¤erentials for Beginners . . . . . . . . . . . . . . . . . . . 5
dy
1.3.1 A Recipe for Calculating dx . . . . . . . . . . . . . . . . . 6
1.4 Total Di¤erentials for Intermediates . . . . . . . . . . . . . . . . 9
@y
1.4.1 A Recipe for Calculating @x j
. . . . . . . . . . . . . . . . 10
1.4.2 Example 1: A Simple Linear Function . . . . . . . . . . . 11
1.4.3 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.4 Example 3: The Bivariate Normal Density . . . . . . . . . 13
1.4.5 Example 4: Labour Demand . . . . . . . . . . . . . . . . 14
1.5 Total Di¤erentials for Experts . . . . . . . . . . . . . . . . . . . . 16
1.5.1 Systems of Implicit Functions . . . . . . . . . . . . . . . . 16
@yi
1.5.2 A Recipe for Calculating @x j
. . . . . . . . . . . . . . . . 17
1.5.3 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.4 Example 2: Supply and Demand I . . . . . . . . . . . . . 22
1.5.5 Example 3: Supply and Demand II . . . . . . . . . . . . . 24
1.5.6 Example 4: The IS=LM Model . . . . . . . . . . . . . . . 25
1.6 Total Di¤erentials and Optimization . . . . . . . . . . . . . . . . 29
1.6.1 A General Discussion . . . . . . . . . . . . . . . . . . . . 29
1.6.2 A Quick Derivation of the Main Principle . . . . . . . . . 30
1.6.3 Pro…t Maximization . . . . . . . . . . . . . . . . . . . . . 31
1.6.4 Constrained Optimization . . . . . . . . . . . . . . . . . . 33
1.6.5 Utility Maximization . . . . . . . . . . . . . . . . . . . . . 34
1.6.6 Cost Minimization . . . . . . . . . . . . . . . . . . . . . . 35
i
CONTENTS ii
2 The Envelope Theorem 38

2.1 Unconstrained Optimization . . . . . . . . . . . . . . . . . . . . . 38
2.1.1 Pro…t Maximization in the Short-Run I . . . . . . . . . . 38
2.1.2 Envelope Theorem I . . . . . . . . . . . . . . . . . . . . . 40
2.1.3 Pro…t Maximization in the Short-Run II . . . . . . . . . . 41
2.1.4 Pro…t Maximization in the Long-Run . . . . . . . . . . . 42
2.1.5 Envelope Theorem II . . . . . . . . . . . . . . . . . . . . . 43
2.1.6 A Recipe for Unconstrained Optimization . . . . . . . . . 45
2.1.7 The Pro…t Function . . . . . . . . . . . . . . . . . . . . . 45
2.1.8 Homogeneous Functions . . . . . . . . . . . . . . . . . . . 48
2.1.9 Concavity and Convexity . . . . . . . . . . . . . . . . . . 49
2.1.10 Concavity, Convexity and Homogeneity of f ¤ (x) . . . . . 53
2.1.11 Properties of the Pro…t Function . . . . . . . . . . . . . . 55
2.1.12 Some Symmetry Results . . . . . . . . . . . . . . . . . . . 58
2.2 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 58
2.2.1 Envelope Theorem III: Constrained Optimization . . . . . 58
2.2.2 A Recipe for Constrained Optimization . . . . . . . . . . 60
2.2.3 Concavity, Convexity and Homogeneity of f ¤ (x) . . . . . 61
2.2.4 Cost Minimization . . . . . . . . . . . . . . . . . . . . . . 63
2.2.5 Properties of the Cost Function . . . . . . . . . . . . . . . 67
2.2.6 Utility Maximization . . . . . . . . . . . . . . . . . . . . . 70
2.2.7 Expenditure Minimization . . . . . . . . . . . . . . . . . . 74
3 Integration and Random Variables 78

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.2 Discrete Summation . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.2.1 De…nitions P . . . . . . . . . . . . . . . . . . . . . . . . . . 78
n
3.2.2 Example 1: Pi=1 aXi . . . . . . . . . . . . . . . . . . . . 79
n
3.2.3 Example 2: Pi=1 (aXi + bYi ) . . . . . . . . . . . . . . . . 79
n
3.2.4 Example 3: i=1 (aXi + b) . . . . . . . . . . . . . . . . . 80
Pn 2
3.2.5 Example 4: Pi=1 ¡(aXi + b)¢ . . . . . . . . . . . . . . . . 80
n ¹ =0. . . .
3.2.6 Example 5: i=1 Xi ¡ X . . . . . . . . . . . 80
3.2.7 Manipulating Pnthe Sigma Notation . . . . . . . . . . . . . . 81
2
3.2.8 Example 1: i=1 (aXi + bYi ) . . . . . . . . . . . . . . . . 82
Pn 2
3.2.9 Example 2: i=1 (aXi + b) . . . . . . . . . . . . . . . . 83
Pn ¡ ¹
¢2
3.2.10 Example 3: i=1 Xi ¡ X . . . . . . . . . . . . . . . . 83
3.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.3.1 The Fundamental
R 10 2 Theorem of Integral Calculus . . . . . . 84
3.3.2 Example 1: 1 x dx . . . . . . . . . . . . . . . . . . . . . 85
R5
3.3.3 Example 2: 0 e¡x dx . . . . . . . . . . . . . . . . . . . . 85
3.3.4 Integration byR 4 Parts . . . . . . . . . . . . . . . . . . . . . 85
¡x
3.3.5 Example 1: 0 xe dx . . . . . . . . . . . . . . . . . . . . 86
3.3.6 Example 2: ¡ (n) . . . . . . . . . . . . . . . . . . . . . . . 86
3.3.7 Integration as Summation . . . . . . . . . . . . . . . . . . 88
Rb
3.3.8 a dx as a Linear Operator . . . . . . . . . . . . . . . . . 89
CONTENTS iii
Rb 2
3.3.9 Example 1: a (cf (x) + dg (x)) dx . . . . . . . . . . . . . 90
Rb 2
3.3.10 Example 2: a (f (x) + d) dx . . . . . . . . . . . . . . . . 91
3.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.4.2 Discrete Random Variables . . . . . . . . . . . . . . . . . 91
3.4.3 The Bernoulli Distribution . . . . . . . . . . . . . . . . . 93
3.4.4 The Binomial Distribution . . . . . . . . . . . . . . . . . . 93
3.4.5 The Poisson Distribution . . . . . . . . . . . . . . . . . . 94
3.4.6 Continuous Random Variables . . . . . . . . . . . . . . . 96

3.4.7 The Uniform Distribution . . . . . . . . . . . . . . . . . . 97
3.4.8 The Standard Normal Distribution . . . . . . . . . . . . . 98
3.4.9 The Normal Distribution . . . . . . . . . . . . . . . . . . 98
3.4.10 The Chi-Squared Distribution . . . . . . . . . . . . . . . . 99
3.4.11 The Student’s t Distribution: . . . . . . . . . . . . . . . . 100
3.4.12 The F Distribution: . . . . . . . . . . . . . . . . . . . . . 101
3.5 Expected Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.5.1 E [X] for Discrete Random Variables . . . . . . . . . . . . 102
3.5.2 Example 1: Die Roll . . . . . . . . . . . . . . . . . . . . . 104
3.5.3 Example 2: Binomial Distribution . . . . . . . . . . . . . 105
3.5.4 E [X] for Continuous Random Variables . . . . . . . . . . 105
3.5.5 Example 1: The Uniform Distribution . . . . . . . . . . . 106
3.5.6 The Standard Normal Distribution . . . . . . . . . . . . . 107
3.5.7 E [ ] as a Linear Operator . . . . . . . . . . . . . . . . . . 109
3.5.8 The Rules for E h [ ] . . . . .i . . . . . . . . . . . . . . . . . 110
3.5.9 Example 1: E (3X + 4Y )2 . . . . . . . . . . . . . . . . . 111
h i
2
3.5.10 Example 2: E (3X + 4) . . . . . . . . . . . . . . . . . . 112
£ ¤
3.5.11 Example 3: X » N ¹; ¾2 . . . . . . . . . . . . . . . . . . 112
3.6 Variances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.6.1 De…nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.6.2 Some Results for Variances . . . . . . . . . . . . . . . . . 113
3.6.3 Example 1: Die Roll . . . . . . . . . . . . . . . . . . . . . 114
3.6.4 Example 2: Uniform Distribution . . . . . . . . . . . . . . 115
3.6.5 Example 3: The Normal Distribution . . . . . . . . . . . 116
3.7 Covariances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.7.1 De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.7.2 Independence . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.7.3 Results for Covariances . . . . . . . . . . . . . . . . . . . 118
3.8 Linear Combinations of Random Variables . . . . . . . . . . . . . 119
3.8.1 Linear Combinations of Two Random Variables . . . . . . 119
3.8.2 The Capital Asset Pricing Model (CAPM) . . . . . . . . 122
3.8.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.8.4 Sums of Uncorrelated Random Variables . . . . . . . . . . 126
3.8.5 An Example . . . . . . . . . . . . . . . . . . . . . . . . . 127
CONTENTS iv
3.8.6 The Sample Mean . . . . . . . . . . . . . . . . . . . . . . 127

3.8.7 The Mean and Variance of the Binomial Distribution . . . 129
3.8.8 The Linear Regression Model for Beginners . . . . . . . . 130
3.8.9 Linear Combinations of Correlated Random Variables . . 132
3.8.10 The Linear Regression Model for Experts . . . . . . . . . 135
4 Dynamics 138
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.2 Trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.2.1 Degrees and Radians . . . . . . . . . . . . . . . . . . . . . 138
4.2.2 The Functions cos (x) and sin (x) . . . . . . . . . . . . . . 139
4.2.3 The Functions tan (x) and arctan (x) . . . . . . . . . . . . 140

4.3 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.3.2 Arithmetic with Complex Numbers . . . . . . . . . . . . . 142
4.3.3 Complex conjugates: . . . . . . . . . . . . . . . . . . . . . 142
4.3.4 The Absolute Value of a Complex Number . . . . . . . . 143
4.3.5 The Argument of a Complex Number . . . . . . . . . . . 144
4.3.6 The Polar Form of a Complex Number . . . . . . . . . . . 145
4.3.7 De Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . 145
4.3.8 Euler’s Formula . . . . . . . . . . . . . . . . . . . . . . . . 146
4.3.9 When does (a + bi)n ! 0 as n ! 1? . . . . . . . . . . . . 149
4.4 First-Order Linear Di¤erence Equations . . . . . . . . . . . . . . 151
4.4.1 De…nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.4.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.4.3 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
4.4.4 A Macroeconomic Example . . . . . . . . . . . . . . . . . 155
4.4.5 The Cob-Web Model . . . . . . . . . . . . . . . . . . . . . 156
4.5 Second-Order Linear Di¤erence Equations . . . . . . . . . . . . 157
4.5.1 The Behavior of Yt with Complex Roots . . . . . . . . . . 159
4.5.2 Example 1 Real Roots and Stable . . . . . . . . . . . . . 160
4.5.3 Example 2 Real Roots and Unstable . . . . . . . . . . . . 161
4.5.4 Example 3 Complex Roots and Stable . . . . . . . . . . . 163
4.5.5 Example 4 Complex Roots and Unstable . . . . . . . . . . 165
4.5.6 A Macroeconomic Example . . . . . . . . . . . . . . . . . 167
4.6 First-Order Di¤erential Equations
168
4.6.1 De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.6.2 The Case where b = 0 . . . . . . . . . . . . . . . . . . . . 169
4.6.3 The Case where b 6= 0 . . . . . . . . . . . . . . . . . . . . 171
4.6.4 Example 1: . . . . . . . . . . . . . . . . . . . . . . . . . . 172
4.6.5 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
4.7 Second-Order Di¤erential Equations . . . . . . . . . . . . . . . . 173
4.7.1 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
4.7.2 Stability with Real Roots . . . . . . . . . . . . . . . . . . 175
CONTENTS v
4.7.3 Stability with Complex Roots . . . . . . . . . . . . . . . . 175

4.7.4 Example 1: Stable with Real Roots . . . . . . . . . . . . . 176
4.7.5 Example 2: Unstable with Real Roots . . . . . . . . . . . 177
4.7.6 Example 3: Stable with Complex Roots . . . . . . . . . . 178
4.7.7 Example 4: Unstable with Complex Roots . . . . . . . . . 179
CONTENTS vi
0.1 Preface
Thanks go to my Economics 326 students from the Winter 1999 and Winter
2000 semesters for working so hard on the …rst and second versions of these
lecture notes. I would in particular like to thank Tamarha Louis-Charles, Dana
Al-Aghbar, Christopher Martin and Samnang Om for pointing out many typos
and errors.
Chapter 1
Total Di¤erentials
1.1 Endogenous and Exogenous Variables

In economics we mostly work with mathematical models. By their very na-
ture these models contain variables which can be divided into two classes: 1)
endogenous variables and 2) exogenous variables.
Endogenous variables (from Greek, endo: within and genous: born hence
born or generated from within the model) are those variables which the model
attempts to explain.
In a supply and demand model, for example, both the equilibrium price P
and quantity Q are determined by the model, that is at the intersection of the
supply and demand curves. They are therefore endogenous variables.
Exogenous variables (from Greek, exo: from outside, hence generated from
outside the model) are those variables whose determination the model says
nothing about, but which nevertheless in‡uence the endogenous variables.
An example of an exogenous variable in a supply and demand model would
be the weather. Bad weather shifts the supply curve of say oranges to the left
causing P to go up and Q to go down. Weather therefore in‡uences P and Q
but P and Q do not (presumably) in‡uence the weather. A supply and demand
model says nothing about how the weather is determined.
Often policy variables such as taxes, government expenditure or the money
supply are exogenous variables.
Mathematically you can think of the exogenous variables as the x0 s or the
independent variables, while endogenous variables are the dependent variables
or the y 0 s:
Typically in economics we do not use x0 s and y0 s in our notation and so you
will often need to think carefully about what are the exogenous variables and
what are the endogenous variables in any particular model.
1
CHAPTER 1. TOTAL DIFFERENTIALS 2
1.2 Structural and Reduced Forms of a Model

1.2.1 The Structural Form of a Model
When we …rst write down an economic model, that is before doing any com-
plicated derivations, it is often the case that the endogenous variables y = [yi ]
(say an n £ 1 vector) and exogenous variables x = [xi ] (say an m £ 1 vector)
are all jumbled and appear on both sides of the = sign as:
M (y; x) = N (y; x) :
This form of the model is often referred to as the structural form of the model
since it is in this form that the basic assumptions (or structure) of the model
are most apparent.
1.2.2 The Reduced Form of a Model

In economics we are often interested in how changes the x0 s; the exogenous
variables a¤ect the y0 s; the endogenous variables.
For example we might ask how increasing the tax rate t (an exogenous vari-
able) a¤ects the price P or quantity Q (endogenous variables).
It is often hard with the structural form of the model to perform such exper-
iments. The problem is that the y0 s and x0 s are all jumbled together. It would
instead be more useful to have the reduced form of the model which is:
y = h (x)
so that y is by itself on the left-hand side and some function of x is on the

right-hand side.
Given knowledge of the form of h (x) we can consider any number of changes
in the x0 s and then see how the y 0 s change.
An Example
Suppose y is the number of workers that a …rm hires, x1 is the price of the good
that the …rm sells, and x2 is the nominal wage. From a model we …nd that the
marginal revenue of hiring an extra worker is:
1
M (y; x1 ; x2 ) = x1 y ¡ 2
while the marginal cost is:

1
N (y; x1 ; x2 ) = x2 y 2 :
Equating marginal revenue M and marginal cost N we arrive at a structural

model: M (y; x1 ; x2 ) = N (y; x1 ; x2 ) or
1 1
x1 y ¡ 2 = x2 y 2 :
To get the reduced form we need to get y alone on the left-hand side. In this
case this is not hard to do. By a little algebra we …nd that:
x1
y = h (x1 ; x2 ) = :
x2
Suppose x1 = x2 = 1: We then know that y = 1: Consider three experiments:
1) doubling x1 while keeping x2 …xed, 2) doubling x2 while keeping x1 …xed,
and 3) doubling both x1 and x2 : We see from the reduced form that the …rst
experiment results in doubling y to 2; the second experiment results in halving
y to 12 ; while in the third experiment nothing happens to y:
1.2.3 Implicit and Explicit Functions

Closely related to the notion of structural and reduced forms are implicit and
explicit functions.
De…nition 1 An implicit function is any function written as:
g (y; x) = 0:
so that both the y 0 s and x0 s appear jumbled together on the left-hand side and
zero is on the right-hand side.
Given the structural form of a model it is very easy to rewrite it as an implicit

function. Thus from the structural form M (y; x) = N (y; x) we can …nd the
implicit function g (y; x) as:
g (y; x) = M (y; x) ¡ N (y; x) = 0:

1 1
Using the example of: M (y; x1 ; x2 ) = x1 y ¡ 2 and N (y; x1 ; x2 ) = x2 y 2 we have:
1 1
g (y; x1 ; x2 ) = x1 y¡ 2 ¡ x2 y 2 = 0:
De…nition 2 An explicit function unscrambles the y0 s and x0 s so that y is

alone on the left-hand side as:
y = h (x) :
Thus an explicit function and the reduced form are essentially the same
thing.
Given an explicit function or reduced form it is always possible to write it
as an implicit function as:
g (y; x) = y ¡ h (x) = 0:
However, it is often impossible to …nd an explicit function or the reduced

form from an implicit function or structural model.
For example given the implicit function:
g (y; x) = y ln (y) ¡ ln (x) = 0; x > 1
there is no way of getting y alone on the left-hand side. The best we could do
is:
y ln (y) = ln (x)
or
y y = x:
Nevertheless, g (y; x) = y ln (y) ¡ ln (x) = 0 is a perfectly respectable func-

tion. We can for example show that it is increasing and concave. We could even
plot it as shown below.
1.8
1.6
y
1.4
1.2
1
1 2 3 4 5
x
g (y; x) = y ln (y) ¡ ln (x) = 0 for x > 1
1.2.4 Calculating Derivatives

In economics we are often interested in calculating the partial derivative:
@yi
@xj
which tells us the relationship between the j th exogenous variable and the ith
endogenous variable when all other exogenous variables are held constant. This
corresponds to the experiment of keeping all other exogenous variables except
xj constant, changing xj by a small amount and seeing how much yj changes.
@yi
De…nition 3 If @x j
> 0 then a positive relationship exists between xi and yj ;
that is, an increase (decrease) in xj will result in an increase (decrease) in yi :
@yi
De…nition 4 If @x j
< 0 then a negative relationship exists between xi and yj ;
that is, an increase (decrease) in xj will result in an decrease (increase ) in yi :
In the example above where we found that

x1
y = h (x1 ; x2 ) =
x2
we have:
@y 1 @y x1
= > 0; =¡ < 0:
@x1 x2 @x2 (x2 )2
Thus there is a positive relationship between x1 and y and a negative relationship
between x2 and y:
Since it is often not possible to …nd the reduced form of the model, that is,
we often cannot unscramble the model enough to get it where y appears alone
on the left-hand side and h (x) appears on the right-hand side, we cannot always
@yi
calculate @x j
in the usual manner.
Fortunately, no matter how jumbled up the structural form of the model or
@yi
the implicit function, it is still possible to calculate @xj
indirectly using total
di¤erentials. In what follows we will learn this technique starting with the
simplest cases and building up to the more general.
We begin with beginners’ total di¤erentials where there is one exogenous
variable x and one endogenous variable y: We then move on to intermediate
total di¤erentials where there are many exogenous variables but still only one
endogenous variable. Finally we move on to advanced total di¤erentials where
there are both many endogenous and exogenous variables.
1.3 Total Di¤erentials for Beginners

Suppose we are given an implicit function:
g (y; x) = 0
where both y and x are scalars so that there is one endogenous variable y and
one exogenous variable x: Under one condition this de…nes an explicit function:
y = h (x) :
The reason we know this is the Implicit Function Theorem which states
that:
Theorem 5 Implicit Function Theorem I: The explicit function h (x) is guar-
anteed to exist as long as
@g (y; x)
6= 0:
@y
The problem then is to calculate:
dy
= h0 (x) :
dx
We do this by taking the total di¤erential of g (y; x) which is:
De…nition 6 The total di¤erential is de…ned as:

@g (y; x) @g (y; x)
dy + dx = 0:
@y @x
dy
The total di¤erential can be used to calculate dx using the following recipe:
dy
1.3.1 A Recipe for Calculating dx
Step 1. If it is not already clear to you, identify the endogenous variable y and
the exogenous variable x in your model.
Step 2. If the structural model is not written as an implicit function of the
form g (y; x) = 0; rewrite it so that it is in this form.
Step 3. Take the partial derivative of g (y; x) with respect to y and call this
a: Thus: a = @g(y;x)
@y : Verify that a 6= 0 which insures that the explicit function
exists by the Implicit Function Theorem. Multiply a by dy to get:
a £ dy:
Step 4. Take the partial derivative of g (y; x) with respect to x and call this
b: Thus: b = @g(y;x)
@x : Multiply b by dx to get:
b £ dx:
Step 5. Add the results of steps 3 and 4 together and set them equal to
zero to get the total di¤erential:
a £ dy + b £ dx = 0:
dy
Step 6. Solve for the ratio dx to get:
dy b
=¡
dx a
or alternatively using the de…nition of a and b :
@g(y;x)
dy @x
= ¡ @g(y;x) :
dx
@y
Note that if a = 0 this ratio would be unde…ned.
Example 1: A Simple Linear Function

Let us begin with an easy structural model written as:
3Q ¡ 4R = 9:
where Q is say the quantity of apples produced in an orchard and R is the

amount of rainfall.
Applying Step 1 in the recipe Q is the endogenous variable and R is the

exogenous variable. This follows from the common sense notion that while
rainfall might a¤ect apple production, apple production is unlikely to a¤ect
rainfall.
Applying Step 2 we can write this as an implicit function by putting the 9
on the other side of the equal sign to get:
g (Q; R) = 3Q ¡ 4R ¡ 9 = 0:
From steps 3; 4 and 5; the total di¤erential for g (Q; R) is:
3dQ ¡ 4dR = 0
since:
@g (Q; R) @g (Q; R)
a= = 3; b = = ¡4:
@Q @R
Note that a = 3 6= 0 and so the explicit function exists by the Implicit Function
Theorem.
Solving for dQ
dR from the total di¤erential it then follows that:
dQ 4
= :
dR 3
This is the same answer we would get from working with the explicit form
of the function or the reduced form. We obtain this by solving g (Q; R) =
3Q ¡ 4R ¡ 9 = 0 for Q to obtain:
4
Q= R+3
3
dQ
from which it easily follows that: dR = 43 :
Example 2
Consider the implicit function:
1 1
g (y; x) = ln (y) + x2 + ln (2¼) = 0:
2 2
This can be written explicitly or as a reduced form as:
1 1 2
y = h (x) = p e¡ 2 x
2¼
which is the standard normal density. From h (x) we can directly calculate h0 (x)
as:
dy 1 1 2
= ¡x p e¡ 2 x = ¡xy:
dx 2¼
| {z }
=y
Let us obtain now attempt to obtain the derivative via the total di¤erential.
Using Steps 3, 4, and 5 we obtain a = @g(y;x)
@y = y1 and b = @g(y;x)
@x = x so that
the total di¤erential is:
1
dy + xdx = 0:
y
dy
Solving for dx we then get:
dy
= ¡xy:
dx
Example 3
Suppose we were given a structural equation of the form:
y y = x:
Here from the notation y is the endogenous variable and x is the exogenous
variable. From Step 2; by taking the ln ( ) of both sides we can write this as an
implicit function:
g (y; x) = y ln (y) ¡ ln (x) = 0
which we have already seen cannot be written as an explicit function y = h (x) :1
From Steps 3; 4 and 5 the total di¤erential of g (y; x) is:
1
(1 + ln (y)) dy ¡ dx = 0:
x
since:
@g (y; x) @g (y; x) 1
a= = 1 + ln (y) ; b = =¡ :
@y @x x
dy
Solving for dx :
dy 1
= :
dx x (1 + ln (y))
dy
We can use this result to show that 0 < dx < 1 for x > 1 and hence a
positive relationship exists between x and y: This follows since:
x > 1 =) ln (y) y = ln (x) > 0
implies that y > 1 since if y < 1 then ln (y) < 0: Therefore from x > 1 and
(1 + ln (y)) > 1 we have
dy 1
x > 1 =) 0 < = < 1:
dx x (1 + ln (y))
1 Note that we could have just as correctly used g (y; x) = y y ¡ x = 0; that is there are
many di¤erent correct ways of writing down an implicit function.

As an exercise di¤erentiate this one more time with respect to x and show that:
µ ¶
d2 y 1 1 1 dy
= ¡ + <0
dx2 x (1 + ln (y)) x y (1 + ln (y)) dx
so that the function is concave as the plot above illustrates.
Example 4: The Demand for Labour in the Short-Run

Suppose a …rm facing a short-run production function Q = f (L) where f 0 (L) >
0 and f 00 (L) < 0 and where Q is output and L is the amount of labour. If w = W
P
is the real wage then the …rm will chose that L¤ which satis…es:
f 0 (L¤ ) = w:
Here w is the exogenous variable and L¤ is the endogenous variable. We can
write this structural form of the model as an implicit function as:
g (L¤ ; w) = f 0 (L¤ ) ¡ w = 0:
This de…nes the labour demand function or the explicit function or the reduced
form as:
L¤ = L¤ (w) :
The total di¤erential is:
f 00 (L¤ ) dL¤ ¡ dw = 0
since
@g (L¤ ; w) 00 ¤ @g (L¤ ; w)
a= = f (L ) ; b = = ¡1:
@L¤ @w
Note that a < 0 and so the explicit function exists.
Note that the coe¢cient on dL¤ is negative since the diminishing marginal
product of labour implies that: f 00 (L¤ ) < 0.
dL¤
Solving for the ratio dw we …nd that:
dL¤ 1
= 00 ¤ < 0
dw f (L )
and so the labour demand curve is downward sloping.
1.4 Total Di¤erentials for Intermediates

Consider now an implicit function were there are m exogenous variables x1 ; x2 ; : : : xm
(which we may write as an m £ 1 vector x) and one endogenous variable y. The
implicit function is now written as:
g (y; x1 ; x2 ; : : : xm ) = 0
which determines the explicit function:
y = h (x1 ; x2 ; : : : xm )
as long as the Implicit Function Theorem is satis…ed which states that:
Theorem 7 Implicit Function Theorem II: The explicit function y = h (x1 ; x2 ; : : : xm )

is guaranteed to exist as long as
@g (y; x1 ; x2 ; : : : xm )
6= 0:
@y
We wish to calculate: @x @y
j
= @h(x1 ;x2 ;:::xm )
@xj in order to determine how a
change in the j th exogenous variable xj will a¤ect the endogenous variable y. To
do this from the implicit function g (y; x) = 0 we calculate the total di¤erential
de…ned by:
De…nition 8 The total di¤erential of g (y; x) = 0 is given by:
a £ dy + b1 dx1 + b2 dx2 + ¢ ¢ ¢ + bm dxm = 0
where a; b1 ; b2 ; : : : bm are de…ned below in the following recipe:
@y
1.4.1 A Recipe for Calculating @xj
We proceed as follows:
Step 1. If it is not already clear to you, identify the endogenous variable y
and the exogenous variables x1 ; x2 ; : : : xm in your model.
Step 2. Write the structural form of the model as an implicit function of
the form:
g (y; x1 ; x2 ; : : : xm ) = 0:
This generally means just collecting all terms on the left-hand side of the struc-
tural model and setting them equal to zero.
Step 3. Take the partial derivative of g (y; x) with respect to y and call
this a: Make sure that a 6= 0 so that the explicit function exists by the Implicit
Function Theorem. Multiply a by dy to get:
a £ dy:
Step 4. Take the partial derivative of g (y; x) with respect to x1 and call
this b1 : Multiply b1 by dx1 to get b1 dx1 : Repeat this for x2 ; x3 ; : : : xm and add
them together to get:
b1 dx1 + b2 x2 + ¢ ¢ ¢ + bm dxm :
zero to get the total di¤erential:
a £ dy + b1 dx1 + b2 x2 + ¢ ¢ ¢ + bm dxm = 0:
@y
Step 6. To get the partial derivative @x j
set all dx0 s equal to zero except
dxj ; and then replacing the remaining d0 s in the total di¤erential with @ 0 s:
Thus we set dy = @y; dxj = @xj and dxi = 0 for i 6= j in the total di¤erential
to get:
a@y + bj @xj = 0:
@y
Now solve for the ratio @xj :
@g(y;x)
@y bj @xj
= ¡ = ¡ @g(y;x) :
@xj a
@y
1.4.2 Example 1: A Simple Linear Function

Consider the structural model:
3Q ¡ 4R ¡ 7S = 9
where Q is the production of apples, R is rainfall and S is the amount of sun-

shine. Following Step 1, Q is the endogenous variable and R and S are the 2
exogenous variables.
The structural model can be re-written as an implicit function as:
g (Q; R; S) = 3Q ¡ 4R ¡ 7S ¡ 9 = 0:
Following Steps 3, 4 and 5 the total di¤erential is
3dQ ¡ 4dR ¡ 7S = 0
since
@g (Q; R; S) @g (Q; R; S)
a = = 3; b1 = = ¡4;
@Q @R
@g (Q; R; S)
b2 = = ¡7:
@S
Suppose we wish to calculate @Q@S : Following step 6 …rst set dR = 0 and then
replace the remaining d0 s with @ 0 s so that: dQ = @Q and dS = @S: We then
obtain:
3@Q ¡ 7@S = 0:
@Q
Solving for the ratio @S then yields:
@Q 7
= :
@S 3
In this example we could also have calculated @Q

@S from the reduced form.
From 3Q ¡ 4R ¡ 7S ¡ 9 = 0 you can arrive at the reduced form (or explicit
function) as:
4 7
Q = h (R; S) = R + S + 3
3 3
from which it is trivial to calculate:
@Q 7
= :
@S 3
1.4.3 Example 2
1 1
g (y; x1 ; x2 ) = x1 y¡ 2 ¡ x2 y 2 = 0:
We have already seen that this has a reduced form:

x1
y = h (x1 ; x2 ) =
x2
@y @y
from which it is easy to calculate @x 1
and @x2
:
Let us …nd these partial derivatives from the implicit function. The total
di¤erential is:
1³ 3 1
´ 1 1
¡ x1 y ¡ 2 + x2 y ¡ 2 dy + y ¡ 2 dx1 ¡ y 2 dx2 = 0
2
since:
g (y; x1 ; x2 ) 1 3 1 1
a = = ¡ x1 y¡ 2 ¡ x2 y ¡ 2
@y 2 2
g (y; x1 ; x2 ) 1
b1 = = y¡ 2
@x1
g (y; x1 ; x2 ) 1
b2 = = y2:
@x2
@y
To calculate @x1
set dx2 = 0 and replace the remaining d0 s with @ 0 s so that
dy = @y and dx1 = @x1 : We then obtain:
1³ 3 1
´ 1
¡ x1 y ¡ 2 + x2 y ¡ 2 @y + y ¡ 2 @x1 = 0:
2
@y
Solving for the ratio @x1 yields:
1
@y 2y ¡ 2
=³ ´:
@x1 3 1
x1 y¡ 2 + x2 y¡ 2
As an exercise use the reduced form y = xx12 to prove that the above expression
@y
for @x 1
is equal to x12 :
@y
To calculate @x 2
set dx1 = 0 and replace the remaining d0 s with @ 0 s to
obtain:
1³ 3 1
´ 1
¡ x1 y ¡ 2 + x2 y ¡ 2 @y ¡ y 2 @x2 = 0:
2
@y
Solving for @x2 yields:
1
@y ¡2y 2
=³ ´ < 0:
@x2 3 1
x1 y 2 + x2 y¡ 2
¡
x1
As an exercise use the reduced form y = x2 to prove that the above expression
@y
for @x2
is equal to ¡ (xx21)2 :
1.4.4 Example 3: The Bivariate Normal Density

1¡ 2 ¢
g (y; x1 ; x2 ) = ln (y) + x1 + x22 + ln (2¼) = 0
2
from which we can …nd the explicit function:
1 ¡ 12 (x21 +x22 )
y = h (x1 ; x2 ) = e :
2¼
This is the bivariate standard normal which is plotted below:
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-3 -3
-2 -2
-1 -1
w0 0x
1 1
2 2
3 3
1 ¡ 12 (x21 +x22 )
y = h (x1 ; x2 ) = 2¼ e
If we start from the explicit function we can directly calculate:

@y 1 ¡ 12 (x21 +x22 )
= ¡x1 e = ¡x1 y
@x1 2¼
@y 1 2 2
= ¡x2 e¡ 2 (x1 +x2 ) = ¡x2 y:
1
@x2 2¼
If instead we calculate the total di¤erential we …nd that:
1
dy + x1 dx1 + x2 dx2 = 0
y
since
@g (y; x1 ; x2 ) 1
a = = ;
@y y
@g (y; x1 ; x2 ) @g (y; x1 ; x2 )
b1 = = x1 ; b2 = = x2 :
@x1 @x2
@y
To calculate @x1 set dy = @y; dx1 = @x1 and dx2 = 0 in the total di¤erential
to obtain:
1
@y + x1 @x1 = 0
y
@y
so that solving for the ratio @x1 we …nd:
@y
= ¡x1 y:
@x1
@y
To calculate @x2 set dy = @y; dx2 = @x2 and dx1 = 0 in the total di¤erential
to obtain:
1
@y + x2 @x2 = 0
y
@y
so that solving for the ratio @x2 we …nd:
@y
= ¡x2 y:
@x2
1.4.5 Example 4: Labour Demand

Suppose a …rm facing a short-run production function Q = f (L) where Q is
output and L is the amount of labour. With price P and nominal wage W ,
pro…ts ¼ as a function of L are given by:
¼ (L) = P f (L) ¡ W L:
The pro…t maximizing L¤ then satis…es the …rst-order condition:

¼0 (L¤ ) = 0 = P f 0 (L¤ ) ¡ W
with second-order condition:
P f 00 (L¤ ) < 0:
Here P and W are the exogenous variables and L¤ is the endogenous variable.
Note that the …rst-order condition gives us the appropriate implicit function:
g (L¤ ; W; P ) = P f 0 (L¤ ) ¡ W = 0:
This de…nes the labour demand function or the explicit function or reduced form
as:
L¤ = L¤ (W; P ) :
From the implicit function g (L¤ ; W; P ) = 0 the total di¤erential is:
P f 00 (L¤ ) dL¤ ¡ dW + f 0 (L¤ ) dP = 0
since
@g (L¤ ; W; P ) 00 ¤ @g (L¤ ; W; P )
a = = P f (L ) ; b1 = = ¡1;
@L¤ @W
@g (L¤ ; W; P )
b2 = = f 0 (L¤ ) :
@P
Note that the coe¢cient on dL¤ is negative from the second-order condition;
that is a = P f 00 (L¤¤) < 0.
To calculate @L @W set dP = 0 in the total di¤erential and replace the remain-
ing d0 s by @ 0 s so that dL¤ = @L¤ and dW = @W to get:
P f 00 (L¤ ) @L¤ ¡ @W = 0
or
P f 00 (L¤ ) @L¤ = @W:
@L¤
Dividing both sides by @W and solving for @W :
@L¤ (W; P ) 1
= <0
@W P f 00 (L¤ (W; P ))
and so the labour demand
¤
curve is downward sloping.
To calculate @L @P set dW = 0 in the total di¤erential and replace the remain-
ing d0 s by @ 0 s so that dL¤ = @L¤ and dP = @P to get:
P f 00 (L¤ ) @L¤ + f 0 (L¤ ) @P = 0:
Solving we …nd that:
@L¤ f 0 (L¤ )
= ¡ 00 ¤ > 0
@P P f (L )
since f 0 (L¤ ) > 0 (the marginal product of labour is positive) and P f 00 (L¤ ) < 0
from the second-order conditions.
1.5 Total Di¤erentials for Experts

1.5.1 Systems of Implicit Functions
Suppose we have a model where there are n endogenous variables: y1 ; y2 ;
y3 ; : : : yn and m exogenous variables x1 ; x2 ; : : : xm : We can write this more com-
pactly by letting y = [yi ] be an n £ 1 vector of the endogenous variables and
x = [xi ] an m £ 1 vector of exogenous variables.
Suppose you are given n implicit functions:
gi (y; x) = 0; for i = 1; 2; : : : n:
This de…nes n reduced form or explicit functions given by:
yi = hi (x) ; for i = 1; 2; : : : n
as long as the Implicit Function Theorem is satis…ed which states that:
Theorem 9 Implicit Function Theorem III: The explicit functions yi = hi (x1 ; x2 ; : : : xm )

for i = 1; 2; : : : n are guaranteed to exist as long as
det [A] 6= 0
where:
2 @g1 (y;x) @g1 (y;x) @g1 (y;x) 3
@y1 @y2 ¢¢¢ @yn
6 @g2 (y;x) @g2 (y;x) @g2 (y;x) 7
6 @y1 @y2 ¢¢¢ @yn 7
A=6
6 .. .. .. ..
7:
7
4 . . . . 5
@gn (y;x) @gn (y;x) @gn (y;x)
@y1 @y2 ¢¢¢ @yn
For example suppose that we have a structural model where apple production
Q1 and honey production Q2 are related to the amount of rainfall R; sunshine
S and temperature T as:
3Q1 + 2Q2 + 3S + 5T = 2R ¡ 7
Q1 ¡ 2Q2 + 6R ¡ 3T = ¡2S + 4:
Here Q1 and Q2 are the endogenous variables and R; S; and T are the exogenous
variables. Note that we have two equations and two endogenous variables.
This can then be written as two implicit functions as:
g1 (Q1 ; Q2 ; R; S; T ) = 3Q1 + 2Q2 ¡ 2R + 3S + 5T + 7 = 0

g2 (Q1 ; Q2 ; R; S; T ) = Q1 ¡ 2Q2 + 6R + 2S ¡ 3T ¡ 4 = 0:
Here:
" @g1 @g1
# · ¸
@Q1 @Q2 3 2
A= @g2 (y;x) @g2 (y;x) =
@Q1 @Q2
1 ¡2
and
·· ¸¸
3 2
det = ¡8 6= 0
1 ¡2
and so the explicit functions:
Q1 = h1 (R; S; T )
Q2 = h2 (R; S; T )
are guaranteed to exist by the Implicit Function Theorem.

@h2 (R;S;T )
We might therefore want to calculate @Q@S =
2
@S ; that is how sunshine
a¤ects honey production.
@yi
In general we might want to calculate @x j
; in order to determine the nature
of the relationship between the j exogenous variable xj and the ith endogenous
th
variable yi : We do this by calculating the total di¤erential de…ned as
De…nition 10 The total di¤erential of gi (y; x) = 0 for i = 1; 2; : : : n is de…ned

as:
Ady + Bdx = 0
where:
2 3 2 3
dy1 dx1
6 dy2 7 6 dx2 7
6 7 6 7
dy = 6 .. 7 ; dx = 6 .. 7
4 . 5 4 . 5
dyn dxm
and A is the n £ n matrix de…ned above and B is an n £ m matrix given by:

2 @g1 (y;x) @g1 (y;x) 3
@x1 @x2 ¢ ¢ ¢ @g@x
1 (y;x)
m
6 @g2 (y;x) @g2 (y;x) 2 (y;x) 7
6 @x1 @x2 ¢ ¢ ¢ @g@x 7
B=6 6 .. .. ..
m 7:
.. 7
4 . . . . 5
@gn (y;x) @gn (y;x) @gn (y;x)
@x1 @x2 ¢¢¢ @xm
@yi
The recipe for calculating the total di¤erential and from this @xj
is as follows:
@yi
1.5.2 A Recipe for Calculating @xj
Step 1. Identify the n endogenous variables y1 ; y2 ; : : : yn and the m exogenous

variables x1 ; x2 ; : : : xm ; in your model.
Step 2. If the structural model is not written as a system of implicit func-

tions write them in the form:
g1 (y1 ; y2 ; : : : yn ; x1 ; x2 ; : : : xm ) = 0;
g2 (y1 ; y2 ; : : : yn ; x1 ; x2 ; : : : xm ) = 0;
..
.
gn (y1 ; y2 ; : : : yn ; x1 ; x2 ; : : : xm ) = 0;
Note you should have as many implicit functions as endogenous variables!

Step 3. Take the partial derivative of the …rst implicit function g1 (y; x) = 0
with respect to y1 . Call this a11 so that a11 = @g1@y(y;x)
1
: Multiply a11 by dy1 to
get a11 dy1 : Continue this with the remaining endogenous variables y2 ; y3 ; : : : yn
and add the results together to obtain:
a11 dy1 + a12 dy2 + ¢ ¢ ¢ + a1n dyn :
Note that a1j = @g1@y(y;x)

1
is the coe¢cient on dyj in the …rst implicit function.
Step 4. Take the partial derivative of g1 (y; x) with respect to x1 and call
this b11 : Multiply b11 by dx1 to get b11 dx1 : Repeat for x2 ; x2 ; : : : xm to get:
b11 dx1 + b12 dx2 + ¢ ¢ ¢ + b1m dxm :
zero. This gives you the total di¤erential for the …rst implicit function.
a11 dy1 + a12 dy2 + ¢ ¢ ¢ + a1n dyn + b11 dx1 + b12 dx2 + ¢ ¢ ¢ + b1m dxm = 0
Now repeat this for the second implicit function to get:
so that a2j = @g2@y

(y;x)
j
and b2j = @g@x
2 (y;x)
j
.
Keep doing this for all n implicit equations to obtain n total di¤erentials:
..
.
an1 dy1 + an2 dy2 + ¢ ¢ ¢ + ann dyn + bn1 dx1 + bn2 dx2 + ¢ ¢ ¢ + bnm dxm = 0
so that aij = @g@y

i (y;x)
j
and bij = @g@x
i (y;x)
j
are the coe¢cients on dyj and dxj in
th
the i total di¤erential:
Thus there are n total di¤erentials, one for each implicit equation.
@yi
Step 6. To get @x j
from the total di¤erential set all dx0 s equal to zero
except dxj and replace all remaining d0 s with @ 0 s to obtain:
a11 @y1 + a12 @y2 + ¢ ¢ ¢ + a1n @yn + b1j @xj = 0

a21 @y1 + a22 @y2 + ¢ ¢ ¢ + a2n @yn + b2j @xj = 0
..
.
an1 @y1 + an2 @y2 + ¢ ¢ ¢ + ann @yn + bnj @xj = 0:
Now put the terms involving @xj on the right-hand side to get:
a11 @y1 + a12 @y2 + ¢ ¢ ¢ + a1n @yn = ¡b1j @xj

a21 @y1 + a22 @y2 + ¢ ¢ ¢ + a2n @yn = ¡b2j @xj
..
.
an1 @y1 + an2 @y2 + ¢ ¢ ¢ + ann @yn = ¡bnj @xj
or:
2 32 3 2 3
a11 a12 ¢¢¢ a1n @y1 b1j
6 a21 a22 ¢¢¢ a2n 76 @y2 7 6 b2j 7
6 76 7 6 7
6 .. .. .. .. 76 .. 7 = ¡6 .. 7 @xj :
4 . . . . 54 . 5 4 . 5
an1 an2 ¢¢¢ ann @yn bnj
Divide both sides by the scalar @xj and re-write this in matrix notation:
@y
A = ¡bj
@xj
@y
where @xj and bj are n £ 1 column vectors given respectively by:
2 @y1 3 2 3
@xj b1j
6 @y2 7 6 b2j 7
@y 6 7
=6
@xj
7 and bj = 6
6 ..
7
7:
@xj 6 .. 7 4 . 5
4 . 5
@yn bnj
@xj
Note that bj is the j th column of the matrix B:

@yi
To solve for @x j
use Cramer’s rule to obtain:
@yi det [Ai (¡bj )]

=
@xj det [A]
where Ai (¡bj ) is the matrix you obtain by replacing the ith column of A with
the vector ¡bj :
1.5.3 Example 1
Given the apple/honey production example above, we have two linear implicit
equations:
g1 (Q1 ; Q2 ; R; S; T ) = 3Q1 + 2Q2 ¡ 2R + 3S + 5T + 7 = 0

g2 (Q1 ; Q2 ; R; S; T ) = Q1 ¡ 2Q2 + 6R + 2S ¡ 3T ¡ 4 = 0:
@Q2
Suppose we wish to calculate @S :
Using Total Di¤erentials

Following steps 3; 4; and 5 we have:
@g1 @g1
a11 = = 3; a12 = = 2;
@Q1 @Q2
@g1 @g1 @g1
b11 = = ¡2; b12 = = 3; b13 = =5
@R @S @T
so that the total di¤erential for g1 (Q1 ; Q2 ; R; S; T ) is:
3dQ1 + 2dQ2 ¡ 2dR + 3dS + 5dT = 0:
Similarly since:
@g2 @g2
a21 = = 1; a22 = = ¡2;
@Q1 @Q2
@g2 @g2 @g2
b21 = = 6; b22 = = 2; b23 = = ¡3
@R @S @T
the total di¤erential for g2 (Q1 ; Q2 ; R; S; T ) is:
dQ1 ¡ 2dQ2 + 6dR + 2dS ¡ 3dT = 0:
Putting these two total di¤erentials together and we have:
3dQ1 + 2dQ2 ¡ 2dR + 3dS + 5dT = 0

dQ1 ¡ 2dQ2 + 6dR + 2dS ¡ 3dT = 0:
To calculate @Q
@S from the total di¤erential set dR = dT = 0 and replace
2
the remaining d s with @ 0 s so that dS = @S; dQ1 = @Q1 and dQ2 = @Q2 to
0
obtain:
3@Q1 + 2@Q2 + 3@S = 0

@Q1 ¡ 2@Q2 + 2@S = 0
or in matrix notation:
· ¸· ¸ · ¸
3 2 @Q1 ¡3
= @S:
1 ¡2 @Q2 ¡2
Dividing both sides by @S we then get:

· ¸ · @Q1 ¸ · ¸
3 2 @S ¡3
@Q2 = :
1 ¡2 @S
¡2
Solving by Cramer’s rule then yields:

· ¸
3 ¡3
det
@Q2 1 ¡2 3
= · ¸= :
@S 3 2 8
det
1 ¡2
Similar if we wish to calculate @Q

@R set dS = dT = 0 in the total di¤erential
1
and replace the remaining d s with @ 0 s so that dR = @R; dQ1 = @Q1 and
0
dQ2 = @Q2 : We therefore obtain:
3@Q1 + 2@Q2 ¡ 2@R = 0

@Q1 ¡ @dQ2 + 6@R = 0
· ¸· ¸ · ¸
3 2 @Q1 2
= @R:
1 ¡2 @Q2 ¡6
Dividing both sides by @R:

· ¸· @Q1
¸ · ¸
3 2 @R 2
@Q2 = :
1 ¡2 @R
¡6
Solving by Cramer’s rule then yields:

· ¸
2 2
det
@Q1 ¡6 ¡2
= · ¸ = ¡1:
@R 3 2
det
1 ¡2
From the Reduced Form

Since the structural model is linear, we can easily …nd the reduced form using
matrix algebra. We thus have from the implicit functions:
2 3
· ¸· ¸ · ¸ R · ¸
3 2 Q1 ¡2 3 5 4 S 5+ 7
+ = 0:
1 ¡2 Q2 6 2 ¡3 ¡4
T
so that:
2 3
· ¸ · ¸¡1 · ¸ R
Q1 3 2 ¡2 3 5 4 S 5
= ¡
Q2 1 ¡2 6 2 ¡3
T
· ¸¡1 · ¸
3 2 7
¡
1 ¡2 ¡4
2 3
· ¸ R · ¸
¡1 ¡ 54 ¡ 12 4 S 5 + ¡19
3
4
= 5 3 :
2 8 ¡ 74 ¡8
T
Writing these out we get the reduced form:

5 1 3
Q1 = h1 (R; S; T ) = ¡R ¡ S ¡ T ¡
4 2 4
5 3 7 19
Q2 = h2 (R; S; T ) = R + S ¡ T ¡
2 8 4 8
so that:
@Q1 @Q2 3
= ¡1; = :
@R @S 8
1.5.4 Example 2: Supply and Demand I

First consider a demand and supply model given by:
Qd = D (P ) D0 (P ) < 0 (the demand curve)

Qs = S (P ) S 0 (P ) > 0 (the supply curve)
Qs = Qd = Q (the equilibrium condition).
Here there are no exogenous variables, the …rst equation says that the demand
curve slopes downward, the second that the supply curve slopes upward and the
third that in equilibrium supply equals demand.
We can re-write this as two implicit functions with two endogenous variables
Q and P as:
g1 (Q; P ) = Q ¡ D (P ) = 0
g2 (Q; P ) = Q ¡ S (P ) = 0:
At this stage there are no exogenous variables.

Let us now introduce two exogenous variables: a tax TH paid by households
and a tax TF paid by …rms so that the model becomes:
Qd = D (P + TH ) ;
Qs = S (P ¡ TF ) ;
Qs = Qd = Q:
or as implicit functions:
g1 (Q; P; TH ; TF ) = Q ¡ D (P + TH ) = 0
g2 (Q; P; TH ; TF ) = Q ¡ S (P ¡ TF ) = 0:
This determines the equilibrium quantity Q and price P as functions of TH

and TF as:
Q = h1 (TH ; TF )
P = h2 (TH ; TF ) :
To simplify the notation de…ne:
DP ´ D0 (P + TH )
SP ´ S 0 (P ¡ TF ) :
To …nd the e¤ect of the tax TH on the endogenous variables P and Q; we

calculate the total di¤erential which is:
dQ ¡ DP dP ¡ DP dTH = 0
dQ ¡ SP dP + SP dTF = 0
since:
a11 = @g1 (Q;P;T

@Q
H ;TF )
=1 a12 = @g1 (Q;P;T
@P
H ;TF )
= ¡DP
@g1 (Q;P;TH ;TF ) @g1 (Q;P;TH ;TF )
b11 = @TH = ¡DP b12 = @TF =0
and:
@g2 (Q;P;TH ;TF )
a21 = @Q = 1 a22 = @g2 (Q;P;T
@P
H ;TF )
= ¡SP
@g2 (Q;P;TH ;TF )
b21 = @TH = 0 b22 = @g2 (Q;P;T
@TF
H ;TF )
= SP :
@Q
To …nd @T H
set dTF = 0 and replace all the remaining d0 s in the total
di¤erential with @ 0 s to obtain:
@Q ¡ DP @P ¡ DP @TH = 0
@Q ¡ SP @P = 0
or
· ¸· ¸ · ¸
1 ¡DP @Q DP
= @TH :
1 ¡SP @P 0
Dividing both sides by @TH we obtain:

· ¸ " @Q # · ¸
1 ¡DP @TH DP
@P =
1 ¡SP @TH
0
so that from Cramer’s rule:

· ¸
DP ¡DP
det
@Q 0 ¡SP
= · ¸
@TH 1 ¡DP
det
1 ¡SP
¡DP SP
= <0
¡SP + DP
so that the tax on households reduces the equilibrium quantity.
1.5.5 Example 3: Supply and Demand II

There are of course many more variables than taxes which shift demand and
supply curves. Let a be any variable which shifts the demand curve to the right
and b any variable which shifts the supply curve to the right. We can write a
more general supply and demand model as:
@D(P;a) @D(P;a)
Qd = D (P; a) DP ´ @P <0 Da ´ @a >0
@S(P;b) @S(P;b)
Qs = S (P; b) SP ´ @P > 0 Sb ´ @b > 0
Qs = Qd = Q:
For example if a = TH and b = TF and D (P; a) = D (P + TH ) and S (P; b) =

S (P ¡ TF ) then we would have the previous supply and demand model.
Here the exogenous variables are a and b and the endogenous variables are
Q and P: Written as a system of implicit equations we have:
g1 (Q; P; a; b) = Q ¡ D (P; a) = 0
g2 (Q; P; a; b) = Q ¡ S (P; b) = 0:
The total di¤erential is then:
dQ ¡ DP dP ¡ Da da + 0db = 0
dQ ¡ SP dP + 0da ¡ Sb db = 0:
Suppose we want to see what happens if the demand curve shifts to the right
while the supply curve stays …xed. Set db = 0; da = @a; dQ = @Q and dP = @P
so that from the total di¤erential:
· ¸ · @Q ¸ · ¸
1 ¡DP @a
Da
@P =
1 ¡SP @a
0
so that using Cramer’s rule:

· ¸
Da ¡DP
det
@Q 0 ¡SP ¡Da SP
= · ¸ = >0
@a 1 ¡DP ¡SP + DP
det
1 ¡SP
· ¸
1 Da
det
@P 1 0 Da
= · ¸ =¡ >0
@a 1 ¡DP ¡SP + DP
det
1 ¡SP
Thus anything which shifts the demand curve to the right increases Q and P:
Verify this using a diagram!
1.5.6 Example 4: The IS=LM Model

The Goods and Services Market
In the IS/LM model the goods and services market is described by:
cd = c (y) ; 0 < c0 (y) < 1;

id = i (r) ; i0 (r) < 0
gd = go
yd = cd + id + g d
y = yd
where cd is consumption demand (with c0 (y) the marginal propensity to con-

sume), id is investment demand (which depends negatively on the rate of inter-
est), gd is government expenditure y d is aggregate demand and y is GNP:
Here government expenditure is exogenous and equal to some constant go
(the o subscript here is used to denote exogenous variables) while all other
variables are endogenous, i.e., c, i; y and r:
In equilibrium we require that y = y d = c (y) + i (r) + go so that:
y = c (y) + i (r) + go
which is the structural form of the IS curve. We can write this as an implicit
function as:
g1 (y; r; go ; mso ) = y ¡ c (y) ¡ i (r) ¡ go = 0
where y and r are the endogenous variables and go and (the not yet introduced
exogenous money supply) mso are the exogenous variables.
The total di¤erential for the IS curve is then:
(1 ¡ c0 (y)) dy ¡ i0 (r) dr ¡ dgo + 0dmso = 0:

The Slope of the IS Curve

Let us now show that the IS curve is downward sloping. The IS curve is all
r; y combinations which satisfy this implicit function for …xed values of go and
mso : We can therefore think of y as the independent or exogenous variable and r
as the dependent variable or endogenous variable and take the total di¤erential
with dgo = dmso = 0 to obtain:
(1 ¡ c0 (y)) dy ¡ i0 (r) dr = 0:
dr
Solving for the slope of the IS curve given by: dy we obtain:
dr 1 ¡ c0 (y)
= <0
dy i0 (r)
since 1 ¡ c0 (y) > 0 and i0 (r) < 0: Thus the IS curve is downward sloping.
The Money Market

The money market is described by:
md = k (y) + l (r) ; k0 (y) > 0; l0 (r) < 0
ms = mso
where k (y) is the transactions demand for money, l (r) is the speculative demand
for money, which depends negatively on r; and mso is the exogenous supply of
money.
In equilibrium: ms = md so that we have:
mso = k (y) + l (r)
which is the structural form of the LM curve. We can write the LM curve as
an implicit function as:
g2 (y; r; go ; mso ) = k (y) + l (r) ¡ mso = 0:
This has a total di¤erential given by:
k0 (y) dy + l0 (r) dr + 0dgo ¡ dmso = 0:
The Slope of the LM Curve

Let us now show that the LM curve is downward sloping. The LM curve is all
r; y combinations which satisfy this implicit function for …xed values of go and
mso : We can therefore think of y as the independent or exogenous variable and r
as the dependent variable or endogenous variable and take the total di¤erential
with dgo = dmso = 0 to obtain:
k0 (y) dy + l0 (r) dr = 0
dr
and solve for dy ; the slope of the LM curve as:
dr k0 (y)
=¡ 0 >0
dy l (r)
The IS and LM Curves as Implicit Functions

Taken together IS and LM equations determine the structural form of the model
with two endogenous variables y and r and two exogenous variables go and mso .
We can write the IS and LM equations respectively as:
g1 (y; r; go ; mso ) = y ¡ c (y) ¡ i (r) ¡ go = 0
g2 (y; r; go ; mso ) = k (y) + l (r) ¡ mso = 0:
These two equations then determine a reduced form:
y = f1 (go ; mso )
r = f2 (go ; mso ) :
@y
We wish to calculate such partial derivatives as @go
; which tells us how GN P
@r
responds to changes in government expenditure, or @m s which tells us how
o
interest rates respond to changes in the money supply.
The Total Di¤erential

The total di¤erential for the IS=LM model is given by:
(1 ¡ c0 (y)) dy ¡ i0 (r) dr ¡ dgo = 0
k0 (y) dy + l0 (r) dr ¡ dmso = 0
since:
@g1 @g1
a11 = = 1 ¡ c0 (y) ; a12 = = ¡i0 (r) ;
@y @r
@g1 @g1
b11 = = ¡1; b12 = =0
@go @mso
and
@g2 @g2
a21 = = k0 (y) ; a22 = = l0 (r) ;
@y @r
@g2 @g2
b21 = = 0; b22 = = ¡1:
@go @mso
The E¤ect of Changes in Government Expenditure

@y
To …nd @go set dmso = 0 and replace the d0 s by @ 0 s to obtain:
(1 ¡ c0 (y)) @y ¡ i0 (r) @r ¡ @go = 0

k0 (y) @y + l0 (r) @r = 0
or writing this in matrix notation:
· ¸· ¸ · ¸
1 ¡ c0 (y) ¡i0 (r) @y 1
= @go :
k0 (y) l0 (r) @r 0
Dividing both sides by @go we obtain:

· ¸" @y
# · ¸
1 ¡ c0 (y) ¡i0 (r) @go 1
= :
k0 (y) l0 (r) @r
@go
0
Finally, using Cramer’s rule:

· ¸
1 ¡i0 (r)
det
@y 0 l0 (r)
= · ¸
@go 1 ¡ c0 (y) ¡i0 (r)
det
k0 (y) l0 (r)
l0 (r)
=
(1 ¡ c0 (y)) l0 (r) + i0 (r) k0 (y)
so that simplifying we have:
@y 1
= i0 (r)k0 (y)
:
@go 0
(1 ¡ c (y)) + l0 (r)
Since 0 < c0 (y) < 1 it follows that 0 < 1 ¡ c0 (y) < 1 so the …rst term in the
0 0
denominator is positive. The second term: i (r)k l0 (r)
(y)
represents the crowding out
e¤ect and is also positive since i (r) < 0 , l (r) < 0 and k0 (y) > 0: We therefore
0 0
@y
conclude that @g o
> 0 so that an increase in government expenditure increases
GN P:
@r
Using Cramer’s rule to solve for @g o
we …nd that:
· ¸
1 ¡ c0 (y) 1
det
@r k0 (y) 0
= · ¸
@go 1 ¡ c0 (y) ¡i0 (r)
det
k0 (y) l0 (r)
¡k0 (y)
= · ¸ >0
1 ¡ c0 (y) ¡i0 (r)
det
k0 (y) l0 (r)
so that an increase in government expenditure increases interest rates.
It follows from this that investment falls when government expenditure in-
creases. Using the chain rule we have:
@i (r (go ; mso )) @r (go ; mso )
= i0 (r (go ; mso )) <0
@go @go
since i0 (r) < 0:
The E¤ect of Changes in the Money Supply

@y
Similarly to …nd @mso set dgo = 0 and replace the d0 s by @ 0 s to obtain:
(1 ¡ c0 (y)) @y ¡ i0 (r) @r = 0
k (y) @y + l0 (r) @r ¡ @mso
0
= 0:
or
· ¸" @y
# · ¸
1 ¡ c0 (y) ¡i0 (r) @mso 0
= :
k0 (y) l0 (r) @r
@mso
1
Using Cramer’s rule it follows that :

· ¸
0 ¡i0 (r)
det
@y 1 l0 (r)
= · ¸ >0
@mos 1 ¡ c0 (y) ¡i0 (r)
det
k0 (y) l0 (r)
and
· ¸
1 ¡ c0 (y) 0
det
@r k0 (y) 1
= · ¸ <0
@mo s 1 ¡ c0 (y) ¡i0 (r)
det
k0 (y) l0 (r)
so that an increase in the money supply increases GN P and decreases interest
rates. (Verify this!)
It follows from this that investment increases when the money supply in-
creases since using the chain rule we have:
@i (r (go ; mso )) @r (go ; mso )
s
= i0 (r (go ; mso )) >0
@mo @mso
since i0 (r) < 0:
1.6 Total Di¤erentials and Optimization

1.6.1 A General Discussion
Economics deals with rational behavior, and rational behavior typically means
maximizing or minimizing something. For example utility and pro…ts get max-
imized while costs get minimized.
Maximization and minimization imply …rst-order and second-order condi-
tions. First-order conditions say that some expression of the endogenous and
exogenous variables must equal zero and so give directly implicit functions from
which we can take total di¤erentials.
Second-order conditions tell us that something about a symmetric matrix,
the Hessian H: For example with an unconstrained maximization problem we
know that H must be negative de…nite while for an unconstrained minimization
problem H must be positive de…nite.
It turns out that when we take the total di¤erential, the coe¢cients on the
dy 0 s; the exogenous variables, always come from H; the Hessian from the second-
order conditions. This turns out to be invaluable in signing partial derivatives
since if we use Cramer’s rule det [H] will always appear in the denominator.
1.6.2 A Quick Derivation of the Main Principle

Let y be an n £ 1 vector and x an m £ 1 vector of exogenous variables. Suppose
that our structural model involves either maximizing or minimizing:
f (y; x) :
The optimal y; say y¤ will then satisfy the …rst-order conditions:

@f (y ¤ ; x)
= 0; for i = 1; 2; : : : n:
@yi
Here y¤ is the vector of endogenous variables and x is the vector of exogenous
variables. The …rst-order conditions then de…ne a set of n implicit functions
given by:
@f (y¤ ; x)
gi (y ¤ ; x) = = 0; for i = 1; 2; : : : n:
@yi
This determines the reduced form of n equations:
yi¤ = yi¤ (x) for i = 1; 2; : : : n:
From the second-order conditions we have that the Hessian:

· 2 ¸
@ f (y¤ ; x)
H=
@yi @yj
is negative de…nite for a maximum and positive de…nite for a minimum.

Taking the total di¤erential of g (y¤ ; x) = 0 we …nd that:
Hdy + Bdx = 0
where the n £ n matrix H and the n £ m matrix B are given by:

· 2 ¸
¤ @ f (y¤ ; x)
H (y ; x) = ;
@yi @yj
· 2 ¸
@ f (y¤ ; x)
B =
@yi @xj
Using Cramer’s rule we then …nd that as before that :

@yi det [Hi (¡bj )]
= :
@xj det [H]
From the second-order conditions we know that H is either negative de…nite

for a maximum or positive de…nite and for a minimum hence non-singular.
Furthermore we can determine the sign of det [H] : For example det [H] > 0 if
@yi
H is positive de…nite. This can be used in determining whether @xj
is positive
or negative when we use Cramer’s rule.
While this derivation has skipped some details and is perhaps hard to follow
in the details, the essential point is easy to grasp and is this:
When taking the total di¤erential from an unconstrained optimization prob-
lem, the matrix of coe¢cients on the dy0 s is identical to the Hessian: H from
the second-order conditions. This fact can then be used in signing the partial
@yi
derivatives like @x j
:
1.6.3 Pro…t Maximization

Consider the problem of pro…t maximizing in the long-run for a perfectly com-
petitive …rm with technology Q = F (L; K) where ¼ is pro…ts, Q is output, P
is price, L is labour, K is capital, W is the nominal wage and R is the rental
cost of capital. Pro…t are thus given by:
¼ (L; K; P; W; R) = P F (L; K) ¡ W L ¡ RK:
From the …rst-order conditions we have:
@F (L¤ ; K ¤ )
P ¡W = 0
@L
¤ ¤
@F (L ; K )
P ¡ R = 0:
@K
Here L¤ and K ¤ are the endogenous variables and P; W and R are the ex-
ogenous variables. The …rst-order conditions thus de…ne two implicit functions:
@F (L¤ ; K ¤ )
g1 (L¤ ; K ¤ ; P; W; R) = P ¡W =0
@L
@F (L¤ ; K ¤ )
g2 (L¤ ; K ¤ ; P; W; R) = P ¡R =0
@K
which determines the reduced form:
L¤ = L¤ (P; W; R)
K¤ = K ¤ (P; W; R) :
To simplify the notation let us make the following de…nitions:
@F (L¤ ;K ¤ ) ¤ ¤
FL ´ @L Fk ´ @F (L@L;K )
@ 2 F (L¤ ;K ¤ ) 2
(L¤ ;K ¤ )
FLL ´ @L2 FLK ´ @ F@L@K
2
(L¤ ;K ¤ )
FKK ´ @ F @K 2
The second-order conditions are then that the Hessian H given by:
· ¸
P FLL P FLK
H=
P FLK P FKK
is negative de…nite. Therefore:
P FLL (L¤ ; K ¤ ) < 0; P FKK (L¤ ; K ¤ ) < 0
det [H] > 0:
Taking the total di¤erential we …nd that:

P FLL dL¤ + P FLK dK ¤ + FL dP ¡ dW = 0
P FLK dL¤ + P FKK dK ¤ + FK dP ¡ dR = 0
so that in matrix notation we have:
2 3
· ¸· ¸ · ¸ dP · ¸
P FLL P FLK dL¤ FL ¡1 0 4 dW 5 = 0 :
+
P FLK P FKK dK ¤ FK 0 ¡1 0
| {z } dR
=H
Note that the matrix of coe¢cients on the vector of dL¤ and dK ¤ ; the
endogenous variables, is H; the Hessian¤ from the¤ second-order conditions.
Suppose now we wish to calculate @L @K
@W and @W : Set dP = dR = 0 and change
the remaining d s to @ s so that dL = @L ; dK ¤ = @K ¤ ; and dW = @W; put
0 0 ¤ ¤
the terms involving @L on the right-hand side, and divide by @L to obtain:

· ¸ · @L¤ ¸ · ¸
P FLL P FLK @W 1
@K ¤ = :
P FLK P FKK @W
0
| {z }
=H
Using Cramer’s rule we …nd that:

· ¸
1 P FLK
det
@L¤ 0 P FKK P FKK
= = <0
@W det [H] det [H]
since P FKK < 0 and det [H] > 0 from the second-order conditions. Note how we
have used the second-order conditions to show that the long-run labour demand
curve is downward sloping.
From this we can also show using Cramer’s rule that:
· ¸
P FLL 1
det
@K ¤ P FLK 0 ¡P FLK
= =
@W det [H] det [H]
which can be either positive or negative
¤
depending on the sign of FLK :
@L¤
Suppose now we calculate @K @R and @R : From the total di¤erential we can
show that:
· ¸ · @L¤ ¸ · ¸
P FLL P FLK @R 0
@K ¤ =
P FLK P FKK @R
1
so that:
· ¸
0 P FLK
det
@L¤ 1 P FKK P FLK
= =¡
@R det [H] det [H]
· ¸
P FLL 0
det
@K ¤ P FLK 1 P FLL
= = < 0:
@R det [H] det [H]
¤
Thus as one would expect: @K @R < 0 so that the long-run demand curve for
capital is downward sloping. ¤
@L¤
Comparing the expressions for @K
@W with @R we …nd that:
@L¤ @K ¤ P FLK
= =¡ :
@R @W det [H]
This (very surprising! ) symmetry result often occurs in these kind of problems.
1.6.4 Constrained Optimization

Often in economics we are maximizing or minimizing subject to a constraint.
Suppose we wish to choose y to maximize or minimize f (y; x) subject to the
constraint h (y; x) = 0:
This leads to the Lagrangian:
L (¸; y; x) = f (y; x) + ¸h (y; x)
and the …rst-order conditions:
@L (¸¤ ; y ¤ ; x)
= h (y ¤ ; x) = 0;
@¸
@L (¸¤ ; y ¤ ; x) @f (y¤ ; x) @h (y ¤ ; x)
= + ¸¤ = 0; for i = 1; 2; : : : n:
@yi @yi @yi
These n + 1 …rst-order conditions lead to n + 1 implicit functions given by:
g1 (¸¤ ; y ¤ ; x) = h (y ¤ ; x) = 0;
@f (y¤ ; x) @h (y ¤ ; x)
gi (¸¤ ; y ¤ ; x) = + ¸¤ = 0; for i = 2; 3; : : : n + 1:
@yi @yi
Here the endogenous variables are ¸¤ and y ¤ ; while x is the vector of exogenous
variables. This in turn determines the reduced form:
¸¤ = ¸¤ (x)
yi¤ = yi¤ (x) for i = 1; 2; : : : n:
The second-order conditions lead to the (n + 1) £ (n + 1) matrix H which
is the Hessian of L (¸; y; x) evaluated at ¸¤ ; y ¤ :
Consider now taking the total di¤erential. We will then …nd that:
HdZ ¤ + Bdx = 0
where:
· ¸
¤ d¸¤
dZ =
dy¤
is the (n + 1) £ 1 vector of endogenous variables.
The essential point is easy to grasp and is this:
When taking the total di¤erential from a constrained optimization problem,
the matrix of coe¢cients on the d¸ and dy is identical to the Hessian: H from
the second-order conditions. This fact can then be used in signing the partial
@yi
derivatives like @xj
:
1.6.5 Utility Maximization

Consider a household with a utility function:
U (X1 ; X2 ) = U1 (X1 ) + U2 (X2 )
where:
U10 (X1 ) > 0; U100 (X1 ) < 0

U20 (X2 ) > 0; U200 (X2 ) < 0
and with income Y and prices P1 and P2 :

The Lagrangian for the constrained maximization problem is then:
L (¸; X1 ; X2 ) = U1 (X1 ) + U2 (X2 ) + ¸ (Y ¡ P1 X1 ¡ P2 X2 )
which yields the …rst-order conditions:

@L (¸¤ ; X1¤ ; X2¤ )
= Y ¡ P1 X1¤ ¡ P2 X2¤ = 0
@¸
@L (¸¤ ; X1¤ ; X2¤ )
= U10 (X1¤ ) ¡ ¸¤ P1 = 0
@X1
@L (¸¤ ; X1¤ ; X2¤ )
= U20 (X2¤ ) ¡ ¸¤ P2 = 0:
@X2
Here ¸¤ ; X1¤ and X2¤ are the endogenous variables while Y; P1 and P2 are the
exogenous variables. This leads to 3 implicit functions:
g1 (¸¤ ; X1¤ ; X2¤ ; Y; P1 ; P2 ) = Y ¡ P1 X1¤ ¡ P2 X2¤ = 0

g2 (¸¤ ; X1¤ ; X2¤ ; Y; P1 ; P2 ) = U10 (X1¤ ) ¡ ¸¤ P1 = 0
g3 (¸¤ ; X1¤ ; X2¤ ; Y; P1 ; P2 ) = U20 (X2¤ ) ¡ ¸¤ P2 = 0
¸¤ = ¸¤ (Y; P1 ; P2 ) ;
X1¤ = X1¤ (Y; P1 ; P2 ) ;
X2¤ = X2¤ (Y; P1 ; P2 ) :
Note that ¸¤ > 0 since from the second order conditions:

U10 (X1¤ ) U 0 (X ¤ )
¸¤ = = 2 2 > 0:
P1 P2
The second-order condition for this problem is that the Hessian of L (¸¤ ; X1¤ ; X2¤ )
given by:
2 3
0 ¡P1 ¡P2
H = 4 ¡P1 U100 (X1¤ ) 0 5
00 ¤
¡P2 0 U2 (X2 )
which satis…es:
2 3
0 ¡P1 ¡P2
det [H] = det 4 ¡P1 U100 (X1¤ ) 0 5
00 ¤
¡P2 0 U2 (X2 )
= ¡ (P1 )2 U200 (X2¤ ) ¡ (P2 )2 U100 (X1¤ ) > 0:
0d¸¤ ¡ P1 dX1¤ ¡ P2 dX2¤ + dY ¡ X1¤ dP1 ¡ X2¤ dP2 = 0

¡P1 d¸¤ + U100 dX1¤ + 0dX2¤ ¡ ¸¤ dP1 = 0
¡P2 d¸¤ + 0dX1¤ + U200 dX2¤ ¡ ¸¤ dP2 = 0
where to simplify the notation we de…ne:
U100 ´ U100 (X1¤ ) ; U200 ´ U200 (X2¤ ) :

¤
@X
Suppose we wish to calculate: @P11 . Then setting: dP2 = dY = 0 and
changing all the @ 0 s to d0 s we obtain:
0@¸¤ ¡ P1 @X1¤ ¡ P2 @X2¤ ¡ X1¤ @P1 = 0

¡P1 @¸¤ + U100 @X1¤ + 0@X2¤ ¡ ¸¤ @P1 = 0
¡P2 @¸¤ + 0@X1¤ + U200 @X2¤ = 0:
Writing this in matrix notation and dividing both sides by @P1 we obtain:
2 3 2 @¸¤ 3 2 ¤ 3
0 ¡P1 ¡P2 @P1¤ X1
6 1 7
4 ¡P1 U100 0 5 4 @X @P1¤ 5 = 4 ¸¤ 5
00 @X2
¡P2 0 U2 0
| {z } @P1
=H
so that:
2 3
0 X1¤ ¡P2
det 4 ¡P1 ¸¤ 0 5
¤
@X1 ¡P2 0 U200 P1 X1¤ U200 ¡ (P2 )2 ¸¤
= = <0
@P1 det [H] det [H]
since det [H] > 0 from the second order conditions, U200 < 0 and ¸¤ > 0:
1.6.6 Cost Minimization
For which of you, desiring to build a tower, doth not …rst sit
down and count the cost...Luke 14:28
Suppose a …rm wishes to minimize costs C given by:

C = W L + RK
subject to the constraint that Q units are produced where:
Q = F (L; K) :
L (¸; X1 ; X2 ) = W L + RK + ¸ (Q ¡ F (L; K))
@L (¸¤ ; L¤ ; K ¤ )
= Q ¡ F (L¤ ; K ¤ ) = 0
@¸
@L (¸¤ ; L¤ ; K ¤ ) @F (L¤ ; K ¤ )
= W ¡ ¸¤ =0
@L @L
@L (¸¤ ; L¤ ; K ¤ ) @F (L¤ ; K ¤ )
= R ¡ ¸¤ = 0:
@K @K
Here there are three endogenous variables ¸¤ ; L¤ ; K ¤ and three exogenous vari-
ables: Q; W; R:
The …rst-order conditions lead to three implicit functions:
g1 (¸¤ ; L¤ ; K ¤ ; Q; W; R) = Q ¡ F (L¤ ; K ¤ ) = 0
@F (L¤ ; K ¤ )
g2 (¸¤ ; L¤ ; K ¤ ; Q; W; R) = W ¡ ¸¤ =0
@L
@F (L¤ ; K ¤ )
g3 (¸¤ ; L¤ ; K ¤ ; Q; W; R) = R ¡ ¸¤ =0
@K
¸¤ = ¸¤ (Q; W; R) ;
L¤ = L¤ (Q; W; R) ;
K¤ = K ¤ (Q; W; R) :
L¤ (Q; W; R) and K ¤ (Q; W; R) are the conditional factor demands for labour
and capital.
To simplify the notation de…ne:
@F (L¤ ;K ¤ ) ¤ ¤
FL ´ @L Fk ´ @F (L@L;K )
@ 2 F (L¤ ;K ¤ ) 2
(L¤ ;K ¤ )
FLL ´ @L2 FLK ´ @ F@L@K
2
(L¤ ;K ¤ )
FKK ´ @ F @K 2
in which case the second-order condition is that the matrix:

2 3
0 ¡FL ¡FK
H = 4 ¡FL ¡¸ FLL ¡¸ FLK 5
¤ ¤
¡FK ¡¸¤ FLK ¡¸¤ FKK

satis…es:
det [H] < 0:
0d¸¤ ¡ FL dL¤ ¡ FK dK ¤ + dQ = 0
¡FL d¸¤ ¡ ¸¤ FLL dL¤ ¡ ¸¤ FLK dK ¤ + dW = 0
¡FK d¸¤ ¡ ¸¤ FLK dL¤ ¡ ¸¤ FKK dK ¤ + dR = 0:
¤
Suppose we wish to calculate: @L @W . Setting: dR = dQ = 0 and changing the
remaining d0 s to @ 0 s we …nd that:
0@¸¤ ¡ FL @L¤ ¡ FK @K ¤ = 0
¡FL @¸ ¡ ¸ FLL @L¤ ¡ ¸¤ FLK @K ¤ + @W
¤ ¤
= 0
¡FK @¸¤ ¡ ¸¤ FLK @L¤ ¡ ¸¤ FKK @K ¤ = 0:
Writing this in matrix notation, dividing both sides by @W etc. we obtain:

2 3 2 @¸¤ 3 2 3
0 ¡FL ¡FK @W 0
4 ¡FL ¡¸¤ FLL ¡¸¤ FLK 5 4 @L¤ 5 = 4 ¡1 5
@W
¡FK ¡¸¤ FLK ¡¸¤ FKK @K ¤
@W
0
| {z }
=H
so that:
2 3
0 0 ¡FK
det 4 ¡FL ¡1 ¡¸¤ FLK 5
@L¤ ¡FK 0 ¡¸¤ FKK (FK )2
= = <0
@W det [H] det [H]
2
since det [H] < 0 from the second-order conditions and (FK ) > 0:
As an exercise show that:
@K ¤
<0
@R
and that:
@L¤ @K ¤
= :
@R @W
Chapter 2
The Envelope Theorem
2.1 Unconstrained Optimization

2.1.1 Pro…t Maximization in the Short-Run I
p
Consider a …rm with a production function Q = L so that the only factor of
production is labour L: To simplify things a bit instead of working with nominal
pro…ts given by:
¦ = PQ ¡ WL
we will work with real pro…ts given by:
¦
¼= = Q ¡ wL:
P
W
Real pro…ts can be written as a function of L and the real wage w = P as:
p
¼ (L; w) = L ¡ wL:
Let us call ¼ (L; w) the naive pro…t function. It is naive because it gives
pro…ts for any L; whether it be rational or irrational.
As an example of an irrational choice, consider the case where w = 12 and
the …rm hired a million workers or L = 1; 000; 000: Real pro…ts are then:
µ ¶
1 p 1
¼ 1000000; = 1000000 ¡ £ 1000000
2 2
= ¡499; 000:
Clearly this is not rational; the …rm could do much better by simply shutting
down its operations and setting L = 0 in which case it would receive zero pro…ts
since:
µ ¶
1 p 1
¼ 0; = 0 ¡ £ 0 = 0:
2 2
38
CHAPTER 2. THE ENVELOPE THEOREM 39
Of course in economics we do not believe that the …rm will choose irrational
values for L: Instead it will choose the pro…t maximizing L¤ which satis…es the
…rst-order conditions:
@¼ (L¤ ; w) 1
= p ¡ w = 0:
@L 2 L¤
This implicitly de…nes the labour demand curve L¤ = L¤ (w) : In this case we
can …nd L¤ (w) explicitly as:
L¤ (w) = (2w)¡2 :
Thus facing a real wage of w = 12 the clever or pro…t maximizing …rm would
hire
µ ¶ µ ¶¡2
1 1
L¤ = 2£ =1
2 2
or one worker in which case its pro…ts would be:

µ ¶
1 p 1 1
¼ 1; = 1¡ £1= :
2 2 2
Consider now replacing L in ¼ (L; w) with L¤ (w) = (2w)¡2 . Denote this

by ¼¤ (w), which I will call the clever pro…t function. This is given by:
¼ ¤ (w) = ¼ (L¤ (w) ; w)

p
= L¤ (w) ¡ wL¤ (w)
q
= (2w)¡2 ¡ w (2w)¡2
1 1 1
= ¡ =
2w 4w 4w
and so:
1
¼¤ (w) = :
4w
Note that since:
1
¼¤00 (w) = >0
2w3
¼¤ (w) is convex. We will see later that the pro…t function is always convex.
¼ ¤ (w) gives pro…ts as a function of the real wage w on the assumption that
the …rm hires the pro…t maximizing amount of labour. For example¡ ¢ if w = 12
the pro…t maximizing …rm will receive real pro…ts equal to ¼ 2 = 4£1 1 = 12 :
¤ 1
2
Consider taking the derivative of the naive pro…t function with respect to
the real wage w: We obtain:
@¼ (L; w) @ ³p ´
= L ¡ wL = ¡L:
@w @w
Now suppose we do the same thing with the clever pro…t. We …nd that:
µ ¶
d¼¤ (w) d 1 1
= =¡ 2
dw dw 4w 4w
= ¡L¤ (w) :
Note the parallel between the two results where we get ¡L and ¡L¤ respectively.
This is no coincidence but follows from the envelope theorem which states
that:
d¼¤ (w) @¼ (L; w) ¯¯

= ¯L=L¤ (w)
dw @w
¤
@¼ (L (w) ; w)
= ;
@w
that is, to …nd the derivative of the clever function ¼¤ (w) with respect to w you
only need to take the derivative of the naive function ¼ (L; w) with respect to
w and then replace L by L¤ (w) :
2.1.2 Envelope Theorem I

Let f (y; x) be the naive version of a function to be optimized (i.e., either max-
imized or minimized) over y: Here x is some exogenous variable.
From the …rst-order conditions we have:
@f (y ¤ ; x)
= 0:
@y
This …rst-order condition implicitly de…nes the optimal value of y ¤ as a function
of x or:
y ¤ = y ¤ (x) :
Suppose we replace y in f (y; x) with y ¤ (x) to obtain the clever version of

the function so that:
f ¤ (x) = f (y ¤ (x) ; x) :
To calculate the derivative of the clever function with respect to x use the
chain rule to obtain:
µ ¶
df ¤ (x) @f (y ¤ (x) ; x) @y ¤ (x) @f (y ¤ (x) ; x)
= + :
dx @y @x @x
However, by the …rst-order conditions:

@f (y ¤ (x) ; x)
=0
@y
and so the term in brackets is zero. We therefore have:
¯
df ¤ (x) @f (y; x) ¯¯
= ¯ y = y ¤ (x)
dx @x
¤
@f (y (x) ; x)
= :
@x
Thus the derivative of the clever version of the function is identical to the
partial derivative of the naive version of the function with respect to x:
We therefore have:
Theorem 11 Envelope Theorem I: If f (y; x) is the naive function and f ¤ (x)
is the clever function then:
¯
df ¤ (x) @f (y; x) ¯¯
= ¯ y = y ¤ (x)
dx @x
@f (y ¤ (x) ; x)
= :
@x
2.1.3 Pro…t Maximization in the Short-Run II

Let us now consider the more general problem where real pro…ts are given by:
¼ (L; w) = f (L) ¡ wL
p
where instead of f (L) = L simply assume that the short-run production
function satis…es: f 0 (L) > 0 and f 00 (L) < 0: Pro…t maximization requires that
L¤ (w) satis…es:
@¼ (L¤ (w) ; w)
= f 0 (L¤ (w)) ¡ w = 0
@L
from which, using total di¤erentials we can show that:
dL¤ (w) 1
= 00 ¤ <0
dw f (L (w))
so that the short-run labour demand curve slopes downward.
The clever pro…t function is then de…ned as:
¼ ¤ (w) = ¼ (L¤ (w) ; w) = f (L¤ (w)) ¡ wL¤ (w) :
From the envelope theorem we then have that since:
@¼ (L; w)
= ¡L
@w
that one can obtain the negative of the demand for labour L¤ (w) by di¤erenti-
ating the clever pro…t function with respect to w as:
d¼¤ (w) @¼ (L; w) ¯¯

= ¯L=L¤ (w)
dw @w
¤
= ¡L (w) :
Furthermore the clever pro…t function is convex since we have already show
¤
that dLdw(w) < 0 and hence:
d2 ¼¤ (w) dL¤ (w)

2
=¡ > 0:
dw dw
2.1.4 Pro…t Maximization in the Long-Run

We will now apply the envelope theorem to pro…t maximization in the long-
run. This will illustrate how the envelope theorem works when there are many
endogenous and exogenous variables.
Consider a …rm with production function: Q = F (L; K) where Q is output,
L is labour and K is the amount of capital. The …rm faces three prices P; W
and R:
The naive pro…t function is:
¼ (L; K; P; W; R) = P F (L; K) ¡ W L ¡ RK:
Again ¼ (L; K; P; W; R) is naive because it gives pro…ts for any choice of L and
K; even irrational choices that do not maximize pro…ts.
Again, in economics we do not believe …rms are irrational, that they will
just choose any L and K. Instead that they will choose that L¤ and K ¤ which
maximizes pro…ts. These will satisfy the …rst-order conditions:
@F (L¤ ; K ¤ )
P ¡W = 0
@L
@F (L¤ ; K ¤ )
P ¡R = 0
@K
which implicitly de…ne:
L¤ = L¤ (P; W; R)
K¤ = K ¤ (P; W; R)
that is, the demand for labour L¤ (P; W; R) and the demand for capital K ¤ (P; W; R) :
If we replace L and K by in Q = F (L; K) with L¤ and K ¤ then we obtain
the …rm’s supply curve Q¤ (P; W; R) given as:
Q¤ (P; W; R) = F (L¤ (P; W; R) ; K ¤ (P; W; R)) :

If we replace L and K in ¼ (L; K; P; W; R) with L¤ (P; W; R) and K ¤ (P; W; R) ;

then we get the clever pro…t function:
¼¤ (P; W; R) = ¼ (L¤ (P; W; R) ; K ¤ (P; W; R) ; P; W; R)

= P F (L¤ ; K ¤ ) ¡ W L¤ ¡ RK ¤ :
Suppose we attempt to calculate a partial derivative of the clever pro…t

¤
function, say: @¼ (P;W;R)
@W : At …rst sight it would appear that:
@¼ ¤ (P; W; R) @
= (P F (L¤ ; K ¤ ) ¡ W L¤ ¡ RK ¤ ) = ¡L¤
@W @W
which looks almost the same as the derivative of the naive pro…t function:
@¼ (L; K; P; W; R)
= ¡L:
@W
However, this line of reasoning is based on an error since the pro…t maxi-
mizing amount of labour L¤ (P; W; R) and capital K ¤ (P; W; R) also depend on
W . Nevertheless, this incorrect reasoning still gives the correct answer! This
is, again, the envelope theorem.
Doing the di¤erentiation properly we obtain the correct expression:
@¼¤ (P; W; R) @
= (P F (L¤ ; K ¤ ) ¡ W L¤ ¡ RK ¤ )
@W @W
= ¡L¤ (P; W; R)
µ ¶
@F (L¤ ; K ¤ ) @L¤ (P; W; R)
+ P ¡W
@L @W
µ ¤ ¤
¶ ¤
@F (L ; K ) @K (P; W; R)
+ P ¡R :
@K @W
¤ ¤ ¤ ¤
However, by the …rst-order conditions P @F (L@L;K ) ¡ W = 0 and P @F (L
@K
;K )
¡
R = 0: Thus the last two terms in brackets are zero and we are back to (what
was actually correct but now for the correct reason):
@¼ ¤ (P; W; R)
= ¡L¤ (P; W; R)
@W
or alternatively:
¯
@¼ ¤ (P; W; R) @¼ (L; K; P; W; R) ¯¯
= ¯ L=L¤ (P;W;R) :
@W @W ¯K=K ¤ (P;W;R)
2.1.5 Envelope Theorem II

Let f (y; x) be the naive version of a function to be optimized (either maximized
or minimized) over y where y = [yi ] is an n £ 1 vector of endogenous variables
and x = [xi ] is an m £ 1 vector of exogenous variables.
From the …rst-order conditions we then have that the optimal value of y
denoted by y¤ will satisfy :
@f (y¤ ; x)
= 0 for i = 1; 2; : : : n:
@yi
This implicitly de…nes the yi¤0 s as a functions of the exogenous variables as:
yi¤ = yi¤ (x1 ; x2 ; : : : xm ) for i = 1; 2; : : : n:
Suppose we replace y in f (y; x) with y ¤ (x) to obtain the clever version of

the function: f ¤ (x) : Thus:
f ¤ (x) ´ f (y ¤ (x) ; x)
or more explicitly:
f ¤ (x1 ; x2 ; : : : xm ) = f (y1¤ (x) ; y2¤ (x) ; : : : yn¤ (x) ; x1 ; x2 ; : : : xm ) :
To calculate the partial derivative:

@f ¤ (x1 ; x2 ; : : : xm )
@xi
use the chain rule to obtain:
µ ¶ µ ¶
@f ¤ (x1 ; x2 ; : : : xm ) @f (y ¤ (x) ; x) @y1¤ (x) @f (y ¤ (x) ; x) @y2¤ (x)
= + +
@xi @y1 @xi @y2 @xi
µ ¶
@f (y ¤ (x) ; x) @yn¤ (x)
¢¢¢ +
@yn @xi
@f (y ¤ (x) ; x)
+ :
@xi
By the …rst-order conditions:
@f (y ¤ (x) ; x) @f (y ¤ (x) ; x) @f (y ¤ (x) ; x)
= = ¢¢¢ = =0
@y1 @y2 @yn
so all of the terms in brackets are zero. We therefore have:
Theorem 12 Envelope Theorem II: If f (y; x) is the naive function and f ¤ (x)
is the clever function then:
@f ¤ (x1 ; x2 ; : : : xm ) @f (y; x) ¯¯
= y=y ¤ (x)
@xi @xi
@f (y¤ (x) ; x)
= :
@xi
Thus the partial derivative of the clever version of the function is identical
to the partial derivative of the naive version of the function when we replace y
by the optimal y ¤ (x) :
2.1.6 A Recipe for Unconstrained Optimization

Given the original naive version of the function f (y; x) and the clever version
of the function: f ¤ (x) = f (y¤ (x) ; x) suppose you wish to calculate:
@f ¤ (x)
:
@xi
Step 1: First calculate the partial derivative of the naive version of the
function with respect to the ith exogenous variable or
@f (y; x)
:
@xi
Step 2: Replace all occurrences of the endogenous variables y with the
maximized values y ¤ (x) : This gives the correct answer, that is:
¯
@f ¤ (x) @f (y; x) ¯¯
=
@xi @xi ¯ y = y¤ (x)
@f (y ¤ (x) ; x)
= :
@xi
2.1.7 The Pro…t Function

The following results, which relate the factor demand and supply curve to deriva-
tives of the clever pro…t function, are collectively known as Hotelling’s lemma.
The Demand for Labour

Let us use the envelope theorem recipe to calculate:
@¼¤ (P; W; R)
:
@W
Step 1: Calculate the derivative of the naive pro…t function:
@¼ (L; K; P; W; R) @
= (P F (L; K) ¡ W L ¡ RK)
@W @W
= ¡L:
Step 2: Replace all occurrences of L and K with L¤ and K ¤ to obtain:
@¼ ¤ (P; W; R)
= ¡L¤ (P; W; R)
@W
which is the (negative of the) …rm’s demand curve for labour.
The Demand for Capital

To calculate:
@¼¤ (P; W; R)
:
@R
Step 1: Calculate the partial derivative of the naive pro…t function with
respect to R to obtain:
@¼ (L; K; P; W; R) @
= (P F (L; K) ¡ W L ¡ RK)
@R @R
= ¡K:
Step 2: Replace all occurrences of L and K with L¤ and K ¤ to obtain:

@¼¤ (P; W; R)
= ¡K ¤ (P; W; R)
@R
which is the (negative of the) …rm’s demand curve for capital.
The Supply Curve

To calculate:
@¼¤ (P; W; R)
@P
Step 1: Calculate the derivative of the naive pro…t function with respect to
P:
@¼ (L; K; P; W; R) @
= (P F (L; K) ¡ W L ¡ RK)
@P @P
= F (L; K) :
Now replace all occurrences of L and K with L¤ and K ¤ to obtain:
@¼¤ (P; W; R)
= F (L¤ (P; W; R) ; K ¤ (P; W; R))
@P
= Q¤ (P; W; R)
since Q = F (L; K) : This is then the …rm’s supply curve.
An Example
Suppose the …rm’s production function Q = F (L; K) is Cobb-Douglas or:
Q = L® K ¯
where ® + ¯ < 1 (the production function is homogeneous of degree ® + ¯):
Then it can be shown that the clever pro…t function ¼¤ (P; W; R) is given by:
® ¯ 1 ¡® ¡¯
¼ ¤ (P; W; R) = (1 ¡ ® ¡ ¯) ® 1¡®¡¯ ¯ 1¡®¡¯ P 1¡®¡¯ W 1¡®¡¯ R 1¡®¡¯ :
Note that we can rewrite this as:
¼¤ (P; W; R) = BP c W d Re
where:
® ¯
B = (1 ¡ ® ¡ ¯) ® 1¡®¡¯ ¯ 1¡®¡¯
1
c =
1¡®¡¯
¡®
d =
1¡®¡¯
¡¯
e = :
1¡®¡¯
which also has the Cobb-Douglas functional form.
Note that the pro…t function is homogeneous of degree 1 since
1 ¡® ¡¯
c+d+e = + +
1¡®¡¯ 1¡®¡¯ 1¡®¡¯
= 1
with c ¸ 0, d · 0 and e · 0:
Sometimes we start with the production function, derive L¤ (P; W; R) ; K ¤ (P; W; R)
and Q¤ (P; W; R) and then ¼¤ (P; W; R) : However, we can also start with ¼¤ (P; W; R)
and from there derive L¤ (P; W; R) ; K ¤ (P; W; R) and Q¤ (P; W; R) :
For example, suppose the clever pro…t function is given by:
5 3 3
¼¤ (P; W; R) = 10P 2 W ¡ 4 R¡ 4 :
To calculate the …rm’s supply curve: Q¤ (P; W; R) simply di¤erentiate ¼¤ (P; W; R)

with respect to P: That is:
@¼ ¤ (P; W; R)
Q¤ (P; W; R) =
@P
@ ³ 5 3 3
´
= 10P 2 W ¡ 4 R¡ 4
@P
3 3 3
= 25P 2 W ¡ 4 R¡ 4 :
Note that Q¤ (P; W; R) is homogenous of degree 0:

To calculate the …rm’s demand curve for labour L¤ (P; W; R) simply di¤er-
entiate ¼¤ (P; W; R) with respect to W and multiply by ¡1: That is:
@¼ ¤ (P; W; R)
L¤ (P; W; R) = ¡
@W
@ ³ 5 3 3
´
= ¡ 10P 2 W ¡ 4 R¡ 4
@W
15 5 ¡ 7 ¡ 3
= P 2W 4R 4:
2
Note that L¤ (P; W; R) is homogenous of degree 0:

To calculate the …rm’s demand curve for capital K ¤ (P; W; R) simply di¤er-
entiate ¼¤ (P; W; R) with respect to R and multiply by ¡1: That is:
@¼¤ (P; W; R)
K ¤ (P; W; R) = ¡
@R
@ ³ 5 3 3
´
= ¡ 10P 2 W ¡ 4 R¡ 4
@R
15 5 ¡ 3 ¡ 7
= P 2W 4R 4:
2
Note that K ¤ (P; W; R) is homogenous of degree 0:
2.1.8 Homogeneous Functions

Discussion
In economics we often work with homogenous functions. Sometimes these are
production functions when we wish to capture the idea of increasing, decreasing
or constant returns to scale. Other times economic theory requires that certain
functions be homogeneous. Thus pro…t and cost functions are always homoge-
neous of degree 1, demand and supply curves are always homogeneous of degree
0 while the marginal utility of income must be homogeneous of degree ¡1:
De…nition
Let us review the de…nition of a homogenous function.
De…nition 13 A function f (x1 ; x2 ; : : : xn ) ; or f (x) where x is an n£1 vector,

is said to be homogeneous of degree k if for any positive scalar ¸:
f (¸x1 ; ¸x2 ; : : : ¸xn ) = ¸k f (x1 ; x2 ; : : : xn )
or:
f (¸x) = ¸k f (x) :
An Example
The classic example of a homogeneous function is the Cobb-Douglas functional
form where:
f (x1 ; x2 ; : : : xn ) = A (x1 )®1 (x2 )®2 ¢ ¢ ¢ (xn )®n
which is homogeneous of degree:
k = ®1 + ®2 + ¢ ¢ ¢ + ®n
since:
f (¸x1 ; ¸x2 ; : : : ¸xn ) = A (¸x1 )®1 (¸x2 )®2 ¢ ¢ ¢ (¸xn )®n

= A¸a1 (x1 )®1 ¸a2 (x2 )®2 ¢ ¢ ¢ ¸an (xn )®n
= ¸®1 +®2 +¢¢¢+®n A (x1 )®1 (x2 )®2 ¢ ¢ ¢ (xn )®n
= ¸®1 +®2 +¢¢¢+®n f (x1 ; x2 ; : : : xn ) :
1 1
Thus F (L; K) = L 3 K 6 is homogeneous of degree
1 1 1
k= + = :
3 6 2
Some Results for Homogeneous Functions

Theorem 14 Euler’s theorem: If f (x) is homogeneous of degree k then:
@f (x) @f (x) @f (x)

kf (x) = x1 + x2 + ¢ ¢ ¢ + xn :
@x1 @x2 @xn
1 1
For example if F (L; K) = L 3 K 6 then you can verify that:
1 1 1
F (L; K) = L3 K 6
2
@F (L; K) @F (L; K)
= L+ K
@L @K
1 1 ¡1 1 1 1 1
= L 3 K 6 £ L + L 3 K 6 ¡1 £ K:
3 6
It turns out that the derivative of a homogenous function is also homogeneous
of degree one less than the original function. We have:
@f (x)
Theorem 15 If f (x) is homogeneous of degree k then @xi is homogeneous of
degree k ¡ 1:
1 1 1
For example if F (L; K) = L 3 K 6 which is homogeneous of degree k = 2
then you can verify that
@F (L; K) 1 2 1
= L¡ 3 K 6
@L 3
is homogeneous of degree k ¡ 1 = ¡ 12 :
2.1.9 Concavity and Convexity

Discussion We have seen that concavity and convexity are of fundamental
importance in economics. It turns out that there are certain di¢culties with
the notions of concavity and convexity that we have used so far.
For example we have said that a function is (globally) convex if f 00 (x) > 0
for all x in the domain of f: Consider the function
y = jxj
which is plotted below:
-4 -2 0 2 4
x :
y = jxj
This function is clearly valley-like and hence we would like to say that it is
convex. However for this function f 00 (x) = 0 for x 6= 0 and f 00 (x) is not de…ned
when x = 0 so according to the calculus de…nition of convexity, the function is
not convex.
Another problem arises with functions such as:
f (x) = x4
600
500
400
300
200
100
-4 -2 0 2 4
x :
y = x4
This is also clearly valley-like. For x 6= 0 we have f 00 (x) = 12x2 > 0 but
f 00 (0) = 0 so the de…nition of convexity we have used fails here as well.
For these and other reasons we will now discuss a more satisfactory de…nition
of concavity and convexity.
It turns out that it is possible to de…ne concavity and convexity without
resorting to calculus. The basic idea is that if a function is convex, then the
line connecting any two points of the graph of the function must lie above the
function, while if it is concave it must lie everywhere below. This is illustrated
in the diagrams below:
This leads us to the following de…nitions of concavity and convexity.

The De…nition of Convexity

De…nition 16 A function f (x) is convex if and only if for all x1 and x2 in the
domain of f (x) :
f (¸x1 + (1 ¡ ¸) x2 ) · ¸f (x1 ) + (1 ¡ ¸) f (x2 )
where ¸ is any scalar with 0 · ¸ · 1:
We have the following:
Theorem 17 If f (x) has …rst and second derivatives then f (x) is convex if
and only if the Hessian of f (x) is positive semi-de…nite for all x:
The De…nition of Strict Convexity

De…nition 18 The function f (x) is strictly convex if and only if for all x1 and
x2 in the domain of f (x) :
f (¸x1 + (1 ¡ ¸) x2 ) < ¸f (x1 ) + (1 ¡ ¸) f (x2 )
where ¸ is any scalar with 0 < ¸ < 1 and x1 6= x2 :
Note that unlike the de…nition of convexity we exclude ¸ = 0 or ¸ = 1 in

the de…nition of strict convexity and rule out identical x1 and x2 :
We have the following results:
Theorem 19 If f (x) is strictly convex then it is convex.
A function may be convex without being strictly convex if it has linear

segments. For example it can be shown that y = jxj is convex but not strictly
convex because it is linear to the right and left of the origin. You can verify this
yourself by looking at the graph of y = jxj.
A strictly convex function always has valley-like curvature. For example
y = x2 or y = x4 is strictly convex.
Theorem 20 If the Hessian of f (x) is everywhere positive de…nite, then f (x)

is strictly convex.
The De…nition of Concavity

De…nition 21 The function f (x) is concave if and only if for all x1 and x2 in
the domain of f (x) :
f (¸x1 + (1 ¡ ¸) x2 ) ¸ ¸f (x1 ) + (1 ¡ ¸) f (x2 )
where ¸ is any scalar with 0 · ¸ · 1:
Theorem 22 If f (x) has …rst and second derivatives then f (x) is concave if
and only if the Hessian of f (x) is negative semi-de…nite for all x:
The De…nition of Strict Concavity

De…nition 23 f (x) is strictly concave if and only if for all x1 and x2 in the
domain of f (x) :
f (¸x1 + (1 ¡ ¸) x2 ) > ¸f (x1 ) + (1 ¡ ¸) f (x2 )
where ¸ is any scalar with 0 < ¸ < 1 and x1 6= x2 :
Note that unlike the de…nition of concavity we exclude ¸ = 0 or ¸ = 1 in
the de…nition of strict concavity and rule out identical x1 and x2 :
Theorem 24 If f (x) is strictly concave then it is concave.
A function may be concave without being strictly concave if it has linear
segments. For example it can be shown that y = ¡ jxj is concave but not
strictly concave because it is linear to the right and left of the origin.
A linear function is both concave and convex but is neither strictly concave
nor strictly convex.
A strictly concave function always has mountain-like curvature. For example
y = ¡x2 or y = ¡x4 is strictly concave.
Theorem 25 If the Hessian of f (x) is everywhere negative de…nite, then f (x)
is strictly concave.
2.1.10 Concavity, Convexity and Homogeneity of f ¤ (x)

In economics the exogenous variables, or the x0 s, are often prices. This means
that the naive function f (y; x) is often a linear function of the x0 s or prices and
so can be written as:
X n
f (y; x) = xi Ái (y) (2.1)
i=1
T
= x Á (y)
where Á (y) = [Ái (y)] is an m £ 1 vector. This linearity of the naive function
has important implications for the clever function f ¤ (x). We have the following
results:
Theorem 26 If the naive function f (y; x) can be written as
f (y; x) = xT Á (y)
then the clever function f ¤ (x) is homogeneous of degree 1 so that f ¤ (¸x) =
¸f ¤ (x) for all positive ¸: Furthermore y ¤ (x) is homogeneous of degree 0; that
is y ¤ (¸x) = y¤ (x) for all ¸ > 0:
Theorem 27 If the naive function f (y; x) = xT Á (y) is being maximized the
clever function f ¤ (x) is convex.
Theorem 28 If the naive function f (y; x) = xT Á (y) is being minimized the
clever function f ¤ (x) is concave.
Proof of Homogeneity
We …rst prove that f ¤ (x) given by:
f ¤ (x) = xT Á (y¤ (x))
is homogeneous of degree 1 and y ¤ (x) ; the value of y which maximizes or

minimizes f (y; x) ; is homogeneous of degree 0: (See page 48 for a review of
homogeneity).
First, note that the naive function is homogeneous of degree 1 in x since:
T
f (y; ¸x) = (¸x) Á (y)
¡ T ¢
= ¸ x Á (y)
= ¸f (y; x) :
Therefore since ¸ > 0 it follows that if y¤ (x) maximizes or minimizes f (y; x) it

also maximizes or minimizes f (y; ¸x) and so
y ¤ (¸x) = y ¤ (x) :
Thus y ¤ (x) is homogeneous of degree 0: Furthermore:
f ¤ (¸x) = (¸x)T Á (y ¤ (¸x))

= (¸x)T Á (y ¤ (x))
= ¸xT Á (y¤ (x))
= ¸f ¤ (x)
where the second line follows by the homogeneity of y ¤ (x) : This proves that
f ¤ (x) is homogeneous of degree 1:
Proof that f ¤ (x) is convex when f (y; x) is being maximized

Suppose the naive function f (y; x) is being maximized. Let x1 and x2 (which
are m £ 1 vectors) be two values of x and let x3 = ¸x1 + (1 ¡ ¸) x2 (where
0 · ¸ · 1) be a convex combination of x1 and x2 : In order to prove that f ¤ (x)
is convex (see page 52 for a review of convexity) we need to show that:
f ¤ (x3 ) · ¸f ¤ (x1 ) + (1 ¡ ¸) f ¤ (x2 ) :
We have then that:

T
f ¤ (x3 ) = (¸x1 + (1 ¡ ¸) x2 ) Á (y ¤ (x3 ))
= ¸xT1 Á (y ¤ (x3 )) + (1 ¡ ¸) xT2 Á (y ¤ (x3 )) :
However:
xT1 Á (y¤ (x3 )) = f (y ¤ (x3 ) ; x1 ) · f (y ¤ (x1 ) ; x1 ) = xT1 Á (y ¤ (x1 )) = f ¤ (x1 )

since y ¤ (x1 ) maximizes f (y; x1 ) : Similarly we have:

since y ¤ (x2 ) maximizes f (y; x2 ) : We therefore conclude that:

f ¤ (x3 ) = ¸xT1 Á (y ¤ (x3 )) + (1 ¡ ¸) xT2 Á (y ¤ (x3 ))
· ¸f ¤ (x1 ) + (1 ¡ ¸) f ¤ (x2 )
so that f ¤ (x) is convex.
Proof that f ¤ (x) is concave when f (y; x) is being minimized

Suppose the naive function f (y; x) is being minimized. Let x1 and x2 (which
are m £ 1 vectors) be two values of x and let x3 = ¸x1 + (1 ¡ ¸) x2 (where
0 · ¸ · 1 ) be a convex combination of x1 and x2 : In order to prove that f ¤ (x)
is concave we need to show that (see page 21):
f ¤ (x3 ) ¸ ¸f ¤ (x1 ) + (1 ¡ ¸) f ¤ (x2 ) :
We have then that:
f ¤ (x3 ) = (¸x1 + (1 ¡ ¸) x2 )T Á (y ¤ (x3 ))
= ¸xT1 Á (y ¤ (x3 )) + (1 ¡ ¸) xT2 Á (y ¤ (x3 )) :
However we have:
xT1 Á (y¤ (x3 )) = f (y ¤ (x3 ) ; x1 ) ¸ f (y ¤ (x1 ) ; x1 ) = xT1 Á (y ¤ (x1 )) = f ¤ (x1 )
since y ¤ (x1 ) minimizes f (y; x1 ) : Similarly we have:

xT2 Á (y¤ (x3 )) = f (y ¤ (x2 ) ; x2 ) ¸ f (y ¤ (x2 ) ; x2 ) = xT2 Á (y ¤ (x2 )) = f ¤ (x2 )
since y ¤ (x2 ) minimizes f (y; x2 ) : We therefore conclude that:

f ¤ (x3 ) = ¸xT1 Á (y ¤ (x3 )) + (1 ¡ ¸) xT2 Á (y ¤ (x3 ))
¸ ¸f ¤ (x1 ) + (1 ¡ ¸) f ¤ (x2 )
so that f ¤ (x) is concave.
2.1.11 Properties of the Pro…t Function

Let us now apply the results of the previous section to the pro…t function. These
results apply since the naive pro…t function is a linear function of the exogenous
variables which are the prices P; W and R since:
¼ (L; K; P; W; R) = P F (K; L) ¡ W L ¡ RK
= xT Á (y)
where:
2 3 2 3
P · ¸ F (K; L)
L
x = 4 W 5; y = ; Á (y) = 4 L 5:
K
R K
From this it follows that (see page 53):
Theorem 29 ¼¤ (P; W; R) is homogeneous of degree 1
Theorem 30 ¼¤ (P; W; R) is convex.
Theorem 31 L¤ (P; W; R), K ¤ (P; W; R) and Q¤ (P; W; R) are homogeneous of

degree 0:
The fact that L¤ (P; W; R), K ¤ (P; W; R) and Q¤ (P; W; R) are homogeneous
of degree 0 is a basic requirement of rationality; that the …rm is not subject to
money illusion. For example if a 100% in‡ation doubles P; W and R then a
rational …rm does not change its production or factor demands, or:
L¤ (2P; 2W; 2R) = L¤ (P; W; R)

K ¤ (2P; 2W; 2R) = K ¤ (P; W; R)
Q¤ (2P; 2W; 2R) = Q¤ (P; W; R) :
We can use the convexity of ¼¤ (P; W; R) to show that the …rm’s factor
demand curves slope downwards and the …rm’s supply curve slopes upwards.
Since ¼ ¤ (P; W; R) is convex (and assuming …rst and second derivatives exist)
we have that the Hessian given by:
2 @ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R) 3
@P 2 @P @W @P @R
6 @ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R) 7
H (P; W; R) = 4 @P @W @W 2 @W @R
5
@ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R)
@P @R @W @R @R2
is positive semi-de…nite (see page 52)

Since a positive semi-de…nite matrix has non-negative diagonal elements, we
have:
@ 2 ¼¤ (P; W; R) @ 2 ¼¤ (P; W; R) @ 2 ¼¤ (P; W; R)
¸ 0; ¸ 0; ¸ 0: (2.2)
@P 2 @W 2 @R2
The Demand Curve for Labour Slopes Downwards

Let us begin by showing the …rm’s labour demand curve is downward sloping.
Di¤erentiate both sides of
@¼ ¤ (P; W; R)
= ¡L¤ (P; W; R)
@W
with respect to W to obtain:
@L¤ (P; W; R) @ 2 ¼¤ (P; W; R)

¡ = ¸0
@W @W 2
where the inequality follows from (2:2) : It therefore follows that1 :
@L¤ (P; W; R)
·0
@W
so that the …rm’s demand curve for labour slopes downwards.
The Demand Curve for Capital Slopes Downwards

A similar argument shows that the demand for capital is downward sloping.
@¼¤ (P; W; R)
= ¡K ¤ (P; W; R)
@R
with respect to R to obtain:
@K ¤ (P; W; R) @ 2 ¼¤ (P; W; R)
¡ = ¸0
@R @R2
where the inequality follows from (2:2) : It therefore follows that:
@K ¤ (P; W; R)
·0
@R
so the demand curve for capital slopes downwards.
The Supply Curve Slopes Upwards

To show that the …rm’s supply curve slopes upwards, di¤erentiate both sides of
@¼ ¤ (P; W; R)
= Q¤ (P; W; R)
@P
with respect to P to obtain:
@Q¤ (P; W; R) @ 2 ¼¤ (P; W; R)

= ¸0
@P @P 2
where the inequality follows from (2:2) : It therefore follows that:
@Q¤ (P; W; R)
¸0
@P
so the …rm’s supply curve slopes upwards.
1 Recall that if ¡X ¸ a then X · ¡a: For example if ¡5 ¸ ¡7 it follows that 5 · 7:
Multiplying both sides of an inequality by a negative number reverses the inequality.

2.1.12 Some Symmetry Results

There are a number of symmetry results that can also be demonstrated using
these techniques. Di¤erentiate both sides of
@¼ ¤ (P; W; R)
= ¡L¤ (P; W; R)
@W
@L¤ (P; W; R) @ 2 ¼ ¤ (P; W; R)
¡ = :
@R @R@W
Now di¤erentiate both sides of
@¼¤ (P; W; R)
= ¡K ¤ (P; W; R)
@R
@K ¤ (P; W; R) @ 2 ¼ ¤ (P; W; R)
¡ = :
@W @W @R
@ 2 ¼¤ (P;W;R) @ 2 ¼¤ (P;W;R)
By Young’s theorem: @R@W = @W @R and so we can conclude that:
@L¤ (P; W; R) @K ¤ (P; W; R)
= :
@R @W
As an exercise use the same approach to show that:
@Q¤ (P; W; R) @K ¤ (P; W; R)
= ¡ ;
@R @P
@Q¤ (P; W; R) ¤
@L (P; W; R)
= ¡ :
@W @P
2.2 Constrained Optimization

2.2.1 Envelope Theorem III: Constrained Optimization
The envelope theorem also works for constrained optimization problems where
we are using Lagrangians.
Let f (y; x) be the naive function which is to be optimized where y = [yi ] is
an n £ 1 vector (of variables over which we optimize f (y; x)) and x = [xi ] is an
m £ 1 vector of exogenous variables.
The naive function is being optimized subject to a constraint:
h (y; x) = 0:

L (¸; y; x) = f (y; x) + ¸h (y; x) :
As before we call f (y; x) the naive the version of the function because it
does not restrict the choice of y to be optimal or rational.
The optimal value of y; say y ¤ ; will along with the Lagrange multiplier ¸¤ ,
satis…es the …rst-order conditions:
@L (¸¤ (x) ; y ¤ (x) ; x)
= h (y¤ ; x) = 0;
@¸
@L (¸¤ (x) ; y ¤ (x) ; x) @f (y ¤ ; x) @h (y ¤ ; x)
= + ¸¤ = 0 for i = 1; 2; : : : n:
@yi @yi @yi
This implicitly de…nes n + 1 functions:
¸¤ = ¸¤ (x)
yi¤ = yi¤ (x) for i = 1; 2; : : : n:
We can replace y with y ¤ (x) in f (y; x) to get the clever version of the
function:
f ¤ (x) = f (y ¤ (x) ; x) :
We could also replace ¸ and y in the Lagrangian L (¸; y; x) with y ¤ (x) and
¸ (x) to obtain the clever Lagrangian: L (¸¤ (x) ; y ¤ (x) ; x) : Since y ¤ (x) satis…es
¤
the constraint h (y ¤ (x) ; x) = 0; it follows that:

L (¸¤ (x) ; y ¤ (x) ; x) = f (y ¤ (x) ; x) + ¸¤ (x) h (y ¤ (x) ; x)
= f (y ¤ (x) ; x)
= f ¤ (x) :
We thus have:
f ¤ (x) = L (¸¤ (x) ; y¤ (x) ; x) :
Suppose now we wish to calculate:
@f ¤ (x1 ; x2 ; : : : xm )
:
@xi
Using the chain rule and di¤erentiating both sides of :
f ¤ (x1 ; x2 ; : : : xm ) = L (¸¤ (x) ; y¤ (x) ; x)
with respect to xi we …nd that:
µ ¶
@f ¤ (x1 ; x2 ; : : : xm ) @L (¸¤ (x) ; y ¤ (x) ; x) @¸¤ (x)
= +
@xi @¸ @xi
µ ¶
@L (¸¤ (x) ; y ¤ (x) ; x) @y1¤ (x)
+
@y1 @xi
µ ¶
@L (¸¤ (x) ; y ¤ (x) ; x) @y2¤ (x)
+
@y2 @xi
µ ¶
@L (¸¤ (x) ; y ¤ (x) ; x) @yn¤ (x)
¢¢¢ +
@yn @xi
¤
@L (y (x) ; x)
+ :
@xi
However, by the …rst-order conditions:

@L (¸¤ (x) ; y ¤ (x) ; x) @L (¸¤ (x) ; y ¤ (x) ; x) @L (¸¤ (x) ; y¤ (x) ; x)
= = ¢¢¢ = =0
@¸ @y1 @yn
so that all the derivatives in brackets above are zero. We therefore have that:
Theorem 32 Envelope Theorem III: Let f (y; x) be the naive function which is
optimized subject to a constraint: h (y; x) = 0 with Lagrangian:
L (¸; y; x) = f (y; x) + ¸h (y; x) :
Then:
¯
¯
@f ¤ (x1 ; x2 ; : : : xm ) @L (¸; y; x) ¯¯ ¤
= ¯ ¸ = ¸¤ (x)
@xi @xi ¯ y = y (x)
@L (¸¤ (x) ; y ¤ (x) ; x)
= :
@xi
The envelope theorem for constrained optimization states that to calculate
the derivative of the clever version of the function f ¤ (x) with respect to xi ; …rst
take the derivative of the Lagrangian: L (¸; y; x) with respect to xi and evaluate
this at ¸¤ (x) and y¤ (x) :
2.2.2 A Recipe for Constrained Optimization

Given the original naive version of the function f (y; x) ; the constraint h (y; x) =
0; the optimal value y ¤ (x) and the clever version of the function f ¤ (x) =
f (y ¤ (x) ; x) ; suppose you wish to calculate:
@f ¤ (x)
:
@xi
The Lagrangian for the constrained optimization problem is then:
L (¸; y; x) = f (y; x) + ¸h (y; x) :
The recipe proceeds as follows:
Step 1: Calculate the partial derivative of the Lagrangian L (¸; y; x) with
respect to the ith exogenous variable or
@L (¸; y; x) @f (y; x) @h (y; x)
= +¸ :
@xi @xi @xi
Step 2: Replace all occurrences of ¸ and y with ¸¤ (x) and y¤ (x) : This
gives the desired result, that is:
¯
¯
@f ¤ (x) @L (¸; y; x) ¯¯ ¤
@xi
=
@xi ¯ ¸ = ¸¤ (x)
¯ y = y (x)
@f (y¤ (x) ; x) @h (y ¤ (x) ; x)
= + ¸¤ (x) :
@xi @xi
2.2.3 Concavity, Convexity and Homogeneity of f ¤ (x)

As with unconstrained optimization the x0 s in the naive function f (y; x) are
often prices so that f (y; x) is often a linear function of the x0 s and can be
written as:
n
X
f (y; x) = xi Ái (y) (2.3)
i=1
T
= x Á (y)
where Á (y) = [Ái (y)] is an m £ 1 vector.

We saw for unconstrained optimization that this linearity had important im-
plications for the clever function f ¤ (x). These results also follow for constrained
optimization as long as the constraint does not depend on x or:
h (y; x) = h (y) :
We then have the following results:

Theorem 33 If the naive function f (y; x) can be written as
f (y; x) = xT Á (y)
and the constraint does not depend on x so that:
h (y; x) = h (y)
then the clever function f ¤ (x) is homogeneous of degree 1 so that f ¤ (¸x) =

¸f ¤ (x) for all positive ¸: Furthermore y ¤ (x) is homogeneous of degree 0; that
is y ¤ (¸x) = y¤ (x) for all ¸ > 0:
Theorem 34 If the naive function f (y; x) = xT Á (y) is being maximized the
clever function f ¤ (x) is convex.
Theorem 35 If the naive function f (y; x) = xT Á (y) is being minimized the
clever function f ¤ (x) is concave.
Proof of Homogeneity
We …rst prove that f ¤ (x) given by:
f ¤ (x) = xT Á (y¤ (x))
is homogeneous of degree 1 and y ¤ (x) ; the value of y which maximizes or mini-

mizes f (y; x) ; is homogeneous of degree 0 (see page 48 to review homogeneity).
First, note that the naive function is homogeneous of degree 1 in x since:
f (y; ¸x) = (¸x)T Á (y)
¡ ¢
= ¸ xT Á (y)
= ¸f (y; x) :
Therefore since ¸ > 0 and since h (y¤ (x)) = 0 or y ¤ (x) satis…es the constraint,
it follows that if y ¤ (x) maximizes or minimizes f (y; x) subject to h (y) = 0 it
also maximizes or minimizes f (y; ¸x) subject to h (y) = 0 and so
y ¤ (¸x) = y ¤ (x) :
Thus y ¤ (x) is homogeneous of degree 0: Furthermore:
f ¤ (¸x) = (¸x)T Á (y ¤ (¸x))

= (¸x)T Á (y ¤ (x))
= ¸xT Á (y¤ (x))
= ¸f ¤ (x)
where the second line follows by the homogeneity of y ¤ (x) : This proves that
f ¤ (x) is homogeneous of degree 1:
Proof that f ¤ (x) is convex when f (y; x) is being maximized

Suppose the naive function f (y; x) is being maximized. Let x1 and x2 (which
are m £ 1 vectors) be two values of x and let x3 = ¸x1 + (1 ¡ ¸) x2 where
0 · ¸ · 1 be a convex combination of x1 and x2 : In order to prove that f ¤ (x)
is convex we need to show that (see page 52):
f ¤ (x3 ) · ¸f ¤ (x1 ) + (1 ¡ ¸) f ¤ (x2 ) :
We have then that:
f ¤ (x3 ) = (¸x1 + (1 ¡ ¸) x2 )T Á (y ¤ (x3 ))

= ¸xT1 Á (y ¤ (x3 )) + (1 ¡ ¸) xT2 Á (y ¤ (x3 )) :
However we have:
since y ¤ (x1 ) maximizes f (y; x1 ) subject to h (y) = 0 and h (y ¤ (x3 )) = 0: Simi-

larly we have:
since y¤ (x2 ) maximizes f (y; x2 ) subject to h (y) = 0 and h (y ¤ (x3 )) = 0: We

therefore conclude that:
f ¤ (x3 ) = ¸xT1 Á (y ¤ (x3 )) + (1 ¡ ¸) xT2 Á (y ¤ (x3 ))

· ¸f ¤ (x1 ) + (1 ¡ ¸) f ¤ (x2 )
so that f ¤ (x) is convex.

Proof that f ¤ (x) is concave when f (y; x) is being minimized

The proof follows the same reasoning as above. You may want to do it as an
exercise.
2.2.4 Cost Minimization

The naive cost function is given by:
C = W L + RK
which tells us what it costs the …rm to hire L workers and K units of capital.
The production function is given by:
Q = F (L; K)
which gives the technology available to the …rm.
Any pro…t maximizing …rm must minimize the cost of producing a given
level of output Q: Consider the problem of minimizing the cost of producing Q
units of output.
The problem of minimizing cost subject to the constraint that Q units be
produced leads to the Lagrangian:
L (¸; L; K; Q; W; R) = W L + RK + ¸ (Q ¡ F (L; K))
with …rst-order conditions given by:
@L (¸¤ ; L¤ ; K ¤ ; Q; W; R)
= Q ¡ F (L¤ ; K ¤ ) = 0
@¸
@L (¸¤ ; L¤ ; K ¤ ; Q; W; R) @F (L¤ ; K ¤ )
= W ¡ ¸¤ =0
@L @L
@L (¸¤ ; L¤ ; K ¤ ; Q; W; R) @F (L¤ ; K ¤ )
= R ¡ ¸¤ =0
@K @K
which implicitly determine three functions:
¸¤ = ¸¤ (Q; W; R) ;
L¤ = L¤ (Q; W; R) ;
K¤ = K ¤ (Q; W; R) :
L¤ (Q; W; R) and K ¤ (Q; W; R) are called conditional factor demands of respec-
tively labour and capital. They are conditional in the sense of being conditional
on Q units of output being produced. 2
Substituting L¤ (Q; W; R) and K ¤ (Q; W; R) into the naive cost function C =
W L + RK yields the clever cost function:
C ¤ (Q; W; R) = W L¤ (Q; W; R) + RK ¤ (Q; W; R) :
2 This Q is not necessarily the pro…t maximizing level of output and so the conditional
factor demands are not the same as the ordinary factor demands which, as we saw earlier
with pro…t maximization, depend on P; W and R:
The Conditional Demand for Labour

Suppose we wish to calculate
@C ¤ (Q; W; R)
:
@W
Following the recipe for the envelope theorem we proceed as follows:
Step 1: Di¤erentiate the Lagrangian L (¸; L; K; Q; W; R) with respect to W
to obtain:
@L (¸; L; K; Q; W; R) @
= (W L + RK + ¸ (Q ¡ F (L; K)))
@W @L
= L:
Step 2: Substitute all occurrences of L; K and ¸ with L¤ ; K ¤ and ¸¤ :
@C ¤ (Q; W; R)
= L¤ (Q; W; R) :
@W
This is an example of Shepard’s lemma. It states that to …nd the conditional
demand for labour one need only di¤erentiate the cost function C ¤ (Q; W; R)
with respect to W:
The Conditional Demand for Capital

Now suppose we wish to calculate
@C ¤ (Q; W; R)
:
@R
Step 1: Di¤erentiate the Lagrangian L (¸; L; K; Q; W; R) with respect to R
to obtain:
@L (¸; L; K; Q; W; R) @
= (W L + RK + ¸ (Q ¡ F (L; K)))
@R @R
= K:
@C ¤ (Q; W; R)
= K ¤ (Q; W; R) :
@R
This is again Shepard’s lemma. It states that we can …nd the conditional demand
for capital by di¤erentiating the cost function C ¤ (Q; W; R) with respect to the
rental cost of capital R:
Marginal Cost
Finally, suppose we attempt to calculate marginal cost:
@C ¤ (Q; W; R)
MC (Q; W; R) = :
@Q
Step 1: Di¤erentiate the Lagrangian L (¸¤ ; L¤ ; K ¤ ; Q; W; R) with respect to
Q to obtain:
@L (¸; L; K; Q; W; R) @
= (W L + RK + ¸ (Q ¡ F (L; K)))
@Q @Q
= ¸:

@C ¤ (Q; W; R)
= ¸¤ (Q; W; R) :
@Q
Thus the Lagrange multiplier ¸¤ is in fact equal to marginal cost.
An Example
It is a fact that if the …rm’s production function Q = F (L; K) is Cobb-Douglas
or:
Q = L® K ¯
then the cost function C ¤ (Q; W; R) has the Cobb-Douglas form:

® ¯ 1 ® ¯
C ¤ (Q; W; R) = (® + ¯) ®¡ ®+¯ ¯ ¡ ®+¯ Q ®+¯ W ®+¯ R ®+¯
or
C ¤ (Q; W; R) = BQc W d Re
where:
® ¯
B = (® + ¯) ®¡ ®+¯ ¯ ¡ ®+¯
1
c =
®+¯
®
d =
®+¯
¯
e = :
®+¯
Note that:
d+e=1
so that the cost function is homogeneous of degree 1 in W and R when Q is

kept …xed. This turns out to be a property that all cost functions share.
If the technology has decreasing returns to scale or ® + ¯ < 1; then c > 1;
and average cost is given by:
C ¤ (Q; W; R)
= BQc¡1 W d Re
Q
and is an increasing function of Q:
If the technology has constant returns to scale or ® + ¯ = 1 then c = 1 and
average cost is independent of Q (the average cost curve is ‡at) since:
C ¤ (Q; W; R)
= BW d Re
Q
does not depend on Q:
Finally if the technology has increasing returns to scale or ® + ¯ > 1 then
c < 1 and average cost:
C ¤ (Q; W; R)
= BQc¡1 W d Re
Q
is a decreasing function of Q:
An Example
Suppose the cost function is given by:
3 1
C ¤ (Q; W; R) = Q2 W 4 R 4
which implies decreasing returns to scale since c = 2 > 1:

To calculate the …rm’s conditional demand curve for labour L¤ (Q; W; R)
simply di¤erentiate C ¤ (Q; W; R) with respect to W to obtain:
@C ¤ (Q; W; R)
L¤ (Q; W; R) =
@W
@ ³ 2 3 1´
= Q W 4 R4
@W
3 2 ¡ 14 14
= Q W R :
4
To calculate the …rm’s conditional demand curve for capital K ¤ (Q; W; R)
simply di¤erentiate C ¤ (Q; W; R) with respect to R:
@C ¤ (Q; W; R)
K ¤ (Q; W; R) =
@R
@ ³ 2 34 14 ´
= Q W R
@R
1 2 3 ¡3
= Q W 4R 4:
4
2.2.5 Properties of the Cost Function

The naive cost function is a linear function of the exogenous variables W and
R when Q is held …xed since:
C = W L + RK
= xT Á (y)
where:
· ¸ · ¸
W L
x= ; y= ; Á (y) = y:
R K
From this and the fact that the constraint does not depend on W or R it follows
that (see page 26 ):
Theorem 36 C ¤ (Q; W; R) is homogeneous of degree 1 in W and R when Q is

held …xed; that is
C ¤ (Q; ¸W; ¸R) = ¸C ¤ (Q; W; R) :
Theorem 37 C ¤ (Q; W; R) is a concave function of W and R when Q is held

…xed.
Theorem 38 L¤ (Q; W; R), K ¤ (Q; W; R) are homogeneous of degree 0 in W

and R when Q is held …xed; that is
L¤ (Q; ¸W; ¸R) = L¤ (Q; W; R)

K ¤ (Q; ¸W; ¸R) = K ¤ (Q; W; R) :
The fact that L¤ (Q; W; R) and K ¤ (Q; W; R) and Q¤ (P; W; R) are homo-
geneous of degree 0 is a basic requirement of rationality; that the …rm is not
subject to money illusion.
For example if a 100% in‡ation doubles W and R then a rational …rm does
not change its means of producing Q units of output or:
L¤ (Q; 2W; 2R) = L¤ (Q; W; R)

K ¤ (Q; 2W; 2R) = K ¤ (Q; W; R)
From the concavity of the cost function it follows that if we treat Q as a

constant then the Hessian given by:
" 2 ¤ 2 ¤
#
@ C (Q;W;R) @ C (Q;W;R)
H (Q; W; R) = @W 2 @W @R
@ 2 C ¤ (Q;W;R) @ 2 C ¤ (Q;W;R)
@W @R @R2
is negative semi-de…nite (see page 52).

This in turn implies that the diagonal elements are non-positive or:
@ 2 C ¤ (Q; W; R) @ 2 C ¤ (Q; W; R)
· 0; · 0: (2.4)
@W 2 @R2
The Conditional Factor Demand Curves are Downward Sloping

We can use the concavity of C ¤ (Q; W; R) to show that the …rm’s conditional
factor demand curves are downward sloping.
Let us begin by showing the …rm’s labour demand curve is downward sloping.
@C ¤ (Q; W; R)
= L¤ (Q; W; R)
@W
@L¤ (Q; W; R) @ 2 C ¤ (Q; W; R)
= · 0:
@W @W 2
where the inequality follows from (2:4).
A similar argument shows that the conditional demand for capital is down-
ward sloping. Di¤erentiate both sides of
@C ¤ (Q; W; R)
= K ¤ (Q; W; R)
@R
@K ¤ (Q; W; R) @ 2 C ¤ (Q; W; R)
= ·0
@R @R2
where the inequality follows from (2:4) :
A Symmetry Result
Just as with the pro…t function we can establish a certain symmetry results
regarding the partial derivatives of the factor demands.
@C ¤ (Q; W; R)
= L¤ (Q; W; R)
@W
@L¤ (Q; W; R) @ 2 C ¤ (Q; W; R)
= :
@R @R@W
@C ¤ (Q; W; R)
= K ¤ (Q; W; R)
@R
@K ¤ (Q; W; R) @ 2 C ¤ (Q; W; R)
= :
@W @W @R
@ 2 C ¤ (Q;W;R) @ 2 C ¤ (Q;W;R)
By Young’s theorem: @R@W = @W @R and so we conclude that:
@L¤ (Q; W; R) @K ¤ (Q; W; R)
= :
@R @W
More Results
Using Euler’s theorem (see page 49) and the fact that L¤ (Q; W; R) is homoge-
neous of degree 0 we have:
@L¤ (Q; W; R) @L¤ (Q; W; R)

L¤ (Q; W; R) £ 0 = 0 = W+ R
@W @R
or:
@L¤ (Q; W; R) @L¤ (Q; W; R) W
=¡ :
@R @W R
Since we have already seen that:
@L¤ (Q; W; R)
·0
@W
it follows that:
@L¤ (Q; W; R)
¸0
@R
so that an increase in the price of capital increases (or more precisely does not
decrease) the conditional demand for labour.
¤ ¤
We have shown the symmetry result that: @L (Q;W;R)
@W = @K (Q;W;R)
@W and so
we have immediately that:
@K ¤ (Q; W; R)
¸ 0:
@W
An Example
Note that in the example above where:
3 1
C ¤ (Q; W; R) = Q2 W 4 R 4
that C ¤ (Q; W; R) is homogeneous of degree 1 in W and R; that is:

3 1
C ¤ (Q; ¸W; ¸R) = Q2 (¸W ) 4 (¸R) 4
3 1 3 1
= ¸ 4 ¸ 4 Q2 W 4 R 4
3 1
= ¸Q2 W 4 R 4
= ¸C ¤ (Q; W; R) :
You can verify that

@ ³ 2 3 1´
L¤ (Q; W; R) = Q W 4 R4
@W
3 2 ¡1 1
= Q W 4 R4
4
W
is homogeneous of degree 0 and can be written as a function of R as:
µ ¶¡ 14
3 W
¤
L (Q; W; R) = Q2
4 R
so that:
µ ¶ µ ¶¡ 14
¹ Q; W 3 2 W
L = Q :
R 4 R
Similarly
@ ³ 2 3 1´
K ¤ (Q; W; R) = Q W 4 R4
@R
1 2 3 ¡3
= Q W 4R 4
4
W
is homogeneous of degree 0 and can be written as a function of R as:
µ ¶ 34
1 W
¤
K (Q; W; R) = Q2
4 R
so that:
µ ¶ µ ¶3
¹ W 1 2 W 4
K Q; = Q :
R 4 R
2.2.6 Utility Maximization

Suppose a household consuming two goods Q1 and Q2 has a (naive) utility
function:
U = U (Q1 ; Q2 ) :
It has income Y and faces prices P1 and P2 so that the budget constraint is:
Y = P1 Q1 + P2 Q2 :
L (¸; Q1 ; Q2 ; P1 ; P2 ; Y ) = U (Q1 ; Q2 ) + ¸ (Y ¡ P1 Q1 ¡ P2 Q2 )

@L (¸¤ ; Q¤1 ; Q¤2 ; P1 ; P2 ; Y )
= Y ¡ P1 Q¤1 ¡ P2 Q¤2 = 0
@¸
@L (¸¤ ; Q¤1 ; Q¤2 ; P1 ; P2 ; Y ) @U (Q¤1 ; Q¤2 )
= ¡ ¸¤ P1 = 0
@Q1 @Q1
¤
@L (¸ ; Q1 ; Q¤2 ; P1 ; P2 ; Y )
¤
@U (Q¤1 ; Q¤2 )
= ¡ ¸¤ P2 = 0:
@Q2 @Q2
These …rst-order conditions implicitly de…ne the reduced form:
¸¤ = ¸¤ (P1 ; P2 ; Y ) ;
Q¤1 = Q¤1 (P1 ; P2 ; Y ) ;
Q¤2 = Q¤2 (P1 ; P2 ; Y ) :
Here Q¤1 (P1 ; P2 ; Y ) and Q¤2 (P1 ; P2 ; Y ) are the Marshallian curves for the
two goods. In general Marshallian demand curves are functions of prices and
nominal income Y: Later we will investigate another kind of demand curve, the
Hicksian demand curve, which is a function of prices and real income.
The indirect utility function (or the clever utility function) is given by:
U ¤ (Y; P1 ; P2 ) = U (Q¤1 (Y; P1 ; P2 ) ; Q¤2 (Y; P1 ; P2 )) :
Since Q¤1 (Y; P1 ; P2 ) and Q¤2 (Y; P1 ; P2 ) are homogeneous of degree 0 in Y; P1

and P2 it follows that U ¤ (Y; P1 ; P2 ) is also homogenous of degree 0:
The Lagrange Multiplier

The Lagrange multiplier: ¸¤ (P1 ; P2 ; Y ) turns out to be the marginal utility of
income.
To see this de…ne the marginal utility of income as:
@U ¤ (P1 ; P2 ; Y )
@Y
and apply the envelope theorem recipe as follows:
Step 1: Take the derivative of the Lagrangian with respect to Y to get:
@L (¸; Q1 ; Q2 ; P1 ; P2 ; Y ) @
= (U (Q1 ; Q2 ) + ¸ (Y ¡ P1 Q1 ¡ P2 Q2 ))
@Y @Y
= ¸:
Step 2: Replace all occurrences of ¸; Q1 and Q2 with the optimal val-

ues: ¸¤ ; Q¤1 and Q¤2 to obtain:
@U ¤ (P1 ; P2 ; Y )
= ¸¤ (P1 ; P2 ; Y ) :
@Y
Thus the Lagrange multiplier turns out to be the marginal utility of income.
Exercise: Show that: ¸¤ (P1 ; P2 ; Y ) is homogeneous of degree ¡1:
@U ¤ (P1 ;P2 ;Y )
Calculating @P1
Suppose we now wish to calculate:
@U ¤ (P1 ; P2 ; Y )
:
@P1
Using the recipe for the envelope theorem we proceed as follows:

Step 1: Di¤erentiate the Lagrangian with respect to P1 to get:
@L (¸; Q1 ; Q2 ; P1 ; P2 ; Y ) @
= (U (Q1 ; Q2 ) + ¸ (Y ¡ P1 Q1 ¡ P2 Q2 ))
@P1 @P1
= ¡¸Q1 :
@U ¤ (P1 ; P2 ; Y )
= ¡¸¤ (P1 ; P2 ; Y ) £ Q¤1 (P1 ; P2 ; Y ) :
@P1
@U ¤ (P1 ;P2 ;Y )
Calculating @P2
Suppose we now wish to calculate:

@U ¤ (P1 ; P2 ; Y )
:
@P2
Using the recipe for the envelope theorem we proceed as follows:
Step 1: Take the derivative of the Lagrangian with respect to P2 to get:
@L (¸; Q1 ; Q2 ; P1 ; P2 ; Y ) @
= (U (Q1 ; Q2 ) + ¸ (Y ¡ P1 Q1 ¡ P2 Q2 ))
@P2 @P2
= ¡¸Q2 :
@U ¤ (P1 ; P2 ; Y )
= ¡¸¤ (P1 ; P2 ; Y ) £ Q¤2 (P1 ; P2 ; Y ) :
@P2
Roy’s Identity
Up until now when we di¤erentiated a clever function with respect to a price
we have always obtained either a demand or a supply curve. With utility maxi-
mization this turns out not to be the case. Nevertheless, we can use the indirect
utility function to calculate the demand for Q1 and Q2 :
Thus using:
@U ¤ (P1 ; P2 ; Y )
= ¸¤ (P1 ; P2 ; Y )
@Y
@U ¤ (P1 ; P2 ; Y )
= ¡¸¤ (P1 ; P2 ; Y ) £ Q¤1 (P1 ; P2 ; Y )
@P1
we obtain:
@U ¤ (P1 ;P2 ;Y )
@P1
Q¤1 (P1 ; P2 ; Y ) =¡ @U ¤ (P1 ;P2 ;Y )
@Y
which is Roy’s identity.

Similarly, using:
@U ¤ (P1 ; P2 ; Y )
= ¸¤ (P1 ; P2 ; Y )
@Y
@U ¤ (P1 ; P2 ; Y )
= ¡¸¤ (P1 ; P2 ; Y ) £ Q¤2 (P1 ; P2 ; Y )
@P2
we obtain:
@U ¤ (P1 ;P2 ;Y )
@P2
Q¤2 (P1 ; P2 ; Y ) =¡ @U ¤ (P1 ;P2 ;Y )
@Y
which is again Roy’s identity.
An Illustration of Roy’s Identity

Given the utility function:
1 2
U (Q1 ; Q2 ) = ln (Q1 ) + ln (Q2 )
3 3
we obtain the Lagrangian:
1 2
L (¸; Q1 ; Q2 ; P1 ; P2 ; Y ) = ln (Q1 ) + ln (Q2 )
3 3
+¸ (Y ¡ P1 Q1 ¡ P2 Q2 )
which leads to:
1
¸¤ (P1 ; P2 ; Y ) =
Y
Y
Q¤1 (P1 ; P2 ; Y ) =
3P1
2Y
Q¤2 (P1 ; P2 ; Y ) = :
3P2
To …nd the indirect (or clever) utility function, substitute Q¤1 and Q¤2 into
U (Q1 ; Q2 ) to obtain:
µ ¶ µ ¶
¤ 1 Y 2 2Y
U (P1 ; P2 ; Y ) = ln + ln
3 3P1 3 3P2
µ ¶ µ ¶
1 2 1 1 2 2
= ln (Y ) ¡ ln (P1 ) ¡ ln (P2 ) + ln + ln :
3 3 3 3 3 3
Now suppose you were just given the indirect utility function and had to
calculate the demand curves for Q1 and Q2 using Roy’s identity. We have:
@U ¤ (P1 ; P2 ; Y ) 1
=
@Y Y
@U ¤ (P1 ; P2 ; Y ) ¡1
=
@P1 3P1
@U ¤ (P1 ; P2 ; Y ) ¡2
=
@P2 3P2
so that:
@U ¤ (P1 ;P2 ;Y )
@P1
Q¤1 (P1 ; P2 ; Y ) = ¡ @U ¤ (P1 ;P2 ;Y )
@Y
¡1
3P1 Y
= ¡ 1 = 3P
Y 1
and so we recover the original demand curve for Q1 from U ¤ (P1 ; P2 ; Y ) :

Similarly:
@U ¤ (P1 ;P2 ;Y )
@P2
Q¤2 (P1 ; P2 ; Y ) = ¡ @U ¤ (P1 ;P2 ;Y )
@Y
¡2
3P2 2Y
= ¡ 1 = 3P
Y 2
and so we recover the original demand curve for Q2 from U ¤ (P1 ; P2 ; Y ) :
2.2.7 Expenditure Minimization

A rational household will always minimize the expenditure necessary to obtain
a certain level of utility say Uo : Let:
E = P1 Q1 + P2 Q2
be the (naive) level of expenditure on Q1 and Q2 :

Given the utility function U (Q1 ; Q2 ) we obtain the constraint:
Uo = U (Q1 ; Q2 ) :
Minimizing E subject to this constraint leads to the Lagrangian:
L (¸; Q1 ; Q2 ; P1 ; P2 ; Uo ) = P1 Q1 + P2 Q2 + ¸ (Uo ¡ U (Q1 ; Q2 ))

@L (¸¤ ; Q¤1 ; Q¤2 ; P1 ; P2 ; Uo )
= Uo ¡ U (Q¤1 ; Q¤2 ) = 0
@¸
@L (¸¤ ; Q¤1 ; Q¤2 ; P1 ; P2 ; Uo ) @U (Q¤1 ; Q¤22 )
= P 1 ¡ ¸¤ =0
@Q1 @Q1
¤
@L (¸ ; Q1 ; Q¤2 ; P1 ; P2 ; Uo )
¤
@U (Q¤1 ; Q¤22 )
= P 2 ¡ ¸¤ = 0:
@Q2 @Q2
These 3 implicit functions determine the explicit functions:
¸¤ = ¸¤ (P1 ; P2 ; Uo ) ;
Q¤1 = Q¤1 (P1 ; P2 ; Uo ) ;
Q¤2 = Q¤2 (P1 ; P2 ; Uo ) :
Here Q¤1 (P1 ; P2 ; Uo ) and Q¤2 (P1 ; P2 ; Uo ) are the Hicksian demand curves.
They di¤er from the Marshallian demand curves only in that Uo , which can
be thought of as real income, takes the place of Y (or nominal income) in the
Marshallian demand curves.
Substituting Q¤1 (P1 ; P2 ; Uo ) and Q¤2 (P1 ; P2 ; Uo ) into the naive expenditure
function E = P1 Q1 + P2 Q2 ; we obtain the clever expenditure function:
E ¤ (P1 ; P2 ; Uo ) = P1 Q¤1 (P1 ; P2 ; Uo ) + P2 Q¤2 (P1 ; P2 ; Uo ) :
Properties of the Expenditure Function

The naive expenditure function is a linear function of the exogenous variables
P1 and P2 when Uo is held …xed since:
E = P1 Q1 + P2 Q2
= xT Á (y)
where:
· ¸ · ¸
P1 Q1
x= ; y= ; Á (y) = y:
P2 Q2
From this it follows (see page 26 ) that:
Theorem 39 E ¤ (P1 ; P2 ; Uo ) is homogeneous of degree 1 in P1 and P2 when Uo

is held …xed; that is:
E ¤ (¸P1 ; ¸P2 ; Uo ) = ¸E ¤ (P1 ; P2 ; Uo )
Theorem 40 E ¤ (P1 ; P2 ; Uo ) is a concave function of P1 and P2 when Uo is

held …xed.
Theorem 41 Q¤1 (P1 ; P2 ; Uo ) and , Q¤2 (P1 ; P2 ; Uo ) are homogeneous of degree

0 in P1 and P1 when Uo is held …xed. That is:
Q¤1 (¸P1 ; ¸P2 ; Uo ) = Q¤1 (P1 ; P2 ; Uo )

Q¤2 (¸P1 ; ¸P2 ; Uo ) = Q¤2 (P1 ; P2 ; Uo ) :
Since the expenditure function is a concave function of P1 and P2 when Uo

is held …xed it follows that the Hessian:
2 2 ¤ 2 ¤
3
@ E (P1 ;P2 ;Uo ) @ E (P1 ;P2 ;Uo )
@P12
H (P1 ; P2 ; Uo ) = 4 @ 2 E ¤ (P1 ;P2 ;Uo )
@P1 @P2
@ 2 E¤ (P1 ;P2 ;Uo )
5
@P1 @P2 @P22
is negative semi-de…nite (see page 52 ) and consequently it has non-positive

diagonal elements or:
@ 2 E ¤ (P1 ; P2 ; Uo ) @ 2 E ¤ (P1 ; P2 ; Uo )
· 0; · 0:
@P12 @P22
The Hicksian Demand Curves are Downward Sloping

The Marshallian demand curves encompass a substitution and an income e¤ect.
Since the income e¤ect can either be positive or negative, Marshallian demand
curves can slope either upwards or downwards.
We can however use the concavity of E ¤ (P1 ; P2 ; Uo ) to show that the Hick-
sian demand curves are always downward sloping. This is because, unlike a
Marshallian demand curve, Hicksian demand curves have only a substitution
e¤ect which always leads to downward sloping demand curves.
Using the envelope theorem we can show that:
@E ¤ (P1 ; P2 ; Uo )
= Q¤1 (P1 ; P2 ; Uo ) :
@P1
Now di¤erentiate both sides with respect to P1 to obtain:
@Q¤1 (P1 ; P2 ; Uo ) @ 2 E ¤ (P1 ; P2 ; Uo )
= ·0
@P1 @P12
where the inequality follows from the concavity of the expenditure function.
Similarly from:
@E ¤ (P1 ; P2 ; Uo )
= Q¤2 (P1 ; P2 ; Uo )
@P2
and di¤erentiating both sides with respect to P2 we obtain:
@Q¤2 (P1 ; P2 ; Uo ) @ 2 E ¤ (P1 ; P2 ; Uo )
= ·0
@P2 @P22
where the inequality follows from the concavity of the expenditure function.
Thus we see that the Hicksian demand curves are always downwards sloping.
A Symmetry Result
The Hicksian demand curves display the same symmetry as the conditional
factor demand functions.
@E ¤ (P1 ; P2 ; Uo )
= Q¤1 (P1 ; P2 ; Uo )
@P1
with respect to P2 to obtain:
@Q¤1 (P1 ; P2 ; Uo ) @ 2 E ¤ (P1 ; P2 ; Uo )
= :
@P2 @P2 @P1
@E ¤ (P1 ; P2 ; Uo )
= Q¤2 (P1 ; P2 ; Uo )
@P2
with respect to P1 to obtain:
@Q¤2 (P1 ; P2 ; Uo ) @ 2 E ¤ (P1 ; P2 ; Uo )

= :
@P1 @P1 @P2
@ 2 E ¤ (P1 ;P2 ;Uo ) @ 2 E ¤ (P1 ;P2 ;Uo )
By Young’s theorem: @P2 @P1 = @P1 @P2 and so we can conclude
that:
@Q¤1 (P1 ; P2 ; Uo ) @Q¤2 (P1 ; P2 ; Uo )
= :
@P2 @P1
Chapter 3
Integration and Random

Variables
3.1 Introduction
The economic world is a world of uncertainty. To understand models of economic
decision making under uncertainty, whether it is insuring a house against …re or
picking a portfolio of risky assets, one needs to be able to manipulate variables
which take on di¤erent outcomes with di¤erent probabilities. These variables
are known as random variables.
The basic tool for working with a random variable X is the expectations
operator E [X] : E [ ] in turn is
Pbased on either discrete summation, which in-
n
volves the sigma notation or i=1 ; or continuous summation, which involves
Rb
integration or a ( ) dx.
Therefore, before studying random variables we will …rst study discrete sum-
Pn Rb
mation or i=1 and then continuous summation or a ( ) dx.
3.2 Discrete Summation

3.2.1 De…nitions
Given a set of n numbers: X1 ; X2 ; X3 ; : : : Xn it is natural to want to add them
up as:
X1 + X2 + X3 + ¢ ¢ ¢ + Xn :
I call this the dot; dot; dot notation. The dot; dot; dot notation is the safest
way of working with sums, but also the most cumbersome.
78
CHAPTER 3. INTEGRATION AND RANDOM VARIABLES 79
Sums can be represented more compactly using the sigma notation as:
n
X
Xi ´ X1 + X2 + X3 + ¢ ¢ ¢ + Xn :
i=1
P
Here i is the summation index. Underneath the we write i = 1 to indicate
that we begin the sum with X1 . After including X1 , we then increase i to 2
and include X2 in the sum. This process continues until we reach i = n with
Xn included in the sum, at which point we stop.
This notation can be modi…ed in a number of ways. ForP23example to start
the summation with X5 and end with X23 we would write: i=5 Xi or
23
X
Xi = X5 + X6 + X7 + ¢ ¢ ¢ + X23 :
i=5
The sigma notation is much more e¢cient than the dot; dot; dot notation,
but it is also more dangerous. One strategy for avoiding these dangers when
deriving results is to revert to the safe dot; dot; dot notation and then, once we
think we have the answer, convert back into the summation notation. Here are
some examples.
Pn
3.2.2 Example 1: i=1 aXi
Converting to the dot; dot; dot notation we have:
n
X
aXi = aX1 + aX2 + aX3 + ¢ ¢ ¢ + aXn
i=1
0 1
B C
= a @X1 + X2 + X3 + ¢ ¢ ¢ + Xn A
| P
{z }
n
= i=1 Xi
n
X
= a Xi
i=1
Pn
3.2.3 Example 2: i=1 (aXi + bYi )
Converting to the dot; dot; dot notation we have:
n
X
(aXi + bYi ) = (aX1 + bY1 ) + (aX2 + bY2 ) + ¢ ¢ ¢ + (aXn + bYn )
i=1
= a (X1 + X2 + ¢ ¢ ¢ + Xn ) + b (Y1 + Y2 + ¢ ¢ ¢ + Yn )
Xn n
X
= a Xi + b Yi :
i=1 i=1
Pn
3.2.4 Example 3: i=1 (aXi + b)
Pn Pn
This one is tricky. It is a great temptation to write: i=1 (aXi + b) = a i=1 Xi +
b which is wrong. Instead:
n
X
(aXi + b) = (aX1 + b) + (aX2 + b) + ¢ ¢ ¢ + (aXn + b)
i=1
= a (X1 + X2 + ¢ ¢ ¢ + Xn ) + (b + b + ¢ ¢ ¢ + b)
| {z }
n times
n
X
= a Xi + nb
i=1
Pn
3.2.5 Example 4: i=1 (aXi + b)2
Again converting into the dot; dot; dot notation we have
n
X
(aXi + b)2 = (aX1 + b)2 + (aX2 + b)2 + ¢ ¢ ¢ + (aXn + b)2
i=1
¡ 2 2 ¢ ¡ ¢ ¡ ¢
= a X1 + 2abX1 + b2 + a2 X22 + 2abX2 + b2 + ¢ ¢ ¢ + a2 Xn2 + 2abXn + b2
¡ ¢ ¡ ¢
= a2 X12 + ¢ ¢ ¢ + Xn2 + 2ab (X1 + ¢ ¢ ¢ + Xn ) + b2 + ¢ ¢ ¢ + b2
| {z }
n times
n
X n
X
= a2 Xi2 + 2ab Xi + nb2 :
i=1 i=1
Pn ¡ ¹
¢
3.2.6 Example 5: i=1 Xi ¡ X = 0
¹ is:
The sample mean X
X n
¹= 1
X
1
Xj = (X1 + X2 + ¢ ¢ ¢ + Xn )
n j=1 n
and so:
n
X ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
¹
Xi ¡ X ¹ + X2 ¡ X
= X1 ¡ X ¹ + ¢ ¢ ¢ + Xn ¡ X
¹
i=1
= (X1 + X2 + ¢ ¢ ¢ + Xn ) ¡ nX¹
1
= (X1 + X2 + ¢ ¢ ¢ + Xn ) ¡ n (X1 + X2 + ¢ ¢ ¢ + Xn )
n
= (X1 + X2 + ¢ ¢ ¢ + Xn ) ¡ (X1 + X2 + ¢ ¢ ¢ + Xn )
= 0:
3.2.7 Manipulating the Sigma Notation

Deriving expressions by converting into the dot; dot; dot notation is quite safe
but also very ine¢cient. A faster Pnbut more dangerous approach is to work
directly with the sigma notation: i=1 : There are a small number of rules that
need to be mastered in order to be able to do this.
Pn
Rules for Manipulating the Summation Operator: i=1
Pn
Rule 1: i=1 is a linear operator; that is if the exponent on the bracket
P P
1, or given ni=1 (Xi + Yi )1 or equivalently ni=1 (Xi + Yi ), then you
in front is P
n
may take i=1 inside the brackets and apply it to each term inside as:
n
Ã n n
!
X X X
(Xi + Yi ) = Xi + Yi :
i=1 i=1 i=1
More generally:
n
X n
X n
X n
X
(X1i + X2i + ¢ ¢ ¢ + Xmi ) = X1i + X2i + ¢ ¢ ¢ + Xmi :
i=1 i=1 i=1 i=1
This only works for linear functions! If the exponent on the brackets is not 1
then this step is not justi…ed. For example:
n
Ã n n
!2
X 2
X X
(Xi + Yi ) 6= Xi + Yi :
i=1 i=1 i=1
In general if f (x) is a nonlinear function then

n
Ã n n
!
X X X
f (Xi + Yi ) 6= f Xi + Yi ;
i=1 i=1 i=1
x
for example if f (x) = e = exp (x) ; which is not linear, then:
n
Ã n n
!
X X X
exp (Xi + Yi ) 6= exp Xi + Yi
i=1 i=1 i=1
Pn
Rule 2: You may always pull i=1 past any expression that does not
depend on iP(or whatever the summation index is) as long as something remains
n
in front of i=1 . For example:
n
X n
X
aXi = a Xi
i=1 i=1
since a does not have an i subscript and Xi remains in front. However:

n
X n
X
ai Xi 6= ai Xi
i=1 i=1
since ai here has an i subscript and

n
X n
X
b 6= b
i=1 i=1
P
since there always needs toPbe something in front of ni=1 .
Rule 3: If in front of ni=1 there is nothing which has an i subscript, then
simply multiply this by the number of terms being summed. Thus:
n
X
b = nb
i=1
since n terms are being summed while:

7
X
b = 5b
i=3
since there are 5 terms being summed, that is : i = 3; 4; 5; 6 7:

Rule 4: As long as there is no con‡ict with other de…nitions you can always
change the summation
Pn index without a¤ecting the result. For example if we
replace i in i=1 Xi with j we get the same result. That is:
n
X n
X
Xi = Xj :
i=1 j=1
As an exercise, redo Examples 1,2,3,4 and 5 above without converting into

the dot; dot; dot notation.
Below are some examples of how these rules can be used to derive correct
results.
Pn
3.2.8 Example 1: i=1 (aXi + bYi )2
Pn 2
Since the exponent on the brackets in i=1 (aXi + bYi ) is 2; by Rule 1 you
cannot immediately pull the summation inside the brackets. However, since:
2 ¡ ¢1
(aXi + bYi ) = a2 Xi2 + 2abXi Yi + b2 Yi2
¡ ¢
= a2 Xi2 + 2abXi Yi + b2 Yi2
which now has an exponent of 1; we can pull the summation sign through the
expanded expression as:
n
X n
X ¡ 2 2 ¢
(aXi + bYi )2 = a Xi + 2abXi Yi + b2 Yi2
i=1 i=1
n
X n
X n
X
2
= a Xi2 + 2abXi Yi + b2 Yi2 :
i=1 i=1 i=1
Pn
Since a2 ; 2ab and b2 do not depend on i; we can by Rule 2 pull i=1 past these
constants to obtain:
n
X n
X n
X n
X
(aXi + bYi )2 = a2 Xi2 + 2ab Xi Yi + b2 Yi2 :
i=1 i=1 i=1 i=1
Pn
3.2.9 Example 2: i=1 (aXi + b)2
This is a special case of Example 1 where Yi = 1 for all i: However, it is useful
to work it through since it provides an example of Rule 3. Thus by changing
the exponent to 1 we have:
n
X n
X ¡ 2 2 ¢
(aXi + b)2 = a Xi + 2abXi + b2
i=1 i=1
n
X n
X n
X
= a2 Xi2 + 2ab Xi + b2 :
i=1 i=1 i=1
Pn
Since for b2 in i=1 b2 does not have an i subscript, by Rule 3 we multiply b2
by the number of terms being summed so that:
n
X
b2 = nb2
i=1
and hence:
n
X n
X n
X
(aXi + b)2 = a2 Xi2 + 2ab Xi + nb2 :
i=1 i=1 i=1
Pn ¡ ¢
¹ 2
3.2.10 Example 3: i=1 Xi ¡ X
¹ However, it is again
This is a special case of Example 2 with a = 1 and b = ¡X:
useful to do it from scratch.
The sample mean X ¹ is:
X n
¹= 1
X Xj :
n j=1
¡ ¢
Since there is has an exponent of 2 on Xi ¡ X ¹ 2 we cannot pull Pn
i=1
through the brackets. However, by expanding and then using Rule 1 we have:
n
X n
X
¡ ¢ ¡ 2 ¢
¹ 2
Xi ¡ X = ¹ i+X
Xi ¡ 2XX ¹2
i=1 i=1
n
X n
X n
X
= Xi2 + ¹ i+
¡2XX ¹ 2:
X
i=1 i=1 i=1
Note that in the secondPsum the term ¡2X ¹ does not have an i subscript and so
n ¹ 2 does not have
by Rule 2 we can pull i=1 past it to Xi : In the third sum X
an i subscript so that by Rule 3 we simply multiply it by n: We thus have:
n
X n
X n
X
¡ ¢
¹ 2=
Xi ¡ X ¹
Xi2 ¡ 2X ¹ 2:
Xi + nX
i=1 i=1 i=1
We can further simplify by noting that:

X n n
¹= 1 1X
X Xi = Xi
n j=1 n i=1
since by Rule 4 we can replace the summation index of j with i without a¤ecting
the result. Therefore:
n
X
¹
Xi = nX
i=1
and hence:
n
X n
X
¡ ¢
¹ 2
Xi ¡ X = ¹ 2 + nX
Xi2 ¡ 2nX ¹2
i=1 i=1
Xn
= ¹ 2:
Xi2 ¡ nX
i=1
As an exercise prove that:

n
X n
X
¡ ¢¡ ¢ ¡ ¢
¹ Yi ¡ Y¹
Xi ¡ X = ¹ Yi
Xi ¡ X
i=1 i=1
n
X
= ¹ Y¹ :
Xi Yi ¡ nX
i=1
3.3 Integration
3.3.1 The Fundamental Theorem of Integral Calculus
The de…nite integral
Z b
f (x) dx
a
is often interpreted as the area under the curve f (x) between a and b: Here x
is the variable over which we are integrating while a is the lower limit of the
integral and b the upper limit.
From the fundamental theorem of integral calculus we can evaluate integrals
by …nding the anti-derivative of f (x) given:
Theorem 42 Fundamental theorem of integral calculus:

Z b ¯
¯ b
f (x) dx = F (x) ¯¯ ´ F (b) ¡ F (a)
a a
where F (x) is the anti-derivative of f (x) ; that is:
F 0 (x) = f (x) :
R 10
3.3.2 Example 1: 1
x2 dx
x3 d x3
If f (x) = x2 then F (x) = 3 ; that is, F 0 (x) = dx 3 = x2 : Therefore:
Z ¯
10
2 x3 ¯¯ 10 103 13
x dx = = ¡ = 333:
1 3 ¯ 1 3 3
R5
3.3.3 Example 2: 0
e¡x dx
If f (x) = e¡x then F (x) = ¡e¡x and:
Z ¯
5 ¯ 5
e¡x
dx = ¡e ¡x ¯ ¡5
¯ 0 =1¡e :
0
In general we have that:

Z n ¯
¯ n
e¡x dx = ¡e¡x ¯¯ = 1 ¡ e¡n :
0 0
¡n
Note that limn!1 e = 0: We can then let n = 1 and obtain:
Z ¯
1
¡x
¯
¡x ¯ 1
e dx = ¡e ¯ = 1 ¡ e¡1 = 1:
0 0
3.3.4 Integration by Parts

One the more important tricks for evaluating integrals of functions is integration
by parts. This is the integral calculus analogue of the product rule in di¤erential
calculus.
Suppose we have an integral:
Z b
f (x) dx
a
and we can write f (x) as:
f (x) = u (x) £ v0 (x)
where we know the antiderivative of v 0 (x) which is v (x) : Then:

Theorem 43 Integration by Parts: If f (x) = u (x) £ v0 (x) then

Z b Z b
f (x) dx = u (x) £ v0 (x) dx
a a
¯ Z b
¯ b
¯
= u (x) v (x) ¯ ¡ u0 (x) v (x) dx:
a a
R4
3.3.5 Example 1: 0
xe¡x dx
Suppose we wish to evaluate the integral:
Z 4
xe¡x dx:
0
Here it is not obvious what the antiderivative of xe¡x is. However, the anti-
derivative of e¡x is just ¡e¡x so set
u (x) = x; v0 (x) = e¡x

u0 (x) = 1; v (x) = ¡e¡x :
Using integration by parts we then have:

Z 4 ¯ Z 4
¡ ¡x ¢ ¯ 4 ¡ ¢
¡x
xe dx = x £ ¡e ¯ (1) ¡e¡x dx
¯ 0 ¡
0 0
Z 4
= ¡4e¡4 + e¡x dx
0
= ¡4e¡4 + ¡e¡4 + 1
= 1 ¡ 5e¡4
= :9084218:
3.3.6 Example 2: ¡ (n)

Consider the gamma function de…ned by:
Z 1
¡ (n) = xn¡1 e¡x dx:
0
This turns out to be a very important function in econometrics.

For n = 1 we have:
Z 1
¡ (1) = e¡x dx = 1:
0
There turns out to be a close connection between the gamma function and
factorials like 5! = 5 £ 4 £ 3 £ 2 £ 1 = 120: This is based on the following result:
Theorem 44 The Recursive Property of the Gamma function

¡ (n) = (n ¡ 1) ¡ (n ¡ 1) :
The proof uses integration by parts with:
u (x) = xn¡1 ; v0 (x) = e¡x
u0 (x) = (n ¡ 1) xn¡2 ; v (x) = ¡e¡x :
Then:
Z 1
¡ (n) = xn¡1 e¡x dx
0
¯ Z 1
¯
n¡1 ¡x ¯ 1
= ¡x e ¯ + (n ¡ 1) xn¡2 e¡x dx:
0 0
However it can be shown that:

¯
¯ 1
¡x n¡1 ¡x
e ¯
¯ 0 =0
and by de…nition
Z 1
xn¡2 e¡x dx ´ ¡ (n ¡ 1)
0
so we have:
¡ (n) = (n ¡ 1) ¡ (n ¡ 1) :
Starting with ¡ (1) = 1 = 0! and using this property of ¡ (n) we have:
¡ (1) = 1 = 0!
¡ (2) = (2 ¡ 1) £ ¡ (1) = 1 = 1!
¡ (3) = (3 ¡ 1) £ ¡ (2) = 2 = 2!
¡ (4) = (4 ¡ 1) £ ¡ (3) = 6 = 3!
¡ (5) = (5 ¡ 1) £ ¡ (4) = 24 = 4!
and in general as long as n is an integer:
¡ (n) = (n ¡ 1)!
We can also use the gamma function to de…ne non-integer factorials. For
example it can be shown that:
µ ¶ µ ¶
1 1 p
¡ = ¡ != ¼
2 2
µ ¶ µ ¶ µ ¶ p
3 1 1 1 ¼
¡ = ¡ = !=
2 2 2 2 2
µ ¶ µ ¶ µ ¶ p
5 3 3 3 3 ¼
¡ = ¡ = !=
2 2 2 2 4
etc::
3.3.7 Integration as Summation

Integration and summation are very closely related concepts. In fact it is often
Rb
useful toPthink of a f (x) dx as the summation of f (x) dx as x goes from a to b
n
P as i=1 f (Xi ) is a summation of f (Xi ) as i goes from 1 to n: RThe symbol
just
is sigma, the Greek S where S here stands for sum. The symbol also looks
like a (somewhat squashed) S .
An integral is the limiting form of a sum. If you let:
(b ¡ a)
dxn =
n
then:
Z b n
X
f (x) dx = lim f (a + dxn i) dxn
a n!1
i=0
n µ ¶
(b ¡ a) X i (b ¡ a)
= lim f a+ :
n!1 n i=0
n
Consequently if n is large we can approximate an integral by a sum as:

Z n µ ¶
b
(b ¡ a) X i (b ¡ a)
f (x) dx ¼ f a+ :
a n i=0
n
As n ! 1 think of dxn approaching an in…nitely small number dx that is

Rb
not quite zero. The integral a f (x) dx then can be thought of as summing an
in…nite number of in…nitely small f (x) dx0 s as x goes from a to b.
An Example
We showed that:
Z 10
x2 dx = 333:
1
Here a = 1 and b = 10: If we set n = 9 in the approximation we have:
(b ¡ a) 10 ¡ 1
dxn = = =1
n 9
and so we obtain the approximation:
Z 10 9
X
x2 dx ¼ dxn (1 + dxn i)2
1 i=0
= 12 + 22 + 32 + 52 + 62 + 72 + 82 + 92 + 102
= 369:
R 10
The error, the di¤erence between 333, the exact value of 1
x2 dx; and the
approximate value of 369; can be reduced by increasing n.
If let n = 18 in the approximation we have:
(b ¡ a) 10 ¡ 1 1
dxn = = =
n 18 2
so that:
Z 18 µ ¶2
10
1X 1
x2 dx ¼ 1+ i
1 2 i=0 2
1¡ 2 ¢
= 1 + 1:52 + 22 + 2:52 + ¢ ¢ ¢ + 9:52 + 102
2
= 358:625:
Using a computer we can make n much larger. If n = 1000 so that

10 ¡ 1 9
dxn = =
1000 1000
we have:
Z 1000 µ ¶2
10
9 X 9
x2 dx ¼ 1+ i
1 1000 i=0 1000
9 ¡ 2 ¢
= 1 + 1:0092 + 1:0182 + ¢ ¢ ¢ + 9:9912 + 102
1000
= 333:4546:
Finally if n = 10000 so that

10 ¡ 1 9
dxn = =
10000 10000
we have:
Z 10
9 X µ
10000
9
¶2
2
x dx ¼ 1+ i
1 10000 i=0 10000
= 333:0455:
Each time n is made larger the approximation gets better.

Rb
3.3.8 a
dx as a Linear Operator
Rb Pn Pn
Since a dx is a limiting form of i=1 ; and since i=1 is a linear operator, it
Rb
is not surprising a dx is also a linear operator. We can therefore manipulate
Rb Pn
expressions involving a dx using rules very similar to those for i=1 :
Rb
Rules for Manipulating the Integration Operator: a
dx
Rb
Rule 1: a dx is a linear operator. That is if the exponent on the bracket is
Rb 1 Rb
1, or given a (f (x) + g (x)) dx or equivalently a (f (x) + g (x)) dx, then you
Rb
may take a dx inside the brackets and apply it to each term inside as:
Z ÃZ Z !
b b b
(f (x) + g (x)) dx = f (x) dx + g (x) dx :
a a a
If the exponent on the brackets is not 1 then this step is not justi…ed. For
example:
Z b ÃZ Z b !2
b
2
(f (x) + g (x)) 6= f (x) dx + g (x) dx :
a a a
Rb
Rule 2:You may pull a past any expression that does not depend on x (or
whatever the integration variable is) but do not go past the dx term. Thus:
Z b Z b
cf (x) dx = c f (x) dx
a a
since c does not depend on x:

Rb Rb
Rule 3: If when moving over a you reach dx then set a dx = b ¡ a: Thus
for example:
Z 10
dx = 10 ¡ 3 = 7:
3
Rule 4: As long as it does not con‡ict with other notation, you can change
the integration variable without a¤ecting the expression. Thus:
Z b Z b
f (x) dx = f (y) dy:
a a
Here are some examples:

Rb
3.3.9 Example 1: a (cf (x) + dg (x))2 dx
Rb
Before we can take a through the brackets we need to expand the square to
get an exponent of 1: Thus:
Z b Z b³ ´
2
(cf (x) + dg (x)) dx = c2 f (x)2 + 2cdf (x) g (x) + d2 g (x)2 dx:
a a
Rb
Using Rule 1 and pulling a dx through the brackets and applying it to each
term we obtain:
Z b Z b Z b Z b
2 2
(cf (x) + dg (x)) dx = 2
c f (x) dx + 2cdf (x) g (x) dx + d2 g (x)2 dx:
a a a a
Rb
By Rule 2 we can take a
dx past the terms not depending on x; that is c2 ; 2cd;
and d2 to get:
Z b Z b Z b Z b
(cf (x) + dg (x))2 dx = c2 f (x)2 dx + 2cd f (x) g (x) dx + d2 g (x)2 dx:
a a a a
Rb
3.3.10 Example 2: a
(f (x) + d)2 dx
Again squaring the brackets to get an exponent of 1 we obtain:
Z b Z b³ ´
(f (x) + d)2 dx = 2
f (x) + 2df (x) + d2 dx:
a a
Rb
Now that the brackets have an exponent of 1 we can pull a
dx through the
brackets to get:
Z b Z b Z b Z b
(f (x) + d)2 dx = f (x)2 dx + 2d f (x) dx + d2 dx:
a a a a
Rb
By Rule 3 the term a dx becomes b ¡ a and so we have:
Z b Z b Z b
(f (x) + d)2 dx = f (x)2 dx + 2d f (x) dx + d2 (b ¡ a) :
a a a
3.4 Random Variables

3.4.1 Introduction
Up until now all variables we have considered are nonrandom. A random vari-
able is a new kind of variable, one that, roughly speaking, takes on di¤erent
values with di¤erent probabilities.
There are two basic kinds of random variables: 1) discrete random variables
where we can make a numbered list of the outcomes and 2) continuous random
variables where the outcomes are all possible values along the real line.
3.4.2 Discrete Random Variables

A discrete random variable X has n possible outcomes as: x1 ; x2 ; : : : xn each
associated with a probability: p1 ; p2 ; : : : pn . Note the convention of using upper
case letters to denote the random variable, here X; and lower case letters to
denote possible outcomes, here: x1 ; x2 ; : : : xn .
De…nition 45 A discrete random variable is a list of outcomes and associated

probabilities given as:
2 3
Outcomes Probabilities
6 x1 p1 7
6 7
6 x p2 7
X=6 2 7
6 .. .. 7
4 . . 5
xn pn
and where the probabilities satisfy:
1. 0 · pi · 1 for i = 1; 2; : : : n
Pn
2. i=1 pi = p1 + p2 + ¢ ¢ ¢ + pn = 1:
For example if X is the outcome of rolling a fair die, the outcomes are
1; 2; 3; 4; 5 and 6 with probabilities 16 ; 16 ; 16 ; 16 ; 16 and 16 then:
2 3
6 1 1 7
6 6 7
6 2 1 7
6 6 7
X=6 6 3 1 7:
6 7
6 4 1 7
6 6 7
4 5 1 5
6
1
6 6
Given any function f (x) and random variable X you can always create a
new random variable f (X) by applying f (x) to each of the outcomes. For
example if X is the die roll above and f (x) = x2 then we simply square all the
outcomes to obtain:
2 3
6 12 1 7
6 6 7
6 22 1 7
6 6 7
2
X =6 6 32 1 7:
6 7
6 42 1 7
6 6 7
4 52 1 5
6
2 1
6 6
Note that it is the outcomes and not the probabilities that get squared!
Degenerate Random Variables

De…nition 46 If a random variable has only one outcome that it takes on with
probability 1 then we say that X is a degenerate random variable.
For example if X = 3 with probability 1 or:
· ¸
X=
3 1
then X is a degenerate random variable.
3.4.3 The Bernoulli Distribution

De…nition 47 A random variable X which takes on two values, 1 and 0 with
probability p and 1 ¡ p where 0 · p · 1 is said to have a Bernoulli distribution
or:
2 3
X=4 1 p 5
0 1¡p
If X is has a Bernoulli distribution we write X » B (p) :
For example suppose that 30 percent of the population supports the gamma
party. If I take one person at random and ask him if he supports the Gamma
Party, and let X = 1 if he says yes and X = 0 if he says no then X has a
Bernoulli distribution with probability p = 0:3:
3.4.4 The Binomial Distribution

De…nition 48 Suppose Xi » B (p) are n independent Bernoulli’s and de…ne
Y as their sum:
Y = X1 + X2 + X3 + ¢ ¢ ¢ + Xn :
Then Y is said to have a binomial distribution denoted as: Y » B (p; n).
Continuing the Gamma Party example above, if I ask n = 2000 people taken
randomly from a population, then Y is the number of people in my sample who
say they support the Gamma party or Y » B (0:3; 2000).
If Y » B (p; n) then there are n + 1 possible outcomes k = 0; 1; 2; : : : n and
it can be shown that
n!
Pr [Y = k] = pk (1 ¡ p)n¡k ; k = 0; 1; 2; : : : n
k! (n ¡ k)!
or:
2 3
6 0 n! 0 n¡0 7
6 0!(n¡0)! p (1 ¡ p) 7
6 n! 1 n¡1 7
6 1 1!(n¡1)! p (1 ¡ p)
7
6 7
Y =6 2 n! 2 n¡2 7
6 2!(n¡2)! p (1 ¡ p) 7
6 .. .. 7
6 7
4 . . 5
n n!
2!(n¡n)! pn
(1 ¡ p)n¡n
1
For example given a ¡one-third
¢ probability of success or p = 3 and four trials
or n = 4 so that Y » B 13 ; 4 we have:
µ ¶0 µ ¶4¡0
4! 1 1 16
Pr [Y = 0] = 1¡ =
0! (4 ¡ 0)! 3 3 81
µ ¶1 µ ¶4¡1
4! 1 1 32
Pr [Y = 1] = 1¡ =
1! (4 ¡ 1)! 3 3 81
µ ¶2 µ ¶4¡2
4! 1 1 24
Pr [Y = 2] = 1¡ =
2! (4 ¡ 2)! 3 3 81
µ ¶3 µ ¶4¡3
4! 1 1 8
Pr [Y = 3] = 1¡ =
3! (4 ¡ 3)! 3 3 81
µ ¶4 µ ¶4¡4
4! 1 1 1
Pr [Y = 4] = 1¡ =
4! (4 ¡ 4)! 3 3 81
or:
2 3
6 0 16 7
6 81 7
6 1 32 7
Y =6
6
81
24
7
7
6 2 81 7
4 3 8 5
81
1
4 81
Verify that the probabilities sum to one.
3.4.5 The Poisson Distribution
The Poisson distribution is an example of a discrete random variable with an

in…nite number of possible outcomes.
De…nition 49 We say that Y has a Poisson distribution with mean parameter

¹ > 0 or Y » P (¹) if:
¹k
Pr [Y = k] = e¡¹ ; k = 0; 1; 2; : : : 1:
k!
Alternatively we can represent a Poisson random variables as:
2 3
0
6 0 e¡¹ ¹0! 7
6 7
6 ¡¹ ¹
1 7
6 1 e 1! 7
Y =6 6 2 e ¡¹ ¹2 7:
7
6 2!3 7
6 3 e ¡¹ ¹ 7
4 3! 5
.. ..
. .
You can verify that the probabilities sum to one from the Taylor series:
P ¹k
e¹ = 1k=0 k! so that:
1
X 1
X ¹k
Pr [Y = k] = e¡¹
k!
k=0 k=0
1
X ¹k
= e¡¹
k!
k=0
¡¹ ¹
= e e
= 1:
We have the following relationship between the Poisson and the Binomial
distribution, sometimes referred to as the Law of Small Numbers:
Theorem 50 If Y » B (p; n) has a Binomial distribution and n ! 1; p ! 0
but np ! ¹ > 0 then Y has an approximate Poisson distribution or Y » P (¹)
as n ! 1:
For example consider an intersection where n = 1000 cars pass through
each year. The probability of any one car having an accident is very small, say
1
p = 250 and let us assume independent of other cars. If then Y is the total
number of accidents in the intersection, then
µ ¶
1
Y »B ; 1000 :
250
Therefore:
µ ¶0 µ ¶1000
1000! 1 1
Pr [Y = 0] = 1¡ = 0:01817
0! £ 1000! 250 250
µ ¶1 µ ¶999
1000! 1 1
Pr [Y = 1] = 1¡ = 0:07296
1! £ 999! 250 250
µ ¶2 µ ¶998
1000! 1 1
Pr [Y = 2] = 1¡ = 0:14638
2! £ 998! 250 250
µ ¶3 µ ¶997
1000! 1 1
Pr [Y = 3] = 1¡ = 0:19556
3! £ 997! 250 250
µ ¶4 µ ¶996
1000! 1 1
Pr [Y = 4] = 1¡ = 0:19556
4! £ 996! 250 250
µ ¶5 µ ¶995
1000! 1 1
Pr [Y = 5] = 1¡ = 0:15661
5! £ 995! 250 250
..
.
1
If instead we work with the Poisson distribution with ¹ = 2500 £ 10000 = 4
or
Y » P (4) :
we obtain a very good approximation. Thus:
40
Pr [Y = 0] ¼ e¡4 = 0:01831
0!
1
¡4 4
Pr [Y = 1] ¼ e = 0:07326
1!
2
4
Pr [Y = 2] ¼ e¡4 = 0:14652
2!
3
4
Pr [Y = 3] ¼ e¡4 = 0:19537
3!
4
4
Pr [Y = 4] ¼ e¡4 = 0:19537
4!
5
4
Pr [Y = 5] = e¡4 = 0:15629
5!
..
.
and so the probability of 3 (or 4 for that matter) accidents occurring in one year
is 0:19537 using the exact binomial distribution and 0:19556 using the Poisson
approximation.
3.4.6 Continuous Random Variables

A continuous random variable X is one where the possible outcomes may fall
anywhere along the real line: ¡1 < x < 1: It turns out that the in…nity of
the real line is greater than the in…nity of the number of integers which means
that continuous random variables have to be handled di¤erently than discrete
random variables.1
De…nition 51 A continuous random variable X is one where the possible out-

comes may fall anywhere along the real line or ¡1 < x < 1: Associated with
each outcome x is a probability density: p (x) which has the following properties:
1. p (x) ¸ 0 for all ¡1 < x < 1

R1
2. ¡1 p (x) dx = 1:
Rb
3. Pr [a · X · b] = a p (x) dx:
Thus the area under the density between a and b gives the probability that
X will fall in that interval.
1 For example it turns out that X takes on any particular value x must be zero!
3.4.7 The Uniform Distribution

De…nition 52 We say that the random variable X has a uniform distribution
with lower bound a and upper bound b or X » U (a; b) if:
½ 1
p (x) = b¡a for a · x · b :
0 for x < a or x > b
R 1 that this satis…es the requirements for a density. In particular p (x) ¸ 0

Note
and ¡1 p (x) dx = 1 since:
Z 1 Z b
1
p (x) dx = dx
¡1 a b ¡ a
Z b
1
= dx
b¡a a
1
= £ (b ¡ a)
b¡a
= 1:
For example imagine picking a number between 0 and 10 where each number
has an equal likelihood of being picked and calling this X: Then X has a uniform
distribution with lower bound 0 and upper bound 10 or
X » U (0; 10) :
Given that X » U (0; 10) we have for 0 · x · 10:
1 1
p (x) = =
10 ¡ 0 10
and the probability of picking a number between 5 and 7 would be:
Z 7
Pr [5 · X · 7] = p (x) dx
5
Z 7
1
= dx
5 10
Z 7
1
= dx
10 5
1
= (7 ¡ 5)
10
1
= :
5
It is possible for p (x) > 1: For example with the uniform distribution when
a = 0 and b = 12 we have
1
p (x) = 1 =2>1
2 ¡0
for 0 < x < 12 : Remember p (x) is not a probability and so there is nothing
paradoxical about this.
3.4.8 The Standard Normal Distribution

De…nition 53 We say that the continuous random variable Z has a standard
normal distribution or Z » N [0; 1] if it has a density given by:
1 z2
p (z) = p e¡ 2 ; ¡1 < z < 1:
2¼
A little re‡ection will show that p (z) has a global maximum at z = 0 and
that p (¡z) = p (z) so that it is symmetric around 0: This is con…rmed by the
plot of the density found below:
0.4
0.3
0.2
0.1
-4 -2 0 2 4
z
The Standard Normal Density
3.4.9 The Normal Distribution

De…nition 54 If a random variable X can be written as: X = ¹ + ¾Z where ¹
and ¾ are nonrandom constants, then we£ say that
¤ X has a normal distribution
with mean ¹ and variance ¾2 or X » N ¹; ¾ 2 :
It can be shown that the normal distribution has a density:

1 (x¡¹)2
p (x) = p e¡ 2¾2 ; ¡1 < x < 1:
2¼¾ 2
For example if ¹ = 65 and ¾ = 15 we have:
1 (x¡65)2
¡
p (x) = p e 2£(15)2
2¼152
0.025
0.02
0.015
0.01
0.005
0 20 40 60 80 100 120
x
Normal Density with ¹ = 65 and ¾ = 15:
3.4.10 The Chi-Squared Distribution

De…nition 55 If Y = Z12 + Z22 + ¢ ¢ ¢ + Zr2 where the Zi 0 s are independent
standard normals, then Y is said to have a chi-squared distribution with r degrees
of freedom or Y » Â2r .
The chi-square distribution is very important in econometrics. For example

s2 ; the sample variance, turns out to have a chi-squared distribution.
It turns out that if Y » Â2r then y has a density:
r
y 2 ¡1 e¡y=2
p (y) = £ ¤ ; 0 · y < 1:
2r=2 ¡ 2r
For example if r = 3:
3
y 2 ¡1 e¡y=2 1 p ¡1y
p (y) = £3¤ = p ye 2
3=2
2 ¡ 2 2¼
0.2
0.15
0.1
0.05
0 2 4 6 8 10
y
Chi-squared density with r = 3
3.4.11 The Student’s t Distribution:

De…nition 56 If Z has a standard normal distribution and Y has a chi-squared
distribution with r degrees of freedom, and if Z and Y are independent, then :
Z
X=q
Y
r
is said to have a Student’s t distribution with r degrees of freedom which is

denoted as X » Tr :
Again, this is an important distribution in econometrics. The t distribution

has the same bell shape as the standard normal distribution. In fact as r ! 1
it turns out that the t distribution approaches a standard normal.
It can be shown that if X » Tr then X has a density:
¡ ¢ ¡ ¢µ ¶¡ (r+1)
¡ 12 ¡ r2 x2 2
p (x) = p ¡ r+1 ¢ 1 + ; ¡1 < x < 1:
r¡ 2 r
For example if r = 7:
¡ ¢ ¡ ¢µ ¶¡ (7+1)
¡ 12 ¡ 72 x2 2
p (x) = p ¡ 7+1 ¢ 1 +
7¡ 2 7
5 ¼ 1
= p
16 7 ¡1 + x2 ¢4
7
which is plotted below
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4 -2 0 2 4
x
Student’s t distribution with r = 7
3.4.12 The F Distribution:

De…nition 57 If Y1 » Â2r has a chi-squared distribution with r degrees of free-
dom and Y2 » Â2s has a chi-squared distribution with s degrees of freedom, and
if Y1 and Y2 are independent then the ratio:
Y1 =r
Y =
Y2 =s
has an F distribution with r degrees of freedom in the numerator and s degrees

of freedom in the denominator. This is denoted as Y » Fr;s :
Again this is an important distribution in econometrics. Often for example

hypothesis tests lead to test statistics with an F distribution.
The density for an F distribution can be written as:
¡ r ¢r=2 ¡ r ¢ ¡ s ¢ r=2¡1
¡ 2 ¡ 2 y
p (y) = s ; 0 · y < 1:
¡ r+s ¢ ³ ry 2
´ (r+s)
2
¡ 2 1+ s
For example if r = 3 and s = 7 then the density is:

15
p p 3=2¡1
6272 3 7¼ £ y
p (y) = ³ ´5
2
1 + 3y7
0.016
0.014
0.012
0.01
0.008
0.006
0.004
0.002
0 1 2 3 4
x
F distribution
3.5 Expected Values

Suppose the random variable X is say the growth of GN P next year and that:
2 3
6 1% 1 7
X =6 4
4
1
7
5
2% 2
1
3% 4
Your job is to forecast what GN P will be. To do this you must pick some
number, which will be your forecast, as your best guess of what GN P will be.
One way of doing this is to use the expected value of X; written as E [X] or
¹ or ¹X : In the above example ¹ = ¹X = E [X] = 2%:
This concept turns out to be fundamental for the study of random variables.
We will …rst discuss it for discrete random variables and then for continuous
random variables.
3.5.1 E [X] for Discrete Random Variables

De…nition 58 The expectation of a discrete random variable is X:
E [X] = x1 p1 + x2 p2 + ¢ ¢ ¢ + xn pn
Xn
= xi pi :
i=1
The recipe for calculating E [X] given the events x1 ; x2 ; : : : xn and probabil-
ities p1 ; p2 ; : : : pn is as follows:
Recipe for Calculating E [X]
1. Take each outcome xi

2. Weight it by its probability xi £ pi

P
3. And sum to get: E [X] = ni=1 xi pi :
This may be represented as:

2 3
Outcomes Probabilities Product
6 x1 p1 x1 £ p1 7
6 7
6 x p2 x2 £ p2 7
6 2 7
6 x p3 x3 £ p3 7
6 3 7
6 .. .. .. 7
6 . . . 7
6 7
4 xn pn £ pn
xnP 5
E [X] ´ ni=1 xi £ pi
Note that E [X] ; unlike X; is not random.

More generally we can take the expectation of a function of a random vari-
able. Given the function f (x) we have a new random variable Y = f (X) with
outcomes f (x1 ) ; f (x2 ) ; : : : f (xn ) and associated probabilities: p1 ; p2 ; : : : pn in
which case:
De…nition 59 Given a discrete random variable X then E [f (X)] is de…ned

as:
E [f (X)] = f (x1 ) p1 + f (x2 ) p2 + ¢ ¢ ¢ + f (xn ) pn

Xn
= f (xi ) pi :
i=1
This may be represented as:

2 3
Outcomes Probabilities Product
6 f (x1 ) p1 f (x1 ) £ p1 7
6 7
6 f (x2 ) p2 f (x2 ) £ p2 7
6 7
6 f (x3 ) p3 f (x3 ) £ p3 7
6 7
6 .. .. .. 7
6 . . . 7
6 7
4 f (xn ) pn f (xP
n ) £ pn
5
E [f (X)] ´ ni=1 f (xi ) £ pi
One of the things you must be careful about is not confusing E [f (x)] with
f (E [X]) : In P
generalRthe two are di¤erent except for linear functions. This is
because, like and , it turns out that E [ ] is a linear operator.
2 £ ¤
We will illustrate this di¤erence by comparing E [X] with E X 2 for each
of the random variables considered below. In general we will see that:
£ ¤ 2
E X 2 ¸ E [X]
with equality only when X is a degenerate random variable; that is when it

takes on only one value with probability 1:
3.5.2 Example 1: Die Roll

Let the random variable X be the outcome of rolling a fair die so that X is
de…ned as:
2 3
6 1 1 7
6 6 7
6 2 1 7
6 6 7
X =66 3 1
6
7
7
6 4 1 7
6 6 7
4 5 1 5
6
1
6 6
Then by weighting each outcome by its probability we obtain:

1 1 1 1 1 1
E [X] = 1 £ +2£ +3£ +4£ +5£ +6£
6 6 6 6 6 6
7
= = 3:5:
2
If f (x) = x2 then we have:
2 3
6 12 1 7
6 6 7
6 2 2 1 7
6 6 7
X2 = 66 3 2 1
6
7
7
6 4 2 1 7
6 6 7
4 52 1 5
6
1
62 6
so that weighting each outcome by its probability we obtain:
£ ¤ 1 1 1 1 1 1
E X 2 = 12 £ + 22 £ + 32 £ + 42 £ + 52 £ + 62 £
6 6 6 6 6 6
91
= = 15:167:
6
We thus have:
µ ¶2
91 £ ¤ 7
15:167 = = E X 2 > E [X]2 = = 12:25:
6 2
3.5.3 Example 2: Binomial Distribution

Given
¡ ¢the probabilities calculated for the binomial distribution where: Y »
B 13 ; 4 we have:
2 3
6 0 16 7
6 81 7
6 1 32 7
Y =6 6
81
24
7
7
6 2 81 7
4 3 8 5
81
1
4 81
and so weighting each outcome by its probability we have:

16 32 24 8 1
E [Y ] = 0 £ +1£ +2£ +3£ +4£
81 81 81 81 81
4
= :
3
If f (x) = x2 we have:
2 3
6 02 16 7
6 81 7
6 1 2 32 7
Y =6
2
6
81
24
7
7
6 22 81 7
4 3 2 8 5
81
1
42 81
and so weighting outcomes by probabilities we have:

£ ¤ 16 32 24 8 1
E Y 2 = 02 £ + 12 £ + 22 £ + 32 £ + 42 £
81 81 81 81 81
8
= :
3
Again we have that:
µ ¶2
8 £ ¤ 4
2:6667 = = E Y 2 > E [Y ]2 = = 1:7778:
3 3
3.5.4 E [X] for Continuous Random Variables

As with discrete random variables we can de…ne the expectation of a continuous
random variable. The basic idea is exactly the same, outcome x;
R 1we take each P n
weight it by its probability p (x) dx and sum using ¡1 instead of i=1 . We
thus have:
De…nition 60 If X is a continuous random variable then
Z 1
E [X] = xp (x) dx:
¡1
De…nition 61 Given a continuous random variable X then E [f (X)] is de…ned

as:
Z 1
E [f (X)] = f (x) p (x) dx:
¡1
3.5.5 Example 1: The Uniform Distribution

Theorem 62 If X » U (a; b) has a uniform distribution then:
b+a
E [X] = :
2
1
Proof. Since p (x) = b¡a for a < x < b and 0 otherwise we have:
Z 1
E [X] = xp (x) dx
¡1
Z b
1
= x dx
a b¡a
Z b
1
= xdx
b¡a a
¯
1 x2 ¯¯ b
=
b¡a 2 ¯ a
1 b2 ¡ a2
=
2 b¡a
1 (b ¡ a) (b + a)
=
2 b¡a
b+a
= :
2
b+a
Thus E [X] = 2 is the midpoint between a and b: For example if X »
U (0; 10), then
0 + 10
E [X] = = 5:
2
£ ¤ b2 + ab + a2
E X2 = :
3
£ ¤
Proof. To calculate E X 2 we have:
Z 1
£ ¤
E X2 = x2 p (x) dx
¡1
Z b
1
= x2 dx
a b ¡ a
Z b
1
= x2 dx
b¡a a
¯
1 x3 ¯¯ b
=
b¡a 3 ¯ a
1 b3 ¡ a3
=
3 b¡a
¡ ¢
1 (b ¡ a) b2 + ab + a2
=
3 b¡a
b + ab + a2
2
= :
3
Thus if X » U (0; 10) ; we have from the above formulae that:

£ ¤ 102 + 10 £ 0 + 02 100
E X2 = = = 33:3
3 3
Note that
£ ¤
33:3 = E X 2 > E [X]2 = (5)2 = 25:
3.5.6 The Standard Normal Distribution

If Z » N [0; 1] then it can be shown that:
Z 1
1 z2
E [Z] = z p e¡ 2 dz = 0:
¡1 2¼
z2
This follows from the symmetry of p (z) = p1 e¡ 2 around 0: In fact it can be
2¼
shown that:
Theorem 64 If Z » N [0; 1] then:

£ ¤
E Z k = 0; for k = 1; 3; 5 : : :
For k = 2 we have:

£ ¤
E Z 2 = 1:
Proof. We have:
Z 1
£ ¤ 1 z2
E Z2 = z 2 p e¡ 2 dz
¡1 2¼
Z 1 ³ ´
1 z2
= p z ze¡ 2 dz:
2¼ ¡1
³ z2
´
Using integration by parts where u (z) = z; u0 (z) = 1; v 0 (z) = z ze¡ 2 and
z2
v (z) = ¡e¡ 2 we have:
Z 1 ³ ´ ¯
1 1 2 ¯
z2
¡ z2 ¯ 1
p z ze¡ 2 dz = ¡ p ze ¯ ¡1
2¼ ¡1 2¼
Z 1
1 z2
+ p e¡ 2 dz
¡1 2¼
= 1
since:
z2
lim ¡ze¡ 2 =0
z!§1
and
Z 1 Z 1
1 z2
p e¡ 2 dz = p (z) dz = 1
¡1 2¼ ¡1
since p (z) is a density.

We thus have for Z » N [0; 1] that:
£ ¤
1 = E Z 2 > E [Z]2 = 02 = 0:
It is possible to show the following using integration by parts:
£ ¤
E Z 4 = 3:
The number 3 is the kurtosis of the standard normal distribution and is used
in econometrics to tell if data are well described by the normal distribution.
Proof. To show that:
Z 1
1 z2
z 4 p e¡ 2 dz = 3
¡1 2¼
³ z2
´
use integration by parts where u (z) = z 3 ; u0 (z) = 3z 2 ; v0 (z) = z ze¡ 2 and
z2
v (z) = ¡e¡ 2 : From this you obtain:
Z 1 Z 1
1 z2 z 2 1 z2
z p e¡ 2 dz
4
= ¡z 3 e¡ 2 jz=1
z=¡1 +3 z 2 p e¡ 2 dz
¡1 2¼ | {z } 2¼
=0 | ¡1 {z }
=1
= 3
z2 £ ¤
since e¡ 2 dominates z 3 for large z and by our previous proof that E Z 2 = 1:
More generally
£ ¤ it can be shown using a proof by induction that for even
powers E Z 2k that:

¡ ¢
£ 2k ¤ 2k ¡ k + 12
E Z = ¡ ¢ for k = 0; 1; 2; 3 : : : :
¡ 12
Example 68 Using this result we …nd for k = 4 that:

¡ ¢
£ ¤ 24 ¡ 4 + 12
E Z8 = ¡1¢ = 105:
¡ 2
3.5.7 E [ ] as a Linear Operator

Discussion
P
Since the Rexpectations operator E [X] is based on for discrete
P random R vari-
ables and for continuous random variables, and since both and are linear
operators, it turns out that E [ ] inherits these properties and is a linear operator
as well.
This means that if f (x) is a linear function then:
E [f (X)] = f (E [X]) for f (x) a linear function.
For example given the linear function: f (x) = ax + b with X a continuous

random variable we have:
Z 1
E [f (X)] = f (x) p (x) dx
¡1
Z 1
= (ax + b) p (x) dx
¡1
Z 1 Z 1
= a xp (x) dx + b p (x) dx
| ¡1 {z } | ¡1 {z }
=E[X] =1
= aE [X] + b
= f (E [X]) :
while if X is discrete:
n
X
E [f (X)] = f (xi ) pi
i=1
Xn
= (axi + b) pi
i=1
Xn Xn
= a xi pi + b pi
i=1 i=1
| {z } | {z }
=E[X] =1
= aE [X] + b
= f (E [X]) :
However if f (x) is nonlinear then we cannot take E [ ] inside the brackets.

This is illustrated above by the nonlinear function f (x) = x2 where we showed
for the distributions above that so that:
£ ¤ 2
E X 2 > E [X] :
More generally we have the following rules for manipulating expressions in-
volving E [ ] :
Up until nowR 1 we have derived results by expressing
Pn E [f (X)] either as an
integral using ¡1 f (x) dx or as a summation i=1 f (xi ) pi : For many deriva-
tions it turns out that it is not necessary to do this, that one can derive results
using the fact that E [ ] is a linear operator. To do this you need to know the
rules given below.
3.5.8 The Rules for E [ ]

Rules for Manipulating the Expectations Operator: E [ ]
Rule 1: E [ ] is a linear operator. That is, if X and Y are h two random

i
1
variables and if the exponent on the bracket is 1, or given E (X + Y ) or
equivalently E [(X + Y )], then you may take E [ ] inside the brackets and apply
it to each term inside as:
E [(X + Y )] = E [X] + E [Y ] :
More generally if X1 ; X2 ; : : : Xm are m random variables then:
E [(X1 + X2 + ¢ ¢ ¢ + Xm )] = E [X1 ] + E [X2 ] + ¢ ¢ ¢ + E [Xm ] :
This only works for linear functions! If the exponent on the brackets is not 1
then this step is not justi…ed. For example:
h i
2 2
E (X + Y ) 6= (E [X] + E [Y ]) :
In general if f (x) is a nonlinear function then
E [f (X + Y )] 6= f (E [X] + E [Y ]) ;
for example if f (x) = ex ; which is not linear, then:

£ ¤
E eX+Y 6= eE[X]+E[Y ] :
Rule 2:You may always pull E [ ] past any expression that is non-random.
For example if a is a non-random constant (e.g. a = 3 ) then:
E [aX] = aE [X]
but if Y is another random variable:
E [XY ] 6= Y E [X]
and in general:
E [XY ] 6= E [X] E [Y ] :
Rule 3: The expectation of a non-random constant is just that constant.

That is if a is a non-random constant then:
E [a] = a:
Thus E [7] = 7:
Here are some examples:
£ ¤
3.5.9 Example 1: E (3X + 4Y )2
To pull E [ ] through the brackets we must expand to get an exponent of 1:
Thus:
h i £¡ ¢¤
E (3X + 4Y )2 = E 9X 2 + 24XY + 16Y 2 :
Pulling E [ ] through the brackets we obtain:

h i £ ¤ £ ¤
E (3X + 4Y )2 = E 9X 2 + E [24XY ] + E 16Y 2 :
Since 9; 24 and 16 are nonrandom, we can pull E [ ] past them to obtain:

h i £ ¤ £ ¤
E (3X + 4Y )2 = 9E X 2 + 24E [XY ] + 16E Y 2 :
£ ¤
3.5.10 Example 2: E (3X + 4)2
To pull E [ ] through the brackets we must expand to get an exponent of 1:
Thus:
h i £¡ ¢¤
E (3X + 4)2 = E 9X 2 + 24X + 16 :
Pulling E [ ] through the brackets we obtain:

h i £ ¤
E (3X + 4Y )2 = E 9X 2 + E [24X] + E [16] :
Since 9; 24 are nonrandom we can pull E [ ] past them and use the fact that
E [16] = 16 to obtain:
h i £ ¤
E (3X + 4Y )2 = 9E X 2 + 24E [X] + 16:
3.5.11 Example 3: X » N [¹; ¾ 2 ]

£ ¤
If X » N ¹; ¾ 2 then we can write X as X = ¹ + ¾Z where Z » N [0; 1] : It
follows that:
E [X] = E [(¹ + ¾Z)]

= E [¹] + ¾E [Z]
| {z }
=0
= ¹:
since ¹ is nonrandom and we have already shown that E [Z] = 0:

Thus if X » N [6; 25] then E [X] = 6: £ ¤
Also since we have already shown that E Z 2 = 1 it follows that:
£ ¤ h i
E X 2 = E (¹ + ¾Z)2
£¡ ¢¤
= E ¹2 + 2¾Z + ¾2 Z 2
£ ¤
= ¹2 + 2¾E [Z] + ¾2 E Z 2
| {z } | {z }
=0 =1
= ¹2 + ¾2 :
£ ¤
Thus if X » N [6; 25] then E X 2 = 62 + 25 = 61:
We thus have:
£ ¤
Theorem 69 If X » N ¹; ¾ 2 then :
£ ¤
¹2 + ¾ 2 = E X 2 > E [X]2 = ¹2 :
3.6 Variances
3.6.1 De…nition
While the population mean ¹ = E [X] is used to measure the central tendency
of the random variable X; the variance V ar [X] ´ ¾2 ´ ¾ 2X is used to measure
the dispersion or volatility of X: For example if X is the return on some stock
then E [X] is used to measure the expected return while V ar [X] provides a
measure of the riskiness of that particular stock.
De…nition 70 The variance of a random variables is de…ned as:

h i
2
V ar [X] ´ E (X ¡ E [X])
h i
´ E (X ¡ ¹)2 ;
that is if f (x) = (x ¡ ¹)2 then V ar [X] = E [f (X)] :
De…nition 71 The standard deviation of X is the positive square root of the

variance and is denoted as
p
¾ ´ ¾ X ´ V ar [X]:
3.6.2 Some Results for Variances

Theorem 72 V ar [X] ¸ 0:
Theorem 73 V ar [X] = 0 if and only if X is a degenerate random variable.

That is if V ar [X] = 0 then X is equal to a nonrandom constant with probability
1:
Thus for example V ar [7] = 0 and V ar [X] = 0 implies that Pr [X = c] = 1

for some c:
Theorem 74 If c is a nonrandom constant and X is a random variable then:2
V ar [X + c] = V ar [X] :
Thus for example V ar [X + 99999999] = V ar [X] :

2 This follows since E [X + c] = E [X] and so:
h i
V ar [X + c] ´ E (X + c ¡ E [X + c])2
h i
= E (X + c ¡ c ¡ E [X])2
h i
= E (X ¡ E [X])2
´ V ar [X] :
£ ¤ £ ¤
Theorem 75 V ar [X] = E X 2 ¡ E [X]2 = E X 2 ¡ ¹2 :
Proof. Using the rules for E [ ], and expanding the brackets to have an
exponent of 1 in the de…nition of the variance, we …nd that:
h i
V ar [X] = E (X ¡ ¹)2
£¡ ¢¤
= E X 2 ¡ 2¹X + ¹2
£ ¤ £ ¤
= E X 2 + E [¡2¹X] + E ¹2
£ ¤
= E X 2 ¡ 2¹E [X] + ¹2 :
By de…nition:
¹ = E [X]
so that:
£ ¤
V ar [X] = E X 2 ¡ 2E [X] E [X] + E [X]2
£ ¤
= E X 2 ¡ E [X]2 :
Alternatively we can always replace E [X] with ¹ to obtain:

£ ¤
V ar [X] = E X 2 ¡ ¹2 :
£ ¤
Theorem 76 E X 2 = E [X]2 + V ar [X] :
£ ¤
Theorem 77 E X 2 ¸ E [X]2 with equality only when X is a degenerate ran-
dom variable.3
Proof. This follows from the fact that V ar [X] ¸ 0 and hence:
£ ¤ £ ¤ £ ¤
E X 2 = E X 2 + V ar [X] ¸ E X 2
with equality only when V ar [X] = 0 or X is a degenerate random variable.
3.6.3 Example 1: Die Roll

Suppose X is the result of a fair die so that X can take on the outcomes
1; 2; 3; 4; 5 and 6 each with probability 16 : We saw above that
1 1 1
E [X] = 1 £ + 2 £ + ¢ ¢ ¢ + 6 £ = 3:5:
6 6 6
3 This follows from Result 5 since V ar [X] ¸ 0 and hence:
£ ¤ £ ¤
E X2 ¸ E X2
with equality only when X is a degenerate random variable.
To calculate V ar [X] from the de…nition we need to calculate E [f (X)] where

2
f (x) = (x ¡ 3:5) : We thus have:
2 3
6 (1 ¡ 3:5)2 1 7
6 6 7
6 (2 ¡ 3:5)2 1 7
6 6 7
2 6 2 1 7
(X ¡ 3:5) = 6 (3 ¡ 3:5) 6 7
6 7
6 (4 ¡ 3:5)2 1 7
6 6 7
4 (5 ¡ 3:5)2 1
6
5
(6 ¡ 3:5)2 1
6
so that weighting outcomes with probabilities we obtain:

1 1 1
V ar [X] = (1 ¡ 3:5)2 £ + (2 ¡ 3:5)2 £ + ¢ ¢ ¢ + (6 ¡ 3:5)2 £
6 6 6
= 2:916667:
The standard deviation of X is then

p
¾ = 2:916667 = 1:707825:
£ ¤
Another way to calculate V ar [X] is to calculate E X 2 as before:
£ ¤ 1 1 1 91
E X 2 = 12 £ + 22 £ + ¢ ¢ ¢ + 62 £ =
6 6 6 6
and then to use Result 4 so that:
91
V ar [X] = ¡ (3:5)2 = 2:916667:
6
3.6.4 Example 2: Uniform Distribution

Suppose X » U (a; b) has a uniform distribution. We have:
(b ¡ a)2
V ar [X] = :
12
Proof. We have already seen that:
a+b
E [X] =
2
and
£ ¤ b2 + ab + a2
E X2 = :
3
Consequently we have:
£ ¤
V ar [X] = E X 2 ¡ E [X]2
µ ¶2
b2 + ab + a2 a+b
= ¡
3 2
(b ¡ a)2
= :
12
Thus if X » U (0; 10) then

(10 ¡ 0)2 25
V ar [X] = =
12 3
so the standard derivation is
r
25 5
¾= =p :
3 3
3.6.5 Example 3: The Normal Distribution

If Z » N [0; 1] then V ar [Z] = 1 since we have already shown that
£ ¤
E Z 2 = 1 and E [Z] = 0:
Thus we have:
£ ¤
V ar [Z] = E Z 2 ¡ E [Z]2
= 1 ¡ 02
= 1:
From this we can show that:
£ ¤
Theorem 79 If X » N ¹; ¾ 2 then:
E [X] = ¹
V ar [X] = ¾2 :
Proof. Since X = ¹+¾Z where Z » N [0; 1] is a standard normal, it follows
that
E [X] = ¹ + ¾E [Z]
= ¹:
To show that V ar [X] = ¾2 note that we have already shown that: E [X] =
¹2 + ¾ 2 and hence:
£ ¤
V ar [X] = E X 2 ¡ E [X]2
= ¹2 + ¾2 ¡ ¹2
= ¾2:
Thus if X » N [6; 25] then E [X] = 6 and V ar [X] = 25:

3.7 Covariances
3.7.1 De…nitions
De…nition 80 The covariance of two random variables X and Y is de…ned as:
Cov [X; Y ] = E [(X ¡ E [X]) (Y ¡ E [Y ])] :

¡ ¢
De…nition 81 Alternatively if f (x) = (x ¡ ¹x ) and g (y) = y ¡ ¹y then
Cov [X; Y ] = E [f (X) g (Y )] :
If Cov [X; Y ] > 0 then there exists some positive linear relationship between
X and Y so that when X is above (below) average then Y tends to be above
(below) average. For example if X is ice cream consumption and Y is the
temperature, one would expect that Cov [X; Y ] > 0; that is above average ice
cream consumption would be associated with above average temperatures.
If Cov [X; Y ] < 0 then there exists some negative linear relationship between
X and Y so that when X is above (below) average then Y tends to be below
(above) average. For example if X is ice cream consumption and Y is rainfall,
one would expect that Cov [X; Y ] < 0; that is above (below) average ice cream
consumption would be associated with below (above) average rainfall.
De…nition 82 The correlation coe¢cient ½ is:

Cov [X; Y ]
½= 1 1 :
V ar [X] 2 V ar [Y ] 2
Since the standard deviations in the denominator are positive, it follows that
½ > 0 (½ < 0) if and only if Cov [X; Y ] > 0 (Cov [X; Y ] < 0) so that the sign of
½ is always the same as the sign of Cov [X; Y ] :
For example suppose that:
V ar [X1 ] = 9; Cov [X1 ; X2 ] = 6

V ar [X2 ] = 16
then:
6 1
½= 1 1 = :
9 16 2
2 2
3.7.2 Independence
Suppose we have two random variables X and Y and we wish to say that the two
random variables are independent of each other. For discrete random variables
the usual de…nition of independence is that if the joint probability distribution
satis…es
Pr [X = xi ; Y = yj ] = Pr [X = xi ] Pr [Y = yj ]
for all i and j then X and Y are independent.

I will use another de…nition of independence based on expectation E [ ] which
(it turns out) is equivalent to the above de…nition. It is that
De…nition 83 X and Y are independent if and only if for all functions f (x)
and g (y):
E [f (X) g (Y )] = E [f (X)] E [g (Y )] :
3.7.3 Results for Covariances

Here are some important results involving covariances:
Theorem 84 If X and Y are independent then Cov [X; Y ] = 0:
Proof. This follows since independence implies that:
E [f (X) g (Y )] = E [f (X)] E [g (Y )]
for any functions f (x) and g (y). In particular Letting:
f (x) = (x ¡ E [X]) ; g (y) = (y ¡ E [Y ])
we have
E [f (X)] = (E [X] ¡ E [X]) = 0
E [g (Y )] = (E [Y ] ¡ E [Y ]) = 0
and hence for these two functions independence implies that:
Cov [X; Y ] = E [f (X) g (Y )]
= E [f (X)] E [g (Y )]
= 0£0
= 0:
Theorem 85 If Cov [X; Y ] = 0 it does not follow that X and Y are indepen-
dent.
Proof. We show this result by providing a counter-example; that is two
random variables with 0 covariance but which are not independent. Suppose X
can take on the outcomes ¡1; 0; 1 and Y can take on the outcomes 0; 1; 2: The
probabilities of the various combinations of outcomes are given by:
Y
0 1 2
1
-1 0 4 0
1 1
X 0 4 0 4
1
1 0 4 0
so that for example Pr [X = ¡1; Y = 1] = 14 while Pr [X = ¡1; Y = 0] = 0: You

can show that E [X] = 0 and E [XY ] = 0 so that Cov [X; Y ] = 0: However, X
and Y are not independent since if Y = 1 it is impossible for X = 0 and so Y
carries information about X: Alternatively, Pr [X = 0] = 12 and Pr [Y = 0] = 12
but:
1
0 = Pr [X = 0; Y = 0] 6= Pr [X = 0] £ Pr [Y = 0] =
4
and so X and Y are not independent.
Theorem 86 If Cov [X; Y ] = 0 and X and Y are normally distributed then it

does follow that X and Y are independent.
Theorem 87 If c is a nonrandom constant then Cov [X; c] = 0:
Theorem 88 Cov [X; X] = V ar [X]
Theorem 89 Cov [X; Y ] = E [XY ] ¡ E [X] E [Y ] .
Proof. Let ¹x ´ E [X] and ¹Y ´ E [Y ] : From the de…nition of the covari-

ance and multiplying out the brackets we have:
Cov [X; Y ] = E [(X ¡ ¹X ) (Y ¡ ¹Y )]

= E [(XY ¡ ¹X Y ¡ ¹Y X + ¹X ¹Y )]
= E [XY ] + E [¡¹X Y ] + E [¡¹Y X] + E [¹X ¹Y ]
= E [XY ] ¡ ¹X E [Y ] ¡ ¹Y E [X] + ¹X ¹Y :
Therefore:
£ ¤
Cov [X; Y ] = E X 2 ¡ 2E [X] E [Y ] + E [X] E [Y ]
= E [XY ] ¡ E [X] E [Y ] :
Theorem 90 E [XY ] = E [X] E [Y ] + Cov [X; Y ] :
Theorem 91 E [XY ] = E [X] E [Y ] if and only if Cov [X; Y ] = 0:
3.8 Linear Combinations of Random Variables

3.8.1 Linear Combinations of Two Random Variables
Suppose the random variable Y is a linear combination of the random variables
X1 and X2 so that:
Y = aX1 + bX2 + c:
If we know E [X1 ] and E [X2 ] then we can immediately calculate E [Y ] since:
E [Y ] = aE [X1 ] + bE [X2 ] + c:
For example given two assets (say bonds and stocks) with returns : X1 and X2 ;
and if X1 has an expected return of 4% or E [X1 ] = 4 and if X2 has an expected
return of 10% or E [X2 ] = 10; and we have a portfolio with 30% in X1 and 70%
in X2 or Y = 0:3X1 + 0:7X2 ; then the expected return on the portfolio is:
E [Y ] = E [0:3X1 + 0:7X2 ]
= 0:3E [X1 ] + 0:7E [X2 ]
= 0:3 £ 4 + 0:7 £ 10
= 8:2
so the expected return on the portfolio is 8:2%:

Another important question is can we calculate V ar [Y ] if we know V ar [X1 ]
and V ar [X2 ]? The answer is in general no, we also need to know Cov [X1 ; X2 ] :
First if Y = aX1 + bX2 + c we can ignore c since:
V ar [aX1 + bX2 + c] = V ar [aX1 + bX2 ] :
To calculate V ar [aX1 + bX2 + c] we have the following important result:
Theorem 92 Given two random variables X1 ; X2 and two nonrandom con-

stants a and b:
V ar [aX1 + bX2 ] = a2 V ar [X1 ] + 2abCov [X1 ; X2 ] + b2 V ar [X2 ] :
From this we can show that:
Theorem 93 The matrix:

· ¸
V ar [X1 ] Cov [X1 ; X2 ]
Cov [X1 ; X2 ] V ar [X2 ]
is positive semi-de…nite.
Proof. This follows since for all a and b :
V ar [aX1 + bX2 ] = a2 V ar [X1 ] + 2abCov [X1 ; X2 ] + b2 V ar [X2 ]

· ¸· ¸
£ ¤ V ar [X1 ] Cov [X1 ; X2 ] a
= a b
Cov [X1 ; X2 ] V ar [X2 ] b
¸ 0
since V ar [aX1 + bX2 ] ¸ 0:

We can also show that the correlation coe¢cient always lies between ¡1 and
1:
4
Theorem 94 If V ar [X1 ] 6= 0 and V ar [X2 ] 6= 0 then: ¡1 · ½ · 1:
Proof. Using the leading principal minors and the fact that the matrix in
the previous theorem is positive semi-de…nite we have:
· ¸
V ar [X1 ] Cov [X1 ; X2 ] 2
det = V ar [X1 ] V ar [X2 ] ¡ Cov [X1 ; X2 ]
Cov [X1 ; X2 ] V ar [X2 ]
Ã !
2
Cov [X1 ; X2 ]
= V ar [X1 ] V ar [X2 ] 1 ¡
V ar [X1 ] V ar [X2 ]
¡ ¢
= V ar [X1 ] V ar [X2 ] 1 ¡ ½2
¸ 0:
Theorem 95 If ½ = 1 then X2 = ®X1 + ¯ where ® > 0:
Proof. From
V ar [aX1 + bX2 ] = a2 V ar [X1 ] + 2abCov [X1 ; X2 ] + b2 V ar [X2 ]

1 ¡1
and letting a = p and b = p we have:
V ar[X1 ] V ar[X2 ]
" #
1 ¡1
V ar p X1 + p X2 = 2 (1 ¡ ½) :
V ar [X1 ] V ar [X2 ]
If ½ = 1 then the variance is zero and hence

1 ¡1
p X1 + p X2 = c
V ar [X1 ] V ar [X2 ]
where c is a nonrandom constant. It then follows that:
p
V ar [X2 ] p
X2 = p X1 + ¡c V ar [X2 ]:
V ar [X1 ] | {z }
| {z } ¯
®>0
4 Using the leading principal minors and the fact that the matrix in Result 2 is positive
semi-de…nite we have:
· ¸
V ar [X1 ] Cov [X1 ; X2 ]
det = V ar [X1 ] V ar [X2 ] ¡ Cov [X1 ; X2 ]2
Cov [X1 ; X2 ] V ar [X2 ]
Ã !
Cov [X1 ; X2 ]2
= V ar [X1 ] V ar [X2 ] 1 ¡
V ar [X1 ] V ar [X2 ]
¡ ¢
= V ar [X1 ] V ar [X2 ] 1 ¡ ½2
¸ 0:
¡ ¢
From this it follows that: 1 ¡ ½2 ¸ 0 or ¡1 · ½ · 1: If ½ 6= §1 then the both leading
principal minors are strictly positive and the matrix is positive de…nite.
Theorem 96 If ½ = ¡1 then X2 = ®X1 + ¯ where ® < 0:

1 1
Proof. Set a = p and b = p and use the same idea as above
V ar[X1 ] V ar[X2 ]
to show that:
V ar [aX1 + bX2 ] = 2 (1 + ½) :
An Example
Suppose that:
V ar [X1 ] = 9; Cov [X1 ; X2 ] = 6
V ar [X2 ] = 16:
The correlation coe¢cient is then given by:
6 1
½= p p =
9 16 2
and the variance covariance matrix is:
· ¸
9 6
6 16
which you can verify is positive de…nite.
If Y = 0:3X1 + 0:7X2 then
2 2
V ar [Y ] = (0:3) V ar [X1 ] + 2 (0:3) (0:7) Cov [X1 ; X2 ] + (0:7) V ar [X2 ]
= 0:32 £ 9 + 2 £ 0:3 £ 0:7 £ 6 + (0:7)2 £ 16
= 11:17
p
and so the standard deviation is ¾Y = 11:17 = 3:34:
3.8.2 The Capital Asset Pricing Model (CAPM)

Theory
Suppose you can invest in three assets with returns: R0 ; R1 and R2 : The …rst
asset Ro is risk free (e.g. short-term government bonds) so that Ro = ro with
probability 1: Assets 1 and 2 (say long-term bonds and stocks) are risky with:
E [R1 ] = r1 ; V ar [R1 ] = ¾21
E [R2 ] = r2 ; V ar [R1 ] = ¾22
Cov [R1 ; R2 ] = ¾12 :
Consider now a portfolio with weights on each asset ! o ; !1 ; and !2 with:
! o + ! 1 + ! 2 = 1:
Let the return on the portfolio be R which is given by:
R = ! o ro + ! 1 R1 + ! 2 R2 :
Since ! o = 1 ¡! 1 ¡ ! 2 we can rewrite this as:
R = (1 ¡ ! 1 ¡ !2 ) ro + ! 1 R1 + ! 2 R2
or:
R = ro + ! 1 (R1 ¡ ro ) + !2 (R2 ¡ ro )
where R1 ¡ ro and R2 ¡ ro are the returns in excess of the risk free rate ro :
Let r = E [R] be the return on the portfolio which is given by:
E [R] = !1 E [R1 ¡ ro ] + ! 2 E [R2 ¡ ro ]

= ro + ! 1 (r1 ¡ ro ) + !2 (r2 ¡ ro )
or
r~ = !1 r~1 + !2 r~2
where r~ = r ¡ ro is the expected return of the portfolio net of the risk free rate
ro while r~1 = r1 ¡ ro and r~2 = r2 ¡ ro are the expected returns on asset 1 and
asset 2 net of the risk free rate:
The risk of the portfolio is given by the variance which is
V ar [R] = !21 ¾21 + 2! 1 ! 2 ¾12 + ! 22 ¾ 22 :
Suppose you are a portfolio manager. Your client wants to obtain an ex-
pected return on this portfolio of r: Your job is to …nd the portfolio weights:
!1 and ! 2 (with ! o = 1 ¡! 1 ¡ !2 ) which achieves this but at the minimum
possible risk.
This leads to the constrained minimization problem of choosing ! 1 and ! 2
to minimize5 :
!21 ¾21 + 2! 1 ! 2 ¾ 12 + ! 22 ¾ 22
2
subject to the constraint:
r~ = ! 1 r~1 + ! 2 r~2 :
and the Lagrangian:

!21 ¾ 21 + 2! 1 ! 2 ¾12 + ! 22 ¾22
L (¸; ! 1 ; !2 ) =
2
+¸ (~ r ¡ ! 1 r~1 ¡ ! 2 r~2 ) :
5 Dividing by two is just to make the math a little simpler. Clearly minimizing half the
variance is equivalent to minimizing the variance itself.

The …rst order conditions are:

@L (¸¤ ; ! ¤1 ; ! ¤2 )
= r~ ¡ ! ¤1 r~1 ¡ ! ¤2 r~2 = 0
@¸
@L (¸¤ ; ! ¤1 ; ! ¤2 )
= !¤1 ¾21 + !¤2 ¾ 12 ¡ ¸¤ r~1 = 0
@!1
@L (¸¤ ; ! ¤1 ; ! ¤2 )
= !¤1 ¾12 + !¤2 ¾22 ¡ ¸¤ r~2 = 0
@!2
with second order conditions:
2 3
0 ¡~
r1 ¡~
r2
H = det 4 ¡~
r1 ¾ 21 ¾12 5 < 0:
¡~
r2 ¾12 ¾ 22
The …rst-order conditions can be re-written in matrix notation as:

2 32 ¤ 3 2 3
0 ¡~r1 ¡~r2 ¸ ¡~r
4 ¡~ r1 ¾21 ¾ 12 5 4 ! ¤1 5 = 4 0 5
¡~r2 ¾12 ¾22 ! ¤2 0
so that from Cramer’s rule we have:

2 3
0 ¡~ r ¡~
r2
det 4 ¡~
r1 0 ¾12 5
¡~
r2 0 ¾ 22
!¤1 =
¡ H ¢
r~ ¾12 r~2 ¡ ¾22 r~1
=
H
and:
2 3
0 ¡~ r1 ¡~ r
det 4 ¡~
r1 ¾21 0 5
¡~
r2 ¾ 12 0
! ¤2 =
¡ H ¢
r~ ¾ 12 r~1 ¡ ¾ 21 r~2
= :
H
Note that
! ¤1 ¾12 r~2 ¡ ¾22 r~1
=
! ¤2 ¾12 r~1 ¡ ¾21 r~2
is independent of r~: This means that the relative proportion of the two risky
assets in the minimum risk portfolio is independent of required rate of return r~:
3.8.3 An Example
Suppose that Ro = 1 with probability 1 and:
E [R1 ] = 5; V ar [R1 ] = 25
E [R2 ] = 9; V ar [R2 ] = 49
Cov [R1 ; R2 ] = 14:
The expected return on the portfolio then satis…es:
E [R] = ! 1 E [R1 ¡ ro ] + ! 2 E [R2 ¡ ro ]
= 1 + ! 1 (5 ¡ 1) + !2 (9 ¡ 1)
= 1 + 4!1 + 8!2
while the risk of the portfolio is:
V ar [R] = 25! 21 + 28! 1 ! 2 + 49!22 :
Suppose you wish to obtain an expected return of r = 7 (or r~ = 7 ¡ 1 = 6)
at the least possible risk. This leads to the constrained minimization problem
of choosing ! 1 and ! 2 to minimize:
25! 21 + 28! 1 ! 2 + 49!22
2
subject to the constraint:
7 = 1 + 4!1 + 8!2
or:
6 ¡ 4! 1 ¡ 8! 2 = 0:
25! 21 + 28!1 ! 2 + 49! 22
L (¸; !1 ; ! 2 ) = + ¸ (6 ¡ 4! 1 ¡ 8! 2 )
2
@L (¸¤ ; ! ¤1 ; !¤2 )
= 6 ¡ 4! ¤1 ¡ 8!¤2 = 0
@¸
@L (¸¤ ; ! ¤1 ; !¤2 )
= 25!¤1 + 14!¤2 ¡ 4¸¤ = 0
@! 1
@L (¸¤ ; ! ¤1 ; !¤2 )
= 14!¤1 + 49!¤2 ¡ 8¸¤ = 0
@! 2
with second-order conditions being satis…ed since:
2 3
0 ¡4 ¡8
H = det 4 ¡4 25 14 5 = ¡1488 < 0:
¡8 14 49
The …rst-order conditions in matrix notation are then:

2 32 ¤ 3 2 3
0 ¡4 ¡8 ¸ ¡6
4 ¡4 25 14 5 4 ! ¤1 5 = 4 0 5
¡8 14 49 ! ¤2 0
so that from Cramer’s rule we have:

2 3
0 ¡6 ¡8
det 4 ¡4 0 14 5
¡8 0 49 21
! ¤1 = = = 0:3387
¡1488 62
and:
2 3
0 ¡4 ¡6
det 4 ¡4 25 0 5
¡8 14 0 36
!¤2 = = = 0:5806:
¡1488 62
Thus:
21 36 5
!¤o = 1 ¡ ! ¤1 ¡ ! ¤2 = 1 ¡ ¡ = = 0:0806:
62 62 62
Thus the portfolio which obtains an expected return of 6% with the least vari-
ance has about 8% of the portfolio in the risk free asset or Ro ; about 34% in R1
and 58% in R2 : The riskiness of the return is then:
V ar [R] = 25! 21 + 28! 1 ! 2 + 49!22

µ ¶2 µ ¶µ ¶ µ ¶2
21 21 18 18
= 25 + 28 + 49
62 62 31 31
3087
=
124
q
so that the standard deviation is: ¾ = 3087
124 = 4:99:
3.8.4 Sums of Uncorrelated Random Variables

Suppose that the random variable Y is a linear combination of the n uncorrelated
random variables X1 ; X2 ; : : : Xn so that
Y = a1 X1 + a2 X2 + a3 X3 + ¢ ¢ ¢ + an Xn
and
Cov [Xi ; Xj ] = 0 for i 6= j:

As we have seen, a su¢cient condition for X1 ; X2 ; : : : Xn to be uncorrelated is

that they be independent since:
Xi ; Xj independent =) Cov [Xi ; Xj ] = 0:
We then have the following:
Theorem 97 Given that X1 ; X2 ; : : : Xn are uncorrelated it follows that if
Y = a1 X1 + a2 X2 + a3 X3 + ¢ ¢ ¢ + an Xn
then
V ar [Y ] = a21 V ar [X1 ] + a22 V ar [X2 ] + ¢ ¢ ¢ + a2n V ar [Xn ] :
Thus to calculate the variance of Y we need only square the coe¢cients on

each Xi ; take the variance, and sum.
3.8.5 An Example
If X1 ; X2 and X3 are uncorrelated and
V ar [X1 ] = 25; V ar [X2 ] = 36; V ar [X3 ] = 49
then if Y = 5X1 ¡ 4X2 + 2X3 then:
V ar [Y ] = (5)2 V ar [X1 ] + (¡4)2 V ar [X2 ] + (2)2 V ar [X3 ]

= (5)2 £ 25 + (¡4)2 £ 36 + (2)2 £ 49
= 1397
p
so that the standard deviation of Y is ¾ = 1397 = 37:38:
One common source of confusion is a negative coe¢cient, for example the
¡4 on X2 in this example. It is probably best at the beginning to rewrite this
with + signs and all numbers in brackets as:
Y = (5) X1 + (¡4) X2 + (2) X3
and then to square coe¢cients, take variances and sum.
3.8.6 The Sample Mean

Consider n uncorrelated random variables X1 ; X2 ; : : : Xn with the following
properties:
A1: E [Xi ] = ¹ for i = 1; 2; : : : n
A2: V ar [Xi ] = ¾2 for i = 1; 2; : : : n
A3: Cov [Xi ; Xj ] = 0 for i 6= j:
Note that each of the uncorrelated random variables has the same mean ¹
by A1 and the same variance ¾ 2 by A2:
De…nition 98 An estimator ^µ of a parameter µ is said to be unbiased if:

h i
E ^µ = µ:
Consider the sample mean:

¹ 1
X = (X1 + X2 + ¢ ¢ ¢ + Xn )
n
1 1 1
= X1 + X2 + ¢ ¢ ¢ + Xn
n n n
as an estimator of the unknown parameter ¹: We have the following:
Theorem ¹ is an unbiased estimator of ¹ or
99 Given A1 the sample mean X
£ ¤
¹
E X = ¹:
Proof. The proof is as follows:
£ ¤ 1 1 1
E X¹ = E [X1 ] + E [X2 ] + ¢ ¢ ¢ + E [Xn ]
n | {z } n | {z } n | {z }
=¹ by A1 =¹ by A1 =¹ by A1
1 1 1
= ¹ + ¹ +¢¢¢+ ¹
n n n
1
= n£ ¹
n
= ¹
¹ is an unbiased estimator of ¹.
and so X
We can also show that:
Theorem 100 Given A1; A2; and A3:
£ ¤ ¾2
V ar X¹ = :
n
Proof. Since the Xi 0 s are uncorrelated by A3, we obtain the variance by
squaring the coe¢cients, here n1 so that:
µ ¶2 µ ¶2 µ ¶2
£ ¤
V ar X¹ = 1 V ar [X1 ] + 1 V ar [X2 ] + ¢ ¢ ¢ + 1 V ar [Xn ] :
n n n
Now since all the variances are equal by A2 so we have:
µ ¶2 µ ¶2 µ ¶2
£ ¤ 1 1 1
¹
V ar X = V ar [X1 ] + V ar [X2 ] + ¢ ¢ ¢ + V ar [Xn ]
n | {z } n | {z } n | {z }
=¾ 2 by A2 =¾2 by A2 =¾ 2 by A2
µ ¶2 µ ¶2 µ ¶2
1 1 1
= ¾2 + ¾2 + ¢ ¢ ¢ + ¾2
n n n
µ ¶2
1
= n£ ¾2
n
¾2
= :
n
¹ gets, in a probabilistic since, closer and closer to

It therefore follows that X
¹ as the number of observations n goes to in…nity. This is sometimes referred
to as the Law of Large Numbers.
3.8.7 The Mean and Variance of the Binomial Distribu-

tion
If Y » B (p; n) has a binomial distribution then:
n!
Pr [Y = k] = pk (1 ¡ p)n¡k :
k! (n ¡ k)!
Therefore we have:
n
X n!
E [Y ] = k£ pk (1 ¡ p)n¡k
k! (n ¡ k)!
k=0
and
n
X n!
V ar [Y ] = (k ¡ E [Y ])2 £ pk (1 ¡ p)n¡k :
k! (n ¡ k)!
k=0
Neither of these are easy to evaluate but in fact simplify considerably after
a great deal of work. Using a proof by induction you can show that
Theorem 101 If Y » B (p; n) has a binomial distribution then:
E [Y ] = np and V ar [Y ] = np (1 ¡ p) :
It is much easier to prove these results using the fact that Y can be written
as the sum of n independent Bernoulli’s: X1 ; X2 ; : : : Xn or:
Y = X1 + X2 + X3 + ¢ ¢ ¢ + Xn :
Since Xi takes on the values 1 and 0 with probabilities p and 1 ¡ p we have:
E [Xi ] = 1 £ p + 0 £ (1 ¡ p) = p
£ ¤
E Xi2 = 12 £ p + 02 £ (1 ¡ p) = p
£ ¤
V ar [Xi ] = E Xi2 ¡ E [Xi ]2 = p ¡ p2 = p (1 ¡ p) :
Therefore E [Y ] = np since:
E [Y ] = E [X1 ] + E [X2 ] + E [X3 ] + ¢ ¢ ¢ + E [Xn ]

| {z } | {z } | {z } | {z }
=p =p =p =p
= np:
Similarly it can be shown that V ar [Y ] = np (1 ¡ p) : Since the Xi0 s are

independent and hence uncorrelated we can square the coe¢cients and take the
variance, and use the fact that all V ar [Xi ] = p (1 ¡ p) so that:
V ar [Y ] = 12 V ar [X1 ] + 12 V ar [Xn ] + ¢ ¢ ¢ + 12 V ar [Xn ]
| {z } | {z } | {z }
=p(1¡p) =p(1¡p) =p(1¡p)
= np (1 ¡ p) :
Thus for example if I ask n = 2000 and p = 0:3 so that: Y » B (0:3; 2000)
then:
E [Y ] = 2000 £ 0:3 = 600
V ar [Y ] = 2000 £ 0:3 £ (1 ¡ 0:3) = 420:
3.8.8 The Linear Regression Model for Beginners

The classical linear regression model with a constant and one regressor satis…es
the following assumptions:
A1. Yi = ® + Xi ¯ + ei for i = 1; 2; : : : n:
A2. E [ei ] = 0
A3. V ar [ei ] = ¾2 for i = 1; 2; : : : n:
A4. Cov [ei ; ej ] = 0 for i 6= j:
P ¡ ¢
A5. Xi is nonrandom with nj=1 Xj ¡ X ¹ 2 6= 0:
The least squares estimator of ¯ is then:
Pn ¡ ¢¡
¹ Yi ¡ Y¹
¢
^ = i=1 X i ¡ X
¯ Pn ¡ ¢
¹ 2
j=1 Xj ¡ X
n
X
= ai Yi
i=1
where:
¡ ¢
¹
Xi ¡ X
ai = Pn ¡ ¢ :
¹ 2
j=1 Xj ¡ X
As an exercise in the use of the summation notation show that:

Lemma 102 If ai is de…ned as above then:
n
X
ai = 0
i=1
n
X
ai Xi = 1
i=1
n
X 1
a2i = Pn ¡ ¢ :
¹ 2
i=1 j=1 Xj ¡ X
We then have that:

n
X
^
¯ = ai Yi
i=1
Xn
= ai (® + Xi ¯ + ei )
i=1
Xn n
X n
X
= ® ai + ¯ ai Xi + ai ei
i=1 i=1 i=1
| {z } | {z }
=0 =1
Xn
= ¯+ ai ei :
i=1
We can now show that:

h i
^ is an unbiased estimator of ¯ or E ¯
Theorem 103 ¯ ^ = ¯:
Proof. First note that since the ai 0 s depend on the Xi 0 s which are assumed
to be nonrandom by A5, it follows that the ai 0 s are nonrandom. We can
therefore pull E [ ] past the ai 0 s . Now using A2 we have:
" #
h i X n
E ¯^ = E ¯+ ai ei
i=1
n
X
= ¯+ ai E [ei ]
| {z }
i=1
=0 by A2
= ¯:
We can also show that:

^ is given by:
Theorem 104 The variance of ¯
h i ¾2
^ =P
V ar ¯ ¡ ¢ :
n ¹ 2
j=1 Xj ¡ X
Proof. First, since ¯ is a non-random constant, we can drop it from the

variance:
" #
h i Xn
V ar ¯^ = V ar ¯ + ai ei
i=1
" n #
X
= V ar ai ei :
i=1
By A4 the ei 0 s are uncorrelated so we need only square the coe¢cients and

take the variance. Doing this, using A3 and the fact that:
n
X 1
a2i = Pn ¡ ¢
¹ 2
i=1 j=1 Xj ¡ X
we have:
h i Xn
^
V ar ¯ = a2i V ar [ei ]
i=1
| {z }
=¾ 2 by A3
Xn
= ¾2 a2i
i=1
2
¾
= Pn ¡ ¢ :
¹ 2
j=1 Xj ¡ X
3.8.9 Linear Combinations of Correlated Random Vari-

ables
Expectations Using Sigma and Vector Notation
Now let Y be a linear combination of n possibly correlated random variables:
X1 ; X2 ; X3 ; : : : Xn so that:
Y = a1 X1 + a2 X2 + a3 X3 + ¢ ¢ ¢ + an Xn :
which can be more compactly using the sigma notation as:

n
X
Y = aj Xj
j=1
and even more compactly using vector notation as:
Y = aT X
where a and X are n £ 1 vectors given by:

2 3 2 3
a1 X1
6 a2 7 6 X2 7
6 7 6 7
6 a3 7 6 7
a=6 7 ; X = 6 X3 7 :
6 .. 7 6 .. 7
4 . 5 4 . 5
an Xn
and where X is an n £ 1 random vector.

We then have:
Theorem 105 Given Y = a1 X1 + a2 X2 + ¢ ¢ ¢ + an Xn we have:
E [Y ] = a1 E [X1 ] + a2 E [X2 ] + ¢ ¢ ¢ + an E [Xn ]
or
n
X
E [Y ] = ai E [Xi ]
i=1
or
E [Y ] = aT E [X]
where by E [X] ; the expected value of the random vector X we mean:

2 3
E [X1 ]
6 E [X2 ] 7
6 7
6 7
E [X] = 6 E [X3 ] 7 :
6 .. 7
4 . 5
E [Xn ]
Using Matrix Notation

We could generalize this setup even further. Suppose we have m linear combi-
nations Yi for i = 1; 2; : : : m (say the return on m di¤erent portfolios) all with
di¤erent weights so that:
Yi = ai1 X1 + ai2 X2 + ai3 X3 + ¢ ¢ ¢ + ain Xn ; for i = 1; 2; : : : m
or:
n
X
Yi = aij Xj for i = 1; 2; : : : m
j=1
Y = AX
where:
2 3 2 3 2 3
Y1 a11 a12 a13 ¢¢¢ a1n X1
6 Y2 7 6 a21 a22 a23 ¢¢¢ a2n 7 6 X2 7
6 7 6 7 6 7
6 Y3 7 6 a31 a32 a33 ¢¢¢ a3n 7 6 X3 7
Y =6 7; A = 6 7; X = 6 7:
6 .. 7 6 .. .. .. .. .. 7 6 .. 7
4 . 5 4 . . . . . 5 4 . 5
Ym am1 am2 am3 ¢¢¢ amn Xn
Theorem 106 If Y is an m £ 1 random vector given by Y = AX then:
E [Y ] = AE [X]
where:
2 3
E [Y1 ]
6 E [Y2 ] 7
6 7
6 E [Y3 ] 7
E [Y ] = 6 7:
6 .. 7
4 . 5
E [Yn ]
Variances
If we only have one linear combination so that:
n
X
Y = ai Xi = aT X
i=1
and the Xi0 s are correlated then we can no longer just square the coe¢cients,
take the variance and sum. Instead it can be shown that:
P
Theorem 107 If Y = ni=1 ai Xi then
n X
X n
V ar [Y ] = ai aj Cov [Xi ; Xj ]
i=1 j=1
Pn
Theorem 108 If Y = i=1 ai Xi then:
n
X n X
X n
V ar [Y ] = a2i V ar [Xi ] + 2 ai aj Cov [Xi ; Xj ] :
i=1 i=1 j=i+1
Pn
Theorem 109 If Y = i=1 ai Xi then:
n
X
V ar [Y ] = a2i V ar [Xi ]
i=1
+2a1 a2 Cov [X1 ; X2 ] + 2a1 a3 Cov [X1 ; X3 ] + ¢ ¢ ¢ + 2a1 an Cov [X1 ; Xn ]
..
.
+2an¡1 an Cov [Xn¡1 ; Xn ] :
It turns out that the best way to proceed is to generalize the notation of
V ar [X] to allow X to be an n £ 1 random vector.
De…nition 110 If X is an n £ 1 random vector then V ar [X] is de…ned as:

h i
V ar [X] = E (X ¡ E [X]) (X ¡ E [X])T
£ ¤ T
= E XX T ¡ E [X] E [X]
which is equal to:

2 3
V ar [X1 ] Cov [X1 ; X2 ]Cov [X1 ; X3 ] ¢ ¢ ¢ Cov [X1 ; Xn ]
6 Cov [X1 ; X2 ] V ar [X2 ] Cov [X2 ; X2 ] ¢ ¢ ¢ Cov [X2 ; Xn ] 7
6 7
6 7
V ar [X] = 6 Cov [X1 ; X3 ] Cov [X2 ; X3 ] V ar [X3 ] ¢¢¢ Cov [X3 ; Xn ] 7:
6 .. .. .. .. .. 7
4 . . . . . 5
Cov [X1 ; Xn ] Cov [X2 ; Xn ] ¢¢¢ V ar [Xn ]
This is often referred to as the variance-covariance matrix. Note that V ar [X]

is symmetric.
Theorem 111 If Y = aT X then:

£ ¤
V ar aT X = aT V ar [X] a:
Theorem 112 V ar [X] is positive semi-de…nite.
Proof. This follows from the previous theorem and the fact that vari-
ances£ are ¤always positive so that variances are always none negative so that:
V ar aT X ¸ 0 and hence:
£ ¤
aT V ar [X] a = V ar aT X ¸ 0
for all a: Thus V ar [X] is positive semi-de…nite matrix.

We can say that V ar [X] is positive-de…nite if no linear combinations of X
are nonrandom. In particular we have:
Theorem 113 If there is no c 6= 0 such that cT X = d with probability 1 where

d is a constant then V ar [X] is positive de…nite.
Theorem 114 If Y = AX then:
V ar [AX] = AV ar [X] AT :
3.8.10 The Linear Regression Model for Experts

The classical linear regression model is de…ned by the following assumptions:
A1. Y = X¯ + e where Y is an n £ 1 random vector, X is n £ p; ¯ is a p £ 1
nonrandom vector and e is n £ 1 random vector.
A2. E [e] = 0
A3. V ar [e] = ¾2 I where I is an n £ n identity matrix.
A4. X is nonrandom with rank [X] = p

For example the linear regression model we considered earlier with a constant
and one regressor is a special case of this model where:
2 3 2 3 2 3
Y1 1 X1 e1
6 Y2 7 6 1 X2 7· ¸ 6 7
6 7 6 7 6 e2 7
6 Y3 7 6 1 X3 7 ® 6 e3 7
6 7=6 7 +6 7:
6 .. 7 6 .. .. 7 ¯ 6 . 7
4 . 5 4 . . 5| {z } 4 .. 5
¯
Yn 1 Xn en
| {z } | {z } | {z }
Y X e
It turns out that the least squares estimator is:

¡ ¢
^ = X T X ¡1 X T Y:
¯
It then can be show that:
¡ T ¢¡1 T
^
¯ = X X X Y
¡ T ¢¡1 T
= X X X (X¯ + e)
| {z }
=Y by A1
¡ ¢¡1 T ¡ ¢¡1 T
= XT X X X¯ + XT X X e
| {z }
=I
¡ ¢¡1 T
= ¯ + XT X X e
| {z }
´A
= ¯ + Ae
where:
¡ ¢¡1 T
A = XT X X
is nonrandom by A4:
h i
^ is unbiased or E ¯
Theorem 115 ¯ ^ = ¯:
Proof. By A4 the matrix A is nonrandom so that:

h i
E ¯ ^ = E [¯ + Ae]
= ¯ + E [Ae]
= ¯ + A E [e]
|{z}
=0 by A2
= ¯:
We also have:
^ is:
Theorem 116 The variance-covariance matrix of ¯
h i ¡ ¢
^ = ¾2 X T X ¡1 :
V ar ¯
Proof. We have:
h i
^
V ar ¯ = V ar [¯ + Ae]
= V ar [Ae]
since ¯ is nonrandom. Thus:

h i
^
V ar ¯ = V ar [Ae]
= A V ar [e] AT
| {z }
=¾ 2 I by A3
= ¾ AAT :
2
¡ ¢¡1 T ¡ ¢¡1
Since A = X T X X and AT = X X T X we have:
h i
V ar ¯^ = ¾ 2 AAT
¡ ¢¡1 T ³ ¡ T ¢¡1 ´
= ¾2 X T X X X X X
| {z }
=AT
¡ ¢¡1 T ¡ T ¢¡1
¾ XT X
2
X X X X
| {z }
=I
¡ ¢¡1
= ¾ XT X
2
:
Chapter 4
Dynamics
4.1 Introduction
Di¤erence equations, and their continuous time analogues, di¤erential equations,
are often used to describe and predict how economic variables change over time;
for example, in describing movements in the business cycle.
A basic understanding of trigonometry turns out to be important for dif-
ference and di¤erential equations. Strange and wonderful as it may seem,
trigonometry in turn is very closely related to complex
p numbers, that is num-
bers which involve the square root of ¡1 or i ´ ¡1: We therefore begin with
a review of trigonometry, go on from there to show the deep connection with
complex numbers, and then proceed to di¤erence and di¤erential equations.
4.2 Trigonometry
4.2.1 Degrees and Radians
In every day life we typically measure angles using degrees. A full turn for
example is 360± , a right-angle is 90± . In mathematics we typically use radians
to measure degrees. With radians a full turn is 2¼ instead of 360± ; a right-angle
is ¼2 instead of 90± while a 45± angle would be ¼4 : In general we have:
2¼
radians = £ degrees.
360
|{z}
=0:0175
4.2.2 The Functions cos (x) and sin (x)
138
CHAPTER 4. DYNAMICS 139
The two basic trigonometric functions are cos (x) and sin (x) which are plotted
below:
0.5
-10 -5 0 5 10
x
-0.5
-1
y = sin (x)
0.5
-10 -5 0 5 10
x
-0.5
-1
y = cos (x)
The functions sin (x) and cos (x) have the following properties:
Properties of sin (x) and cos (x)
1. j sin (x) j· 1; j cos (x) j· 1 ,

2 2
2. sin (x) + cos (x) = 1;
3. cos (¡x) = cos (x) (i.e., cos (x) is an even function),
4. sin (¡x) = ¡ sin (x) (i.e., sin (x) is an odd function),
5. cos (x + 2¼) = cos (x) ; ( cos (x) has a period of 2¼ ),
6. sin (x + 2¼) = sin (x) ; ( sin (x) has a period of 2¼ ),
d sin(x)
7. dx = cos (x) ;
d cos(x)
8. dx = ¡ sin (x) ;
9. sin (µ 1 + µ 2 ) = cos (µ1 ) sin (µ2 ) + cos (µ 2 ) sin (µ1 ) ;
10. cos (µ1 + µ2 ) = cos (µ 1 ) cos (µ 2 ) ¡ sin (µ1 ) sin (µ 2 ) ;
x2 x4 x6
11. cos (x) = 1 ¡ 2! + 4! ¡ 6! +¢¢¢ ;
x3 x5 x7
12. sin (x) = x ¡ 3! + 5! ¡ 7! +¢¢¢ :
The last two follow from the Taylor series around xo = 0; the fact that
cos (0) = 1 and sin (0) = 0:
4.2.3 The Functions tan (x) and arctan (x)

De…nition 117 The function tan (x) is de…ned by:
sin (x) ¼ ¼
tan (x) = for ¡ < x < :
cos (x) 2 2
2 2
Using the quotient rule and the fact that: sin (x) + cos (x) = 1; it can be
shown that:
Theorem 118
d tan (x) 1
= >0
dx cos (x)2
We can therefore de…ne the inverse function: arctan (x) which satis…es:
tan (arctan (x)) = arctan (tan (x)) = x:
As an exercise you may want to show that:

d arctan (x) 1
= :
dx 1 + x2
4.3 Complex Numbers

4.3.1 Introduction
Consider the roots of the quadratic:
x2 + 1 = 0:
This does not have a solution amongst ordinary or more precisely real numbers
since it implies that:
x2 = ¡1
and for both negative and positive numbers:
x2 ¸ 0:
We can however invent a new class of numbers, which turn out to be surpris-
ingly useful, called complex numbers where the solution to x2 = ¡1 is simply
de…ned to exist as a new kind of number, neither positive nor negative, but
where:
p
x = ¡1 ´ i:
Many polynomials do not have any real roots, but any polynomial will have
roots if one allows for complex numbers. For example the quadratic
x2 ¡ 6x + 13 = 0
has no real roots since from the usual formula the roots would be:
q
p
6 § (¡6)2 ¡ 4 £ 13 6 § ¡16
=
2 2
p
and since the expression inside is negative. Amongst complex numbers
p
though we have that since ¡16 = 4i that the roots are:
6 § 4i
2
or the two roots are:
x1 = 3 + 2i and x2 = 3 ¡ 2i:
In general complex numbers are written as:
a + bi
where a and b are real numbers: For example given the complex number 3 + 2i
we have a = 3 and b = 2:
De…nition 119 Given a complex number a + bi we call a the real part or
Re [a + bi] = a:
De…nition 120 Given a complex number a + bi we call b the imaginary part or
Im [a + bi] = b:
Thus given 3 + 2i we have
Re [3 + 2i] = 3
Im [3 + 2i] = 2:
If x is a real number we have Im [x] = 0 so that for example Im [7] = 0:

4.3.2 Arithmetic with Complex Numbers

Just as with real numbers we can add, subtract, multiply and divide complex
numbers using the following:
(a + bi) + (c + di) = (a + c) + (d + e) i
(a + bi) ¡ (c + di) = (a ¡ c) + (d ¡ e) i
(a + bi) (c + di) = (ac ¡ bd) + (ad + bc) i
1 a ¡ bi a b
= = 2 ¡ i
a + bi a2 + b2 a + b2 a2 + b2
(a + bi) (a + bi) (c ¡ di)
=
(c + di) (c + di) (c ¡ di)
(ac + bd) (¡ad + bc)
= + i:
c2 + d2 c2 + d2
Thus for example:
(3 + 4i) + (5 + 6i) = 8 + 10i
(3 + 4i) £ (5 + 6i) = ¡9 + 38i
1 5 6
= ¡ i
(5 + 6i) 61 61
39 2
(3 + 4i) ¥ (5 + 6i) = + i:
61 61
4.3.3 Complex conjugates:

De…nition 121 Given a complex number a+bi we denote the complex conjugate
as: (a + bi) where:
(a + bi) ´ a ¡ bi:
For example the complex conjugate of 3 + 4i is given by:
3 + 4i = 3 ¡ 4i:
The complex roots of any polynomials always appear as conjugate pairs;
that is we have:
Theorem 122 Given the nth degree polynomial:
f (x) = ®0 xn + ®1 xn¡1 + ®2 xn¡2 + ¢ ¢ ¢ + ®n¡1 x + ®n = 0
where the coe¢cients ®0 ; ®1 ; ®2 ; : : : ®n are real numbers, if x = a + bi is a root
then so too is its complex conjugate x = a ¡ bi:
For example with the quadratic
x2 ¡ 6x + 13 = 0
the two roots x1 = 3 + 2i and x2 = 3 ¡ 2i are conjugates of each other.
An interesting and useful property of conjugates is that:
Theorem 123 The conjugate of the product of two complex numbers is the
product of the conjugate so that:
(a + bi) (c + di) = (a + bi) £ (c + di):
An Example
We verify this theorem using the following example. Note that:
(7 + 3i) (4 + 2i) = 22 + 26i
and so:
(7 + 3i) (4 + 2i) = 22 + 26i = 22 ¡ 26i:
However, it is also true that we get the same answer if we calculate the
product of the conjugates as:
(7 + 3i) £ (4 + 2i) = (7 ¡ 3i) (4 ¡ 2i)

= 22 ¡ 26i:
4.3.4 The Absolute Value of a Complex Number

For real numbers we often calculate the absolute values. For example j¡3j = 3:
Just like real numbers, one can calculate the absolute value of a complex
number a + bi:
De…nition 124 The absolute value of a complex number a + bi is de…ned as:

q p
j a + bi j= (a + bi) (a + bi) = a2 + b2 :
For example
p
j 3 + 4i j= 32 + 42 = 5:
One very useful way of thinking about complex numbers is as a point in a

two dimensional space as shown below for 3 + 4i :
Thus in the diagram a = 3 is the real part on the horizontal axis and b = 4
is the imaginary part on the vertical axis.
It then follows that the length r from the origin to the point (a; b) is the
absolute value of a + bi since:
p
r =j a + bi j= a2 + b2 :
p
Thus in the diagram r = 32 + 42 = 5:
4.3.5 The Argument of a Complex Number

The angle µ between the horizontal axis and (a; b) is called the argument of
a + bi: We thus have:
De…nition 125 The argument of a + bi; denoted as arg [a + bi] is de…ned as:
8 ¡ ¢
< arctan ab ; ¡a >¢ 0; b > 0
b
µ = arg [a + bi] = ¡ ¢a ; a < 0
¼ + arctan :
:
2¼ + arctan ab ; a > 0; b < 0
Thus in the diagram we have for 3 + 4i we have that:
µ ¶
4
µ = arg (3 + 4i) = arctan = 0:927:
3
To illustrate the calculation of µ where either a or b is negative, we have:
µ ¶
¡4
arg (¡3 + 4i) = ¼ + arctan = 2:214;
3
µ ¶
¡4
arg (3 ¡ 4i) = 2¼ + arctan = 5:356:
3
4.3.6 The Polar Form of a Complex Number

De…nition 126 The polar form of a complex number as:
a + bi = r (cos (µ) + i sin (µ))
where r = ja + bij and µ = arg [a + bi] :
For example you can verify that:
3 + 4i = 5 (cos (0:927) + i sin (0:927 ))
¡3 + 4i = 5 (cos (2:214) + i sin (2:214))
3 ¡ 4i = 5 (cos (5:356) + i sin (5:356)) :
4.3.7 De Moivre’s theorem

n
Consider taking calculating (a + bi) : One way of doing this is to expand the
brackets so that for example:
3
(a + bi) = (a + bi) (a + bi) (a + bi)
| {z }
=a2 +2iab¡b2
¡ ¢
= a2 ¡ b2 + 2iab (a + bi)
¡ ¢
= a3 ¡ 3ab2 + i 3a2 b ¡ b3 :
This approach would involve a lot of work if n were large and would not
1 p
work if n is not an integer (for example what is (1 + i) 2 or 1 + i ?)
A surprising result, called De Moivre’s theorem is given in the following
theorem:
Theorem 127 De Moivre’s theorem:
n
(a + bi) = rn (cos (nµ) + i sin (nµ)) :
Example 1
Consider
p the complex
p number 1 + 1i:
¡ ¢It follow from De Moivre’s theorem that
r = 12 + 12 = 2 and µ = arctan 11 = ¼4 and so:
p ³ ³¼ ´ ³ ¼ ´´
1 + 1i = 2 cos + i sin :
4 4
¡ ¢ ¡ ¢
Verify this using the fact that cos ¼4 = p12 and sin ¼4 = p12 :
Furthermore we have:
³p ´5 µ µ 5¼ ¶ µ ¶¶
5¼
5
(1 + 1i) = 2 cos + i sin
4 4
= ¡4 ¡ 4i:
You should verify this by multiplying out
(1 + 1i) £ (1 + 1i) £ (1 + 1i) £ (1 + 1i) £ (1 + 1i)
and by using the polar form.
Example 2
Other examples using the results from above are that:
n
(3 + 4i) = 5n (cos (0:927 £ n) + i sin (0:927 £ n))
(¡3 + 4i)n = 5n (cos (2:214 £ n) + i sin (2:214 £ n))
(3 ¡ 4i)n = 5n (cos (5:355 £ n) + i sin (5:355 £ n)) :
For example we have:

1 p
(3 + 4i) 2 = 3 + 4i
µ µ ¶ µ ¶¶
1 1 1
= 5:0 2 cos 0:927 £ + i sin 0:927 £
2 2
= 2 + 1i:
Thus we can calculate square roots of complex numbers!

You can verify that 2 + i is in fact the square root of 3 + 4i since:
(2 + i) (2 + i) = 3 + 4i:
4.3.8 Euler’s Formula

To see why De Moivre’s theorem is true, consider taking a Taylor Series of eµi
which is:
(µi)2 (µi)3 (µi)4

eµi = 1 + µi + + + +¢¢¢ :
2! 3! 4!
Since i2 = ¡1; i3 = ¡i; i4 = 1 etc. we have:
µ ¶ µ ¶
µ 2 µ4 µ6 µ3 µ5 µ7
eµi = 1 ¡ + ¡ +¢¢¢ +i µ ¡ + ¡ + ¢¢¢
2! 4! 6! 3! 5! 7!
| {z } | {z }
=cos(µ) =sin(µ)
from the Taylor series for cos (µ) and sin (µ) : We therefore have (the really
mind-boggling!!)
Theorem 128 Euler’s formula:
eµi = cos (µ) + i sin (µ) :
To get a feel for some of the depth behind Euler’s formula consider setting
µ = ¼: We then have:
e¼i = cos (¼) + isin (¼)

| {z } | {z }
=¡1 =0
or
e¼i + 1 = 0
which unites the …ve most fundamental numbers: e; ¼; i; 1 and 0 into one simple
formula. Taking log’s of both sides of e¼i = ¡1 we obtain ln (¡1) = ¼i:
We can also calculate ei¼=2 = i: Taking logs of both sides we …nd that
ln (i) = i ¼2 : If we now ask what sort of number is ii (mathematicians once
thought this would be a new kind of number) we …nd that it turns out to be a
¼ ¼
real number! Thus ii = eln(i)i = e 2 i£i = e¡ 2 ¼ 0:207880:
¯ ¯
Theorem 129 The absolute value of eµi is 1 or ¯eµi ¯ = 1 and the conjugate of
eµi is e¡µi or eµi = e¡µi :
Proof.
q
j e j= cos (µ)2 + sin (µ)2 = 1
µi
eµi = cos (µ) + i sin (µ) = cos (µ) ¡ i sin (µ) = e¡µi :
Examples of Euler’s Formula

Verify the following examples:
e¼i = cos (¼) + i sin (¼) = ¡1

¼
³¼ ´ ³¼ ´ 1 1
e 4i = cos + i sin =p +p i
4 4 2 2
ei = cos (1) + i sin (1) = 0:540 + 0:841i:
You can also verify from a previous example that:
(3 + 4i)n = 5n e0:927i£n
(¡3 + 4i)n = 5n e2:214i£n
(3 ¡ 4i)n = 5n e5:355i£n :
A Proof of De Moivre’s Theorem

Since
a + bi = r (cos (µ) + i sin (µ))
and by Euler’s formula
eµi = cos (µ) + i sin (µ)

we have:
a + bi = reµi :
It therefore follows that:

¡ µi ¢n
(a + bi)n = re
= rn enµi
= rn (cos (nµ) + i sin (nµ))
which is De Moivre’s theorem. p

Thus given 3 + 4i we have r = 32 + 42 = 5 and µ = arg (3 + 4i) =
arctan 43 = 0:927 we have
3 + 4i = 5e0:927i
so that:
(3 + 4i)3 = 53 e3£0:927i
= ¡117 + 44i:
Using Euler’s Formula to Represent cos (x) and sin (x)

Consider using Euler’s formula as follows:
eµi + e¡µi cos (µ) + i sin (µ) + cos (¡µ) + i sin (¡µ)
=
2 2
= cos (µ)
since cos (¡µ) = cos (µ) and sin (¡µ) = ¡ sin (µ) : On your own apply the same
logic and show that:
eµi ¡ e¡µi
= sin (µ) :
2i
We therefore have that:
eµi + e¡µi
cos (µ) =
2
eµi ¡ e¡µi
sin (µ) = :
2i
These relations can be used to easily prove many trigonometric results. For
example from
e(µ1 +µ2 )i = cos (µ1 + µ2 ) + i sin (µ 1 + µ 2 )

and
e(µ1 +µ2 )i = eµ1 i eµ2 i = (cos (µ 1 ) + i sin (µ1 )) (cos (µ2 ) + i sin (µ 2 ))
= (cos (µ 1 ) cos (µ 2 ) ¡ sin (µ1 ) sin (µ 2 ))
+i (cos (µ 1 ) sin (µ2 ) + cos (µ2 ) sin (µ 1 ))
we can, by equating real terms with real terms and complex terms with complex
terms arrive at:
cos (µ1 + µ2 ) = cos (µ1 ) cos (µ 2 ) ¡ sin (µ1 ) sin (µ 2 )
sin (µ1 + µ2 ) = cos (µ1 ) sin (µ 2 ) + cos (µ2 ) sin (µ 1 ) :
4.3.9 When does (a + bi)n ! 0 as n ! 1?

With di¤erential and di¤erence equations it turns out that the important prop-
erty of stability depends on what happens to:
(a + bi)n
as n increases without bound.
For real numbers we know that an ! 0 if and only if jaj < 1: For example if
we take higher and higher powers of ¡ 12 we get:
µ ¶n
1 1
¡ = ¡ ; n=1
2 2
1
= ; n=2
4
1
= ¡ ; n=3
8
1
= ; n=4
16
¯ ¯
which approaches 0 as n increases and ¯¡ 12 ¯ < 1: In contrast if a = 2 then we
n n
have:(2) = 2; 4; 8; 16; 32; 64 : : : as n = 1; 2; 3; 4; 5; 6; : : : so that: limn!1 (2) =
1= 6 0:
It turns out that this generalizes to complex numbers. In particular we have:
Theorem 130
lim (a + bi)n = 0 if and only if j a + bi j< 1:
n!1
This follows since:

(a + bi)n = rn enµi
¯ ¯
where r =j a + bi jand the fact that we have already shown that ¯eµi ¯ = 1 for
any µ so that:
¯ ¯
j(a + bi)n j = rn ¯enµi ¯
= rn ! 0
if and only r =j a + bi j< 1:

For example consider 12 + 12 i: We have:
sµ ¶ µ ¶2
2
1 1
r = +
2 2
= 0:707 < 1
so that:
µ ¶n
1 1
lim + i = 0:
n!1 2 2
To see that this is so note consider:

µ ¶1
1 1 1 1
+ i = + i
2 2 2 2
µ ¶2
1 1 1
+ i = i
2 2 2
µ ¶3
1 1 1 1
+ i = ¡ + i
2 2 4 4
µ ¶4
1 1 1
+ i = ¡
2 2 4
µ ¶5
1 1 1 1
+ i = ¡ ¡ i
2 2 8 8
..
.
µ ¶40
1 1 1
+ i = :
2 2 1048576
Note how either the real and complex components get smaller and small as n
gets larger.
Now consider 1 + 1i . Since: r =j 1 + 1i j is:
p
r = 12 + 12
= 1:414 > 1
we have that:
lim (1 + i)n 6= 0:
n!1
To see that this is so consider:

(1 + i)2 = 2i
3
(1 + i) = ¡2 + 2i
4
(1 + i) = ¡4
5
(1 + i) = ¡4 ¡ 4i
6
(1 + i) = ¡8i
..
.
40
(1 + i) = 1048576:
Note how either the real and complex components get larger and larger as n
gets larger.
4.4 First-Order Linear Di¤erence Equations

4.4.1 De…nition
Let Yt be some variable at time t: For example Yt could be GN P at time t:
Time is discrete beginning at t = 0 and progresses in units of 1 so that t =
0; 1; 2; : : : 1 and so there is a sequence of values of Yt ; beginning at Yo and
continuing as Y1 ; Y2 ; Y3 ; : : : :
De…nition 131 A …rst-order linear di¤erence equation is de…ned by an equa-
tion of movement:
Yt = ® + ÁYt¡1
and by a starting value Yo which gets the process started.
4.4.2 Example
For example suppose that Yo = 200; ® = 300; Á = 0:5 so that:
Yt = 300 + 0:5Yt¡1 :
It then follows that:
Y1 = 300 + 0:5Yo = 300 + 0:5 £ 200 = 400
Y2 = 300 + 0:5Y1 = 300 + 0:5 £ 400 = 500
Y3 = 300 + 0:5Y2 = 300 + 0:5 £ 500 = 550
Y4 = 300 + 0:5Y3 = 300 + 0:5 £ 550 = 575
Y5 = 300 + 0:5Y4 = 300 + 0:5 £ 575 = 587:5
etc::
Note in the example that Yt is increasing and appears to be approaching
some value. This is the equilibrium value Y ¤ of the process which is de…ned as:
De…nition 132 The equilibrium value Y ¤ is that value of Yt such that if Yt =

Y ¤ then Yt+k = Y ¤ for all k > 0:
For a …rst-order linear di¤erence equation we have that if Yt¡1 = Y ¤ ; then

Yt = Y ¤ so that substituting Yt¡1 = Y ¤ and Yt = Y ¤ into Yt = ®+ÁYt¡1 yields:
Y ¤ = ® + ÁY ¤ :
Thus as long as Á 6= 1 we can solve for Y ¤ as:

®
Y¤ = :
1¡Á
In the above example we have:
Y ¤ = 300 + 0:5Y ¤
so that solving we have:
Y ¤ = 600
and so Y ¤ = 600:
4.4.3 Stability
An important question is if we start at any Yo ; does Yt always approach Y ¤ : If
it does then we say that the di¤erence equation is stable.
De…nition 133 A di¤erence equation is stable if for any starting value Yo
lim Yt = Y ¤ :
t!1
For example suppose you have a ball and a teacup as illustrated below. If you
take the teacup and put a small ball anywhere inside it the ball eventually comes
to rest at the bottom of the teacup. This is an example of stable dynamics
with the bottom of the teacup a stable equilibrium point. If however we turn
the teacup over then the top of the teacup is an equilibrium point since if we
put the ball exactly at this point it will remain there forever if nothing disturbs
it. If however the ball starts anywhere except at the top of the cup or if it is
slightly disturbed while it is at the top of the cup, then it will move away from
the equilibrium. This then is an example of unstable dynamics with the top
of the teacup an unstable equilibrium point.
Just as with the teacup example, it turns out that some di¤erence equations
are stable and some are unstable. The above example turns out to be stable.
If instead
Yt = 300 + 2Yt¡1
Figure 4.1:
with say Yo = 200 then this is unstable. To see this note that:
Y1 = 300 + 2Yo = 300 + 2 £ 200 = 700
Y2 = 300 + 2Y1 = 300 + 2 £ 700 = 1700
Y3 = 300 + 2Y2 = 300 + 2 £ 1700 = 3700
Y4 = 300 + 2Y3 = 300 + 2 £ 3700 = 7700
Y5 = 300 + 2Y4 = 300 + 2 £ 7700 = 15700
etc:
which does not approach Y ¤ given by
300
Y¤ = = ¡300:
1¡2
The question then is under what circumstances is a di¤erence equation stable
and under what circumstances is it unstable. We can answer this question pre-
cisely for a …rst-order linear di¤erence equation by de…ning Yt¤ as the deviation
from the equilibrium value Y ¤ or
Yt¤ ´ Yt ¡ Y ¤
so that: Yt = Yt¤ +Y ¤ : We therefore have:
¡ ¤ ¢
Yt¤ + Y ¤ = ® + Á Yt¡1 +Y¤
| {z } | {z }
Yt Yt¡1
®Á ¤
= ®+ + ÁYt¡1 ::
1¡Á
| {z }
=Y ¤
Thus the Y ¤ on the left side cancels with the Y ¤ on the right side and we have:
Yt¤ = ÁYt¡1
¤
which is the same di¤erence equation except without the constant ®:

It follows then that:
Yt¤ ¤
= Á Yt¡1
|{z}
¤
=ÁYt¡2
= Á2 Yt¡2
¤
..
.
= Át Y0¤
or
Yt¤ = Át Y0¤ :
Recall that Yt¤ = Yt ¡ Y ¤ : It therefore follows that:
Theorem 134 The solution to a …rst order linear di¤erence equation takes the
form:
Yt = Y ¤ + Át (Yo ¡ Y ¤ ) :
We therefore have stability if and only if:
lim Át = 0
t!1
and so we obtain the following:
Theorem 135 The di¤erence equation Yt = ® + ÁYt¡1 is stable if and only if:
¡1 < Á < 1:
Note that stability depends only on Á and not on Yo or ®:

Thus Yt = 300 + 0:5Yt¡1 is stable because: ¡1 < Á = 12 < 1 while Yt =
300 + 2Yt¡1 is not stable because Á = 2 > 1:
4.4.4 A Macroeconomic Example

Suppose that Yt is GN P , Ct is consumption; It is investment and Gt is govern-
ment expenditure so that:
Yt = Ct + It + Gt :
Both investment and government expenditure are exogenous so that:
It = I;
Gt = G
where I and G are constants. Consumption depends on income lagged one

period via the consumption function
Ct = C + cYt¡1 ; 0 < c < 1
where c is the marginal propensity to consume. It follows that:
Yt = Ct + It + Gt
= C + cYt¡1 + I + G
or
Yt = (C + I + G) + |{z}
c Yt¡1
| {z }
® Á
which is a …rst-order di¤erence equation.

The equilibrium value of GNP is thus:

® C +I +G
Y¤ = = :
1¡Á 1¡c
¤
Note for example that @Y 1
@G = 1¡c ; the conventional multiplier.
The economy is stable since Á = c and 0 < c < 1: In fact
µ ¶
C +I +G t C + I + G t!1 C + I + G
Yt = + c Yo ¡ ! :
1¡c 1¡c 1¡c
As an example consider the economy where Yo = 400; c = 0:5; C = I = G =

100 in which case:
Yt = 300 + 0:5Yt¡1 :
which has a solution:

µ ¶t
1
Yt = 600 ¡ 200
2
with a time path plotted below:
600
500
400
300
200
100
0 2 4 6 8 10
t :
¡ 1 ¢t
Yt = 600 ¡ 200 2
4.4.5 The Cob-Web Model

Suppose that the quantity supplied depends on the price that occurred one
period ago and not the current price. For example suppose hog producers might
make their decisions on how many hogs to bring to market based on last year’s
hog price and not this years. Assume that the quantity demanded depends on
the current price. Then we have:
Qdt = a ¡ bPt
Qst = c + dPt¡1
where Qdt is the quantity demanded, Qst is the quantity supplied and Pt is the
price. In equilibrium: Qdt = Qst so that price Pt follows a …rst-order linear
di¤erence equation given by:
a ¡ bPt = c + dPt¡1
or:
a¡c d
Pt = ¡ Pt¡1 :
b b
To solve this note that the equilibrium price P ¤ is:
a¡c
a¡c
P¤ = b
=
1 + db b+d
so that:
µ ¶t
¡d
Pt = P ¤ + (Po ¡ P ¤ ) :
b
Note that because ¡d b is a negative number, the actual price will, over consec-
utive years, over and undershoot the equilibrium price.
Stability requires that d < b or that the slope of the supply curve be less
than the slope of the demand curve.
4.5 Second-Order Linear Di¤erence Equations

De…nition 136 A second-order linear di¤erence equation has an equation of
motion given by:
Yt = ® + Á1 Yt¡1 + Á2 Yt¡2
with given starting values: Y0 and Y1 .
If Yt = Y ¤ ,the equilibrium value, it follows that Yt+1 = Y ¤ and Yt+2 = Y ¤

so that Y ¤ satis…es:
Y ¤ = ® + Á1 Y ¤ + Á2 Y ¤
so that:
Theorem 137 The equilibrium value for a second-order linear di¤erence equa-
tion is given by:
®
Y¤ = :
1 ¡ Á1 ¡ Á2
If we de…ne Yt¤ = Yt ¡ Y ¤ then:
Yt¤ = Á1 Yt¡1
¤ ¤
+ Á2 Yt¡2
so that Yt¤ obeys the same di¤erence equation but without the constant ®:
To solve the di¤erence equation in this form we hazard a guess that:
Yt¤ = Art
for some A and r: Substituting Art for Yt¤ ; Art¡1 for Yt¡1
¤ ¤
; and Art¡2 for Yt¡2
it follows that:
Art = Á1 Art¡1 + Á2 Art¡2 :
Cancelling the A and dividing both sides by rt¡2 we …nd that r is a root of the
characteristic polynomial de…ned by:
De…nition 138 The characteristic polynomial of the second-order di¤erence

equation: Yt = ® + Á1 Yt¡1 + Á2 Yt¡2 is
r2 ¡ Á1 r ¡ Á2 :
Thus the two roots: r1 and r2 of the characteristic polynomial are:

q
Á1 § (Á1 )2 + 4Á2
r1 ; r2 = :
2
The general solution to a second-order linear di¤erence equation then takes
the form:
Yt¤ = A1 r1t + A2 r2t
or using the fact that Yt¤ = Yt ¡ Y ¤ we have:
Theorem 139 The solution to a second-order linear di¤erence equation takes

the form:
Yt = Y ¤ + A1 r1t + A2 r2t :
Thus a stability requires that j r1 j< 1 and j r2 j< 1 in which case r1t ! 0
and r2t ! 0 as t ! 1 and:
lim Yt = Y ¤ + A1 lim r1t + A2 lim r2t

t!1 t!1 t!1
| {z } | {z }
=0 =0
= Y ¤:
Thus we have:
Theorem 140 The linear second-order di¤erence equation:
Yt = ® + Á1 Yt¡1 + Á2 Yt¡2
is stable if and only if:

j r1 j< 1 and j r2 j< 1
where r1 and r2 are the roots of the characteristic polynomial.
Note that there is no di¢culty if r1 and r2 are complex in establishing

stability since we know how to calculate absolute values for complex numbers.
Thus
Theorem 141 If r1 = a + bi and r2 = a ¡ bi the di¤erence equation is stable
if and only if:
p
j r1 j=j r2 j= a2 + b2 < 1:
To calculate A1 and A2 we use the fact that Yo and Y1 are given and substi-
tute t = 0 and t = 1 in
Yt = Y ¤ + A1 r1t + A2 r2t
to obtain:
Yo = Y ¤ + A1 + A2
Y1 = Y ¤ + A1 r1 + A2 r2 :
It then follows that:

· ¸· ¸ · ¸
1 1 A1 Yo ¡ Y ¤
=
r1 r2 A2 Y1 ¡ Y ¤
so that A1 and A2 are the solutions to two linear equations:
· ¸ · ¸¡1 · ¸
A1 1 1 Yo ¡ Y ¤
=
A2 r1 r2 Y1 ¡ Y ¤
· ¸· ¸
1 r2 ¡1 Yo ¡ Y ¤
= :
r2 ¡ r1 ¡r1 1 Y1 ¡ Y ¤
4.5.1 The Behavior of Yt with Complex Roots

Assuming that we have complex roots we can write them in polar form as:
r1 = reµi ; r2 = re¡µi :
It can be shown that A1 and A2 will also be complex and conjugates of each
other. Thus if A1 = c + di and A2 = c ¡ di are written in polar form we have:
A1 = Ae®i ; A2 = Ae¡®i
where A =j c + di j and ® = arg (c + di) : Then:
Yt = Y ¤ + A1 r1t + A2 r2t
¡ ¢t ¡ ¢t
= Y ¤ + Ae®i reµi + Ae¡®i re¡µi
µ i(µt+®) ¶
e + e¡i(µt+®)
= Y ¤ + 2Art
2
¤ t
= Y + 2Ar cos (µt + ®) :
Thus with complex roots Yt will be a damped cosine wave if r < 1 since in this
case rt ! 0 as t ! 1 and so Yt will converge to Y ¤ :
If r > 1 Yt will be an exploding cosine wave since in this case rt ! 1 as
t ! 1 and so Yt will diverge from Y ¤ :
Thus in general complex roots result in a kind of wave behavior for Yt :
4.5.2 Example 1 Real Roots and Stable

Consider the second order di¤erence equation:
Yt = 100 + 1:3Yt¡1 ¡ 0:4Yt¡2

Y1 = 850; Yo = 800:
The equilibrium value Y ¤ is then:

100
Y¤ = = 1000:
1 ¡ (1:3) ¡ (¡0:4)
The characteristic polynomial is:
r2 ¡ 1:3r + 0:4 = 0
so that:
q
¡ (¡1:3) § (¡1:3)2 ¡ 4 £ 0:4
r1 ; r2 =
2
or:
4 1
r1 = ; r2 = :
5 2
Since j r1 j< 1 and j r2 j< 1 this di¤erence equation is stable; that is no matter
what the starting values Yt will converge to the equilibrium value Y ¤ = 1000:
The general solution is then:
µ ¶t µ ¶t
4 1
Yt = 1000 + A1 + A2 :
5 2
where A1 and A2 depend on the starting values.

Since Yo = 800 and Y1 = 850 we have

µ ¶0 µ ¶0
4 1
Yo = 800 = 1000 + A1 + A2
5 2
µ ¶1 µ ¶1
4 1
Y1 = 850 = 1000 + A1 + A2
5 2
so that in matrix notation:

· ¸· ¸ · ¸
1 1 A1 ¡200
4 1 =
5 2 A2 ¡150
or solving for A1 and A2 we have:

· ¸ · ¸¡1 · ¸
A1 1 1 ¡200
= 4 1
A2 ¡150
· 5 5002 ¸
¡ 3
=
¡ 100
3
We thus have the solution:

µ ¶t µ ¶t
500 4 100 1
Yt = 1000 ¡ ¡ :
3 5 3 2
The movement of Yt over time is plotted below:
1000
950
900
850
800
0 5 10 15 20
t :
Note how Yt approaches the equilibrium value Y ¤ = 1000:
4.5.3 Example 2 Real Roots and Unstable

Yt = ¡100 + 0:7Yt¡1 + 0:4Yt¡2

Y1 = 850; Yo = 800:

¡100
Y¤ = = 1000:
1 ¡ (0:7) ¡ (0:4)
r2 ¡ 0:7r ¡ 0:4 = 0
so that:
q
¡ (¡0:7) § (¡0:7)2 ¡ 4 £ (¡0:4)
r1 ; r2 =
2
or:
r1 = 1:07; r2 = ¡0:373:
Since r1 = 1:07 > 1 it follows that the di¤erence equation is unstable.

The general solution then takes the form:
Yt = 1000 + A1 (1:07)t + A2 (¡0:373)t :

Yo = 800 = 1000 + A1 (1:07)0 + A2 (¡0:373)0

Y1 = 850 = 1000 + A1 (1:07)1 + A2 (¡0:373)1

· ¸· ¸ · ¸
1 1 A1 ¡200
=
1:07 ¡0:373 A2 ¡150

· ¸ · ¸¡1 · ¸
A1 1 1 ¡200
=
A2 1:07 ¡0:373 ¡150
· ¸
¡155:65
= :
¡44:352
Thus the solution is:
Yt = 1000 ¡ 155:65 £ (1:07)t ¡ 44:352 £ (¡0:373)t

800
600
400
200
0 5 10 15 20 25 30
t
-200
:
Plot of Yt
4.5.4 Example 3 Complex Roots and Stable

1
Yt = 100 + Yt¡1 ¡ Yt¡2
2
Y1 = 100; Yo = 120:

100
Y¤ = ¡ ¢ = 200:
1 ¡ (1) ¡ ¡ 12

1
r2 ¡ r + =0
2
so that:
q
1§ (¡1)2 ¡ 4 £ 1
2
r1 ; r2 =
2
or:
1 1 1 1
r1 = + i; r2 = ¡ i:
2 2 2 2
Since
sµ ¶ µ ¶2
2
1 1 1
j r1 j=j r2 j= + = p <1
2 2 2
the di¤erence equation is stable. That is no matter what the starting values Yt
will converge to the equilibrium value Y ¤ = 200:
µ ¶t µ ¶t
1 1 1 1
Yt = 200 + A1 + i + A2 ¡ i :
2 2 2 2

µ ¶0 µ ¶0
1 1 1 1
Yo = 100 = 200 + A1 + i + A2 ¡ i
2 2 2 2
µ ¶1 µ ¶1
1 1 1 1
Y1 = 120 = 200 + A1 + i + A2 ¡ i
2 2 2 2

· ¸· ¸ · ¸
1 1 A1 ¡100
1 1 1 =
2 + 2i 2 ¡ 12 i A2 ¡80

· ¸ · ¸¡1 · ¸
A1 1 1 ¡100
= 1
A2 2 + 12 i 1
2 ¡ 12 i ¡80
· ¸
¡50 + 30i
=
¡50 ¡ 30i

µ ¶t µ ¶t
1 1 1 1
Yt = 200 + (¡50 + 30i) + i + (¡50 ¡ 30i) ¡ i :
2 2 2 2
Here
1 1 1 ¼
+ i = p e4i
2 2 2
1 1 1 ¡¼i
¡ i = p e 4
2 2 2
p
¡50 + 30i = 3400e2:601173i
p
¡50 ¡ 30i = 3400e¡2:601173i
since:
µ ¶
30
arg (¡50 + 30i) = arctan +¼
¡50
= 2:601173:
Therefore:
µ ¶t ³¼ ´
p 1
Yt = 200 + 2 £ 3400 £ p £ cos t + 2:601173 :
2 4
This is plotted below:
220
200
180
160
140
120
100
0 5 10 15 20
t :
Plot of Yt
4.5.5 Example 4 Complex Roots and Unstable

Yt = 100 + 2Yt¡1 ¡ 2Yt¡2
Y1 = 100; Yo = 120:

100
Y¤ = = 100:
1 ¡ (2) ¡ (¡2)
r2 ¡ 2r + 2 = 0
so that:
r1 = 1 + i; r2 = 1 ¡ i:
Since
p p
j r1 j=j r2 j= 12 + 12 = 2 > 1
the di¤erence equation is unstable. That is no matter what the starting values
Yt will diverge from the equilibrium value Y ¤ = 100:
Yt = 100 + A1 (1 + i)t + A2 (1 ¡ i)t


Yo = 100 = 100 + A1 (1 + i)0 + A2 (1 ¡ i)0

Y1 = 120 = 100 + A1 (1 + i)1 + A2 (1 + i)1

· ¸· ¸ · ¸
1 1 A1 0
=
1+i 1+i A2 ¡20

· ¸ · ¸¡1 · ¸
A1 1 1 0
=
A2 1+i 1¡i ¡20
· ¸
10i
= :
¡10i

t t
Yt = 100 + 10i (1 + i) ¡ 10i (1 ¡ i) :
Since:
p ¼i
1+i = 2e 4
p ¡¼i
1¡i = 2e 4
¼
i = e2
we can rewrite this as:

p t ³¼ ¼´
Yt = 100 + 5 2 cos t+
4 2
500
t
5 10 15 20
0
-500
-1000
-1500
-2000
-2500
:
Plot of Yt
4.5.6 A Macroeconomic Example

Suppose as before that Yt is GN P , Ct is consumption; It is investment and Gt
is government expenditure so that:
Yt = Ct + It + Gt :
Now however let investment be endogenous according to the accelerator theory

of investment so that
It = I + ° (Yt¡1 ¡ Yt¡2 ) ; ° > 0:
We keep government expenditure exogenous so that:
Gt = G:
As before consumption depends on income lagged one period via the con-
sumption function
Ct = C + cYt¡1 ; 0 < c < 1
where c is the marginal propensity to consume.

It now follows that:
Yt = Ct + It + Gt
= C + cYt¡1 + I + ° (Yt¡1 ¡ Yt¡2 ) + G
or
Yt = (C + I + G) + (c + °)Yt¡1 ¡ ° Yt¡2
| {z } | {z } |{z}
® Á1 Á2
which is a second-order di¤erence equation.

The equilibrium value of GNP is thus:
® C+I +G
Y¤ = =
1 ¡ Á1 ¡ Á2 1 ¡ (c + °) ¡ (¡°)
C +I +G
=
1¡c
so that the equilibrium value is unchanged. This is because in equilibrium
investment is the same since if Yt¡1 = Yt¡2 = Y ¤ :
It = I + ° (Yt¡1 ¡ Yt¡2 )
= I + ° (Y ¤ ¡ Y ¤ )
= I:
@Y ¤ 1
so that @G = 1¡c ; as before.
Stability requires that the roots:

q
(c + °) § (c + °)2 + 4°
r1 ; r2 = :
2
both be less than one in absolute value.
An interesting question is whether an economic model could have complex
roots. Thus consider the case where:
Ct = 100 + 0:5Yt¡1
It = 100 + 0:8 (Yt¡1 ¡ Yt¡2 )
Gt = 100
and Yo = Y1 = 150:
In this case we have the di¤erence equation:
Yt = 300 + 1:3Yt¡1 ¡ 0:8Yt¡2
with characteristic polynomial:
r2 ¡ 1:3r + 0:8 = 0
which has complex roots:
r1 = 0:65 + 0:61441i
r2 = 0:65 ¡ 0:61441i:
Since j0:65 + 0:61441ij = 0:89443 < 1 this is stable. Alternatively, since the
roots are complex and jÁ2 j = 0:8 < 1 the di¤erence equation is stable.
4.6 First-Order Di¤erential Equations
Di¤erence equations are used when we think of time as being discrete. Economic
data is inevitably in discrete time whether it be weekly, monthly or annual.
However, we tend to experience time as being continuous, and it is often easier
to work with continuous time. In this case dynamics involves not di¤erence
equations but di¤erential equations.
4.6.1 De…nitions
De…nition 142 A …rst-order linear di¤erential equation consists of an equation
of motion of the form:
y0 (t) + a1 y (t) = b
with a given starting value: y (0).

The problem is to solve for y (t) : Often the dependence of y (t) on t is omitted
so we write y instead of y (t) and y 0 instead of y 0 (t). Thus a …rst-order linear
di¤erential equation is sometimes written instead as:
y0 + a1 y = b:
4.6.2 The Case where b = 0

A di¤erential equation with b = 0 is referred to as a homogeneous di¤erential
equation. Homogeneous di¤erential equations are simpler to work with.
Therefore consider:
y0 (t) + a1 y (t) = 0
or:
y 0 (t) = ¡a1 y (t) :
Dividing both sides by y (t) we have:

y 0 (t)
= ¡a1 :
y (t)
But
d ln (y (t)) y 0 (t)
=
dt y (t)
so this is equivalent to:
d ln (y (t))
= ¡a1
dt
which implies that:
ln (y (t)) = ¡a1 t + ao
for some constant ao :

It then follows then that:
y (t) = Ae¡a1 t
where A = eao : Setting t = 0 we can solve for A as:
y (0) = A
so that:
Theorem 143 The solution to y 0 (t) + a1 y (t) = 0 with y (0) given is:
y (t) = y (0) e¡a1 t :

A necessary and su¢cient condition for this di¤erential equation to be stable

is then
a1 > 0
in which case
lim y (t) = 0
t!1
independent of y (0) :
For example given y 0 (t) + 4y (t) = 0 with y (0) = 10 we have the solution:
y (t) = 10e¡4t
which is stable since y (t) ! 0 as t ! 1 as can be seen by the plot below:
10
0 1 2 3 4 5
t :
y (t) = 10e¡4t
Verify this on your own by calculating y (t) for say t = 1; 4; 10 etc:.
On the other hand given: y 0 (t) ¡ 3y (t) = 0 with y (0) = 5 the solution is:
y (t) = 5e3t
which is unstable as can be seen by the plot below:
22
20
18
16
14
12
10
8
6
0 0.1 0.2 0.3 0.4 0.5
t
y (t) = 5e3t
Verify this on your own by calculating y (t) for say t = 1; 4; 10 etc:.
4.6.3 The Case where b 6= 0

Let us now return to the case where the constant term b is not zero. We can
solve this model by transforming the original model into a form where b = 0:
Given:
y 0 (t) + a1 y (t) = b:
If y (t) ever reaches the equilibrium value y¤ then it stays there forever.
Thus if y (t) = y ¤ then y0 (t) = 0 so that:
b
y¤ = :
a1
Let y~ (t) = y (t) ¡ y ¤ be the deviation of y (t) from y ¤ : We then have:
y~0 (t) = y 0 (t)
so that replacing y (t) with y (t) = y~ (t) + y ¤ and y0 (t) with y~0 (t) it follows that:
y~0 (t) + a1 (~
y (t) + y ¤ ) = b
or:
y~0 (t) + a1 y~ (t) = 0:
Thus y~ (t) follows an identical homogenous di¤erential equation (but with

b = 0 ): From previous our analysis of this special case we conclude that:
y~ (t) = y~ (0) e¡a1 t
or setting y~ (t) = y (t) ¡ y ¤ and y~ (0) = y (0) ¡ y¤ that:
Theorem 144 The solution to: y 0 (t) + a1 y (t) = b with y (0) given is:
y (t) = y ¤ + (y (0) ¡ y ¤ ) e¡a1 t :
We therefore have:
Theorem 145 A necessary and su¢cient condition for y 0 (t) + a1 y (t) = b to

be stable is that:
a1 > 0
in which case
lim y (t) = y¤
t!1
independent of y (0) :
4.6.4 Example 1:
Consider: y0 (t) + 4y (t) = 8 with y (0) = 10: Then
8
y¤ = =2
4
with a solution:
y (t) = 2 + 8e¡4t
which is stable as can be seen from the plot below:
10
0 1 2 3 4 5
t
y (t) = 2 + 8e¡4t
Verify this on your own by calculating y (t) as t takes on larger values.
4.6.5 Example 2
Consider: y0 (t) ¡ 3y (t) = 9 with y (0) = 6: Then
9
y¤ = = ¡3
¡3
with a solution:
y (t) = ¡3 + 9e3t
which is unstable as can be seen from the plot below:
35
30
25
20
15
10
0 0.1 0.2 0.3 0.4 0.5

t :
y (t) = ¡3 + 9e3t
Verify this on your own by calculating y (t) as t takes on larger values.
4.7 Second-Order Di¤erential Equations

De…nition 146 A second-order linear di¤erential equation is given as:
y 00 (t) + a1 y 0 (t) + a2 y (t) = b
with y 0 (0) and y (0) given.
Then if y (t) = y ¤ ; the equilibrium value, then y00 (t) = y0 (t) = 0 so that:
Theorem 147 The equilibrium value y ¤ is given by:

b
y¤ = :
a2
4.7.1 Solution
To solve this di¤erential equation we will …rst attempt to eliminate b: To this
end de…ne
y~ (t) = y (t) ¡ y ¤
so that
y~0 (t) = y0 (t) ;

y~00 (t) = y00 (t) :
Replacing y (t) by y~ (t) + y ¤ ; y0 (t) with y~0 (t) and y 00 (t) with y~00 (t) we have:
y~00 (t) + a1 y~0 (t) + a2 y~ (t) = 0:

We saw for the …rst-order di¤erential equation that the solution took the
form:
y~ (t) = Aert :
Suppose we assume this to be true and ask what value r will take. To answer
this set y~ (t) = Aert so that y~0 (t) = rAert and y~00 (t) = r2 Aert : It follows that:
r|2 Aert
{z } + a1 rAe
rt
| {z } + a2 Ae
rt
|{z} = 0
y 00 (t)
=~ y0 (t)
=~ =~
y(t)
or:
r2 + a1 r + a2 = 0
which implies r is a root of the characteristic polynomial de…ned by:
De…nition 148 The characteristic polynomial of the di¤erential equation:
y 00 (t) + a1 y 0 (t) + a2 y (t) = b
is:
r2 + a1 r + a2 :
We therefore have from:
r2 + a1 r + a2 = 0
that the two roots r1 and r2 are given by:

p
¡a1 + a21 ¡ 4a2
r1 =
p2
¡a1 ¡ a21 ¡ 4a2
r2 = :
2
Since there are two roots it follows that:
y~ (t) = A1 er1 t + A2 er2 t
or using the fact that: y~ (t) = y (t) ¡ y¤ that:
Theorem 149 The solution to: y 00 (t) + a1 y 0 (t) + a2 y (t) = b with y 0 (0) and
y (0) given takes the form:
y (t) = y¤ + A1 er1 t + A2 er2 t :

To calculate A1 and A2 substitute t = 0 into the above solution and into
y0 (t) = r1 A1 er1 t + r2 A2 er2 t
to obtain the two linear equations:
y (0) = y¤ + A1 + A2
y 0 (0) = r1 A1 + r2 A2 :
It therefore follows that:

· ¸· ¸ · ¸
1 1 A1 y (0) ¡ y ¤
=
r1 r2 A2 y 0 (0)
which can be solved to obtain A1 and A2 .
4.7.2 Stability with Real Roots

If:
a21 ¡ 4a2 ¸ 0
then the term under the square root sign that de…nes r1 and r2 , that is:
p
¡a1 § a21 ¡ 4a2
2
will be non-negative so that both r1 and r2 will be real. It then follows that
y (t) is stable if and only if both roots are negative, that is:
r1 < 0
r2 < 0:
4.7.3 Stability with Complex Roots

If
a21 ¡ 4a2 < 0
then the term under the square root sign that de…nes r1 and r2 , that is:
p
¡a1 § a21 ¡ 4a2
2
will be negative so that both r1 and r2 will be complex numbers, or more
precisely complex conjugates with:
r1 = c + di
r2 = c ¡ di
where:
¡a1
c =
p2
4a2 ¡ a21
d = :
2
Since:
er1 t = e(c+di)t = ect (cos (dt) + i sin (dt))

er2 t = e(c¡di)t = ect (cos (dt) ¡ i sin (dt))
it follows that
lim er1 t = lim er2 t = 0

t!1 t!1
if and only if c < 0: We thus have:
Theorem 150 The di¤erential equation
y 00 (t) + a1 y 0 (t) + a2 y (t) = b
is stable if and only if the real part of the two roots r1 and r2 of the characteristic
polynomial are negative.
Alternatively since c = ¡ a21 < 0 stability in the case of complex roots is

equivalent to the condition:
a1 > 0:
4.7.4 Example 1: Stable with Real Roots

Consider:
y 00 (t) + 4y 0 (t) + 2y (t) = 4

y (0) = 0
y 0 (0) = 1
then:
4
y¤ = =2
2
and the quadratic that de…nes r1 and r2 is:
r2 + 4r + 2 = 0
with solutions:
p
r1 = ¡2 + 2 = ¡0:5857864 < 0
p
r2 = ¡2 ¡ 2 = ¡3:414214 < 0:
Since r1 < 0 and r2 < 0 we conclude that y (t) stable.

To solve for A1 and A2 note that from:
y (t) = 2 + A1 er1 t + A2 er2 t
it follows that:
· ¸· ¸ · ¸
1p 1p A1 ¡2
=
¡2 + 2 ¡2 ¡ 2 A2 1
so:
· ¸ · ¸¡1 · ¸
A1 1p 1p ¡2
=
A2 ¡2 + 2 ¡2 ¡ 2 1
· 3p ¸
¡ 4p 2 ¡ 1
= 3 :
4 2¡1
It follows then that the general solution is:

µ ¶ µ ¶
3p (
p
¡2+ 2)t 3p p
y (t) = 2 + ¡ 2¡1 e + 2 ¡ 1 e¡(2+ 2)t
4 4
1.5
0.5
0 2 4 6 8 10
t
Plot of y (t)
Note how Y (t) approaches Y ¤ = 2; re‡ecting the fact that this di¤erential
equation is stable.
4.7.5 Example 2: Unstable with Real Roots

Consider:
y00 (t) ¡ 6y 0 (t) ¡ 2y (t) = ¡6

y (0) = 0
y0 (0) = 1
so that:
¡6
y¤ = =3
¡2
r2 ¡ 6r ¡ 2 = 0
with solutions:
p
r1 = 3 + 11 = 6:32 > 0
p
r2 = 3 ¡ 11 = ¡0:32 < 0:
Since r1 > 0 it follows that (even though r2 < 0) that the di¤erential equation
is unstable. The exact solution is:
µ ¶
3 5p p 1 ³ p ´p p
y (t) = 3 + ¡ + 11 e(3+ 11)t ¡ 10 + 3 11 11e(3¡ 11)t
2 11 22
0 0.2 0.4 0.6 0.8 1

t :
Plot of y (t)
4.7.6 Example 3: Stable with Complex Roots

Consider:
y 00 (t) + 2y 0 (t) + 3y (t) = 6

y (0) = 0
y 0 (0) = 1
then:
6
y¤ = =2
3
r2 + 2r + 3 = 0
so that:
p
r1 = ¡1 + i 2
p
r2 = ¡1 ¡ i 2:
Since the real part of r1 and r2 ; here ¡1; is negative, it follows that the di¤er-
ential equation is stable.
The solution can be shown to be:
µ ¶
p 1 ³ p ´
y (t) = 2 ¡ 2 cos 2t ¡ p sin 2t e¡t
2
1.5
0.5
0 1 2 3 4 5
t :
Plot of y (t)
Note that it is the factor e¡t ; which insures stability e¡t ! 0 as t ! 1:
4.7.7 Example 4: Unstable with Complex Roots

Consider:
y00 (t) ¡ 2y 0 (t) + 3y (t) = 6
y (0) = 0
y 0 (0) = 1
then:
6
y¤ = =2
3
r2 ¡ 2r + 3 = 0
so that:
p
r1 = 1+i 2
p
r2 = 1 ¡ i 2:
Since the real part of r1 and r2 equals 1 and hence is positive, the di¤erential
equation is unstable.
It can be shown that the solution is:
µ ¶
3 p p
y (t) = 2 + p sin 2t ¡ 2 cos 2t et
2
1600
1400
1200
1000
800
600
400
200
0 1 2 3 4 5 6 7
t
-200
:
Plot of y (t)
Note that it is the factor et ; which causes the instability since et ! 1 as t ! 1:

Introduction To Mathematical Economics Part 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Mathematical Economics Part 2

Uploaded by

Copyright:

Available Formats

An Introduction to

1. The printed version is for your personal use only.

Professor Michael Sampson

2 The Envelope Theorem 38

3 Integration and Random Variables 78

3.4.6 Continuous Random Variables . . . . . . . . . . . . . . . 96

3.8.6 The Sample Mean . . . . . . . . . . . . . . . . . . . . . . 127

4.2.3 The Functions tan (x) and arctan (x) . . . . . . . . . . . . 140

4.7.3 Stability with Complex Roots . . . . . . . . . . . . . . . . 175

1.1 Endogenous and Exogenous Variables

1.2 Structural and Reduced Forms of a Model

1.2.2 The Reduced Form of a Model

so that y is by itself on the left-hand side and some function of x is on the

while the marginal cost is:

Equating marginal revenue M and marginal cost N we arrive at a structural

1.2.3 Implicit and Explicit Functions

De…nition 1 An implicit function is any function written as:

Given the structural form of a model it is very easy to rewrite it as an implicit

g (y; x) = M (y; x) ¡ N (y; x) = 0:

De…nition 2 An explicit function unscrambles the y0 s and x0 s so that y is

However, it is often impossible to …nd an explicit function or the reduced

For example given the implicit function:

g (y; x) = y ln (y) ¡ ln (x) = 0; x > 1

Nevertheless, g (y; x) = y ln (y) ¡ ln (x) = 0 is a perfectly respectable func-

g (y; x) = y ln (y) ¡ ln (x) = 0 for x > 1

1.2.4 Calculating Derivatives

In the example above where we found that

1.3 Total Di¤erentials for Beginners

De…nition 6 The total di¤erential is de…ned as:

Note that if a = 0 this ratio would be unde…ned.

Example 1: A Simple Linear Function

where Q is say the quantity of apples produced in an orchard and R is the

Applying Step 1 in the recipe Q is the endogenous variable and R is the

From steps 3; 4 and 5; the total di¤erential for g (Q; R) is:

many di¤erent correct ways of writing down an implicit function.

Example 4: The Demand for Labour in the Short-Run

1.4 Total Di¤erentials for Intermediates

which determines the explicit function:

as long as the Implicit Function Theorem is satis…ed which states that:

Theorem 7 Implicit Function Theorem II: The explicit function y = h (x1 ; x2 ; : : : xm )

De…nition 8 The total di¤erential of g (y; x) = 0 is given by:

a £ dy + b1 dx1 + b2 dx2 + ¢ ¢ ¢ + bm dxm = 0

where a; b1 ; b2 ; : : : bm are de…ned below in the following recipe:

1.4.2 Example 1: A Simple Linear Function

where Q is the production of apples, R is rainfall and S is the amount of sun-

Following Steps 3, 4 and 5 the total di¤erential is

In this example we could also have calculated @Q

We have already seen that this has a reduced form:

1.4.4 Example 3: The Bivariate Normal Density

If we start from the explicit function we can directly calculate:

1.4.5 Example 4: Labour Demand

The pro…t maximizing L¤ then satis…es the …rst-order condition:

1.5 Total Di¤erentials for Experts

This de…nes n reduced form or explicit functions given by:

as long as the Implicit Function Theorem is satis…ed which states that:

Theorem 9 Implicit Function Theorem III: The explicit functions yi = hi (x1 ; x2 ; : : : xm )

g1 (Q1 ; Q2 ; R; S; T ) = 3Q1 + 2Q2 ¡ 2R + 3S + 5T + 7 = 0

and so the explicit functions:

are guaranteed to exist by the Implicit Function Theorem.

variable yi : We do this by calculating the total di¤erential de…ned as

De…nition 10 The total di¤erential of gi (y; x) = 0 for i = 1; 2; : : : n is de…ned

and A is the n £ n matrix de…ned above and B is an n £ m matrix given by:

Step 1. Identify the n endogenous variables y1 ; y2 ; : : : yn and the m exogenous