This action might not be possible to undo. Are you sure you want to continue?
Charles University
Economics Institute
Academy of Science of the Czech Republic
A COOKBOOK OF
MATHEMATICS
Viatcheslav VINOGRADOV
June 1999
CERGEEI LECTURE NOTES 1
A CookBook of
´/T Hc´/T 1(o
Viatcheslav Vinogradov
Center for Economic Research and Graduate Education
and Economics Institute of the Czech Academy of Sciences,
Prague, 1999
CERGEEI 1999
ISBN 8086286207
To my Teachers
He liked those literary cooks
Who skim the cream of others’ books
And ruin half an author’s graces
By plucking bonmots from their places.
Hannah More, Florio (1786)
Introduction
This textbook is based on an extended collection of handouts I distributed to the graduate
students in economics attending my summer mathematics class at the Center for Economic
Research and Graduate Education (CERGE) at Charles University in Prague.
Two considerations motivated me to write this book. First, I wanted to write a short
textbook, which could be covered in the course of two months and which, in turn, covers
the most signiﬁcant issues of mathematical economics. I have attempted to maintain a
balance between being overly detailed and overly schematic. Therefore this text should
resemble (in the ‘ideological’ sense) a “hybrid” of Chiang’s classic textbook Fundamental
Methods of Mathematical Economics and the comprehensive reference manual by Berck
and Sydsæter (Exact references appear at the end of this section).
My second objective in writing this text was to provide my students with simple “cook
book” recipes for solving problems they might face in their studies of economics. Since the
target audience was supposed to have some mathematical background (admittance to the
program requires at least BA level mathematics), my main goal was to refresh students’
knowledge of mathematics rather than teach them math ‘from scratch’. Students were
expected to be familiar with the basics of set theory, the realnumber system, the concept
of a function, polynomial, rational, exponential and logarithmic functions, inequalities
and absolute values.
Bearing in mind the applied nature of the course, I usually refrained from presenting
complete proofs of theoretical statements. Instead, I chose to allocate more time and
space to examples of problems and their solutions and economic applications. I strongly
believe that for students in economics – for whom this text is meant – the application
of mathematics in their studies takes precedence over das Glasperlenspiel of abstract
theoretical constructions.
Mathematics is an ancient science and, therefore, it is little wonder that these notes
may remind the reader of the other textbooks which have already been written and
published. To be candid, I did not intend to be entirely original, since that would be
impossible. On the contrary, I tried to beneﬁt from books already in existence and
adapted some interesting examples and worthy pieces of theory presented there. If the
reader requires further proofs or more detailed discussion, I have included a useful, but
hardly exhaustive reference guide at the end of each section.
With very few exceptions, the analysis is limited to the case of real numbers, the
theory of complex numbers being beyond the scope of these notes.
Finally, I would like to express my deep gratitude to Professor Jan Kmenta for his
valuable comments and suggestions, to Sufana Razvan for his helpful assistance, to Aurelia
Pontes for excellent editorial support, to Natalka Churikova for her advice and, last but
not least, to my students who inspired me to write this book.
All remaining mistakes and misprints are solely mine.
i
I wish you success in your mathematical kitchen! Bon Appetit !
Supplementary Reading (General):
• Arrow, K. and M.Intriligator, eds. Handbook of Mathematical Economics, vol. 1.
• Berck P. and K. Sydsæter. Economist’s Mathematical Manual.
• Chiang, A. Fundamental Methods of Mathematical Economics.
• Ostaszewski, I. Mathematics in Economics: Models and Methods.
• Samuelson, P. Foundations of Economic Analysis.
• Silberberg, E. The Structure of Economics: A Mathematical Analysis.
• Takayama, A. Mathematical Economics.
• Yamane, T. Mathematics for Economists: An Elementary Survey.
ii
Basic notation used in the text:
Statements: A, B, C, . . .
True/False: all statements are either true or false.
Negation: A ‘not A’
Conjunction: A ∧ B ‘A and B’
Disjunction: A ∨ B ‘A or B’
Implication: A ⇒ B ‘A implies B’
(A is suﬃcient condition for B; B is necessary condition for A.)
Equivalence: A ⇔ B ‘A if and only if B’ (A iﬀ B, for short)
(A is necessary and suﬃcient for B; B is necessary and suﬃcient for A.)
Example 1 (A) ∧ A ⇔ FALSE.
((A ∨ B)) ⇔ ((A) ∧ (B)) (De Morgan rule).
Quantiﬁers:
Existential: ∃ ‘There exists’ or ‘There is’
Universal: ∀ ‘For all’ or ‘For every’
Uniqueness: ∃! ‘There exists a unique ...’ or ‘There is a unique...’
The colon : and the vertical line [ are widely used as abbreviations for ‘such that’
a ∈ o means ‘a is an element of (belongs to) set o’
Example 2 (Deﬁnition of continuity)
f is continuous at x if
((∀ > 0)(∃δ > 0) : (∀y ∈ [y −x[ < δ ⇒ [f(y) −f(x)[ < )
Optional information which might be helpful is typeset in footnotesize font.
The symbol ´! is used to draw the reader’s attention to potential pitfalls.
iii
Contents
1 Linear Algebra 1
1.1 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Laws of Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Inverses and Transposes . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Determinants and a Test for NonSingularity . . . . . . . . . . . . . 4
1.1.5 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Appendix: Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2 Vector Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.3 Independence and Bases . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.4 Linear Transformations and Changes of Bases . . . . . . . . . . . . 18
2 Calculus 21
2.1 The Concept of Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Diﬀerentiation  the Case of One Variable . . . . . . . . . . . . . . . . . . 22
2.3 Rules of Diﬀerentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Maxima and Minima of a Function of One Variable . . . . . . . . . . . . . 26
2.5 Integration (The Case of One Variable) . . . . . . . . . . . . . . . . . . . . 29
2.6 Functions of More than One Variable . . . . . . . . . . . . . . . . . . . . . 32
2.7 Unconstrained Optimization in the Case of More than One Variable . . . . 34
2.8 The Implicit Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . 35
2.9 Concavity, Convexity, Quasiconcavity and Quasiconvexity . . . . . . . . . . 38
2.10 Appendix: Matrix Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 40
3 Constrained Optimization 43
3.1 Optimization with Equality Constraints . . . . . . . . . . . . . . . . . . . . 43
3.2 The Case of Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . 47
3.2.1 NonLinear Programming . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.2 KuhnTucker Conditions . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Appendix: Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3.1 The Setup of the Problem . . . . . . . . . . . . . . . . . . . . . . . 54
3.3.2 The Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4 Dynamics 63
4.1 Diﬀerential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.1.1 Diﬀerential Equations of the First Order . . . . . . . . . . . . . . . 63
4.1.2 Linear Diﬀerential Equations of a Higher Order with Constant Co
eﬃcients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1.3 Systems of the First Order Linear Diﬀerential Equations . . . . . . 68
4.1.4 Simultaneous Diﬀerential Equations. Types of Equilibria . . . . . . 77
4.2 Diﬀerence Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
iv
4.2.1 Firstorder Linear Diﬀerence Equations . . . . . . . . . . . . . . . . 80
4.2.2 SecondOrder Linear Diﬀerence Equations . . . . . . . . . . . . . . 82
4.2.3 The General Case of Order n . . . . . . . . . . . . . . . . . . . . . 83
4.3 Introduction to Dynamic Optimization . . . . . . . . . . . . . . . . . . . . 85
4.3.1 The FirstOrder Conditions . . . . . . . . . . . . . . . . . . . . . . 85
4.3.2 PresentValue and CurrentValue Hamiltonians . . . . . . . . . . . 87
4.3.3 Dynamic Problems with Inequality Constraints . . . . . . . . . . . 87
5 Exercises 93
5.1 Solved Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 More Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
v
1 Linear Algebra
1.1 Matrix Algebra
Deﬁnition 1 An m n matrix is a rectangular array of real numbers with m rows and
n columns.
A =
¸
¸
¸
¸
¸
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
¸
,
mn is called the dimension or order of A. If m = n, the matrix is the square of order
n.
A subscribed element of a matrix is always read as a
row,column
´!
A shorthand notation is A = (a
ij
), i = 1, 2, . . . , m and j = 1, 2, . . . , n, or A =
(a
ij
)
[m×n]
.
A vector is a special case of a matrix, when either m = 1 (row vectors v = (v
1
, v
2
, . . . , v
n
))
or n = 1 – column vectors
u =
¸
¸
¸
¸
¸
u
1
u
2
.
.
.
u
m
¸
.
1.1.1 Matrix Operations
• Addition and subtraction: given A = (a
ij
)
[m×n]
and B = (b
ij
)
[m×n]
A ±B = (a
ij
±b
ij
)
[m×n]
,
i.e. we simply add or subtract corresponding elements.
Note that these operations are deﬁned only if A and B are of the same dimension. ´!
• Scalar multiplication:
λA = (λa
ij
), where λ ∈ R,
i.e. each element of A is multiplied by the same scalar λ.
• Matrix multiplication: if A = (a
ij
)
[m×n]
and B = (a
ij
)
[n×k]
then
A B = C = (c
ij
)
[m×k]
, where c
ij
=
n
¸
l=1
a
il
b
lj
.
Note the dimensions of matrices! ´!
Recipe 1 – How to Multiply Two Matrices:
In order to get the element c
ij
of matrix C you need to multiply the ith row of matrix
A by the jth column of matrix B.
1
Example 3
3 2 1
4 5 6
¸
¸
9 10
12 11
14 13
¸
=
3 9 + 2 12 + 1 14 3 10 + 2 11 + 1 13
4 9 + 5 12 + 6 14 4 10 + 5 11 + 6 13
.
Example 4 A system of m linear equations for n unknowns
a
11
x
1
+a
12
x
2
+. . . +a
1n
x
n
= b
1
,
. . .
.
.
.
a
m1
x
1
+a
m2
x
2
+. . . +a
mn
x
n
= b
m
,
can be written as Ax = b, where A = (a
ij
)
[m×n]
, and
x =
¸
¸
¸
x
1
.
.
.
x
n
¸
, b =
¸
¸
¸
b
1
.
.
.
b
m
¸
.
1.1.2 Laws of Matrix Operations
• Commutative law of addition: A + B = B +A.
• Associative law of addition: (A +B) +C = A + (B + C).
• Associative law of multiplication: A(BC) = (AB)C.
• Distributive law:
A(B +C) = AB +AC (premultiplication by A),
(B +C)A = BA +BC (postmultiplication by A).
The commutative law of multiplication is not applicable in the matrix case, AB =
BA!!! ´!
Example 5 Let A =
2 0
3 8
, B =
7 2
6 3
.
Then AB =
14 4
69 30
= BA =
20 16
21 24
.
Example 6 Let v = (v
1
, v
2
, . . . , v
n
), u =
¸
¸
¸
¸
¸
u
1
u
2
.
.
.
u
n
¸
.
Then vu = v
1
u
1
+ v
2
u
2
+ . . . + v
n
u
n
is a scalar, and uv = C = (c
ij
)
[n×n]
is a n by n
matrix, with c
ij
= u
i
v
j
, i, j = 1, . . . , n.
We introduce the identity or unit matrix of dimension n I
n
as
I
n
=
¸
¸
¸
¸
¸
1 0 . . . 0
0 1 . . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . 1
¸
.
Note that I
n
is always a square [nn] matrix (further on the subscript n will be omitted).
I
n
has the following properties:
2
a) AI = IA = A,
b) AIB = AB for all A, B.
In this sense the identity matrix corresponds to 1 in the case of scalars.
The null matrix is a matrix of any dimension for which all elements are zero:
0 =
¸
¸
¸
0 . . . 0
.
.
.
.
.
.
.
.
.
0 . . . 0
¸
.
Properties of a null matrix:
a) A +0 = A,
b) A + (−A) = 0.
Note that AB = 0 ⇒ A = 0 or B = 0; AB = AC ⇒ B = C. ´!
Deﬁnition 2 A diagonal matrix is a square matrix whose only nonzero elements appear
on the principle (or main) diagonal.
A triangular matrix is a square matrix which has only zero elements above or below
the principle diagonal.
1.1.3 Inverses and Transposes
Deﬁnition 3 We say that B = (b
ij
)
[n×m]
is the transpose of A = (a
ij
)
[m×n]
if a
ji
= b
ij
for all i = 1, . . . , n and j = 1, . . . , m.
Usually transposes are denoted as A
(or as A
T
).
Recipe 2 – How to Find the Transpose of a Matrix:
The transpose A
of A is obtained by making the columns of A into the rows of A
.
Example 7 A =
1 2 3
0 a b
, A
=
¸
¸
1 0
2 a
3 b
¸
.
Properties of transposition:
a) (A
)
= A
b) (A +B)
= A
+ B
c) (αA)
= αA
, where α is a real number.
d) (AB)
= B
A
Note the order of transposed matrices! ´!
Deﬁnition 4 If A
= A, A is called symmetric.
If A
= −A, A is called antisymmetric (or skewsymmetric).
If A
A = I, A is called orthogonal.
If A = A
and AA = A, A is called idempotent.
3
Deﬁnition 5 The inverse matrix A
−1
is deﬁned as A
−1
A = AA
−1
= I.
Note that A as well as A
−1
are square matrices of the same dimension (it follows from
the necessity to have the preceding line deﬁned). ´!
Example 8 If A =
1 2
3 4
then the inverse of A is A
−1
=
−2 1
3
2
−
1
2
.
We can easily check that
1 2
3 4
−2 1
3
2
−
1
2
=
−2 1
3
2
−
1
2
1 2
3 4
=
1 0
0 1
.
Another important characteristics of inverse matrices and inversion:
• Not all square matrices have their inverses. If a square matrix has its inverse, it is
called regular or nonsingular. Otherwise it is called singular matrix.
• If A
−1
exists, it is unique.
• A
−1
A = I is equivalent to AA
−1
= I.
Properties of inversion:
a) (A
−1
)
−1
= A
b) (αA)
−1
=
1
α
A
−1
, where α is a real number, α = 0.
c) (AB)
−1
= B
−1
A
−1
Note the order of matrices! ´!
d) (A
)
−1
= (A
−1
)
1.1.4 Determinants and a Test for NonSingularity
The formal deﬁnition of the determinant is as follows: given n n matrix A = (a
ij
),
det(A) =
¸
(α
1
,...,α
n
)
(−1)
I(α
1
,...,α
n
)
a
1α
1
a
2α
2
. . . a
nα
n
where (α
1
, . . . , α
n
) are all diﬀerent permutations of (1, 2, . . . , n), and I(α
1
, . . . , α
n
) is the
number of inversions.
Usually we denote the determinant of A as det(A) or [A[.
For practical purposes, we can give an alternative recursive deﬁnition of the deter
minant. Given the fact that the determinant of a scalar is a scalar itself, we arrive at
following
Deﬁnition 6 (Laplace Expansion Formula)
det(A) =
n
¸
k=1
(−1)
l+k
a
lk
det(M
lk
) for some integer l, 1 ≤ l ≤ n.
Here M
lk
is the minor of element a
lk
of the matrix A, which is obtained by deleting lth
row and kth column of A. (−1)
l+k
det(M
lk
) is called cofactor of the element a
lk
.
4
Example 9 Given matrix
A =
¸
¸
2 3 0
1 4 5
7 2 1
¸
, the minor of the element a
23
is M
23
=
2 3
7 2
.
Note that in the above expansion formula we expanded the determinant by elements
of the lth row. Alternatively, we can expand it by elements of lth column. Thus the
Laplace Expansion formula can be rewritten as
det(A) =
n
¸
k=1
(−1)
k+l
a
kl
det(M
kl
) for some integer l, 1 ≤ l ≤ n.
Example 10 The determinant of 2 2 matrix:
det
a
11
a
12
a
21
a
22
= a
11
a
22
−a
12
a
21
.
Example 11 The determinant of 3 3 matrix:
det
¸
¸
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
¸
=
= a
11
det
a
22
a
23
a
32
a
33
−a
12
det
a
21
a
23
a
31
a
33
+a
13
det
a
21
a
22
a
31
a
32
=
= a
11
a
22
a
33
+a
12
a
23
a
31
+a
13
a
21
a
32
−a
13
a
22
a
31
−a
12
a
21
a
33
−a
11
a
23
a
32
.
Properties of the determinant:
a) det(A B) = det(A) det(B).
b) In general, det(A +B) = det(A) + det(B). ´!
Recipe 3 – How to Calculate the Determinant:
We can apply the following useful rules:
1. The multiplication of any one row (or column) by a scalar k will change the value
of the determinant kfold.
2. The interchange of any two rows (columns) will alter the sign but not the numerical
value of the determinant.
3. If a multiple of any row is added to (or subtracted from) any other row it will not
change the value or the sign of the determinant. The same holds true for columns.
(I.e. the determinant is not aﬀected by linear operations with rows (or columns)).
4. If two rows (or columns) are identical, the determinant will vanish.
5. The determinant of a triangular matrix is a product of its principal diagonal ele
ments.
5
Using these rules, we can simplify the matrix (e.g. obtain as many zero elements as
possible) and then apply Laplace expansion.
Example 12 Let A =
¸
¸
¸
¸
4 −2 6 2
0 −1 5 −3
2 −1 8 −2
0 0 0 2
¸
.
Subtracting the ﬁrst row, divided by 2, from the third row, we get
det A =
¸
¸
¸
¸
4 −2 6 2
0 −1 5 −3
2 −1 8 −2
0 0 0 2
¸
= 2 det
¸
¸
¸
¸
2 −1 3 1
0 −1 5 −3
2 −1 8 −2
0 0 0 2
¸
=
2 det
¸
¸
¸
¸
2 −1 3 1
0 −1 5 −3
0 0 5 −3
0 0 0 2
¸
= 2 2 (−1) 5 2 = −40
If we have a blockdiagonal matrix, i.e. a partitioned matrix of the form
P =
P
11
P
12
0 P
22
, where P
11
and P
22
are square matrices,
then det(P) = det(P
11
) det(P
22
).
If we have a partitioned matrix
P =
P
11
P
12
P21 P
22
, where P
11
, P
22
are square matrices,
then det(P) = det(P
22
) det(P
11
−P
12
P
−1
22
P
21
) = det(P
11
) det(P
22
−P
21
P
−1
11
P
12
).
Proposition 1 (The Determinant Test for NonSingularity)
A matrix A is nonsingular ⇔ det(A) = 0.
As a corollary, we get
Proposition 2 A
−1
exists ⇔ det(A) = 0.
Recipe 4 – How to Find an Inverse Matrix:
There are two ways of ﬁnding inverses.
Assume that matrix A is invertible, i.e. det(A) = 0.
1. Method of adjoint matrix. For the computation of an inverse matrix A
−1
we use
the following algorithm: A
−1
= (d
ij
), where
d
ij
=
1
det(A)
(−1)
i+j
det(M
ji
).
Note the order of indices at M
ji
! ´!
This method is called “method of adjoint” because we have to compute the socalled adjoint of
matrix A, which is deﬁned as a matrix adjA = C
= ([C
ji
[), where [C
ij
[ is the cofactor of the
element a
ij
.
6
2. Gauss elimination method or pivotal method. An identity matrix is placed along
side a matrix A that is to be inverted. Then, the same elementary row operations
are performed on both matrices until A has been reduced to an identity matrix. The
identity matrix upon which the elementary row operations have been performed will
then become the inverse matrix we seek.
Example 13 (method of adjoint)
A =
a b
c d
=⇒ A
−1
=
1
ad −bc
d −b
−c a
.
Example 14 (Gauss elimination method)
Let A =
2 3
4 2
. Then
2 3
4 2
1 0
0 1
∼
1 3/2
4 2
1/2 0
0 1
∼
1 3/2
0 −4
1/2 0
−2 1
∼
∼
1 3/2
0 1
1/2 0
1/2 −1/4
∼
1 0
0 1
−1/4 3/8
1/2 −1/4
.
Therefore, the inverse is
A
−1
=
−1/4 3/8
1/2 −1/4
.
1.1.5 Rank of a Matrix
A linear combination of vectors a
1
, a
2
, . . . , a
k
is a sum
q
1
a
1
+q
2
a
2
+. . . +q
k
a
k
,
where q
1
, q
2
, . . . , q
k
are real numbers.
Deﬁnition 7 Vectors a
1
, a
2
, . . . , a
k
are linearly dependent if and only if there exist num
bers c
1
, c
2
, . . . , c
k
not all zero, such that
c
1
a
1
+c
2
a
2
+ . . . +c
k
a
k
= 0.
Example 15 Vectors a
1
= (2, 4) and a
2
= (3, 6) are linearly dependent: if, say, c
1
= 3
and c
2
= −2 then c
1
a
1
+c
2
a
2
= (6, 12) + (−6, −12) = 0.
Recall that if we have n linearly independent vectors e
1
, e
2
, . . . , e
n
, they are said to span an n
dimensional vector space or to constitute a basis in an ndimensional vector space. For more details
see the section ”Vector Spaces”.
Deﬁnition 8 The rank of a matrix A rank(A) can be deﬁned as
– the maximum number of linearly independent rows;
– or the maximum number of linearly independent columns;
– or the order of the largest nonzero minor of A.
7
Example 16 rank
¸
¸
1 3 2
2 6 4
−5 7 1
¸
= 2.
The ﬁrst two rows are linearly dependent, therefore the maximum number of linearly
independent rows is equal to 2.
Properties of the rank:
• The column rank and the row rank of a matrix are equal.
• rank(AB) ≤ min(rank(A), rank(B)).
• rank(A) = rank(AA
) = rank(A
A).
Using the notion of rank, we can reformulate the condition for nonsingularity:
Proposition 3 If A is a square matrix of order n, then
rank(A) = n ⇔ det(A) = 0.
1.2 Systems of Linear Equations
Consider a system of n linear equations for n unknowns Ax = b.
Recipe 5 – How to Solve a Linear System Ax = b (general rules):
b =0 (homogeneous case)
If det(A) = 0 then the system has a unique trivial (zero) solution.
If det(A) = 0 then the system has an inﬁnite number of solutions.
b =0 (nonhomogeneous case)
If det(A) = 0 then the system has a unique solution.
If det(A) = 0 then
a) rank(A) = rank(
˜
A) ⇒ the system has an inﬁnite number of solutions.
b) rank(A) = rank(
˜
A) ⇒ the system is inconsistent.
Here
˜
A is a socalled augmented matrix,
˜
A =
¸
¸
¸
a
11
. . . a
1n
b
1
.
.
.
.
.
.
.
.
.
a
n1
. . . a
nn
b
n
¸
.
Recipe 6 – How to Solve the System of Linear Equations, if b =0 and det(A) = 0:
1. The inverse matrix method:
Since A
−1
exists, the solution x can be found as x = A
−1
b.
2. Gauss method:
We perform the same elementary row operations on matrix A and vector b until A
has been reduced to an identity matrix. The vector b upon which the elementary row
operations have been performed will then become the solution.
8
3. Cramer’s rule:
We can consequently ﬁnd all elements x
1
, x
2
, . . . , x
n
of vector x using the following
formula:
x
j
=
det(A
j
)
det(A)
, where A
j
=
¸
¸
¸
a
11
. . . a
1j−1
b
1
a
1j+1
. . . a
1n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
. . . a
nj−1
b
n
a
nj+1
. . . a
nn
¸
.
Example 17 Let us solve
2 3
4 −1
x
1
x
2
=
12
10
for x
1
, x
2
using Cramer’s rule:
det(A) = 2 (−1) −3 3 = −14,
det(A
1
) = det
12 3
10 −1
= 12 (−1) −3 10 = −42,
det(A
2
) = det
2 12
4 10
= 2 10 −12 4 = −28,
therefore
x
1
=
−42
−14
= 3, x
2
=
−28
−14
= 2.
Economics Application 1 (General Market Equilibrium)
Consider a market for three goods. Demand and supply for each good are given by:
D
1
= 5 −2P
1
+P
2
+P
3
S
1
= −4 + 3P
1
+ 2P
2
D
2
= 6 + 2P
1
−3P
2
+P
3
S
2
= 3 + 2P
2
D
3
= 20 +P
1
+ 2P
2
−4P
3
S
3
= 3 + P
2
+ 3P
3
where P
i
is the price of good i, i = 1, 2, 3.
The equilibrium conditions are: D
i
= S
i
, i = 1, 2, 3 , that is
5P
1
+P
2
−P
3
= 9
−2P
1
+5P
2
−P
3
= 3
−P
1
−P
2
+7P
3
= 17
This system of linear equations can be solved at least in two ways.
9
a) Using Cramer’s rule:
A
1
=
9 1 −1
3 5 −1
17 −1 7
, A =
5 1 −1
−2 5 −1
−1 −1 7
P
∗
1
=
A
1
A
=
356
178
= 2.
Similarly P
∗
2
= 2 and P
∗
3
= 3. The vector of (P
∗
1
,P
∗
2
,P
∗
3
) describes the general market
equilibrium.
b) Using the inverse matrix rule:
Let us denote
A =
¸
¸
5 1 −1
−2 5 −1
−1 −1 7
¸
, P =
¸
¸
P
1
P
2
P
3
¸
, B =
¸
¸
9
3
17
¸
.
The matrix form of the system is:
AP = B, which implies P = A
−1
B.
A
−1
=
1
[A[
¸
¸
34 −6 4
15 34 7
7 4 27
¸
,
P =
1
178
¸
¸
34 −6 4
15 34 7
7 4 27
¸
¸
¸
9
3
17
¸
=
¸
¸
2
2
3
¸
Again, P
∗
1
= 2, P
∗
2
= 2 and P
∗
3
= 3.
Economics Application 2 (Leontief InputOutput Model)
This model addresses the following planning problem: Assume that n industries produce
n goods (each industry produces only one good) and the output good of each industry is
used as an input in the other n − 1 industries. In addition, each good is demanded
for ‘noninput’ consumption. What are the eﬃcient amounts of output each of the n
industries should produce? (‘Eﬃcient’ means that there will be no shortage and no surplus
in producing each good).
The model is based on an input matrix:
A =
¸
¸
¸
¸
¸
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
a
n1
a
n2
a
nn
¸
,
where a
ij
denotes the amount of good i used to produce one unit of good j.
To simplify the model, let set the price of each good equal to $1. Then the value of
inputs should not exceed the value of output:
n
¸
i=1
a
ij
≤ 1, j = 1, . . . n.
10
If we denote an additional (noninput) demand for good i by b
i
, then the optimality
condition reads as follows: the demand for each input should equal the supply, that is
x
i
=
n
¸
j=1
a
ij
x
j
+b
i
, i = 1, . . . n,
or
x = Ax + b,
or
(I −A)x = b.
The system (I − A)x = b can be solved using either Cramer’s rule or the inverse
matrix.
Example 18 (Numerical Illustration)
Let A =
¸
¸
0.2 0.2 0.1
0.3 0.3 0.1
0.1 0.2 0.4
¸
, b =
¸
¸
8
4
5
¸
Thus the system (I −A)x = b becomes
0.8x
1
−0.2x
2
−0.1x
3
= 8
−0.3x
1
+0.7x
2
−0.1x
3
= 4
−0.1x
1
−0.2x
2
+0.6x
3
= 5
Solving it for x
1
, x
2
, x
3
we ﬁnd the solution (
4210
269
,
3950
269
,
4260
269
).
1.3 Quadratic Forms
Generally speaking, a form is a polynomial expression, in which each term has a uniform
degree (e.g. L = ax + by + cz is an example of a linear form in three variables x, y, z,
where a, b, c are arbitrary real constants).
Deﬁnition 9 A quadratic form Q in n variables x
1
, x
2
, . . . , x
n
is a polynomial expression
in which each component term has a degree two (i.e. each term is a product of x
i
and x
j
,
where i, j = 1, 2, . . . , n):
Q =
n
¸
i=1
n
¸
j=1
a
ij
x
i
x
j
,
where a
ij
are real numbers. For convenience, we assume that a
ij
= a
ji
. In matrix notation,
Q = x
Ax, where A = (a
ij
) is a symmetric matrix, and
x =
¸
¸
¸
¸
¸
x
1
x
2
.
.
.
x
n
¸
.
11
Example 19 A quadratic form in two variables: Q = a
11
x
2
1
+ 2a
12
x
1
x
2
+a
22
x
2
2
.
Deﬁnition 10 A quadratic form Q is said to be
positive deﬁnite (PD)
negative deﬁnite (ND)
positive semideﬁnite (PSD)
negative semideﬁnite (NSD)
if Q = x
Ax is
> 0 for all x = 0
< 0 for all x = 0
≥ 0 for all x
≤ 0 for all x
otherwise Q is called indeﬁnite (ID).
Example 20 Q = x
2
+ y
2
is PD, Q = (x +y)
2
is PSD, Q = x
2
−y
2
is ID.
Leading principal minors D
k
, k = 1, 2, . . . , n of a matrix A = (a
ij
)
[n×n]
are deﬁned as
D
k
= det
¸
¸
¸
a
11
. . . a
1k
.
.
.
.
.
.
a
k1
. . . a
kk
¸
.
Proposition 4
1. A quadratic form Q id PD ⇔ D
k
> 0 for all k = 1, 2, . . . , n.
2. A quadratic form Q id ND ⇔ (−1)
k
D
k
> 0 for all k = 1, 2, . . . , n.
Note that if we replace > by ≥ in the above statement, it does NOT give us the criteria
for the semideﬁnite case! ´!
Proposition 5 A quadratic form Q is PSD (NSD) ⇔ all the principal minors of A are ≥ (≤)0.
By deﬁnition, the principal minor
A
i
1
. . . i
p
i
1
. . . i
p
= det
¸
¸
a
i
1
i
1
. . . a
i
1
i
p
.
.
.
.
.
.
a
i
p
i
1
. . . a
i
p
i
p
¸
, where 1 ≤ i
1
< i
2
< . . . < i
p
≤ n, p ≤ n.
Example 21 Consider the quadratic form Q(x, y, z) = 3x
2
+ 3y
2
+ 5z
2
− 2xy. The
corresponding matrix has the form
A =
¸
¸
3 −1 0
−1 3 0
0 0 5
¸
.
Leading principal minors of A are
D
1
= 3 > 0, D
2
= det
3 −1
−1 3
= 8 > 0, D
3
= det
¸
¸
3 −1 0
−1 3 0
0 0 5
¸
= 40 > 0,
therefore, the quadratic form is positive deﬁnite.
12
1.4 Eigenvalues and Eigenvectors
Deﬁnition 11 Any number λ such that the equation
Ax = λx (1)
has a nonzero vectorsolution x is called an eigenvalue (or a characteristic root) of the
equation (1)
Deﬁnition 12 Any nonzero vector x satisfying (1) is called an eigenvector (or charac
teristic vector) of A for the eigenvalue λ.
Recipe 7 – How to calculate eigenvalues:
Ax − λx = 0 ⇒ (A − λI)x = 0. Since x is nonzero, the determinant of (A − λI)
should vanish. Therefore all eigenvalues can be calculated as roots of the equation (which
is often called the characteristic equation or the characteristic polynomial of A)
det(A −λI) = 0.
Example 22 Let us consider the quadratic form from Example 21.
det(A −λI) = det
¸
¸
3 −λ −1 0
−1 3 −λ 0
0 0 5 −λ
¸
=
= (5 −λ)(λ
2
−6λ + 8) = (5 −λ)(λ −2)(λ −4) = 0,
therefore the eigenvalues are λ = 2, λ = 4 and λ = 5.
Proposition 6 (Characteristic Root Test for Sign Deﬁniteness.)
A quadratic form Q is
positive deﬁnite
negative deﬁnite
positive semideﬁnite
negative semideﬁnite
⇔
eigenvalues λ
i
> 0 for all i = 1, 2, . . . , n
λ
i
< 0 for all i = 1, 2, . . . , n
λ
i
≥ 0 for all i = 1, 2, . . . , n
λ
i
≤ 0 for all i = 1, 2, . . . , n
A form is indeﬁnite if at least one positive and one negative eigenvalues exist.
Deﬁnition 13 Matrix A is diagonalizable ⇔ P
−1
AP = D for a nonsingular matrix P
and a diagonal matrix D.
Proposition 7 (The Spectral Theorem for Symmetric Matrices)
If A is a symmetric matrix of order n and λ
1
, . . . , λ
n
are its eigenvalues, there exists
an orthogonal matrix U such that
U
−1
AU =
¸
¸
¸
λ
1
0
.
.
.
0 λ
n
¸
.
13
Usually, U is the normalized matrix formed by eigenvectors. It has the property
U
U = I (i.e. U is orthogonal matrix; U
= U
−1
).“Normalized” means that for any
column u of the matrix U u
u = 1.
It is essential that A be symmetrical! ´!
Example 23 Diagonalize the matrix
A =
1 2
2 4
.
First, we need to ﬁnd the eigenvalues:
det
1 −λ 2
2 4 −λ
= (1 −λ)(4 −λ) −4 = λ
2
−5λ = λ(λ −5),
i.e. λ = 0 and λ = 5.
For λ = 0 we solve
1 −0 2
2 4 −0
x
1
x
2
=
0
0
or
x
1
+ 2x
2
= 0,
2x
1
+ 4x
2
= 0.
The second equation is redundant and the eigenvector, corresponding to λ = 0, is v
1
=
C
1
(2, −1)
, where C
1
is an arbitrary real constant.
For λ = 5 we solve
1 −5 2
2 4 −5
x
1
x
2
=
0
0
or
−4x
1
+ 2x
2
= 0,
2x
1
−x
2
= 0.
Thus the general expression for the second eigenvector is v
2
= C
2
(1, 2)
.
Let us normalize the eigenvectors, i.e. let us pick constants C such that v
1
v
1
= 1 and
v
2
v
2
= 1. After normalization we get v
1
= (2/
√
5, −1/
√
5)
, v
2
= (1/
√
5, 2/
√
5)
. Thus
the diagonalization matrix U is
U =
2
√
5
1
√
5
−
1
√
5
2
√
5
.
You can easily check that
U
−1
AU =
0 0
0 5
.
Some useful results:
14
• det(A) = λ
1
. . . λ
n
.
• if λ
1
, . . . , λ
n
are eigenvalues of A then 1/λ
1
, . . . , 1/λ
n
are eigenvalues of A
−1
.
• if λ
1
, . . . , λ
n
are eigenvalues of A then f(λ
1
), . . . , f(λ
n
) are eigenvalues of f(A),
where f() is a polynomial.
• the rank of a symmetric matrix is the number of nonzero eigenvalues it contains.
• the rank of any matrix A is equal to the number of nonzero eigenvalues of A
A.
• if we deﬁne the trace of a square matrix of order n as the sum of the n elements on
its principal diagonal tr(A) =
¸
n
i=1
a
ii
, then tr(A) = λ
1
+. . . +λ
n
.
Properties of the trace:
a) if A and B are of the same order, tr(A + B) = tr(A) + tr(B);
b) if λ is a scalar, tr(λA) = λtr(A);
c) tr(AB) = tr(BA), whenever AB is square;
d) tr(A
) = tr(A).
e) tr(A
A) =
¸
n
i=1
¸
n
j=1
a
2
ij
.
1.5 Appendix: Vector Spaces
1.5.1 Basic Concepts
Deﬁnition 14 A (real) vector space is a nonempty set V of objects together with an
additive operation + : V V → V , +(u, v) = u + v and a scalar multiplicative operation
: R V → V, (a, u) = au which satisﬁes the following axioms for any u, v, w ∈ V and
any a, b ∈ R (R is the set of all real numbers):
A1) (u+v)+w=u+(v+w)
A2) u+v=v+u
A3) 0+u=u
A4) u+(u)=0
S1) a(u+v)=au+av
S2) (a+b)u=au+bu
S3) a(bu)=(ab)u
S4) 1u=u.
Deﬁnition 15 The objects of a vector space V are called vectors, the operations + and
are called vector addition and scalar multiplication, respectively. The element 0 ∈ V is
the zero vector and −v is the additive inverse of V .
Example 24 (The nDimensional Vector Space R
n
)
Deﬁne R
n
= ¦(u
1
, u
2
, . . . , u
n
)
[u
i
∈ R, i = 1, . . . , n¦ (the apostrophe denotes the trans
pose). Consider u, v ∈ R
n
, u = (u
1
, u
2
, . . . , u
n
)
, v = (v
1
, v
2
, . . . , v
n
)
and a ∈ R.
15
Deﬁne the additive operation and the scalar multiplication as follows:
u + v = (u
1
+v
1
, . . . u
n
+v
n
)
,
au = (au
1
, . . . au
n
)
.
It is not diﬃcult to verify that R
n
together with these operations is a vector space.
Deﬁnition 16 Let V be a vector space. An inner product or scalar product in V is a
function s : V V → R, s(u, v) = u v which satisﬁes the following properties:
u v = v u,
u (v +w) = u v + u w,
a(u v) = (au) v = u (av),
u u ≥ 0 and u u = 0 iﬀ u = 0,
Example 25 Let u, v ∈ R
n
, u = (u
1
, u
2
, . . . , u
n
)
, v = (v
1
, v
2
, . . . , v
n
)
. Deﬁne u v =
(u
1
v
1
, . . . u
n
v
n
)
. Then this rule is an inner product in R
n
.
Deﬁnition 17 Let V be a vector space and : V V → R an inner product in V . The
norm of magnitude is a function   : V → R deﬁned as v =
√
v v.
Proposition 8 If V is a vector space, then for any v ∈ V and a ∈ R
i) au = [a[u;
ii) (Triangle inequality) u +v ≤ u +v;
iii) (Schwarz inequality) [u v[ ≤ uv.
Example 26 If u ∈ R
n
, u = (u
1
, u
2
, . . . , u
n
), the norm of u can be introduced as
u =
√
u u =
u
2
1
+ + u
2
n
.
The triangle inequality and Schwarz’s inequality in R
n
become:
ii)
(u
1
+v
1
)
2
+ + (u
n
+v
n
)
2
≤
¸
n
i=0
u
2
i
+
¸
n
i=0
v
2
i
;
(Minkowski’s inequality for sums)
iii) (
¸
n
i=0
u
i
v
i
) ≤ (
¸
n
i=0
u
2
i
)(
¸
n
i=0
v
2
i
).
(CauchySchwarz inequality for sums)
Deﬁnition 18
a) The nonzero vectors u and v are parallel if there exists a ∈ R such that u = av.
b) The vectors u and v are orthogonal or perpendicular if their scalar product is zero,
that is, if u v = 0.
c) The angle between vectors u and v is arccos(
uv
uv
).
16
1.5.2 Vector Subspaces
Deﬁnition 19 A nonempty subset S of a vector space V is a subspace of V if for any
u, v ∈ S and a ∈ R
u + v ∈ S and au ∈ S.
Example 27 V is a subset of itself. ¦0¦ is also a subset of V . These subspaces are called
proper subspaces.
Example 28 L = ¦(x, y)[y = mx + n¦ where m, n ∈ R and m = 0 is a subspace of R
2
.
Deﬁnition 20 Let u
1
, u
2
. . . u
k
be vectors in a vector space V . The set S of all linear
combinations of these vectors
S = ¦a
1
u
1
+a
2
u
2
+ + a
k
u
k
[a
i
∈ R, i = 1, k¦
is called the subspace generated or spanned by the vectors u
1
, u
2
. . . , u
k
and denoted as
sp(u
1
, u
2
u
k
).
Proposition 9 S is a subspace of V .
Example 29 Let u
1
= (2, −1, 1)
, u
2
= (3, 4, 0)
. Then the subspace of R
3
generated by
u
1
and u
2
is
sp(u
1
, u
2
) = ¦au
1
+ bu
2
[a, b ∈ R¦ = ¦(2a + 3b, −a + 4b, a)
[a, b ∈ R¦.
1.5.3 Independence and Bases
Deﬁnition 21 A set ¦u
1
, u
2
. . . u
k
¦ of vectors in a vector space V is linearly dependent if
there exists the real numbers a
1
, a
2
. . . a
k
, not all zero, such that a
1
u
1
+a
2
u
2
+. . . a
k
u
k
= 0.
In other words, the set of vectors in a vector space is linearly dependent if and only if
one vector can be written as a linear combination of the others. ´!
Example 30 The vectors u
1
= (2, −1, 1)
, u
2
= (1, 3, 4)
, u
3
= (0, −7, −7)
are linearly
dependent since u
3
= u
1
−2u
2
.
Deﬁnition 22 A set ¦u
1
, u
2
. . . u
k
¦ of vectors in a vector space V is linearly independent
if a
1
u
1
+a
2
u
2
+ a
k
u
k
= 0 implying a
1
= a
2
= = a
k
= 0 (that is, they are not linearly
dependent).
In other words, the deﬁnition says that a set of vectors in a vector space is linearly
independent if and only if none of the vectors can be written as a linear combination of
the others.
Proposition 10 Let ¦u
1
, u
2
. . . u
n
¦ be n vectors in R
n
. The following conditions are
equivalent:
i) The vectors are independent.
ii) The matrix having these vectors as columns is nonsingular.
iii) The vectors generate R
n
.
17
Example 31 The vectors u
1
= (1, 2, −2)
, u
2
= (2, 3, 1)
, u
3
= (−2, 0, 1)
in R
3
are lin
early independent since
1 2 −2
2 3 0
−2 1 1
= −17 = 0.
Deﬁnition 23 A set ¦u
1
, u
2
. . . u
k
¦ of vectors in V is a basis for V if it, ﬁrst, generates
V (that is, V = sp(u
1
, u
2
. . . u
k
)), and, second, is linearly independent.
Any set of n linearly independent vectors in R
n
form a basis for R
n
. ´!
Example 32 The vectors from the preceding example u
1
= (1, 2, −2)
, u
2
= (2, 3, 1)
, u
3
=
(−2, 0, 1)
form a basis for R
3
.
Example 33 Consider the following vectors in R
n
: e
i
= (0, . . . , 0, 1, 0, . . . , 0)
, where 1
is in the ith position, i = 1, . . . , n. The set E
n
= ¦e
1
, . . . , e
n
¦ form a basis for R
n
which
is called the standard basis.
Deﬁnition 24 Let V be a vector space and B = ¦u
1
, u
2
, . . . , u
k
¦ a basis for V . Since
B generates V , for any u ∈ V there exists the real numbers x
1
, x
2
, . . . , x
n
such that
u = x
1
u
1
+ . . . + x
n
u
n
. The column vector x = (x
1
, x
2
, . . . , x
n
)
is called the vector of
coordinates of u with respect to B.
Example 34 Consider the vector space R
n
with the standard basis E
n
. For any u =
(u
1
, . . . , u
n
)
we can represent u as u = u
1
e
1
+ . . . u
n
e
n
; therefore, (u
1
, . . . , u
n
)
is the
vector of coordinates of u with respect to E
n
.
Example 35 Consider the vector space R
2
. Let us ﬁnd the coordinate vector of (−1, 2)
with respect to the basis B = (1, 1)
, (2, −3)
(i.e. ﬁnd (−1, 2)
B
). We have to solve for a, b
such that (−1, 2)
= a(1, 1)
+b(2, −3)
. Solving the system
a +2b = −1
a −3b = 2
we ﬁnd a =
1
5
, b =
−3
5
. Thus, (−1, 2)
B
= (
1
5
,
−3
5
)
.
Deﬁnition 25 The dimension of a vector space V dim(V ) is the number of elements in
any basis for V .
Example 36 The dimension of the vector space R
n
with the standard basis E
n
is dim(R
n
) =
n.
1.5.4 Linear Transformations and Changes of Bases
Deﬁnition 26 Let U, V be two vector spaces. A linear transformation of U into V is a
mapping T : U → V such that for any u, v ∈ U and any a, b ∈ R
T(au + bv) = aT(u) +bT(v).
Example 37 Let A be an m n real matrix. The mapping T : R
n
→ R
m
deﬁned by
T(u) = Au is a linear transformation.
18
Example 38 (Rotation of the plane)
The function T
R
: R
2
→ R
2
that rotates the plane counterclockwise through a positive
angle α is a linear transformation.
To check this, ﬁrst note that any twodimensional vector u ∈ R
2
can be expressed in
polar coordinates as u = (r cos θ, r sin θ)
where r = u and θ is the angle that u makes
with the xaxis of the system of coordinates.
The mapping T
R
is thus deﬁned by
T
R
(u) = (r cos(θ + α), r sin(θ + α))
.
Therefore,
T
R
(u) =
r(cos θ cos α −sin θ sin α)
r(sin θ cos α + cos θ sin α)
,
or, alternatively,
T
R
(u) =
cos α −sin α
sin α cos α
r cos θ
r sin θ
= Au,
where
A =
cos α −sinα
sin α cos α
(the rotation matrix).
From example 37 it follows that T
R
is a linear transformation.
Proposition 11 Let U and V be two vector spaces, B = (b
1
, . . . , b
n
) a basis for U and
C = (c
1
, . . . , c
m
) a basis for V .
• Any linear transformation T can be represented by an m n matrix A
T
whose ith
column is the coordinate vector of T(b
i
) relative to C.
• If x = (x
1
, . . . , x
n
)
is the coordinate vector of u ∈ U relative to B and y =
(y
1
, . . . , y
m
)
is the coordinate vector of T(u) relative to C then T deﬁnes the follow
ing transformation of coordinates:
y = A
T
x for any u ∈ U.
Deﬁnition 27 The matrix A
T
is called the matrix representation of T relative to bases
B, C.
Any linear transformation is uniquely determined by a transformation of coordinates. ´!
Example 39 Consider the linear transformation T : R
3
→ R
2
, T((x, y, z)
) = (x−2y, x+
z)
and bases B = ¦(1, 1, 1)
, (1, 1, 0)
, (1, 0, 0)
¦ for R
3
and C = ¦(1, 1)
, (1, 0)
¦ for R
2
.
How can we ﬁnd the matrix representation of T relative to bases B, C?
We have:
T((1, 1, 1)
) = (−1, 2), T((1, 1, 0)
) = (−1, 1), T((1, 0, 0)
) = (1, 1).
The columns of A
T
are formed by the coordinate vectors of T((1, 1, 1)
), T((1, 1, 0)
),
T((1, 0, 0)
) relative to C. Applying the procedure developed in Example 35 we ﬁnd
A
T
=
2 1 1
−3 −2 0
.
19
Deﬁnition 28 (Changes of Bases)
Let V be a vector space of dimension n, B and C be two bases for V , and I : V → V
be the identity transformation (I(v) = v for all v ∈ V ). The changeofbasis matrix D
relative to B, C is the matrix representation of I relative to B, C.
Example 40 For u ∈ V , let x = (x
1
, . . . , x
n
)
be the coordinate vector of u relative to B
and y = (y
1
, . . . , y
n
)
is the coordinate vector of u relative to C. If D is the changeofbasis
matrix relative to B, C then y = Cx. The changeofbasis matrix relative to C, B is D
−1
.
Example 41 Given the following bases for R
2
: B = ¦(1, 1)
, (1, 0)
¦ and C = ¦(0, 1)
, (1, 1)
¦,
ﬁnd the changeofbasis matrix D relative to B, C.
The columns of D are the coordinate vectors of (1, 1)
, (1, 0)
relative to C. Following
Example 35 we ﬁnd D =
0 −1
1 1
.
Proposition 12 Let T : V → V be a linear transformation, and let B, C be two bases for
V . If A
1
is the matrix representation of T in the basis B, A
2
is the matrix representation of
T in the basis C and D is the changeofbasis matrix relative to C, B then A
2
= D
−1
A
1
D.
Further Reading:
• Bellman, R. Introduction to Matrix Analysis.
• Fraleigh, J.B. and R.A. Beauregard. Linear Algebra.
• Gantmacher, F.R. The Theory of Matrices.
• Lang, S. Linear Algebra.
20
2 Calculus
2.1 The Concept of Limit
Deﬁnition 29 The function f(x) has a limit A (or tends to A as a limit) as x approaches
a if for each given number ε > 0, no matter how small, there exists a positive number δ
(that depends on ε) such that [f(x) −A[ < ε whenever 0 < [x −a[ < δ.
The standard notation is lim
x→a
f(x) = A or f(x) → A as x → a.
Deﬁnition 30 The function f(x) has a leftside (or rightside) limit A as x approaches
a from the left (or right),
lim
x→a
−
f(x) = A ( lim
x→a
+
f(x) = A),
if for each given number ε > 0 there exists a positive number δ such that [f(x) − A[ < ε
whenever a −δ < x < a (a < x < a +δ).
Recipe 8 – How to Calculate Limits:
We can apply the following basic rules for limits:
if lim
x→x
0
f(x) = A and lim
x→x
0
g(x) = B then
1. if C is constant, then lim
x→x
0
C = C.
2. lim
x→x
0
(f(x) +g(x)) = lim
x→x
0
f(x) + lim
x→x
0
g(x) = A +B.
3. lim
x→x
0
f(x)g(x) = lim
x→x
0
f(x) lim
x→x
0
g(x) = A B.
4. lim
x→x
0
f(x)
g(x)
=
lim
x→x
0
f(x)
lim
x→x
0
g(x)
=
A
B
.
5. lim
x→x
0
(f(x))
n
= (lim
x→x
0
f(x))
n
= A
n
.
6. lim
x→x
0
f(x)
g(x)
= e
lim
x→x
0
g(x) ln f(x)
= e
B ln A
.
Some important results:
lim
x→0
sin x
x
= 1, lim
x→∞
1 +
1
x
x
= e = 2.718281828459 . . . , lim
x→0
ln(1 +x)
x
= 1,
lim
x→0
(1 +x)
p
−1
x
= p, lim
x→0
e
x
−1
x
= 1, lim
x→0
a
x
−1
x
= ln a, a > 0.
Example 42
a) lim
x→1
x
3
−1
x −1
= lim
x→1
(x −1)(x
2
+x + 1)
x −1
= lim
x→1
(x
2
+x + 1) = 1 + 1 + 1 = 3.
b) lim
x→0
sin
2
x
x
= lim
x→0
(
sin x
x
sin x) = lim
x→0
sin x
x
lim
x→0
sin x = 1 0 = 0.
c) lim
x→∞
x
2
ln
√
x
2
+ 1
x
= lim
x→∞
x
2
ln
x
2
+ 1
x
2
= lim
x→∞
x
2
1
2
ln(1 +
1
x
2
) =
21
=
1
2
lim
x→∞
ln(1 + 1/x
2
)
1/x
2
=
1
2
.
d) lim
x→0
+
x
1
1+ln x
= e
lim
x→0
+
1
1+ln x
ln x
= e
lim
x→0
+
1
1/ ln x+1
= e
1
= e.
Another powerful tool for evaluating limits is L’Hˆopital’s rule.
Proposition 13 (L’Hˆopital’s Rule)
Suppose f(x) and g(x) are diﬀerentiable
1
in an interval (a, b) around x
0
except possibly
at x
0
, and suppose that f(x) and g(x) both approach 0 when x approaches x
0
. If g
(x) = 0
for all x = x
0
in (a, b) and lim
x→x
0
f
(x)
g
(x)
= A then
lim
x→x
0
f(x)
g(x)
= lim
x→x
0
f
(x)
g
(x)
= A.
The same rule applies if f(x) → ±∞, g(x) → ±∞. x
0
can be either ﬁnite or inﬁnite.
Note that L’Hˆopital’s rule can be applied only if we have expressions of the form
0
0
or
±∞
±∞
. ´!
Example 43
a) lim
x→0
x −sin x
x
3
= lim
x→0
1 −cos x
3x
2
= lim
x→0
sin x
6x
=
1
6
lim
x→0
sin x
x
=
1
6
.
b) lim
x→0
x −sin 2x
x
3
= lim
x→0
1 −2 cos 2x
3x
2
=
lim
x→0
(1 −2 cos 2x)
lim
x→0
3x
2
=
−1
lim
x→0
3x
2
= −∞.
2.2 Diﬀerentiation  the Case of One Variable
Deﬁnition 31 A function f(x) is continuous at x = a if lim
x→a
f(x) = f(a).
If f(x) and g(x) are continuous at a then:
• f(x) ±g(x) and f(x)g(x) are continuous at a;
• if g(a) = 0 then
f(x)
g(x)
is continuous at a;
• if g(x) is continuous at a and f(x) is continuous at g(a) then f(g(x)) is continuous
at a.
1
For the deﬁnition of diﬀerentiability see the next section.
22
In general, any function built from continuous functions by additions, subtractions, mul
tiplications, divisions and compositions is continuous where deﬁned.
If lim
x→a
+
f(x) = c
1
= c
2
= lim
x→a
−
f(x), [c
1
[, [c
2
[ < ∞, the function f(x) is said to
have a jump discontinuity at a. If lim
x→a
f(x) = ±∞, we call this type of discontinuity
inﬁnite discontinuity.
Suppose there is a functional relationship between x and y, y = f(x). One of the
natural questions one may ask is: How does y change if x changes? We can answer
this question using the notion of the diﬀerence quotient. Denoting the change in x as
∆x = x −x
0
, the diﬀerence quotient is deﬁned as
∆y
∆x
=
f(x
0
+ ∆x) −f(x
0
)
∆x
.
Taking the limit of the above expression, we arrive at the following deﬁnition.
Deﬁnition 32 The derivative of f(x) at x
0
is
f
(x
0
) = lim
x→x
0
f(x
0
+ ∆x) −f(x
0
)
∆x
.
If the limit exists, f is called diﬀerentiable at x.
An alternative notation for the derivative often found in textbooks is
f
(x) = y
=
df
dx
=
df(x)
dx
.
The ﬁrst and second derivatives with respect to time are usually denoted by dots ( ˙ and
¨, respectively), i.e if z = z(t) then ˙ z =
dz
dt
, ¨ z =
d
2
z
dt
2
.
The set of all continuously diﬀerentiable functions in the domain D (i.e. the argument
of a function may take any value from D) is denoted by C
(1)
(D).
Geometrically speaking, the derivative represents the slope of the tangent line to f at
x. Using the derivative, the equation of the tangent line to f(x) at x
0
can be written as
y = f(x
0
) +f
(x
0
)(x −x
0
).
Note that the continuity of f(x) is a necessary but NOT suﬃcient condition for its
diﬀerentiability! ´!
Example 44 For instance, f(x) = [x − 2[ is continuous at x = 2 but not diﬀerentiable
at x = 2.
The geometric interpretation of the derivative gives us the following formula (usually called New
ton’s approximation method), which allows us to ﬁnd an approximate root of f(x) = 0. Let us
deﬁne a sequence
x
n+1
= x
n
−
f(x
n
)
f
(x
n
)
.
If the initial value x
0
is chosen such that x
0
is reasonably close to an actual root, this sequence
will converge to that root.
The derivatives of higher order (2,3,...,n) can be deﬁned in the same manner:
f
(x
0
) =
d
dx
f
(x
0
) = lim
x→x
0
f
(x
0
+ ∆x) −f
(x
0
)
∆x
. . . f
(n)
(x
0
) =
d
n
dx
n
f
(n−1)
(x
0
),
provided that these limits exist. If so, f(x) is called n times continuously diﬀerentiable,
f ∈ C
(n)
. The symbol
d
n
dx
n
denotes an operator of taking the nth derivative of a function
with respect to x.
23
2.3 Rules of Diﬀerentiation
Suppose we have two diﬀerentiable functions of the same variable x, say f(x) and g(x).
Then
• (f(x) ±g(x))
= f
(x) ±g
(x);
• (f(x)g(x))
= f
(x)g(x) +f(x)g
(x) (product rule);
•
f(x)
g(x)
=
f
(x)g(x) −f(x)g
(x)
g
2
(x)
(quotient rule);
• if f = f(y) and y = g(x) then
df
dx
=
df
dy
dy
dx
= f
(y)g
(x) (chain rule or composite
function rule).
This rule can be easily extended to the case of more than two functions involved.
Example 45 (Application of the Chain Rule)
Let f(x) =
√
x
2
+ 1. We can decompose as f =
√
y, where y = x
2
+ 1. Therefore,
df
dx
=
df
dy
dy
dx
=
1
2
√
y
2x =
x
√
y
=
x
√
x
2
+ 1
.
• if x = u(t) and y = v(t) (i.e. x and y are parametrically deﬁned) then
dy
dx
=
dy/dt
dx/dt
=
˙ v(t)
˙ u(t)
(recall that ˙ means
d
dt
),
d
2
y
dx
2
=
d
dx
˙ v(t)
˙ u(t)
=
d
dt
˙ v(t)
˙ u(t)
dx/dt
=
¨ v(t) ˙ u(t) − ˙ v(t)¨ u(t)
( ˙ u(t))
3
.
Example 46 If x = a(t −sin t), y = a(1 −cos t), where a is a parameter, then
dy
dx
=
sin t
1 −cos t
,
d
2
y
dx
2
=
cos t(1 −cos t) −sin
2
t
a(1 −cos t)
3
=
cos t −1
a(1 −cos t)
2
= −
1
a(1 −cos t)
2
.
Some special rules:
f(x) = constant ⇒ f
(x) = 0
f(x) = x
a
(a is constant) ⇒ f
(x) = ax
a−1
f(x) = e
x
⇒ f
(x) = e
x
f(x) = a
x
(a > 0) ⇒ f
(x) = a
x
ln a
f(x) = ln x ⇒ f
(x) =
1
x
f(x) = log
a
x (a > 0, a = 1) ⇒ f
(x) =
1
x
log
a
e =
1
xln a
f(x) = sin x ⇒ f
(x) = cos x
f(x) = cos x ⇒ f
(x) = −sin x
f(x) = tgx ⇒ f
(x) =
1
cos
2
x
f(x) = ctgx ⇒ f
(x) = −
1
sin
2
x
f(x) = arcsinx ⇒ f
(x) =
1
√
1−x
2
f(x) = arccosx ⇒ f
(x) = −
1
√
1−x
2
f(x) = arctgx ⇒ f
(x) =
1
1+x
2
f(x) = arcctgx ⇒ f
(x) = −
1
1+x
2
24
More hints:
• if y = ln f(x) then y
=
f
(x)
f(x)
.
• if y = e
f(x)
then y
= f
(x)e
f(x)
.
• if y = f(x)
g(x)
then
y
= (f(x)
g(x)
)
= (e
g(x) ln f(x)
)
(by chain rule)
= e
g(x) ln f(x)
(g(x) ln f(x))
(by product rule)
=
= e
g(x) ln f(x)
g
(x) ln f(x) +g(x)
f
(x)
f(x)
= f(x)
g(x)
g
(x) ln f(x) +g(x)
f
(x)
f(x)
.
• if a function f is a onetoone mapping (or singlevalued mapping),
2
it has the
inverse f
−1
. Thus, if y = f(x) and x = f
−1
(y) then
dx
dy
=
1
f
(x)
=
1
f
(f
−1
(y))
or,
in other words,
dx
dy
=
1
dy/dx
.
Example 47 Given y = ln x, its inverse is x = e
y
. Therefore
dx
dy
=
1
1/x
= x = e
y
.
• if a function f is a product (or quotient) of a number of other functions, logarith
mic function might be helpful while taking the derivative of f. If y = f(x) =
g
1
(x)g
2
(x) . . . g
k
(x)
h
1
(x)h
2
(x) . . . h
l
(x)
then after taking the (natural) logarithms of the both sides and
rearranging the terms we get
dy
dx
= y(x)
g
1
(x)
g
1
(x)
+ . . . +
g
k
(x)
g
k
(x)
−
h
1
(x)
h
1
(x)
−. . . −
h
l
(x)
h
l
(x)
.
Deﬁnition 33 If y = f(x) and dx is any number then the diﬀerential of y is deﬁned as
dy = f
(x)dx.
The rules of diﬀerentials are similar to those of derivatives:
• If k is constant then dk = 0;
• d(u ±v) = du ±dv;
• d(uv) = v du +u dv;
• d
u
v
=
v du −u dv
v
2
.
Diﬀerentials of higher order can be found in the same way as derivatives.
2
In the onedimensional case the set of all onetoone mappings coincides with the set of strictly
monotonic functions.
25
2.4 Maxima and Minima of a Function of One Variable
Deﬁnition 34 If f(x) is continuous in a neighborhood U of a point x
0
, it is said to
have a local or relative maximum (minimum) at x
0
if for all x ∈ U, x = x
0
, f(x) <
f(x
0
) (f(x) > f(x
0
)).
Proposition 14 (Weierstrass’s Theorem)
If f is continuous on a closed and bounded subset S of R
1
(or, in the general case, of
R
n
), then f reaches its maximum and minimum in S, i.e. there exist points m, M ∈ S
such that f(m) ≤ f(x) ≤ f(M) for all x ∈ S.
Proposition 15 (Fermat’s Theorem or the Necessary Condition for Extremum)
If f(x) is diﬀerentiable at x
0
and has a local extremum (minimum or maximum) at x
0
then f
(x
0
) = 0.
Note that if the ﬁrst derivative vanishes at some point, it does not imply that at this
point f possesses an extremum. We can only state that f has a stationary point. ´!
Some useful results:
Proposition 16 (Rolle’s Theorem)
If f is continuous in [a, b], diﬀerentiable in (a, b) and f(a) = f(b), then there exists at
least one point c ∈ (a, b) such that f
(c) = 0.
Proposition 17 (Lagrange’s Theorem or the Mean Value Theorem)
If f is continuous in [a, b] and diﬀerentiable in (a, b), then there exists at least one
point c ∈ (a, b) such that f
(c) =
f(b) −f(a)
b −a
.

6
`
The geometric meaning of the Lagrange’s theorem:
the slope of a tangent line to f(x) at the point c coincides
with the slope of the straight line, connecting two points
(a, f(a)) and (b, f(b)).
If we denote a = x
0
, b = x
0
+∆x, then we may rewrite
Lagrange theorem as
f(x
0
+∆x)−f(x
0
) = ∆xf
(x
0
+θ∆x), where θ ∈ (0, 1).
a
b
x
f(a)
f(b)
c
f(x)
This statement can be generalized in the following way:
Proposition 18 (Cauchy’s Theorem or the Generalized Mean Value Theorem)
If f and g are continuous in [a, b] and diﬀerentiable in (a, b) then there exists at least
one point c ∈ (a, b) such that (f(b) −f(a))g
(c) = (g(b) −g(a))f
(c).
Recall the meaning of the ﬁrst and the second derivatives of a function f. The sign
of the ﬁrst derivative tells us whether the value of the function increases (f
> 0) or
decreases (f
< 0), whereas the sign of the second derivative tells us whether the slope of
the function increases (f
> 0) or decreases (f
< 0). This gives us an insight into how
to verify that at a stationary point we have a maximum or minimum.
Assume that f(x) is diﬀerentiable in a neighborhood of x
0
and it has a stationary
point at x
0
, i.e. f
(x
0
) = 0.
26
Proposition 19 (The FirstDerivative Test for Local Extremum)
If at a stationary
point x
0
the ﬁrst
derivative of a
function f
a) changes its sign from
positive to negative,
b) changes its sign from
negative to positive,
c) does not change its
sign,
then the value
of the func
tion at x
0
,
f(x
0
), will be
a) a local maximum;
b) a local minimum;
c) neither a local max
imum nor a local min
imum.
Proposition 20 (SecondDerivative Test for Local Extremum)
A stationary point x
0
of f(x) will be a local maximum if f
(x
0
) < 0 and a local
minimum if f
(x
0
) > 0.
However, it may happen that f
(x
0
) = 0, therefore the secondderivative test is not
applicable. To compensate for this, we can extend the latter result and to apply the
following general test:
Proposition 21 (nthDerivative Test)
If f
(x
0
) = f
(x
0
) = . . . = f
(n−1)
(x
0
) = 0, f
(n)
(x
0
) = 0 and f
(n)
is continuous at x
0
then at point x
0
f(x) has
a) an inﬂection point if n is odd;
b) a local maximum if n is even and f
(n)
(x
0
) < 0;
c) a local minimum if n is even and f
(n)
(x
0
) > 0.
To prove this statement we can use the Taylor series: If f is continuously diﬀerentiable
enough at x
0
, it can be expanded around a point x
0
, i.e. this function can be transformed
into a polynomial form in which the coeﬃcient of the various terms are expressed in terms
of the derivatives values of f, all evaluated at the point of expansion x
0
. More precisely,
f(x) =
∞
¸
i=0
f
(i)
(x
0
)
i!
(x −x
0
)
i
, where k! = 1 2 . . . k, 0! = 1, f
(0)
(x
0
) = f(x
0
).
or
f(x) =
n
¸
i=0
f
(i)
(x
0
)
i!
(x −x
0
)
i
+ R
n
,
where
R
n
= f
(n+1)
(x
0
+θ(x −x
0
))
(x −x
0
)
n+1
(n + 1)!
, 0 < θ < 1
is the remainder in the Lagrange form.
The Maclaurin series is the special case of the Taylor series when we set x
0
= 0.
Example 48 Expand e
x
around x = 0.
Since
d
n
dx
n
e
x
= e
x
for all n and e
0
= 1, e
x
= 1 +
x
1!
+
x
2
2!
+
x
3
3!
+. . . =
∞
¸
i=0
x
i
i!
.
27
The expansion of a function into Taylor series is useful as an approximation device. For instance,
we can derive the following approximation:
f(x + dx) ≈ f(x) + f
(x)dx, when dx is small.
Note, however, that in general polynomial approximation is not the most eﬃcient one. ´!
Recipe 9 – How to Find (Global) Maximum or Minimum Points:
If f(x) has a maximum or minimum in a subset S (of R
1
or, in general, R
n
), then
we need to check the following points for maximum/minimum values of f:
–interior points of S that are stationary;
–points at which f is not diﬀerentiable;
–extrema of f at the boundary of S.
Note that the ﬁrstderivative test gives us only relative (or local) extrema. It may
happen (due to Weierstrass’s theorem) that f reaches its global maximum at the border
point. ´!
Example 49 Find maximum of f(x) = x
3
−3x in [−2, 3].
The ﬁrstderivative test gives two stationary points: f
(x) = 3x
2
−3 = 0 at x = 1 and
x = −1. The secondderivative test guarantees that x = −1 is a local maximum. However,
f(−1) = 2 < f(3) = 18. Therefore the global maximum of f in [−2, 3] is reached at the
border point x = 2.
Recipe 10 – How to Sketch the Graph of a Function:
Given f, you should perform the following actions:
1. Check at which points our function is continuous, diﬀerentiable, etc.
2. Find its asymptotics, i.e. vertical asymptotes (ﬁnite x’s at which f(x) = ∞) and
nonvertical asymptotes. By deﬁnition, y = ax + b is a nonvertical asymptote for
the curve y = f(x) if lim
x→∞
(f(x) −(ax +b)) = 0.
How to ﬁnd an asymptote for the curve y −f(x) as x → ∞:
– Examine lim
x→∞
f(x)/x; if the limit does not exist, there is no asymptote as
x → ∞.
– If lim
x→∞
f(x)/x = a, examine lim
x→∞
f(x) − ax; if the limit does not exist,
there is no asymptote as x → ∞.
– If lim
x→∞
f(x) − ax = b, the straight line y = ax + b is an asymptote for the
curve y = f(x) as x → ∞.
3. Find all stationary points and check whether they are local minima, local maxima
or neither.
4. Find all points of inﬂection, i.e. points given by f
(x) = 0, at which the second
derivative changes its sign.
Note that a zero ﬁrst derivative value is not required for an inﬂection point. ´!
5. Find intervals in which f is increasing (decreasing).
28
6. Find intervals in which f is convex (concave).
Deﬁnition 35 f is called convex (concave) at x
0
if f
(x
0
) ≥ 0 (f
(x
0
) ≤ 0), and
strictly convex (concave) if the inequalities are strict.
Example 50 Sketch the graph of the function y =
2x−1
(x−1)
2
.
a) y(x) is deﬁned for x ∈ (−∞, 1) ∪ (1, +∞).
b) y(x) is continuous in (−∞, 1) ∪ (1, +∞).
c) y(x) → 0 if x → ±∞, therefore y = 0 is the horizontal asymptote.
d) y(x) → ∞ if x → ±1, therefore x = 1 is the vertical asymptote.
e) y
(x) = −
2x
(x −1)
3
. y
(x) is not deﬁned at x = 1, it is positive if x ∈ (0, 1) (the
function is increasing) and negative if x ∈ (−∞, 0) ∪(1, +∞) (the function is decreasing).
x = 0 is the unique extremum (minimum), y(0) = −1.
f ) y
(x) = 2
2x −1
(x −1)
4
. It is not deﬁned at x = 1, positive at x ∈ (−
1
2
, 1) ∪ (1, +∞)
(the function is convex) and negative at x ∈ (−∞, −
1
2
) ( the function is concave).
x = −
1
2
is the unique point of inﬂection, y(−
1
2
) = −
8
9
.
g) Intersections with the axes are given by x = 0 ⇒ y(x) = −1, y = 0 ⇒ x =
1
2
.
Finally, the sketch of the graph looks like this:
66

y(x)
x
1/2
1
−1/2
−1
2.5 Integration (The Case of One Variable)
Deﬁnition 36 Let f(x) be a continuous function. The indeﬁnite integral of f (denoted
by
f(x)dx) is deﬁned as
f(x)dx = F(x) +C,
where F(x) is such that F
(x) = f(x), and C is an arbitrary constant.
Rules of integration:
•
(af(x) + bg(x))dx = a
f(x)dx + b
g(x)dx, a, b are constants (linearity of the
integral);
29
•
f
(x)g(x)dx = f(x)g(x) −
f(x)g
(x)dx (integration by parts);
•
f(u(t))
du
dt
dt =
f(u)du (integration by substitution).
Some special rules of integration:
f
(x)
f(x)
dx = ln [f(x)[ +C,
f
(x)e
f(x)
dx = e
f(x)
+ C
1
x
dx = ln [x[ +C,
x
a
dx =
x
a+1
a+1
+C, a = −1
e
x
dx = e
x
+C,
a
x
dx =
a
x
ln a
+ C a > 0
See any calculus referencebook for tables of integrals of basic functions.
Example 51
a)
x
2
+ 2x + 1
x
dx =
xdx +
2dx +
1
x
dx =
x
2
2
+ 2x + ln [x[ +C.
b)
xe
−x
2
dx = −
1
2
(−2x)e
−x
2
dx
(substitution z = x
2
)
= −
1
2
e
−z
dz = −
e
−x
2
2
+C.
c)
xe
x
dx
(by parts, f(x) = x, g(x) = e
x
)
= xe
x
−
e
x
dx = xe
x
−e
x
+C.
Deﬁnition 37 (the NewtonLeibniz formula).
The deﬁnite integral of a continuous function f is
b
a
f(x)dx = F(x)[
b
a
= F(b) −F(a), (2)
for F(x) such that F
(x) = f(x) for all x ∈ [a, b]
The indeﬁnite integral is a function. The deﬁnite integral is a number! ´!
We understand the deﬁnite integral in Riemann sense:
Given a partition a = x
0
< x
1
< . . . < x
n
= b and numbers ζ
i
∈ [x
i
, x
i+1
], i = 0, 1, . . . , n −1,
b
a
f(x)dx = lim
max
i
(x
i+1
−x
i
)→0
n−1
¸
i=0
f(ζ
i
)(x
i+1
−x
i
).
Since every deﬁnite integral has a deﬁnite value, this value may be interpreted geo
metrically to be a particular area under a given curve deﬁned by y = f(x).
Note that if a curve lies below the x axis, this area will be negative. ´!
Properties of deﬁnite integrals: For any arbitrary real numbers a, b, c, α, β
•
b
a
(αf(x) + βg(x))dx = α
b
a
f(x)dx +β
b
a
g(x)dx
•
b
a
f(x)dx = −
a
b
f(x)dx
•
a
a
f(x)dx = 0
•
b
a
f(x)dx =
c
a
f(x)dx +
b
c
f(x)dx
• [
b
a
f(x)dx[ ≤
b
a
[f(x)[dx
•
b
a
f(x)g
(x)dx =
g(b)
g(a)
f(u)du, u = g(x) (change of variable)
30
•
b
a
f(x)g
(x)dx = f(x)g(x)[
b
a
−
b
a
f
(x)g(x)dx
Some more useful results: If λ is a real parameter,
•
d
dλ
b(λ)
a(λ)
f(x)dx = f(b(λ))b
(λ) −f(a(λ))a
(λ)
In particular,
d
dx
x
a
f(t)dt = f(x)
If
b
a
d
dλ
f(x, λ)dx exists, then
•
d
dλ
b
a
f(x, λ)dx =
b
a
f
λ
(x, λ)dx
Example 52 We may apply this formula when we need to evaluate a deﬁnite in
tegral, which can not be integrated in elementary functions. For instance, let us
ﬁnd
I(λ) =
1
0
xe
−λx
dx.
If we introduce another integral
J(λ) =
1
0
e
−λx
dx = −
e
−λx
λ
[
1
0
= −
e
−λ
−1
λ
,
then
I(λ) = −
1
0
d
dλ
e
−λx
dx = −
dJ(λ)
dλ
=
1 −e
−λ
(1 +λ)
λ
2
.
We can combine the last two formulas to get:
• Leibniz’s formula:
d
dλ
b(λ)
a(λ)
f(x, λ)dx = f(b(λ), λ)b
(λ) −f(a(λ), λ)a
(λ) +
b(λ)
a(λ)
f
λ
(x, λ)dx
Proposition 22 (Mean Value Theorem)
If f(x) is continuous in [a, b] then there exist at least one point c ∈ (a, b) such that
b
a
f(x)dx = f(c)(b −a).
Numerical methods of approximation the deﬁnite integrals:
a) The trapezoid formula: Denoting y
i
= f(a + i
b−a
n
),
b
a
f(x)dx ≈
b −a
2n
(y
0
+ 2y
1
+ 2y
2
+ . . . + 2y
n−1
+ y
n
),
n is an arbitrary integer number.
b) Simpson’s formula: Denoting y
i
= f(a + i
b−a
2n
),
b
a
f(x)dx ≈
b −a
6n
(y
0
+ 4
n
¸
i=1
y
2i−1
+ 2
n−1
¸
i=1
y
2i
+ y
2n
).
Improper integrals are integrals of one of the following forms:
31
•
∞
a
f(x)dx = lim
b→∞
b
a
f(x)dx or
b
−∞
f(x)dx = lim
a→−∞
b
a
f(x)dx.
•
b
a
f(x)dx = lim
→0
b
a+
f(x)dx where lim
x→a
f(x) = ∞.
•
b
a
f(x)dx = lim
→0
b−
a
f(x)dx where lim
x→b
f(x) = ∞.
If these limits exist, the improper integral is said to be convergent, otherwise it is
called divergent.
Example 53
1
0
1
x
p
dx = lim
→0
+
1
1
x
p
dx =
lim
→0
+
(ln 1 −ln ), p = 1
lim
→0
+
(
1
1−p
−
1−p
1−p
), p = 1
Therefore the improper integral converges if p < 1 and diverges if p ≥ 1.
Economics Application 3 (The Optimal Subsidy and the Compensated Demand Curve)
Consider the consumer’s dual problem (recall that the dual to utility maximization is
expenditure minimization: we minimize the objective function M = p
x
x + p
y
y, subject to
the utility constraint U = U(x, y), where M is income level, x, y and p
x
, p
y
are quantities
and prices of two goods, respectively).
The solution to the expenditureminimization problem under the utility constraint is
the budget line with income M
∗
, M
∗
(p
x
, p
y
) = p
x
x
∗
+p
y
y
∗
.
If we hold utility and p
y
constant, due to Hotelling’s lemma
dM
∗
(p
x
)
dp
x
= x
∗
c
(p
x
),
where x
∗
c
(p
x
) is the incomecompensated (or simply compensated) demand curve for good
x.
By rearranging terms, dM
∗
(p
x
) = x
∗
c
(p
x
)dp
x
, where dM
∗
is the optimal subsidy for
an inﬁnitesimal change in price, we can “add up” a series of inﬁnitesimal changes by
taking integral over those inﬁnitesimal changes. Thus, if price increases from p
1
x
to p
2
x
,
the optimal subsidy S
∗
required to maintain a given level of utility would be the integral
over all the inﬁnitesimal income changes as price changes:
S
∗
=
p
2
x
p
1
x
x
∗
c
(p
x
)dp
x
.
For example, if p
1
x
= 1, p
2
x
= 4, x
∗
c
(p
x
) =
1
√
p
x
then
S
∗
=
4
1
1
√
p
x
dp
x
= 2
√
p
x
[
4
1
= 2 (2 −1) = 2.
2.6 Functions of More than One Variable
Let us consider a function y = f(x
1
, x
2
, . . . , x
n
) where the variables x
i
, i = 1, 2, . . . , n
are all independent of one other, i.e. each varies without aﬀecting others. The partial
derivative of y with respect to x
i
is deﬁned as
∂y
∂x
i
= lim
∆x
i
→0
∆y
∆x
i
= lim
∆x
i
→0
f(x
1
, . . . , x
i
+ ∆x
i
, . . . , x
n
) −f(x
1
, . . . , x
i
, . . . , x
n
)
∆x
i
for all i = 1, 2, . . . , n. If these limits exist, our function is diﬀerentiable with respect to all
arguments. Partial derivatives are often denoted with subscripts, e.g.
∂f
∂x
i
= f
x
i
(= f
x
i
).
32
Example 54 If y = 4x
3
1
+x
1
x
2
+ ln x
2
then
∂y
∂x
1
= 12x
2
1
+ x
2
,
∂y
∂x
2
= x
1
+ 1/x
2
.
Deﬁnition 38 The total diﬀerential of a function f is deﬁned as
dy =
∂f
∂x
1
dx
1
+
∂f
∂x
2
dx
2
+. . . +
∂f
∂x
n
dx
n
.
The total diﬀerential can be also used as an approximation device.
For instance, if y = f(x
1
, x
2
) then by deﬁnition y −y
0
= ∆f(x
0
1
, x
0
2
) ≈ df(x
0
1
, x
0
2
).
Therefore f(x
1
, x
2
) ≈ f(x
0
1
, x
0
2
)+df(x
0
1
, x
0
2
) ≈ f(x
0
1
, x
0
2
)+
∂f(x
0
1
, x
0
2
)
∂x
1
(x
1
−x
0
1
)+
∂f(x
0
1
, x
0
2
)
∂x
2
(x
2
−x
0
2
).
If y = f(x
1
, . . . , x
n
), x
i
= x
i
(t
1
, . . . , t
m
), i = 1, 2, . . . , n, then for all j = 1, 2, . . . , m
∂y
∂t
j
=
n
¸
i=1
∂f(x
1
, . . . , x
n
)
∂x
i
∂x
i
∂t
j
.
This rule is called the chain rule.
Example 55 If z = f(x, y), x = u(t), y = v(t) then
dz
dt
=
∂f
∂x
dx
dt
+
∂f
∂y
dy
dt
.
(a special case of the chain rule)
Example 56 If z = f(x, y, t) where x = u(t), y = v(t) then
dz
dt
=
∂f
∂x
dx
dt
+
∂f
∂y
dy
dt
+
∂f
∂t
.
(dz/dt is sometimes called the total derivative).
Second order partial derivative of y = f(x
1
, . . . , x
n
) is deﬁned as
∂
2
y
∂x
j
∂x
i
= f
x
i
x
j
(x
1
, . . . , x
n
) =
∂
∂x
j
f
x
i
(x
1
, . . . , x
n
), i, j = 1, 2, . . . , n.
Proposition 23 (Schwarz’s Theorem or Young’s Theorem)
If at least one of the two partials is continuous, then
∂
2
f
∂x
j
∂x
i
=
∂
2
f
∂x
i
∂x
j
, i, j = 1, 2, . . . , n.
Similar to the case of one variable, we can expand a function of several variables in a polynomial
form of Taylor series, using partial derivatives of higher order.
Deﬁnition 39 The second order total diﬀerential of a function y = f(x
1
, x
2
, . . . , x
n
) is
deﬁned as
d
2
y = d(dy) =
n
¸
i=1
n
¸
j=1
∂
2
f
∂x
i
∂x
j
dx
i
dx
j
, i, j = 1, 2, . . . , n.
Example 57 Let z = f(x, y) and f(x, y) satisfy the conditions of Schwarz’s theorem (i.e.
f
xy
= f
yx
). Then d
2
z = f
xx
dx
2
+ 2f
xy
dxdy + f
yy
dy
2
33
2.7 Unconstrained Optimization in the Case of More than One
Variable
Let z = f(x
1
, . . . , x
n
).
Deﬁnition 40 z has a stationary point at x
∗
= (x
∗
1
, . . . , x
∗
n
) if dz = 0 at x
∗
, i.e. all f
x
i
should vanish at x
∗
.
The conditions
f
x
i
(x
∗
) = 0, for all i = 1, 2, . . . , n
are called the ﬁrst order necessary conditions (F.O.N.C.).
The secondorder necessary condition (S.O.N.C.) requires sign semideﬁniteness of the second order
total diﬀerential.
Deﬁnition 41 z = f(x
1
, . . . , x
n
) has a local maximum (minimum) at x
∗
= (x
∗
1
, . . . , x
∗
n
)
if f(x
∗
) −f(x) ≥ 0(≤ 0) for all x in some neighborhood of x
∗
.
The secondorder suﬃcient condition (S.O.C.):
x
∗
is a maximum (minimum) if d
2
z at x
∗
is negative deﬁnite (positive deﬁnite).
Recipe 11 – How to Check the Sign Deﬁniteness of d
2
z:
Let us deﬁne the symmetric Hessian matrix H of second partial derivatives as
H =
¸
¸
¸
¸
∂
2
f
∂x
1
∂x
1
. . .
∂
2
f
∂x
1
∂x
n
.
.
.
.
.
.
∂
2
f
∂x
n
∂x
1
. . .
∂
2
f
∂x
n
∂x
n
¸
.
Thus, the sign deﬁniteness of d
2
z is equivalent to the sign deﬁniteness of the quadratic
form H evaluated at x
∗
. We can apply either the principal minors test or the eigenvalue
test discussed in the previous chapter.
Example 58 Consider z = f(x, y). If the ﬁrst order necessary conditions are met at x
∗
then
x
∗
is a (local)
minimum
maximum
¸
if
f
xx
(x
∗
) > 0, f
xx
(x
∗
)f
yy
(x
∗
) −(f
xy
(x
∗
))
2
> 0
f
xx
(x
∗
) < 0, f
xx
(x
∗
)f
yy
(x
∗
) −(f
xy
(x
∗
))
2
> 0
¸
.
If f
xx
(x
∗
)f
yy
(x
∗
) −(f
xy
(x
∗
))
2
< 0, we identify a stationary point as a saddle point.
Note the suﬃciency of this test. For instance, even if f
xx
(x
∗
)f
yy
(x
∗
) −(f
xy
(x
∗
))
2
= 0,
we still may have an extremum at (x
∗
). ´!
Example 59 (Classiﬁcation of Stationary Points of a C
(2)
Function
3
of n Variables)
If x
∗
= (x
∗
1
, . . . , x
∗
n
) is a stationary point of f(x
1
, . . . , x
n
) and if [H
k
(x
∗
)[ is the follow
ing determinant evaluated at x
∗
,
[H
k
[ = det
∂
2
f
∂x
1
∂x
1
. . .
∂
2
f
∂x
1
∂x
k
.
.
.
.
.
.
∂
2
f
∂x
k
∂x
1
. . .
∂
2
f
∂x
k
∂x
k
, k = 1, 2, . . . , n
then
3
Recall that C
(2)
means ‘twice continuously diﬀerentiable’.
34
a) if (−1)
k
[H
k
(x
∗
)[ > 0, k = 1, 2, . . . , n, then x
∗
is a local maximum;
b) if [H
k
(x
∗
)[ > 0, k = 1, 2, . . . , n, then x
∗
is a local minimum;
c) if [H
n
(x
∗
)[ = 0 and neither of the two conditions above is satisﬁed, then x
∗
is a
saddle point.
Example 60 Let us ﬁnd the extrema of the function z(x, y) = x
3
−8y
3
+ 6xy + 1.
F.O.N.C.: z
x
= 3(x
2
+ 2y) = 0, z
y
= 6(−4y
2
+ x) = 0.
Solving F.O.N.C. for x, y we obtain two stationary points:
x = y = 0 and x = 1, y = −1/2.
S.O.C.: The Hessian matrices evaluated at stationary points are
H[
(0,0)
=
6x 6
6 −48y
(0,0)
=
0 6
6 0
, H[
(1,−1/2)
=
6 6
6 24
,
therefore (0, 0) is a saddle point, (1, −1/2) is a local minimum.
2.8 The Implicit Function Theorem
Proposition 24 (The TwoDimensional Case)
If F(x, y) ∈ C
(k)
in a set D (i.e. F is k times continuously diﬀerentiable at any
point in D) and (x
0
, y
0
) is an interior point of D, F(x
0
, y
0
) = c (c is constant) and
F
y
(x
0
, y
0
) = 0 then the equation F(x, y) = c deﬁnes y as a C
(k)
function of x in some
neighborhood of (x
0
, y
0
), i.e. y = ψ(x) and
dy
dx
= −
F
x
(x, y)
F
y
(x, y)
.
Example 61 Let consider the function
1 −e
−ax
1 +e
ax
, where a is a parameter, a > 1.
What happens to the extremum value of x if a increases by a small number?
To answer this question, ﬁrst write down the F.O.N.C. for a stationary point:
f
x
=
a(e
−ax
−e
ax
+ 2)
((1 + e
ax
)
2
= 0.
If this expression vanishes at, say, x = x
∗
, we can apply the implicit function theorem to
the F.O.N.C. and obtain
dx
da
= −
∂(e
−ax
−e
ax
+ 2)/∂a
∂(e
−ax
−e
ax
+ 2)/∂x
= −
x
a
< 0
in some neighborhood of (x
∗
, a). The negativity of the derivative is due to the fact that x
∗
is positive, therefore, every x in a small neighborhood of x
∗
should be positive as well.
35
Let us formulate the general case of the implicit function theorem.
Consider the system of m equations with n exogenous variables, x
1
, . . . , x
n
, and m
endogenous variables, y
1
, . . . , y
m
:
f
1
(x
1
, . . . , x
n
, y
1
, . . . , y
m
) = 0
f
2
(x
1
, . . . , x
n
, y
1
, . . . , y
m
) = 0
. . . . . . . . .
f
m
(x
1
, . . . , x
n
, y
1
, . . . , y
m
) = 0
(3)
Our objective is to ﬁnd
∂y
j
∂x
i
, i = 1, . . . , n, j = 1, . . . , m.
If we take the total diﬀerentials of all f
j
, we get
∂f
1
∂y
1
dy
1
+. . . +
∂f
1
∂y
m
dy
m
= −
∂f
1
∂x
1
dx
1
+. . . +
∂f
1
∂x
n
dx
n
,
. . .
∂f
m
∂y
1
dy
1
+ . . . +
∂f
m
∂y
m
dy
m
= −
∂f
m
∂x
1
dx
1
+. . . +
∂f
m
∂x
n
dx
n
.
Now allow only x
i
to vary (i.e. dx
i
= 0, dx
l
= 0, l = 1, 2, . . . , i−1, i+1, . . . , n) and divide
each remaining term in the system above by dx
i
. Thus we arrive at the following linear
system with respect to
∂y
1
∂x
i
, . . .
∂y
m
∂x
i
(note, that in fact dy
1
/dx
i
, . . . , dy
m
/dx
i
should be
interpreted as partial derivatives with respect to x
i
since we allow only x
i
to vary, holding
constant all other x
l
):
∂f
1
∂y
1
∂y
1
∂x
i
+ . . . +
∂f
1
∂y
m
∂y
m
∂x
i
= −
∂f
1
∂x
i
,
. . .
∂f
m
∂y
1
∂y
1
∂x
i
+. . . +
∂f
m
∂y
m
∂y
m
∂x
i
= −
∂f
m
∂x
i
,
which can be solved for
∂y
1
∂x
i
, . . .
∂y
m
∂x
i
by, say, Cramer’s rule or by the inverse matrix
method.
Let deﬁne the Jacobian matrix of f
1
, . . . , f
m
with respect to y
1
, . . . , y
m
as
. =
¸
¸
¸
¸
∂f
1
∂y
1
. . .
∂f
1
∂y
m
.
.
.
.
.
.
.
.
.
∂f
m
∂y
1
. . .
∂f
m
∂y
m
¸
.
Given the deﬁnition of Jacobian, we are in a position to formulate the general result:
Proposition 25 (The General Implicit Function Theorem)
Suppose f
1
, . . . , f
m
are C
(k)
functions in a set D ⊂R
n+m
.
Let (x
∗
, y
∗
) = (x
∗
1
, . . . , x
∗
n
, y
∗
1
, . . . , y
∗
m
) be a solution to (3) in the interior of A.
36
Suppose also that det(.) does not vanish at (x
∗
, y
∗
). Then (3) deﬁnes y
1
, . . . , y
m
as
C
(k)
functions of x
1
, . . . , x
n
in some neighborhood of (x
∗
, y
∗
) and in that neighborhood, for
j = 1, 2, . . . , n,
¸
¸
¸
¸
∂y
1
∂x
j
.
.
.
∂y
m
∂x
j
¸
= −.
−1
¸
¸
¸
¸
∂f
1
∂x
j
.
.
.
∂f
m
∂x
j
¸
.
Example 62 Given the system of two equations x
2
+ y
2
= z
2
/2 and x + y + z = 2, ﬁnd
x
= x
(z), y
= y
(z) in the neighborhood of the point (1, −1, 2).
In this setup, f
1
(x, y, z) = x
2
+y
2
−z
2
/2, f
2
(x, y, z) = x+y +z −2, f
1
, f
2
∈ C
∞
(R
3
),
f
1
(1, −1, 2) = f
2
(1, −1, 2) = 0, and at (1, −1, 2)
det(.) = det
¸
∂f
1
∂x
∂f
1
∂y
∂f
2
∂x
∂f
2
∂y
¸
= det
2x 2y
1 1
(1,−1,2)
= det
2 −2
1 1
= 4 = 0.
Therefore the conditions of the implicit function theorem are satisﬁed and
dx
dz
dy
dz
= −
1
2x −2y
1 −2y
−1 2x
−z
1
=⇒
dx
dz
=
z+2y
2x−2y
dy
dz
= −
z+2x
2x−2y
The Jacobian matrix can be also applied to test whether functional (linear or nonlinear) dependence
exists among a set of n functions in n variables. More precisely, if we have n functions of n variables
g
i
= g
i
(x
1
, . . . , x
n
), i = 1, 2, . . . , n, then the determinant of the Jacobian matrix of g
1
, . . . , f
n
with
respect to x
1
, . . . , x
m
will be identically zero for all values of x
1
, . . . , x
n
if and only if the n functions
g
1
, . . . , g
n
are functionally (linearly or nonlinearly) dependent.
Economics Application 4 (The ProﬁtMaximizing Firm)
Consider that a ﬁrm has the proﬁt function F(l, k) = pf(l, k) − wl − rk where f is
the ﬁrm’s (neoclassical) production function, p is the price of output, l,k are the amount
of labor and capital employed by the ﬁrm (in units of output), w is the real wage and r
is the real rental price of capital. The ﬁrm takes p, w and r as given. Assume that the
Hessian matrix of f is negative deﬁnite.
a) Show that if the wage increases by a small amount, then the ﬁrm decides to employ
less labor.
b) Show that if the wage increases by a small amount and the ﬁrm is constrained to
maintain the same amount of capital k
0
, then the ﬁrm will reduce l by less than it does in
part a).
Solution.
a) The ﬁrm’s objective is to choose l and k so that it maximizes its proﬁts. The
ﬁrstorder conditions for maximization are:
p
∂f
∂l
(l, k) −w = 0
p
∂f
∂k
(l, k) −r = 0
or,
F
1
(l, k; w) = 0
F
2
(l, k; r) = 0
The Jacobian of this system of equations is:
[J[ =
∂F
1
∂l
∂F
1
∂k
∂F
2
∂l
∂F
2
∂k
= p
2
f
11
(l, k) f
12
(l, k)
f
12
(l, k) f
22
(l, k)
= p
2
[H[ > 0.
37
where f
ij
is the secondorder partial derivative of f with respect to the jth and the ith
arguments and [H[ is the Hessian of f. The fact that H is negative deﬁnite means that
f
11
< 0 and [H[ > 0 (which also implies f
22
< 0).
Since [J[ = 0 the implicitfunction theorem can be applied; thus, l and k are implicit
functions of w. The ﬁrstorder partial derivative of l with respect to w is:
∂l
∂w
= −
−1 pf
12
(l, k)
0 pf
22
(l, k)
[J[
=
f
22
(l, k)
p[H[
< 0.
Thus, the optimal l falls in response to a small increase in w.
b) If the ﬁrm’s amount of capital is ﬁxed at k
0
, the proﬁt function is F(l) = pf(l, k
0
) −
wl −rk
0
and the ﬁrstorder condition is
p
∂f
∂l
(l, k
0
) −w = 0.
Since f
11
= 0, the above condition deﬁnes l as an implicit function of w. Taking the
derivative of the F.O.C. with respect to l, we have:
pf
11
(l, k
0
)
∂l
∂w
= 1 or
∂l
∂w
=
1
pf
11
(l, k
0
)
.
This change in l is smaller in absolute value than the change in l in part a) since
f
22
(l, k
0
)
p[H[
<
1
pf
11
(l, k
0
)
(Note that both ratios are negative.)
For another economic application see the section ‘Constrained Optimization’.
2.9 Concavity, Convexity, Quasiconcavity and Quasiconvexity
Deﬁnition 42 A function z = f(x
1
, . . . , x
n
) is concave if and only if for any pair of
distinct points u = (u
1
, . . . , u
n
) and v = (v
1
, . . . , v
n
) in the domain of f and for any
θ ∈ (0, 1)
θf(u) + (1 −θ)f(v) ≤ f(θu + (1 −θ)v)
and convex if and only if
θf(u) + (1 −θ)f(v) ≥ f(θu + (1 −θ)v).
If f is diﬀerentiable, an alternative deﬁnition of concavity (or convexity) is
f(v) ≤ f(u) +
n
¸
i=1
f
x
i
(u)(v
i
−u
i
) (or f(v) ≥ f(u) +
n
¸
i=1
f
x
i
(u)(v
i
−u
i
)).
If we change weak inequalities ≤ and ≥ to strict inequalities, we will get the deﬁnition
of strict concavity and strict convexity, respectively.
38
Some useful results:
• The sum of concave (or convex) functions is a concave (or convex) function. If,
additionally, at least one of these functions is strictly concave (or convex), the sum
is also strictly concave (or convex).
• If f(x) is a (strictly) concave function then −f(x) is a (strictly) convex function
and vise versa.
Recipe 12 – How to Check Concavity (or Convexity):
A twice continuously diﬀerentiable function z = f(x
1
, . . . , x
n
) is concave (or convex)
if and only if d
2
z is everywhere negative (or positive) semideﬁnite.
z = f(x
1
, . . . , x
n
) is strictly concave (or strictly convex) if (but not only if ) d
2
z is
everywhere negative (or positive) deﬁnite.
Again, any test for the sign deﬁniteness of a symmetric quadratic form is applicable.
Example 63 z(xy) = x
2
+xy +y
2
is strictly convex.
The conditions for concavity and convexity are necessary and suﬃcient, while those
for strict concavity (strict convexity) are only suﬃcient. ´!
Deﬁnition 43 A set S in R
n
is convex if for any x, y ∈ S and any θ ∈ [0, 1] the linear
combination θx + (1 −θ)y ∈ S.
Deﬁnition 44 A function z = f(x
1
, . . . , x
n
) is quasiconcave if and only if for any pair
of distinct points u = (u
1
, . . . , u
n
) and v = (v
1
, . . . , v
n
) in the convex domain of f and for
any θ ∈ (0, 1)
f(v) ≥ f(u) =⇒ f(θu + (1 −θ)v) ≥ f(u)
and quasiconvex if and only if
f(v) ≥ f(u) =⇒ f(θu + (1 −θ)v) ≤ f(v).
If f is diﬀerentiable, an alternative deﬁnition of quasiconcavity (quasiconvexity) is
f(v) ≥ f(u) =⇒
n
¸
i=1
f
x
i
(u)(v
i
−u
i
) ≥ 0 (f(v) ≥ f(u) =⇒
n
¸
i=1
f
x
i
(v)(v
i
−u
i
) ≥ 0).
A change of weak inequalities ≤ and ≥ on the right to strict inequalities gives us the
deﬁnitions of strict quasiconcavity and strict quasiconvexity, respectively.
Example 64 The function z(x, y) = xy, x, y ≥ 0 is quasiconcave.
Note that any convex (concave) function is quasiconvex (quasiconcave), but not vice
versa. ´!
39
2.10 Appendix: Matrix Derivatives
Matrix derivatives play an important role in economic analysis, especially in econometrics.
If A is [n n] nonsingular matrix, the derivative of its determinant with respect to
A is given by
∂
∂A
(det(A)) = (C
ij
),
where (C
ij
) is the matrix of cofactors of A.
Example 65 (Matrix Derivatives in Econometrics: How to Find the Least Squares Es
timator in a Multiple Regression Model)
Consider the multiple regression model:
y = Xβ +
where n 1 vector y is the dependent variable, X is a n k matrix of k explanatory
variables with rank(X) = k, β is a k 1 vector of coeﬃcients which are to be estimated
and is a n 1 vector of disturbances. We assume that the matrices of observations
X and y are given.
Our goal is to ﬁnd an estimator b for β using the least squares method. The least
squares estimator of β is a vector b, which minimizes the expression
E(b) = (y −Xb)
(y −Xb) = y
y −y
Xb −b
X
y +b
X
Xb.
The ﬁrstorder condition for extremum is:
dE(b)
db
= 0.
To obtain the ﬁrst order conditions we use the following formulas:
•
d(a
b)
db
= a,
•
d(b
a)
db
= a,
•
d(Mb)
db
= M
,
•
d(b
Mb)
db
= (M +M
)b,
where a, b are k 1 vectors and M is a k k matrix.
Using these formulas we obtain:
dE(b)
db
= −2X
y + 2X
Xb
and the ﬁrstorder condition implies:
X
Xb = X
y
41
or
b = (X
X)
−1
X
y
On the other hand, by the third derivation rule above, we have:
d
2
E(b)
db
2
= (2X
X)
= 2X
X.
To check whether the solution b is indeed a minimum, we need to prove the positive
deﬁniteness of the matrix X
X.
First, notice that X
X is a symmetric matrix. The symmetry is obvious:
(X
X)
= X
X.
To prove positive deﬁniteness, we take an arbitrary k 1 vector z, z = 0 and check the
following quadratic form:
z
(X
X)z = (Xz)
Xz.
The assumptions rank(X) = k and z = 0 imply Xz = 0. It follows that (Xz)
Xz > 0 for
any z = 0 or, equivalently, z
X
Xz > 0 for any z = 0, which means that (by deﬁnition)
X
X is positive deﬁnite.
Further Reading:
• Bartle, R.G. The Elements of Real Analysis.
• Edwards, C.N and D.E.Penny. Calculus and Analytical Geomethry.
• Greene, W.H. Econometric Analysis.
• Rudin, W. Principles of Mathematical Analysis.
42
3 Constrained Optimization
3.1 Optimization with Equality Constraints
Consider the problem
4
:
extremize f(x
1
, . . . , x
n
) (4)
subject to g
j
(x
1
, . . . , x
n
) = b
j
, j = 1, 2, . . . , m < n.
f is called the objective function, g
1
, g
2
, . . . , g
m
are the constraint functions, b
1
, b
2
, . . . , b
m
are the constraint constants. The diﬀerence n −m is the number of degrees of freedom of
the problem.
Note that n is strictly greater than m. ´!
If it is possible to explicitly express (from the constraint functions) m independent
variables as functions of the other n − m independent variables, we can eliminate m
variables in the objective function, thus the initial problem will be reduced to the uncon
strained optimization problem with respect to n − m variables. However, in many cases
it is not technically feasible to explicitly express one variable as function of the others.
Instead of the substitution and elimination method, we may resort to the easytouse
and welldeﬁned method of Lagrange multipliers.
Let f and g
1
, . . . , g
m
be C
(1)
functions and the Jacobian . = (
∂g
j
∂x
i
), i = 1, 2, . . . , n, j =
1, 2, . . . , m have the full rank, i.e.rank(.) = m. We introduce the Lagrangian function as
L(x
1
, . . . , x
n
, λ
1
, . . . , λ
m
) = f(x
1
, . . . , x
n
) +
m
¸
j=1
λ
j
(b
j
−g
j
(x
1
, . . . , x
n
)),
where λ
1
, λ
2
, . . . , λ
m
are constant (Lagrange multipliers).
Recipe 13 – What are the Necessary Conditions for the Solution of (4)?
Equate all partials of L with respect to x
1
, . . . , x
n
, λ
1
, . . . , λ
m
to zero:
∂L(x
1
, . . . , x
n
, λ
1
, . . . , λ
m
)
∂x
i
=
∂f(x
1
, . . . , x
n
)
∂x
i
−
m
¸
j=1
λ
j
∂g
j
(x
1
, . . . , x
n
)
∂x
i
= 0, i = 1, 2, . . . , n,
∂L(x
1
, . . . , x
n
, λ
1
, . . . , λ
m
)
∂λ
j
= b
j
−g
j
(x
1
, . . . , x
n
) = 0, j = 1, 2, . . . , m.
Solve these equations for x
1
, . . . , x
n
and λ
1
, . . . , λ
m
.
In the end we will get a set of stationary points of the Lagrangian. If x
∗
= (x
∗
1
, . . . , x
∗
n
)
is a solution of (4), it should be a stationary point of L.
It is important that rank(.) = m, and the functions are continuously diﬀerentiable. ´!
Example 66 This example shows that the Lagrangemultiplier method does not always
provide the solution to an optimization problem. Consider the problem:
min x
2
+y
2
subject to (x −1)
3
−y
2
= 0.
4
“Extremize” means to ﬁnd either the minimum or the maximum of the objective function f.
43
Notice that the solution to this problem is (1, 0). The restriction (x −1)
3
= y
2
implies
x ≥ 1. If x = 1, then y = 0 and x
2
+y
2
= 1. If x > 1, then y > 0 and x
2
+y
2
> 1. Thus,
the minimum of x
2
+y
2
is attained for (1, 0).
The Lagrangian function is
L = x
2
+y
2
−λ[(x −1)
3
−y
2
]
The ﬁrstorder conditions are
∂L
∂x
= 0
∂L
∂y
= 0
∂L
∂λ
= 0
or
2x −3λ(x −1)
2
= 0
2y + 2λy = 0
(x −1)
3
−y
2
= 0
Clearly, the ﬁrst equation is not satisﬁed for x = 1. Thus, the Lagrangemultiplier
method fails to detect the solution (1, 0).
If we need to check whether a stationary point is a maximum or minimum, the following
local suﬃcient conditions can be applied:
Proposition 26 Let us introduce a bordered Hessian [
¯
H
r
[ as
[
¯
H
r
[ = det
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
0 . . . 0
∂g
1
∂x
1
. . .
∂g
1
∂x
r
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . 0
∂g
m
∂x
1
. . .
∂g
m
∂x
r
∂g
1
∂x
1
. . .
∂g
m
∂x
1
∂
2
L
∂x
1
∂x
1
. . .
∂
2
L
∂x
1
∂x
r
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
∂g
1
∂x
r
. . .
∂g
m
∂x
r
∂
2
L
∂x
r
∂x
1
. . .
∂
2
L
∂x
r
∂x
r
¸
, r = 1, 2, . . . , n.
Let f and g
1
, . . . , g
m
are C
(2)
functions and let x
∗
satisfy the necessary conditions for the
problem (4). Let [
¯
H
r
(x
∗
)[ be the bordered Hessian determinant evaluated at x
∗
. Then
• if (−1)
m
[
¯
H
r
(x
∗
)[ > 0, r = m + 1, . . . , n, then x
∗
is a local minimum point for the
problem (4).
• if (−1)
r
[
¯
H
r
(x
∗
)[ > 0, r = m + 1, . . . , n, then x
∗
is a local maximum point for the
problem (4).
Example 67 (Local SecondOrder Conditions for the Case with one Constraint)
Suppose that x
∗
= (x
∗
1
, . . . , x
∗
n
) satisﬁes the necessary conditions for the problem
max(min)f(x
1
, . . . , x
n
) subject to g(x
1
, . . . , g
n
) = b,
i.e. all partial derivatives of the Lagrangian are zero at x
∗
. Deﬁne
[
¯
H
r
[ = det
¸
¸
¸
¸
¸
¸
0
∂g
∂x
1
. . .
∂g
∂x
r
∂g
∂x
1
∂
2
L
∂x
1
∂x
1
. . .
∂
2
L
∂x
1
∂x
r
.
.
.
.
.
.
.
.
.
.
.
.
∂g
∂x
r
∂
2
L
∂x
r
∂x
1
. . .
∂
2
L
∂x
r
∂x
r
¸
, r = 1, 2, . . . , n.
Let [
¯
H
r
(x
∗
)[ be [
¯
H
r
[ evaluated at x
∗
. Then
44
• x
∗
is a local minimum point of f subject to the given constraint if [
¯
H
r
(x
∗
)[ < 0 for
all r = 2, . . . , n.
• x
∗
is a local maximum point of f subject to the given constraint if (−1)
r
[
¯
H
r
(x
∗
)[ > 0
for all r = 2, . . . , n.
Economics Application 5 (Utility Maximization)
In order to illustrate how the Lagrangemultiplier method can be applied, let us consider
the following optimization problem:
The preferences of a consumer over two goods x and y are given by the utility function
U(x, y) = (x + 1)(y + 1) = xy +x +y + 1.
The prices of goods x and y are 1 and 2, respectively, and the consumer’s income is 30.
What bundle of goods will the consumer choose?
The consumer’s budget constraint is x + 2y ≤ 30 which, together with the conditions
x ≥ 0 and y ≥ 0, determines his budget set (the set of all aﬀordable bundles of goods).
He will choose the bundle from his budget set that maximizes his utility. Since U(x, y) is
an increasing function in both x and y (over the domain x ≥ 0 and y ≥ 0), the budget
constraint should be satisﬁed with equality. The consumer’s optimization problem can be
stated as follows:
max U(x, y)
subject to x + 2y = 30.
The Lagrangian function is
L = U(x, y) −λ(x + 2y −30)
= xy +x +y + 1 −λ(x + 2y −30).
The ﬁrstorder necessary conditions are
∂L
∂x
= 0
∂L
∂y
= 0
∂L
∂z
= 0
or
y + 1 −λ = 0
x + 1 −2λ = 0
x + 2y −30 = 0
From the ﬁrst equation, we get λ = y +1; substituting y +1 for λ in the second equation,
we obtain x − 2y − 1 = 0. This condition and the third condition above lead to x =
31
2
,
y =
29
4
.
To check whether this solution really maximizes the objective function, let us apply the
second order suﬃcient conditions.
The secondorder conditions involve the bordered Hessian:
det
¯
H =
0 1 2
1 0 1
2 1 0
= 4 > 0.
Thus, (x =
31
2
, y =
29
4
) is indeed a maximum and represents the bundle demanded by the
consumer.
45
Economics Application 6 (Applications of the ImplicitFunction Theorem)
Consider the problem of maximizing the function f(x, y) = ax + y subject to the
constraint x
2
+ay
2
= 1 where x > 0, y > 0 and a is a positive parameter. Given that this
problem has a solution, ﬁnd how the optimal values of x and y change if a increases by a
very small amount.
Solution:
Diﬀerentiation of the Lagrangian function L = ax + y − λ(x
2
+ ay
2
− 1) with respect
to x, y, λ gives the ﬁrstorder conditions:
a −2λx = 0
1 −2aλy = 0
x
2
+ay
2
−1 = 0
or,
F
1
(x, λ; a) = 0
F
2
(y, λ; a) = 0
F
3
(x, y; a) = 0
(5)
The Jacobian of this system of three equations is:
J =
¸
¸
¸
∂F
1
∂x
∂F
1
∂y
∂F
1
∂λ
∂F
2
∂x
∂F
2
∂y
∂F
2
∂λ
∂F
3
∂x
∂F
3
∂y
∂F
3
∂λ
¸
=
¸
¸
−2λ 0 −2x
0 −2aλ −2ay
2x 2ay 0
¸
,
[J[ = −8aλ(x
2
+ ay
2
) = −8aλ.
From the ﬁrst F.O.C. we can see that λ > 0, thus [J[ < 0 at the optimal point. Therefore,
in a neighborhood of the optimal solution, given by the system (5), the conditions of the
implicitfunction theorem are met, and in this neighborhood x and y can be expressed as
functions of a.
The ﬁrstorder partial derivatives of x and y with respect to a are evaluated as:
∂x
∂a
= −
∂F
1
∂a
∂F
1
∂y
∂F
1
∂λ
∂F
2
∂a
∂F
2
∂y
∂F
2
∂λ
∂F
3
∂a
∂F
3
∂y
∂F
3
∂λ
[J[
= −
1λ 0 −2x
−2λy −2aλ −2ay
y
2
2ay 0
[J[
=
=
4ay
2
(λx + a)
8aλ
=
3xy
2
2
> 0.
∂y
∂a
= −
∂F
1
∂x
∂F
1
∂a
∂F
1
∂λ
∂F
2
∂x
∂F
2
∂a
∂F
2
∂λ
∂F
3
∂x
∂F
3
∂a
∂F
3
∂λ
[J[
= −
−2λ 1 −2x
0 −2λy −2ay
2x y
2
0
[J[
=
= −
−4xy(λx +a) −4λy(x
2
+ay
2
)
−8aλ
= −
3xy
4λ
+
y
2a
< 0.
In conclusion, a small increase in a will lead to a rise in the optimal x and a fall in
the optimal y.
Proposition 27 Suppose that f and g
1
, . . . , g
m
are deﬁned on an open convex set S ⊂R
n
.
Let x
∗
be a stationary point of the Lagrangian. Then,
• if the Lagrangian is concave, x
∗
maximizes (4);
• if the Lagrangian is convex, x
∗
minimizes (4).
46
3.2 The Case of Inequality Constraints
Classical methods of optimization (the method of Lagrange multipliers) deal with opti
mization problems with equality constraints in the form of g(x
1
, . . . , x
n
) = c. Nonclassical
optimization, also known as mathematical programming, tackles problems with inequality
constraints like g(x
1
, . . . , x
n
) ≤ c.
Mathematical programming includes linear programming and nonlinear programming.
In linear programming, the objective function and all inequality constraints are linear.
When either the objective function or an inequality constraint is nonlinear, we face a
problem of nonlinear programming.
In the following, we restrict our attention to nonlinear programming. A problem of
linear programming – also called a linear program – is discussed in the Appendix to this
chapter.
3.2.1 NonLinear Programming
The nonlinear programming problem is that of choosing nonnegative values of certain
variables so as to maximize or minimize a given (nonlinear) function subject to a given
set of (nonlinear) inequality constraints.
The nonlinear programming maximum problem is
max f(x
1
, . . . , x
n
) (6)
subject to g
i
(x
1
, . . . , x
n
) ≤ b
i
, i = 1, 2, . . . , m.
x
1
≥ 0, . . . , x
n
≥ 0
Similarly, the minimization problem is
min f(x
1
, . . . , x
n
)
subject to g
i
(x
1
, . . . , x
n
) ≥ b
i
, i = 1, 2, . . . , m.
x
1
≥ 0, . . . , x
n
≥ 0
First, note that there are no restrictions on the relative size of m and n, unlike the case
of equality constraints. Second, note that the direction of the inequalities (≤ or ≥) at the
constraints is only a convention, because the inequality g
i
≤ b
i
can be easily converted to
the ≥ inequality by multiplying it by –1, yielding −g
i
≥ −b
i
. Third, note that an equality
constraint, say g
k
= b
k
, can be replaced by the two inequality constraints, g
k
≤ b
k
and
−g
k
≤ −b
k
. ´!
Deﬁnition 45 A constraint g
j
≤ b
j
is called binding (or active) at x
0
= (x
0
1
, . . . , x
0
n
) if
g
j
(x
0
) = b
j
.
Example 68 The following is an example of a nonlinear program:
Max π = x
1
(10 −x
1
) +x
2
(20 −x
2
)
subject to 5x
1
+ 3x
2
≤ 40
x
1
≤ 5
x
2
≤ 10
x
1
≥ 0, x
2
≥ 0.
47
Example 69 (How to Solve a Nonlinear Program Graphically)
An intuitive way to understand a lowdimensional nonlinear program is to represent
the feasible set and the objective function graphically.
For instance, we may try to represent graphically the optimization problem from Ex
ample 68, which can be rewritten as
Max π = 125 −(x
1
−5)
2
−(x
2
−10)
2
subject to 5x
1
+ 3x
2
≤ 40
x
1
≤ 5
x
2
≤ 10
x
1
≥ 0, x
2
≥ 0;
or
Min C = (x
1
−5)
2
+ (x
2
−10)
2
subject to 5x
1
+ 3x
2
≤ 40
x
1
≤ 5
x
2
≤ 10
x
1
≥ 0, x
2
≥ 0.
Region OABCD below shows the feasible set of the nonlinear program.
5
10
5 2
"!
#
O
AB
C
D

p
6
x
2
x
1
Feasible Set
The value of the objective function can be interpreted as the square of the distance
from the point (5, 10) to the point (x
1
, x
2
). Thus, the nonlinear program requires us to
ﬁnd the point in the feasible region which minimizes the distance to the point (5, 10).
As can be seen in the ﬁgure above, the optimal point (x
∗
1
, x
∗
2
) lies on the line 5x
1
+3x
2
=
40 and it is also the tangency point with a circle centered at (5, 10). These two conditions
are suﬃcient to determine the optimal solution.
Consider (x
1
−5)
2
+(x
2
−10)
2
−C = 0. From the implicitfunction theorem, we have
dx
2
dx
1
= −
x
1
−5
x
2
−10
.
Thus, the tangent to the circle centered at (5, 10) which has the slope −
5
3
is given by the
equation
−
5
3
= −
x
1
−5
x
2
−10
or
3x
1
−5x
2
+ 35 = 0.
48
Solving the system of equations
5x
1
+ 3x
2
= 40
3x
1
−5x
2
= −35
we ﬁnd the optimal solution (x
∗
1
, x
∗
2
) = (
95
34
,
295
34
).
3.2.2 KuhnTucker Conditions
For the purpose of ruling out certain irregularities on the boundary of the feasible set, a
restriction on the constrained functions is imposed. This restriction is called the constraint
qualiﬁcation.
Deﬁnition 46 Let ¯ x = (x
1
, . . . , x
n
) be a point on the boundary of the feasible region of a
nonlinear program.
i) A vector dx = (dx
1
, . . . , dx
n
)is a test vector of ¯ x if it satisﬁes the conditions
• ¯ x
j
= 0 =⇒ dx
j
≥ 0;
• g
i
(¯ x) = r
i
=⇒ dg
i
(¯ x) ≤ (≥)0 for a maximization (minimization) program.
ii) A qualifying arc for a test vector of ¯ x is an arc with the point of origin ¯ x, contained
entirely in the feasible region and tangent to the test vector.
iii) The constraint qualiﬁcation is a condition which requires that, for any boundary
point ¯ x of the feasible region and for any test vector of ¯ x, there exist a qualifying
arc for that test vector.
The constraint qualiﬁcation condition is often imposed in the following way:
the gradients at x
∗
of those g
j
constraints which are binding at x
∗
are linearly independent.
Assume that the constraint qualiﬁcation condition is satisﬁed and the functions in
(6) are diﬀerentiable. If x
∗
is the optimal solution to (6), it should satisfy the following
conditions (the KuhnTucker necessary maximum conditions or KT conditions):
∂L
∂x
i
≤ 0, x
i
≥ 0 and x
i
∂L
∂x
i
= 0, i = 1, . . . , n,
∂L
∂λ
j
≥ 0, λ
j
≥ 0 and λ
j
∂L
∂λ
j
= 0, j = 1, . . . , m
where
L(x
1
, . . . , x
n
, λ
1
, . . . , λ
m
) = f(x
1
, . . . , x
n
) +
m
¸
j−1
λ
j
(b
j
−g
j
(x
1
, . . . , x
n
))
is the Lagrangian function of a nonlinear program.
The minimization version of KuhnTucker necessary conditions is
∂L
∂x
i
≥ 0, x
i
≥ 0 and x
i
∂L
∂x
i
= 0, i = 1, . . . , n,
49
∂L
∂λ
j
≤ 0, λ
j
≥ 0 and λ
j
∂L
∂λ
j
= 0, j = 1, . . . , m
Note that, in general, the KT conditions are neither necessary nor suﬃcient for a
local optimum. However, if certain assumptions are satisﬁed, the KT conditions become
necessary and even suﬃcient. ´!
Example 70 The Lagrangian function of the nonlinear program in Example 68 is:
L = x
1
(10 −x
1
) +x
2
(20 −x
2
) −λ
1
(5x
1
+ 3x
2
−40) −λ
2
(x
1
−5) −λ
3
(x
2
−10).
The KT conditions are:
∂L
∂x
1
= 10 −2x
1
−5λ
1
−λ
2
≤ 0
∂L
∂x
2
= 20 −2x
2
−3λ
1
−λ
3
≤ 0
∂L
∂λ
1
= −(5x
1
+ 3x
2
−40) ≥ 0
∂L
∂λ
2
= −(x
1
−5) ≥ 0
∂L
∂λ
3
= −(x
2
−10) ≥ 0
x
1
≥ 0, x
2
≥ 0
λ
1
≥ 0, λ
2
≥ 0, λ
3
≥ 0
x
1
∂L
∂x
1
= 0, x
2
∂L
∂x
2
= 0
λ
i
∂L
∂x
i
= 0, i = 1, 2, 3.
Proposition 28 (The Necessity Theorem)
The KT conditions are necessary conditions for a local optimum if the constraint qual
iﬁcation is satisﬁed.
Note that ´!
• The failure of the constraint qualiﬁcation signals certain irregularities at the bound
ary kinks of the feasible set. Only if the optimal solution occurs in such a kink may
the KT conditions not be satisﬁed.
• If all constraints are linear, the constraint qualiﬁcation is always satisﬁed.
Example 71 The constraint qualiﬁcation for the nonlinear program in Example 68 is
satisﬁed since all constraints are linear. Therefore, the optimal solution (
95
34
,
295
34
) must
satisfy the KT conditions in Example 70.
Example 72 The following example illustrates a case where the KuhnTucker conditions
are not satisﬁed in the solution of an optimization problem. Consider the problem:
max y
subject to x + (y −1)
3
≤ 0,
x ≥ 0, y ≥ 0.
The solution to this problem is (0, 1). (If y > 1, then the restriction x + (y −1)
3
≤ 0
implies x < 0.)
The Lagrangian function is
L = y +λ[−x −(y −1)
3
]
50
One of the KuhnTucker conditions requires
∂L
∂y
≤ 0,
or
1 −3λ(y −1)
2
≤ 0.
As can be observed, this condition is not veriﬁed at the point (0, 1).
Proposition 29 (KuhnTucker Suﬃciency Theorem  1) If
a) f is diﬀerentiable and concave in the nonnegative orthant;
b) each constraint function is diﬀerentiable and convex in the nonnegative orthant;
c) the point x
∗
satisﬁes the KuhnTucker necessary maximum conditions
then x
∗
gives a global maximum of f.
To adapt this theorem for minimization problems, we need to interchange the two words “concave”
and “convex” in a) and b) and use the KuhnTucker necessary minimum condition in c).
Proposition 30 (The Suﬃciency Theorem  2)
The KT conditions are suﬃcient conditions for x
∗
to be a local optimum of a maxi
mization (minimization) program if the following assumptions are satisﬁed:
• the objective function f is diﬀerentiable and quasiconcave (or quasiconvex);
• each constraint g
i
is diﬀerentiable and quasiconvex (or quasiconcave);
• any one of the following is satisﬁed:
– there exists j such that
∂f(x
∗
)
∂x
j
< 0(> 0);
– there exists j such that
∂f(x
∗
)
∂x
j
> 0(< 0) and x
j
can take a positive value without
violating the constraints;
– f is concave (or convex).
The problem of ﬁnding the nonnegative vector (x
∗
, λ
∗
), x
∗
= (x
∗
1
, . . . , x
∗
n
), λ
∗
=
(λ
∗
1
, . . . , λ
∗
m
) which satisﬁes KuhnTucker necessary conditions and for which
L(x, λ
∗
) ≤ L(x
∗
, λ
∗
) ≤ L(x
∗
, λ) for all x = (x
1
, . . . , x
n
) ≥ 0, λ = (λ
1
, . . . , λ
m
) ≥ 0
is known as the saddle point problem.
Proposition 31 If (x
∗
, λ
∗
) solves the saddle point problem then (x
∗
, λ
∗
) solves the prob
lem (6).
51
Economics Application 7 (Economic Interpretation of a Nonlinear Program and Kuhn
Tucker Conditions)
A nonlinear program can be interpreted much like a linear program. A maximization
program in the general form, for example, is the production problem facing a ﬁrm which
has to produce n goods such that it maximizes its revenue subject to m resource (factor)
constraints.
The variables have the following economic interpretations:
• x
j
is the amount produced of the jth product;
• r
i
is the amount of the ith resource available;
• f is the proﬁt (revenue) function;
• g
i
is a function which shows how the ith resource is used in producing the n goods.
The optimal solution to the maximization program indicates the optimal quantities of each
good the ﬁrm should produce.
In order to interpret the KuhnTucker conditions, we ﬁrst have to note the meanings
of the following variables:
• f
j
=
∂f
∂x
j
is the marginal proﬁt (revenue) of product j;
• λ
i
is the shadow price of resource i;
• g
i
j
=
∂g
i
∂x
j
is the amount of resource i used in producing a marginal unit of product j;
• λ
i
g
i
j
is the imputed cost of resource i incurred in the production of a marginal unit
of product j.
The KT condition
∂L
∂x
j
≤ 0 can be written as f
j
≤
¸
m
i=1
λ
i
g
i
j
and it says that the
marginal proﬁt of the jth product cannot exceed the aggregate marginal imputed cost of
the jth product.
The condition x
j
∂L
∂x
j
= 0 implies that, in order to produce good j (x
j
> 0), the marginal
proﬁt of good j must be equal to the aggregate marginal imputed cost (
∂L
∂x
j
= 0). The same
condition shows that good j is not produced (x
j
= 0) if there is an excess imputation
(x
j
∂L
∂x
j
< 0).
The KT condition
∂L
∂λ
i
≥ 0 is simply a restatement of constraint i, which states that
the total amount of resource i used in producing all the n goods should not exceed total
amount available r
i
.
The condition
∂L
∂λ
i
= 0 indicates that if a resource is not fully used in the optimal
solution (
∂L
∂λ
i
> 0), then its shadow price will be 0 (λ
i
= 0). On the other hand, a fully
used resource (
∂L
∂λ
i
= 0) has a strictly positive price (λ
i
> 0).
Example 73 Let us ﬁnd an economic interpretation for the maximization program given
in Example 68:
Max R = x
1
(10 −x
1
) + x
2
(20 −x
2
)
subject to 5x
1
+ 3x
2
≤ 40
x
1
≤ 5
x
2
≤ 10
x
1
≥ 0, x
2
≥ 0.
52
A ﬁrm has to produce two goods using three kinds of resources available in the amounts
40,5,10 respectively. The ﬁrst resource is used in the production of both goods: ﬁve units
are necessary to produce one unit of good 1, and three units to produce one unit of good 2.
The second resource is used only in producing good 1 and the third resource is used only
in producing good 2.
The sale prices of the two goods are given by the linear inverse demand equations
p
1
= 10 − x
1
and p
2
= 20 − x
2
. The problem the ﬁrm faces is how much to produce of
each good in order to maximize revenue R = x
1
p
1
+ x
2
p
2
. The solution (2, 10) gives the
optimal amounts the ﬁrm should produce.
Economics Application 8 (The SalesMaximizing Firm)
Suppose that a ﬁrm’s objective is to maximize its sales revenue subject to the constraint
that the proﬁt is not less than a certain value. Let Q denote the amount of the good
supplied on the market. If the revenue function is R(Q) = 20Q−Q
2
, the cost function is
C(Q) = Q
2
+ 6Q+ 2 and the minimum proﬁt is 10, ﬁnd the salesmaximizing quantity.
The ﬁrm’s problem is
max R(Q)
subject to R(Q) −C(Q) ≥ 10 and Q ≥ 0
or
max R(Q)
subject to 2Q
2
−14Q+ 2 ≤ −10 and Q ≥ 0.
The Lagrangian function is
Z = 20Q−Q
2
−λ(2Q
2
−14Q+ 12)
and the KuhnTucker conditions are
∂Z
∂Q
≤ 0, Q ≥ 0, Q
∂Z
∂Q
= 0, (7)
∂Z
∂λ
≥ 0, λ ≥ 0, λ
∂Z
∂λ
= 0 (8)
Explicitly, the ﬁrst inequalities in each row above are
−Q+ 10 −λ(2Q−7) ≤ 0, (9)
−Q
2
+ 7Q−6 ≥ 0 (10)
The second inequality in (7) and inequality (10) imply that Q > 0 (if Q were 0, (10)
would not be satisﬁed). From the third condition in (7) it follows that
∂Z
∂Q
= 0.
If λ were 0, the condition
∂Z
∂Q
= 0 would lead to Q = 10, which is not consistent with
(10); thus, we must have λ > 0 and
∂Z
∂λ
= 0.
The quadratic equation −Q
2
+ 7Q − 6 = 0 has the roots Q
1
= 1 and Q
2
= 6. Q
1
implies λ = −
9
5
, which contradicts λ > 0. The solution Q
2
= 6 leads to a positive value
for λ. Thus, the salesmaximizing quantity is Q = 6.
53
3.3 Appendix: Linear Programming
3.3.1 The Setup of the Problem
There are two general types of linear programs:
• the maximization program in n variables subject to m+n constraints:
Maximize π =
¸
n
j=1
c
j
x
j
subject to
¸
n
j=1
a
ij
x
j
≤ r
i
(i = 1, 2, . . . m)
and x
j
≥ 0 (j = 1, 2, . . . n).
• the minimization program in n variables subject to m+n constraints:
Minimize C =
¸
n
j=1
c
j
x
j
subject to
¸
n
j=1
a
ij
x
j
≥ r
i
(i = 1, 2, . . . m)
and x
j
≥ 0 (j = 1, 2, . . . n).
Note that without loss of generality all constraints in a linear program may be written
as either ≤ inequalities or ≥ inequalities because any ≤ inequality can be transformed
into a ≥ inequality by multiplying both sides by −1.
A more concise way to express a linear program is by using matrix notation:
• the maximization program in n variables subject to m+n constraints:
Maximize π = c
x
subject to Ax ≤ r
and x ≥ 0.
• the minimization program in n variables subject to m+n constraints:
Minimize C = c
x
subject to Ax ≥ r
and x ≥ 0.
where c = (c
1
, . . . , c
n
)
, x = (x
1
, . . . , x
n
)
, r = (r
1
, . . . , r
m
)
,
A =
¸
¸
¸
¸
¸
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
a
m1
a
m2
a
mn
¸
Example 74 The following is an example of a linear program:
Max π = 10x
1
+ 30x
2
subject to 5x
1
+ 3x
2
≤ 40
x
1
≤ 5
x
2
≤ 10
x
1
≥ 0, x
2
≥ 0.
54
Proposition 32 (The Globality Theorem)
If the feasible set F of an optimization problem is a closed convex set, and if the
objective function is a continuous concave (convex) function over the feasible set, then
i) any local maximum (minimum) will also be a global maximum (minimum) and
ii) the points in F at which the objective function is optimized will constitute a convex
set.
Note that any linear program satisﬁes the assumptions of the globality theorem. In
deed, the objective function is linear; thus, it is both a concave and a convex function.
On the other hand, each inequality restriction deﬁnes a closed halfspace, which is also a
convex set. Since the intersection of a ﬁnite number of closed convex sets is also a closed
convex set, it follows that the feasible set F is closed and convex.
The theorem says that for a linear program there is no diﬀerence between a local
optimum and a global optimum, and if a pair of optimal solutions to a linear program
exists, then any convex combination of the two must also be an optimal solution.
Deﬁnition 47 An extreme point of a linear program is a point in its feasible set which
cannot be derived from a convex combination of any other two points in the feasible set.
Example 75 For the problem in example 74 the extreme points are (0, 0), (0, 10), (2, 10),
(5, 5), (5, 0). They are just the corners of the feasible region represented in the x
1
0x
2
system of coordinates.
Proposition 33 The optimal solution of a linear program is an extreme point.
3.3.2 The Simplex Method
The next question is to develop an algorithm which will enable us to solve a linear program.
Such an algorithm is called the simplex method.
The simplex method is a systematic procedure for ﬁnding the optimal solution of a
linear program. Loosely speaking, using this method we move from one extreme point of
the feasible region to another until the optimal point is attained.
The following two examples illustrate how the simplex method can be applied:
Example 76 Consider again the maximization problem in Example 74.
Step 1. Transform each inequality constraint into an equality constraint by inserting
a dummy variable (slack variable) in each constraint.
After applying this step, the problem becomes:
Max π = 10x
1
+ 30x
2
subject to 5x
1
+ 3x
2
+d
1
= 40
x
1
+d
2
= 5
x
2
+d
3
= 10
x
1
≥ 0, x
2
≥ 0, d
1
≥ 0, d
2
≥ 0, d
3
≥ 0.
55
The extreme points of the new feasible set are called basic feasible solutions (BFS).
For each extreme point of the original feasible set there is a BFS. For example, the BFS
corresponding to the extreme point (x
1
, x
2
) = (0, 0) is (x
1
, x
2
, d
1
, d
2
, d
3
) = (0, 0, 40, 5, 10).
Step 2. Find a BFS to be able to start the algorithm.
A BFS is easy to ﬁnd when (0, 0) is in the feasible set, as it is in our example. Here, the
BFS are (0, 0, 40, 5, 10). In general, ﬁnding a BFS requires the introduction of additional
artiﬁcial variables. Example 77 illustrates this situation.
Step 3. Set up a simplex tableau like the following one:
Row π x
1
x
2
d
1
d
2
d
3
Constant
0 1 10 30 0 0 0 0
1 0 5 3 1 0 0 40
2 0 1 0 0 1 0 5
3 0 0 1 0 0 1 10
Row 0 contains the coeﬃcients of the objective function. Rows 1, 2, 3 include the
coeﬃcients of the constraints as they appear in the matrix of coeﬃcients.
In row 0 we can see the variables which have nonzero values in the current BFS.
They correspond to the zeros in row 0, in this case, the nonzero variables are d
1
, d
2
, d
3
.
These nonzero variables always correspond to a basis in R
m
, where m is the number of
constraints (see the general form of a linear program). The current basis is formed by
the column coeﬃcient vectors which appear in the tableau in rows 1, 2, 3 under d
1
, d
2
, d
3
(m = 3).
The rightmost column shows the nonzero values of the current BFS (in rows 1, 2, 3)
and the value of the objective function for the current BFS (in row 0). That is, d
1
= 40,
d
2
= 5, d
3
= 10, π = 0.
Step 4. Choose the pivot.
The simplex method ﬁnds the optimal solution by moving from one BFS to another
BFS until the optimum is reached. The new BFS is chosen such that the value of the
objective function is improved (that is, it is larger for a maximization problem and lower
for a minimization problem).
In order to switch to another BFS, we need to replace a variable currently in the basis
by a variable which is not in the basis. This amounts to choosing a pivot element which
will indicate both the new variable and the exit variable.
The pivot element is selected in the following way. The new variable (and, correspond
ingly, the pivot column) is that variable having the lowest negative value in row 0 (largest
positive value for a minimization problem). Once we have determined the pivot column,
the pivot row is found in this way:
• pick the strictly positive elements in the pivot column except for the element in row
0;
• select the corresponding elements in the constant column and divide them by the
strictly positive elements in the pivot column;
• the pivot row corresponds to the minimum of these ratios.
56
In our example, the pivot column is the column of x
2
(since −30 < −10) and the pivot
row is row 3 (since min¦
40
3
,
10
1
¦ =
10
1
); thus, the pivot element is the 1 which lies at the
intersection of column of x
2
and row 3.
Step 5. Make the appropriate transformations to reduce the pivot column to a unit
vector (that is, a vector with 1 in the place of the pivot element and 0 in the rest).
Usually, ﬁrst we have to divide the pivot row by the pivot element to obtain 1 in the
place of the pivot element. However, this is not necessary in our case since we already
have 1 in that position.
Then the new version of the pivot row is used to get zeros in the other places of the
pivot column. In our example, we multiply the pivot row by 30 and add it to row 0; then
we multiply the pivot row by −3 and add it to row 1.
Hence we obtain a new simplex tableau:
Row π x
1
x
2
d
1
d
2
d
3
Constant
0 1 10 0 0 0 30 300
1 0 5 0 1 0 3 10
2 0 1 0 0 1 0 5
3 0 0 1 0 0 1 10
Step 6. Check that row 0 still contains negative values (positive values for a mini
mization problem). If it does, proceed to step 4. If it does not, the optimal solution must
be found.
In our example, we still have a negative value in row 0 (10). According to step 4,the
column of x
1
is the pivot column (x
1
enters the base). Since min¦
10
5
,
5
1
¦ =
10
5
, it follows
that row 1 is the pivot row. Thus, 5 in row 1 is the new pivot element. Applying step 5,
we obtain a third simplex tableau:
Row π x
1
x
2
d
1
d
2
d
3
Constant
0 1 0 0 2 0 24 320
1 0 1 0
1
5
0
−3
5
2
2 0 0 0
−1
5
1
3
5
3
3 0 0 1 0 0 1 10
There is no negative element left in row 0 and the current basis consists of x
1
, x
2
, d
2
.
Thus, the optimal solution can be read from the rightmost column: x
1
= 2, x
2
= 10,
d
2
= 3, π = 320. The correspondence between the optimal values and the variables in the
current basis is made using the 1’s in the column vectors of x
1
, x
2
, d
2
.
Example 77 Consider the minimization problem:
Min C = x
1
+x
2
subject to x
1
−x
2
≥ 0
−x
1
−x
2
≥ −6
x
2
≥ 1
x
1
≥ 0, x
2
≥ 0.
57
After applying step 1 of the simplex algorithm, we obtain:
Min C = x
1
+x
2
subject to x
1
−x
2
−d
1
= 0
−x
1
−x
2
−d
2
= −6
x
2
−d
3
= 1
x
1
≥ 0, x
2
≥ 0, d
1
≥ 0, d
2
≥ 0, d
3
≥ 0.
In this example, (0, 0) is not in the feasible set. One way of ﬁnding a starting BFS
is to transform the problem by adding 3 more artiﬁcial variables f
1
, f
2
, f
3
to the linear
program as follows:
Min C = x
1
+x
2
+M(f
1
+f
2
+f
3
)
subject to x
1
−x
2
−d
1
+f
1
= 0
x
1
+x
2
+d
2
+f
2
= 6
x
2
−d
3
+f
3
= 1
x
1
≥ 0, x
2
≥ 0, d
1
≥ 0, d
2
≥ 0, d
3
≥ 0
f
1
≥ 0, f
2
≥ 0, f
3
≥ 0.
Before adding the artiﬁcial variables, the second constraint is multiplied by 1 to obtain a
positive number on the righthand side.
As long as M is high enough, the optimal solution will contain f
1
= 0, f
2
= 0, f
3
= 0
and thus the optimum is not aﬀected by including the artiﬁcial variables in the linear
program. In our case, we choose M = 100. (In maximization problems M is chosen very
low.)
Now, the starting BFS is (x
1
, x
2
, d
1
, d
2
, d
3
, f
1
, f
2
, f
3
) = (0, 0, 0, 0, 0, 0, 6, 1) and the ﬁrst
simplex tableau is:
Row C x
1
x
2
d
1
d
2
d
3
f
1
f
2
f
3
Constant
0 1 1 1 0 0 0 100 100 100 0
1 0 1 1 1 0 0 1 0 0 0
2 0 1 1 0 1 0 0 1 1 6
3 0 0 1 0 0 1 0 0 0 1
In order to obtain unit vectors in the columns of f
1
, f
2
and f
3
, we add 100(row1+row2+row3)
to row0:
Row C x
1
x
2
d
1
d
2
d
3
f
1
f
2
f
3
Constant
0 1 199 99 100 100 100 0 0 0 700
1 0 1 1 1 0 0 1 0 0 0
2 0 1 1 0 1 0 0 1 1 6
3 0 0 1 0 0 1 0 0 0 1
Proceeding with the other steps of the simplex algorithm will ﬁnally lead to the optimal
solution x
1
= 1, x
2
= 1, C = 2.
Sometimes the starting BFS (or the starting base) can be found in a simpler way than
by using the general method that was just illustrated. For example, our linear program
58
can be written as:
Min C = x
1
+x
2
subject to −x
1
+ x
2
+d
1
= 0
x
1
+x
2
+ d
2
= 6
x
2
−d
3
= 1
x
1
≥ 0, x
2
≥ 0, d
1
≥ 0, d
2
≥ 0, d
3
≥ 0.
Taking x
1
= 0, x
2
= 0 we get d
1
= 0, d
2
= 6, d
3
= −1. Only d
3
contradicts the
constraint d
3
≥ 0. Thus, we need to add only one artiﬁcial variable f to the linear
program and the variables in the starting base will be d
1
, d
2
, f.
The complete sequence of simplex tableaux leading to the optimal solution is given
below.
Row C x
1
x
2
d
1
d
2
d
3
f Constant
0 1 1 1 0 0 0 100 0
1 0 1 1 1 0 0 0 0
2 0 1 1 0 1 0 0 6
3 0 0 1 0 0 1 1 1
0 1 1 99 0 0 100 0 100
1 0 1 1
∗
1 0 0 0 0
2 0 1 1 0 1 0 0 6
3 0 0 1 0 0 1 1 1
0 1 98 0 99 0 100 0 100
1 0 1 1 1 0 0 0 0
2 0 2 0 1 1 0 0 6
3 0 1
∗
0 1 0 1 1 1
0 1 0 0 1 0 2 98 2
1 0 0 1 0 0 1 1 1
2 0 0 0 1 1 2 2 4
3 0 1 0 1 0 1 1 1
(The asterisks mark the pivot elements.)
As can be seen from the last tableau, the optimum is x
1
= 1, x
2
= 1, d
2
= 4, C = 2.
Economics Application 9 (Economic Interpretation of a Maximization Program)
The maximization program in the general form is the production problem facing a ﬁrm
which has to produce n goods such that it maximizes its revenue subject to m resource
(factor) constraints.
The variables have the following economic interpretations:
• x
j
is the amount produced of the jth product;
• r
i
is the amount of the ith resource available;
• a
ij
is the amount of the ith resource used in producing a unit of the jth product;
• c
j
is the price of a unit of product j.
The optimal solution to the maximization program indicates the optimal quantities of
each good the ﬁrm should produce.
59
Example 78 Let us ﬁnd an economic interpretation of the maximization program given
in Example 74:
Max π = 10x
1
+ 30x
2
subject to 5x
1
+ 3x
2
≤ 40
x
1
≤ 5
x
2
≤ 10
x
1
≥ 0, x
2
≥ 0.
A ﬁrm has to produce two goods using three kinds of resources available in the amounts
40,5,10 respectively. The ﬁrst resource is used in the production of both goods: ﬁve units
are necessary to produce one unit of good 1, and three units to produce 1 unit of good 2.
The second resource is used only in producing good 1 and the third resource is used only
in producing good 2.
The question is, how much of each good should the ﬁrm produce in order to maximize
its revenue if the sale prices of goods are 10 and 30, respectively? The solution (2, 10)
gives the optimal amounts the ﬁrm should produce.
Economics Application 10 (Economic Interpretation of a Minimization Program)
The minimization program (see the general form) is also connected to the ﬁrm’s pro
duction problem. This time the ﬁrm wants to produce a ﬁxed combination of m goods
using n resources (factors). The problem lies in choosing the amount of each resource
such that the total cost of all resources used in the production of the ﬁxed combination of
goods is minimized.
The variables have the following economic interpretations:
• x
j
is the amount of the jth resource used in the production process;
• r
i
is the amount of the good i the ﬁrm decides to produce;
• a
ij
is the marginal productivity of resource j in producing good i;
• c
j
is the price of a unit of resource j.
The optimal solution to the minimization program indicates the optimal quantities of
each factor the ﬁrm should employ.
Example 79 Consider again the minimization program in example 77:
Min C = x
1
+x
2
subject to x
1
−x
2
≥ 0
x
1
+x
2
≤ 6
x
2
≥ 1
x
1
≥ 0, x
2
≥ 0.
What economic interpretation can be assigned to this linear program?
A ﬁrm has to produce two goods, at most six units of the ﬁrst good and one unit of
the second good. Two factors are used in the production process and the price of both
factors is one. The marginal productivity of both factors in producing good 1 is one and
the marginal productivity of the factors in producing good 2 are zero and one, respectively
(good 2 uses only the second factor). The production process also requires that the amount
of the second factor not exceed the amount of the ﬁrst factor.
The question is how much of each factor the ﬁrm should employ in order to minimize
the total factor cost. The solution (1, 1) gives the optimal factor amounts.
60
3.3.3 Duality
As a matter of fact, the maximization and the minimization programs are closely related.
This relationship is called duality.
Deﬁnition 48 The dual (program) of the maximization program
Maximize π = c
x
subject to Ax ≤ r
and x ≥ 0.
is the minimization program
Minimize π
∗
= r
y
subject to A
y ≥ c
and y ≥ 0.
Deﬁnition 49 The dual (program) of the minimization program
Minimize C = c
x
subject to Ax ≥ r
and x ≥ 0.
is the maximization program
Maximize C
∗
= r
y
subject to A
y ≤ c
and y ≥ 0.
Deﬁnition 50 The original program from which the dual is derived is called the primal
(program).
Example 80 The dual of the maximization program in Example 74 is
Min C
∗
= 40y
1
+ 5y
2
+ 10y
3
subject to 5y
1
+y
2
≥ 10
3y
1
+y
3
≥ 30
y
1
≥ 0, y
2
≥ 0, y
3
≥ 0.
Example 81 The dual of the minimization program in Example 77 is
Max π
∗
= −6y
2
+y
3
subject to y
1
−y
2
≤ 1
−y
1
−y
2
+y
3
≤ 1
y
1
≥ 0, y
2
≥ 0, y
3
≥ 0.
The following duality theorems clarify the relationship between the dual and the pri
mal.
Proposition 34 The optimal values of the primal and the dual objective functions are
always identical, if the optimal solutions exist.
61
Proposition 35
1. If a certain choice variable in a primal (dual) program is optimally nonzero then the
corresponding dummy variable in the dual (primal) must be optimally zero.
2. If a certain dummy variable in a primal (dual) program is optimally nonzero then
the corresponding choice variable in the dual (primal) must be optimally zero.
The last simplex tableau of a linear program oﬀers not only the optimal solution to
that program but also the optimal solution to the dual program. ´!
Recipe 14 – How to solve the dual program:
The optimal solution to the dual program can be read from row 0 of the last simplex
tableau of the primal in the following way:
1. the absolute values of the elements in the columns corresponding to the primal
dummy variables are the values of the dual choice variables.
2. the absolute values of the elements in the columns corresponding to the primal choice
variables are the values of the dual dummy variables.
Example 82 From the last simplex tableau in Example 74 we can infer the optimal so
lution to the problem in Example 80. It is y
1
= 2, y
2
= 0, y
3
= 24, π
∗
= 320.
Example 83 The last simplex tableau in Example 77 implies that the optimal solution to
the problem in Example 81 is y
1
= 1, y
2
= 0, y
3
= 2, C = 2.
Further reading:
• Intriligator, M. Mathematical Optimization and Economic Theory.
• Luenberger, D.G. Introduction to Linear and Nonlinear Programming.
62
4 Dynamics
4.1 Diﬀerential Equations
Deﬁnition 51 An equation F(x, y, y
, . . . , y
(n)
) = 0 which assumes a functional rela
tionship between independent variable x and dependent variable y and includes x, y and
derivatives of y with respect to x is called an ordinary diﬀerential equation of order n.
The order of a diﬀerential equation is determined by the order of the highest derivative
in the equation.
A function y = y(x) is said to be a solution of this diﬀerential equation, in some open
interval I, if F(x, y(x), y
(x), . . . , y
(n)
(x)) = 0 for all x ∈ I.
4.1.1 Diﬀerential Equations of the First Order
Let us consider the ﬁrst order diﬀerential equations, y
= f(x, y).
The solution y(x) is said to solve Cauchy’s problem (or the initial value problem) with
the initial values (x
0
, y
0
), if y(x
0
) = y
0
.
Proposition 36 (The Existence Theorem)
If f is continuous in an open domain D, then for any given pair (x
0
, y
0
) ∈ D there
exists a solution y(x) of y
= f(x, y) such that y(x
0
) = y
0
.
Proposition 37 (The Uniqueness Theorem)
If f and
∂f
∂y
are continuous in an open domain D, then given any (x
0
, y
0
) ∈ D there
exists a unique solution y(x) of y
= f(x, y) such that y(x
0
) = y
0
.
Deﬁnition 52 A diﬀerential equation of the form y
= f(y) is said to be autonomous
(i.e. y
is determined by y alone).
Note that the diﬀerential equation gives us the slope of the solution curves at all points
of the region D. Thus, in particular, the solution curves cross the curve f(x, y) = k (k is
a constant) with slope k. This curve is called the isocline of slope k. If we draw the set
of isoclines obtained by taking diﬀerent real values of k, we can schematically sketch the
family of solution curves in the (x −y)plane.
Recipe 15 – How to Solve a Diﬀerential Equation of the First Order:
There is no general method of solving diﬀerential equations. However, in some cases
the solution can be easily obtained:
• The Variables Separable Case.
If y
= f(x)g(y) then
dy
g(y)
= f(x)dx. By integrating both parts of the latter equation,
we will get a solution.
Example 84 Solve Cauchy’s problem (x
2
+ 1)y
+ 2xy
2
= 0, given y(0) = 1.
If we rearrange the terms, this equation becomes equivalent to
dy
y
2
= −
2xdx
x
2
+ 1
.
63
The integration of both parts yields
1
y
= ln(x
2
+ 1) + C,
or
y(x) =
1
ln(x
2
+ 1) + C
.
Since we should satisfy the condition y(0) = 1, we can evaluate the constant C,
C = 1.
• Diﬀerential Equations with Homogeneous Coeﬃcients.
Deﬁnition 53 A function f(x, y) is called homogeneous of degree n, if for any λ
f(λx, λy) = λ
n
f(x, y).
If we have the diﬀerential equation M(x, y)dx+N(x, y)dy = 0 and M(x, y), N(x, y)
are homogeneous of the same degree, then the change of variable y = tx reduces this
equation to the variables separable case.
Example 85 Solve (y
2
−2xy)dx +x
2
dy = 0.
The coeﬃcients are homogeneous of degree 2. Note that y ≡ 0 is a special solution.
If y = 0, let us change the variable: y = tx, dy = tdx + xdt. Thus
x
2
(t
2
−2t)dx +x
2
(tdx +xdt) = 0.
Dividing by x
2
(since y = 0, x = 0 as well) and rearranging the terms, we get a
variables separable equation
dx
x
= −
dt
t
2
−t
.
Taking the integrals of both parts,
dx
x
= ln [x[,
−
dt
t
2
−t
=
1
t
−
1
t −1
dt = ln [t[ −ln [t −1[ + C,
therefore
x =
Ct
t −1
, or, in terms of y, x =
Cy
y −x
.
We can single out y as a function of x, y =
x
2
x −C
.
64
• Exact diﬀerential equations.
If we have the diﬀerential equation M(x, y)dx+N(x, y)dy = 0 and
∂M(x,y)
∂y
≡
∂N(x,y)
∂x
then all the solutions of this equation are given by F(x, y) = C, C is a constant,
where F(x, y) is such that
∂F
∂x
= M(x, y),
∂F
∂y
= N(x, y).
Example 86 Solve
y
x
dx + (y
3
+ ln x)dy = 0.
∂
∂y
y
x
=
∂
∂x
(y
3
+ ln x), so we have checked that we are dealing with an exact
diﬀerential equation. Therefore, we need to ﬁnd F(x, y) such that
F
x
=
y
x
=⇒ F(x, y) =
y
x
dx = y ln x +φ(y),
and
F
y
= y
3
+ ln x.
To ﬁnd φ(y), note that
F
y
= y
3
+ ln x = ln x +φ
(y) =⇒ φ(y) =
y
3
dy =
1
4
y
4
+C.
Therefore, all solutions are given by the implicit function 4y ln x +y
4
= C.
• Linear diﬀerential equations of the ﬁrst order.
If we need to solve y
+p(x)y = q(x), then all the solutions are given by the formula
y(x) = e
−
p(x)dx
C +
q(x)e
p(x)dx
dx
.
Example 87 Solve xy
−2y = 2x
4
.
In other words, we need to solve y
−
2
x
y = 2x
3
. Therefore,
y(x) = e
2
x
dx
C +
2x
3
e
−
2
x
dx
sd
= e
ln x
2
C +
2x
3
e
ln
1
x
2
sd
= x
2
(C +x
2
).
4.1.2 Linear Diﬀerential Equations of a Higher Order with Constant Coeﬃ
cients
These are the equations of the form
y
(n)
+ a
1
y
(n−1)
+. . . + a
n−1
y
+a
n
y = f(x).
If f(x) ≡ 0 the equation is called homogeneous, otherwise it is called nonhomogeneous.
65
Recipe 16 How to ﬁnd the general solution y
g
(x) of the homogeneous equation: The general
solution is a sum of basic solutions y
1
, . . . , y
n
,
y
g
(x) = C
1
y
1
(x) + . . . +C
n
y
n
(x), C
1
, . . . , C
n
are arbitrary constants.
These arbitrary constants, however, can be deﬁned in a unique way, once we set up the
initial value Cauchy problem. Find y(x) such that
y(x) = y
0
0
, y
(x) = y
0
1
, . . . , y
(n−1)
(x) = y
0
n−1
at x = x
0
,
where x
0
, y
0
0
, y
0
1
, . . . , y
0
n−1
are given initial values (real numbers).
To ﬁnd all basic solutions, we proceed as follows:
1. Solve the characteristic equation
λ
n
+ a
1
λ
n−1
+. . . +a
n−1
λ +a
n
= 0
for λ. The roots of this equation are λ
1
, . . . , λ
n
. Some of these roots may be complex
numbers. We may also have repeated roots.
2. If λ
i
is a simple real root, the basic solution corresponding to this root is y
i
(x) = e
λ
i
x
.
3. If λ
i
is a repeated real root of degree k, it generates k basic solutions
y
i
1
(x) = e
λ
i
x
, y
i
2
(x) = xe
λ
i
x
, . . . , y
i
k
(x) = x
k−1
e
λ
i
x
.
4. If λ
j
is a simple complex root, λ
j
= α
j
+iβ
j
, i =
√
−1, then the complex conjugate
of λ
j
, say λ
j+1
= α
j
−iβ
j
, is also a root of the characteristic equations. Therefore
the pair λ
j
, λ
j+1
give rise to two basic solutions:
y
j
1
= e
α
j
x
cos β
j
x, y
j
2
= e
α
j
x
sin β
j
x.
5. If λ
j
is a repeated complex root of degree l, λ
j
= α
j
+iβ
j
, then the complex conjugate
of λ
j
, say λ
j+1
= α
j
−iβ
j
, is also a repeated root of degree l. These 2l roots generate
2l basic solutions
y
j
1
= e
α
j
x
cos β
j
x, y
j
2
= xe
α
j
x
cos β
j
x, . . . , y
j
l
= x
l−1
e
α
j
x
cos β
j
x,
y
j
l+1
= e
α
j
x
sin β
j
x, y
j
l+2
= xe
α
j
x
sin β
j
x, . . . , y
j
2l
= x
l−1
e
α
j
x
sin β
j
x.
Recipe 17 – How to Solve the NonHomogeneous Equation:
The general solution of the nonhomogeneous solution is y
nh
(x) = y
g
(x) +y
p
(x), where
y
g
(x) is the general solution of the homogeneous equation and y
p
(x) is a particular solution
of the nonhomogeneous equation, i.e. any function which solves the nonhomogeneous
equation.
Recipe 18 – How to Find a Particular Solution of the NonHomogeneous Equation:
66
1. If f(x) = P
k
(x)e
bx
, P
k
(x) is a polynomial of degree k, then a particular solution is
y
p
(x) = x
s
Q
k
(x)e
bx
,
where Q
k
(x) is a polynomial of the same degree k. If b is not a root of the charac
teristic equation, s = 0; if b is a root of the characteristic polynomial of degree m
then s = m.
2. If f(x) = P
k
(x)e
px
cos qx + Q
k
(x)e
px
sin qx, P
k
(x), Q
k
(x) are polynomials of degree
k, then a particular solution can be found in the form
y
p
(x) = x
s
R
k
(x)e
px
cos qx + x
s
T
k
(x)e
px
sin qx,
where R
k
(x), T
k
(x) are polynomials of degree k. If p +iq is not a root of the charac
teristic equation, s = 0; if p +iq is a root of the characteristic polynomial of degree
m then s = m.
3. The general method for ﬁnding a particular solution of a nonhomogeneous equation
is called the variation of parameters or the method of undetermined coeﬃcients.
Suppose we have the general solution of the homogeneous equation
y
g
= C
1
y
1
(x) +. . . +C
n
y
n
(x),
where y
i
(x) are basic solutions. Treating constants C
1
, . . . , C
n
as functions of x, for
instance, u
1
(x), . . . , u
n
(x), we can express a particular solution of the nonhomoge
neous equation as
y
p
(x) = u
1
(x)y
1
(x) + . . . + u
n
(x)y
n
(x),
where u
1
(x). . . . , u
n
(x) are solutions of the system
u
1
(x)y
1
(x) +. . . +u
n
(x)y
n
(x) = 0,
u
1
(x)y
1
(x) +. . . +u
n
(x)y
n
(x) = 0,
. . . . . . . . .
u
1
(x)y
(n−2)
1
(x) +. . . +u
n
(x)y
(n−2)
n
(x) = 0,
u
1
(x)y
(n−1)
1
(x) +. . . +u
n
(x)y
(n−1)
n
(x) = f(x)
4. If f(x) = f
1
(x) + f
2
(x) + . . . + f
r
(x) and y
p1
(x), . . . , y
pr
(x) are particular solutions
corresponding to f
1
(x), . . . , f
r
(x) respectively, then
y
p
(x) = y
p1
(x) +. . . +y
pr
(x).
67
Example 88 Solve y
−5y
+ 6y = x
2
+e
x
−5.
The characteristic roots are λ
1
= 2, λ
2
= 3, therefore the general solution of the
homogeneous equation is
y(t) = C
1
e
2t
+C
2
e
3t
.
We search for a particular solution in the form
y
p
(t) = at
2
+bt +c +de
t
.
To ﬁnd coeﬃcients a, b, c, d, let us substitute the particular solution into the initial
equation:
2a +de
t
−5(2at +b +de
t
) + 6(at
2
+bt +c +de
t
) = t
2
−5 +e
t
.
Equating termbyterm the coeﬃcients at the lefthand side to the righthand side, we
obtain
6a = 1, −5 2a + 6 b = 0, 2a −5b + 6c = −5, d −5d + 6d = 1.
Thus d = 1/2, a = 1/6, b = 10/36, c = −71/108.
Finally, the general solution of the nonhomogeneous equation is
y
nh
(x) = y(t) = C
1
e
2t
+C
2
e
3t
+
x
2
6
+
5x
18
−
71
108
+
e
x
2
.
Example 89 Solve y
−2y
+ y =
e
x
x
.
The general solution of the homogeneous equation is y
g
(x) = C
1
xe
x
+C
2
e
x
(the char
acteristic equation has the repeated root λ = 1 of degree 2). We look for a particular
solution of the nonhomogeneous equation in the form
y
p
(x) = u
1
(x)xe
x
+u
2
(x)e
x
.
We have
u
1
(x)xe
x
+u
2
(x)e
x
= 0,
u
1
(x)(e
x
+xe
x
) +u
2
(x)e
x
=
e
x
x
.
Therefore u
1
(x) =
1
x
, u
2
(x) = −1. Integrating, we ﬁnd that u
1
(x) = ln [x[, u
2
(x) = −x.
Thus y
p
(x) = ln [x[xe
x
+ xe
x
, and
y
n
(x) = y
g
(x) +y
p
(x) = ln [x[xe
x
+C
1
xe
x
+C
2
e
x
.
4.1.3 Systems of the First Order Linear Diﬀerential Equations
The general form of a system of the ﬁrst order diﬀerential equations is:
˙ x(t) = A(t)x(t) +b(t), x(0) = x
0
where t is the independent variable (“time”), x(t) = (x
1
(t), . . . , x
n
(t))
is a vector of
dependent variables, A(t) = (a
ij
(t))
[n×n]
is a real n n matrix with variable coeﬃcients,
b(t) = (b
1
(t), . . . , b
n
(t))
a variant nvector.
68
In what follows, we concentrate on the case when A and b do not depend on t, which
is the case in constant coeﬃcients diﬀerential equations systems:
˙ x(t) = Ax(t) +b, x(0) = x
0
. (11)
We also assume that A is nonsingular.
The solution of system (11) can be determined in two steps:
• First, we consider the homogeneous system (corresponding to b = 0):
˙ x(t) = Ax(t), x(0) = x
0
(12)
The solution x
c
(t) of this system is called the complementary solution.
• Second, we ﬁnd a particular solution x
p
of the system (11), which is called the
particular integral. The constant vector x
p
is simply the solution to Ax
p
= −b, i.e.
x
p
= −A
−1
b.
5
Given x
c
(t) and x
p
, the general solution of the system (11) is simply:
x(t) = x
c
(t) +x
p
We can ﬁnd the solution to the homogeneous system (12) in two diﬀerent ways.
First, we can eliminate n−1 unknowns and reduce the system to one linear diﬀerential
equation of degree n.
Example 90 Let
˙ x = 2x +y,
˙ y = 3x + 4y.
Taking the derivative of the ﬁrst equation and eliminating y and ˙ y ( ˙ y = 3x + 4y =
3x + 4 ˙ x −4 2x), we arrive at the second order homogeneous linear diﬀerential equation
¨ x −6 ˙ x + 5x = 0,
which has the general solution x(t) = C
1
e
t
+ C
2
e
5t
. Since y(t) = ˙ x − 2x, we ﬁnd that
y(t) = −C
1
e
t
+ 3C
2
e
5t
.
The second way is to write the solution to (12) as
x(t) = e
At
x
0
where (by deﬁnition)
e
At
= I + At +
A
2
t
2
2!
+
5
In a sense, x
p
can be viewed as an aﬃne shift, which restores the origin to a unique equilibrium of
(11).
69
Unfortunately, this formula is of little practical use. In order to ﬁnd a feasible formula
for e
At
, we distinguish three main cases.
Case 1: A has real and distinct eigenvalues
The fact that the eigenvalues are real and distinct implies that any corresponding
eigenvectors are linearly independent. Consequently, A is diagonalizable, i.e.
A = PΛP
−1
where P = [v
1
, v
2
, . . . , v
n
] is a matrix composed by the eigenvectors of A and Λ is a
diagonal matrix whose diagonal elements are the eigenvalues of A. Therefore, e
A
=
Pe
Λ
P
−1
.
Thus, the solution of the system (12) can be written as:
x(t) = Pe
Λt
P
−1
x
0
= Pe
Λt
c
= c
1
v
1
e
λ
1
t
+. . . +c
n
v
n
e
λ
n
t
where c = (c
1
, c
2
, . . . , c
n
) is a vector of constants determined from the initial conditions
(c = P
−1
x
0
).
Case 2: A has real and repeated eigenvalues
First, consider the simpler case when A has only one eigenvalue λ which is repeated
m times (m is called the algebraic multiplicity of λ). Generally, in this situation the
maximum number of independent eigenvectors corresponding to λ is less than m, meaning
that we cannot construct the matrix P of linearly independent eigenvectors and, therefore,
A is not diagonalizable.
6
The solution in this case has the form:
x(t) =
m
¸
i=1
c
i
h
i
(t)
where h
i
(t) are quasipolinomials and c
i
are constants determined by the initial conditions.
For example, if m = 3, we have:
h
1
(t) = e
λt
v
1
h
2
(t) = e
λt
(tv
1
+v
2
)
h
3
(t) = e
λt
(t
2
v
1
+ 2tv
2
+ 3v
3
)
where v
1
, v
2
, v
3
are determined by the conditions:
(A −λI)v
i
= v
i−1
, v
0
= 0;
If A happens to have several eigenvalues which are repeated, the solution of (12) is
obtained by ﬁnding the solution corresponding to each eigenvalue, and then adding up
the individual solutions.
6
If A is a real and symmetric matrix, then A is always diagonalizable and the formula given for case
1 can be applied.
70
Case 3: A has complex eigenvalues.
Since A is a real matrix, the complex eigenvalues always appear in pairs, i.e. if A has
the eigenvalue α + βi, then it also must have the eigenvalue α −βi.
Now consider the simpler case when A has only one pair of complex eigenvalues,
λ
1
= α + βi and λ
2
= α −βi. Let v
1
and v
2
denote the eigenvectors corresponding to λ
1
and λ
2
; then v
2
= ¯ v
1
, where ¯ v
1
is the conjugate of v
1
. The solution can be expressed as:
x(t) = e
At
x
0
= Pe
Λt
P
−1
x
0
= Pe
Λt
c
= c
1
v
1
e
(α+βi)t
+c
2
v
2
e
(α−βi)t
= c
1
v
1
e
αt
(cos βt +i sin βt) +c
2
v
2
e
αt
(cos βt −i sin βt)
= (c
1
v
1
+ c
2
v
2
)e
αt
cos βt + i(c
1
v
1
−c
2
v
2
)e
αt
sin βt
= h
1
e
αt
cos βt +h
2
e
αt
sin βt,
where h
1
= c
1
v
1
+c
2
v
2
, h
2
= i(c
1
v
1
−c
2
v
2
) are real vectors.
Note that if A has more pairs of complex eigenvalues, then the solution of (12) is
obtained by ﬁnding the solution corresponding to each eigenvalue, and then adding up
the individual solutions. The same remark holds true when we encounter any combination
of the three cases discussed above.
Example 91 Consider
A =
5 2
−4 −1
.
Solving the characteristic equation det(A−λI) = λ
2
−4λ + 3 = 0 we ﬁnd eigenvalues
of A: λ
1
= 1, λ
2
= 3.
From the condition (A − λ
1
I)v
1
= 0 we ﬁnd an eigenvector corresponding to λ
1
:
v
1
=
−1
2
.
Similarly, the condition (A − λ
2
I)v
2
= 0 gives an eigenvector corresponding to λ
2
:
v
2
=
−1
1
.
Therefore, the general solution of the system ˙ x(t) = Ax(t) is
x(t) = c
1
v
1
e
λ
1
t
+c
2
v
2
e
λ
2
t
or
x(t) =
x
1
(t)
x
2
(t)
=
−c
1
e
t
−c
2
e
3t
2c
1
e
t
+c
2
e
3t
.
Example 92 Consider
A =
3 1
−1 1
.
71
det(A −λI) = λ
2
−4λ + 4 = 0 gives the eigenvalue λ = 2 repeated twice.
From the condition (A −λ
1
I)v
1
= 0 we ﬁnd v
1
=
−1
1
and from (A −λ
1
I)v
2
= v
1
we obtain v
2
=
1
−2
.
The solution of the system ˙ x(t) = Ax(t) is
x(t) = c
1
h
1
(t) + c
2
h
2
(t)
= c
1
v
1
e
λt
+c
2
(tv
1
+v
2
)e
λt
or
x(t) =
x
1
(t)
x
2
(t)
=
−c
1
+c
2
−c
2
t
c
1
−2c
2
+ c
2
t
e
2t
.
Example 93 Consider
A =
3 2
−1 1
.
The equation det(A−λI) = λ
2
−4λ+5 = 0 leads to the complex eigenvalues λ
1
= 2+i,
λ
2
= 2 −i.
For λ
1
we set the equation (A −λ
1
I)v
1
= 0 and obtain:
v
1
=
1 +i
−1
v
2
= ¯ v
1
=
1 −i
−1
.
The solution of the system ˙ x(t) = Ax(t) is
x(t) = c
1
v
1
e
λ
1
t
+c
2
v
2
e
λ
2
t
= c
1
v
1
e
(2+i)t
+c
2
(tv
1
+v
2
)e
(2−i)t
= c
1
v
1
e
2t
(cos t +i sin t) +c
2
v
2
e
2t
(cos t −i sin t).
Performing the computations, we ﬁnally obtain:
x(t) = (c
1
+c
2
)e
2t
cos t −sin t
−cos t
+i(c
1
−c
2
)e
2t
cos t + sin t
−sin t
.
Now let the vector b be diﬀerent from 0. To specify the solution to the system (11), we
need a particular solution x
p
. This is found from the condition ˙ x = 0; thus, x
p
= −A
−1
b.
Therefore, the general solution of the nonhomogeneous system is
x(t) = x
c
(t) + x
p
= Pe
Λt
P
−1
c −A
−1
b
where the vector of constants c is determined by the initial condition x(0) = x
0
as c =
x
0
+A
−1
b.
72
Example 94 Consider
A =
5 2
−4 −1
, b =
1
2
and x
0
=
−1
2
.
In Example 91 we obtained
x
c
(t) =
−c
1
e
t
−c
2
e
3t
2c
1
e
t
+c
2
e
3t
.
In addition, we have
c =
c
1
c
2
= x
0
+A
−1
b =
−1
2
+
−5
3
14
3
=
−8
3
20
3
and
−x
p
= A
−1
b =
−1
3
−2
3
4
3
5
3
1
2
=
−5
3
14
3
.
The solution of the system ˙ x(t) = Ax(t) +b is
x(t) = x
c
(t) +x
p
=
8
3
e
t
−
20
3
e
3t
+
5
3
−16
3
e
t
+
20
3
e
3t
+
−14
3
.
Economics Application 11 (General Equilibrium)
Consider a market for n goods x = (x
1
, x
2
, . . . , x
n
)
. Let p = (p
1
, p
2
, . . . , p
n
)
denote the
vector of prices and D
i
(p), S
i
(p) demand and supply for good i, i = 1, . . . , n. The general
equilibrium is obtained for a price vector p
∗
which satisﬁes the conditions: D
i
(p
∗
) = S
i
(p
∗
)
for all i = 1, . . . , n.
The dynamics of price adjustment for the general equilibrium can be described by a
system of ﬁrst order diﬀerential equations of the form:
˙ p = W[D(p) −S(p)] = WAp, p(0) = p
0
where W = diag(w
i
) is a diagonal matrix having the speeds of adjustment w
i
, i = 1, . . . , n,
D(p) = [D
1
(p), . . . , D
n
(p)]
on the diagonal and S(p) = [S
1
(p), . . . , S
n
(p)]
are linear func
tions of p and A is a constant n n matrix.
We can solve this system by applying the methods discussed above. For instance, if
WA = PΛP
−1
, then the solution is
p(t) = Pe
Λt
P
−1
p
0
.
Economics Application 12 (The Dynamic ISLM Model)
A simpliﬁed dynamic ISLM model can be speciﬁed by the following system of diﬀer
ential equations:
˙
Y = w
1
(I −S)
˙ r = w
2
[L(Y, r) −M],
with initial conditions
Y (0) = Y
0
, r(0) = r
0
,
where:
73
• Y is national income;
• r is the interest rate;
• I = I
0
−αr is the investment function (α is a positive constant);
• S = s(Y −T)+(T −G) is national savings; (T, G and s represents taxes, government
purchases and the savings rate);
• L(Y, r) = a
1
Y −a
2
r is money demand (a
1
and a
2
are positive constants);
• M is money supply;
• w
1
, w
2
are speeds of adjustment.
For simplicity, consider w
1
= 1 and w
2
= 1. The original system can be rewritten as:
˙
Y
˙ r
=
−s −α
a
1
−a
2
Y
r
+
I
0
−(1 −s)T +G
−M
Assume that A =
−s −α
a
1
−a
2
is diagonalizable, i.e. A = PΛP
−1
, and has distinct
eigenvalues λ
1
, λ
2
with corresponding eigenvectors v
1
, v
2
. (λ
1
, λ
2
are the solutions of the
equation r
2
+ (s +a
2
)r +sa
2
+αa
1
= 0.)
The solution becomes:
Y (t)
r(t)
= Pe
Λt
P
−1
(x
0
+A
−1
b) −A
−1
b
where
x
0
=
Y
0
r
0
, b =
I
0
−(1 −s)T + G
−M
¸
.
We have
A
−1
=
1
αa
1
+sa
2
−a
2
α
−a
1
−s
and
A
−1
b =
1
αa
1
+sa
2
−a
2
[I
0
−(1 −s)T +G] −αM
−a
1
[I
0
−(1 −s)T +G] + sM
Assume that P
−1
=
˜ v
1
˜ v
2
. Then, we have:
Pe
Λt
P
−1
= [v
1
, v
2
]
e
λ
1
t
0
0 e
λ
2
t
˜ v
1
˜ v
2
= v
1
˜ v
1
e
λ
1
t
+v
2
˜ v
2
e
λ
2
t
.
Finally, the solution is:
Y (t)
r(t)
= [v
1
˜ v
1
e
λ
1
t
+v
2
˜ v
2
e
λ
2
t
]
Y
0
+
−a
2
S−αM
αa
1
+sa
2
r
0
+
−a
1
S+sM
αa
1
+sa
2
−
−a
2
S−αM
αa
1
+sa
2
−a
1
S+sM
αa
1
+sa
2
,
where S = I
0
−(1 −s)T +G.
74
Economics Application 13 (The Solow Model)
The Solow model is the basic model of economic growth and it relies on the following
key diﬀerential equation, which explains the dynamics of capital per unit of eﬀective labor:
˙
k(t) = sf(k(t)) −(n + g + δ)k(t), k(0) = k
0
where
• k denotes capital per eﬀective labor;
• s is the savings rate;
• n is the rate of population growth;
• g is the rate of technological progress;
• δ is the depreciation rate.
Let us consider two types of production function.
Case 1. The production function is linear: f(k(t)) = ak(t) where a is a positive
constant.
The equation for
˙
k becomes:
˙
k(t) = (sa −n −g −δ)k(t).
This linear diﬀerential equation has the solution
k(t) = k(0)e
(sa−n−g−δ)t
= k
0
e
(sa−n−g−δ)t
.
Case 2. The production function takes the CobbDouglas form: f(k(t)) = [k(t)]
α
where α ∈ [0, 1].
In this case, the equation for
˙
k represents a Bernoulli nonlinear diﬀerential equation:
˙
k(t) + (n +g +δ)k(t) = s[k(t)]
α
, k(0) = k
0
Dividing both sides of the latter equation by [k(t)]
α
, and denoting m(t) = [k(t)]
1−α
, m(0) =
(k
0
)
1−α
, we obtain:
1
1 −α
˙ m(t) + (n +g + δ)m(t) = s,
or
˙ m(t) + (1 −α)(n +g +δ)m(t) = s(1 −α).
The solution to this ﬁrst order linear diﬀerential equation is given by:
m(t) = e
−
(1−α)(n+g+δ)dt
[A +
s(1 −α)e
(1−α)(n+g+δ)dt
dt]
= e
−(1−α)(n+g+δ)t
[A +s(1 −α)
e
(1−α)(n+g+δ)t
dt]
= Ae
−(1−α)(n+g+δ)t
+
s
n + g +δ
.
75
where A is determined from the initial condition m(0) = (k
0
)
1−α
. It follows that A =
(k
0
)
1−α
−
s
n+g+δ
and
m(t) = [(k
0
)
1−α
−
s
n + g +δ
]e
−(1−α)(n+g+δ)t
+
s
n +g + δ
,
k(t) = ¦[(k
0
)
1−α
−
s
n + g + δ
]e
−(1−α)(n+g+δ)t
+
s
n +g +δ
¦
1
1−α
.
Economics Application 14 (The RamseyCassKoopmans Model)
This model takes capital per eﬀective labor k(t) and consumption per eﬀective labor c(t)
as endogenous variables. Their behavior is described by the following system of diﬀerential
equations:
˙
k(t) = f(k(t)) −c(t) −(n + g)k(t)
˙ c(t) =
f
(k(t)) −ρ −θg
θ
c(t)
with the initial conditions
k(0) = k
0
, c(0) = c
0
,
where, in addition to the variables already speciﬁed in the previous application,
• ρ stands for the discount rate ;
• θ is the coeﬃcient of relative risk aversion.
Assume again that f(k(t)) = ak(t) where a > 0 is a constant. The initial system
reduces to a system of linear diﬀerential equations:
˙
k(t) = (a −n −g)k(t) −c(t)
˙ c(t) =
a −ρ −θg
θ
c(t)
The second equation implies that:
c(t) = c
0
e
bt
, where b =
a −ρ −θg
θ
.
Substituting for c(t) in the ﬁrst equation, we obtain:
˙
k(t) = (a −n −g)k(t) −c
0
e
bt
The solution of this linear diﬀerential equation takes the form:
k(t) = k
c
(t) + k
p
(t)
where k
c
and k
p
are the complementary solution and the particular integral, respectively.
The homogeneous equation
˙
k(t) = (a −n −g)k(t) gives the complementary solution:
k(t) = k
0
e
(a−n−g)t
.
The particular integral should be of the form k
p
(t) = me
bt
; the value of m can be
determined by substituting k
p
in the nonhomogeneous equation for
˙
k:
mbe
bt
= (a −n −g)me
bt
−c
0
e
bt
76
It follows that
m =
c
0
a −n −g −b
, k
p
(t) =
c
0
a −n −g −b
e
bt
and
k(t) = k
0
e
(a−n−g)t
+
c
0
a −n −g −b
e
bt
.
4.1.4 Simultaneous Diﬀerential Equations. Types of Equilibria
Let x = x(t), y = y(t), where t is independent variable. Then consider a twodimensional
autonomous system of simultaneous diﬀerential equations
dx
dt
= f(x, y),
dy
dt
= g(x, y).
The point (x
∗
, y
∗
) is called an equilibrium of this system if f(x
∗
, y
∗
) = g(x
∗
, y
∗
) = 0.
Let . be the Jacobian matrix
. =
∂f
∂x
∂f
∂y
∂g
∂x
∂g
∂y
evaluated at (x
∗
, y
∗
), and let λ
1
, λ
2
be the eigenvalues of this Jacobian.
Then the equilibrium is
1. a stable (unstable) node if λ
1
, λ
2
are real, distinct and both negative (positive);
2. a saddle if the eigenvalues are real and of diﬀerent signs, i.e. λ
1
λ
2
< 0;
3. a stable (unstable) focus if λ
1
, λ
2
are complex and Re(λ
1
) < 0 (Re(λ
1
) > 0);
4. a center or a focus if λ
1
, λ
2
are complex and Re(λ
1
) = 0;
5. a stable (unstable) improper node if λ
1
, λ
2
are real, λ
1
= λ
2
< 0 (λ
1
= λ
2
> 0) and
the Jacobian is a not diagonal matrix;
6. a stable (unstable) star node if λ
1
, λ
2
are real, λ
1
= λ
2
< 0 (λ
1
= λ
2
> 0) and the
Jacobian is a diagonal matrix.
The ﬁgure below illustrates this classiﬁcation:
77
1) Node 2) Saddle
4) Center 5) Improper node 6) Star node
3) Focus
p p p
p
p
Example 95 Find the equilibria of the system and classify them
˙ x = 2xy −4y −8,
˙ y = 4y
2
−x
2
.
Solving the system
2xy −4y −8 = 0,
4y
2
−x
2
= 0
for x, y, we ﬁnd two equilibria, (−2, −1) and (4, 2).
At (−2, −1) the Jacobian
J =
2y 2x −4
−2x 8y
(−2,−1)
=
−2 −8
4 −8
.
The characteristic equation for J is λ
2
−(tr(J))λ + det(J) = 0.
(tr(J))
2
−4 det(J)) < 0 =⇒ λ
1,2
are complex.
78
tr(J) < 0 =⇒ λ
1
+ λ
2
< 0 =⇒ Reλ
1
< 0 =⇒ the equilibrium (−2, −1) is a stable
focus.
At (4, 2)
J =
4 4
−8 16
.
(tr(J))
2
−4 det(J)) > 0 =⇒ λ
1,2
are distinct and real.
tr(J) = λ
1
+λ
2
> 0, det(J) = λ
1
λ
2
> 0 =⇒ λ
1
> 0, λ
2
> 0 =⇒ the equilibrium (4, 2)
is an unstable node.
4.2 Diﬀerence Equations
Let y denote a realvalued function deﬁned over the set of natural numbers. Hereafter y
k
stands for y(k) – the value of y at k, where k = 0, 1, 2, . . ..
Deﬁnition 54
• The ﬁrst diﬀerence of y evaluated at k is ∆y
k
= y
k+1
−y
k
.
• The second diﬀerence of y evaluated at k is ∆
2
y
k
= ∆(∆y
k
) = y
k+2
−2y
k+1
+y
k
.
• In general, the nth diﬀerence of y evaluated at k is deﬁned as ∆
n
y
k
= ∆(∆
n−1
y
k
),
for any integer n, n > 1.
Deﬁnition 55 A diﬀerence equation in the unknown y is an equation relating y and any
of its diﬀerences ∆y, ∆
2
y, . . . for each value of k, k = 0, 1, . . ..
Solving a diﬀerence equation means ﬁnding all functions y which satisfy the relation
speciﬁed by the equation.
Example 96 (Examples of Diﬀerence Equations)
• 2∆y
k
−y
k
= 1 (or, equivalently, 2∆y −y = 1)
• ∆
2
y
k
+ 5y
k
∆y
k
+ (y
k
)
2
= 2 (or ∆
2
y + 5y∆y + (y)
2
= 2)
Note that any diﬀerence equation can be written as
f(y
k+n
, y
k+n−1
, . . . , y
k+1
, y
k
) = g(k), n ∈ N
For example, the above two equations may also be expressed as follows:
• 2y
k+1
−3y
k
= 1
• y
k+2
−2y
k+1
+y
k
+ 5y
k
y
k+1
−4(y
k
)
2
= 2
Deﬁnition 56 A diﬀerence equation is linear if it takes the form
f
0
(k)y
k+n
+f
1
(k)y
k+n−1
+. . . +f
n−1
(k)y
k+1
+f
n
(k)y
k
= g(k),
where f
0
, f
1
, . . . , f
n
, g are each functions of k (but not of some y
k+i
, i ∈ N) deﬁned for
all k = 0, 1, 2, . . ..
79
Deﬁnition 57 A linear diﬀerence equation is of order n if both f
0
(k) and f
n
(k) are
diﬀerent from zero for each k = 0, 1, 2, . . ..
From now on, we focus on the problem of solving linear diﬀerence equations of order
n with constant coeﬃcients. The general form of such an equation is:
f
0
y
k+n
+f
1
y
k+n−1
+. . . +f
n−1
y
k+1
+f
n
y
k
= g
k
,
where, this time, f
0
, f
1
, . . . , f
n
are real constants and f
0
, f
n
are diﬀerent from zero.
Dividing this equation by f
0
and denoting a
i
=
f
i
f
0
for i = 0, . . . , n, r
k
=
g
k
f
0
, the general
form of a linear diﬀerence equation of order n with constant coeﬃcients becomes:
y
k+n
+a
1
y
k+n−1
+. . . +a
n−1
y
k+1
+ a
n
y
k
= r
k
, (13)
where a
n
is nonzero.
Recipe 19 – How to Solve Diﬀerence Equation: General Procedure:
The general procedure for solving a linear diﬀerence equation of order n with constant
coeﬃcients involves three steps:
Step 1. Consider the homogeneous equation
y
k+n
+a
1
y
k+n−1
+. . . + a
n−1
y
k+1
+ a
n
y
k
= 0,
and ﬁnd its solution Y . (First we will describe and illustrate the solving procedure
for equations of order 1 and 2. Following that, the theory for the case of order n is
obtained as a natural extension of the theory for secondorder equations.)
Step 2. Find a particular solution y
∗
of the complete equation (13). The most useful tech
nique for ﬁnding particular solutions is the method of undetermined coeﬃcients
which will be illustrated in the examples that follow.
Step 3. The general solution of equation (13) is:
y
k
= Y + y
∗
.
4.2.1 Firstorder Linear Diﬀerence Equations
The general form of a ﬁrstorder linear diﬀerential equation is:
y
k+1
+ay
k
= r
k
.
The corresponding homogeneous equation is:
y
k+1
+ay
k
= 0,
and has the solution Y = C(−a)
k
, where C is an arbitrary constant.
80
Recipe 20 – How to Solve a FirstOrder NonHomogeneous Linear Diﬀerence Equation:
First, consider the case when r
k
is constant over time, i.e. r
k
≡ r for all k.
Much as in the case of linear diﬀerential equations, the general solution of the non
homogeneous diﬀerence equation y
t
is the sum of the general solution of the homogeneous
equation and a particular solution of the nonhomogeneous equation. Therefore
y
k
= A (−a)
k
+
r
1 +a
, a = −1,
y
k
= A (−a)
k
+rk = A +rk, a = −1,
where A is an arbitrary constant.
If we have the initial condition y
k
= y
0
when k = 0, the general solution of the
nonhomogeneous equation takes the form
y
k
=
y
0
−
r
1 +a
(−a)
k
+
r
1 +a
, a = −1,
y
k
= y
0
+ rk, a = −1.
If r now depends on k, r = r
k
, then the solution takes the form
y
k
= (−a)
k
y
0
+
k−1
¸
i=0
(−a)
k−1−i
r
i
, k = 1, 2, . . .
A particular solution to nonhomogeneous diﬀerence equation can be found using the
method of undetermined coeﬃcients.
Example 97 y
k+1
−3y
k
= 2.
The solution of the homogeneous equation y
k+1
− 3y
k
= 0 is: Y = C 3
k
. The right
hand term suggests looking for a constant function as a particular solution. Assuming
y
∗
= A and substituting into the initial equation, we obtain A = −1; hence y
∗
= −1 and
the general solution is y
k
= Y +y
∗
= C3
k
−1.
Example 98 y
k+1
−3y
k
= k
2
+k + 2.
The homogeneous equation is the same as in the preceding example. But the righthand
term suggests ﬁnding a particular solution of the form:
y
∗
= Ak
2
+Bk +D.
Substituting y
∗
into nonhomogeneous equation, we have:
A(k + 1)
2
+ B(k + 1) + D −3Ak
2
−3Bk −3D = k
2
+ k + 2, or
−2Ak
2
+ 2(A −B)k +A +B −2D = k
2
+k + 2.
In order to satisfy this equality for each k, we must have:
−2A = 1
2(A −B) = 1
A +B −2D = 2.
It follows that A = −
1
2
, B = −1, D = −
3
4
and y
∗
= −
1
2
k
2
−k −
3
4
. The general solution
is y
k
= Y +y
∗
= C3
k
−
1
2
k
2
−k −
3
4
.
81
Example 99 y
k+1
−3y
k
= 4e
k
.
This time we try a particular solution of the form: y
∗
= Ae
k
. After substitution, we
ﬁnd A =
4
e−3
. The general solution is y
k
= Y +y
∗
= C3
k
+
4e
k
e−3
.
Two remarks:
• The method of undetermined coeﬃcients illustrated by these examples applies to
the general case of order n as well. The trial solutions corresponding to some simple
functions r
k
are given in the following table:
Righthand term r
k
Trial solution y
∗
a
k
Aa
k
k
n
A
0
+A
1
k +A
2
k
2
+ . . . +A
n
k
n
sin ak or cos ak Asin ak +Bcos ak
• If the function r is a combination of simple functions for which we know the trial
solutions, then a trial solution for r can be obtained by combining the trial solutions
for the simple functions. For example, if r
k
= 5
k
k
3
, then the trial solution would be
y
∗
= 5
k
(A
0
+A
1
k +A
2
k
2
+A
3
k
3
).
4.2.2 SecondOrder Linear Diﬀerence Equations
The general form is:
y
k+2
+a
1
y
k+1
+a
2
y
k
= r
k
. (14)
The solution to the corresponding homogeneous equation
y
k+2
+a
1
y
k+1
+a
2
y
k
= 0 (15)
depends on the solutions to the quadratic equation:
m
2
+a
1
m+a
2
= 0,
which is called the auxiliary (or characteristic) equation of equation (14). Let m
1
, m
2
denote the roots of the characteristic equation. Then, both m
1
and m
2
will be nonzero
because a
2
is nonzero for (14) to be of second order.
Case 1. m
1
and m
2
are real and distinct.
The solution to (15) is Y = C
1
m
k
1
+C
2
m
k
2
, where C
1
, C
2
are arbitrary constants.
Case 2. m
1
and m
2
are real and equal.
The solution to (15) is Y = C
1
m
k
1
+C
2
km
k
1
= (C
1
+C
2
k)m
k
1
.
Case 3. m
1
and m
2
are complex with the polar forms r(cos θ ± i sin θ), r > 0,
θ ∈ (−π, π].
The solution to (15) is Y = C
1
r
k
cos(kθ + C
2
).
Any complex number a +bi admits a representation in the polar (or trigonometric) form r(cos θ ±
i sin θ). The transformation is given by
• r =
√
a
2
+ b
2
• θ is the unique angle (in radians) such that θ ∈ (−π, π] and
cos θ =
a
√
a
2
+ b
2
, sin θ =
b
√
a
2
+ b
2
82
Example 100 y
k+2
−5y
k+1
+ 6y
k
= 0
The characteristic equation is m
2
− 5m + 6 = 0 and has the roots m
1
= 2, m
2
= 3.
The diﬀerence equation has the solution y
k
= C
1
2
k
+ C
2
3
k
.
Example 101 y
k+2
−8y
k+1
+ 16y
k
= 2
k
+ 3
The characteristic equation m
2
−8m + 16 = 0 has the roots m
1
= m
2
= 4; therefore,
the solution to the corresponding homogeneous equation is Y = (C
1
+ C
2
k)4
k
.
To ﬁnd a particular solution, we consider the trial solution y
∗
= A2
k
+ B. By substi
tuting y
∗
into the nonhomogeneous equation and performing the computations we ob
tain A =
1
4
, B =
1
3
. Thus, the solution to the nonhomogeneous equation is y
k
=
(C
1
+C
2
k)4
k
+
1
4
2
k
+
1
3
.
Example 102 y
k+2
+ y
k+1
+ y
k
= 0
The equation m
2
+m+1 = 0 gives the roots m
1
=
−1−i
√
3
2
, m
2
=
−1+i
√
3
2
or, written in
the polar form, m
1,2
= cos
2π
3
± i sin
2π
3
. Therefore, the given diﬀerence equation has the
solution: y
∗
= C
1
cos(
2π
3
k +C
2
).
4.2.3 The General Case of Order n
In general, any diﬀerence equation of order n can be written as:
y
k+n
+a
1
y
k+n−1
+. . . +a
n−1
y
k+1
+ a
n
y
k
= r
k
. (14)
The corresponding characteristic equation becomes
m
n
+ a
1
m
n−1
+. . . +a
n−1
m+a
n
= 0
and has exactly n roots denoted as m
1
, . . . , m
n
.
The general solution to the homogeneous equation is a sum of terms which are pro
duced by the roots m
1
, . . . , m
n
in the following way:
• a real simple root m generates the term C
1
m
k
.
• a real and repeated p times root m generates the term
(C
1
+ C
2
k +C
3
k
2
+. . . +C
p
k
p−1
)m
k
.
• each pair of simple complex conjugate roots r(cos θ ±i sin θ) generates the term
C
1
r
k
cos(kθ +C
2
).
• each repeated p times pair of complex conjugate roots r(cos θ ±i sin θ) generates the
term
r
k
[C
1,1
cos(kθ +C
1,2
) +C
2,1
k cos(kθ +C
2,2
) +. . . + C
p,1
k
p−1
cos(kθ + C
p,2
)].
The sum of terms will contain n arbitrary constants.
A particular solution y
∗
to the nonhomogeneous equation again can be obtained by
means of the method of undetermined coeﬃcients (as was illustrated for the cases of
ﬁrstorder and secondorder diﬀerence equations).
83
Example 103 y
k+4
−5y
k+3
+ 9y
k+2
−7y
k+1
+ 2y
k
= 0
The characteristic equation
m
4
−5m
3
+ 9m
2
−7m+ 2 = 0
has the roots m
1
= 1, which is repeated 3 times, and m
2
= 2. The solution is y
∗
=
(C
1
+C
2
k +C
3
k
2
) +C
4
2
k
.
Economics Application 15 (Simple and compound interest)
Let S
0
denote an initial sum of money. There are two basic methods for computing
the interest earned in a period, for example, one year.
S
0
earns simple interest at rate r if each period the interest equals a fraction r of S
0
.
If S
k
denotes the sum accumulated after k periods, S
k
is computed in the following way:
S
k+1
= S
k
+rS
0
.
This is a ﬁrstorder diﬀerence equation which has the solution S
k
= S
0
(1 +kr).
S
0
earns compound interest at rate r if each period the interest equals a fraction r of
the sum accumulated at the beginning of that period. In this case, the computation formula
is:
S
k+1
= S
k
+rS
k
or S
k+1
= S
k
(1 +r).
The solution to this homogeneous ﬁrstorder diﬀerence equation is S
k
= (1 +r)
k
S
0
.
Economics Application 16 (A dynamic model of economic growth)
Let Y
t
, C
t
, I
t
denote national income, consumption, and investment in period t, respec
tively. A simple dynamic model of economic growth is described by the equations:
Y
t
= C
t
+I
t
C
t
= c +mY
t
Y
t+1
= Y
t
+rI
t
where
• c is a positive constant;
• m is the marginal propensity to consume, 0 < m < 1;
• r is the growth factor.
From the above system of three equations, we can express the dynamics of national
income in the form of a ﬁrstorder diﬀerence equation:
Y
t+1
−Y
t
= rI
t
= r(Y
t
−C
t
)
= rY
t
−r(c +mY
t
),
or
Y
t+1
−[1 +r(1 −m)]Y
t
= −rc.
84
The solution to this equation is
Y
t
= (Y
0
−
c
1 −m
)[1 +r(1 −m)]
t
+
c
1 −m
.
The behavior of investment can be described in the following way:
I
t+1
−I
t
= (Y
t+1
−C
t+1
) −(Y
t
−C
t
)
= (Y
t+1
−Y
t
) −(C
t+1
) −C
t
)
= rI
t
−m(Y
t+1
−Y
t
)
= rI
t
−mrI
t
,
or
I
t+1
= [1 + r(1 −m)]I
t
.
This homogeneous diﬀerence equation has the solution I
t
= [1 + r(1 −m)]
t
I
0
.
4.3 Introduction to Dynamic Optimization
4.3.1 The FirstOrder Conditions
A typical dynamic optimization problem takes the following form:
max
c(t)
V =
T
0
f[k(t), c(t), t]dt
subject to
˙
k(t) = g[k(t), c(t), t],
k(0) = k
0
> 0.
where
• [0, T] is the horizon over which the problem is considered (T can be ﬁnite or inﬁnite);
• V is the value of the objective function as seen from the initial moment t
0
= 0;
• c(t) is the control variable and the objective function V is maximized with respect
to this variable;
• k(t) is the state variable and the ﬁrst constraint (called the equation of motion)
describes the evolution of this state variable over time (
˙
k denotes
dk
dt
);
Similar to the static optimization, in order to solve the optimization program we need
ﬁrst to ﬁnd the ﬁrstorder conditions:
85
Recipe 21 – How to derive the FirstOrder Conditions:
Step 1. Construct the Hamiltonian function
H = f(k, c, t) + λ(t)g(k, c, t)
where λ(t) is a Lagrange multiplier.
Step 2. Take the derivative of the Hamiltonian with respect to the control variable and set
it to 0:
∂H
∂c
=
∂f
∂c
+λ
∂g
∂c
= 0.
Step 3. Take the derivative of the Hamiltonian with respect to the state variable and set it to
equal the negative of the derivative of the Lagrange multiplier with respect to time:
∂H
∂k
=
∂f
∂k
+λ
∂g
∂k
= −
˙
λ.
Step 4. Transversality condition
Case 1: Finite horizons.
µ(T)k(T) = 0.
Case 2: Inﬁnite horizons with f of the form f[k(t), c(t), t] = e
−ρt
u[k(t), c(t)]:
lim
t→∞
[λ(t)k(t)] = 0.
Case 3: Inﬁnite horizons with f in a form diﬀerent from that speciﬁed in case 2.
lim
t→∞
[H(t)] = 0.
Recipe 22 – The F.O.C. with More than One State and/or Control Variable:
It may happen that the dynamic optimization problem contains more than one control
variable and more than one state variable. In that case we need an equation of motion
for each state variable. To write the ﬁrstorder conditions, the algorithm speciﬁed above
should be modiﬁed in the following way:
Step 1a. The Hamiltonian includes the righthand side of each equation of motion times the
corresponding multiplier.
Step 2a. Applies for each control variable.
Step 3a. Applies for each state variable.
86
4.3.2 PresentValue and CurrentValue Hamiltonians
In economic problems, the objective function is usually of the form
f[k(t), c(t), t] = e
−ρt
u[k(t), c(t)],
where ρ is a constant discount rate and e
−ρt
is a discount factor. The Hamiltonian
H = e
−ρt
u(k, c) +λ(t)g(k, c, t)
is called the presentvalue Hamiltonian (since it represents a value at the time t
0
= 0).
Sometimes it is more convenient to work with the currentvalue Hamiltonian, which
represents a value at the time t:
ˆ
H = He
ρt
= u(k, c) + µ(t)g(k, c, t)
where µ(t) = λ(t)e
ρt
. The ﬁrstorder conditions written in terms of the currentvalue
Hamiltonian appear to be a slight modiﬁcation of the presentvalue type:
•
∂
ˆ
H
∂c
= 0,
•
∂
ˆ
H
∂k
= ρµ − ˙ µ,
• µ(T)k(T) = 0 (The transversality condition in the case of ﬁnite horizons).
4.3.3 Dynamic Problems with Inequality Constraints
Let us add an inequality constraint to the dynamic optimization problem considered
above.
max
c(t)
V =
T
0
f[k(t), c(t), t]dt
subject to
˙
k(t) = g[k(t), c(t), t],
h(t, k, c) ≥ 0,
k(0) = k
0
> 0.
The following algorithm allows us to write the ﬁrstorder conditions:
Recipe 23 – The F.O.C. with Inequality Constraints:
Step 1. Construct the Hamiltonian
H = f(k, c, t) + µ(t)g(k, c, t) +w(t)h(t, k, c)
where µ(t) and w(t) are Lagrange multipliers.
Step 2.
∂H
∂c
= 0.
Step 3.
∂H
∂k
= −˙ µ.
Step 4. w(t) ≥ 0, w(t)h(t, k, c) = 0
87
Step 5. Transversality condition if T is ﬁnite:
µ(T)k(T) = 0.
Economics Application 17 (The Ramsey Model)
This example illustrates how the tools of dynamic optimization can be applied to solve
a model of economic growth, namely the Ramsey model (for more details on the economic
illustrations presented in this section see, for instance, R. J. Barro, X. SalaiMartin
Economic Growth, McGrawHill, Inc., 1995).
The model assumes that the economy consists of households and ﬁrms but here we fur
ther assume, for simplicity, that households carry out production directly. Each household
chooses, at each moment t, the level of consumption that maximizes the present value of
lifetime utility
U =
∞
0
u[A(t)c(t)]e
nt
e
−ρt
dt
subject to the dynamic budget constraint
˙
k(t) = f[k(t)] −c(t) −(x +n + δ)k(t).
and the initial level of capital k(0) = k
0
.
Throughout the example we use the following notation:
• A(t) is the level of technology;
• c denotes consumption per unit of eﬀective labor (eﬀective labor is deﬁned as the
product of labor force L(t) and the level of technology A(t));
• k denotes the amount of capital per unit of eﬀective labor;
• f() is the production function written in intensive form (f(k) = F(K/AL, 1));
• n is the rate of population growth;
• ρ is the rate at which households discount future utility;
• x is the growth rate of technological progress;
• δ is the depreciation rate.
The expression A(t)c(t) in the utility function represents consumption per person and
the term e
nt
captures the eﬀect of the growth in the family size at rate n. The dynamic
constraint says that the change in the stock of capital (expressed in units of eﬀective labor)
is given by the diﬀerence between the level of output (in units of eﬀective labor) and the
sum of consumption (per unit of eﬀective labor), depreciation and the additional eﬀect
(x +n)k(t) from expressing all variables in terms of eﬀective labor.
As can be observed, households face a dynamic optimization problem with c as a control
variable and k as a state variable. The presentvalue Hamiltonian is
H = u(A(t)c(t))e
nt
e
−ρt
+µ(t)[f[k(t)] −c(t) −(x + n +δ)k(t)]
88
and the ﬁrstorder conditions state that
∂H
∂c
=
∂u(Ac)
∂c
e
nt
e
−ρt
= µ
∂H
∂k
= µ[f
(k) −(x +n +δ)] = −˙ µ
In addition the transversality condition requires:
lim
t→∞
[µ(t)k(t)] = 0.
Eliminating µ from the ﬁrstorder conditions and assuming that the utility function
takes the functional form u(Ac) =
(Ac)
(1−θ)
1−θ
leads us to an expression
7
for the growth rate
of c:
˙ c
c
=
1
θ
[f
(k) −δ −ρ −θx]
This equation, together with the dynamic budget constraint forms a system of diﬀeren
tial equations which completely describe the dynamics of the Ramsey model. The boundary
conditions are given by the initial condition k
0
and the transversality condition.
Economics Application 18 (A Model of Economic Growth with Physical and Human
Capital)
This model illustrates the case when the optimization problem has two dynamic con
straints and two control variables.
7
To reconstruct these calculations, note ﬁrst that
∂u(Ac)
∂c
= A
1−θ
c
θ
and
d
dt
∂u(Ac)
∂c
=
(1 −θ)
˙
A
A
−θ
˙ c
c
∂u(Ac)
∂c
or, assuming that
˙
A
A
= x,
d
dt
∂u(Ac)
∂c
=
(1 −θ)x −θ
˙ c
c
∂u(Ac)
∂c
.
Therefore,
˙ µ =
∂u(Ac)
∂c
e
nt
e
−ρt
n −ρ + (1 −θ)x −θ
˙ c
c
.
Eliminating µ from the second equation in the F.O.C. we get
θ
˙ c
c
−(1 −θ)x −n + ρ = f
(k) −x −n −δ,
which after rearranging terms gives an expression for the growth rate of consumption c.
89
As was the case in the previous example, we assume that households perform production
directly. The model also assumes that physical capital K and human capital H enter the
production function Y in a CobbDouglas manner:
Y = AK
α
H
1−α
,
where A denotes a constant level of technology and 0 ≤ α ≤ 1. Output is divided among
consumption C, investment in physical capital I
K
and investment in human capital I
H
:
Y = C +I
K
+ I
H
.
The two capital stocks change according to the dynamic equations:
˙
K = I
K
−δK,
˙
H = I
H
−δH,
where δ denotes the depreciation rate.
The households’ problem is to choose C, I
K
, I
H
such that they maximize the present
value of lifetime utility
U =
∞
0
u(C)e
−ρt
dt
subject to the above four constraints. We again assume that u(C) =
C
(1−θ)
1−θ
. By eliminating
Y and I
K
, the households’ problem becomes
Maximize U
subject to
˙
K = AK
α
H
1−α
−C −δK −I
H
,
˙
H = I
H
−δH,
and it contains two control variables (C, I
H
) and two state variables (K, H).
The Hamiltonian associated with this dynamic problem is:
H = u(C) +µ
1
(AK
α
H
1−α
−C −δK −I
H
) + µ
2
(I
H
−δH)
and the ﬁrstorder conditions require
∂H
∂C
= u
(C)e
−ρt
−µ
1
= 0; (16)
∂H
∂I
H
= −µ
1
+µ
2
= 0; (17)
∂H
∂K
= −µ
1
(αAK
α−1
H
1−α
−δ) = − ˙ µ
1
; (18)
∂H
∂H
= −µ
1
(1 −α)AK
α
H
−α
−µ
2
δ = − ˙ µ
2
. (19)
Eliminating µ
1
from (16) and (18) we obtain a result for the growth rate of consump
tion:
˙
C
C
=
1
θ
¸
αA
K
H
−(1−α)
−δ −ρ
¸
.
90
Since µ
1
= µ
2
(from (17)), conditions (18) and (19) imply
αA
K
H
−(1−α)
−δ = (1 −α)A
K
H
α
−δ,
which enables us to ﬁnd the ratio:
K
H
=
α
1 −α
.
After substituting this ratio in the result for
˙
C
C
, it turns out that consumption grows at a
constant rate given by:
˙
C
C
=
1
θ
[Aα
α
(1 −α)
(1−α)
−δ −ρ].
The substitution of
K
H
into the production function implies that Y is a linear function
of K:
Y = AK(
1 −α
α
)
(1−α)
.
(Therefore, such a model is also called an AK model.)
Formulas for Y and the K/Hratio suggest that K, H and Y grow at a constant rate.
Moreover, it can be shown that this rate is the same as the growth rate of C found above.
Economics Application 19 (InvestmentSelling Decisions)
Finally, let us exemplify the case of a dynamic problem with inequality constraints.
Assume that a ﬁrm produces, at each moment t, a good which can either be sold or
reinvested to expand the productive capacity y(t) of the ﬁrm. The ﬁrm’s problem is to
choose at each moment the proportion u(t) of the output y(t) which should be reinvested
such that it maximizes total sales over the period [0, T].
Expressed in mathematical terms, the ﬁrm’s problem is:
max
T
0
[1 −u(t)]y(t)dt
subject to ˙ y(t) = u(t)y(t)
y(0) = y
0
> 0
0 ≤ u(t) ≤ 1.
where u is the control and y is the state variable.
The Hamiltonian takes the form:
H = (1 −u)y +λuy + w
1
(1 −u) +w
2
u.
The optimal solution satisﬁes the conditions:
H
u
= (λ −1)y +w
2
−w
1
= 0,
˙
λ = u −1 −uλ,
w
1
≥ 0, w
1
(1 −u) = 0,
w
2
≥ 0, w
2
u = 0,
λ(T) = 0.
91
These conditions imply that
λ > 1, u = 1 and
˙
λ = −λ (20)
or
λ < 1, u = 0 and
˙
λ = −1; (21)
thus, λ(t) is a decreasing function over [0, T]. Since λ(t) is also a continuous function,
there exists t
∗
such that (21) holds for any t in [t
∗
, T]. Hence
u(t) = 0,
λ(t) = T −t,
x(t) = x(t
∗
),
for any t in [t
∗
, T].
t
∗
must satisfy λ(t
∗
) = 1, which implies that t
∗
= T − 1 if T ≥ 1. If T ≤ 1, we have
t
∗
= 0.
If T > 1, then for any t in [0, T −1] (20) must hold. Using the fact that y is continuous
over [0, T] (and, therefore, continuous at t
∗
= T −1), we obtain
u(t) = 1,
λ(t) = e
T−t−1
,
x(t) = y
0
e
t
,
for any t in [0, T −1].
In conclusion, if T > 1 then it is optimal for the ﬁrm to invest all the output until
t = T −1 and to sell all the output after that date. If T < 1 then it is optimal to sell all
the output.
Further Reading:
• Arrowsmith, D.K. and C.M.Place. Dynamical Systems.
• Barro, R.J. and X. SalaiMartin. Economic Growth.
• Chiang A.C. Elements of Dynamic Optimization.
• Goldberg, S. Introduction to Diﬀerence Equations.
• Hartman, Ph. Ordinary Diﬀerential Equations.
• Kamien, M.I and N.I. Schwartz. Dynamic Optimization: The Calculus of Variations
and Optimal Control in Economics and Management.
• Lorenz, H.W. Nonlinear Dynamical Economics and Chaotic Motion.
• Pontryagin, L.S. Ordinary Diﬀerential Equations.
• Seierstad A. and K.Sydsæter. Optimal Control Theory with Economic Applications.
92
5 Exercises
5.1 Solved Problems
1. Evaluate the Vandermonde determinant
det
1 x
1
x
2
1
1 x
2
x
2
2
1 x
3
x
2
3
.
2. Find the rank of the matrix A,
A =
¸
¸
¸
¸
0 2 2
1 −3 −1
−2 0 −4
4 6 14
¸
.
3. Solve the system Ax = b, given
A =
¸
¸
1 2 3
2 −1 −1
1 3 4
¸
, b =
¸
¸
5
1
6
¸
.
4. Find the unknown matrix X from the equation
1 2
2 5
X =
4 −6
2 1
.
5. Find all solutions of the system of linear equations
2x + y −z = 3,
3x + 3y + 2z = 7,
7x + 5y = 13.
6. a) given v =
¸
¸
x
y
z
¸
ﬁnd the quadratic form v
Av if A =
¸
¸
3 4 6
4 −2 0
6 0 1
¸
.
b) Given the quadratic form x
2
+ 2y
2
+ 3z
2
+ 4xy −6yz + 8xz
ﬁnd matrix A of v
Av.
7. Find the eigenvalues and corresponding eigenvectors of the following matrix:
¸
¸
1 0 1
0 1 1
1 1 2
¸
.
Is this matrix positive deﬁnite?
93
8. Show that the matrix
¸
¸
a d e
d b f
e f c
¸
is positive deﬁnite if and only if
a > 0, ab −d
2
> 0, abc + 2edf > af
2
+ be
2
+ cd
2
.
9. Find the following limits:
a) lim
x→−2
x
2
+ x −2
x
2
+ 2x
b) lim
x→1
x −1
ln x
c) lim
x→+∞
x
n
e
−x
d) lim
x→0
x
x
.
10. Show that a parabola y = x
2
/2e is tangent to the curve y = ln x and ﬁnd the tangent
point.
11. Find y
(x) if
a) y = ln
e
4x
e
4x
+ 1
b) y = ln(sinx +
1 + sin
2
x)
c) y = x
1/x
.
12. Show that the function y(x) = x(ln x − 1) is a solution to the diﬀerential equation
y
(x)xln x = y
(x).
13. Given f(x) = 1/(1 −x
2
), show that f
(n)
(0) =
n!, n is even,
0, n is odd.
14. Find
d
2
y
dx
2
, given
a)
x = t
2
,
y = t + t
3
.
b)
x = e
2t
,
y = e
3t
.
94
15. Can we apply Rolle’s Theorem to the function f(x) = 1−
3
√
x
2
, where x ∈ [−1, 1]? Explain
your answer and show it graphically.
16. Given y(x) = x
2
[x[, ﬁnd y
(x) and draw the graph of the function.
17. Find the approximate value of
5
√
33.
18. Find
∂z
∂x
and
∂z
∂y
, given
a) z = x
3
+ 3x
2
y −y
3
b) z =
xy
x −y
c) z = xe
−xy
.
19. Show that the function z(x, y, ) = xe
−y/x
is a solution to the diﬀerential equation
x
∂
2
z
∂x∂y
+ 2
∂z
∂x
+
∂z
∂y
= y
∂
2
z
∂y
2
.
20. Find the extrema of the following functions:
a) z = y
√
x −y
2
−x + 6y
b) z = e
x/2
(x + y
2
).
21. Evaluate the following indeﬁnite integrals:
a)
(x
2
+ 1)
2
x
3
dx
b)
a
x
(1 + a
−x
/
√
x
3
)dx
c)
e
x
3
x
2
dx
d)
x
x
2
+ 1dx
e)
xln(x −1)dx
22. Evaluate the following deﬁnite integrals:
a)
2
1
(x
2
+ 1/x
4
)dx
b)
9
4
dx
√
x −1
(Hint: substitution x = t
2
)
95
23. Find the area bounded by the curves:
a) y = x
3
, y = 8, x = 0
b) 4y = x
2
, y
2
= 4x
24. Check whether the improper integral
+∞
0
dx
(x −3)
2
converges or diverges.
25. Find extrema of the functions
a) z = 3x + 6y −x
2
−xy −y
2
b) z = 3x
2
−2x
√
y + y −8x + 8
26. Find the maximum and minimum values of the function f(x, y) = x
3
+ y
3
− 3xy in the
domain ¦0 ≤ x ≤ 2, −1 ≤ y ≤ 2¦. (Hint: Don’t forget to test the function on the
boundary of the domain)
27. Find dy/dx, given
a) x
2
+ y
2
−4x + 6y =
b) xe
2y
−ye
2x
= 0
28. Given the equation 2 sin(x + 2y −3z) = x + 2y −3z, show that z
x
+ z
y
= 1.
29. Check that the CobbDouglas production function z = x
a
y
b
for 0 < a, b < 1, a + b ≤ 1 is
concave for x, y > 0.
30. Show that z = (x
1/2
+ y
1/2
)
2
is a concave function.
31. For which values of x, y is the function f(x, y) = x
3
+ xy + y
2
convex?
32. Find dx/dz and dy/dz given
a) x + y + z = 0, x
2
+ y
2
+ z
2
= 1
b) x
2
+ y
2
−2z = 1, x + xy + y + z = 1
33. Find the conditional extrema of the following functions and classify them:
a) u = x + y + z
2
, s.t. z −x = 1, y −xz = 1
b) u = x + y, s.t. 1/x
2
+ 1/y
2
= 1/a
2
c) u = (x + y)z, s.t. 1/x
2
+ 1/y
2
+ 2/z
2
= 4
d) u = xyz, s.t. x
2
+ y
2
+ z
2
= 3
e) u = xyz, s.t. x
2
+ y
2
+ z
2
= 1, x + y + z = 0
f) u = x −2y + z, s.t. x + y
2
−z
2
= 1.
96
34. A consumer is known to have a CobbDouglas utility of the form u(x, y) = x
α
y
1−α
, where
the parameter α is unknown. However, it is known that when faced with the utility
maximization problem
max u(x, y) subject to x + y = 3,
the consumer chooses x = 1, y = 2. Find the value of α.
35. Solve the following diﬀerential equations:
a) x(1 + y
2
) + y(1 + x
2
)y
= 0
b) y
= xy(y + 2)
c) y
+ y = 2x + 1
d) y
2
+ x
2
y
= xyy
e) 2x
3
y
= y(2x
2
−y
2
)
f) x
2
y
+ xy + 1 = 0
g) y = x(y
−xcos x)
h) xy
−2y = 2x
4
i) 2y
−5y
+ 2y = 0
j) y
−2y
= 0
k) y
−3y
+ 2y = 0
l) y
−2y
+ y = 6xe
x
m) y
+ 2y
+ y = 3e
−x
√
x + 1
36. Solve the following systems of diﬀerential equations:
a) ˙ x = 5x + 2y, ˙ y = −4x −y
b) ˙ x = 2y, ˙ y = 2x
c) ˙ x = 2x + y, ˙ y = x + 2y
Answers:
1. (x
2
−x
1
)(x
3
−x
1
)(x
3
−x
2
).
2. 2.
97
3. x =
¸
¸
1
−1
2
¸
.
4. X =
16 −32
−6 13
.
5. The third line in the matrix is a linear combination of the ﬁrst two, therefore we have to
solve the system of two equations with three variables. Taking one of the variables, say z,
as a parameter, we ﬁnd the solution to be
x =
2 + 5z
3
, y =
5 −7z
3
.
6. a) 3x
2
−2y
2
+ z
2
+ 8xy + 12xz.
b)
A =
¸
¸
1 2 4
2 2 −3
4 −3 3
¸
.
7. Eigenvalues are λ = 0, 1, 3.
Corresponding eigenvectors are const (1, −1, −1)
, const (1, −1, 0)
, const (1, 1, 2)
.
The matrix is nonnegative deﬁnite.
9. a) 3/2 b) 1 c) 0 d) 1.
10. x
0
=
√
e, y
0
= 1/2.
11. a) 2/(e
4x
+ 1) b) cos x/
1 + sin
2
x c) x
1/x 1−ln x
x
2
.
14. a) (t
2
−1)/4t
3
b) 3/4e
t
.
17. 2.0125
18. a) 3x(x + 2y), 3(x
2
−y
2
) b) −y
2
/(x −y)
2
, x
2
/(x −y)
2
c) e−xy(1 −xy), −x
2
e−xy.
20. a) z
max
= 12 at x = y = 4 b) z
min
= −2/e at x = −2, y = 0.
21. a) x
2
/2+2 ln [x[−1/2x
2
+C b) a
x
/ ln a−2/
√
x+C c) e
x
3
/3+C d)
(x
2
+ 1)
3
/3+C
e) x
2
ln [x −1[/2 −(x
2
/2 + x + ln [x −1[)/2 + C
22. a) 10/8 b) 2(1 + ln 2)
23. a) 12 b) 16/3
24. Diverges.
25. a) z
max
= 9 at x = 0, y = 3 b) z
min
= 0 at x = 2, y = 4
26. f
min
= −1 at x = y = 1, f
max
= 13 at x = 2, y = −1.
27. a)
2−x
y+3
b)
2ye
2x
−e
2y
2xe
2y
−e
2x
31. For any y and x ≥ 1/12.
98
32. a) x
= (y −z)/(x −y), y
= (z −x)/(x −y).
b) y
= −x
= 1/(y −x).
33. a) minimum at x = −1, y = 1, z = 0
b) minimum at x = y = −
√
2a, maximum at x = y =
√
2a
c) maximum at x = y = 1, z = 1, minimum at x = y = 1, z = −1
d) minimum at points (−1, 1, 1), (1, −1, 1), (1, 1, −1), (−1, −1, −1),
maximum at points (1, 1, 1), (−1, −1, 1), (−1, 1, −1), (1, −1, −1)
e) minimum at points (a, a, −2a), (a, −2a, a), (−2a, a, a),
maximum at points (−a, −a, 2a), (−a, 2a, −a), (2a, −a, −a),
where a = 1/
√
6
f) no extrema
34. α = 1/3
35. a) (1 + x
2
)(1 + y
2
) = C
b) y =
2Ce
x
2
1−Ce
x
2
c) y = 2x −1 + Ce
−x
d) y = Ce
y/x
e) x = ±y
√
ln Cx
f) xy = C −ln [x[
g) y = x(C + sin x)
h) y = Cx
2
+ x
4
i) y = C
1
e
2x
+ C
2
e
x/2
j) y = C
1
+ C
2
e
2x
k) y = e
x
(C
1
+ C
2
x) + C
3
e
−2x
l) y = (C
1
+ C
2
x)e
x
+ x
3
e
x
m) y = e
−x
(
4
5
(x + 1)
5/2
+ C
1
+ C
2
x)
36. a) x = C
1
e
t
+ C
2
e
3t
, y = −2C
1
e
t
−C
2
e
3t
b) x = C
1
e
2t
+ C
2
e
−2t
, y = C
1
e
2t
−C
2
e
−2t
c) x = C
1
e
t
+ C
2
e
3t
, y = −C
1
e
t
+ C
2
e
3t
5.2 More Problems
1. Find maximum and minimum values of the function
f(x, y, z) =
x
2
+ y
2
+ z
2
e
−(x
2
+y
2
+z
2
)
.
(Hint: introduce the new variable u = x
2
+ y
2
+ z
2
; note that u ≥ 0.)
2. a) In the method of least squares of regression theory the curve y = a + bx is ﬁt to the
data (x
i
, y
i
), i = 1, 2, . . . , n by minimizing the sum of squared errors:
S(a, b) =
n
¸
i=1
(y
i
−(a + bx
i
))
2
by choice of two parameters a (the intercept) and b (the slope).
Determine the necessary condition for minimizing S(a, b) by choice of a and b.
Solve for a and b and show that the suﬃcient conditions are met.
99
b) The continuous analog of the discrete case of the method of least squares can be set
up as follows:
Given a continuously diﬀerentiable function f((x), ﬁnd a, b which solve the problem
min
a,b
1
0
(f(x) −(a + bx))
2
dx.
Again, ﬁnd the optimal a, b and show that the suﬃcient conditions are met.
c) Compare your results.
3. This result is known in economic analysis as Le Chatelier Principle. But rise to the
challenge and ﬁgure out the answer yourself!
Consider the problem
max
x
1
,x
2
F(x
1
, x
2
) = f(x
1
, x
2
) −w
1
x
1
−w
2
x
2
,
where the Hessian matrix of the second order partial derivatives
∂
2
f
∂x
i
∂x
j
, i, j = 1, 2, is
assumed negative deﬁnite and w
1
, w
2
are given positive parameters.
a) What are the ﬁrst order conditions for a maximum?
b) Show that x
1
can be solved as a function of w
1
, w
2
and
∂x
1
∂w
1
< 0.
c) Suppose that to the problem is added the linear constraint x
2
= b, where b is a given
nonzero parameter. Find the new equilibrium and show that:
∂x
1
∂w
1
without added constraint
≤
∂x
1
∂w
1
with added constraint
< 0.
4. Consider the following minimization problem:
min(x −u)
2
+ (y −v)
2
s.t. xy = 1, u + 2v = 1.
a) Find the solution.
b) How can you interpret this problem geometrically? Illustrate your results graphically.
c) What is the solution to this minimization problem if the second constraint is replaced
with the constraint u + 2v = 3? In this case, do you really need to use the Lagrange
multiplier method?
(Hint: Given two points (x
1
, y
1
) and (x
2
, y
2
) in the Euclidean space, what is the meaning
of the expression
(x
1
−x
2
)
2
+ (y
1
−y
2
)
2
?)
5. Solve the integral equation
x
0
(x −t)y(t)dt =
x
0
y(t)dt + 2x.
(Hint: take the derivatives of both parts with respect to x and solve the diﬀerential
equation).
6. a) Show that Chebyshev’s diﬀerential equation
(1 −x
2
)y
−xy
+ n
2
y = 0
100
(where n is constant and [x[ < 1) can be reduced to the linear diﬀerential equation with
constant coeﬃcients
d
2
y
dt
2
+ n
2
y = 0
by substituting x = cos t.
b) Prove that the linear diﬀerential equation of the second order
a(x)y
(x) + b(x)y
(x) + c(x)y(x) = 0
where a(x), b(x), c(x) are continuous functions and a(x) = 0, can be reduced to the
equation
d
dx
p(x)
dy
dx
+ q(x)y(x) = 0.
(Hint: in other words, you need to prove the existence of a function µ(x) such that
p(x) = µ(x)a(x), q(x) = µ(x)c(x) and
d
dx
(µ(x)a(x)) = µ(x)b(x).)
7. Find the equilibria of the system of diﬀerential equations
˙ x = sin x, ˙ y = sin y,
and classify them.
Draw the phase portrait of the system.
5.3 Sample Tests
Sample Test – I
1. Easy problems:
a) Check whether the function f(x, y) = 4x
2
−2xy+3y
2
, deﬁned for all (x, y) ∈ R
2
,
is concave, convex, strictly concave, strictly convex or neither. At which point(s)
does the function reach its maximum (minimum)?
b) Evaluate the indeﬁnite integral
x
3
e
−x
2
dx.
c) You are given the matrix A and the real number x, x ∈ R:
A =
¸
¸
¸
¸
0 0 2 4
1 3 0 0
0 0 1 3
3 x 0 0
¸
.
For which values of x is the matrix nonsingular?
d) The function f(x, y) =
√
xy is to be maximized under the constraint x+y = 5. Find
the optimal x and y.
2. Using the inverse matrix, solve the system of linear equations
x + 2y + 3z = 1,
3x + y + 2z = 2,
2x + 3y + z = 3.
101
3. Find the eigenvalues and the corresponding eigenvectors of the following matrix:
2 2
2 −1
.
4. Sketch the graph of the function y(x) =
2 −x
x
2
+ x −2
. Be as speciﬁc as you can.
5. Given the function
1 −e
−ax
1 + e
ax
, a > 1
a) Write down the F.O.C. of an extremum point.
b) Determine whether an extremum point is maximum or minimum.
c) Find
dx
da
and determine the sign of
dx
da
.
6. The function f(x, y) = ax + y is to be maximized under the constraints
x
2
+ ay
2
≤ 1, x ≥ 0, y ≥ 0, where a is a positive real parameter.
a) Use the Lagrangemultiplier method to write down the F.O.C. for the maximum.
b) Taking as given that this problem has a solution, ﬁnd how the optimal values of x
and y change if a increases by a very small number.
7. a) Solve the diﬀerential equation y
(x) + 2y
(x) −3y(x) = 0.
b) Solve the diﬀerential equation y
(x) + 2y
(x) −3y(x) = x −1.
8. Solve the diﬀerence equation x
t
−5x
t−1
= 3.
9. Consider the optimal control problem
max
u(t)
I(u(t)) =
+∞
0
(x
2
(t) −u
2
(t))e
−rt
dt,
subject to
dx
dt
= au(t) −bx(t),
where a, b and r are positive parameters.
Set up the Hamiltonian function associated with this problem and write down the ﬁrst
order conditions.
Sample Test – II
1. Easy problems:
a) For what values of p and q is the function
f(x, y) =
1
x
p
+
1
y
q
deﬁned in the region x > 0, y > 0
a) convex,
b) concave?
102
b) Find the area between the curves y = x
2
and y =
√
x.
c) Apply the characteristic root test to check whether the quadratic form Q = −2x
2
−
2y
2
−z
2
+ 4xy is positive (negative) deﬁnite.
d) A consumer is known to have a CobbDouglas utility of the form u(x, y) = x
α
y
1−α
,
where the parameter α is unknown. However, it is known that when faced with the
utility maximization problem
max u(x, y) subject to x + y = 3,
the consumer chooses x = 1, y = 2. Find the value of α.
2. Find the unknown matrix X from the equation
X
¸
¸
1 1 −1
2 1 0
1 −1 1
¸
=
¸
¸
1 −1 3
4 3 2
1 −2 5
¸
.
3. Find local extrema of the function z = (x
2
+ y
2
)e
−(x
2
+y
2
)
and classify them.
4. A curve in the (x −y) plane is given parametrically as x = a(t −sin t), y = a(1 −cos t),
a is a real parameter, t ∈ [0, 2π].
Draw the graph of this curve and ﬁnd the equation of the tangent line at t = π/2.
5. In the method of least squares of regression theory the curve y = a + bx is ﬁt to the data
(x
i
, y
i
), i = 1, 2, . . . , n by minimizing the sum of squared errors
S(a, b) =
n
¸
i=1
(y
i
−(a + bx
i
))
2
by the choice of two parameters, a (the intercept) and b (the slope).
Determine the necessary condition for minimizing S(a, b) by the choice of a and b.
Solve for a and b and show that the suﬃcient conditions are met.
6. Use the Lagrange multiplier method to write the ﬁrst order conditions for the maximum
of the function
f(x, y) = ln x + ln y −x −y, subject to x + cy = 2,
where c is a real parameter, c > 1.
If c is such that these conditions describe the maximum, show what happens with x and
y if c increases by a very small number.
7. a) Solve the diﬀerential equation y
(x) −2y
(x) + y(x) = 0.
b) Solve the diﬀerential equation y
(x) −2y
(x) + y(x) = xe
x
.
8. Solve the diﬀerence equation x
t+1
−x
t
= b
t
, t = 0, 1, 2, . . . .
9. Write down the KuhnTucker conditions for the following problem:
max
x
1
,x
2
3x
1
x
2
−x
3
2
,
subject to 2x
1
+ 5x
2
≥ 20, x
1
−2x
2
= 5, x
1
≥ 0, x
2
≥ 0.
103
Sample Test – III
1. Answer the following questions true, false or it depends.
Provide a complete explanation for all your answers.
a) If A is n m matrix and B is mn matrix, then trace(AB) = trace(BA).
b) If x
∗
is a local maximum of f(x) then f
(x
∗
) < 0.
c) If A is n n matrix and rank(A) < n then the system of linear equations Ax = b
has no solutions.
d) If f(x) = −f(−x) for all x then for any constant a
a
−a
f(x)dx = 0.
e) If f(x) and g(x) are continuously diﬀerentiable at x = a then
lim
x→a
f(x)
g(x)
= lim
x→a
f
(x)
g
(x)
.
2. Solve the system of linear equations
x + y + cz = 1
x + cy + z = 1
cx + y + z = 1
where c is a real parameter. Consider all possible cases!
3. Given the matrix
A =
¸
¸
2 1 1
0 3 α
2 0 1
¸
,
a) ﬁnd all values of α, for which this matrix has three distinct real eigenvalues.
b) let α = 1. If I tell you that one eigenvalue λ
1
= 2, ﬁnd two other eigenvalues of A
and the eigenvector, corresponding to λ
1
.
4. You are given the function f(x) = xe
1−x
+ (1 −x)e
x
.
a) Find lim
x→−∞
f(x), lim
x→+∞
f(x)
b) Write down the ﬁrst order necessary condition for maximum.
c) Using this function, prove that there exists a real number c in the interval (0, 1),
such that 1 −2c = ln c −ln(1 −c).
5. Find the area, bounded by the curves y
2
= (4 −x)
3
, x = 0, y = 0.
6. Let z = f(x, y) be homogeneous function of degree n, i.e. f(tx, ty) = t
n
f(x, y) for any t.
Prove that
x
∂z
∂x
+ y
∂z
∂y
= nz.
7. Find local extrema of the function
u(x, y, z) = 2x
2
−xy + 2xz −y + y
3
+ z
2
and classify them.
104
8. Show that the implicit function z = z(x, y), deﬁned as x −mz = φ(y −nz) (where m, n
are parameters, φ is an arbitrary diﬀerentiable function), solves the diﬀerential equation
m
∂z
∂x
+ n
∂z
∂y
= 1.
Sample Test – IV
1. Answer the following questions true, false or it depends.
Provide a complete explanation for all your answers.
a) If matrices A, B and C are such that AB = AC then B = C.
b) det(A + B) = det A + det B.
c) A concave function f(x) deﬁned in the closed interval [a, b] reaches its maximum at
the interior point of (a, b).
d) If f(x) is a continuous function in [a, b], then the condition
b
a
f
2
(x)dx = 0 implies
f(x) ≡ 0 in [a, b].
e) If a matrix A is triangular, then the elements on its principal diagonal are the
eigenvalues of A.
2. Easy problems.
a) Find maxima and minima of the function x
2
+ y
2
, subject to x
2
+ 4y
2
= 1
b) (back to microeconomics...)
A demand for some good X is described by the linear demand function D = 6 −4x.
The industry supply of this good equals S =
√
x+1. Find the value of the consumer
and producer surpluses in the equilibrium.
(Hint: The surplus is deﬁned as the area in the positive quadrant, bounded by the
demand and supply curves)
c) Let f
1
(x
1
, . . . , x
n
), . . . , f
m
(x
1
, . . . , x
n
) be convex functions. Prove that the function
F(x
1
, . . . , x
n
) =
m
¸
i=1
α
i
f
i
(x
1
, . . . , x
n
)
is convex, if α
i
≥ 0, i = 1, 2, . . . , m.
3. Find the absolute minimum and maximum of the function
z(x, y) = x
2
+ y
2
−2x −4y + 1
in the domain, deﬁned as x ≥ 0, y ≥ 0, 5x + 3y ≤ 15 (a triangle with vertices in (0,0),
(3,0) and (0,5)).
(Hint: Find local extrema, then check the function on the boundary and compare).
4. Consider the following optimization problem:
extremize f(x, y) =
1
2
x
2
+ e
y
subject to x + y = a, where a is a real parameter.
Write down the ﬁrst order necessary conditions.
Check whether the extremum point is minimum or maximum.
What happens with optimal values of x and y if a increases by a very small number?
(Hint: Don’t waste your time trying to solve the ﬁrst order conditions.)
105
5. Solve the diﬀerential equation y
(x) − 6y
(x) + 9y = xe
ax
, where a is a real parameter.
Consider all possible cases!
6. a) Solve the system of linear diﬀerential equations ˙ x = Ax, where
A =
¸
¸
2 −1 1
1 2 −1
1 −1 2
¸
, x =
¸
¸
x
1
(t)
x
2
(t)
x
3
(t)
¸
.
b) Solve the system of diﬀerential equations ˙ x = Ax + b where A is as in part a) and
b = (1, 2, 3)
.
c) What is the solution to the problem in part b) if you know that
x
1
(0) = x
2
(0) = x
3
(0) = 0?
7. Solve the diﬀerence equation 3y
k+2
+ 2y
k+1
−y
k
= 2
k
+ k.
8. Find the equilibria of the system of nonlinear diﬀerential equations and classify them:
˙ x = ln(1 −y + y
2
),
˙ y = 3 −
x
2
+ 8y.
106
A CookBook of MAT HEMAT ICS
Viatcheslav Vinogradov
Center for Economic Research and Graduate Education and Economics Institute of the Czech Academy of Sciences, Prague, 1999
CERGEEI 1999 ISBN 8086286207 .
To my Teachers .
My second objective in writing this text was to provide my students with simple “cookbook” recipes for solving problems they might face in their studies of economics. to Aurelia Pontes for excellent editorial support. but hardly exhaustive reference guide at the end of each section. Students were expected to be familiar with the basics of set theory. covers the most signiﬁcant issues of mathematical economics. I have included a useful. All remaining mistakes and misprints are solely mine. the analysis is limited to the case of real numbers. Therefore this text should resemble (in the ‘ideological’ sense) a “hybrid” of Chiang’s classic textbook Fundamental Methods of Mathematical Economics and the comprehensive reference manual by Berck and Sydsæter (Exact references appear at the end of this section). rational. Hannah More. If the reader requires further proofs or more detailed discussion. therefore. I wanted to write a short textbook. polynomial. the theory of complex numbers being beyond the scope of these notes. I did not intend to be entirely original. I strongly believe that for students in economics – for whom this text is meant – the application of mathematics in their studies takes precedence over das Glasperlenspiel of abstract theoretical constructions. since that would be impossible. the realnumber system. to my students who inspired me to write this book. Two considerations motivated me to write this book. the concept of a function. I have attempted to maintain a balance between being overly detailed and overly schematic.He liked those literary cooks Who skim the cream of others’ books And ruin half an author’s graces By plucking bonmots from their places. I tried to beneﬁt from books already in existence and adapted some interesting examples and worthy pieces of theory presented there. in turn. Bearing in mind the applied nature of the course. i . To be candid. On the contrary. Florio (1786) Introduction This textbook is based on an extended collection of handouts I distributed to the graduate students in economics attending my summer mathematics class at the Center for Economic Research and Graduate Education (CERGE) at Charles University in Prague. Instead. last but not least. to Sufana Razvan for his helpful assistance. which could be covered in the course of two months and which. it is little wonder that these notes may remind the reader of the other textbooks which have already been written and published. First. I usually refrained from presenting complete proofs of theoretical statements. inequalities and absolute values. my main goal was to refresh students’ knowledge of mathematics rather than teach them math ‘from scratch’. exponential and logarithmic functions. Mathematics is an ancient science and. I chose to allocate more time and space to examples of problems and their solutions and economic applications. Since the target audience was supposed to have some mathematical background (admittance to the program requires at least BA level mathematics). With very few exceptions. Finally. I would like to express my deep gratitude to Professor Jan Kmenta for his valuable comments and suggestions. to Natalka Churikova for her advice and.
Mathematics for Economists: An Elementary Survey. Sydsæter. Fundamental Methods of Mathematical Economics. T.Intriligator. Economist’s Mathematical Manual. vol. Mathematical Economics. Handbook of Mathematical Economics. A. The Structure of Economics: A Mathematical Analysis. • Ostaszewski. I. • Yamane. • Samuelson. • Takayama. ii . and M. • Chiang. E. • Berck P. A.I wish you success in your mathematical kitchen! Bon Appetit ! Supplementary Reading (General): • Arrow. P. K. and K. Foundations of Economic Analysis. Mathematics in Economics: Models and Methods. 1. eds. • Silberberg.
) Example 1 (¬A) ∧ A ⇔ FALSE.) Equivalence: A⇔B ‘A if and only if B’ (A iﬀ B. Negation: ¬A ‘not A’ A∧B A∨B A⇒B ‘A and B’ ‘A or B’ ‘A implies B’ Conjunction: Disjunction: Implication: (A is suﬃcient condition for B.. for short) (A is necessary and suﬃcient for B. (¬(A ∨ B)) ⇔ ((¬A) ∧ (¬B)) (De Morgan rule).Basic notation used in the text: Statements: A. B. B is necessary and suﬃcient for A. .’ and the vertical line  are widely used as abbreviations for ‘such that’ a ∈ S means ‘a is an element of (belongs to) set S’ Example 2 (Deﬁnition of continuity) f is continuous at x if ((∀ > 0)(∃δ > 0) : (∀y ∈ y − x < δ ⇒ f (y) − f (x) < ) Optional information which might be helpful is typeset in footnotesize font. . The symbol ! is used to draw the reader’s attention to potential pitfalls. B is necessary condition for A. iii .. Quantiﬁers: Existential: Universal: Uniqueness: The colon : ∀ ∃! ∃ ‘There exists’ or ‘There is’ ‘For all’ or ‘For every’ ‘There exists a unique . ..’ or ‘There is a unique.. C. True/False: all statements are either true or false.
. . . . . .3 Rules of Diﬀerentiation . . . . 3. . .4 Simultaneous Diﬀerential Equations. . . . . .3 Systems of the First Order Linear Diﬀerential Equations . . . . . . . . . 3. .3 Duality . . . . . . . . 3. . . . . . .1. . . . . . . . . . . . . . . . . 1.4 Linear Transformations and Changes of Bases . . .1. . . . . . . . 4. . . . . . . . 3. . . .1 The Setup of the Problem . . . . . . . . . . . . . . . . . . . . .4 Eigenvalues and Eigenvectors . . 1. . . . . . . . . . . .2 Diﬀerentiation . . . . . . . . . . . . . . . . . . . 4 Dynamics 4. . . . . . . . . . 4.Contents 1 Linear Algebra 1. . . . . . . . . . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . Quasiconcavity and Quasiconvexity . . . . .2 The Simplex Method . . .1. .3 Appendix: Linear Programming . 3. . . . . . . . . . . . . . . . . .1 Optimization with Equality Constraints . . . . . . . . . . . . . . . . . .10 Appendix: Matrix Derivatives . . . . . . . . . . . . . . 1. 2. . . . . . . . . . .5 Rank of a Matrix . . . 2. . . . . . . . . . . . . .4 Determinants and a Test for NonSingularity . . . 2. . . . . . . . . . . .3 Independence and Bases . . . . . . . . . . . . . . . . . . . .9 Concavity. . . . . .3. . 2. . .the Case of One Variable . . . . . . . iv 63 .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. . . . . 65 68 77 79 . . .3. . . . 1. . . . . . . .1 Matrix Algebra . . . 2. . . . . . . . 1 1 1 2 3 4 7 8 11 13 15 15 17 17 18 21 21 22 24 26 29 32 34 35 38 40 43 43 47 47 49 54 54 55 61 2 Calculus 2. . . . .2 Systems of Linear Equations . . . 3. . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . .2 Vector Subspaces . . . . . . Types of Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. . . . . . . . . . . . . . . . . .1. . . . . . . . . . 1. . . . . . . . . . . . . . . . . . . . . .2 The Case of Inequality Constraints . . . . . . . . . . 1. . . . . 1. . . . . . . . . . . . . . .2 Linear Diﬀerential Equations of a Higher Order with Constant Coeﬃcients . . . . . . . . . . . . . . . . . . . . 1. . . . . . . .2.5. . . . . . . . . . . . . . . . . . . 63 . . . . . .1 Basic Concepts . . . 4. . . . .4 Maxima and Minima of a Function of One Variable .1 NonLinear Programming . . . . . . . . . . . . . . . . . . . . . . .1.5 Appendix: Vector Spaces . . . 3 Constrained Optimization 3. .3. . . . . . . . . .3 Quadratic Forms . . . .5 Integration (The Case of One Variable) . . .1 Diﬀerential Equations . 1. . 2. . . . . . . . . . . . . . .1 Matrix Operations . . . 1. . . . . . . . 3. . . . . . . . . . . . . . . . . . . . . . 1. . . . . . . . . . .1. . . . . . . .8 The Implicit Function Theorem . . . . 4. . . .5. . . . . . . . . . . . . .7 Unconstrained Optimization in the Case of More than One Variable 2. . .6 Functions of More than One Variable . . . . . . . . .5. . . . . . . . . .3 Inverses and Transposes . . . . . . . . . . . . .2 Diﬀerence Equations . . .1. 4. . . . . . . . .2 Laws of Matrix Operations . . . . . . .5. . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . .2 KuhnTucker Conditions . 63 . . . . . . .1 Diﬀerential Equations of the First Order . . . .1 The Concept of Limit . . . . . . . . . . . . . . . . . Convexity. . . .
. .4. . . . . . . . . . . . . . . . . . .3 The General Case of Order n . . . . . 80 82 83 85 85 87 87 5 Exercises 93 5. 4. . . . . . . 4. . . . . . . .2 PresentValue and CurrentValue Hamiltonians 4. . . . . . . . . . . . . . . . .2 More Problems . . . . 93 5. . . . . . . . . . . . .2. . . . . .3 Dynamic Problems with Inequality Constraints . . . . . . . . .2. . . . . . . . . . . . 4. . . . . . . . . . . . .3 Sample Tests . . 101 v . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . .2. . .3. . . . . . . . . .1 Firstorder Linear Diﬀerence Equations . . . . . 99 5. . . . .1 The FirstOrder Conditions .1 Solved Problems . . . . . . . . . . . . 4. . . . . . . . .2 SecondOrder Linear Diﬀerence Equations . . . . . . .3. . . . . . . . . . .3 Introduction to Dynamic Optimization . 4. . . . . . . . . . . . . . . . . .
m and j = 1.1 Matrix Operations • Addition and subtraction: given A = (aij )[m×n] and B = (bij )[m×n] A ± B = (aij ± bij )[m×n] . um . .column ! A shorthand notation is A = (aij ). vn )) or n = 1 – column vectors u= u1 u2 . . v2 . the matrix is the square of order n.1 Linear Algebra Matrix Algebra Deﬁnition 1 An m × n matrix is a rectangular array of real numbers with m rows and n columns. each element of A is multiplied by the same scalar λ. 1 . A= a11 a21 . . . amn . . . . ! Recipe 1 – How to Multiply Two Matrices: In order to get the element cij of matrix C you need to multiply the ith row of matrix A by the jth column of matrix B. Note that these operations are deﬁned only if A and B are of the same dimension. . i = 1. . . Note the dimensions of matrices! where cij = l=1 ail blj . 1. . . . when either m = 1 (row vectors v = (v1 . • Matrix multiplication: if A = (aij )[m×n] and B = (aij )[n×k] then n A · B = C = (cij )[m×k] .1 1. • Scalar multiplication: λA = (λaij ). we simply add or subtract corresponding elements. a12 a22 . . . where λ ∈ R. a1n .e. or A = (aij )[m×n] . . ! i. m × n is called the dimension or order of A. A subscribed element of a matrix is always read as arow. 2. . n. .1. . . . am1 am2 . A vector is a special case of a matrix. . . 2.e. If m = n. i. . . . a2n . .
. . . and uv = C = (cij )[n×n] is a n by n matrix. v2 . a x + a x + . (B + C)A = BA + BC (postmultiplication by A). i.. . . + vn un is a scalar. n.. ! . . .. 0 0 . j = 1. Laws of Matrix Operations • Commutative law of addition: A + B = B + A. B= 7 2 6 3 . • Associative law of multiplication: A(BC) = (AB)C. bm = b1 .2 . . In has the following properties: 2 . x = . . We introduce the identity or unit matrix of dimension n In as u= u1 u2 . • Distributive law: A(B + C) = AB + AC (premultiplication by A). . . . 0 0 . = BA = 20 16 21 24 Example 6 Let v = (v1 . with cij = ui vj . . .. . .. vn ). . 1 . Note that In is always a square [n×n] matrix (further on the subscript n will be omitted). • Associative law of addition: (A + B) + C = A + (B + C).. un Then vu = v1 u1 + v2 u2 + . b = . .. . AB = BA!!! Example 5 Then Let AB = A= 14 4 69 30 2 0 3 8 . .Example 3 3 2 1 4 5 6 9 10 · 12 11 = 14 13 3 · 9 + 2 · 12 + 1 · 14 3 · 10 + 2 · 11 + 1 · 13 4 · 9 + 5 · 12 + 6 · 14 4 · 10 + 5 · 11 + 6 · 13 . Example 4 A system of m linear equations for n unknowns a11 x1 + a12 x2 + . . . . where A = (aij )[m×n] .. . The commutative law of multiplication is not applicable in the matrix case. 0 . . . . xn 1. + a x m1 1 m2 2 mn n b1 . .. + a1n xn can be written as Ax = b. ... . 1 . . In = 1 0 .1. = bm . and x1 .
0 Properties of a null matrix: a) A + 0 = A. . Example 7 A= 1 2 3 0 a b . . b) AIB = AB for all A. . The null matrix is a matrix of any dimension for which all elements are zero: 0 . Usually transposes are denoted as A (or as AT ). In this sense the identity matrix corresponds to 1 in the case of scalars. . 0 . . ! Deﬁnition 2 A diagonal matrix is a square matrix whose only nonzero elements appear on the principle (or main) diagonal. If A = A and AA = A. 3 ! . . A is called antisymmetric (or skewsymmetric). ..1. . . . d) (AB) = B A Note the order of transposed matrices! Deﬁnition 4 If A = A. A is called idempotent. 0 . 1 0 A = 2 a . A triangular matrix is a square matrix which has only zero elements above or below the principle diagonal.. AB = AC ⇒ B = C. If A A = I. 3 b Properties of transposition: a) (A ) = A b) (A + B) = A + B c) (αA) = αA . .a) AI = IA = A. B. A is called orthogonal. .. Note that AB = 0 ⇒ A = 0 or B = 0. A is called symmetric. . where α is a real number. n and j = 1. .. 0 = . If A = −A.. 1. Recipe 2 – How to Find the Transpose of a Matrix: The transpose A of A is obtained by making the columns of A into the rows of A . . b) A + (−A) = 0. m.3 Inverses and Transposes Deﬁnition 3 We say that B = (bij )[n×m] is the transpose of A = (aij )[m×n] if aji = bij for all i = 1.
Usually we denote the determinant of A as det(A) or A. Example 8 If A= 1 2 3 4 −2 3 2 ! then the inverse of A is A−1 = −2 3 2 1 −1 2 . Properties of inversion: a) (A−1 )−1 = A 1 b) (αA)−1 = α A−1 . (−1)l+k det(Mlk ) is called cofactor of the element alk .αn ) (−1)I(α1 . Given the fact that the determinant of a scalar is a scalar itself.1. 1 ≤ l ≤ n. αn ) is the number of inversions. . .αn ) a1α1 · a2α2 · . Another important characteristics of inverse matrices and inversion: • Not all square matrices have their inverses. Note that A as well as A−1 are square matrices of the same dimension (it follows from the necessity to have the preceding line deﬁned). . . · anαn where (α1 . • If A−1 exists. . which is obtained by deleting lth row and kth column of A. Otherwise it is called singular matrix... .4 Determinants and a Test for NonSingularity ! The formal deﬁnition of the determinant is as follows: given n × n matrix A = (aij ). n). . . 4 . • A−1 A = I is equivalent to AA−1 = I.Deﬁnition 5 The inverse matrix A−1 is deﬁned as A−1 A = AA−1 = I.. We can easily check that 1 2 3 4 −2 3 2 1 −1 2 = 1 −1 2 1 2 3 4 = 1 0 0 1 . 2. it is unique. αn ) are all diﬀerent permutations of (1. .. α = 0. where α is a real number. If a square matrix has its inverse. c) (AB)−1 = B −1 A−1 Note the order of matrices! d) (A )−1 = (A−1 ) 1. For practical purposes. det(A) = (α1 . . .. and I(α1 . Here Mlk is the minor of element alk of the matrix A.. it is called regular or nonsingular.. we arrive at following Deﬁnition 6 (Laplace Expansion Formula) n det(A) = k=1 (−1)l+k alk · det(Mlk ) for some integer l. . we can give an alternative recursive deﬁnition of the determinant. .. .
If two rows (or columns) are identical. Properties of the determinant: a) det(A · B) = det(A) · det(B). 2. 7 2 1 the minor of the element a23 is M23 = 2 3 7 2 . The multiplication of any one row (or column) by a scalar k will change the value of the determinant kfold. b) In general. Thus the Laplace Expansion formula can be rewritten as n det(A) = k=1 (−1)k+l akl · det(Mkl ) for some integer l. we can expand it by elements of lth column. The interchange of any two rows (columns) will alter the sign but not the numerical value of the determinant. The same holds true for columns. 5. 1 ≤ l ≤ n. Recipe 3 – How to Calculate the Determinant: We can apply the following useful rules: 1. The determinant of a triangular matrix is a product of its principal diagonal elements. Alternatively. 5 ! .e. Note that in the above expansion formula we expanded the determinant by elements of the lth row. the determinant is not aﬀected by linear operations with rows (or columns)). 3. (I. the determinant will vanish. 4. If a multiple of any row is added to (or subtracted from) any other row it will not change the value or the sign of the determinant. Example 10 The determinant of 2 × 2 matrix: det a11 a12 a21 a22 = a11 a22 − a12 a21 . det(A + B) = det(A) + det(B).Example 9 Given matrix 2 3 0 A = 1 4 5 . Example 11 The determinant of 3 × 3 matrix: a11 a12 a13 det a21 a22 a23 = a31 a32 a33 = a11 · det a22 a23 a32 a33 − a12 · det a21 a23 a31 a33 + a13 · det a21 a22 a31 a32 = = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 .
−1 8 −2 0 0 2 by 2. For the computation of an inverse matrix A−1 we use the following algorithm: A−1 = (dij ). which is deﬁned as a matrix adjA = C = (Cji ). Recipe 4 – How to Find an Inverse Matrix: There are two ways of ﬁnding inverses. As a corollary. we can simplify the matrix (e. i.e. 1. we get det A = 4 −2 6 2 0 −1 5 −3 2 −1 8 −2 0 0 0 2 = 2 · det 2 −1 3 1 0 −1 5 −3 2 −1 8 −2 0 0 0 2 = 2 · det 2 −1 3 1 0 −1 5 −3 0 0 5 −3 0 0 0 2 = 2 · 2 · (−1) · 5 · 2 = −40 If we have a blockdiagonal matrix. divided −2 6 2 −1 5 −3 . i. where P11 and P22 are square matrices. from the third row. det(A) ! Note the order of indices at Mji ! This method is called “method of adjoint” because we have to compute the socalled adjoint of matrix A. det(A) = 0. we get Proposition 2 A−1 exists ⇔ det(A) = 0. P22 are square matrices. If we have a partitioned matrix P = P11 P12 P 21 P22 .g. where Cij  is the cofactor of the element aij . then det(P ) = det(P11 ) · det(P22 ).Using these rules. −1 −1 then det(P ) = det(P22 ) · det(P11 − P12 P22 P21 ) = det(P11 ) · det(P22 − P21 P11 P12 ). 4 0 Example 12 Let A = 2 0 Subtracting the ﬁrst row. where dij = 1 (−1)i+j det(Mji ).e. a partitioned matrix of the form P = P11 P12 0 P22 . 6 . Proposition 1 (The Determinant Test for NonSingularity) A matrix A is nonsingular ⇔ det(A) = 0. Assume that matrix A is invertible. obtain as many zero elements as possible) and then apply Laplace expansion. where P11 . Method of adjoint matrix.
. a2 . ak are linearly dependent if and only if there exist numbers c1 . An identity matrix is placed along side a matrix A that is to be inverted. . . . – or the order of the largest nonzero minor of A. the same elementary row operations are performed on both matrices until A has been reduced to an identity matrix. . Example 13 (method of adjoint) A= a b c d =⇒ A−1 = 1 ad − bc d −b −c a . . . Recall that if we have n linearly independent vectors e1 . . a2 . c2 . q2 . . The identity matrix upon which the elementary row operations have been performed will then become the inverse matrix we seek. they are said to span an ndimensional vector space or to constitute a basis in an ndimensional vector space. . . 7 . . . . + q k ak . . Gauss elimination method or pivotal method. . . . −12) = 0. e2 . . Rank of a Matrix A linear combination of vectors a1 . where q1 . – or the maximum number of linearly independent columns. the inverse is A−1 = 1.1. ck not all zero. + ck ak = 0. 1/2 0 −2 1 ∼ ∼ ∼ 1/2 0 1/2 −1/4 −1/4 3/8 1/2 −1/4 Therefore. 4) and a2 = (3. . 6) are linearly dependent: if. ak is a sum q 1 a1 + q 2 a2 + . Deﬁnition 7 Vectors a1 . qk are real numbers. Deﬁnition 8 The rank of a matrix A rank(A) can be deﬁned as – the maximum number of linearly independent rows.2. Then 1 3/2 4 2 ∼ 1/2 0 0 1 1 0 0 1 1 3/2 0 −4 . . Example 14 (Gauss elimination method) Let A = 2 3 4 2 ∼ 1 3/2 0 1 2 3 4 2 1 0 0 1 . For more details see the section ”Vector Spaces”. . such that c1 a1 + c2 a2 + . Example 15 Vectors a1 = (2. . 12) + (−6.5 −1/4 3/8 1/2 −1/4 . . say. c1 = 3 and c2 = −2 then c1 a1 + c2 a2 = (6. Then. en .
. . Recipe 5 – How to Solve a Linear System Ax = b (general rules): b =0 (homogeneous case) If det(A) = 0 then the system has a unique trivial (zero) solution. . ˜ b) rank(A) = rank(A) ⇒ the system is inconsistent. ˜ Here A is a socalled augmented matrix. b =0 (nonhomogeneous case) If det(A) = 0 then the system has a unique solution. The inverse matrix method: Since A−1 exists. then rank(A) = n ⇔ det(A) = 0. A= . 1. Gauss method: We perform the same elementary row operations on matrix A and vector b until A has been reduced to an identity matrix. . If det(A) = 0 then ˜ a) rank(A) = rank(A) ⇒ the system has an inﬁnite number of solutions. . Using the notion of rank.1 Example 16 rank 2 −5 The ﬁrst two rows are linearly independent rows is equal to 2. . . ann bn Recipe 6 – How to Solve the System of Linear Equations. the solution x can be found as x = A−1 b. 7 1 dependent. If det(A) = 0 then the system has an inﬁnite number of solutions. . therefore the maximum number of linearly • The column rank and the row rank of a matrix are equal. a1n b1 . . 8 . if b =0 and det(A) = 0: 1. 2. a11 . The vector b upon which the elementary row operations have been performed will then become the solution. . we can reformulate the condition for nonsingularity: Proposition 3 If A is a square matrix of order n.2 Systems of Linear Equations Consider a system of n linear equations for n unknowns Ax = b. ˜ . • rank(AB) ≤ min(rank(A). rank(B)). an1 . . • rank(A) = rank(AA ) = rank(A A). Properties of the rank: 3 2 6 4 = 2.
det(Aj ) . . . . . a1n . i = 1. . ann Example 17 Let us solve 2 3 4 −1 x1 x2 = 12 10 for x1 . . 9 . . x2 = −28 = 2.3. 3 . 2. Demand and supply for each good are given by: D1 = 5 − 2P1 + P2 + P3 S1 = −4 + 3P1 + 2P2 D2 = 6 + 2P1 − 3P2 + P3 S2 = 3 + 2P2 D3 = 20 + P1 + 2P2 − 4P3 S3 = 3 + P2 + 3P3 where Pi is the price of good i. 2. . where Aj = . . . . xn of vector x using the following formula: a11 . anj−1 bn anj+1 . −14 Economics Application 1 (General Market Equilibrium) Consider a market for three goods. . . xj = . . 3. . . . det(A1 ) = det 12 3 10 −1 2 12 4 10 = 12 · (−1) − 3 · 10 = −42. x2 using Cramer’s rule: det(A) = 2 · (−1) − 3 · 3 = −14. . det(A) an1 . . det(A2 ) = det therefore x1 = −42 = 3. . that is 5P1 −2P1 −P1 +P2 −P3 = 9 +5P2 −P3 = 3 −P2 +7P3 = 17 This system of linear equations can be solved at least in two ways. . . The equilibrium conditions are: Di = Si . . Cramer’s rule: We can consequently ﬁnd all elements x1 . a1j−1 b1 a1j+1 . . x2 . −14 = 2 · 10 − 12 · 4 = −28. i = 1. .
i=1 10 . A= −1 −1 7 P1 P = P2 . let set the price of each good equal to $1. Then the value of inputs should not exceed the value of output: n aij ≤ 1. The model is based on an input matrix: a11 a12 · · · a1n a21 a22 · · · a2n A= . an1 an2 · · · ann . 17 The matrix form of the system is: AP = B. . 17 −1 7 ∗ P1 = 5 1 −1 A = −2 5 −1 −1 −1 7 356 A1 = = 2. The vector of (P1 . b) Using the inverse matrix rule: Let us denote 5 1 −1 −2 5 −1 . . A 7 4 27 34 −6 4 9 2 1 P = 15 34 7 3 = 2 178 7 4 27 17 3 ∗ ∗ ∗ Again. A−1 34 −6 4 1 = 15 34 7 . What are the eﬃcient amounts of output each of the n industries should produce? (‘Eﬃcient’ means that there will be no shortage and no surplus in producing each good). . n.P3 ) describes the general market equilibrium. where aij denotes the amount of good i used to produce one unit of good j. P3 9 B = 3 . . In addition. P2 = 2 and P3 = 3.a) Using Cramer’s rule: 9 1 −1 A1 = 3 5 −1 . j = 1. which implies P = A−1 B. Economics Application 2 (Leontief InputOutput Model) This model addresses the following planning problem: Assume that n industries produce n goods (each industry produces only one good) and the output good of each industry is used as an input in the other n − 1 industries. To simplify the model.. P1 = 2. A 178 ∗ ∗ ∗ ∗ ∗ Similarly P2 = 2 and P3 = 3. each good is demanded for ‘noninput’ consumption.P2 .
xn . .2x2 −0. x2 . that is n xi = j=1 aij xj + bi . 269 269 269 1.3x1 −0.e. c are arbitrary real constants).6x3 = 5 Solving it for x1 . or (I − A)x = b. For convenience.1 . 2.1x3 = 8 +0.1 0. Q = x Ax. . In matrix notation.2x2 +0. or x = Ax + b.1 A = 0.2 0.3 0. then the optimality condition reads as follows: the demand for each input should equal the supply. .7x2 −0. each term is a product of xi and xj . The system (I − A)x = b can be solved using either Cramer’s rule or the inverse matrix. 4260 ). L = ax + by + cz is an example of a linear form in three variables x. j = 1. x3 we ﬁnd the solution ( 4210 . . . .4 8 b= 4 5 Thus the system (I − A)x = b becomes 0. i = 1.2 0.If we denote an additional (noninput) demand for good i by bi .g. x2 . where A = (aij ) is a symmetric matrix. we assume that aij = aji . . n): n n Q= i=1 j=1 aij xi xj . . .3 0. xn is a polynomial expression in which each component term has a degree two (i. where aij are real numbers. a form is a polynomial expression.1x3 = 4 −0. b. . 11 . where i. and x= x1 x2 . . Deﬁnition 9 A quadratic form Q in n variables x1 . z.2 0. Example 18 (Numerical Illustration) Let 0. in which each term has a uniform degree (e.8x1 −0.3 Quadratic Forms Generally speaking. 0. n. where a.1x1 −0. y. . 3950 . .
. . . Q = (x + y)2 is PSD. n of a matrix A = (aij )[n×n] are deﬁned as a11 . Dk = det . . . By deﬁnition. Leading principal minors Dk . 3 −1 0 −1 3 0 = 40 > 0. .. . . Note that if we replace > by ≥ in the above statement. . . . . . . aip ip . 2. y. . . . D3 = det 0 0 5 therefore. i p .. . The corresponding matrix has the form 3 −1 0 3 0 . . A = det . Q = x2 − y 2 is ID. . n. . Example 20 Q = x2 + y 2 is PD. p ≤ n. i p aip i1 ! ai1 ip . z) = 3x2 + 3y 2 + 5z 2 − 2xy. it does NOT give us the criteria for the semideﬁnite case! Proposition 5 A quadratic form Q is PSD (NSD) ⇔ all the principal minors of A are ≥ (≤)0. . 2. . . . 2. . Example 21 Consider the quadratic form Q(x. akk Proposition 4 1. A quadratic form Q id PD ⇔ Dk > 0 for all k = 1. k = 1. . the principal minor ai1 i1 i1 .Example 19 A quadratic form in two variables: Q = a11 x2 + 2a12 x1 x2 + a22 x2 . . ak1 . . A quadratic form Q id ND ⇔ (−1)k Dk > 0 for all k = 1. the quadratic form is positive deﬁnite. 12 . A = −1 0 0 5 Leading principal minors of A are D1 = 3 > 0. 1 2 Deﬁnition 10 A quadratic form Q is said to be positive deﬁnite (PD) negative deﬁnite (ND) positive semideﬁnite (PSD) negative semideﬁnite (NSD) >0 if Q = x Ax is <0 ≤0 ≥0 for for for for all all all all x=0 x=0 x x otherwise Q is called indeﬁnite (ID). . . 1 2 p . < i ≤ n. i1 . . n. where 1 ≤ i < i < . . a1k . 2. D2 = det 3 −1 −1 3 = 8 > 0.
. . . . 13 . . . 2. λn are its eigenvalues. .) A quadratic form Q is positive deﬁnite negative deﬁnite positive semideﬁnite negative semideﬁnite eigenvalues λi > 0 for all i = 1. 2. n ⇔ λi < 0 for all i = 1. 0 0 λn . therefore the eigenvalues are λ = 2. . . Example 22 Let us consider the quadratic form from Example 21. 2. n A form is indeﬁnite if at least one positive and one negative eigenvalues exist. . n λi ≥ 0 for all i = 1. the determinant of (A − λI) should vanish. Deﬁnition 13 Matrix A is diagonalizable ⇔ P −1 AP = D for a nonsingular matrix P and a diagonal matrix D. 2. Proposition 6 (Characteristic Root Test for Sign Deﬁniteness. . Since x is nonzero. . λ = 4 and λ = 5.1. 3 − λ −1 0 0 = det(A − λI) = det −1 3 − λ 0 0 5−λ = (5 − λ)(λ2 − 6λ + 8) = (5 − λ)(λ − 2)(λ − 4) = 0.. . there exists an orthogonal matrix U such that U −1 AU = λ1 . . Therefore all eigenvalues can be calculated as roots of the equation (which is often called the characteristic equation or the characteristic polynomial of A) det(A − λI) = 0. .. . n λi ≤ 0 for all i = 1. .4 Eigenvalues and Eigenvectors Ax = λx (1) Deﬁnition 11 Any number λ such that the equation has a nonzero vectorsolution x is called an eigenvalue (or a characteristic root) of the equation (1) Deﬁnition 12 Any nonzero vector x satisfying (1) is called an eigenvector (or characteristic vector) of A for the eigenvalue λ. . . Proposition 7 (The Spectral Theorem for Symmetric Matrices) If A is a symmetric matrix of order n and λ1 . Recipe 7 – How to calculate eigenvalues: Ax − λx = 0 ⇒ (A − λI)x = 0. .
e. corresponding to λ = 0. 2x1 − x2 = 0. −1) . ! First. It has the property U U = I (i. is v1 = C1 · (2. U is orthogonal matrix. 2/ 5) .e. The second equation is redundant and the eigenvector. Thus the diagonalization matrix U is U= 2 √ 5 1 − √5 1 √ 5 2 √ 5 x1 x2 = 0 0 x1 x2 = 0 0 . −1/ 5) . i.e. U is the normalized matrix formed by eigenvectors. 2x1 + 4x2 = 0.Usually. After normalization we get v1 = (2/ 5. i. where C1 is an arbitrary real constant. U = U −1 ). For λ = 5 we solve 1−5 2 2 4−5 or −4x1 + 2x2 = 0. Let us normalize the eigenvectors. we need to ﬁnd the eigenvalues: det 1−λ 2 2 4−λ = (1 − λ)(4 − λ) − 4 = λ2 − 5λ = λ(λ − 5). λ = 0 and λ = 5. For λ = 0 we solve 1−0 2 2 4−0 or x1 + 2x2 = 0. let us√ pick constants C such √ v1 v1 = 1 and that √ √ v2 v2 = 1. It is essential that A be symmetrical! Example 23 Diagonalize the matrix A= 1 2 2 4 . Some useful results: 14 . v2 = (1/ 5. Thus the general expression for the second eigenvector is v2 = C2 · (1.“Normalized” means that for any column u of the matrix U u u = 1. You can easily check that U −1 AU = 0 0 0 5 . 2) .
5. Example 24 (The nDimensional Vector Space Rn ) Deﬁne Rn = {(u1 . v) = u + v and a scalar multiplicative operation · : R × V → V. Consider u. . • if λ1 . v2 . un ) ui ∈ R. . . + λn . . . v = (v1 . . . v. vn ) and a ∈ R. f (λn ) are eigenvalues of f (A). u2 . The element 0 ∈ V is the zero vector and −v is the additive inverse of V . . . 1/λn are eigenvalues of A−1 . • if we deﬁne the trace of a square matrix of order n as the sum of the n elements on its principal diagonal tr(A) = n aii .• det(A) = λ1 · . . . b ∈ R (R is the set of all real numbers): A1) (u+v)+w=u+(v+w) A2) u+v=v+u A3) 0+u=u A4) u+(u)=0 S1) a(u+v)=au+av S2) (a+b)u=au+bu S3) a(bu)=(ab)u S4) 1u=u. . where f (·) is a polynomial. . 15 . . . respectively. .1 Appendix: Vector Spaces Basic Concepts Deﬁnition 14 A (real) vector space is a nonempty set V of objects together with an additive operation + : V × V → V . • the rank of a symmetric matrix is the number of nonzero eigenvalues it contains. . . . . . . ij 1.5 1. n} (the apostrophe denotes the transpose). ·(a. . . . d) tr(A ) = tr(A). u = (u1 . whenever AB is square. . un ) . e) tr(A A) = n i=1 n j=1 a2 . . λn are eigenvalues of A then f (λ1 ). w ∈ V and any a. u) = au which satisﬁes the following axioms for any u. . . . then tr(A) = λ1 + . u2 . • the rank of any matrix A is equal to the number of nonzero eigenvalues of A A. i = 1. . c) tr(AB) = tr(BA). tr(λA) = λtr(A). . . · λn . Deﬁnition 15 The objects of a vector space V are called vectors. . v ∈ Rn . • if λ1 . . λn are eigenvalues of A then 1/λ1 . . i=1 Properties of the trace: a) if A and B are of the same order. tr(A + B) = tr(A) + tr(B). +(u. the operations + and · are called vector addition and scalar multiplication. b) if λ is a scalar.
un + vn ) . Proposition 8 If V is a vector space. u · u ≥ 0 and u · u = 0 iﬀ u = 0. Deﬁnition 16 Let V be a vector space. u2 . . u = (u1 . . Example 26 If u ∈ Rn . . if u · v = 0. . u2 . the norm of u can be introduced as u = √ u·u= u2 + · · · + u2 . . n n 2 2 iii) ( n ui vi ) ≤ ( i=0 i=0 ui )( i=0 vi ). . . c) The angle between vectors u and v is arccos( uv u v ). iii) (Schwarz inequality) u · v ≤ u v . . u · (v + w) = u · v + u · w. un ) . . Example 25 Let u. Deﬁne u · v = (u1 v1 . . The √ norm of magnitude is a function · : V → R deﬁned as v = v · v. then for any v ∈ V and a ∈ R i) au = a u . . (CauchySchwarz inequality for sums) Deﬁnition 18 a) The nonzero vectors u and v are parallel if there exists a ∈ R such that u = av. Deﬁnition 17 Let V be a vector space and · : V × V → R an inner product in V . . Then this rule is an inner product in Rn . . . v2 . v ∈ Rn . un ). s(u. . . . . v) = u · v which satisﬁes the following properties: u · v = v · u. vn ) . ii) (Triangle inequality) u + v ≤ u + v . a(u · v) = (au) · v = u · (av). 16 . 1 n The triangle inequality and Schwarz’s inequality in Rn become: ii) (u1 + v1 )2 + · · · + (un + vn )2 ≤ (Minkowski’s inequality for sums) n i=0 u2 + i n i=0 2 vi . v = (v1 . It is not diﬃcult to verify that Rn together with these operations is a vector space. un vn ) . that is. An inner product or scalar product in V is a function s : V × V → R.Deﬁne the additive operation and the scalar multiplication as follows: u + v = (u1 + v1 . aun ) . . . u = (u1 . au = (au1 . b) The vectors u and v are orthogonal or perpendicular if their scalar product is zero. .
a2 . Deﬁnition 22 A set {u1 . . . 3. .5. the deﬁnition says that a set of vectors in a vector space is linearly independent if and only if none of the vectors can be written as a linear combination of the others. u2 = (1. 1. y)y = mx + n} where m.5. Proposition 10 Let {u1 . Example 27 V is a subset of itself. Example 28 L = {(x. v ∈ S and a ∈ R u+v ∈S and au ∈ S. b ∈ R} = {(2a + 3b. i = 1. . b ∈ R}. 0) . u2 .2 Vector Subspaces Deﬁnition 19 A nonempty subset S of a vector space V is a subspace of V if for any u. −7) are linearly dependent since u3 = u1 − 2u2 . . {0} is also a subset of V . 4. Deﬁnition 20 Let u1 . The following conditions are equivalent: i) The vectors are independent. 1) . Then the subspace of R3 generated by u1 and u2 is sp(u1 . · · · k} is called the subspace generated or spanned by the vectors u1 . . −1. . uk } of vectors in a vector space V is linearly independent if a1 u1 + a2 u2 + · · · ak uk = 0 implying a1 = a2 = · · · = ak = 0 (that is. −a + 4b. −1. a) a. 1) . such that a1 u1 +a2 u2 +. These subspaces are called proper subspaces. Example 29 Let u1 = (2. 17 ! . In other words. iii) The vectors generate Rn . ak uk = 0. uk be vectors in a vector space V . −7. n ∈ R and m = 0 is a subspace of R2 . u2 . u2 . The set S of all linear combinations of these vectors S = {a1 u1 + a2 u2 + · · · + ak uk ai ∈ R.3 Independence and Bases Deﬁnition 21 A set {u1 . In other words. u3 = (0.1. . un } be n vectors in Rn . not all zero. 4) . . they are not linearly dependent). . ii) The matrix having these vectors as columns is nonsingular. u2 = (3. . u2 ) = {au1 + bu2 a. u2 . ak . the set of vectors in a vector space is linearly dependent if and only if one vector can be written as a linear combination of the others. u2 . . u2 · · · uk ). Example 30 The vectors u1 = (2. uk } of vectors in a vector space V is linearly dependent if there exists the real numbers a1 . . . . uk and denoted as sp(u1 . Proposition 9 S is a subspace of V .
3. . . n. 5 ! Thus. b = −3 . u2 = (2. Solving the system a +2b = −1 a −3b = 2 1 we ﬁnd a = 5 . generates V (that is. b such that (−1. 2) with respect to the basis B = (1. + xn un . 1) + b(2. .4 Linear Transformations and Changes of Bases Deﬁnition 26 Let U. uk } a basis for V . . 1. . en } form a basis for Rn which is called the standard basis. is linearly independent. . −3) . 1) in R3 are lin2 −2 3 0 = −17 = 0. un en . therefore. (u1 . for any u ∈ V there exists the real numbers x1 . ﬁnd (−1. v ∈ U and any a. . . Example 37 Let A be an m × n real matrix. . 2)B ). u3 = (−2. i = 1. Let us ﬁnd the coordinate vector of (−1. where 1 is in the ith position. 1) . 1 1 Deﬁnition 23 A set {u1 . . For any u = (u1 . uk } of vectors in V is a basis for V if it. . xn such that u = x1 u1 + . . b ∈ R T (au + bv) = aT (u) + bT (v). x2 . . uk )). second. −2) . (2. −3) (i. The set En = {e1 . 1) form a basis for R3 . 5 5 Deﬁnition 25 The dimension of a vector space V dim(V ) is the number of elements in any basis for V . V = sp(u1 . . (−1. Deﬁnition 24 Let V be a vector space and B = {u1 . Example 32 The vectors from the preceding example u1 = (1. . . . 2)B = ( 1 . V be two vector spaces. u2 = (2. . . . −3 ) . 0) .Example 31 The vectors u1 1 early independent since 2 −2 = (1. . . 0. u2 . . −2) . x2 . 1) . and.5. un ) we can represent u as u = u1 e1 + . Example 36 The dimension of the vector space Rn with the standard basis En is dim(Rn ) = n. 3. . Example 33 Consider the following vectors in Rn : ei = (0. 0. . xn ) is called the vector of coordinates of u with respect to B. 1. . . The mapping T : Rn → Rm deﬁned by T (u) = Au is a linear transformation. . . u2 . . un ) is the vector of coordinates of u with respect to En . 2.e. . A linear transformation of U into V is a mapping T : U → V such that for any u. 2. . . . . . . 2) = a(1. Example 34 Consider the vector space Rn with the standard basis En . ﬁrst. Example 35 Consider the vector space R2 . . . . 0. Any set of n linearly independent vectors in Rn form a basis for Rn . Since B generates V . . 1) . u3 = (−2. 0. 18 . . The column vector x = (x1 . . . We have to solve for a. u2 .
1). ! 19 . 1) . (1. 0) } for R3 and C = {(1. (1. x+ z) and bases B = {(1. . Any linear transformation is uniquely determined by a transformation of coordinates. . To check this. . C. . Therefore. T ((1. From example 37 it follows that TR is a linear transformation. ﬁrst note that any twodimensional vector u ∈ R2 can be expressed in polar coordinates as u = (r cos θ. . . How can we ﬁnd the matrix representation of T relative to bases B. The columns of AT are formed by the coordinate vectors of T ((1. 0) ) = (1. . 1. 1. B = (b1 . 1) ). 1) . Applying the procedure developed in Example 35 we ﬁnd AT = 2 1 1 −3 −2 0 . T ((1. . 2). y. 0) ) relative to C. r(cos θ cos α − sin θ sin α) r(sin θ cos α + cos θ sin α) . C? We have: T ((1. 0. cm ) a basis for V . z) ) = (x−2y. The mapping TR is thus deﬁned by TR (u) = (r cos(θ + α). Proposition 11 Let U and V be two vector spaces. TR (u) = where A= cos α −sinα sin α cos α (the rotation matrix). T ((x. 1). 1. Example 39 Consider the linear transformation T : R3 → R2 . Deﬁnition 27 The matrix AT is called the matrix representation of T relative to bases B. 0) ). r sin(θ + α)) . • If x = (x1 . ym ) is the coordinate vector of T (u) relative to C then T deﬁnes the following transformation of coordinates: y = AT x for any u ∈ U. . bn ) a basis for U and C = (c1 . cos α − sin α sin α cos α r cos θ r sin θ = Au. xn ) is the coordinate vector of u ∈ U relative to B and y = (y1 . TR (u) = or. 1. . . T ((1. 1. . . 0) . 1) ) = (−1. • Any linear transformation T can be represented by an m × n matrix AT whose ith column is the coordinate vector of T (bi ) relative to C. 1. alternatively. 0. 0) ) = (−1. 0) } for R2 . T ((1. . (1. . 0. . r sin θ) where r = u and θ is the angle that u makes with the xaxis of the system of coordinates.Example 38 (Rotation of the plane) The function TR : R2 → R2 that rotates the plane counterclockwise through a positive angle α is a linear transformation.
J. and let B. • Fraleigh. The columns of D are the coordinate vectors of (1. yn ) is the coordinate vector of u relative to C. Linear Algebra. C is the matrix representation of I relative to B. 1) }. 1) . and R. Further Reading: • Bellman. B then A2 = D−1 A1 D. 0) } and C = {(0.B. R. .Deﬁnition 28 (Changes of Bases) Let V be a vector space of dimension n. let x = (x1 . . B is D−1 . If D is the changeofbasis matrix relative to B. Following 0 −1 Example 35 we ﬁnd D = . Introduction to Matrix Analysis. . Beauregard. C then y = Cx. C. 0) relative to C. B and C be two bases for V . If A1 is the matrix representation of T in the basis B. . C be two bases for V . ﬁnd the changeofbasis matrix D relative to B. (1. (1. Example 41 Given the following bases for R2 : B = {(1. Linear Algebra. A2 is the matrix representation of T in the basis C and D is the changeofbasis matrix relative to C. xn ) be the coordinate vector of u relative to B and y = (y1 . F. 1) . 1 1 Proposition 12 Let T : V → V be a linear transformation. • Lang. The changeofbasis matrix relative to C. . Example 40 For u ∈ V . 1) . and I : V → V be the identity transformation (I(v) = v for all v ∈ V ). .A. The Theory of Matrices.R. . . 20 . C. • Gantmacher. The changeofbasis matrix D relative to B. S. (1.
6. x (1 + x)p − 1 = p. 2. limx→x0 f (x)g(x) = limx→x0 f (x) · limx→x0 g(x) = A · B. 4.718281828459 . . x→1 x→1 x→1 x − 1 x−1 sin2 x sin x sin x b) lim = lim ( sin x) = lim lim sin x = 1 · 0 = 0. if C is constant. ax − 1 = ln a. limx→x0 (f (x))n = (limx→x0 f (x))n = An . then limx→x0 C = C. 3. x→0 x Example 42 a) lim ex − 1 = 1. x→a− lim f (x) = A (x→a f (x) = A). no matter how small. there exists a positive number δ (that depends on ε) such that f (x) − A < ε whenever 0 < x − a < δ. limx→x0 f (x)g(x) = elimx→x0 g(x) ln f (x) = eB ln A . . . lim + if for each given number ε > 0 there exists a positive number δ such that f (x) − A < ε whenever a − δ < x < a (a < x < a + δ). g(x) limx→x0 g(x) B 5. Some important results: sin x = 1. Recipe 8 – How to Calculate Limits: We can apply the following basic rules for limits: if limx→x0 f (x) = A and limx→x0 g(x) = B then 1. The standard notation is limx→a f (x) = A or f (x) → A as x → a. x→0 x lim lim x→∞ lim 1+ 1 x x = e = 2. lim x→x0 limx→x0 f (x) A f (x) = = . limx→x0 (f (x) + g(x)) = limx→x0 f (x) + limx→x0 g(x) = A + B.1 Calculus The Concept of Limit Deﬁnition 29 The function f (x) has a limit A (or tends to A as a limit) as x approaches a if for each given number ε > 0. x3 − 1 (x − 1)(x2 + x + 1) = lim = lim (x2 + x + 1) = 1 + 1 + 1 = 3. x→0 x lim x→0 lim ln(1 + x) = 1. x→0 x lim a > 0.2 2. x→0 x→0 x→0 x x→0 x x √ x2 + 1 x2 + 1 1 1 c) x→∞ x2 ln lim = x→∞ x2 ln lim = x→∞ x2 ln(1 + 2 ) = lim 2 x x 2 x 21 . Deﬁnition 30 The function f (x) has a leftside (or rightside) limit A as x approaches a from the left (or right).
and suppose that f (x) and g(x) both approach 0 when x approaches x0 . • if g(a) = 0 then f (x) g(x) is continuous at a. 22 . x→x0 g (x) g(x) The same rule applies if f (x) → ±∞. • if g(x) is continuous at a and f (x) is continuous at g(a) then f (g(x)) is continuous at a. g(x) → ±∞.1 ln(1 + 1/x2 ) 1 = lim = . 0 Note that L’Hˆpital’s rule can be applied only if we have expressions of the form or o 0 ±∞ . If g (x) = 0 (x) for all x = x0 in (a. b) around x0 except possibly at x0 . 1 For the deﬁnition of diﬀerentiability see the next section. x→0 x→0 x3 3x2 limx→0 3x2 limx→0 3x2 2. 2 x→∞ 2 1/x 2 d) lim x 1+ln x = elimx→0+ x→0+ 1 1 1+ln x ln x = elimx→0+ 1 1/ ln x+1 = e1 = e.the Case of One Variable Deﬁnition 31 A function f (x) is continuous at x = a if limx→a f (x) = f (a). x0 can be either ﬁnite or inﬁnite. b) and limx→x0 f (x) = A then g x→x0 lim f (x) f (x) = lim = A. If f (x) and g(x) are continuous at a then: • f (x) ± g(x) and f (x)g(x) are continuous at a. ±∞ Example 43 a) lim b) lim x − sin x 1 − cos x sin x 1 sin x 1 = lim = lim = lim = . o Proposition 13 (L’Hˆpital’s Rule) o Suppose f (x) and g(x) are diﬀerentiable1 in an interval (a. Another powerful tool for evaluating limits is L’Hˆpital’s rule.2 Diﬀerentiation . 3 2 x→0 x→0 6x x→0 x x 3x 6 6 ! x→0 x − sin 2x 1 − 2 cos 2x limx→0 (1 − 2 cos 2x) −1 = lim = = = −∞.
. ∆x ∆x Taking the limit of the above expression. The geometric interpretation of the derivative gives us the following formula (usually called Newton’s approximation method). Denoting the change in x as ∆x = x − x0 . An alternative notation for the derivative often found in textbooks is df df (x) f (x) = y = = .. we call this type of discontinuity inﬁnite discontinuity. 23 . dx dx The ﬁrst and second derivatives with respect to time are usually denoted by dots ( ˙ and dz d2 z ¨ respectively). . any function built from continuous functions by additions. y = f (x). the function f (x) is said to have a jump discontinuity at a. which allows us to ﬁnd an approximate root of f (x) = 0. the argument of a function may take any value from D) is denoted by C (1) (D). x→x0 dx ∆x dx provided that these limits exist. Let us deﬁne a sequence xn+1 = xn − f (xn ) . If so. If limx→a+ f (x) = c1 = c2 = limx→a− f (x). the derivative represents the slope of the tangent line to f at x.. Using the derivative. The derivatives of higher order (2.e if z = z(t) then z = . Geometrically speaking. f (x) = x − 2 is continuous at x = 2 but not diﬀerentiable at x = 2. f is called diﬀerentiable at x. subtractions. Deﬁnition 32 The derivative of f (x) at x0 is f (x0 + ∆x) − f (x0 ) f (x0 ) = x→x lim . c2  < ∞. multiplications. f (n) (x0 ) = n f (n−1) (x0 ). we arrive at the following deﬁnition. . this sequence will converge to that root.n) can be deﬁned in the same manner: d f (x0 + ∆x) − f (x0 ) dn f (x0 ) = f (x0 ) = lim . The symbol dxn denotes an operator of taking the nth derivative of a function with respect to x. the equation of the tangent line to f (x) at x0 can be written as y = f (x0 ) + f (x0 )(x − x0 ). . 0 ∆x If the limit exists. f (x) is called n times continuously diﬀerentiable. dn f ∈ C (n) . the diﬀerence quotient is deﬁned as f (x0 + ∆x) − f (x0 ) ∆y = . divisions and compositions is continuous where deﬁned. c1 . ˙ ¨ dt dt The set of all continuously diﬀerentiable functions in the domain D (i. Note that the continuity of f (x) is a necessary but NOT suﬃcient condition for its diﬀerentiability! Example 44 For instance. One of the natural questions one may ask is: How does y change if x changes? We can answer this question using the notion of the diﬀerence quotient. i..3. Suppose there is a functional relationship between x and y.In general. If limx→a f (x) = ±∞.e. z = 2 . f (xn ) ! If the initial value x0 is chosen such that x0 is reasonably close to an actual root.
x and y are parametrically deﬁned) then dy dy/dt v(t) ˙ = = dx dx/dt u(t) ˙ d2 y d = 2 dx dx v(t) ˙ u(t) ˙ = d dt (recall that ˙ means v(t) ˙ u(t) ˙ d ).3 Rules of Diﬀerentiation Suppose we have two diﬀerentiable functions of the same variable x.e. Therefore. then dy sin t d2 y cos t(1 − cos t) − sin2 t cos t − 1 1 = . y = a(1 − cos t). • (f (x)g(x)) = f (x)g(x) + f (x)g (x) • f (x) g(x) = f (x)g(x) − f (x)g (x) g 2 (x) (product rule). Then • (f (x) ± g(x)) = f (x) ± g (x). We can decompose as f = y. 2 3 2 dx 1 − cos t dx a(1 − cos t) a(1 − cos t) a(1 − cos t)2 Some special rules: f (x) = constant f (x) = xa (a is constant) f (x) = ex f (x) = ax (a > 0) f (x) = ln x f (x) = loga x (a > 0. = = =− .2. dt dx/dt = v (t)u(t) − v(t)¨(t) ¨ ˙ ˙ u . where a is a parameter. Example 45 (Application of the Chain Rule) √ √ Let f (x) = x2 + 1. (chain rule or composite • if f = f (y) and y = g(x) then function rule). df df dy 1 x x = = √ · 2x = √ = √ 2 . (quotient rule). dx dy dx 2 y y x +1 • if x = u(t) and y = v(t) (i. where y = x2 + 1. (u(t))3 ˙ Example 46 If x = a(t − sin t). a = 1) f (x) = sin x f (x) = cos x f (x) = tgx f (x) = ctgx f (x) = arcsinx f (x) = arccosx f (x) = arctgx f (x) = arcctgx ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ f f f f f f f f f f f f f f (x) = 0 (x) = axa−1 (x) = ex (x) = ax ln a 1 (x) = x 1 (x) = x loga e = (x) = cos x (x) = − sin x 1 (x) = cos2 x (x) = − sin12 x 1 (x) = √1−x2 1 (x) = − √1−x2 1 (x) = 1+x2 1 (x) = − 1+x2 24 1 x ln a . say f (x) and g(x). df df dy = = f (y)g (x) dx dy dx This rule can be easily extended to the case of more than two functions involved.
−1 (y)) dy f (x) f (f dx 1 in other words. . • d(u ± v) = du ± dv. dy dy/dx Example 47 Given y = ln x. • if y = f (x)g(x) then y = (f (x)g(x) ) = (eg(x) ln f (x) ) = eg(x) ln f (x) = eg(x) ln f (x) (g(x) ln f (x)) = f (x) f (x) g (x) ln f (x) + g(x) = f (x)g(x) g (x) ln f (x) + g(x) . Therefore 1 dx = = x = ey . dy 1/x • if a function f is a product (or quotient) of a number of other functions.. In the onedimensional case the set of all onetoone mappings coincides with the set of strictly monotonic functions.. + k − − . . f (x) • if y = ef (x) then y = f (x)ef (x) .More hints: • if y = ln f (x) then y = f (x) . .2 it has the dx 1 1 inverse f −1 . The rules of diﬀerentials are similar to those of derivatives: • If k is constant then dk = 0. . = . gk (x) then after taking the (natural) logarithms of the both sides and h1 (x)h2 (x) . v v2 Diﬀerentials of higher order can be found in the same way as derivatives. f (x) f (x) (by chain rule) (by product rule) • if a function f is a onetoone mapping (or singlevalued mapping). If y = f (x) = g1 (x)g2 (x) . dx g1 (x) gk (x) h1 (x) hl (x) Deﬁnition 33 If y = f (x) and dx is any number then the diﬀerential of y is deﬁned as dy = f (x)dx. 2 25 .. hl (x) dy g (x) g (x) h1 (x) h (x) rearranging the terms we get = y(x)· 1 + . • d(uv) = v · du + u · dv. Thus. its inverse is x = ey . • d v · du − u · dv u = . logarithmic function might be helpful while taking the derivative of f . − l .. if y = f (x) and x = f −1 (y) then = = or.
Assume that f (x) is diﬀerentiable in a neighborhood of x0 and it has a stationary point at x0 . i. b) such that f (c) = 0. Proposition 15 (Fermat’s Theorem or the Necessary Condition for Extremum) If f (x) is diﬀerentiable at x0 and has a local extremum (minimum or maximum) at x0 then f (x0 ) = 0. there exist points m. then f reaches its maximum and minimum in S. b) and f (a) = f (b).2. The sign of the ﬁrst derivative tells us whether the value of the function increases (f > 0) or decreases (f < 0). then we may rewrite Lagrange theorem as f (x0 +∆x)−f (x0 ) = ∆x·f (x0 +θ∆x). b). Recall the meaning of the ﬁrst and the second derivatives of a function f . where θ ∈ (0. b = x0 + ∆x. b) such that f (c) = . b] and diﬀerentiable in (a. f (b)). f (a) a c  b x This statement can be generalized in the following way: Proposition 18 (Cauchy’s Theorem or the Generalized Mean Value Theorem) If f and g are continuous in [a. 26 .4 Maxima and Minima of a Function of One Variable Deﬁnition 34 If f (x) is continuous in a neighborhood U of a point x0 . Proposition 14 (Weierstrass’s Theorem) If f is continuous on a closed and bounded subset S of R1 (or. 1). b]. f (x0 ) = 0.e. whereas the sign of the second derivative tells us whether the slope of the function increases (f > 0) or decreases (f < 0). M ∈ S such that f (m) ≤ f (x) ≤ f (M ) for all x ∈ S. b−a f (x) 6 ! f (b) ` The geometric meaning of the Lagrange’s theorem: the slope of a tangent line to f (x) at the point c coincides with the slope of the straight line. Proposition 17 (Lagrange’s Theorem or the Mean Value Theorem) If f is continuous in [a. diﬀerentiable in (a. Note that if the ﬁrst derivative vanishes at some point. This gives us an insight into how to verify that at a stationary point we have a maximum or minimum. b] and diﬀerentiable in (a. b) then there exists at least one point c ∈ (a. it is said to have a local or relative maximum (minimum) at x0 if for all x ∈ U . f (a)) and (b. If we denote a = x0 . i. connecting two points (a. in the general case. then there exists at least one point c ∈ (a. then there exists at least one f (b) − f (a) point c ∈ (a. Some useful results: Proposition 16 (Rolle’s Theorem) If f is continuous in [a.e. x = x0 . it does not imply that at this point f possesses an extremum. We can only state that f has a stationary point. f (x) < f (x0 ) (f (x) > f (x0 )). b) such that (f (b) − f (a))g (c) = (g(b) − g(a))f (c). of n R ).
To compensate for this. b) a local maximum if n is even and f (n) (x0 ) < 0. is the remainder in the Lagrange form.. this function can be transformed into a polynomial form in which the coeﬃcient of the various terms are expressed in terms of the derivatives values of f . However. . c) a local minimum if n is even and f (n) (x0 ) > 0.. we can extend the latter result and to apply the following general test: Proposition 21 (nthDerivative Test) If f (x0 ) = f (x0 ) = . it may happen that f (x0 ) = 0. i! i=0 ∞ where k! = 1 · 2 · . b) a local minimum. Example 48 Expand ex around x = 0. The Maclaurin series is the special case of the Taylor series when we set x0 = 0. ex = 1 + x x2 x3 + + + . · k. f (0) (x0 ) = f (x0 ). 0<θ<1 (n + 1)! f (i) (x0 ) (x − x0 )i + Rn . i. . = f (n−1) (x0 ) = 0. More precisely. f (n) (x0 ) = 0 and f (n) is continuous at x0 then at point x0 f (x) has a) an inﬂection point if n is odd. 0! = 1. To prove this statement we can use the Taylor series: If f is continuously diﬀerentiable enough at x0 . = 1! 2! 3! xi . Local Extremum) then the value of the function at x0 . function f c) does not change its sign. all evaluated at the point of expansion x0 . f (x0 ). i! i=0 n f (i) (x0 ) (x − x0 )i . i=0 i! ∞ 27 .Proposition 19 (The FirstDerivative Test for a) changes its sign from If at a stationary positive to negative.e. will be a) a local maximum. point x0 the ﬁrst b) changes its sign from derivative of a negative to positive. it can be expanded around a point x0 . Since dn x e dxn = ex for all n and e0 = 1. . therefore the secondderivative test is not applicable. f (x) = or f (x) = where Rn = f (n+1) (x0 + θ(x − x0 )) (x − x0 )n+1 . Proposition 20 (SecondDerivative Test for Local Extremum) A stationary point x0 of f (x) will be a local maximum if f (x0 ) < 0 and a local minimum if f (x0 ) > 0. . c) neither a local maximum nor a local minimum.
we can derive the following approximation: f (x + dx) ≈ f (x) + f (x)dx. Note. By deﬁnition. at which the second derivative changes its sign. Find all stationary points and check whether they are local minima. i. Note that a zero ﬁrst derivative value is not required for an inﬂection point. there is no asymptote as x → ∞. The ﬁrstderivative test gives two stationary points: f (x) = 3x2 − 3 = 0 at x = 1 and x = −1. However. –points at which f is not diﬀerentiable. examine limx→∞ f (x) − ax. How to ﬁnd an asymptote for the curve y − f (x) as x → ∞: – Examine limx→∞ f (x)/x. – If limx→∞ f (x)/x = a. ! Recipe 9 – How to Find (Global) Maximum or Minimum Points: If f (x) has a maximum or minimum in a subset S (of R1 or. 3] is reached at the border point x = 2. 3. Recipe 10 – How to Sketch the Graph of a Function: Given f . It may happen (due to Weierstrass’s theorem) that f reaches its global maximum at the border point. 3]. y = ax + b is a nonvertical asymptote for the curve y = f (x) if limx→∞ (f (x) − (ax + b)) = 0.The expansion of a function into Taylor series is useful as an approximation device. vertical asymptotes (ﬁnite x’s at which f (x) = ∞) and nonvertical asymptotes. Check at which points our function is continuous. For instance. 4. if the limit does not exist. local maxima or neither. the straight line y = ax + b is an asymptote for the curve y = f (x) as x → ∞.e. Rn ). Find its asymptotics. f (−1) = 2 < f (3) = 18. Example 49 Find maximum of f (x) = x3 − 3x in [−2. however. etc.e. – If limx→∞ f (x) − ax = b. if the limit does not exist. points given by f (x) = 0. i. Therefore the global maximum of f in [−2. 28 ! ! . The secondderivative test guarantees that x = −1 is a local maximum. you should perform the following actions: 1. when dx is small. Find all points of inﬂection. diﬀerentiable. 2. Find intervals in which f is increasing (decreasing). then we need to check the following points for maximum/minimum values of f : –interior points of S that are stationary. Note that the ﬁrstderivative test gives us only relative (or local) extrema. in general. that in general polynomial approximation is not the most eﬃcient one. –extrema of f at the boundary of S. 5. there is no asymptote as x → ∞.
5 Integration (The Case of One Variable) Deﬁnition 36 Let f (x) be a continuous function. x = 0 is the unique extremum (minimum). and C is an arbitrary constant. d) y(x) → ∞ if x → ±1. it is positive if x ∈ (0. y(− ) = − . Find intervals in which f is convex (concave). Deﬁnition 35 f is called convex (concave) at x0 if f (x0 ) ≥ 0 (f (x0 ) ≤ 0). y = 0 ⇒ x = . 2 1 8 1 x = − is the unique point of inﬂection. 2x e) y (x) = − . c) y(x) → 0 if x → ±∞. +∞). It is not deﬁned at x = 1. positive at x ∈ (− . b are constants (linearity of the integral). 29 . the sketch of the graph looks like this: y(x) 6 −1/2 1/2  1 −1 x 2. 0) ∪ (1. 1) ∪ (1. 2x−1 Example 50 Sketch the graph of the function y = (x−1)2 . y (x) is not deﬁned at x = 1. y(0) = −1. The indeﬁnite integral of f (denoted by f (x)dx) is deﬁned as f (x)dx = F (x) + C. 1) ∪ (1. +∞) (x − 1)4 2 1 (the function is convex) and negative at x ∈ (−∞. 2 Finally. 1) (the (x − 1)3 function is increasing) and negative if x ∈ (−∞. b) y(x) is continuous in (−∞. a) y(x) is deﬁned for x ∈ (−∞. 1) ∪ (1.6. therefore x = 1 is the vertical asymptote. 2 2 9 1 g) Intersections with the axes are given by x = 0 ⇒ y(x) = −1. therefore y = 0 is the horizontal asymptote. Rules of integration: • (af (x) + bg(x))dx = a f (x)dx + b g(x)dx. +∞). where F (x) is such that F (x) = f (x). and strictly convex (concave) if the inequalities are strict. 2x − 1 1 f ) y (x) = 2 . a. +∞) (the function is decreasing). − ) ( the function is concave).
i = 0. < xn = b and numbers ζi ∈ [xi . . . b] The indeﬁnite integral is a function. this value may be interpreted geometrically to be a particular area under a given curve deﬁned by y = f (x). 1. Example 51 x2 + 2x + 1 1 x2 a) dx = xdx + 2dx + dx = + 2x + ln x + C. α. c. x x 2 2 1 1 e−x (substitution z = x2 ) −x2 −x2 −z b) xe dx = − (−2x)e dx = − e dz = − + C. 2 2 2 x) (by parts. f (x)ef (x) dx = ef (x) + C a+1 xa dx = x + C. b. Some special rules of integration: f (x) dx = ln f (x) f (x) 1 dx = ln x + C. β • • • • •  • b a (αf (x) b a a a b a ! + βg(x))dx = α a b b a f (x)dx + β b a g(x)dx f (x)dx = − f (x)dx = 0 f (x)dx = c a f (x)dx f (x)dx + b a b c f (x)dx b a b a f (x)dx ≤ f (x)dx g(b) g(a) f (x)g (x)dx = f (u)du. The deﬁnite integral of a continuous function f is b a f (x)dx = F (x)b = F (b) − F (a). n − 1. Note that if a curve lies below the x axis. u = g(x) (change of variable) 30 . i=0 Since every deﬁnite integral has a deﬁnite value. g(x) = e c) xex dx = xex − ex dx = xex − ex + C. The deﬁnite integral is a number! We understand the deﬁnite integral in Riemann sense: Given a partition a = x0 < x1 < . xi+1 ]. .• • f (x)g(x)dx = f (x)g(x) − f (x)g (x)dx f (u(t)) du dt = f (u)du dt (integration by parts). . a (2) for F (x) such that F (x) = f (x) for all x ∈ [a. Deﬁnition 37 (the NewtonLeibniz formula). . . e dx = e + C. a = −1 a+1 x a ax dx = ln a + C a > 0 See any calculus referencebook for tables of integrals of basic functions. this area will be negative. b n−1 ! f (x)dx = a maxi (xi+1 −xi )→0 lim f (ζi )(xi+1 − xi ). x x x + C. f (x) = x. (integration by substitution). Properties of deﬁnite integrals: For any arbitrary real numbers a.
b) Simpson’s formula: Denoting yi = f (a + i b−a ). dλ dλ λ2 We can combine the last two formulas to get: • Leibniz’s formula: d dλ b(λ) a(λ) b(λ) f (x. For instance. + 2yn−1 + yn ). λ)dx Proposition 22 (Mean Value Theorem) If f (x) is continuous in [a. λ)dx = f (b(λ). λ λ then 1 I(λ) = − 0 d −λx dJ(λ) 1 − e−λ (1 + λ) e dx = − = . λ)dx = fλ (x. 2n n is an arbitrary integer number. let us ﬁnd 1 I(λ) = 0 xe−λx dx. If • f (t)dt = f (x) b d a dλ f (x. λ)b (λ) − f (a(λ). 2n b f (x)dx ≈ a b−a (y0 + 4 y2i−1 + 2 y2i + y2n ). b] then there exist at least one point c ∈ (a.• b a f (x)g (x)dx = f (x)g(x)b − a b a f (x)g(x)dx Some more useful results: If λ is a real parameter. 6n i=1 i=1 n n−1 Improper integrals are integrals of one of the following forms: 31 . then b a b d dλ a f (x. Numerical methods of approximation the deﬁnite integrals: a) The trapezoid formula: Denoting yi = f (a + i b−a ). . λ)a (λ) + a(λ) fλ (x. which can not be integrated in elementary functions. . λ)dx exists. If we introduce another integral 1 J(λ) = 0 e−λx dx = − e−λx 1 e−λ − 1 0 = − . n b f (x)dx ≈ a b−a (y0 + 2y1 + 2y2 + . • b(λ) d dλ a(λ) f (x)dx = f (b(λ))b (λ) − f (a(λ))a (λ) x d dx a In particular. λ)dx Example 52 We may apply this formula when we need to evaluate a deﬁnite integral. b) such that b a f (x)dx = f (c)(b − a).
i = 1. py ) = px x∗ + py y ∗ . y and px . where M is income level. i. . otherwise it is called divergent. . dM ∗ (px ) = x∗ (px )dpx . x∗ (px ) = x x c S∗ = 1 4 then 1 √ √ dpx = 2 px 4 = 2 · (2 − 1) = 2. xi . our function is diﬀerentiable with respect to all ∂f arguments. c √1 px For example. . x x the optimal subsidy S ∗ required to maintain a given level of utility would be the integral over all the inﬁnitesimal income changes as price changes: S∗ = p2 x p1 x x∗ (px )dpx . The solution to the expenditureminimization problem under the utility constraint is the budget line with income M ∗ . 32 .• • • ∞ a b a b a f (x)dx = limb→∞ b a f (x)dx or b −∞ f (x)dx = lima→−∞ b a f (x)dx. .e. c dpx where x∗ (px ) is the incomecompensated (or simply compensated) demand curve for good c x. . xn ) = lim = lim ∂xi ∆xi →0 ∆xi ∆xi →0 ∆xi for all i = 1. . . 1 px 2. . 2. . . . py are quantities and prices of two goods. . . where dM ∗ is the optimal subsidy for c an inﬁnitesimal change in price. If these limits exist. If we hold utility and py constant. n. . Example 53 1 1 lim →0+ (ln 1 − ln ). . if price increases from p1 to p2 . p2 = 4. xi + ∆xi . 1 Economics Application 3 (The Optimal Subsidy and the Compensated Demand Curve) Consider the consumer’s dual problem (recall that the dual to utility maximization is expenditure minimization: we minimize the objective function M = px x + py y. 2. . each varies without aﬀecting others.6 Functions of More than One Variable Let us consider a function y = f (x1 . we can “add up” a series of inﬁnitesimal changes by taking integral over those inﬁnitesimal changes. p = 1 x 0 x Therefore the improper integral converges if p < 1 and diverges if p ≥ 1. Thus. respectively). . x. . . y). . the improper integral is said to be convergent. . . . xn ) where the variables xi . ∂xi = fxi (= fxi ). . x2 . .g. . e. M ∗ (px . If these limits exist. p = 1 1 1−p dx = lim dx = 1 p p →0+ lim →0+ ( 1−p − 1−p ). The partial derivative of y with respect to xi is deﬁned as ∂y ∆y f (x1 . . f (x)dx = lim →0 f (x)dx = lim →0 b a+ b− a f (x)dx where limx→a f (x) = ∞. f (x)dx where limx→b f (x) = ∞. if p1 = 1. By rearranging terms. n are all independent of one other. xn ) − f (x1 . subject to the utility constraint U = U (x. Partial derivatives are often denoted with subscripts. . due to Hotelling’s lemma dM ∗ (px ) = x∗ (px ).
y) satisfy the conditions of Schwarz’s theorem (i. xi = xi (t1 . 1 ∂y ∂x2 = x1 + 1/x2 . dt ∂x dt ∂y dt (a special case of the chain rule) Example 56 If z = f (x. 2. . Deﬁnition 38 The total diﬀerential of a function f is deﬁned as dy = ∂f ∂f ∂f dx1 + dx2 + . . y). fxy = fyx ). n.e. . .Example 54 If y = 4x3 + x1 x2 + ln x2 then 1 ∂y ∂x1 = 12x2 + x2 . . y = v(t) then dz ∂f dx ∂f dy = + . Therefore f (x1 . . xn ). . n. . 2. j = 1. . y) and f (x. . . xn ). . . ∂xj ∂xi ∂xi ∂xj i. . x2 ) ≈ f (x1 . m ∂y = ∂tj ∂f (x1 . . . . y = v(t) then ∂f dx ∂f dy ∂f dz = + + . xn ) is deﬁned as d2 y = d(dy) = ∂2f dxi dxj . . . . . x2 )+df (x1 . . x2 . dt ∂x dt ∂y dt ∂t (dz/dt is sometimes called the total derivative). . For instance. i = 1. . 1 2 1 2 ∂f (x0 . . then ∂2f ∂2f = . . . j = 1. . 2. ∂xi ∂tj i=1 n This rule is called the chain rule. xn ) = fx (x1 . . i=1 j=1 ∂xi ∂xj n n i. x2 ) ≈ f (x1 . . Example 57 Let z = f (x. Deﬁnition 39 The second order total diﬀerential of a function y = f (x1 . x0 ) ∂f (x0 . . if y = f (x1 . . . Second order partial derivative of y = f (x1 . Then d2 z = fxx dx2 + 2fxy dxdy + fyy dy 2 33 . x0 ). . using partial derivatives of higher order. we can expand a function of several variables in a polynomial form of Taylor series. . x2 )+ 2 ∂x1 ∂x2 If y = f (x1 . . . y. . xn ) ∂xi . 2. ∂xj ∂xi ∂xj i i. . x0 ) 1 2 1 2 0 0 0 0 0 0 0 (x1 −x1 )+ (x2 −x0 ). . x0 ) ≈ df (x0 . tm ). ∂x1 ∂x2 ∂xn The total diﬀerential can be also used as an approximation device. . . n. . n. . . then for all j = 1. x = u(t). x2 ) then by deﬁnition y − y0 = ∆f (x0 . . . . Proposition 23 (Schwarz’s Theorem or Young’s Theorem) If at least one of the two partials is continuous. j = 1. . 2. + dxn . . Example 55 If z = f (x. xn ) is deﬁned as ∂2y ∂ = fxi xj (x1 . Similar to the case of one variable. t) where x = u(t).
for all i = 1. The secondorder necessary condition (S. . xn ) has a local maximum (minimum) at x∗ = (x∗ .O. . . ∂2f ∂xn ∂x1 ∂2f ∂xn ∂xn . .. . . x∗ ) is a stationary point of f (x1 .O. y). .. . . . maximum fxx (x∗ ) < 0. .). . .e. xn ). . If the ﬁrst order necessary conditions are met at x∗ then fxx (x∗ ) > 0. . .. The secondorder suﬃcient condition (S. Note the suﬃciency of this test. Example 58 Consider z = f (x. . fxx (x∗ )fyy (x∗ ) − (fxy (x∗ ))2 > 0 minimum x∗ is a (local) if . . the sign deﬁniteness of d2 z is equivalent to the sign deﬁniteness of the quadratic form H evaluated at x∗ . . For instance.2. Example 59 (Classiﬁcation of Stationary Points of a C (2) Function3 of n Variables) If x∗ = (x∗ . The conditions fxi (x∗ ) = 0.O. .. 2. Recipe 11 – How to Check the Sign Deﬁniteness of d2 z: Let us deﬁne the symmetric Hessian matrix H of second partial derivatives as H= ∂2f ∂x1 ∂x1 .N. . . 34 . . . we identify a stationary point as a saddle point.): x∗ is a maximum (minimum) if d2 z at x∗ is negative deﬁnite (positive deﬁnite).. fxx (x∗ )fyy (x∗ ) − (fxy (x∗ ))2 > 0 If fxx (x∗ )fyy (x∗ ) − (fxy (x∗ ))2 < 0. ∂2f ∂x1 ∂xk .N. n are called the ﬁrst order necessary conditions (F. Deﬁnition 41 z = f (x1 .C. . . n ∂2f ∂xk ∂x1 ∂2f ∂xk ∂xk Recall that C (2) means ‘twice continuously diﬀerentiable’. ∂2f ∂x1 ∂x1 ! . . . we still may have an extremum at (x∗ ). . . k = 1. Hk  = det then 3 .. . . We can apply either the principal minors test or the eigenvalue test discussed in the previous chapter.C.C. Deﬁnition 40 z has a stationary point at x∗ = (x∗ .. x∗ ) 1 n if f (x∗ ) − f (x) ≥ 0(≤ 0) for all x in some neighborhood of x∗ . .7 Unconstrained Optimization in the Case of More than One Variable Let z = f (x1 . . . .) requires sign semideﬁniteness of the second order total diﬀerential. 2. .. i. . all fxi 1 n should vanish at x∗ . xn ) and if Hk (x∗ ) is the follow1 n ing determinant evaluated at x∗ . . . . x∗ ) if dz = 0 at x∗ . ∂2f ∂x1 ∂xn . . . Thus. . even if fxx (x∗ )fyy (x∗ ) − (fxy (x∗ ))2 = 0.
say. dx Fy (x.N. .C. 2.O. and obtain dx ∂(e−ax − eax + 2)/∂a x =− =− <0 −ax − eax + 2)/∂x da ∂(e a in some neighborhood of (x∗ . (1. The negativity of the derivative is due to the fact that x∗ is positive.C. y = ψ(x) and dy F (x. . where a is a parameter.0) = 6x 6 6 −48y = (0. F.N. zy = 6(−4y 2 + x) = 0. n. .O.C. therefore. Example 60 Let us ﬁnd the extrema of the function z(x. S.N.−1/2) = 6 6 6 24 .: The Hessian matrices evaluated at stationary points are H(0. for a stationary point: fx = a(e−ax − eax + 2) = 0. y0 ) = c (c is constant) and Fy (x0 .e. a > 1. 1 + eax What happens to the extremum value of x if a increases by a small number? To answer this question. Solving F. F (x0 . y0 ) is an interior point of D. ((1 + eax )2 If this expression vanishes at. y) = x3 − 8y 3 + 6xy + 1. a). . . y0 ). k = 1.: zx = 3(x2 + 2y) = 0. then x∗ is a local minimum. y = −1/2. x = x∗ . 2.O. then x∗ is a local maximum. therefore (0. every x in a small neighborhood of x∗ should be positive as well. .O. 2. b) if Hk (x∗ ) > 0. i. y) Example 61 Let consider the function 1 − e−ax . we can apply the implicit function theorem to the F. y) ∈ C (k) in a set D (i. . then x∗ is a saddle point.a) if (−1)k Hk (x∗ ) > 0. k = 1. 35 .N. . F is k times continuously diﬀerentiable at any point in D) and (x0 .8 The Implicit Function Theorem Proposition 24 (The TwoDimensional Case) If F (x. for x. n. y) =− x . H(1. y0 ) = 0 then the equation F (x. ﬁrst write down the F. 0) is a saddle point.O.0) 0 6 6 0 . y) = c deﬁnes y as a C (k) function of x in some neighborhood of (x0 . y we obtain two stationary points: x = y = 0 and x = 1.e. c) if Hn (x∗ ) = 0 and neither of the two conditions above is satisﬁed. −1/2) is a local minimum.C.C.
. . f m are C (k) functions in a set D ⊂Rn+m . . . . ∂f 1 = − . n. and m endogenous variables. dxi = 0. . y1 . . i−1.. .. . . . ∂x1 ∂xn . by.. . + ∂y1 ∂xi ∂ym ∂xi which can be solved for method. say. x∗ . y ∗ ) = (x∗ . m f (x1 .. . . .. . f m with respect to y1 . ∂xi If we take the total diﬀerentials of all f j . + dxn . xn . . that in fact dy1 /dxi . xn . Cramer’s rule or by the inverse matrix ∂xi ∂xi Let deﬁne the Jacobian matrix of f 1 . . ∂x1 ∂xn Now allow only xi to vary (i. j = 1. . . .. . y1 .. . . .Let us formulate the general case of the implicit function theorem. (note. 2. ... . + ∂y1 ∂xi ∂ym ∂xi ∂f m ∂y1 ∂f m ∂ym + . .. . . . Given the deﬁnition of Jacobian. . y1 . . . ym as ∂f 1 ∂y1 J = . .. . m.. + dxn . ym ) = 0 . ym ) be a solution to (3) in the interior of A. . . holding constant all other xl ): ∂f 1 ∂y1 ∂f 1 ∂ym + . ∂xi ∂y1 ∂ym . . Thus we arrive at the following linear ∂y1 ∂ym system with respect to . . . . Consider the system of m equations with n exogenous variables. . ym ) = 0 2 f (x1 . . .. . we are in a position to formulate the general result: Proposition 25 (The General Implicit Function Theorem) Suppose f 1 . ∗ ∗ Let (x∗ . .. l = 1. ∂f m ∂f m = − dx1 + . . . . . . . dym /dxi should be ∂xi ∂xi interpreted as partial derivatives with respect to xi since we allow only xi to vary. . y1 ... . . . ... ∂xi . x1 . ym ) = 0 (3) Our objective is to ﬁnd ∂yj . . ∂f m ∂y1 ∂f m ∂ym . dxl = 0. . . . . i+1... n) and divide each remaining term in the system above by dxi . . . + dym ∂y1 ∂ym ∂f 1 ∂f 1 = − dx1 + . .. xn . y1 . i = 1. . . . . .. 1 n 36 . . . we get ∂f 1 ∂f 1 dy1 + . . . . . + dym ∂y1 ∂ym ∂f m ∂f m dy1 + . ∂f 1 ∂ym .. . . . ym : 1 f (x1 .. . xn . . . . ∂f m = − .e. . . . .. ..
f 1 .2) 2 −2 1 1 = 4 = 0. ∂y1 ∂xj .Suppose also that det(J ) does not vanish at (x∗ . . . . ym as C (k) functions of x1 . . . k) f22 (l. i = 1. z) = x2 + y 2 − z 2 /2. y ∗ ) and in that neighborhood. n. 2. . then the ﬁrm decides to employ less labor. . . gn are functionally (linearly or nonlinearly) dependent. then the determinant of the Jacobian matrix of g1 . k) 37 . . w and r as given. y ∗ ). k) − r = 0 ∂k ∂F 1 ∂l ∂F 2 ∂l ∂F 1 ∂k ∂F 2 ∂k or. k) − w = 0 ∂l p ∂f (l. Solution. . w) = 0 F 2 (l. 2). . fn with respect to x1 . Therefore the conditions of the implicit function theorem are satisﬁed and dx dz dy dz 1 =− 2x − 2y 1 −2y −1 2x −z 1 =⇒ dx dz dy dz z+2y = 2x−2y z+2x = − 2x−2y The Jacobian matrix can be also applied to test whether functional (linear or nonlinear) dependence exists among a set of n functions in n variables. . . and at (1. . −1. . then the ﬁrm will reduce l by less than it does in part a). a) The ﬁrm’s objective is to choose l and k so that it maximizes its proﬁts. k) f12 (l. 2) = 0. xn ). . . k. . 2) = f 2 (1. p is the price of output.k are the amount of labor and capital employed by the ﬁrm (in units of output). a) Show that if the wage increases by a small amount. −1. . . z) = x + y + z − 2. . y. if we have n functions of n variables gi = gi (x1 . f 1 (1. r) = 0 The Jacobian of this system of equations is: J = = p2 f11 (l. Economics Application 4 (The ProﬁtMaximizing Firm) Consider that a ﬁrm has the proﬁt function F (l. . k) = pf (l. w is the real wage and r is the real rental price of capital. . Assume that the Hessian matrix of f is negative deﬁnite. xm will be identically zero for all values of x1 . . ∂f m ∂xj . . Then (3) deﬁnes y1 . The ﬁrm takes p. f12 (l. . xn in some neighborhood of (x∗ . . f 2 (x. . . . Example 62 Given the system of two equations x2 + y 2 = z 2 /2 and x + y + z = 2. f 1 (x. ∂ym ∂xj = −J −1 · ∂f 1 ∂xj . . y. . . −1. ﬁnd x = x (z). k) − wl − rk where f is the ﬁrm’s (neoclassical) production function. y = y (z) in the neighborhood of the point (1. . . xn if and only if the n functions g1 . More precisely. 2. . . l.−1. . In this setup. k) = p2 H > 0. −1. b) Show that if the wage increases by a small amount and the ﬁrm is constrained to maintain the same amount of capital k0 . F 1 (l. n. f 2 ∈ C ∞ (R3 ). . The ﬁrstorder conditions for maximization are: p ∂f (l. 2) det(J ) = det ∂f 1 ∂x ∂f 2 ∂x ∂f 1 ∂y ∂f 2 ∂y = det 2x 2y 1 1 = det (1. for j = 1. k. .
Since J = 0 the implicitfunction theorem can be applied. un ) and v = (v1 . . pH = Thus. k) < 0. k0 ) 1 < pH pf11 (l. . k0 ) − w = 0. Taking the derivative of the F. l and k are implicit functions of w. the optimal l falls in response to a small increase in w. If we change weak inequalities ≤ and ≥ to strict inequalities.O. k) ∂l =− ∂w J f22 (l. . Convexity. . b) If the ﬁrm’s amount of capital is ﬁxed at k0 . ∂l Since f11 = 0. the proﬁt function is F (l) = pf (l.where fij is the secondorder partial derivative of f with respect to the jth and the ith arguments and H is the Hessian of f . we have: pf11 (l. we will get the deﬁnition of strict concavity and strict convexity. . k0 ) − wl − rk0 and the ﬁrstorder condition is p ∂f (l. an alternative deﬁnition of concavity (or convexity) is n n f (v) ≤ f (u) + i=1 fxi (u)(vi − ui ) (or f (v) ≥ f (u) + i=1 fxi (u)(vi − ui )).9 Concavity. 2. thus. If f is diﬀerentiable. the above condition deﬁnes l as an implicit function of w. ∂w pf11 (l. 1) θf (u) + (1 − θ)f (v) ≤ f (θu + (1 − θ)v) and convex if and only if θf (u) + (1 − θ)f (v) ≥ f (θu + (1 − θ)v). 38 . . . . k0 ) This change in l is smaller in absolute value than the change in l in part a) since f22 (l. . vn ) in the domain of f and for any θ ∈ (0. with respect to l. xn ) is concave if and only if for any pair of distinct points u = (u1 . k) 0 pf22 (l. . The fact that H is negative deﬁnite means that f11 < 0 and H > 0 (which also implies f22 < 0). Quasiconcavity and Quasiconvexity Deﬁnition 42 A function z = f (x1 . The ﬁrstorder partial derivative of l with respect to w is: −1 pf12 (l. k0 ) ∂l = 1 or ∂w ∂l 1 = .) For another economic application see the section ‘Constrained Optimization’. k0 ) (Note that both ratios are negative. respectively. . .C.
. ! 39 . at least one of these functions is strictly concave (or convex). . . Recipe 12 – How to Check Concavity (or Convexity): A twice continuously diﬀerentiable function z = f (x1 . . . y ∈ S and any θ ∈ [0. xn ) is concave (or convex) if and only if d2 z is everywhere negative (or positive) semideﬁnite. while those for strict concavity (strict convexity) are only suﬃcient.Some useful results: • The sum of concave (or convex) functions is a concave (or convex) function. . Deﬁnition 44 A function z = f (x1 . . z = f (x1 . . . respectively. additionally. x. the sum is also strictly concave (or convex). Again. . The conditions for concavity and convexity are necessary and suﬃcient. 1] the linear combination θx + (1 − θ)y ∈ S. A change of weak inequalities ≤ and ≥ on the right to strict inequalities gives us the deﬁnitions of strict quasiconcavity and strict quasiconvexity. y) = xy. • If f (x) is a (strictly) concave function then −f (x) is a (strictly) convex function and vise versa. Note that any convex (concave) function is quasiconvex (quasiconcave). If. . any test for the sign deﬁniteness of a symmetric quadratic form is applicable. . . . Deﬁnition 43 A set S in Rn is convex if for any x. Example 63 z(xy) = x2 + xy + y 2 is strictly convex. . . . xn ) is quasiconcave if and only if for any pair of distinct points u = (u1 . If f is diﬀerentiable. Example 64 The function z(x. but not vice versa. . xn ) is strictly concave (or strictly convex) if (but not only if ) d2 z is everywhere negative (or positive) deﬁnite. . vn ) in the convex domain of f and for any θ ∈ (0. an alternative deﬁnition of quasiconcavity (quasiconvexity) is n n ! f (v) ≥ f (u) =⇒ i=1 fxi (u)(vi − ui ) ≥ 0 (f (v) ≥ f (u) =⇒ i=1 fxi (v)(vi − ui ) ≥ 0). 1) f (v) ≥ f (u) =⇒ f (θu + (1 − θ)v) ≥ f (u) and quasiconvex if and only if f (v) ≥ f (u) =⇒ f (θu + (1 − θ)v) ≤ f (v). y ≥ 0 is quasiconcave. . un ) and v = (v1 .
10 Appendix: Matrix Derivatives Matrix derivatives play an important role in economic analysis. . especially in econometrics.2.
the derivative of its determinant with respect to A is given by ∂ (det(A)) = (Cij ). Example 65 (Matrix Derivatives in Econometrics: How to Find the Least Squares Estimator in a Multiple Regression Model) Consider the multiple regression model: y = Xβ + where n × 1 vector y is the dependent variable. X is a n × k matrix of k explanatory variables with rank(X) = k. Our goal is to ﬁnd an estimator b for β using the least squares method. The ﬁrstorder condition for extremum is: dE(b) = 0. db d(b M b) = (M + M )b. db where a. db To obtain the ﬁrst order conditions we use the following formulas: • • • • d(a b) = a. ∂A where (Cij ) is the matrix of cofactors of A. db d(b a) = a. db d(M b) =M. β is a k × 1 vector of coeﬃcients which are to be estimated and is a n × 1 vector of disturbances. We assume that the matrices of observations X and y are given. Using these formulas we obtain: dE(b) = −2X y + 2X Xb db and the ﬁrstorder condition implies: X Xb = X y 41 .If A is [n × n] nonsingular matrix. The least squares estimator of β is a vector b. which minimizes the expression E(b) = (y − Xb) (y − Xb) = y y − y Xb − b X y + b X Xb. b are k × 1 vectors and M is a k × k matrix.
equivalently.G. • Rudin.N and D. The assumptions rank(X) = k and z = 0 imply Xz = 0. notice that X X is a symmetric matrix. z X Xz > 0 for any z = 0. Further Reading: • Bartle. z = 0 and check the following quadratic form: z (X X)z = (Xz) Xz. we need to prove the positive deﬁniteness of the matrix X X. W.or b = (X X)−1 X y On the other hand. we take an arbitrary k × 1 vector z. C. we have: d2 E(b) = (2X X) = 2X X.Penny. First. W.E. • Greene. 42 . It follows that (Xz) Xz > 0 for any z = 0 or. db2 To check whether the solution b is indeed a minimum. Econometric Analysis. R. The Elements of Real Analysis. by the third derivation rule above. The symmetry is obvious: (X X) = X X. • Edwards. Principles of Mathematical Analysis.H. Calculus and Analytical Geomethry. which means that (by deﬁnition) X X is positive deﬁnite. To prove positive deﬁniteness.
. If it is possible to explicitly express (from the constraint functions) m independent variables as functions of the other n − m independent variables. . . . . . . . .3 3. . . xn and λ1 . xn . . thus the initial problem will be reduced to the unconstrained optimization problem with respect to n − m variables. . ∂xi ∂xi ∂xi j=1 i = 1. . . j (4) f is called the objective function. . m < n.1 Constrained Optimization Optimization with Equality Constraints Consider the problem4 : extremize f (x1 . 2. . .rank(J ) = m. . λm . . λm to zero: m ∂L(x1 . . . j = ∂x 1. . . Instead of the substitution and elimination method. xn ) = bj . . xn )). Example 66 This example shows that the Lagrangemultiplier method does not always provide the solution to an optimization problem. Consider the problem: min x2 + y 2 subject to (x − 1)3 − y 2 = 0. . 4 ! “Extremize” means to ﬁnd either the minimum or the maximum of the objective function f . 2. It is important that rank(J ) = m. . . λ1 . j Let f and g 1 . λm ) ∂f (x1 . xn . . . λ1 . . xn . xn . b1 . x∗ ) 1 n is a solution of (4). λm are constant (Lagrange multipliers). The diﬀerence n − m is the number of degrees of freedom of the problem. We introduce the Lagrangian function as m ! L(x1 . . 2. . m have the full rank. . . . . 43 . . . . . . . . ∂λj j = 1. . . we can eliminate m variables in the objective function. λ1 . . . where λ1 . . and the functions are continuously diﬀerentiable. n. j = 1. . Recipe 13 – What are the Necessary Conditions for the Solution of (4)? Equate all partials of L with respect to x1 . . Solve these equations for x1 . xn ) ∂g j (x1 . . λ1 . . in many cases it is not technically feasible to explicitly express one variable as function of the others. xn ) + j=1 λj (bj − g j (x1 . . . 2. . g m are the constraint functions. .e. λm ) = bj − g j (x1 . . . . . g 2 . b2 . . . . . . . . . . . it should be a stationary point of L. . . . . . i = 1. m. . . In the end we will get a set of stationary points of the Lagrangian. λm ) = f (x1 . Note that n is strictly greater than m. . However. . . xn ) = − λj = 0. . . . . . . n. If x∗ = (x∗ . . λ2 . . g m be C (1) functions and the Jacobian J = ( ∂g i ). . ∂L(x1 . . xn ) subject to g (x1 . . xn ) = 0. . g 1 . bm are the constraint constants. . . . 2. . we may resort to the easytouse and welldeﬁned method of Lagrange multipliers. . . . . . i. . . . . . .
The Lagrangian function is L = x2 + y 2 − λ[(x − 1)3 − y 2 ] The ﬁrstorder conditions are ∂L ∂x ∂L ∂y ∂L ∂λ = 0 = 0 = 0 or 2x − 3λ(x − 1)2 2y + 2λy (x − 1)3 − y 2 = 0 = 0 = 0 Clearly.. . If x = 1. Thus. Let f and g 1 . . . . .. . . . . the following local suﬃcient conditions can be applied: ¯ Proposition 26 Let us introduce a bordered Hessian Hr  as ¯ Hr  = det 0 . . r = 1.. . . r = m + 1... . . ∂g m ∂x1 ∂2L ∂x1 ∂x1 . 2. . .. . ∂g ∂xr ∂2L ∂xr ∂x1 ∂2L ∂xr ∂xr .. . . ¯ • if (−1)r Hr (x∗ ) > 0.Notice that the solution to this problem is (1. ... . then y > 0 and x2 + y 2 > 1. 0).. x∗ ) satisﬁes the necessary conditions for the problem 1 n max(min)f (x1 . . . Then 44 . . .. . n. ∂g ∂xr ∂2L ∂x1 ∂xr . ... Thus. then x∗ is a local maximum point for the problem (4). .. .. n. r = 1. 2. .. .. ∂g m ∂xr ∂2L ∂x1 ∂xr . g m are C (2) functions and let x∗ satisfy the necessary conditions for the ¯ problem (4). . . r = m + 1. . . . . . xn ) subject to g(x1 . n. . . then x∗ is a local minimum point for the problem (4). . . . . 0 .. . . . . . 0 ∂g m ∂x1 ∂g 1 ∂x1 . ... Then ¯ • if (−1)m Hr (x∗ ) > 0. .. . the Lagrangemultiplier method fails to detect the solution (1. . If we need to check whether a stationary point is a maximum or minimum. . . .. gn ) = b.. all partial derivatives of the Lagrangian are zero at x∗ . 0). Deﬁne ¯ Hr  = det 0 ∂g ∂x1 . . 0). The restriction (x − 1)3 = y 2 implies x ≥ 1. i. Let Hr (x∗ ) be the bordered Hessian determinant evaluated at x∗ .. ∂g 1 ∂xr . . . . .e. Example 67 (Local SecondOrder Conditions for the Case with one Constraint) Suppose that x∗ = (x∗ . the ﬁrst equation is not satisﬁed for x = 1. .. If x > 1. . .. ∂g ∂x1 ∂2L ∂x1 ∂x1 . ∂g 1 ∂x1 ∂g 1 ∂xr . 0 . . . .. . ∂g m ∂xr ∂2L ∂xr ∂x1 ∂2L ∂xr ∂xr . the minimum of x2 + y 2 is attained for (1. . . then y = 0 and x2 + y 2 = 1.. . . . . ¯ ¯ Let Hr (x∗ ) be Hr  evaluated at x∗ . n.
. This condition and the third condition above lead to x = 31 . 2 1 0 Thus. determines his budget set (the set of all aﬀordable bundles of goods). y) subject to x + 2y = 30. n. the budget constraint should be satisﬁed with equality. together with the conditions x ≥ 0 and y ≥ 0. . Economics Application 5 (Utility Maximization) In order to illustrate how the Lagrangemultiplier method can be applied. . 31 . ¯ • x∗ is a local maximum point of f subject to the given constraint if (−1)r Hr (x∗ ) > 0 for all r = 2. . y) is an increasing function in both x and y (over the domain x ≥ 0 and y ≥ 0). 2 y= 29 ) 4 is indeed a maximum and represents the bundle demanded by the 45 . 2 y = 29 . . 4 To check whether this solution really maximizes the objective function. we obtain x − 2y − 1 = 0. substituting y + 1 for λ in the second equation. The prices of goods x and y are 1 and 2. He will choose the bundle from his budget set that maximizes his utility. and the consumer’s income is 30. let us apply the second order suﬃcient conditions. What bundle of goods will the consumer choose? The consumer’s budget constraint is x + 2y ≤ 30 which. .¯ • x∗ is a local minimum point of f subject to the given constraint if Hr (x∗ ) < 0 for all r = 2. respectively. y) = (x + 1)(y + 1) = xy + x + y + 1. n. let us consider the following optimization problem: The preferences of a consumer over two goods x and y are given by the utility function U (x. we get λ = y + 1. The secondorder conditions involve the bordered Hessian: 0 1 2 ¯ det H = 1 0 1 = 4 > 0. The consumer’s optimization problem can be stated as follows: max U (x. . The Lagrangian function is L = U (x. . Since U (x. The ﬁrstorder necessary conditions are ∂L ∂x ∂L ∂y ∂L ∂z = 0 = 0 = 0 = 0 or x + 1 − 2λ = 0 x + 2y − 30 = 0 y+1−λ From the ﬁrst equation. y) − λ(x + 2y − 30) = xy + x + y + 1 − λ(x + 2y − 30). (x = consumer.
a) = 0 F 3 (x. Given that this problem has a solution. • if the Lagrangian is concave.C. a) = 0 (5) The Jacobian of this system of three equations is: J = −2λ 0 −2x = 0 −2aλ −2ay . . −8aλ 4λ 2a In conclusion. y > 0 and a is a positive parameter. in a neighborhood of the optimal solution. . we can see that λ > 0. Proposition 27 Suppose that f and g 1 . x∗ minimizes (4). ﬁnd how the optimal values of x and y change if a increases by a very small amount. 46 . • if the Lagrangian is convex. a) = 0 F 2 (y. λ. .O. a small increase in a will lead to a rise in the optimal x and a fall in the optimal y. and in this neighborhood x and y can be expressed as functions of a. .Economics Application 6 (Applications of the ImplicitFunction Theorem) Consider the problem of maximizing the function f (x. 2x 2ay 0 J = −8aλ(x2 + ay 2 ) = −8aλ. F 1 (x. The ﬁrstorder partial derivatives of x and y with respect to a are evaluated as: ∂F 1 ∂a ∂F 2 ∂a ∂F 3 ∂a ∂F 1 ∂y ∂F 2 ∂y ∂F 3 ∂y ∂F 1 ∂λ ∂F 2 ∂λ ∂F 3 ∂λ ∂x =− ∂a = J 1λ 0 −2x −2λy −2aλ −2ay y2 2ay 0 =− J = 4ay 2 (λx + a) 3xy 2 = > 0. thus J < 0 at the optimal point. y. From the ﬁrst F. x∗ maximizes (4). g m are deﬁned on an open convex set S ⊂Rn . Let x∗ be a stationary point of the Lagrangian. Therefore. y. Solution: Diﬀerentiation of the Lagrangian function L = ax + y − λ(x2 + ay 2 − 1) with respect to x. 8aλ 2 ∂F 1 ∂x ∂F 2 ∂x ∂F 3 ∂x ∂F 1 ∂a ∂F 2 ∂a ∂F 3 ∂a ∂F 1 ∂λ ∂F 2 ∂λ ∂F 3 ∂λ ∂y =− ∂a =− J −2λ 1 −2x 0 −2λy −2ay 2x y2 0 =− J = −4xy(λx + a) − 4λy(x2 + ay 2 ) 3xy y =− + < 0. given by the system (5). λ gives the ﬁrstorder conditions: a − 2λx = 0 1 − 2aλy = 0 2 x + ay 2 − 1 = 0 ∂F 1 ∂x ∂F 2 ∂x ∂F 3 ∂x ∂F 1 ∂y ∂F 2 ∂y ∂F 3 ∂y ∂F 1 ∂λ ∂F 2 ∂λ ∂F 3 ∂λ or. y) = ax + y subject to the constraint x2 + ay 2 = 1 where x > 0. the conditions of the implicitfunction theorem are met. Then. λ.
3.2
The Case of Inequality Constraints
Classical methods of optimization (the method of Lagrange multipliers) deal with optimization problems with equality constraints in the form of g(x1 , . . . , xn ) = c. Nonclassical optimization, also known as mathematical programming, tackles problems with inequality constraints like g(x1 , . . . , xn ) ≤ c. Mathematical programming includes linear programming and nonlinear programming. In linear programming, the objective function and all inequality constraints are linear. When either the objective function or an inequality constraint is nonlinear, we face a problem of nonlinear programming. In the following, we restrict our attention to nonlinear programming. A problem of linear programming – also called a linear program – is discussed in the Appendix to this chapter. 3.2.1 NonLinear Programming
The nonlinear programming problem is that of choosing nonnegative values of certain variables so as to maximize or minimize a given (nonlinear) function subject to a given set of (nonlinear) inequality constraints. The nonlinear programming maximum problem is max f (x1 , . . . , xn ) subject to g (x1 , . . . , xn ) ≤ bi , i = 1, 2, . . . , m. x1 ≥ 0, . . . , xn ≥ 0
i
(6)
Similarly, the minimization problem is min f (x1 , . . . , xn ) subject to g (x1 , . . . , xn ) ≥ bi , i = 1, 2, . . . , m. x1 ≥ 0, . . . , xn ≥ 0
i
First, note that there are no restrictions on the relative size of m and n, unlike the case of equality constraints. Second, note that the direction of the inequalities (≤ or ≥) at the constraints is only a convention, because the inequality g i ≤ bi can be easily converted to the ≥ inequality by multiplying it by –1, yielding −g i ≥ −bi . Third, note that an equality constraint, say g k = bk , can be replaced by the two inequality constraints, g k ≤ bk and −g k ≤ −bk . Deﬁnition 45 A constraint g j ≤ bj is called binding (or active) at x0 = (x0 , . . . , x0 ) if 1 n g j (x0 ) = bj . Example 68 The following is an example of a nonlinear program: Max π = x1 (10 − x1 ) + x2 (20 − x2 ) subject to 5x1 + 3x2 ≤ 40 x1 ≤ 5 x2 ≤ 10 x1 ≥ 0, x2 ≥ 0. 47
!
Example 69 (How to Solve a Nonlinear Program Graphically) An intuitive way to understand a lowdimensional nonlinear program is to represent the feasible set and the objective function graphically. For instance, we may try to represent graphically the optimization problem from Example 68, which can be rewritten as Max π = 125 − (x1 − 5)2 − (x2 − 10)2 subject to 5x1 + 3x2 ≤ 40 x1 ≤ 5 x2 ≤ 10 x1 ≥ 0, x2 ≥ 0; or Min C = (x1 − 5)2 + (x2 − 10)2 subject to 5x1 + 3x2 ≤ 40 x1 ≤ 5 x2 ≤ 10 x1 ≥ 0, x2 ≥ 0. Region OABCD below shows the feasible set of the nonlinear program. x2 6 # 10 AB p
"!
5 O 2
C D 5
Feasible Set
x1

The value of the objective function can be interpreted as the square of the distance from the point (5, 10) to the point (x1 , x2 ). Thus, the nonlinear program requires us to ﬁnd the point in the feasible region which minimizes the distance to the point (5, 10). As can be seen in the ﬁgure above, the optimal point (x∗ , x∗ ) lies on the line 5x1 +3x2 = 1 2 40 and it is also the tangency point with a circle centered at (5, 10). These two conditions are suﬃcient to determine the optimal solution. Consider (x1 − 5)2 + (x2 − 10)2 − C = 0. From the implicitfunction theorem, we have x1 − 5 dx2 =− . dx1 x2 − 10
5 Thus, the tangent to the circle centered at (5, 10) which has the slope − 3 is given by the equation
5 x1 − 5 − =− 3 x2 − 10 or 3x1 − 5x2 + 35 = 0. 48
Solving the system of equations 5x1 + 3x2 = 40 3x1 − 5x2 = −35 we ﬁnd the optimal solution (x∗ , x∗ ) = ( 95 , 295 ). 1 2 34 34 3.2.2 KuhnTucker Conditions
For the purpose of ruling out certain irregularities on the boundary of the feasible set, a restriction on the constrained functions is imposed. This restriction is called the constraint qualiﬁcation. Deﬁnition 46 Let x = (x1 , . . . , xn ) be a point on the boundary of the feasible region of a ¯ nonlinear program. i) A vector dx = (dx1 , . . . , dxn )is a test vector of x if it satisﬁes the conditions ¯ • xj = 0 =⇒ dxj ≥ 0; ¯ • g i (¯) = ri =⇒ dg i (¯) ≤ (≥)0 for a maximization (minimization) program. x x ii) A qualifying arc for a test vector of x is an arc with the point of origin x, contained ¯ ¯ entirely in the feasible region and tangent to the test vector. iii) The constraint qualiﬁcation is a condition which requires that, for any boundary point x of the feasible region and for any test vector of x, there exist a qualifying ¯ ¯ arc for that test vector.
The constraint qualiﬁcation condition is often imposed in the following way: the gradients at x∗ of those g j constraints which are binding at x∗ are linearly independent.
Assume that the constraint qualiﬁcation condition is satisﬁed and the functions in (6) are diﬀerentiable. If x∗ is the optimal solution to (6), it should satisfy the following conditions (the KuhnTucker necessary maximum conditions or KT conditions): ∂L ∂L ≤ 0, xi ≥ 0 and xi = 0, ∂xi ∂xi ∂L ∂L ≥ 0, λj ≥ 0 and λj = 0, ∂λj ∂λj where
m
i = 1, . . . , n, j = 1, . . . , m
L(x1 , . . . , xn , λ1 , . . . , λm ) = f (x1 , . . . , xn ) +
j−1
λj (bj − g j (x1 , . . . , xn ))
is the Lagrangian function of a nonlinear program. The minimization version of KuhnTucker necessary conditions is ∂L ∂L ≥ 0, xi ≥ 0 and xi = 0, ∂xi ∂xi i = 1, . . . , n, 49
y ≥ 0. the KT conditions become necessary and even suﬃcient. the KT conditions are neither necessary nor suﬃcient for a local optimum. . The KT conditions are: = 10 − 2x1 − 5λ1 − λ2 ≤ 0 = 20 − 2x2 − 3λ1 − λ3 ≤ 0 = −(5x1 + 3x2 − 40) ≥ 0 = −(x1 − 5) ≥ 0 = −(x2 − 10) ≥ 0 x1 ≥ 0. λ3 ≥ 0 ∂L ∂L x1 ∂x1 = 0. 295 ) must 34 34 satisfy the KT conditions in Example 70. i = 1. the optimal solution ( 95 . 1). 3. x2 ∂x2 = 0 ∂L λi ∂xi = 0. Example 72 The following example illustrates a case where the KuhnTucker conditions are not satisﬁed in the solution of an optimization problem. Note that • The failure of the constraint qualiﬁcation signals certain irregularities at the boundary kinks of the feasible set. ∂λj ∂λj j = 1. However. • If all constraints are linear. . Consider the problem: max subject to y x + (y − 1)3 ≤ 0.) The Lagrangian function is L = y + λ[−x − (y − 1)3 ] 50 . . in general. Therefore. 2. (If y > 1. Only if the optimal solution occurs in such a kink may the KT conditions not be satisﬁed. m Note that. then the restriction x + (y − 1)3 ≤ 0 implies x < 0. λ2 ≥ 0. the constraint qualiﬁcation is always satisﬁed.∂L ∂L ≤ 0. x ≥ 0. Example 71 The constraint qualiﬁcation for the nonlinear program in Example 68 is satisﬁed since all constraints are linear. . λj ≥ 0 and λj = 0. Example 70 The Lagrangian function of the nonlinear program in Example 68 is: L = x1 (10 − x1 ) + x2 (20 − x2 ) − λ1 (5x1 + 3x2 − 40) − λ2 (x1 − 5) − λ3 (x2 − 10). x2 ≥ 0 λ1 ≥ 0. ∂L ∂x1 ∂L ∂x2 ∂L ∂λ1 ∂L ∂λ2 ∂L ∂λ3 ! ! The solution to this problem is (0. if certain assumptions are satisﬁed. Proposition 28 (The Necessity Theorem) The KT conditions are necessary conditions for a local optimum if the constraint qualiﬁcation is satisﬁed.
. we need to interchange the two words “concave” and “convex” in a) and b) and use the KuhnTucker necessary minimum condition in c). Proposition 30 (The Suﬃciency Theorem . . . λ∗ ) solves the saddle point problem then (x∗ . . 1). λm ) ≥ 0 51 . λ∗ ) which satisﬁes KuhnTucker necessary conditions and for which 1 m L(x. for all x = (x1 . λ = (λ1 . xn ) ≥ 0. .One of the KuhnTucker conditions requires ∂L ≤ 0. λ) is known as the saddle point problem. As can be observed. .2) The KT conditions are suﬃcient conditions for x∗ to be a local optimum of a maximization (minimization) program if the following assumptions are satisﬁed: • the objective function f is diﬀerentiable and quasiconcave (or quasiconvex). x∗ ). λ∗ ) solves the problem (6). • any one of the following is satisﬁed: – there exists j such that ∂f (x∗ ) ∂xj ∗ < 0(> 0). λ∗ ) ≤ L(x∗ . . b) each constraint function is diﬀerentiable and convex in the nonnegative orthant. λ∗ ) ≤ L(x∗ . • each constraint g i is diﬀerentiable and quasiconvex (or quasiconcave). .1) If a) f is diﬀerentiable and concave in the nonnegative orthant. . ∂y or 1 − 3λ(y − 1)2 ≤ 0. . – f is concave (or convex). . this condition is not veriﬁed at the point (0. . c) the point x∗ satisﬁes the KuhnTucker necessary maximum conditions then x∗ gives a global maximum of f . Proposition 29 (KuhnTucker Suﬃciency Theorem . Proposition 31 If (x∗ . . – there exists j such that ∂f (xj ) > 0(< 0) and xj can take a positive value without ∂x violating the constraints. The problem of ﬁnding the nonnegative vector (x∗ . . . . λ∗ ). x∗ = (x∗ . λ∗ = 1 n (λ∗ . To adapt this theorem for minimization problems.
The optimal solution to the maximization program indicates the optimal quantities of each good the ﬁrm should produce. the marginal ∂L proﬁt of good j must be equal to the aggregate marginal imputed cost ( ∂xj = 0). Example 73 Let us ﬁnd an economic interpretation for the maximization program given in Example 68: Max R = x1 (10 − x1 ) + x2 (20 − x2 ) subject to 5x1 + 3x2 ≤ 40 x1 ≤ 5 x2 ≤ 10 x1 ≥ 0. in order to produce good j (xj > 0). which states that the total amount of resource i used in producing all the n goods should not exceed total amount available ri . A maximization program in the general form. ∂L The condition xj ∂xj = 0 implies that. The variables have the following economic interpretations: • xj is the amount produced of the jth product. • f is the proﬁt (revenue) function. i • λi gj is the imputed cost of resource i incurred in the production of a marginal unit of product j. x2 ≥ 0. The same condition shows that good j is not produced (xj = 0) if there is an excess imputation ∂L (xj ∂xj < 0). ∂L i The KT condition ∂xj ≤ 0 can be written as fj ≤ m λi gj and it says that the i=1 marginal proﬁt of the jth product cannot exceed the aggregate marginal imputed cost of the jth product. we ﬁrst have to note the meanings of the following variables: • fj = ∂f ∂xj is the marginal proﬁt (revenue) of product j. i • gj = ∂g i ∂xj is the amount of resource i used in producing a marginal unit of product j. On the other hand. • ri is the amount of the ith resource available. 52 .Economics Application 7 (Economic Interpretation of a Nonlinear Program and KuhnTucker Conditions) A nonlinear program can be interpreted much like a linear program. for example. a fully ∂L used resource ( ∂λi = 0) has a strictly positive price (λi > 0). In order to interpret the KuhnTucker conditions. ∂L The condition ∂λi = 0 indicates that if a resource is not fully used in the optimal ∂L solution ( ∂λi > 0). • λi is the shadow price of resource i. • g i is a function which shows how the ith resource is used in producing the n goods. then its shadow price will be 0 (λi = 0). ∂L The KT condition ∂λi ≥ 0 is simply a restatement of constraint i. is the production problem facing a ﬁrm which has to produce n goods such that it maximizes its revenue subject to m resource (factor) constraints.
A ﬁrm has to produce two goods using three kinds of resources available in the amounts 40,5,10 respectively. The ﬁrst resource is used in the production of both goods: ﬁve units are necessary to produce one unit of good 1, and three units to produce one unit of good 2. The second resource is used only in producing good 1 and the third resource is used only in producing good 2. The sale prices of the two goods are given by the linear inverse demand equations p1 = 10 − x1 and p2 = 20 − x2 . The problem the ﬁrm faces is how much to produce of each good in order to maximize revenue R = x1 p1 + x2 p2 . The solution (2, 10) gives the optimal amounts the ﬁrm should produce. Economics Application 8 (The SalesMaximizing Firm) Suppose that a ﬁrm’s objective is to maximize its sales revenue subject to the constraint that the proﬁt is not less than a certain value. Let Q denote the amount of the good supplied on the market. If the revenue function is R(Q) = 20Q − Q2 , the cost function is C(Q) = Q2 + 6Q + 2 and the minimum proﬁt is 10, ﬁnd the salesmaximizing quantity. The ﬁrm’s problem is max R(Q) subject to R(Q) − C(Q) ≥ 10 and Q ≥ 0 or max R(Q) subject to 2Q2 − 14Q + 2 ≤ −10 and Q ≥ 0. The Lagrangian function is Z = 20Q − Q2 − λ(2Q2 − 14Q + 12) and the KuhnTucker conditions are ∂Z ≤ 0, ∂Q ∂Z ≥ 0, ∂λ Q ≥ 0, λ ≥ 0, Q ∂Z = 0, ∂Q (7)
λ
∂Z =0 ∂λ
(8)
Explicitly, the ﬁrst inequalities in each row above are −Q + 10 − λ(2Q − 7) ≤ 0, −Q2 + 7Q − 6 ≥ 0 (9) (10)
The second inequality in (7) and inequality (10) imply that Q > 0 (if Q were 0, (10) would not be satisﬁed). From the third condition in (7) it follows that ∂ Z = 0. ∂Q ∂Z If λ were 0, the condition ∂Q = 0 would lead to Q = 10, which is not consistent with (10); thus, we must have λ > 0 and ∂ Z = 0. The quadratic equation −Q2 + 7Q − 6 = 0 has the roots Q1 = 1 and Q2 = 6. Q1 implies λ = − 9 , which contradicts λ > 0. The solution Q2 = 6 leads to a positive value 5 for λ. Thus, the salesmaximizing quantity is Q = 6. 53
∂λ
3.3
3.3.1
Appendix: Linear Programming
The Setup of the Problem
There are two general types of linear programs: • the maximization program in n variables subject to m + n constraints: Maximize π = n cj xj j=1 n subject to aij xj ≤ ri (i = 1, 2, . . . m) j=1 (j = 1, 2, . . . n). and xj ≥ 0 • the minimization program in n variables subject to m + n constraints: Minimize C = n cj xj j=1 n subject to j=1 aij xj ≥ ri (i = 1, 2, . . . m) and xj ≥ 0 (j = 1, 2, . . . n). Note that without loss of generality all constraints in a linear program may be written as either ≤ inequalities or ≥ inequalities because any ≤ inequality can be transformed into a ≥ inequality by multiplying both sides by −1. A more concise way to express a linear program is by using matrix notation: • the maximization program in n variables subject to m + n constraints: Maximize π = c x subject to Ax ≤ r and x ≥ 0. • the minimization program in n variables subject to m + n constraints: Minimize C = c x subject to Ax ≥ r and x ≥ 0. where c = (c1 , . . . , cn ) , x = (x1 , . . . , xn ) , r = (r1 , . . . , rm ) ,
A=
a11 a21
a12 a22
· · · a1n · · · a2n ...
am1 am2 · · · amn
Example 74 The following is an example of a linear program: Max π = 10x1 + 30x2 subject to 5x1 + 3x2 ≤ 40 x1 ≤ 5 x2 ≤ 10 x1 ≥ 0, x2 ≥ 0. 54
Proposition 32 (The Globality Theorem) If the feasible set F of an optimization problem is a closed convex set, and if the objective function is a continuous concave (convex) function over the feasible set, then i) any local maximum (minimum) will also be a global maximum (minimum) and ii) the points in F at which the objective function is optimized will constitute a convex set. Note that any linear program satisﬁes the assumptions of the globality theorem. Indeed, the objective function is linear; thus, it is both a concave and a convex function. On the other hand, each inequality restriction deﬁnes a closed halfspace, which is also a convex set. Since the intersection of a ﬁnite number of closed convex sets is also a closed convex set, it follows that the feasible set F is closed and convex. The theorem says that for a linear program there is no diﬀerence between a local optimum and a global optimum, and if a pair of optimal solutions to a linear program exists, then any convex combination of the two must also be an optimal solution. Deﬁnition 47 An extreme point of a linear program is a point in its feasible set which cannot be derived from a convex combination of any other two points in the feasible set. Example 75 For the problem in example 74 the extreme points are (0, 0), (0, 10), (2, 10), (5, 5), (5, 0). They are just the corners of the feasible region represented in the x1 0x2 system of coordinates. Proposition 33 The optimal solution of a linear program is an extreme point. 3.3.2 The Simplex Method
The next question is to develop an algorithm which will enable us to solve a linear program. Such an algorithm is called the simplex method. The simplex method is a systematic procedure for ﬁnding the optimal solution of a linear program. Loosely speaking, using this method we move from one extreme point of the feasible region to another until the optimal point is attained. The following two examples illustrate how the simplex method can be applied: Example 76 Consider again the maximization problem in Example 74. Step 1. Transform each inequality constraint into an equality constraint by inserting a dummy variable (slack variable) in each constraint. After applying this step, the problem becomes: Max π = 10x1 + 30x2 subject to 5x1 + 3x2 + d1 = 40 x1 + d2 = 5 x2 + d3 = 10 x1 ≥ 0, x2 ≥ 0, d1 ≥ 0, d2 ≥ 0, d3 ≥ 0.
55
The rightmost column shows the nonzero values of the current BFS (in rows 1. 0) is (x1 . 40. This amounts to choosing a pivot element which will indicate both the new variable and the exit variable. 10). d2 = 5. correspondingly. 10). Example 77 illustrates this situation. Here. 2. 3 under d1 . 0. For example. Rows 1. Set up a simplex tableau like the following one: Row 0 1 2 3 π x1 1 10 0 5 0 1 0 0 x2 30 3 0 1 d1 0 1 0 0 d2 0 0 1 0 d3 0 0 0 1 Constant 0 40 5 10 Row 0 contains the coeﬃcients of the objective function. Once we have determined the pivot column. Find a BFS to be able to start the algorithm. In row 0 we can see the variables which have nonzero values in the current BFS. In order to switch to another BFS. Step 4. 40. 5. as it is in our example. • the pivot row corresponds to the minimum of these ratios. we need to replace a variable currently in the basis by a variable which is not in the basis. the BFS are (0. the nonzero variables are d1 . That is. They correspond to the zeros in row 0. d2 . d3 = 10. In general. A BFS is easy to ﬁnd when (0. These nonzero variables always correspond to a basis in Rm . • select the corresponding elements in the constant column and divide them by the strictly positive elements in the pivot column. 3 include the coeﬃcients of the constraints as they appear in the matrix of coeﬃcients. The new variable (and. 0. The new BFS is chosen such that the value of the objective function is improved (that is. it is larger for a maximization problem and lower for a minimization problem). d2 . the pivot row is found in this way: • pick the strictly positive elements in the pivot column except for the element in row 0.The extreme points of the new feasible set are called basic feasible solutions (BFS). d1 . d3 (m = 3). 5. For each extreme point of the original feasible set there is a BFS. Step 3. π = 0. the BFS corresponding to the extreme point (x1 . Step 2. ﬁnding a BFS requires the introduction of additional artiﬁcial variables. 3) and the value of the objective function for the current BFS (in row 0). in this case. Choose the pivot. x2 . d3 ) = (0. 56 . The current basis is formed by the column coeﬃcient vectors which appear in the tableau in rows 1. 2. The pivot element is selected in the following way. The simplex method ﬁnds the optimal solution by moving from one BFS to another BFS until the optimum is reached. d2 . where m is the number of constraints (see the general form of a linear program). 2. d3 . x2 ) = (0. d1 = 40. the pivot column) is that variable having the lowest negative value in row 0 (largest positive value for a minimization problem). 0) is in the feasible set.
we obtain a third simplex tableau: Row 0 1 2 3 π x1 1 0 0 1 0 0 0 0 x2 0 0 0 1 d1 2 1 5 −1 5 0 d2 0 0 1 0 d3 Constant 24 320 −3 2 5 3 3 5 1 10 There is no negative element left in row 0 and the current basis consists of x1 . Thus. 10 } = 10 ). However. proceed to step 4. Usually. If it does not. In our example. we multiply the pivot row by 30 and add it to row 0. Hence we obtain a new simplex tableau: Row 0 1 2 3 π x1 1 10 0 5 0 1 0 0 x2 d1 0 0 0 1 0 0 1 0 d2 0 0 1 0 d3 30 3 0 1 Constant 300 10 5 10 Step 6. Then the new version of the pivot row is used to get zeros in the other places of the pivot column. 5 in row 1 is the new pivot element.the column of x1 is the pivot column (x1 enters the base). Applying step 5. it follows 5 1 5 that row 1 is the pivot row. Make the appropriate transformations to reduce the pivot column to a unit vector (that is. the optimal solution must be found. According to step 4. x2 . d2 = 3. d2 . Example 77 Consider the minimization problem: Min C = x1 + x2 subject to x1 − x2 ≥ 0 −x1 − x2 ≥ −6 x2 ≥ 1 x1 ≥ 0. we still have a negative value in row 0 (10). then we multiply the pivot row by −3 and add it to row 1. ﬁrst we have to divide the pivot row by the pivot element to obtain 1 in the place of the pivot element. π = 320.In our example. The correspondence between the optimal values and the variables in the current basis is made using the 1’s in the column vectors of x1 . 57 . the pivot element is the 1 which lies at the 3 1 1 intersection of column of x2 and row 3. the optimal solution can be read from the rightmost column: x1 = 2. the pivot column is the column of x2 (since −30 < −10) and the pivot row is row 3 (since min{ 40 . a vector with 1 in the place of the pivot element and 0 in the rest). Since min{ 10 . thus. If it does. this is not necessary in our case since we already have 1 in that position. d2 . Thus. Check that row 0 still contains negative values (positive values for a minimization problem). x2 = 10. x2 . x2 ≥ 0. 5 } = 10 . Step 5. In our example.
d1 ≥ 0. f2 ≥ 0. C = 2. the starting BFS is (x1 . Before adding the artiﬁcial variables. Sometimes the starting BFS (or the starting base) can be found in a simpler way than by using the general method that was just illustrated. x2 . As long as M is high enough. 6. f2 . x2 ≥ 0. For example. f1 . f3 ≥ 0.After applying step 1 of the simplex algorithm. we add 100(row1+row2+row3) to row0: Row 0 1 2 3 C 1 0 0 0 x1 199 1 1 0 x2 99 1 1 1 d1 100 1 0 0 d2 100 0 1 0 d3 f1 100 0 0 1 0 0 1 0 f2 0 0 1 0 f3 0 0 1 0 Constant 700 0 6 1 Proceeding with the other steps of the simplex algorithm will ﬁnally lead to the optimal solution x1 = 1. One way of ﬁnding a starting BFS is to transform the problem by adding 3 more artiﬁcial variables f1 . f3 to the linear program as follows: Min C = x1 + x2 + M (f1 + f2 + f3 ) subject to x1 − x2 − d1 + f1 = 0 x1 + x2 + d2 + f2 = 6 x2 − d3 + f3 = 1 x1 ≥ 0. 0. 0) is not in the feasible set. d1 . our linear program 58 . 0. f2 and f3 . d1 ≥ 0. f3 ) = (0. d3 . In this example. x2 = 1.) Now. d2 . 0. f3 = 0 and thus the optimum is not aﬀected by including the artiﬁcial variables in the linear program. 0. x2 ≥ 0. In our case. f2 . (In maximization problems M is chosen very low. d3 ≥ 0 f1 ≥ 0. d3 ≥ 0. f2 = 0. d2 ≥ 0. 1) and the ﬁrst simplex tableau is: Row 0 1 2 3 C 1 0 0 0 x1 1 1 1 0 x2 1 1 1 1 d1 0 1 0 0 d2 0 0 1 0 d3 0 0 0 1 f1 f2 100 100 1 0 0 1 0 0 f3 100 0 1 0 Constant 0 0 6 1 In order to obtain unit vectors in the columns of f1 . we choose M = 100. the optimal solution will contain f1 = 0. (0. we obtain: Min C = x1 + x2 subject to x1 − x2 − d1 = 0 −x1 − x2 − d2 = −6 x2 − d3 = 1 x1 ≥ 0. the second constraint is multiplied by 1 to obtain a positive number on the righthand side. 0. d2 ≥ 0.
the optimum is x1 = 1. d1 ≥ 0. d2 ≥ 0. we need to add only one artiﬁcial variable f to the linear program and the variables in the starting base will be d1 . • ri is the amount of the ith resource available.) Constant 0 0 6 1 100 0 6 1 100 0 6 1 2 1 4 1 As can be seen from the last tableau. 59 . x2 ≥ 0. • cj is the price of a unit of product j. Only d3 contradicts the constraint d3 ≥ 0. Taking x1 = 0. The optimal solution to the maximization program indicates the optimal quantities of each good the ﬁrm should produce. d2 = 4. Economics Application 9 (Economic Interpretation of a Maximization Program) The maximization program in the general form is the production problem facing a ﬁrm which has to produce n goods such that it maximizes its revenue subject to m resource (factor) constraints. C = 2. x2 = 0 we get d1 = 0. d2 . Row C x1 x2 d1 d2 d3 f 0 1 1 1 0 0 0 100 1 0 1 1 1 0 0 0 2 0 1 1 0 1 0 0 3 0 0 1 0 0 1 1 0 1 1 99 0 0 100 0 ∗ 1 0 1 1 1 0 0 0 2 0 1 1 0 1 0 0 3 0 0 1 0 0 1 1 0 1 98 0 99 0 100 0 1 0 1 1 1 0 0 0 2 0 2 0 1 1 0 0 3 0 1∗ 0 1 0 1 1 0 1 0 0 1 0 2 98 1 0 0 1 0 0 1 1 2 0 0 0 1 1 2 2 3 0 1 0 1 0 1 1 (The asterisks mark the pivot elements. • aij is the amount of the ith resource used in producing a unit of the jth product. x2 = 1. Thus. d3 ≥ 0. d3 = −1.can be written as: Min C = x1 + x2 subject to −x1 + x2 + d1 = 0 x1 + x2 + d2 = 6 x2 − d3 = 1 x1 ≥ 0. d2 = 6. f . The complete sequence of simplex tableaux leading to the optimal solution is given below. The variables have the following economic interpretations: • xj is the amount produced of the jth product.
• cj is the price of a unit of resource j. The production process also requires that the amount of the second factor not exceed the amount of the ﬁrst factor. respectively (good 2 uses only the second factor). Economics Application 10 (Economic Interpretation of a Minimization Program) The minimization program (see the general form) is also connected to the ﬁrm’s production problem. Example 79 Consider again the minimization program in example 77: Min C = x1 + x2 subject to x1 − x2 ≥ 0 x1 + x2 ≤ 6 x2 ≥ 1 x1 ≥ 0.Example 78 Let us ﬁnd an economic interpretation of the maximization program given in Example 74: Max π = 10x1 + 30x2 subject to 5x1 + 3x2 ≤ 40 x1 ≤ 5 x2 ≤ 10 x1 ≥ 0.5. respectively? The solution (2. The ﬁrst resource is used in the production of both goods: ﬁve units are necessary to produce one unit of good 1. 10) gives the optimal amounts the ﬁrm should produce. A ﬁrm has to produce two goods using three kinds of resources available in the amounts 40. What economic interpretation can be assigned to this linear program? A ﬁrm has to produce two goods. 60 . Two factors are used in the production process and the price of both factors is one. • ri is the amount of the good i the ﬁrm decides to produce. The solution (1.10 respectively. The optimal solution to the minimization program indicates the optimal quantities of each factor the ﬁrm should employ. The question is how much of each factor the ﬁrm should employ in order to minimize the total factor cost. x2 ≥ 0. at most six units of the ﬁrst good and one unit of the second good. The problem lies in choosing the amount of each resource such that the total cost of all resources used in the production of the ﬁxed combination of goods is minimized. The marginal productivity of both factors in producing good 1 is one and the marginal productivity of the factors in producing good 2 are zero and one. The variables have the following economic interpretations: • xj is the amount of the jth resource used in the production process. The question is. and three units to produce 1 unit of good 2. The second resource is used only in producing good 1 and the third resource is used only in producing good 2. x2 ≥ 0. This time the ﬁrm wants to produce a ﬁxed combination of m goods using n resources (factors). how much of each good should the ﬁrm produce in order to maximize its revenue if the sale prices of goods are 10 and 30. • aij is the marginal productivity of resource j in producing good i. 1) gives the optimal factor amounts.
y2 ≥ 0.3. y3 ≥ 0. Example 81 The dual of the minimization program in Example 77 is Max π ∗ = −6y2 + y3 subject to y1 − y2 ≤ 1 −y1 − y2 + y3 ≤ 1 y1 ≥ 0. y3 ≥ 0.3 Duality As a matter of fact.3. This relationship is called duality. Example 80 The dual of the maximization program in Example 74 is Min C ∗ = 40y1 + 5y2 + 10y3 subject to 5y1 + y2 ≥ 10 3y1 + y3 ≥ 30 y1 ≥ 0. The following duality theorems clarify the relationship between the dual and the primal. y2 ≥ 0. Deﬁnition 49 The dual (program) of the minimization program Minimize C = c x subject to Ax ≥ r and x ≥ 0. Proposition 34 The optimal values of the primal and the dual objective functions are always identical. Deﬁnition 48 The dual (program) of the maximization program Maximize π = c x subject to Ax ≤ r and x ≥ 0. Deﬁnition 50 The original program from which the dual is derived is called the primal (program). the maximization and the minimization programs are closely related. if the optimal solutions exist. is the minimization program Minimize π ∗ = r y subject to A y ≥ c and y ≥ 0. 61 . is the maximization program Maximize C ∗ = r y subject to A y ≤ c and y ≥ 0.
2. y3 = 24. y2 = 0. 62 . Example 82 From the last simplex tableau in Example 74 we can infer the optimal solution to the problem in Example 80. Recipe 14 – How to solve the dual program: The optimal solution to the dual program can be read from row 0 of the last simplex tableau of the primal in the following way: 1. y2 = 0.G. • Luenberger. Introduction to Linear and Nonlinear Programming. If a certain choice variable in a primal (dual) program is optimally nonzero then the corresponding dummy variable in the dual (primal) must be optimally zero. D. It is y1 = 2. ! Further reading: • Intriligator. the absolute values of the elements in the columns corresponding to the primal choice variables are the values of the dual dummy variables. Example 83 The last simplex tableau in Example 77 implies that the optimal solution to the problem in Example 81 is y1 = 1. M. y3 = 2. 2.Proposition 35 1. If a certain dummy variable in a primal (dual) program is optimally nonzero then the corresponding choice variable in the dual (primal) must be optimally zero. π ∗ = 320. the absolute values of the elements in the columns corresponding to the primal dummy variables are the values of the dual choice variables. The last simplex tableau of a linear program oﬀers not only the optimal solution to that program but also the optimal solution to the dual program. Mathematical Optimization and Economic Theory. C = 2.
4. Proposition 36 (The Existence Theorem) If f is continuous in an open domain D. y) such that y(x0 ) = y0 . If we draw the set of isoclines obtained by taking diﬀerent real values of k. 2 y x +1 63 .e. dy If y = f (x)g(y) then g(y) = f (x)dx. then for any given pair (x0 . y. y(x). this equation becomes equivalent to dy 2xdx =− 2 . we can schematically sketch the family of solution curves in the (x − y)plane. This curve is called the isocline of slope k. If we rearrange the terms. . in some cases the solution can be easily obtained: • The Variables Separable Case. y = f (x. in particular.1 Diﬀerential Equations of the First Order Let us consider the ﬁrst order diﬀerential equations. y (n) (x)) = 0 for all x ∈ I. if F (x. y). The solution y(x) is said to solve Cauchy’s problem (or the initial value problem) with the initial values (x0 . By integrating both parts of the latter equation.1 Dynamics Diﬀerential Equations Deﬁnition 51 An equation F (x. Note that the diﬀerential equation gives us the slope of the solution curves at all points of the region D. Recipe 15 – How to Solve a Diﬀerential Equation of the First Order: There is no general method of solving diﬀerential equations. . However. y0 ) ∈ D there ∂y exists a unique solution y(x) of y = f (x. y (x). we will get a solution. y is determined by y alone).1. in some open interval I. . . The order of a diﬀerential equation is determined by the order of the highest derivative in the equation. y0 ). Thus. . y (n) ) = 0 which assumes a functional relationship between independent variable x and dependent variable y and includes x. y . y0 ) ∈ D there exists a solution y(x) of y = f (x. the solution curves cross the curve f (x.4 4. given y(0) = 1. Proposition 37 (The Uniqueness Theorem) If f and ∂f are continuous in an open domain D. y) such that y(x0 ) = y0 . . A function y = y(x) is said to be a solution of this diﬀerential equation. Deﬁnition 52 A diﬀerential equation of the form y = f (y) is said to be autonomous (i. . y) = k (k is a constant) with slope k. y and derivatives of y with respect to x is called an ordinary diﬀerential equation of order n. if y(x0 ) = y0 . . Example 84 Solve Cauchy’s problem (x2 + 1)y + 2xy 2 = 0. then given any (x0 .
Note that y ≡ 0 is a special solution. then the change of variable y = tx reduces this equation to the variables separable case. • Diﬀerential Equations with Homogeneous Coeﬃcients. If we have the diﬀerential equation M (x. x = 0 as well) and rearranging the terms. t t−1 t2 We can single out y as a function of x. y = 64 . if for any λ f (λx. Example 85 Solve (y 2 − 2xy)dx + x2 dy = 0. x−C − dt = −t 1 1 − dt = ln t − ln t − 1 + C. N (x. λy) = λn f (x. y). let us change the variable: y = tx. The coeﬃcients are homogeneous of degree 2. dy = tdx + xdt. y) are homogeneous of the same degree. Deﬁnition 53 A function f (x. y). x therefore x= Ct Cy . y or y(x) = ln(x2 1 . we get a variables separable equation dx dt =− 2 . y)dy = 0 and M (x. y) is called homogeneous of degree n. Dividing by x2 (since y = 0. t−1 y−x x2 .The integration of both parts yields 1 = ln(x2 + 1) + C. we can evaluate the constant C. x t −t Taking the integrals of both parts. If y = 0. x = . Thus x2 (t2 − 2t)dx + x2 (tdx + xdt) = 0. in terms of y. or. dx = ln x. C = 1. + 1) + C Since we should satisfy the condition y(0) = 1. y)dx + N (x.
Example 87 Solve xy − 2y = 2x4 . y) = x y dx = y ln x + φ(y). y) is such that ∂F = M (x. x Therefore. y)dy = 0 and ∂M∂y ≡ ∂N∂x then all the solutions of this equation are given by F (x. y). x ∂ y ∂ 3 = (y + ln x).y) If we have the diﬀerential equation M (x. we need to solve y − y = 2x3 . • Linear diﬀerential equations of the ﬁrst order.• Exact diﬀerential equations. x y(x) = e 4. . If we need to solve y + p(x)y = q(x). where F (x. y) such that Fx = and Fy = y 3 + ln x. (x.2 2 dx x C+ 2x3 e− 2 dx x sd = eln x 2 C+ 2x3 eln x2 sd = x2 (C + x2 ). then all the solutions are given by the formula y(x) = e− p(x)dx C+ q(x)e p(x)dx dx . 1 Linear Diﬀerential Equations of a Higher Order with Constant Coeﬃcients These are the equations of the form y (n) + a1 y (n−1) + . note that Fy = y 3 + ln x = ln x + φ (y) =⇒ φ(y) = 1 y 3 dy = y 4 + C. + an−1 y + an y = f (x). C is a constant. otherwise it is called nonhomogeneous. Therefore. ∂x ∂y Example 86 Solve y dx + (y 3 + ln x)dy = 0. If f (x) ≡ 0 the equation is called homogeneous. 65 . ∂F = N (x. .1. y)dx + N (x. 2 In other words. y) = C. all solutions are given by the implicit function 4y ln x + y 4 = C. Therefore. y). 4 y =⇒ F (x.y) (x. so we have checked that we are dealing with an exact ∂y x ∂x diﬀerential equation. To ﬁnd φ(y). we need to ﬁnd F (x.
yjl = xl−1 eαj x cos βj x. . . 5. Cn are arbitrary constants. . once we set up the initial value Cauchy problem. . yjl+2 = xeαj x sin βj x. λj+1 give rise to two basic solutions: yj1 = eαj x cos βj x. however. Recipe 17 – How to Solve the NonHomogeneous Equation: The general solution of the nonhomogeneous solution is ynh (x) = yg (x) + yp (x). yj2 = eαj x sin βj x. λn . . . λj = αj +iβj . y01 . If λi is a simple real root. + Cn yn (x).e. These arbitrary constants. where x0 . is also a repeated root of degree l. yj2l = xl−1 eαj x sin βj x.Recipe 16 How to ﬁnd the general solution yg (x) of the homogeneous equation: The general solution is a sum of basic solutions y1 . Therefore the pair λj . i = −1. any function which solves the nonhomogeneous equation. is also a root of the characteristic equations. . . . yn . Some of these roots may be complex numbers. The roots of this equation are λ1 . yi2 (x) = xeλi x . y (n−1) (x) = y0n−1 at x = x0 . . . . . Find y(x) such that y(x) = y00 . . say λj+1 = αj − iβj . 2. + an−1 λ + an = 0 for λ. then the complex conjugate of λj . say λj+1 = αj − iβj . can be deﬁned in a unique way. we proceed as follows: 1. . yj2 = xeαj x cos βj x. λj = αj + iβj . . . . yg (x) = C1 y1 (x) + . . If λi is a repeated real root of degree k. . √ 4. . . y00 . where yg (x) is the general solution of the homogeneous equation and yp (x) is a particular solution of the nonhomogeneous equation. . If λj is a simple complex root. it generates k basic solutions yi1 (x) = eλi x . . C1 . If λj is a repeated complex root of degree l. 3. . . y (x) = y01 . i. . yjl+1 = eαj x sin βj x. y0n−1 are given initial values (real numbers). . . To ﬁnd all basic solutions. We may also have repeated roots. . yik (x) = xk−1 eλi x . . . the basic solution corresponding to this root is yi (x) = eλi x . . . These 2l roots generate 2l basic solutions yj1 = eαj x cos βj x. . Solve the characteristic equation λn + a1 λn−1 + . then the complex conjugate of λj . Recipe 18 – How to Find a Particular Solution of the NonHomogeneous Equation: 66 . .
. . . + un (x)yn (x) = f (x) 4. 3. . . where yi (x) are basic solutions. + ypr (x). . .. . Pk (x). . . .. . un (x) are solutions of the system u1 (x)y1 (x) + . .1. Treating constants C1 . . . . Tk (x) are polynomials of degree k. then a particular solution can be found in the form yp (x) = xs Rk (x)epx cos qx + xs Tk (x)epx sin qx. . If f (x) = f1 (x) + f2 (x) + . . 67 . s = 0. . where Rk (x). . If p + iq is not a root of the characteristic equation. + un (x)yn (x) = 0. . if b is a root of the characteristic polynomial of degree m then s = m. . + fr (x) and yp1 (x).. . If b is not a root of the characteristic equation. . . un (x). then yp (x) = yp1 (x) + . . . Qk (x) are polynomials of degree k. . where u1 (x). . if p + iq is a root of the characteristic polynomial of degree m then s = m. The general method for ﬁnding a particular solution of a nonhomogeneous equation is called the variation of parameters or the method of undetermined coeﬃcients. . u1 (x)y1 (n−1) (n−1) (x) + . + un (x)yn (x). u1 (x)y1 (x) + .. (n−2) (n−2) u1 (x)y1 (x) + . .. . 2. for instance.. where Qk (x) is a polynomial of the same degree k. . If f (x) = Pk (x)epx cos qx + Qk (x)epx sin qx. Cn as functions of x. . . then a particular solution is yp (x) = xs Qk (x)ebx . s = 0. + un (x)yn (x) = 0. we can express a particular solution of the nonhomogeneous equation as yp (x) = u1 (x)y1 (x) + . + un (x)yn (x) = 0. + Cn yn (x). If f (x) = Pk (x)ebx . ypr (x) are particular solutions corresponding to f1 (x). Pk (x) is a polynomial of degree k. . . Suppose we have the general solution of the homogeneous equation yg = C1 y1 (x) + . u1 (x). fr (x) respectively. . .
and yn (x) = yg (x) + yp (x) = ln xxex + C1 xex + C2 ex . . . 68 . d. Equating termbyterm the coeﬃcients at the lefthand side to the righthand side. bn (t)) a variant nvector. x The general solution of the homogeneous equation is yg (x) = C1 xex + C2 ex (the characteristic equation has the repeated root λ = 1 of degree 2). . Thus d = 1/2. 2a − 5b + 6c = −5. The characteristic roots are λ1 = 2. −5 · 2a + 6 · b = 0. .Example 88 Solve y − 5y + 6y = x2 + ex − 5. ˙ x(0) = x0 where t is the independent variable (“time”). d − 5d + 6d = 1. 4. To ﬁnd coeﬃcients a. . Thus yp (x) = ln xxex + xex . . x 1 Therefore u1 (x) = x . b = 10/36. the general solution of the nonhomogeneous equation is ynh (x) = y(t) = C1 e2t + C2 e3t + Example 89 Solve y − 2y + y = x2 5x 71 ex + − + . We have u1 (x)xex + u2 (x)ex = 0. x(t) = (x1 (t). ex u1 (x)(ex + xex ) + u2 (x)ex = . therefore the general solution of the homogeneous equation is y(t) = C1 e2t + C2 e3t . xn (t)) is a vector of dependent variables.1. u2 (x) = −1. 6 18 108 2 ex . c. b. Finally. A(t) = (aij (t))[n×n] is a real n × n matrix with variable coeﬃcients. we ﬁnd that u1 (x) = ln x. c = −71/108. let us substitute the particular solution into the initial equation: 2a + det − 5(2at + b + det ) + 6(at2 + bt + c + det ) = t2 − 5 + et . a = 1/6. We search for a particular solution in the form yp (t) = at2 + bt + c + det . We look for a particular solution of the nonhomogeneous equation in the form yp (x) = u1 (x)xex + u2 (x)ex . .3 Systems of the First Order Linear Diﬀerential Equations The general form of a system of the ﬁrst order diﬀerential equations is: x(t) = A(t)x(t) + b(t). . Integrating. u2 (x) = −x. λ2 = 3. b(t) = (b1 (t). we obtain 6a = 1.
The solution of system (11) can be determined in two steps: • First. Example 90 Let x = 2x + y. we can eliminate n−1 unknowns and reduce the system to one linear diﬀerential equation of degree n. ˙ y = 3x + 4y. xp can be viewed as an aﬃne shift. 69 . Since y(t) = x − 2x. which is the case in constant coeﬃcients diﬀerential equations systems: x(t) = Ax(t) + b. xp = −A−1 b. which is called the particular integral. First. we ﬁnd that ˙ y(t) = −C1 et + 3C2 e5t . The second way is to write the solution to (12) as x(t) = eAt x0 where (by deﬁnition) eAt = I + At + 5 A2 t2 + ··· 2! In a sense. we consider the homogeneous system (corresponding to b = 0): x(t) = Ax(t). • Second. ˙ Taking the derivative of the ﬁrst equation and eliminating y and y (y = 3x + 4y = ˙ ˙ 3x + 4x − 4 · 2x). which restores the origin to a unique equilibrium of (11). the general solution of the system (11) is simply: x(t) = xc (t) + xp We can ﬁnd the solution to the homogeneous system (12) in two diﬀerent ways.e. 5 Given xc (t) and xp .In what follows. we ﬁnd a particular solution xp of the system (11). The constant vector xp is simply the solution to Axp = −b. i. (11) We also assume that A is nonsingular. ¨ ˙ which has the general solution x(t) = C1 et + C2 e5t . ˙ x(0) = x0 (12) The solution xc (t) of this system is called the complementary solution. we arrive at the second order homogeneous linear diﬀerential equation ˙ x − 6x + 5x = 0. ˙ x(0) = x0 . we concentrate on the case when A and b do not depend on t.
. .6 The solution in this case has the form: m x(t) = i=1 ci hi (t) where hi (t) are quasipolinomials and ci are constants determined by the initial conditions. cn ) is a vector of constants determined from the initial conditions (c = P −1 x0 ). . A is not diagonalizable. Thus. and then adding up the individual solutions. If A is a real and symmetric matrix. in this situation the maximum number of independent eigenvectors corresponding to λ is less than m. For example. . then A is always diagonalizable and the formula given for case 1 can be applied.Unfortunately. eA = P eΛ P −1 . . If A happens to have several eigenvalues which are repeated. . the solution of the system (12) can be written as: x(t) = P eΛt P −1 x0 = P eΛt c = c1 v1 eλ1 t + . we have: h1 (t) = eλt v1 h2 (t) = eλt (tv1 + v2 ) h3 (t) = eλt (t2 v1 + 2tv2 + 3v3 ) where v1 . . therefore. A = P ΛP −1 where P = [v1 . Consequently. Case 1: A has real and distinct eigenvalues The fact that the eigenvalues are real and distinct implies that any corresponding eigenvectors are linearly independent. the solution of (12) is obtained by ﬁnding the solution corresponding to each eigenvalue. A is diagonalizable. 6 70 . if m = 3. v2 . + cn vn eλn t where c = (c1 . . . i. meaning that we cannot construct the matrix P of linearly independent eigenvectors and.e. . v3 are determined by the conditions: (A − λI)vi = vi−1 . Generally. v0 = 0. Case 2: A has real and repeated eigenvalues First. this formula is of little practical use. c2 . v2 . Therefore. In order to ﬁnd a feasible formula for eAt . we distinguish three main cases. consider the simpler case when A has only one eigenvalue λ which is repeated m times (m is called the algebraic multiplicity of λ). vn ] is a matrix composed by the eigenvectors of A and Λ is a diagonal matrix whose diagonal elements are the eigenvalues of A.
Since A is a real matrix. Solving the characteristic equation det(A − λI) = λ2 − 4λ + 3 = 0 we ﬁnd eigenvalues of A: λ1 = 1. the complex eigenvalues always appear in pairs. where v1 is the conjugate of v1 . The solution can be expressed as: ¯ ¯ x(t) = = = = = = = eAt x0 P eΛt P −1 x0 P eΛt c c1 v1 e(α+βi)t + c2 v2 e(α−βi)t c1 v1 eαt (cos βt + i sin βt) + c2 v2 eαt (cos βt − i sin βt) (c1 v1 + c2 v2 )eαt cos βt + i(c1 v1 − c2 v2 )eαt sin βt h1 eαt cos βt + h2 eαt sin βt. i. Example 92 Consider A= 3 1 −1 1 . 71 . 2 Similarly. Now consider the simpler case when A has only one pair of complex eigenvalues.Case 3: A has complex eigenvalues. λ2 = 3. and then adding up the individual solutions. if A has the eigenvalue α + βi. Let v1 and v2 denote the eigenvectors corresponding to λ1 and λ2 . λ1 = α + βi and λ2 = α − βi. where h1 = c1 v1 + c2 v2 .e. From the condition (A − λ1 I)v1 = 0 we ﬁnd an eigenvector corresponding to λ1 : −1 v1 = . Example 91 Consider A= 5 2 −4 −1 . then v2 = v1 . then it also must have the eigenvalue α − βi. Note that if A has more pairs of complex eigenvalues. the condition (A − λ2 I)v2 = 0 gives an eigenvector corresponding to λ2 : −1 v2 = . then the solution of (12) is obtained by ﬁnding the solution corresponding to each eigenvalue. 1 Therefore. the general solution of the system x(t) = Ax(t) is ˙ x(t) = c1 v1 eλ1 t + c2 v2 eλ2 t or x(t) = x1 (t) x2 (t) = −c1 et − c2 e3t 2c1 et + c2 e3t . The same remark holds true when we encounter any combination of the three cases discussed above. h2 = i(c1 v1 − c2 v2 ) are real vectors.
thus. The solution of the system x(t) = Ax(t) is ˙ x(t) = c1 v1 eλ1 t + c2 v2 eλ2 t = c1 v1 e(2+i)t + c2 (tv1 + v2 )e(2−i)t = c1 v1 e2t (cos t + i sin t) + c2 v2 e2t (cos t − i sin t). λ2 = 2 − i.det(A − λI) = λ2 − 4λ + 4 = 0 gives the eigenvalue λ = 2 repeated twice. −2 The solution of the system x(t) = Ax(t) is ˙ x(t) = c1 h1 (t) + c2 h2 (t) = c1 v1 eλt + c2 (tv1 + v2 )eλt or x(t) = x1 (t) x2 (t) = −c1 + c2 − c2 t c1 − 2c2 + c2 t e2t . The equation det(A−λI) = λ2 −4λ+5 = 0 leads to the complex eigenvalues λ1 = 2+i. Now let the vector b be diﬀerent from 0. Performing the computations. we ﬁnally obtain: x(t) = (c1 + c2 )e2t cos t − sin t − cos t + i(c1 − c2 )e2t cos t + sin t − sin t . −1 From the condition (A − λ1 I)v1 = 0 we ﬁnd v1 = 1 1 and from (A − λ1 I)v2 = v1 we obtain v2 = . xp = −A−1 b. Example 93 Consider A= 3 2 −1 1 . For λ1 we set the equation (A − λ1 I)v1 = 0 and obtain: v1 = 1+i −1 v2 = v1 = ¯ 1−i −1 . the general solution of the nonhomogeneous system is x(t) = xc (t) + xp = P eΛt P −1 c − A−1 b where the vector of constants c is determined by the initial condition x(0) = x0 as c = x0 + A−1 b. ˙ Therefore. To specify the solution to the system (11). This is found from the condition x = 0. we need a particular solution xp . 72 .
. Economics Application 12 (The Dynamic ISLM Model) A simpliﬁed dynamic ISLM model can be speciﬁed by the following system of diﬀerential equations: ˙ Y = w1 (I − S) r = w2 [L(Y.Example 94 Consider A= 5 2 −4 −1 . Si (p) demand and supply for good i. . we have c= and −xp = A−1 b = −1 3 4 3 −2 3 5 3 c1 c2 = x0 + A−1 b = −1 2 + −5 3 14 3 = −8 3 20 3 1 2 = −5 3 14 3 . .b = 1 2 and x0 = −1 2 . . . x2 . ˙ p(0) = p0 where W = diag(wi ) is a diagonal matrix having the speeds of adjustment wi . . . Let p = (p1 . . . where: 73 . . Economics Application 11 (General Equilibrium) Consider a market for n goods x = (x1 . . . . then the solution is p(t) = P eΛt P −1 p0 . . In addition. . . . if W A = P ΛP −1 . . . . xn ) . i = 1. . . In Example 91 we obtained xc (t) = −c1 et − c2 e3t 2c1 et + c2 e3t . i = 1. The dynamics of price adjustment for the general equilibrium can be described by a system of ﬁrst order diﬀerential equations of the form: p = W [D(p) − S(p)] = W Ap. For instance. n. . . The solution of the system x(t) = Ax(t) + b is ˙ x(t) = xc (t) + xp = 8 t e 3 −16 t e 3 − + 20 3t e 3 20 3t e 3 + + 5 3 −14 3 . . The general equilibrium is obtained for a price vector p∗ which satisﬁes the conditions: Di (p∗ ) = Si (p∗ ) for all i = 1. pn ) denote the vector of prices and Di (p). r(0) = r0 . n. Sn (p)] are linear functions of p and A is a constant n × n matrix. ˙ with initial conditions Y (0) = Y0 . . p2 . Dn (p)] on the diagonal and S(p) = [S1 (p). . n. r) − M ]. D(p) = [D1 (p). . We can solve this system by applying the methods discussed above.
(T. and has distinct a1 −a2 eigenvalues λ1 . consider w1 = 1 and w2 = 1. we have: v1 ˜ v2 ˜ = v1 v1 eλ1 t + v2 v2 eλ2 t .b = I0 − (1 − s)T + G −M . r) = a1 Y − a2 r is money demand (a1 and a2 are positive constants). G and s represents taxes. • S = s(Y −T )+(T −G) is national savings. λ2 are the solutions of the equation r2 + (s + a2 )r + sa2 + αa1 = 0. (λ1 . 74 . • r is the interest rate. • M is money supply. the solution is: Y (t) r(t) = [v1 v1 eλ1 t + v2 v2 eλ2 t ] ˜ ˜ Y0 + r0 + −a2 S−αM αa1 +sa2 −a1 S+sM αa1 +sa2 − −a2 S−αM αa1 +sa2 −a1 S+sM αa1 +sa2 . w2 are speeds of adjustment. For simplicity. government purchases and the savings rate). A = P ΛP −1 . where S = I0 − (1 − s)T + G. = P eΛt P −1 (x0 + A−1 b) − A−1 b Assume that P −1 = P eΛt P −1 = [v1 . λ2 with corresponding eigenvectors v1 . • w1 . • L(Y. v2 ] eλ1 t 0 0 eλ2 t Finally.• Y is national income. Then. The original system can be rewritten as: ˙ Y r ˙ = −s −α a1 −a2 Y r + I0 − (1 − s)T + G −M −s −α is diagonalizable. ˜ ˜ 1 αa1 + sa2 −a2 α −a1 −s Y0 r0 . v2 .) The solution becomes: Assume that A = Y (t) r(t) where x0 = We have A−1 = and A−1 b = 1 αa1 + sa2 −a2 [I0 − (1 − s)T + G] − αM −a1 [I0 − (1 − s)T + G] + sM v1 ˜ v2 ˜ .e. • I = I0 − αr is the investment function (α is a positive constant). i.
which explains the dynamics of capital per unit of eﬀective labor: ˙ k(t) = sf (k(t)) − (n + g + δ)k(t). Case 1. The production function is linear: f (k(t)) = ak(t) where a is a positive constant. Case 2. ˙ 1−α or m(t) + (1 − α)(n + g + δ)m(t) = s(1 − α). the equation for k represents a Bernoulli nonlinear diﬀerential equation: ˙ k(t) + (n + g + δ)k(t) = s[k(t)]α . The production function takes the CobbDouglas form: f (k(t)) = [k(t)]α where α ∈ [0. • g is the rate of technological progress. ˙ The equation for k becomes: ˙ k(t) = (sa − n − g − δ)k(t). ˙ The solution to this ﬁrst order linear diﬀerential equation is given by: m(t) = e− (1−α)(n+g+δ)dt [A + s(1 − α)e (1−α)(n+g+δ)dt dt] = e−(1−α)(n+g+δ)t [A + s(1 − α) e(1−α)(n+g+δ)t dt] s . • s is the savings rate. where • k denotes capital per eﬀective labor. k(0) = k0 k(0) = k0 Dividing both sides of the latter equation by [k(t)]α . ˙ In this case. Let us consider two types of production function. = Ae−(1−α)(n+g+δ)t + n+g+δ 75 . we obtain: 1 m(t) + (n + g + δ)m(t) = s. • δ is the depreciation rate. 1].Economics Application 13 (The Solow Model) The Solow model is the basic model of economic growth and it relies on the following key diﬀerential equation. • n is the rate of population growth. m(0) = (k0 )1−α . and denoting m(t) = [k(t)]1−α . This linear diﬀerential equation has the solution k(t) = k(0)e(sa−n−g−δ)t = k0 e(sa−n−g−δ)t .
The initial system reduces to a system of linear diﬀerential equations: ˙ k(t) = (a − n − g)k(t) − c(t) a − ρ − θg c(t) = ˙ c(t) θ The second equation implies that: c(t) = c0 ebt . where b = a − ρ − θg . The particular integral should be of the form kp (t) = mebt . Assume again that f (k(t)) = ak(t) where a > 0 is a constant. in addition to the variables already speciﬁed in the previous application. θ Substituting for c(t) in the ﬁrst equation. where. Their behavior is described by the following system of diﬀerential equations: ˙ k(t) = f (k(t)) − c(t) − (n + g)k(t) f (k(t)) − ρ − θg c(t) c(t) = ˙ θ with the initial conditions k(0) = k0 . ˙ The homogeneous equation k(t) = (a − n − g)k(t) gives the complementary solution: k(t) = k0 e(a−n−g)t . • θ is the coeﬃcient of relative risk aversion. we obtain: ˙ k(t) = (a − n − g)k(t) − c0 ebt The solution of this linear diﬀerential equation takes the form: k(t) = kc (t) + kp (t) where kc and kp are the complementary solution and the particular integral. the value of m can be ˙ determined by substituting kp in the nonhomogeneous equation for k: mbebt = (a − n − g)mebt − c0 ebt 76 . n+g+δ n+g+δ 1 s s ]e−(1−α)(n+g+δ)t + } 1−α . c(0) = c0 . n+g+δ n+g+δ Economics Application 14 (The RamseyCassKoopmans Model) This model takes capital per eﬀective labor k(t) and consumption per eﬀective labor c(t) as endogenous variables. It follows that A = s (k0 )1−α − n+g+δ and m(t) = [(k0 )1−α − k(t) = {[(k0 )1−α − s s ]e−(1−α)(n+g+δ)t + . • ρ stands for the discount rate . respectively.where A is determined from the initial condition m(0) = (k0 )1−α .
λ2 be the eigenvalues of this Jacobian. distinct and both negative (positive). λ2 are real. a stable (unstable) star node if λ1 . Let J be the Jacobian matrix J = ∂f ∂x ∂g ∂x ∂f ∂y ∂g ∂y evaluated at (x∗ . Then consider a twodimensional autonomous system of simultaneous diﬀerential equations dx = f (x. y ∗ ). λ2 are complex and Re(λ1 ) < 0 (Re(λ1 ) > 0).It follows that m= and k(t) = k0 e(a−n−g)t + c0 ebt . 3. dy = g(x.1. y ∗ ) = g(x∗ . λ1 λ2 < 0. a stable (unstable) improper node if λ1 . and let λ1 . The ﬁgure below illustrates this classiﬁcation: 77 .e. 4. λ1 = λ2 < 0 (λ1 = λ2 > 0) and the Jacobian is a diagonal matrix. 6. Types of Equilibria Let x = x(t). a stable (unstable) focus if λ1 . Then the equilibrium is 1. λ2 are complex and Re(λ1 ) = 0. y = y(t). a center or a focus if λ1 .4 Simultaneous Diﬀerential Equations. dt dt The point (x∗ . a−n−g−b c0 . 5. y ∗ ) = 0. a saddle if the eigenvalues are real and of diﬀerent signs. y). λ1 = λ2 < 0 (λ1 = λ2 > 0) and the Jacobian is a not diagonal matrix. a−n−g−b kp (t) = c0 ebt a−n−g−b 4. i. λ2 are real. 2. y ∗ ) is called an equilibrium of this system if f (x∗ . a stable (unstable) node if λ1 . y). λ2 are real. where t is independent variable.
y. At (−2.2 are complex. The characteristic equation for J is λ2 − (tr(J))λ + det(J) = 0.−1) −2 −8 4 −8 .p p p 1) Node 2) Saddle 3) Focus p p 4) Center 5) Improper node 6) Star node Example 95 Find the equilibria of the system and classify them x = 2xy − 4y − 8. 4y 2 − x2 = 0 for x. 2). we ﬁnd two equilibria. −1) the Jacobian J= 2y 2x − 4 −2x 8y = (−2. (−2. ˙ y = 4y 2 − x2 . 78 . ˙ Solving the system 2xy − 4y − 8 = 0. −1) and (4. (tr(J))2 − 4 det(J)) < 0 =⇒ λ1.
Example 96 (Examples of Diﬀerence Equations) • 2∆yk − yk = 1 (or. 1. k = 0.. Deﬁnition 55 A diﬀerence equation in the unknown y is an equation relating y and any of its diﬀerences ∆y. • In general. λ2 > 0 =⇒ the equilibrium (4. 1. tr(J) = λ1 + λ2 > 0. i ∈ N) deﬁned for all k = 0. . . the above two equations may also be expressed as follows: • 2yk+1 − 3yk = 1 • yk+2 − 2yk+1 + yk + 5yk yk+1 − 4(yk )2 = 2 Deﬁnition 56 A diﬀerence equation is linear if it takes the form f0 (k)yk+n + f1 (k)yk+n−1 + . . . . g are each functions of k (but not of some yk+i . .2 are distinct and real. . 2) is an unstable node. equivalently. + fn−1 (k)yk+1 + fn (k)yk = g(k). Hereafter yk stands for y(k) – the value of y at k. . . 2) J= 4 4 −8 16 . where f0 . . . yk+n−1 . 79 . yk ) = g(k). n > 1. fn . . . . 4. n ∈ N For example. yk+1 . .. . At (4. (tr(J))2 − 4 det(J)) > 0 =⇒ λ1. 2. Deﬁnition 54 • The ﬁrst diﬀerence of y evaluated at k is ∆yk = yk+1 − yk . . for any integer n. det(J) = λ1 λ2 > 0 =⇒ λ1 > 0. 2∆y − y = 1) • ∆2 yk + 5yk ∆yk + (yk )2 = 2 (or ∆2 y + 5y∆y + (y)2 = 2) Note that any diﬀerence equation can be written as f (yk+n . . where k = 0. −1) is a stable focus. 2.2 Diﬀerence Equations Let y denote a realvalued function deﬁned over the set of natural numbers. . the nth diﬀerence of y evaluated at k is deﬁned as ∆n yk = ∆(∆n−1 yk ). . . • The second diﬀerence of y evaluated at k is ∆2 yk = ∆(∆yk ) = yk+2 − 2yk+1 + yk . for each value of k. 1. f1 . .. ∆2 y. Solving a diﬀerence equation means ﬁnding all functions y which satisfy the relation speciﬁed by the equation.tr(J) < 0 =⇒ λ1 + λ2 < 0 =⇒ Reλ1 < 0 =⇒ the equilibrium (−2.
. . . where an is nonzero. + an−1 yk+1 + an yk = rk .. n. the theory for the case of order n is obtained as a natural extension of the theory for secondorder equations. The corresponding homogeneous equation is: yk+1 + ayk = 0. and has the solution Y = C(−a)k . The most useful technique for ﬁnding particular solutions is the method of undetermined coeﬃcients which will be illustrated in the examples that follow. fn are diﬀerent from zero. The general form of such an equation is: f0 yk+n + f1 yk+n−1 + . . where. . . Find a particular solution y ∗ of the complete equation (13). f0 . + fn−1 yk+1 + fn yk = gk . this time. f1 . we focus on the problem of solving linear diﬀerence equations of order n with constant coeﬃcients. rk = gk . 2. . Step 3. Following that.1 Firstorder Linear Diﬀerence Equations (13) The general form of a ﬁrstorder linear diﬀerential equation is: yk+1 + ayk = rk . 1. . . fn are real constants and f0 . . + an−1 yk+1 + an yk = 0. (First we will describe and illustrate the solving procedure for equations of order 1 and 2. .) Step 2. . . and ﬁnd its solution Y .2. 80 . The general solution of equation (13) is: yk = Y + y ∗ . . . 4. Consider the homogeneous equation yk+n + a1 yk+n−1 + . From now on. . Recipe 19 – How to Solve Diﬀerence Equation: General Procedure: The general procedure for solving a linear diﬀerence equation of order n with constant coeﬃcients involves three steps: Step 1. . where C is an arbitrary constant. fi Dividing this equation by f0 and denoting ai = f0 for i = 0.Deﬁnition 57 A linear diﬀerence equation is of order n if both f0 (k) and fn (k) are diﬀerent from zero for each k = 0. the general f0 form of a linear diﬀerence equation of order n with constant coeﬃcients becomes: yk+n + a1 yk+n−1 + .
Much as in the case of linear diﬀerential equations. we must have: −2A = 1 2(A − B) = 1 A + B − 2D = 2. The solution of the homogeneous equation yk+1 − 3yk = 0 is: Y = C · 3k . Therefore r yk = A · (−a)k + . a = −1. D = − 3 and y ∗ = − 1 k 2 − k − 3 . If we have the initial condition yk = y0 when k = 0. It follows that A = − 1 . 4 81 . Substituting y ∗ into nonhomogeneous equation. Assuming y ∗ = A and substituting into the initial equation. a = −1.e. then the solution takes the form k−1 yk = (−a)k y0 + i=0 (−a)k−1−i ri . r = rk . where A is an arbitrary constant. . i. In order to satisfy this equality for each k. k = 1. 1+a yk = A · (−a)k + rk = A + rk. rk ≡ r for all k.Recipe 20 – How to Solve a FirstOrder NonHomogeneous Linear Diﬀerence Equation: First. we have: A(k + 1)2 + B(k + 1) + D − 3Ak 2 − 3Bk − 3D = k 2 + k + 2. yk = y0 + rk. or −2Ak 2 + 2(A − B)k + A + B − 2D = k 2 + k + 2. a = −1. 1+a 1+a a = −1. Example 97 yk+1 − 3yk = 2. The general solution 2 4 2 4 1 is yk = Y + y ∗ = C3k − 2 k 2 − k − 3 . B = −1. . consider the case when rk is constant over time. But the righthand term suggests ﬁnding a particular solution of the form: y ∗ = Ak 2 + Bk + D. The homogeneous equation is the same as in the preceding example. the general solution of the nonhomogeneous equation takes the form yk = y0 − r r × (−a)k + . If r now depends on k. Example 98 yk+1 − 3yk = k 2 + k + 2. A particular solution to nonhomogeneous diﬀerence equation can be found using the method of undetermined coeﬃcients. . hence y ∗ = −1 and the general solution is yk = Y + y ∗ = C3k − 1. 2. The righthand term suggests looking for a constant function as a particular solution. we obtain A = −1. the general solution of the nonhomogeneous diﬀerence equation yt is the sum of the general solution of the homogeneous equation and a particular solution of the nonhomogeneous equation.
C2 are arbitrary constants. sin θ = √ 2 2 + b2 +b a (14) (15) 82 . The solution to the corresponding homogeneous equation yk+2 + a1 yk+1 + a2 yk = 0 depends on the solutions to the quadratic equation: m2 + a1 m + a2 = 0. The trial solutions corresponding to some simple functions rk are given in the following table: Righthand term rk Trial solution y ∗ k a Aak kn A0 + A 1 k + A 2 k 2 + . 1 2 Case 2. Then. both m1 and m2 will be nonzero because a2 is nonzero for (14) to be of second order. Two remarks: • The method of undetermined coeﬃcients illustrated by these examples applies to the general case of order n as well. m2 denote the roots of the characteristic equation. 4. . . θ ∈ (−π. π]. if rk = 5k k 3 . then the trial solution would be y ∗ = 5k (A0 + A1 k + A2 k 2 + A3 k 3 ).2 SecondOrder Linear Diﬀerence Equations The general form is: yk+2 + a1 yk+1 + a2 yk = rk . Case 1. For example. The transformation is given by √ • r = a2 + b2 • θ is the unique angle (in radians) such that θ ∈ (−π. m1 and m2 are real and distinct. m1 and m2 are real and equal. we 4 4ek ﬁnd A = e−3 . The solution to (15) is Y = C1 rk cos(kθ + C2 ). π] and cos θ = √ a2 a b . then a trial solution for r can be obtained by combining the trial solutions for the simple functions. Any complex number a + bi admits a representation in the polar (or trigonometric) form r(cos θ ± i sin θ). This time we try a particular solution of the form: y ∗ = Aek . Let m1 . The solution to (15) is Y = C1 mk + C2 kmk = (C1 + C2 k)mk . The general solution is yk = Y + y ∗ = C3k + e−3 . which is called the auxiliary (or characteristic) equation of equation (14).2. 1 1 1 Case 3. + A n k n sin ak or cos ak A sin ak + B cos ak • If the function r is a combination of simple functions for which we know the trial solutions. After substitution. The solution to (15) is Y = C1 mk + C2 mk . where C1 .Example 99 yk+1 − 3yk = 4ek . m1 and m2 are complex with the polar forms r(cos θ ± i sin θ). r > 0.
the solution to the corresponding homogeneous equation is Y = (C1 + C2 k)4k . any diﬀerence equation of order n can be written as: yk+n + a1 yk+n−1 + . Therefore.2 )].2 = cos 2π ± i sin 2π .1 cos(kθ + C1. A particular solution y ∗ to the nonhomogeneous equation again can be obtained by means of the method of undetermined coeﬃcients (as was illustrated for the cases of ﬁrstorder and secondorder diﬀerence equations).2. mn . . .1 k p−1 cos(kθ + Cp. + an−1 m + an = 0 and has exactly n roots denoted as m1 .1 k cos(kθ + C2. B = 1 . The sum of terms will contain n arbitrary constants.3 The General Case of Order n In general. . Thus. . . 3 4. 83 (14) . • each repeated p times pair of complex conjugate roots r(cos θ ± i sin θ) generates the term rk [C1. + Cp k p−1 )mk . . The diﬀerence equation has the solution yk = C1 2k + C2 3k . m2 = −1+i 3 or. written in 2 2 the polar form. . To ﬁnd a particular solution. the solution to the nonhomogeneous equation is yk = 4 3 1 1 (C1 + C2 k)4k + 4 2k + 3 . mn in the following way: • a real simple root m generates the term C1 mk . . . The general solution to the homogeneous equation is a sum of terms which are produced by the roots m1 . + an−1 yk+1 + an yk = rk . The corresponding characteristic equation becomes mn + a1 mn−1 + . Example 102 yk+2 + yk+1 + yk = 0 √ √ The equation m2 + m + 1 = 0 gives the roots m1 = −1−i 3 . • each pair of simple complex conjugate roots r(cos θ ± i sin θ) generates the term C1 rk cos(kθ + C2 ). therefore. . + Cp. . .2 ) + C2. m1. .2 ) + . Example 101 yk+2 − 8yk+1 + 16yk = 2k + 3 The characteristic equation m2 − 8m + 16 = 0 has the roots m1 = m2 = 4. we consider the trial solution y ∗ = A2k + B. • a real and repeated p times root m generates the term (C1 + C2 k + C3 k 2 + . .Example 100 yk+2 − 5yk+1 + 6yk = 0 The characteristic equation is m2 − 5m + 6 = 0 and has the roots m1 = 2. m2 = 3. the given diﬀerence equation has the 3 3 solution: y ∗ = C1 cos( 2π k + C2 ). . By substituting y ∗ into the nonhomogeneous equation and performing the computations we obtain A = 1 . .
There are two basic methods for computing the interest earned in a period. Ct . for example. This is a ﬁrstorder diﬀerence equation which has the solution Sk = S0 (1 + kr). respectively. consumption. and investment in period t. The solution to this homogeneous ﬁrstorder diﬀerence equation is Sk = (1 + r)k S0 . 0 < m < 1. It denote national income. If Sk denotes the sum accumulated after k periods. 84 . • r is the growth factor. one year. the computation formula is: Sk+1 = Sk + rSk or Sk+1 = Sk (1 + r). and m2 = 2. we can express the dynamics of national income in the form of a ﬁrstorder diﬀerence equation: Yt+1 − Yt = rIt = r(Yt − Ct ) = rYt − r(c + mYt ). Sk is computed in the following way: Sk+1 = Sk + rS0 . Economics Application 16 (A dynamic model of economic growth) Let Yt .Example 103 yk+4 − 5yk+3 + 9yk+2 − 7yk+1 + 2yk = 0 The characteristic equation m4 − 5m3 + 9m2 − 7m + 2 = 0 has the roots m1 = 1. or Yt+1 − [1 + r(1 − m)]Yt = −rc. S0 earns compound interest at rate r if each period the interest equals a fraction r of the sum accumulated at the beginning of that period. • m is the marginal propensity to consume. From the above system of three equations. which is repeated 3 times. The solution is y ∗ = (C1 + C2 k + C3 k 2 ) + C4 2k . A simple dynamic model of economic growth is described by the equations: Yt = Ct + It Ct = c + mYt Yt+1 = Yt + rIt where • c is a positive constant. Economics Application 15 (Simple and compound interest) Let S0 denote an initial sum of money. In this case. S0 earns simple interest at rate r if each period the interest equals a fraction r of S0 .
This homogeneous diﬀerence equation has the solution It = [1 + r(1 − m)]t I0 . t]. where • [0. • V is the value of the objective function as seen from the initial moment t0 = 0. (Yt+1 − Ct+1 ) − (Yt − Ct ) (Yt+1 − Yt ) − (Ct+1 ) − Ct ) rIt − m(Yt+1 − Yt ) rIt − mrIt . • c(t) is the control variable and the objective function V is maximized with respect to this variable.1 Introduction to Dynamic Optimization The FirstOrder Conditions A typical dynamic optimization problem takes the following form: maxc(t) V = 0T f [k(t).3. 4. dt Similar to the static optimization.The solution to this equation is Yt = (Y0 − c c )[1 + r(1 − m)]t + .3 4. c(t). T ] is the horizon over which the problem is considered (T can be ﬁnite or inﬁnite). c(t). in order to solve the optimization program we need ﬁrst to ﬁnd the ﬁrstorder conditions: 85 . • k(t) is the state variable and the ﬁrst constraint (called the equation of motion) ˙ describes the evolution of this state variable over time (k denotes dk ). k(0) = k0 > 0. t]dt ˙ subject to k(t) = g[k(t). 1−m 1−m The behavior of investment can be described in the following way: It+1 − It = = = = or It+1 = [1 + r(1 − m)]It .
c(t). c. t] = e−ρt u[k(t). the algorithm speciﬁed above should be modiﬁed in the following way: Step 1a. µ(T )k(T ) = 0. Take the derivative of the Hamiltonian with respect to the control variable and set it to 0: ∂H ∂f ∂g = +λ = 0. t) where λ(t) is a Lagrange multiplier. Applies for each control variable. Step 2. c(t)]: t→∞ lim [λ(t)k(t)] = 0. Construct the Hamiltonian function H = f (k.O. Transversality condition Case 1: Finite horizons. In that case we need an equation of motion for each state variable. ∂c ∂c ∂c Step 3. t→∞ lim [H(t)] = 0. 86 . Step 3a. ∂k ∂k ∂k Step 4. Case 2: Inﬁnite horizons with f of the form f [k(t). with More than One State and/or Control Variable: It may happen that the dynamic optimization problem contains more than one control variable and more than one state variable.C. Applies for each state variable. Take the derivative of the Hamiltonian with respect to the state variable and set it to equal the negative of the derivative of the Lagrange multiplier with respect to time: ∂H ∂f ∂g ˙ = +λ = −λ. Recipe 22 – The F. c.Recipe 21 – How to derive the FirstOrder Conditions: Step 1. Step 2a. To write the ﬁrstorder conditions. Case 3: Inﬁnite horizons with f in a form diﬀerent from that speciﬁed in case 2. t) + λ(t)g(k. The Hamiltonian includes the righthand side of each equation of motion times the corresponding multiplier.
t] = e−ρt u[k(t).2 PresentValue and CurrentValue Hamiltonians In economic problems. c) where µ(t) and w(t) are Lagrange multipliers. where ρ is a constant discount rate and e−ρt is a discount factor. t) + w(t)h(t. ∂c ∂H = −µ. Step 2. the objective function is usually of the form f [k(t).O. k. ≥ 0.3.3 Dynamic Problems with Inequality Constraints Let us add an inequality constraint to the dynamic optimization problem considered above. ˙ • µ(T )k(T ) = 0 (The transversality condition in the case of ﬁnite horizons). ∂H = 0. c. c(t). = k0 > 0. Step 4. t]. t]dt = g[k(t). c) k(0) = 0T f [k(t). t) + µ(t)g(k. c. = ρµ − µ. c) = 0 87 . maxc(t) V ˙ subject to k(t) h(t.4. The Hamiltonian H = e−ρt u(k. t) is called the presentvalue Hamiltonian (since it represents a value at the time t0 = 0). t) where µ(t) = λ(t)eρt . which represents a value at the time t: ˆ H = Heρt = u(k. c(t). c. w(t)h(t. with Inequality Constraints: Step 1. k. k. ˙ ∂k w(t) ≥ 0. c(t).3. Step 3. c(t)]. Sometimes it is more convenient to work with the currentvalue Hamiltonian.C. c) + µ(t)g(k. The ﬁrstorder conditions written in terms of the currentvalue Hamiltonian appear to be a slight modiﬁcation of the presentvalue type: • • ˆ ∂H ∂c ˆ ∂H ∂k = 0. The following algorithm allows us to write the ﬁrstorder conditions: Recipe 23 – The F. 4. Construct the Hamiltonian H = f (k. c. c) + λ(t)g(k.
• n is the rate of population growth. Barro. The dynamic constraint says that the change in the stock of capital (expressed in units of eﬀective labor) is given by the diﬀerence between the level of output (in units of eﬀective labor) and the sum of consumption (per unit of eﬀective labor). J. As can be observed. • c denotes consumption per unit of eﬀective labor (eﬀective labor is deﬁned as the product of labor force L(t) and the level of technology A(t)). Transversality condition if T is ﬁnite: µ(T )k(T ) = 0.. for instance. • f (·) is the production function written in intensive form (f (k) = F (K/AL. households face a dynamic optimization problem with c as a control variable and k as a state variable. The model assumes that the economy consists of households and ﬁrms but here we further assume. that households carry out production directly. The presentvalue Hamiltonian is H = u(A(t)c(t))ent e−ρt + µ(t)[f [k(t)] − c(t) − (x + n + δ)k(t)] 88 . at each moment t. Throughout the example we use the following notation: • A(t) is the level of technology. • δ is the depreciation rate. Inc. Each household chooses. namely the Ramsey model (for more details on the economic illustrations presented in this section see. The expression A(t)c(t) in the utility function represents consumption per person and the term ent captures the eﬀect of the growth in the family size at rate n. • x is the growth rate of technological progress. SalaiMartin Economic Growth. McGrawHill. X. Economics Application 17 (The Ramsey Model) This example illustrates how the tools of dynamic optimization can be applied to solve a model of economic growth.Step 5. for simplicity. 1)). 1995). • ρ is the rate at which households discount future utility. the level of consumption that maximizes the present value of lifetime utility U= 0 ∞ u[A(t)c(t)]ent e−ρt dt subject to the dynamic budget constraint ˙ k(t) = f [k(t)] − c(t) − (x + n + δ)k(t). depreciation and the additional eﬀect (x + n)k(t) from expressing all variables in terms of eﬀective labor. R. • k denotes the amount of capital per unit of eﬀective labor. and the initial level of capital k(0) = k0 .
and the ﬁrstorder conditions state that ∂H ∂c ∂H ∂k = ∂u(Ac) ent e−ρt = µ ∂c = µ[f (k) − (x + n + δ)] = −µ ˙ In addition the transversality condition requires: t→∞ lim [µ(t)k(t)] = 0. (1 − θ)x − θ c ˙ c · ∂u(Ac) . together with the dynamic budget constraint forms a system of diﬀerential equations which completely describe the dynamics of the Ramsey model. we get c ˙ θ − (1 − θ)x − n + ρ = f (k) − x − n − δ. Eliminating µ from the ﬁrstorder conditions and assuming that the utility function (1−θ) takes the functional form u(Ac) = (Ac) leads us to an expression7 for the growth rate 1−θ of c: 1 c ˙ = [f (k) − δ − ρ − θx] c θ This equation. note ﬁrst that ∂u(Ac) = A1−θ cθ ∂c and d ∂u(Ac) = dt ∂c or. ∂c d ∂u(Ac) = dt ∂c Therefore.C.O. The boundary conditions are given by the initial condition k0 and the transversality condition. Economics Application 18 (A Model of Economic Growth with Physical and Human Capital) This model illustrates the case when the optimization problem has two dynamic constraints and two control variables. 7 To reconstruct these calculations. 89 . Eliminating µ from the second equation in the F. assuming that ˙ A A (1 − θ) ˙ A c ˙ −θ A c · ∂u(Ac) ∂c = x. c which after rearranging terms gives an expression for the growth rate of consumption c. µ= ˙ ∂u(Ac) nt −ρt c ˙ e e · n − ρ + (1 − θ)x − θ ∂c c .
investment in physical capital IK and investment in human capital IH : Y = C + IK + IH . By eliminating and it contains two control variables (C. The model also assumes that physical capital K and human capital H enter the production function Y in a CobbDouglas manner: Y = AK α H 1−α . = −µ1 + µ2 = 0. The households’ problem is to choose C. where A denotes a constant level of technology and 0 ≤ α ≤ 1. The two capital stocks change according to the dynamic equations: ˙ K = IK − δK. = −µ1 (αAK α−1 H 1−α − δ) = −µ1 . The Hamiltonian associated with this dynamic problem is: H = u(C) + µ1 (AK α H 1−α − C − δK − IH ) + µ2 (IH − δH) and the ﬁrstorder conditions require ∂H ∂C ∂H ∂IH ∂H ∂K ∂H ∂H = u (C)e−ρt − µ1 = 0. 90 . IK . IH such that they maximize the presentvalue of lifetime utility U= 0 ∞ u(C)e−ρt dt C (1−θ) . ˙ (16) (17) (18) (19) Eliminating µ1 from (16) and (18) we obtain a result for the growth rate of consumption: ˙ C 1 K = αA C θ H −(1−α) −δ−ρ . the households’ problem becomes Maximize U ˙ subject to K = AK α H 1−α − C − δK − IH . We again assume that u(C) = Y and IK . IH ) and two state variables (K. ˙ H = IH − δH. Output is divided among consumption C.As was the case in the previous example. 1−θ subject to the above four constraints. we assume that households perform production directly. ˙ = IH − δH. H where δ denotes the depreciation rate. H). ˙ = −µ1 (1 − α)AK α H −α − µ2 δ = −µ2 .
The ﬁrm’s problem is to choose at each moment the proportion u(t) of the output y(t) which should be reinvested such that it maximizes total sales over the period [0. it can be shown that this rate is the same as the growth rate of C found above. Economics Application 19 (InvestmentSelling Decisions) Finally. such a model is also called an AK model. w1 ≥ 0. which enables us to ﬁnd the ratio: K α = . C θ The substitution of of K: Y = AK( K H ˙ C . Assume that a ﬁrm produces. where u is the control and y is the state variable. H 1−α After substituting this ratio in the result for constant rate given by: ˙ 1 C = [Aαα (1 − α)(1−α) − δ − ρ]. ˙ λ = u − 1 − uλ. T ]. α (Therefore. Moreover. the ﬁrm’s problem is: max 0T [1 − u(t)]y(t)dt subject to y(t) = u(t)y(t) ˙ y(0) = y0 > 0 0 ≤ u(t) ≤ 1. Expressed in mathematical terms. C it turns out that consumption grows at a into the production function implies that Y is a linear function 1 − α (1−α) ) . w1 (1 − u) = 0. w2 u = 0. H and Y grow at a constant rate. at each moment t. λ(T ) = 0. let us exemplify the case of a dynamic problem with inequality constraints. conditions (18) and (19) imply αA K H −(1−α) − δ = (1 − α)A K H α − δ.Since µ1 = µ2 (from (17)). a good which can either be sold or reinvested to expand the productive capacity y(t) of the ﬁrm. The Hamiltonian takes the form: H = (1 − u)y + λuy + w1 (1 − u) + w2 u. w2 ≥ 0.) Formulas for Y and the K/Hratio suggest that K. 91 . The optimal solution satisﬁes the conditions: Hu = (λ − 1)y + w2 − w1 = 0.
λ(t) is a decreasing function over [0. S. If T ≤ 1.J.C. Optimal Control Theory with Economic Applications.These conditions imply that λ > 1. Further Reading: • Arrowsmith. there exists t∗ such that (21) holds for any t in [t∗ . Since λ(t) is also a continuous function.Sydsæter. T ] (and. 92 . if T > 1 then it is optimal for the ﬁrm to invest all the output until t = T − 1 and to sell all the output after that date. x(t) = y0 et .M.I and N. • Hartman. Hence u(t) = 0. Ph. for any t in [t∗ . • Barro. Dynamical Systems. x(t) = x(t∗ ). ˙ u = 0 and λ = −1. therefore. If T > 1. M. L. Schwartz. t∗ must satisfy λ(t∗ ) = 1.W. D. • Goldberg. and C. If T < 1 then it is optimal to sell all the output. Ordinary Diﬀerential Equations. SalaiMartin. T ]. T ].Place. λ(t) = eT −t−1 . continuous at t∗ = T − 1). Nonlinear Dynamical Economics and Chaotic Motion. Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management. Introduction to Diﬀerence Equations. H. • Pontryagin. In conclusion. for any t in [0. or λ < 1.K. R. • Lorenz. • Seierstad A. and K. (21) ˙ u = 1 and λ = −λ (20) thus. we obtain u(t) = 1.S. • Chiang A. then for any t in [0. T −1] (20) must hold. • Kamien. which implies that t∗ = T − 1 if T ≥ 1. λ(t) = T − t.I. T − 1]. Ordinary Diﬀerential Equations. Elements of Dynamic Optimization. Economic Growth. T ]. we have t∗ = 0. Using the fact that y is continuous over [0. and X.
6 4. x a) given v = y z 3 4 6 ﬁnd the quadratic form v Av if A = 4 −2 0 . 2 1 x3 x2 3 2.5 5. 7. 3. 3x + 3y + 2z = 7. 6. Find the eigenvalues and corresponding eigenvectors of the following matrix: 1 0 1 0 1 1 . 5. 6 0 1 b) Given the quadratic form x2 + 2y 2 + 3z 2 + 4xy − 6yz + 8xz ﬁnd matrix A of v Av. 1 1 2 Is this matrix positive deﬁnite? 93 . 1.1 Exercises Solved Problems 1 x1 x2 1 det 1 x2 x2 . Find all solutions of the system of linear equations 2x + y − z = 3. Find the rank of the matrix A. 7x + 5y = 13. Evaluate the Vandermonde determinant A= 0 2 2 1 −3 −1 −2 0 −4 4 6 14 . 1 3 4 5 b = 1 . given 1 2 3 A = 2 −1 −1 . Find the unknown matrix X from the equation 1 2 2 5 X= 4 −6 2 1 . Solve the system Ax = b.
show that f (n) (0) = 14. Show that the function y(x) = x(ln x − 1) is a solution to the diﬀerential equation y (x)x ln x = y (x). ab − d2 > 0. 9. Find y (x) if e4x e4x + 1 1 + sin2 x) a) y = ln b) y = ln(sin x + c) y = x1/x . Given f (x) = 1/(1 − x2 ). y = t + t3 . Show that the matrix a d e d b f e f c is positive deﬁnite if and only if a > 0. 13. given dx2 a) x = t2 . n!.8. y = e3t . n is odd. n is even. b) 94 . 0. 11. Find the following limits: a) x2 + x − 2 x→−2 x2 + 2x lim lim x−1 ln x b) c) d) x→1 x→+∞ lim xn · e−x x→0 lim xx . Show that a parabola y = x2 /2e is tangent to the curve y = ln x and ﬁnd the tangent point. abc + 2edf > af 2 + be2 + cd2 . Find d2 y . x = e2t . 10. 12.
Can we apply Rolle’s Theorem to the function f (x) = 1− x2 . 1]? Explain your answer and show it graphically. Evaluate the following indeﬁnite integrals: a) (x2 + 1)2 dx x3 √ ax (1 + a−x / x3 )dx ex x2 dx 3 b) c) d) x x2 + 1dx e) x ln(x − 1)dx 22. 18.√ 3 15. Find ∂z ∂z and . Find the extrema of the following functions: √ a) z = y x − y 2 − x + 6y b) z = ex/2 (x + y 2 ). Evaluate the following deﬁnite integrals: 2 a) 1 (x2 + 1/x4 )dx dx x−1 9 b) 4 √ (Hint: substitution x = t2 ) 95 . √ 17. ﬁnd y (x) and draw the graph of the function. where x ∈ [−1. Find the approximate value of 5 33. ∂y 2 20. 19. ) = xe−y/x is a solution to the diﬀerential equation x ∂2z ∂z ∂z +2 + ∂x∂y ∂x ∂y =y ∂2z . given ∂x ∂y a) z = x3 + 3x2 y − y 3 b) z = c) xy x−y z = xe−xy . 21. Show that the function z(x. y. Given y(x) = x2 x. 16.
Show that z = (x1/2 + y 1/2 )2 is a concave function. y is the function f (x. x2 + y 2 + z 2 = 1. x2 + y 2 + z 2 = 3 s. y 2 = 4x 24. s. 1/x2 + 1/y 2 + 2/z 2 = 4 u = (x + y)z. 31. x + y 2 − z 2 = 1. y > 0.t. z − x = 1. (Hint: Don’t forget to test the function on the boundary of the domain) 27. 25. x + y + z = 0 s. e) f) u = xyz. x = 0 b) 4y = x2 .23. Check whether the improper integral +∞ 0 dx (x − 3)2 converges or diverges. y = 8. b < 1. c) s. a + b ≤ 1 is concave for x. Find dy/dx. d) u = xyz. Check that the CobbDouglas production function z = xa y b for 0 < a.t. Given the equation 2 sin(x + 2y − 3z) = x + 2y − 3z. y − xz = 1 s.t. b) u = x + y.t. x + xy + y + z = 1 33. 1/x2 + 1/y 2 = 1/a2 s. Find the maximum and minimum values of the function f (x. x2 + y 2 + z 2 = 1 b) x2 + y 2 − 2z = 1. y) = x3 + xy + y 2 convex? 32. y) = x3 + y 3 − 3xy in the domain {0 ≤ x ≤ 2. 30. Find extrema of the functions a) z = 3x + 6y − x2 − xy − y 2 √ b) z = 3x2 − 2x y + y − 8x + 8 26. given a) x2 + y 2 − 4x + 6y = b) xe2y − ye2x = 0 28. 96 . u = x − 2y + z. 29. Find the conditional extrema of the following functions and classify them: a) u = x + y + z 2 .t.t. −1 ≤ y ≤ 2}. show that zx + zy = 1. Find the area bounded by the curves: a) y = x3 . For which values of x. Find dx/dz and dy/dz given a) x + y + z = 0.
97 . Solve the following diﬀerential equations: a) x(1 + y 2 ) + y(1 + x2 )y = 0 b) y = xy(y + 2) c) y + y = 2x + 1 d) y 2 + x2 y = xyy e) 2x3 y = y(2x2 − y 2 ) f) x2 y + xy + 1 = 0 g) y = x(y − x cos x) h) xy − 2y = 2x4 i) 2y − 5y + 2y = 0 j) y − 2y = 0 k) y − 3y + 2y = 0 l) y − 2y + y = 6xex m) √ y + 2y + y = 3e−x x + 1 36. it is known that when faced with the utility maximization problem max u(x. 35. ˙ Answers: 1. y) subject to x + y = 3. 2.34. y) = xα y 1−α . 2. Find the value of α. ˙ b) x = 2y. A consumer is known to have a CobbDouglas utility of the form u(x. (x2 − x1 )(x3 − x1 )(x3 − x2 ). However. where the parameter α is unknown. y = 2. ˙ c) y = −4x − y ˙ y = 2x ˙ y = x + 2y ˙ x = 2x + y. Solve the following systems of diﬀerential equations: a) x = 5x + 2y. the consumer chooses x = 1.
x2 /(x − y)2 c) e−xy(1 − xy). 27. 98 . const · (1. −x2 e−xy. as a parameter. y = 0. 1. 1. Eigenvalues are λ = 0. x2 b) zmin = −2/e at x = −2. a) 3x(x + 2y). 9.3. a) 2/(e4x + 1) 14. b) 1 2 4 2 −3 . The matrix is nonnegative deﬁnite. 4. y0 = 1/2. 11. say z. Taking one of the variables. 2 X= 16 −32 −6 13 . fmin = −1 at x = y = 1. 2) . The third line in the matrix is a linear combination of the ﬁrst two. 1 x = −1 . Corresponding eigenvectors are const · (1. 3 y= 5 − 7z . const · (1. 2. a) (t2 − 1)/4t3 17. a) zmax = 12 at x = y = 4 b) −y 2 /(x − y)2 . y = 4 26. a) x2 /2+2 ln x−1/2x2 +C b) ax / ln a−2/ x+C c) ex /3+C e) x2 ln x − 1/2 − (x2 /2 + x + ln x − 1)/2 + C b) 2(1 + ln 2) b) 16/3 d) (x2 + 1)3 /3+C 22. x0 = e. A= 2 4 −3 3 7. Diverges. y = 3 b) zmin = 0 at x = 2. √ 3 21. c) x1/x 1−ln x . fmax = 13 at x = 2. −1. For any y and x ≥ 1/12. 5. d) 1. −1.0125 18. y = −1. 3(x2 − y 2 ) 20. a) 2−x y+3 b) 2ye2x −e2y 2xe2y −e2x 31. 0) . a) 10/8 23. a) 12 24. 3. a) 3/2 b) 1 c) 0 √ 10. a) zmax = 9 at x = 0. b) cos x/ 1 + sin2 x b) 3/4et . therefore we have to solve the system of two equations with three variables. we ﬁnd the solution to be x= 2 + 5z . −1) . a) 3x2 − 2y 2 + z 2 + 8xy + 12xz. 3 6. 25.
−1). −a). . a) x = C1 et + C2 e3t . 1). (a. . 1. 1. 2. b) = i=1 (yi − (a + bxi ))2 by choice of two parameters a (the intercept) and b (the slope). . a). (1. where a = 1/ 6 f) no extrema 34. −1). (2a. −1. −2a). (−a. i = 1. y = −C1 et + C2 e3t x2 5. . a). 99 . −1) e) minimum at points (a. (1. 1. z = 1. −a. 1). y. Find maximum and minimum values of the function f (x. (1. 1). minimum at x = y = 1. Solve for a and b and show that the suﬃcient conditions are met. (−1. a. y = C1 e2t − C2 e−2t c) x = C1 et + C2 e3t . −1). −1. (−1. a. b) y = −x = 1/(y − x). (−2a.32. z = 0 = √ b) minimum at x = y = − 2a. note that u ≥ 0. (−1. −2a. maximum at √ points (−a.2 More Problems 1. yi ). y = −2C1 et − C2 e3t b) x = C1 e2t + C2 e−2t . a) (1 + x2 )(1 + y 2 ) = C b) y = 2Ce x2 1−Ce c) y = 2x − 1 + Ce−x y/x d) y = Ce√ e) x = ±y ln Cx f) xy = C − ln x g) y = x(C + sin x) h) y = Cx2 + x4 i) y = C1 e2x + C2 ex/2 j) y = C1 + C2 e2x k) y = ex (C1 + C2 x) + C3 e−2x l) y = (C1 + C2 x)ex + x3 ex m) y = e−x ( 4 (x + 1)5/2 + C1 + C2 x) 5 36. a) In the method of least squares of regression theory the curve y = a + bx is ﬁt to the data (xi . z = −1 d) minimum at points (−1. 1. b) by choice of a and b. y √ 1. a) x = (y − z)/(x − y). −1. Determine the necessary condition for minimizing S(a. maximum at points (1. 2a). 33. a) minimum at x = −1. −a). −1. 2a. α = 1/3 35. y = (z − x)/(x − y). 1).) 2. n by minimizing the sum of squared errors: n S(a. z) = x2 + y 2 + z 2 e−(x 2 +y 2 +z 2 ) . −a. maximum at x = y = 2a c) maximum at x = y = 1. (Hint: introduce the new variable u = x2 + y 2 + z 2 .
b) How can you interpret this problem geometrically? Illustrate your results graphically. Solve the integral equation x 0 x (x − t)y(t)dt = y(t)dt + 2x. w2 and ∂w1 < 0. b and show that the suﬃcient conditions are met. what is the meaning of the expression (x1 − x2 )2 + (y1 − y2 )2 ?) 5. u + 2v = 1. x2 ) − w1 x1 − w2 x2 . 0 (Hint: take the derivatives of both parts with respect to x and solve the diﬀerential equation). a) Show that Chebyshev’s diﬀerential equation (1 − x2 )y − xy + n2 y = 0 100 . ﬁnd a. 2. 1 c) Suppose that to the problem is added the linear constraint x2 = b. c) Compare your results. But rise to the challenge and ﬁgure out the answer yourself! Consider the problem max F (x1 . j = 1. Consider the following minimization problem: min(x − u)2 + (y − v)2 s. a) What are the ﬁrst order conditions for a maximum? ∂x b) Show that x1 can be solved as a function of w1 . ﬁnd the optimal a. b which solve the problem 1 min a. do you really need to use the Lagrangemultiplier method? (Hint: Given two points (x1 . y1 ) and (x2 .b) The continuous analog of the discrete case of the method of least squares can be set up as follows: Given a continuously diﬀerentiable function f ((x). where b is a given nonzero parameter.x2 ∂ f where the Hessian matrix of the second order partial derivatives ∂xi ∂xj . a) Find the solution. is assumed negative deﬁnite and w1 . ∂w1 with added constraint 4. 3. 6. This result is known in economic analysis as Le Chatelier Principle. Find the new equilibrium and show that: 2 ∂x1 ≤ ∂w1 without added constraint ∂x1 < 0. c) What is the solution to this minimization problem if the second constraint is replaced with the constraint u + 2v = 3? In this case. x1 . Again.t. x2 ) = f (x1 . xy = 1. y2 ) in the Euclidean space. i. w2 are given positive parameters.b 0 (f (x) − (a + bx))2 dx.
(Hint: in other words. you need to prove the existence of a function µ(x) such that d p(x) = µ(x)a(x). 3x + y + 2z = 2. can be reduced to the equation d dy p(x) dx dx + q(x)y(x) = 0. convex. strictly concave. Find the optimal x and y. c(x) are continuous functions and a(x) = 0.(where n is constant and x < 1) can be reduced to the linear diﬀerential equation with constant coeﬃcients d2 y + n2 y = 0 dt2 by substituting x = cos t. 2x + 3y + z = 3. For which values of x is the matrix nonsingular? √ d) The function f (x. At which point(s) does the function reach its maximum (minimum)? b) Evaluate the indeﬁnite integral x3 e−x dx. 2. strictly convex or neither. Draw the phase portrait of the system. y) = xy is to be maximized under the constraint x + y = 5. x ∈ R: A= 0 1 0 3 0 3 0 x 2 0 1 0 4 0 3 0 . y) = 4x2 −2xy+3y 2 . Find the equilibria of the system of diﬀerential equations x = sin x. 5. Using the inverse matrix. is concave. Easy problems: a) Check whether the function f (x. 101 . q(x) = µ(x)c(x) and dx (µ(x)a(x)) = µ(x)b(x).3 Sample Tests Sample Test – I 1. ˙ ˙ and classify them. deﬁned for all (x.) 7. 2 c) You are given the matrix A and the real number x. y = sin y. solve the system of linear equations x + 2y + 3z = 1. b(x). y) ∈ R2 . b) Prove that the linear diﬀerential equation of the second order a(x)y (x) + b(x)y (x) + c(x)y(x) = 0 where a(x).
where a is a positive real parameter. y) = 1 1 + q p x y deﬁned in the region x > 0. Solve the diﬀerence equation xt − 5xt−1 = 3. y > 0 a) convex.O. ﬁnd how the optimal values of x and y change if a increases by a very small number. Set up the Hamiltonian function associated with this problem and write down the ﬁrstorder conditions. subject to dx = au(t) − bx(t).a>1 1 + eax a) Write down the F. b and r are positive parameters. Sample Test – II 1. Find the eigenvalues and the corresponding eigenvectors of the following matrix: 2 2 2 −1 . x ≥ 0. of an extremum point. x2 + x − 2 4. 7. 2−x . Sketch the graph of the function y(x) = 5. 8. for the maximum. b) Taking as given that this problem has a solution.3. a) Use the Lagrangemultiplier method to write down the F. Be as speciﬁc as you can.O. b) concave? 102 . b) Solve the diﬀerential equation y (x) + 2y (x) − 3y(x) = x − 1. a) Solve the diﬀerential equation y (x) + 2y (x) − 3y(x) = 0. Easy problems: a) For what values of p and q is the function f (x. Given the function 1 − e−ax . b) Determine whether an extremum point is maximum or minimum. Consider the optimal control problem +∞ max I(u(t)) = u(t) 0 (x2 (t) − u2 (t))e−rt dt. y ≥ 0.C. y) = ax + y is to be maximized under the constraints x2 + ay 2 ≤ 1. dt where a. da da 6.C. The function f (x. dx dx c) Find and determine the sign of . 9.
If c is such that these conditions describe the maximum. n by minimizing the sum of squared errors n S(a. 1. 2. . c > 1. 5. it is known that when faced with the utility maximization problem max u(x. . 6. Find the value of α. Draw the graph of this curve and ﬁnd the equation of the tangent line at t = π/2. 7. Use the Lagrange multiplier method to write the ﬁrst order conditions for the maximum of the function f (x.b) Find the area between the curves y = x2 and y = √ x. Determine the necessary condition for minimizing S(a. where the parameter α is unknown. d) A consumer is known to have a CobbDouglas utility of the form u(x. x1 − 2x2 = 5. show what happens with x and y if c increases by a very small number. 2π]. t = 0. . 2. x1 ≥ 0. 9. x2 ≥ 0. t ∈ [0. where c is a real parameter. y = a(1 − cos t). Solve the diﬀerence equation xt+1 − xt = bt . Solve for a and b and show that the suﬃcient conditions are met. Find the unknown matrix X from the equation 1 1 −1 1 −1 3 1 0 = 4 3 2 . c) Apply the characteristic root test to check whether the quadratic form Q = −2x2 − 2y 2 − z 2 + 4xy is positive (negative) deﬁnite. 2 x1 . b) Solve the diﬀerential equation y (x) − 2y (x) + y(x) = xex . b) = i=1 (yi − (a + bxi ))2 by the choice of two parameters. . the consumer chooses x = 1. 103 . subject to x + cy = 2. 4. i = 1.x2 subject to 2x1 + 5x2 ≥ 20. Find local extrema of the function z = (x2 + y 2 )e−(x 2 +y 2 ) and classify them. a (the intercept) and b (the slope). 8. yi ). y) = xα y 1−α . In the method of least squares of regression theory the curve y = a + bx is ﬁt to the data (xi . a) Solve the diﬀerential equation y (x) − 2y (x) + y(x) = 0. a is a real parameter. . y) subject to x + y = 3. y) = ln x + ln y − x − y. A curve in the (x − y) plane is given parametrically as x = a(t − sin t). . However. X 2 1 −1 1 1 −2 5 3. 2. b) by the choice of a and b. . . Write down the KuhnTucker conditions for the following problem: max 3x1 x2 − x3 . y = 2.
bounded by the curves y 2 = (4 − x)3 . You are given the function f (x) = xe1−x + (1 − x)ex . y. Prove that x ∂z ∂z +y = nz. Find the area. b) let α = 1. 5. 2 0 1 a) ﬁnd all values of α. Provide a complete explanation for all your answers. 1). Solve the system of linear equations x + y + cz =1 x + cy + z = 1 cx + y + z = 1 where c is a real parameter. c) If A is n × n matrix and rank(A) < n then the system of linear equations Ax = b has no solutions. b) If x∗ is a local maximum of f (x) then f (x∗ ) < 0.Sample Test – III 1. x = 0. Given the matrix 2 1 1 A = 0 3 α . prove that there exists a real number c in the interval (0. d) If f (x) = −f (−x) for all x then for any constant a a −a f (x)dx = 0. ﬁnd two other eigenvalues of A and the eigenvector. g(x) g 2. false or it depends. If I tell you that one eigenvalue λ1 = 2. y) be homogeneous function of degree n. 104 . ∂x ∂y 7. y) for any t. corresponding to λ1 . limx→+∞ f (x) b) Write down the ﬁrst order necessary condition for maximum. y = 0. such that 1 − 2c = ln c − ln(1 − c). c) Using this function. Let z = f (x. a) If A is n × m matrix and B is m × n matrix. Consider all possible cases! 3. a) Find limx→−∞ f (x). 6. for which this matrix has three distinct real eigenvalues. ty) = tn f (x. z) = 2x2 − xy + 2xz − y + y 3 + z 2 and classify them. 4.e. then trace(AB) = trace(BA). Answer the following questions true. f (tx. e) If f (x) and g(x) are continuously diﬀerentiable at x = a then (x) limx→a f (x) = limx→a f (x) . i. Find local extrema of the function u(x.
a) Find maxima and minima of the function x2 + y 2 . deﬁned as x ≥ 0. bounded by the demand and supply curves) c) Let f1 (x1 . y ≥ 0. . 3. b) det(A + B) = det A + det B. Find the absolute minimum and maximum of the function z(x. m. .8. . (Hint: The surplus is deﬁned as the area in the positive quadrant. b]. b). xn ) is convex.) 105 . 4. Provide a complete explanation for all your answers. Show that the implicit function z = z(x. if αi ≥ 0... xn ) = i=1 αi fi (x1 . . ∂x ∂y Sample Test – IV 1. . then check the function on the boundary and compare).0) and (0. i = 1. y). . xn ) be convex functions. d) If f (x) is a continuous function in [a.) A demand for some good X is described by the linear demand function D = 6 − 4x. Consider the following optimization problem: 1 extremize f (x. B and C are such that AB = AC then B = C. (Hint: Find local extrema. What happens with optimal values of x and y if a increases by a very small number? (Hint: Don’t waste your time trying to solve the ﬁrst order conditions. Check whether the extremum point is minimum or maximum. . b] reaches its maximum at the interior point of (a. . . . . y) = x2 + y 2 − 2x − 4y + 1 in the domain. where a is a real parameter. c) A concave function f (x) deﬁned in the closed interval [a. n are parameters. . . . b 2 a f (x)dx = 0 implies e) If a matrix A is triangular. . a) If matrices A. then the condition f (x) ≡ 0 in [a. (3. . false or it depends.5)). solves the diﬀerential equation m ∂z ∂z +n = 1. subject to x2 + 4y 2 = 1 b) (back to microeconomics. . . Answer the following questions true. φ is an arbitrary diﬀerentiable function). Write down the ﬁrst order necessary conditions. . then the elements on its principal diagonal are the eigenvalues of A. fm (x1 . √ The industry supply of this good equals S = x + 1. . 5x + 3y ≤ 15 (a triangle with vertices in (0.0). y) = 2 x2 + ey subject to x + y = a. 2. . 2. xn ). Find the value of the consumer and producer surpluses in the equilibrium. . . . Prove that the function m F (x1 . b]. deﬁned as x − mz = φ(y − nz) (where m. Easy problems.
Find the equilibria of the system of nonlinear diﬀerential equations and classify them: x = ln(1 − y + y 2 ). where ˙ 2 −1 1 2 −1 . 3) . A= 1 1 −1 2 x1 (t) x = x2 (t) . Solve the diﬀerence equation 3yk+2 + 2yk+1 − yk = 2k + k. 8. ˙ y = 3− ˙ x2 + 8y. Consider all possible cases! 6. x3 (t) b) Solve the system of diﬀerential equations x = Ax + b where A is as in part a) and ˙ b = (1. a) Solve the system of linear diﬀerential equations x = Ax.5. where a is a real parameter. Solve the diﬀerential equation y (x) − 6y (x) + 9y = xeax . 106 . c) What is the solution to the problem in part b) if you know that x1 (0) = x2 (0) = x3 (0) = 0? 7. 2.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.