Contents
Introduction 7
I Elements of linear algebra 11
1 Linear spaces 13
1.1 The Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Euclidean nspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Linear Algebra 27
2.1 Systems of linear equations. GaussJordan
elimination method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Linear programming problems (LPP) . . . . . . . . . . . . . . . . . . . 43
II Calculus 111
3 One variable calculus 113
3.1 Diﬀerential calculus of one variable . . . . . . . . . . . . . . . . . . . . 113
3.1.1 Limits and continuity . . . . . . . . . . . . . . . . . . . . . . . 114
3.1.2 Rates of change and derivatives . . . . . . . . . . . . . . . . . . 127
3.1.3 Linear approximation and diﬀerentials . . . . . . . . . . . . . . 139
3.1.4 Extreme values of a real valued function . . . . . . . . . . . . . 141
3.1.5 Applications to economics . . . . . . . . . . . . . . . . . . . . . 150
3.2 Integral calculus of one variable . . . . . . . . . . . . . . . . . . . . . . 157
3.2.1 Antiderivatives and techniques of integration . . . . . . . . . . 157
3.2.2 The deﬁnite integral . . . . . . . . . . . . . . . . . . . . . . . . 164
3.3 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.3.1 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.3.2 Euler’s integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 176
4 Diﬀerential calculus of several variables 187
5
4.1 Real functions of several variables. Limits and
continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
4.1.1 Real functions of several variables . . . . . . . . . . . . . . . . 187
4.1.2 Limits. Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 192
4.2 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
4.3 Higher order partial derivatives . . . . . . . . . . . . . . . . . . . . . . 212
4.4 Diﬀerentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
4.4.1 Diﬀerentiability. The total diﬀerential . . . . . . . . . . . . . . 218
4.4.2 Higher order diﬀerentials . . . . . . . . . . . . . . . . . . . . . 228
4.4.3 Taylor formula in R
n
. . . . . . . . . . . . . . . . . . . . . . . . 231
4.5 Extrema of function of several variables . . . . . . . . . . . . . . . . . 235
4.6 Constrained extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
4.7 Applications to economics . . . . . . . . . . . . . . . . . . . . . . . . . 261
4.7.1 The method of least squares . . . . . . . . . . . . . . . . . . . . 261
4.7.2 Inventory control. The economic order
quantity model . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
III Probabilities 269
A short history of probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 271
5 Counting techniques. Tree diagrams 273
5.1 The addition rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
5.2 Tree diagrams and the multiplication principle . . . . . . . . . . . . . 277
5.3 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . 281
6 Basic probability concepts 289
6.1 Sample space. Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
6.2 Conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . . 302
6.3 The total probability formula. Bayes’ formula . . . . . . . . . . . . . . 306
6.4 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
6.5 Classical probabilistic models. Urn models . . . . . . . . . . . . . . . . 313
7 Random variables 327
7.1 Discrete random variables . . . . . . . . . . . . . . . . . . . . . . . . . 328
7.2 The distribution function of a random variable . . . . . . . . . . . . . 332
7.3 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . 335
7.4 Numerical characteristics of random variables . . . . . . . . . . . . . . 336
7.5 Special random variables . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Appendix A 380
Appendix B 384
Bibliography 389
6
Introduction
Why Maths?
Because Mathematics is the universal language of sciences. When we speak math
ematics, all the barriers  linguistic or cultural ones  are pushed away.
Why Maths in Economis?
Mathematics plays an important role in Economics. This role has been rather
signiﬁcant for the late century and knew a real impulse during the last decades.
Emmanuel Kant (17241804) said: ”A science contains as much science as it con
tains Mathematics”.
One of the ﬁrst economists who wanted to make economics more scientiﬁc by
applying the mathematical rigor to it was Alfred Marshall (18421924, English
economist). He did not want that an overside of mathematics to make the economic
texts harder to understand. Accordingly, Marshall put the mathematical content in
the footnotes and appendices of his economics books. In 1906 we wrote:
”I had a growing feeling in the later years of my work at the subject that a good
mathematical theorem dealing with economic hypotheses was very unlikely to be a
good economics: and I went more and more on the rules
(1) Use mathematics as a short hand language, rather than an engine of inquiry;
(2) Keep to them till you have done;
(3) Translate into English;
(4) Then illustrate by examples that are important in real life;
(5) Burn the mathematics;
(6) If you can’t succeed in (4), burn (3).
That last I did often.”
The use of mathematics in economics provides us some advantages:
 the language that is used is more precise and concise
 it allows us to treat the general case
 we have at our disposal a great number of mathematical results.
On his blog, Greg Mankiw (Professor of Economics at Harvard University) wrote
the following answers to the next question:
”Why aspiring economists need Math?”
”A student who wants to pursue a career in policyrelated economics is advised to
7
go to the best graduate school he or she can get into. The best graduate schools will
expect to see a lot of math on your undergraduate transcript, so you need to take it.
But will you use a lot of diﬀerential equations and real analysis once you land that
dream job in a policy organization? No, you won’t.
That raises the question: Why do we academics want students that have taken a
lot of math? There are several reasons:
1. Every economist needs to have a solid foundation in the basics of economic
theory and econometrics, even if you are not going to be either a theorist or an econo
metrician. You cannot get this solid foundation without understanding the language
of mathematics that these ﬁelds use.
2. Occasionally, you will need math in your job. In particular, even as a policy
economist, you need to be able to read the academic literature to ﬁgure out what
research ideas have policy relevance. That literature uses a lot of math, so you will
need to be equipped with mathematical tools to read it intelligently.
3. Math is good training for the mind. It makes you a more rigorous thinker.
4. Your math courses are one long IQ test. We use math courses to ﬁgure out who
is really smart.
5. Economics graduate programs are more oriented to training students for aca
demic research than for policy jobs. Although many econ PhD go on to policy work,
all of us teaching in graduate programs are, by deﬁnition, academics. Some academics
take a few years oﬀ to experience the policy world, as I did not long ago, but many
academics have no idea what that world is like. When we enter the classroom, we
teach what we know. (I am not claiming this is optimal, just reality.) So math plays
a larger role in graduate classes than it does in many jobs that PhD economists hold.
It is possible that admissions committees for econ PhD programs are excessively
fond of mathematics on student transcripts? Perhaps. That is some thing I might
argue with my colleagues about if I were ever put on the admissions committee. But
a student cannot change that. The fact is, if you are thinking about a PhD program
in economics, you are advised to take math course until it hurts.”
At present the Maths teachers’ mission is to do the best advertisement for Maths,
to make the students the importance of Maths, to ”dress up” the Maths classes in
vivid colours.
The purpose of this book (covering three parts (Linear Algebra, Calculus and
Probabilities) divided in seven chapters) is to give to the students at economics the
possibility to acquire the basic knowledge in Maths which they will have to work with
in the future, in order to be able to use them with complex economic models belonging
to the real world. Our book attempts to develop the student’s intuition concerning
the ways of working with mathematical techniques.
This book wants to be a device for using Maths in order to understand the struc
ture of ”economics”.
The text oﬀers an introduction into the most intimate relationship between Maths
and Economics.
Taking into account the applicative content of this book, we will not always present
the complete proofs of all theoretical statements but attached importance to examples
and economic applications. As Mathematics is a very old science, we can’t possibly be
8
entirely original, but the structure and concepts have been thought out after several
years of work together with the students in economics. For the content could get closer
to the necessities of economists, we introduced several examples from the economic
ﬁeld.
This book is especially meant to the ﬁrst year’s students in ”Economic Sciences
and Business Administration”. We also turn to all those who should need to refresh
their knowledge in maths which are to be used in economics or just for the sake of
their professional update.
9
10
Part I
Elements of linear algebra
11
Chapter 1
Linear spaces
1.1 The Euclidean space
One of the main uses of mathematics in economic theory is to construct the ap
propriate geometric and analytic generalizations of the two or threedimensional ge
ometric models which are the main stay of undergraduate economic courses. In this
paragraph we will study how to generalize notions of points, lines, planes, distances
and angles to ndimensional Euclidean spaces.
The set of real numbers, denoted by R, plays a dominant role in mathematics. The
geometric representation of R is a straight line. A point, called the origin, is chosen
to represent 0 and another point, usually to the right of 0, to represent 1. There is
a natural correspondence between the points on the line and real numbers, i.e. each
point will represent a unique real number and each real number will be represented
by a unique point. For this reason we refer to R as the real line and use the words
point and number interchangeably.
0 1
O P
−∞ ∞
The real line R
We assume that the reader is familiar with the Cartesian plane. Each point P
represents an ordered pair of real numbers (a, b) ∈ R
2
and each element of R
2
can
be represented by a point in the Cartesian plane. The vertical line through the point
P meets the horizontal axis (x axis) at a which is called the abscissa of P. The
horizontal line through the point P meets the vertical axis (y axis) at b which is
called the ordinate of P. For this reason we refer to
R
2
= ¦(a, b) [ a ∈ R, b ∈ R¦
as the plane and use the words plane and ordered pair of real numbers interchangeably.
13
¸
`
O
P
b
a
x axis
y axis
The cartesian plane R
2
If P(x
1
, y
1
) and Q(x
2
, y
2
) are two points in R
2
then the distance between them
can be determined by using the pythagorean theorem in a right triangle. We shall
denote the distance between P and Q by d(P, Q).
`
¸
y
2
y
1
x
2
x
1
P
Q
θ
_
¸
_
¸
_
. ¸¸ .
x
2
−x
1
y
2
−y
1
The line PQ
d(P, Q) =
_
(x
2
−x
1
)
2
+ (y
2
−y
1
)
2
.
It is well know that two diﬀerent points determine exactly one line.
Next, we will present diﬀerent forms of the equation of a line.
The inclination θ of a line l is the angle that l makes with the horizontal axis. θ
is the smallest positive angle measured counterclockwise from the positive end of the
x axis to the line l. The range of θ is given by 0 ≤ θ < 180
◦
.
The slope m of the line is deﬁned as the tangent of its angle θ of inclination:
m = tg θ.
14
In particular, the slope of the line passing through two points P(x
1
, y
1
) and
Q(x
2
, y
2
) is given by:
m = tg θ =
y
2
−y
1
x
2
−x
1
.
Remark. a) The slope of an horizontal line, i.e. when y
2
= y
1
, is zero.
b) The slope of a vertical line, i.e. when x
2
= x
1
is not deﬁned.
c) Two distinct lines l
1
and l
2
are parallel if and only if their slopes, m
1
respectively
m
2
, are equal, i.e. m
1
= m
2
.
d) Two distinct lines l
1
and l
2
are perpendicular if and only if the slope of one is
the negative reciprocal of the other:
m
1
= −
1
m
2
or m
1
m
2
= −1.
Linear equations
Every line l in the Cartesian plane R
2
can be represented by a linear equation of
the form
(l) ax +by +c = 0,
where a and b are not both zero, i.e. each point on l is a solution of (l) and each
solution of (l) is a point on l.
Horizontal and vertical lines
The equation of a horizontal line i.e. a line parallel to the x axis, is of the form
y = k where k is the ordinate of the point at which the line intersects the y axis.
In particular, the equation of the x axis is y = 0.
The equation of a vertical line i.e. a line parallel to the y axis, is of the form
x = k where k is the abscissa of the point at which the line intersects the x axis.
In particular, the equation of the y axis is x = 0.
Pointslope form
A line is completely determined if we know its direction (its slope) and a point on
the line.
The equation of the line having slope m and passing through the point (x
1
, y
1
) is
(l) y −y
1
= m(x −x
1
).
Two points form
Let P(x
1
, y
1
) and Q(x
2
, y
2
) two diﬀerent points in the Cartesian plane. The equa
tion of the line which passes through the previous two points is:
(l)
y −y
1
y
2
−y
1
=
x −x
1
x
2
−x
1
.
15
The previous equation was obtained by replacing m =
y
2
−y
1
x
2
−x
1
in the pointslope
form of the equation.
1.2 Euclidean nspace
We can interpret the order pairs of R
2
not only as locations but also as displace
ments. We represent these displacements as vectors in R
2
. The displacement (a, b)
means: move a units to the right and b units up from the current location. The tail of
the arrow marks the initial location; the head marks the location after the displace
ment is made.
To develop a geometric intuition for vector addition we can think in the following
way. If u = (a, b) and v = (c, d) are two vectors in R
2
, then u + v will represent a
displacement of a +c units to the right and b +d units up.
`
¸
`
y
x
O
u
u
v
v
u
+
v
Addition of two vectors
We can use the parallelogram as in the above ﬁgure to draw u + v keeping the
tails of u and v at the same point.
It is generally not possible to multiply two vectors in a nice way to generalize the
multiplication of real numbers. For instance, coordinatewise multiplication does not
satisfy the basic properties of the multiplication of real numbers. On the other hand,
geometrically, scalar multiplication of a vector v be a nonnegative (negative) scalar
corresponds to stretching or shrinking v without (with) changing its direction.
16
`
¸
y
x
2a
a
b
2b
O
u
2u
Vector multiplication by a real scalar
We will generalize the previous discussion to the general case.
For an integer n ≥ 1. By deﬁnition R
n
is the set of ordered ntuples x =
(x
1
, x
2
, . . . , x
n
) of real numbers i.e.
R
n
= ¦x = (x
1
, x
2
, . . . , x
n
) [ x
1
, x
2
, . . . , x
n
∈ R¦.
The elements of R
n
are called vectors and the numbers x
i
, i = 1, n, are called the
coordinates of x (x
i
is the i
th
coordinate of x).
The two fundamental operations (which generalize the addition of two vectors and
the multiplication of a vector by a scalar) are:
1. addition of two vectors: if x, y ∈ R
n
then
x +y := (x
1
+y
1
, x
2
+y
2
, . . . , x
n
+y
n
) ∈ R
n
2. scalar multiplication: if x ∈ R
n
and α ∈ R then
αx = (αx
1
, αx
2
, . . . , αx
n
) ∈ R
n
.
Next, we will present an axiomatic concept based on the simplest properties of
the previous operations.
Deﬁnition. A real vector space is a set V ,= ∅ with an operation + : V V → V
called vector addition and an operation : R V → V called scalar multiplication
with the following properties:
(i) (x +y) +z = x + (y +z), ∀ x, y, z ∈ V
(ii) there is a vector θ ∈ V (called the null vector) that is an identity element for
addition:
x +θ = θ +x = θ, ∀ x ∈ V
(iii) for any x ∈ V there is −x ∈ V such that
x + (−x) = (−x) +x = θ
17
(iv) x +y = y +x, ∀ x, y ∈ V
(v) 1 x = x, ∀ x ∈ V
(vi) ∀ x, y ∈ V, ∀ α, β ∈ R
(a) (αβ)x = α(βx)
(b) (α +β)x = αx +βx
(c) α(x +y) = αx +βy.
Remark. (V, +) is a commutative group.
Example. R
n
is a vector space.
The proof of the previous example follows immediately by using the deﬁnitions of
the operations in R
n
.
We next show that any ﬁnite dimensional vector space is ”like” R
n
.
Let V be a vector space.
A linear subspace of V is a subset W ⊂ V that is itself a vector space with
vector addition and scalar multiplication deﬁned by restriction of given operations on
V .
Remark. If W ⊆ V then W is a linear subspace if and only if the following two
conditions are fulﬁlled:
a) W ,= ∅
b) ∀ α ∈ R, ∀ u, v ∈ W, u +αv ∈ W.
If α
1
, α
2
, . . . , α
n
∈ R and v
1
, v
2
, . . . , v
n
∈ V then the sum α
1
v
1
+ + a
n
v
n
∈ V
is called a linear combination of the vectors v
1
, v
2
, . . . , v
n
.
The span of a set S ⊂ V is the set of all linear combinations α
1
v
1
+ + α
n
v
n
where v
1
, . . . , v
n
∈ V and α
1
, α
2
, . . . , α
n
∈ R.
span S = ¦α
1
v
1
+ +α
n
v
n
[ n ∈ N
∗
, α
1
, . . . , α
n
∈ R, v
1
, . . . , v
n
∈ V ¦.
A set of vectors ¦v
1
, . . . , v
n
¦ is called linearly independent if the vector equation
α
1
v
1
+α
2
v
2
+ +α
n
v
n
= θ
has the only trivial solution α
1
= α
2
= = α
n
= 0.
A set of vectors ¦v
1
, . . . , v
n
¦ is called linearly dependent if the vector equation
α
1
v
1
+ +α
n
v
n
= θ has a nontrivial solution, that is, there are α
1
, α
2
, . . . , α
n
∈ R
not all zero, such that α
1
v
1
+ +α
n
v
n
= θ.
In other words, a set of vectors in a vector space is linearly dependent if and only
if one vector can be written as a linear combination of the others.
A basis for V is a subset B ⊂ V that spans V (spanB = V ) and is minimal for
this property in the sense that there are no proper subsets of B that span V .
A basis is linearly independent. Conversely, a linearly independent set that spans
V is a basis for V .
Remark. If V has a ﬁnite basis, then all basis have the same number of elements.
In this case we say that V is ﬁnite dimensional and the common number of elements
of the basis of V is called the dimension of V .
If B = ¦v
1
, . . . , v
n
¦ is a basis of v then each point v ∈ V can be written as a linear
combination
v = α
1
v
1
+α
2
v
2
+ +α
n
v
n
18
in exactly one way.
Example. a) The set B = ¦e
1
, . . . , e
n
¦ where
e
i
= (0, . . . , 0, 1, 0, . . . , 0)
(1 is the i
th
coordinate), i = 1, n, is a base of the vector space R
n
.
b) dimR
n
= n.
Normed spaces
The concept of a norm is an abstract generalization of the length of a vector.
Deﬁnition. Let V be a vector space. A function   : V →R, x → x is called
a norm if it satisﬁes the following conditions:
N1) x ≥ 0, ∀ x ∈ V
x = 0 ⇔ x = 0
N2) αx = [α[ x, ∀ x ∈ V, ∀ α ∈ R
N3) x +y ≤ x +y, ∀ x, y ∈ V .
A vector space V with a norm   is called a normed space and it is denoted by
(V,  ).
Inner product spaces
Deﬁnition. Let V be a vector space. A mapping ¸, ) : V V → R is called an
inner product in V if the following conditions are satisﬁed:
IP1) ¸x, x) > 0, ∀ x ∈ V ¸ ¦θ¦;
¸x, x) = 0 ⇔ x = θ (positive deﬁniteness)
IP2) ¸x, y) = ¸y, x), ∀ x, y ∈ V
IP3) ¸αx +βy, z) = α¸x, z) +β¸y, z), ∀ x, y, z ∈ V, ∀ α, β ∈ R (bilinearity).
A vector space with an inner product is called an inner product space.
Example. The canonical inner product in R
n
is deﬁned in the following way:
¸, ) : R
n
R
n
→R
¸x, y) = x
1
y
1
+x
2
y
2
+ +x
n
y
n
, ∀ x, y ∈ R
n
.
The proof that the previous functions satisﬁes all the properties of the previous
deﬁnition is left to the reader (easy computations based on the properties of real
numbers).
Example. The function   : R
n
→R
x → x =
_
¸x, x)
is a norm on R
n
which is called the Euclidean norm.
Proof. The properties (N1) and (N2) are easy consequences of the properties of
the inner product.
The property (N3) follows from the CauchyBuniakovskiSchwarz inequality:
[¸x, y)[ ≤ x y, ∀ x, y ∈ R
n
.
19
We consider ﬁrst the following obvious inequalities:
0 ≤ x +y
2
= ¸x +y, x +y) = x
2
+y
2
+ 2¸x, y)
0 ≤ x −y
2
= ¸x −y, x −y) = x
2
+y
2
−2¸x, y)
wherefrom we get
−(x
2
+y
2
) ≤ 2¸x, y) ≤ x
2
+y
2
hence
2[¸x, y)[ ≤ x
2
+y
2
.
We can assume that x ,= θ and y ,= θ (if x = θ or y = θ the CauchyBuniakovski
Schwarz is true).
If we replace x by
x
x
and y by
y
y
in the previous inequality we obtain
2
¸
¸
¸
¸
_
x
x
,
y
y
_¸
¸
¸
¸
≤
_
_
_
_
x
x
_
_
_
_
2
+
_
_
_
_
y
y
_
_
_
_
2
= 2
wherefrom we have:
[¸x, y)[ ≤ x y,
as desired.
We are now able to prove N3)
x +y
2
= x
2
+ 2¸x, y) +y
2
CBS
≤ x
2
+ 2x y +y
2
= (x +y)
2
.
Example (other norms on R
n
).
1)  
1
: R
n
→R, x
1
= [x
1
[ + +[x
n
[, ∀ x ∈ R
n
2)  
∞
: R
n
→R, x
∞
= max¦[x
1
[, . . . , [x
n
[¦.
Metric spaces
Deﬁnition. If X ,= ∅, d : X X → R is called distance on X if the following
conditions are satisﬁed:
D1) d(x, y) ≥ 0, ∀ x, y ∈ X
d(x, y) = 0 ⇔ x = y
D2) d(x, y) = d(y, x), ∀ x, y (symmetry)
D3) d(x, y) ≤ d(x, y) +d(y, z), ∀ x, y, z (triangle inequality).
A metric space is a pair (X, d) in which X is a nonempty set and d is a distance
on X.
Remark. Each normed space (V,  ) is a metric space since the function
d : V V →R
d(x, y) = x −y, ∀ x, y ∈ V
satisﬁes the properties D1), D2), D3).
Example. In R
n
the Euclidean distance is:
d(x, y) =
_
(x
1
−y
1
)
2
+ + (x
n
−y
n
)
2
, ∀ x, y ∈ R
n
.
20
1.3 Quadratic forms
In this section we present the natural generalizations of linear and quadratic func
tions to several variables.
Linear operators
Deﬁnition. Let V and W be two real vector spaces, such that dimV = n and
dimW = m.
A linear operator from V to W is a function T that preserves the vector space
structure, i.e.
T(αx +βy) = αT(x) +βT(y), ∀ x, y ∈ V, ∀ α, β ∈ R.
A linear operator T : V →R is called a linear form.
Remark 1. Let T : R
n
→R be a linear operator. Then, there exists a vector
a =
_
_
_
a
1
.
.
.
a
n
_
_
_ ∈ R
n
such that T(x) = a
t
x for all x ∈ V .
Proof. Let B = ¦e
1
, . . . , e
n
¦ be the canonical basis of R
n
.
Let a
i
= T(e
i
) ∈ R, i = 1, n. Then, for any vector x ∈ R
n
,
x = x
1
e
1
+x
2
e
2
+ +x
n
e
n
T(x) = T(x
1
e
1
+ +x
n
e
n
) = x
1
T(e
1
) + +x
n
T(e
n
)
= x
1
a
1
+ +x
n
a
n
= a
t
x.
The previous remark implies that every linear form on R
n
can be associated with
a unique vector a ∈ R
n
(or with a unique 1 n matrix) so that T(x) = a
t
x.
The same correspondence between linear operators and matrices is valid for linear
operators from R
n
to R
m
.
Remark 2. Let T : R
n
→ R
m
be a linear operator. Then there exists an m n
matrix A such that
T(x) = Ax, ∀ x ∈ R
n
.
Proof. The idea is the same as that of the previous remark. Let B = ¦e
1
, . . . , e
n
¦
be the canonical base of R
n
.
For each j = 1, n, T(e
j
) ∈ R
m
, hence
T(e
j
) =
_
_
_
_
a
1j
a
2j
. . .
a
mj
_
_
_
_
.
21
Let A be the mn matrix whose j
th
column is the column vector T(e
j
). For any
x = x
1
e
1
+ +x
n
e
n
∈ R
n
we have
T(x) = T(x
1
e
1
+ +x
n
e
n
) = x
1
T(e
1
) + +x
n
T(e
n
)
= x
1
_
_
_
_
a
11
a
21
. . .
a
m1
_
_
_
_
+x
2
_
_
_
_
a
12
a
22
. . .
a
m2
_
_
_
_
+ +x
n
_
_
_
_
a
1n
a
2n
. . .
a
mn
_
_
_
_
=
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
m1
a
m2
. . . a
mn
_
_
_
_
_
_
x
1
. . .
x
n
_
_
= Ax.
So, we can say that matrices are representations of linear operators.
Quadratic forms
In mathematics, a quadratic form is a homogeneous polynomial of degree two in
a number of variables.
Examples.
Q(x) = ax
2
Q(x, y) = ax
2
+bxy +cy
2
Q(x, y, z) = ax
2
+by
2
+cz
2
+dxy +exz +fyz
are quadratic forms in one, two or three variables.
Quadratic forms are associated to bilinear forms.
Deﬁnition. a) Let V and W be two real vector spaces such that dimV = n and
dimW = m. The application ([) : V W → R is called bilinear if it is linear with
respect to its two variables, i.e.:
(αx +βy[z) = α(x[z) +β(y[z), ∀ α, β ∈ R, ∀ x, y ∈ V, ∀ z ∈ W
(x[αy +βz) = α(x[y) +β(x[z), ∀ α, β ∈ R, ∀ x ∈ V, ∀ y, z ∈ W.
b) The bilinear form ([) : V V →R is called symmetric if
(x[y) = (y[x), ∀ x, y ∈ V.
c) Let V be a real vector space whose dimension is n and let ([) : V V →R be
a symmetric bilinear form. The application:
Q : V →R
x → Q(x) = (x[x), ∀ x ∈ V
is called a quadratic form on V .
22
Next, we determine the analytical expression of a quadratic form.
If B = ¦v
1
, . . . , v
n
¦ is a base of V then x can be uniquely expressed as
x = x
1
v
1
+x
2
v
2
+ +x
n
v
n
.
Hence:
Q(x) = (x[x) = (x
1
v
1
+x
2
v
2
+ +x
n
v
n
[x)
= x
1
(v
1
[x) +x
2
(v
2
[x) + +x
n
(v
n
[x)
= x
1
(v
1
[x
1
v
1
+ +x
n
v
n
)
+ +
+x
n
(v
n
[x
1
v
1
+ +x
n
v
n
)
= x
1
[x
1
(v
1
[v
1
) + +x
n
(v
1
[v
n
)]
+ +
+x
n
[x
1
(v
n
[v
1
) + +x
n
(v
n
[v
n
)]
= x
1
n
j=1
x
j
(v
1
[v
j
) + +x
n
n
j=1
x
j
(v
n
[v
j
)
=
n
i=1
_
_
x
i
n
j=1
x
j
(v
i
[v
j
)
_
_
=
n
i=1
n
j=1
(v
i
[v
j
)x
i
x
j
.
In conclusion:
Q(x) = (x[x) =
n
i=1
n
j=1
(v
i
[v
j
)x
i
x
j
.
If for each i, j = 1, n we denote a
ij
= (v
i
[v
j
) then a
ij
= a
ji
(since ([) is symmetric)
and
Q(x) =
n
i=1
n
j=1
a
ij
x
i
x
j
= a
11
x
2
1
+2a
12
x
1
x
2
+ + 2a
1n
x
1
x
n
+a
22
x
2
2
+ + 2a
2n
x
2
x
n
+ +a
nn
x
2
n
which is the analytical expression of the quadratic form Q.
Just as a linear function has a matrix representation, a quadratic form has a matrix
representation, too.
Remark. The quadratic form Q : R
n
→R,
Q(x) =
n
i=1
n
j=1
a
ij
x
i
x
j
(a
ij
= a
ji
)
23
can be written as
Q(x) = (x
1
, . . . , x
n
)
_
_
_
_
a
11
. . . a
1n
a
21
. . . a
2n
. . . . . . . . .
a
n1
. . . a
nn
_
_
_
_
_
_
x
1
. . .
x
n
_
_
= x
t
Ax,
where A is the matrix (symmetric) of the coeﬃcients of the quadratic form Q.
Deﬁnitess of quadratic forms
Deﬁnition. If Q : R
n
→R be a quadratic form, then Q is
(a) positive deﬁnite if Q(x) > 0 for all x ∈ R
n
¸ ¦θ¦
(b) positive semideﬁnite if Q(x) ≥ 0 for all x ∈ R
n
(c) negative deﬁnite if Q(x) < 0 for all x ∈ R
n
¸ ¦θ¦
(d) negative semideﬁnite if Q(x) ≤ 0 for all x ∈ R
n
(e) indeﬁnite if there are x, x
′
∈ R
n
¸ ¦θ¦ such that Q(x) < 0 and Q(x
′
) > 0.
Next, we will describe a simple test for the deﬁnitess of a quadratic form. To
present the test we need some deﬁnitions related to the coeﬃcient matrix of Q.
Deﬁnition. a) Let A be a nn matrix. A mm submatrix of A formed by deleting
n − m columns and the same n − m rows from A is called a m
th
order principal
submatrix of A. The determinant of a m m principal submatrix is called a m
th
order principal minor of A.
For a n n matrix there are C
m
n
m
th
order principal minors of A.
b) Let A be a n n matrix. The m
th
order principal submatrix of A obtained by
deleting the last n − m rows and the last n − m columns from A is called the m
th
order leading principal minor of A.
We will denote the m
th
order leading principal submatrix by A
m
and the corre
sponding leading principal minor by det A
m
.
The following remark provides an algorithm which uses the leading principal mi
nors to determine the deﬁnitess of a quadratic form Q whose coeﬃcient matrix is
A.
Remark 3. Let Q : R
n
→ R be a quadratic form whose coeﬃcient matrix is A.
Then
(a) Q is positive deﬁnite if and only if all its n leading principal minors are strictly
positive i.e.
det A
1
> 0, det A
2
> 0, . . . , det A
n
= det A > 0
(b) Q is negative deﬁnite if and only if all its n leading principal minors alternate
in sign as follows
det A
1
< 0, det A
2
> 0, . . . , (−1)
n
det A
n
= (−1)
n
det A > 0.
(c) Q is positive semideﬁnite if and only if every principal minor of A is nonnega
tive.
24
d) Q is negative semideﬁnite if and only if every principal minor of odd order is
not positive and every principal minor of even order is nonnegative.
e) If there is an even number m (m ∈ ¦1, . . . , n¦) such that det A
m
< 0 or if there
are two odd numbers m
1
and m
2
such that det A
m
1
< 0 and det A
m
2
> 0 then Q is
indeﬁnite.
Proof. We will prove the part (a) (the proofs are similar for parts (b), (c) and
(d)) by using induction on the size of A (the coeﬃcient matrix of Q).
Suppose that all the leading minors are strictly positive. We have to show that Q
is positive deﬁnite.
If n = 1 then the result is trivial.
If n = 2 then det A
1
= a
11
> 0, det A
2
= det A = a
11
a
22
−a
2
12
> 0 and hence:
Q(x) = (x
1
, x
2
)
_
a
11
a
12
a
21
a
22
__
x
1
x
2
_
= a
11
x
2
1
+ 2a
12
x
1
x
2
+a
22
x
2
2
= a
11
_
x
1
+
a
12
a
11
x
2
_
2
+
a
11
a
22
−a
2
12
a
11
x
2
2
= det A
1
_
x
1
+
a
12
a
11
x
2
_
2
+
det A
2
det A
1
x
2
2
> 0, ∀ (x
1
, x
2
) ,= (0, 0).
We suppose that the theorem is true for symmetric matrices of order k and prove
it for symmetric matrices of order k + 1.
Let A be symmetric matrix of order k + 1. We have to prove that if det A
j
> 0,
j = 1, k + 1 then x
t
Ax > 0, ∀ x ,= θ. The matrix A can be written as
A =
_
A
k
a
a
t
a
k+1 k+1
_
where a =
_
_
_
_
a
1 k+1
a
2 k+1
. . .
a
k k+1
_
_
_
_
.
If d = a
k+1 k+1
−a
t
A
−1
k
a then we have
_
I
k
0
(A
−1
k
a)
t
1
__
A
k
0
0 d
__
I
k
A
−1
k
a
0 1
_
=
_
I
k
0
(A
−1
k
a)
t
1
__
A
k
a
0 d
_
=
_
A
k
a
a
t
(A
−1
k
a)
t
a +d
_
=
_
A
k
a
a
t
a
k+1 k+1
_
= A.
The previous equality can be written as A = C
t
BC, where
C =
_
I
k
A
−1
k
a
0 1
_
and B =
_
A
k
0
0 d
_
.
Since det C = det C
t
= 1 and det B = d det A
k
then
det A = det B = d det A
k
.
25
Since det A > 0 and det A
k
> 0 then d > 0. Let x ∈ R
k+1
¸ ¦θ¦. Every x ∈ R
k+1
can be written as x =
_
x
x
k+1
_
where x ∈ R
k
.
Then
x
t
Ax = x
t
C
t
BCx = (Cx)
t
B(Cx) = y
t
By
=
_
y
t
y
n+1
_
_
A
k
0
0 d
__
y
y
k+1
_
= y
t
A
k
y +dy
2
k+1
.
In the previous equality we denoted the vector Cx by y =
_
y
y
k+1
_
which is not
the null vector since C is invertible and x ,= θ.
By using the inductive hypothesis and the fact that d > 0 we get that x
t
Ax > 0,
hence Q is positive deﬁnite.
To prove the converse (Q positive deﬁnite implies that det [A
j
[ > 0, j = 1, n) we
will use the induction once more.
If n = 1 then the result is trivial.
If n = 2, then
Q(x) = det A
1
_
x
1
+
a
12
a
11
x
2
_
2
+
det A
2
det A
1
x
2
2
.
In the previous equality we used the fact that a
11
,= 0 since if a
11
= 0 then
Q(1, 0) = 0 and Q cannot be positive deﬁnite.
It is obvious that if Q(x) > 0, ∀ x ,= θ, then det A
1
> 0 and det A
2
> 0.
Assume that the result is true for any quadratic form whose coeﬃcient matrix has
order k and let A be the (k + 1) (k + 1) coeﬃcient matrix of a positive deﬁnite
quadratic form.
Let x ∈ R
k
¸ ¦θ¦. If x =
_
x
0
_
∈ R
k+1
then
0 < x
t
Ax =
_
x
t
0
_
A
_
x
0
_
= x
t
A
k
x.
By the inductive hypothesis we obtain that det A
1
> 0, det A
2
> 0, and det A
k
> 0.
It remains for us only to prove that det A = det A
k+1
is positive.
We write the matrix A as in the ﬁrst part of the proof. Hence
A = C
t
BC and det A = det A
k
d.
We have to prove now that d > 0. Indeed, since Q is positive deﬁnite we have that
d = (0, 0, . . . , 1)
_
A
k
0
0 d
_
_
_
_
0
.
.
.
1
_
_
_ = x
t
Bx = x
t
(C
−1
)
t
AC
−1
x
= (C
−1
x)
t
A(C
−1
x) > 0.
Since det A
k
> 0 and d > 0 then det A = det A
k+1
> 0 as desired.
26
Chapter 2
Linear Algebra
2.1 Systems of linear equations.
GaussJordan elimination method
A ﬁnite set of linear equations in the variables x
1
, x
2
, . . . , x
n
∈ R is called a
system of linear equations or a linear system. The general form of a linear system of
m equations and n unknowns is the following:
_
_
_
a
11
x
1
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+ +a
mn
x
n
= b
m
where a
ij
∈ R, b
i
∈ R, ∀ i = 1, m, j = 1, n.
A solution of the system is a list (s
1
, . . . , s
n
) of numbers which makes each equation
a true statement when the values s
1
, . . . , s
n
are substituted for x
1
, . . . , x
n
, respectively.
The set of all possible solutions is called the solution set or the general solution of the
linear system. Two linear systems are called equivalent if they have the same solution
set. That is, each solution of the ﬁrst system is a solution of the second system, and
each solution of the second system is a solution of the ﬁrst system.
A system of linear equation has either
1. no solution, or
2. exactly one solution, or
3. inﬁnitely many solutions.
We say that a linear system is consistent if it has either one solution or inﬁnitely
many solutions; a system is inconsistent if it has no solution.
For a linear system we consider
• the matrix of the system (the matrix of the coeﬃcients of the unknowns)
A = (a
ij
)
i=1,m
j=1,n
=
_
_
_
a
11
. . . a
1n
.
.
.
.
.
.
a
m1
. . . a
mn
_
_
_;
27
A =
_
_
_
_
_
R
1
R
2
.
.
.
R
m
_
_
_
_
_
,
where R
i
= (a
i1
, a
i2
, . . . , a
in
) is the i
th
row.
• the augmented matrix (the coeﬃcient matrix with an added column containing
the constants from the right sides of the equations)
A =
_
_
_
a
11
. . . a
1n
b
1
.
.
.
.
.
.
.
.
.
a
m1
. . . a
mn
b
m
_
_
_
• the column of the constants
b =
_
_
_
b
1
.
.
.
b
n
_
_
_
• the column of the unknowns
x =
_
_
_
x
1
.
.
.
x
n
_
_
_.
By using the above matrix notations the system can be written in the following
form:
Ax = b.
Concerning the solutions set of a linear system we have the following result:
Remark. 1) If rank A < rank A then there is no solution of the considered linear
system. The system is inconsistent.
2) If rank A = rank A = n (where n is the number of the unknowns) then the
system has exactly one solution. The system is consistent.
3) If rank A = rank A < n then the system has inﬁnitely many solutions.
This chapter describes an algorithm or a systematic procedure for solving linear
systems. This algorithm is called GaussJordan elimination method and its basic
strategy is to replace one system with an equivalent one that is easier to solve.
If the equivalent system contains a degenerate linear equation of the following
form
0 x
1
+ 0 x
2
+ + 0 x
n
= b
i
then
i) If b
i
,= 0, then the system is inconsistent.
ii) If b
i
= 0, then the degenerate equation may be deleted from the system without
changing the solution set.
The method is named after German mathematicians Carl Friederich Gauss (1777
1855) and Wilhelm Jordan (18421899) but it appears in an important Chinese math
ematical text which was written approximately at 150 BCE.
28
The rectangle rule for row operations
The purpose of this paragraph is to transform a matrix which has a nonzero
column into an equivalent one that contains one element equal to 1 and all the other
elements equal to 0 (we say that such a column is in proper form).
This can be done by using the elementary row operations which are:
1) Scaling. Multiply all entries in a row by a nonzero constant.
λR
i
→ R
i
, λ ,= 0
2) Replacement. Replace one row by the sum of itself and a multiple of another
row.
R
i
+λR
k
→ R
i
3) Interchange. Interchange two rows.
R
i
↔ R
j
Remark. If we apply the elementary row operations to an augmented matrix of a
linear system we obtain a new matrix which is the augmented matrix of an equivalent
linear system to the given one. This remark is true since it is well known that the
solution of a system remains unchanged if we multiply one equation by a nonzero
constant or if we add a multiple of one equation to another or if we interchange two
equations of a system (the rows of an augmented matrix correspond to the equations
in the associated system).
Let A be the matrix
A =
_
_
_
_
_
_
_
. . . . . . . . . . . . . . .
. . . a
ij
. . . a
il
. . .
.
.
.
.
.
.
. . . a
kj
. . . a
kl
. . .
. . . . . . . . . . . . . . .
_
_
_
_
_
_
_
Suppose that a
ij
,= 0.
We want to determine the elementary row operations which transform the element
a
ij
into 1 (a
ij
→ 1) and all the other elements of the j
th
column into 0 (a
kj
→ 0,
∀ k ,= i).
We consider the following row operations
R
i
1
a
ij
→ R
i
, a
ij
→
a
ij
a
ij
= 1
and
R
k
−R
i
a
kj
a
ij
→ R
k
, a
kj
→ a
kj
−a
ij
a
kj
a
ij
= 0, ∀ k = 1, m, k ,= i.
The eﬀects of the previous elementary row operations on the other elements of the
matrix are:
a
il
→
a
il
a
ij
, ∀ l = 1, n, l ,= j
29
a
kl
→ a
kl
−a
il
a
kj
a
ij
=
a
kl
a
ij
−a
il
a
kj
a
ij
, ∀ k ,= i, ∀ l ,= j.
The element a
ij
,= 0 is called the pivot.
So, in order to transform the element a
kl
(by using a
ij
as a pivot) we locate the
rectangle which contains the element and the pivot a
ij
as opposite corners. Then,
from the product of the elements situated in the opposite corners of the previous
rectangle’s diagonal which contains the pivot we subtract the product of the elements
situated in the corners of the other diagonal and the result is divided by the pivot
(rectangle’s rule).
Remark. 1) The rows which contain 0 on the pivot column remain unchanged.
Indeed, if a
kj
= 0 then
a
kl
→
a
kl
a
ij
−a
il
0
a
ij
= a
kl
.
2) The columns which contain 0 on the pivot row remain unchanged.
Indeed, if a
il
= 0 then
a
kl
→
a
kl
a
ij
−0 a
kj
a
ij
= a
kl
.
So, in order to transform a matrix which has a nonzero column into an equivalent
one that contains one element equal to 1 and all the other elements equal to 0 we
have to follow the next steps:
Rectangle’s algorithm
Step 1. Choose and circle (from the considered column) a nonzero element which is
called the pivot.
Step 2. Divide the pivot row by the pivot.
Step 3. Set the elements of the pivot column (except the pivot) equal to 0.
Step 4. The rows which contain a 0 on the pivot column remain unchanged.
The columns which contain a 0 on the pivot row remain unchanged.
Step 5. Compute all the other elements of the matrix by using the rectangle’s rule.
Example.
A =
_
_
2 0 1 −1
1 3 2 0
−1 1 0 2
_
_
∼
_
_
2 0 1 −1
1/3 1 2/3 0
−4/3 0 −2/3 2
_
_
Remark. The rectangle rule can be used to determine the inverse of a given
invertible matrix A. This can be done by writing at the right side of the given matrix
the unitary matrix I which has the same number of rows and columns as the matrix A
30
and then applying the rectangle rule to the obtained matrix. By choosing successively
the elements situated on the main diagonal of matrix A as pivots we will ﬁnally obtain
the unitary matrix I (situated below the given matrix A). The matrix situated at the
right side of the unitary matrix (in the ﬁnal table) is the inverse of the matrix A.
We will illustrate the previous procedure by an example.
Example. Determine the inverse of the matrix A given by
A =
_
_
1 2 3
2 3 1
3 1 2
_
_
.
We observe that the matrix A is invertible since its determinant is −18 ,= 0.
A I
3
1 2 3 1 0 0
2 3 1 0 1 0
3 1 2 0 0 1
1 2 3 1 0 0
0 −1 −5 −2 1 0
0 −5 −7 −3 0 1
1 0 −7 −3 2 0
0 1 5 2 −1 0
0 0 18 7 −5 1
1 0 0 −
5
13
1
18
7
18
0 1 0
1
18
7
18
−
5
18
0 0 1
7
18
−
5
18
1
18
Hence
A
−1
=
_
_
−
5
13
1
18
7
18
1
18
7
18
−
5
18
7
18
−
5
18
1
18
_
_
.
The GaussJordan elimination method
This method is an elimination procedure which transforms the initial system into
an equivalent one whose solution can be obtained directly.
GaussJordan elimination algorithm
Step 1. Associate to the given system the following table. The table contains the
augmented matrix with the constant column written at the left side of the matrix A.
b x
1
x
2
. . . x
n
b
1
a
11
a
12
. . . a
1n
b
2
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
b
m
a
m1
a
m2
. . . a
mn
31
Step 2. Choose and circle a
ij
,= 0 (the pivot). The pivot has to be chosen from
the coeﬃcient matrix A not from the constant column.
Use a
ij
as a pivot to eliminate the unknown x
j
from all the equations except the
i
th
equation (by applying rectangle’s algorithm).
Step 3. Examine each new row obtained (or, equivalently, each new equation) R.
a) If R corresponds to the following equation
0 x
1
+ + 0 x
n
= 0
then delete R from the table.
b) If R corresponds to the following equation
0 x
1
+ 0 x
2
+ + 0 x
n
= b
i
,
with b
i
,= 0 then exit the algorithm. The system is inconsistent.
Step 4. Repeat steps 2 and 3 with the subsystem formed by all the equations
from which a pivot hasn’t been chosen yet.
Step 5. Continue the above process until we choose a pivot from each row or a
degenerate equation is obtained at the step 3b.
In the case of consistency (the system is consistent if we choose a pivot from each
row) write the general solution. The solution set can be speciﬁed as follows
 the variables whose columns are in proper form are called leading variables. If all
the variables are leading variables then the system has a unique solution which can
be obtained directly from the column b
 the variables whose columns are not in proper form may assume any values and
they are called secondary variables. If there is at least one secondary variable then
the system has inﬁnitely many solutions. In this case express the leading variables in
terms of secondary variables.
Example. Solve the following linear systems.
a)
_
_
_
x + 2y −3z + 4t = 2
2x + 5y −2z +t = 1
5x + 12y −7z + 6t = 7
Solution
b x y z t
2 1 2 −3 4
1 2 5 −2 1
7 5 12 −7 6
2 1 2 −3 4
−3 0 1 4 −7
−3 0 2 8 −14
8 1 0 −11 18
−3 0 1 4 −7
3 0 0 0 0
The system is inconsistent since we obtain the following equation:
3 = 0 x + 0 y + 0 z + 0 t.
32
b)
_
_
_
x −2y +z = 7
2x −y + 4z = 17
3x −2y + 2z = 14
Solution
b x y z
7 1 −2 1
17 2 −1 4
14 3 −2 2
7 1 −2 1
3 0 3 2
−7 0 4 −1
0 1 2 0
−11 0 11 0
7 0 −4 1
2 1 0 0
−1 0 1 0
3 0 0 1
The system is consistent since we have chosen a pivot from each row. The leading
variables are x, y, z and the system has a unique solution which is
_
_
x
y
z
_
_
=
_
_
2
−1
3
_
_
.
c)
_
_
_
x + 2y −3z −2s + 4t = 1
2x + 5y −8z −s + 6t = 4
x + 4y −4z + 5s + 2t = 8
Solution
b x y z s t
1 1 2 −3 −2 4
4 2 5 −8 −1 6
8 1 4 −4 5 2
1 1 2 −3 −2 4
2 0 1 −2 3 −2
7 0 2 −1 7 −2
−3 1 0 1 −8 8
2 0 1 −2 3 −2
3 0 0 3 1 2
21 1 0 25 0 24
−7 0 1 −11 0 −8
3 0 0 3 1 2
33
The system is consistent since we have chosen a pivot from each row. The leading
variables are x, y, s; the secondary variables are z, t and in consequence the system is
consistent and has inﬁnitely many solutions.
The general solution can be expressed as follows.
From the ﬁnal table we write down the following equivalent system with the given
one:
_
_
_
x + 25z + 24t = 21
y −11z −8t = −7
3z +s + 2t = 3
where from we easily can express the leading variables in terms of secondary variables
_
_
_
x = 21 −25z −24t
y = −7 + 11z + 8t
s = 3 −3z −2t
with z, t ∈ R.
The general solution is:
_
_
_
_
_
_
x
y
z
s
t
_
_
_
_
_
_
=
_
_
_
_
_
_
21 −25z −24t
−7 + 11z + 8t
z
3 −3z −2t
t
_
_
_
_
_
_
,
where z, t ∈ R.
Leontief Production Model
The Leontief production model is a model for the economics of a whole country or
region. In this model there are n industries producing n diﬀerent products such that
consumption equals production. We remark that a part of production is consumed
internally by industries and the rest is to satisfy the outside demand.
The problem is to determine the levels of the outputs of the industries if the
external demand is given and the prices are ﬁxed. We will measure the levels of the
outputs in terms of their economic values. Over some ﬁxed period of time, let
x
i
= monetary value of the total output of the i
th
industry
d
i
= monetary value of the output of the i
th
industry needed to satisfy the external
demand
c
ij
= monetary value of the output of the i
th
industry needed by the j
th
industry to
produce one unit of monetary of its own output.
We deﬁne the production vector
x =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
,
34
the demand vector
d =
_
_
_
_
_
d
1
d
2
.
.
.
d
n
_
_
_
_
_
and the consumption matrix
C =
_
_
_
_
c
11
c
12
. . . c
1n
c
21
c
22
. . . c
2n
. . . . . . . . . . . .
c
n1
c
n2
. . . c
nn
_
_
_
_
.
It is obvious that x
j
, d
j
, c
ij
≥ 0 for each j = 1, n, i = 1, n.
The quantity c
i1
x
1
+c
12
x
2
+ +c
in
x
n
is the value of the output of the i
th
industry
needed by all n industries. We are led to the following equation
x = Cx +d
which is called the Leontief inputoutput model, or production model.
Writing x as I
n
x and using matrix algebra, we can rewrite the previous equation
as
I
n
x −Cx = d
(I
n
−C)x = d.
The above system can be solved by using the GaussJordan elimination method.
If the matrix I
n
−C is invertible, then we obtain
x = (I
n
−C)
−1
d.
Example. As a simple example, suppose the economy consists of three sectors 
manufacturing, agriculture and services whose consumption matrix is given by
C =
_
_
0, 5 0, 2 0, 1
0, 4 0, 3 0, 1
0, 2 0, 1 0, 3
_
_
.
Suppose the external demand is 50 units for manufacturing, 30 units for agriculture
and 20 units for services. Find the production level that will satisfy this demand.
Solution 1 (by using the GaussJordan elimination method)
The production equation is the following
(I
3
−C)x = d
which gives us the following system to be solved:
_
_
_
0, 5x
1
−0, 2x
2
−0, 1x
3
= 50
−0, 4x
1
+ 0, 7x
2
−0, 1x
3
= 30
−0, 2x
1
−0, 1x
2
+ 0, 7x
3
= 20
35
b x
1
x
2
x
3
50 0, 5 −0, 2 −0, 1
30 −0, 4 0, 7 −0, 1
20 −0, 2 −0, 1 0, 7
−500 −5 2 1
300 −4 7 −1
200 −2 −1 7
−500 −5 2 1
−200 −9 9 0
3700 33 −15 0
−
4100
9
−3 0 1
−
200
9
−1 1 0
10100
3
18 0 0
950
9
0 0 1
4450
27
0 1 0
5050
27
1 0 0
x
1
=
5050
27
≈ 187
x
2
=
4450
27
≈ 165
x
3
=
950
9
≈ 106
Solution 2 (by determining the inverse of the matrix I −C)
We know that the production level is determined by
x = (I
3
−C)
−1
D.
We ﬁrst determine the matrix (I
3
−C)
−1
.
I
3
−C I
3
0, 5 −0, 2 −0, 1 1 0 0
−0, 4 0, 7 −0, 1 0 1 0
−0, 2 −0, 1 0, 7 0 0 1
1 −
2
5
−
1
5
2 0 0
0
27
50
−
9
50
4
5
1 0
0 −
9
50
33
50
2
5
0 1
1 0 −
1
3
70
27
20
27
0
0 1 −
1
3
40
27
50
27
0
0 0
3
5
2
3
1
3
1
1 0 0
80
27
25
27
5
9
0 1 0
50
27
55
27
5
9
0 0 1
10
9
5
9
5
3
36
Hence,
(I −C)
−1
=
_
_
80
27
25
27
5
9
50
27
55
27
5
9
10
9
5
9
5
3
_
_
and in consequence
x = (I −C)
−1
d =
_
_
5050
27
4450
27
950
9
_
_
as we expected.
The theorem below shows that in most practical cases, I − C is invertible and
the production vector x is economically feasible in the sense that the entries in x are
nonnegative.
Theorem. Let C be the consumption matrix for an economy and let d be the
vector of external demand. If C and d have nonnegative entries and if each row sum
or each column sum of C is less than 1, then (I − C)
−1
exists and the production
vector
x = (I −C)
−1
d
has nonnegative entries and is the unique solution of the production equation
x = Cx +d.
Remark. The economic interpretation of entries in (I −C)
−1
The (i, j)
th
entry of the matrix (I −C)
−1
is the increased amount of the i
th
sector
which is to be produced in order to satisfy an increase of 1 unit in the external demand
for sector j.
Proof. Let d be the vector in R
n
with 1 in the j
th
entry and zeros elsewhere. The
corresponding production vector x is the j
th
column of (I − C)
−1
. This shows that
the (i, j)
th
entry of (I −C)
−1
gives the production of the i
th
sector to satisfy 1 unit
in the external demand for sector j. Now, the conclusion is true since if x
1
and x
2
are
production vectors which satisfy respectively the external demands d
1
and d
2
then
x
1
−x
2
is the production vector which satisﬁes the external demand d
1
−d
2
.
Basic feasible solutions
We consider a linear system in general form
_
_
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= b
m
.
We suppose that the above system is a consistent one with an inﬁnite number of
solutions (that means that rank A = rank A < n). Also, we suppose that rank A = m
(in the case that rank A < m then there are some equations of the system which are
linear combinations of the others, and if we eliminate these equations we don’t change
the general solution).
37
Since rank A = rank A = m < n the system will have m leading variables and
n − m secondary variables. A leading variable is also called a basic variable and a
secondary variable is called a nonbasic variable.
Deﬁnitions
A feasible solution (FS) of a linear system is a solution for which all the com
ponents are nonnegative.
A basic solution (BS) of a linear system is a solution for which all the nonbasic
variables are zero.
If one or more basic variables in a BS are zero then the solution is a degenerate
BS.
A basic feasible solution (BFS) is a feasible solution which is also a basic one.
If a BFS is degenerate, it is called a degenerate BFS.
Example. Determine all the basic solutions and all the basic feasible solutions of
the following system:
_
2x
1
+ 3x
2
−x
3
= 9
−x
1
+x
2
−x
3
= −2
Solution. Since
A =
_
2 3 −1 9
−1 1 −1 −2
_
,
then rank A = rank A = 2 then the system is consistent with an inﬁnite number of
solutions. Actually, we have 2 basic variables and one nonbasic variable.
The 2 basic variables can be:
a) x
1
, x
2
(x
3
is a nonbasic variable)
Since x
3
is nonbasic then x
3
= 0 and the system becomes
_
2x
1
+ 3x
2
= 9
−x
1
+x
2
= −2
The solution of the previous system is x
1
= 3 and x
2
= 1.
In this case we obtain the BS
_
_
3
1
0
_
_
which is also a BFS.
b) x
1
, x
3
(x
2
is a nonbasic variable)
In this case we obtain the BS
_
_
11
3
0
−
5
3
_
_
which is not a BFS.
c) x
2
, x
3
(x
1
is a nonbasic variable)
In this case we obtain the BS
_
_
0
11
2
15
2
_
_
which is a BFS.
38
Remark. For a consistent system having an inﬁnite number of solutions whose
rank is m < n (n is the number of unknowns) there are at most C
m
n
basic solutions.
Our purpose is to determine the basic feasible solutions of a linear system. We will
use the GaussJordan elimination method.
Since the rank A = m then we have m basic variables and n−m nonbasic variables.
Since a basic variable is a variable from whose column we have chosen a pivot, that
means that we have chosen m pivots from m diﬀerent columns and m diﬀerent rows.
In consequence, we choose a pivot from each row.
Eventually by renumbering the unknowns we can suppose that we have chosen
pivots from the ﬁrst m columns. So, we can suppose that the basic variables are
x
1
, . . . , x
m
and the nonbasic variables are x
m+1
, . . . , x
n
.
The computations can be arranged in the following table.
b x
1
. . . x
m
x
m+1
. . . x
n
b
1
a
11
. . . a
1m
a
1m+1
. . . a
1n
.
.
.
b
m
a
m1
. . . a
mm
a
mm+1
. . . a
mn
. . . . . .
β
1
1 . . . 0 α
1m+1
. . . α
1n
.
.
.
β
m
0 . . . 1 α
mm+1
. . . α
mn
The general solution is:
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
x
1
= β
1
−(α
1m+1
x
m+1
+ +α
1n
x
n
)
x
2
= β
2
−(α
2m+1
x
m+1
+ +α
2n
x
n
)
.
.
.
x
m
= β
m
−(α
mm+1
x
m+1
+ +α
mn
x
n
)
x
m+1
, . . . , x
n
∈ R
In order to get a basic solution we let
x
m+1
= = x
n
= 0, so x
1
= β
1
, x
2
= β
2
, . . . , x
m
= β
m
.
The basic solution
x =
_
_
_
_
_
_
_
_
_
_
_
_
β
1
β
2
.
.
.
β
m
0
.
.
.
0
_
_
_
_
_
_
_
_
_
_
_
_
is in the ﬁnal table. This basic solution is also a basic feasible solution if in the ﬁnal
table the column of the constants contains only nonnegative elements.
39
Next, we will determine rules for choosing the pivot such that if in the initial
table the column of the constants is nonnegative then so it will be in the ﬁnal table.
Actually, we are interested in preserving the property of the constant column to
contain only nonnegative elements at each intermediate table which occurs when we
solve the system.
We may assume that in the initial table the constant column is nonnegative (if
there is an equation whose righthand side constant is negative, then we can multiply
it by −1).
We are interested in choosing a pivot from the j
th
column such that the constant
column in the next table will remain nonnegative.
If we choose a
ij
,= 0 as a pivot then b
i
will transform into
b
i
a
ij
, which has to be
nonnegative, too.
Since b
i
≥ 0 and
b
i
a
ij
≥ 0, then a
ij
(the pivot) has to be positive.
If k ,= i, then the element b
k
will transform by using the rectangle’s rule into
b
k
→
b
k
a
ij
−b
i
a
kj
a
ij
≥ 0.
Since a
ij
> 0 then b
k
a
ij
−b
i
a
kj
≥ 0, k = 1, m, k ,= i.
For k = i the previous inequality becomes
b
i
a
ij
−b
i
a
ij
= 0 ≥ 0;
so the pivot has to satisfy the following condition
b
i
a
kj
≤ b
k
a
ij
, k = 1, m (∗)
Let J
1
= ¦k = 1, m [ a
kj
> 0¦ and J
2
= ¦k = 1, m [ a
kj
≤ 0¦.
If k ∈ J
2
then (∗) is satisﬁed since b
i
a
kj
≤ 0 ≤ b
k
a
ij
.
If k ∈ J
1
then (∗) is equivalent with the following condition
b
i
a
ij
≤
b
k
a
kj
, ∀ k ∈ J
1
.
So, (∗) is satisﬁed if
b
i
a
ij
= min
_
b
k
a
kj
, k ∈ J
1
_
where J
1
= ¦k = 1, m [ a
kj
> 0¦.
The previous condition is called the ratio test.
Conclusion. In order to keep the nonnegativity property of the constants column
we obtain the following rule for choosing a pivot on the j
th
column.
1) The pivot has to be positive; a
ij
> 0.
2) If J
1
= ∅ (on the j
th
column there is no positive element) then none of the
elements of the j
th
column can become a pivot. In this case x
j
can’t be a basic
variable.
40
If J
1
,= ∅ then the pivot will be the positive element situated on j
th
column for
which the ratio test is satisﬁed.
The computation table contains an extra column situated at the right hand side
of the usual table, for the ratio test.
Remark. If the ratio test is satisﬁed for more than one element then the pivot will
be the element which provides us the minimum row by using the lexicographical
order.
Let a = (a
1
, . . . , a
n
) ∈ R
n
and b = (b
1
, . . . , b
n
) ∈ R
n
. We say that a < b (in
lexicographical order) if
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
a
1
< b
1
or
a
1
= b
1
, a
2
< b
2
or
a
1
= b
1
, a
2
= b
2
, a
3
< b
3
or
a
1
= b
1
, . . . , a
n−1
= b
n−1
, a
n
< b
n
Examples. Determine a basic feasible solution for the following systems:
a)
_
2x
1
+ 3x
2
−x
3
= 9
−x
1
+x
2
−x
3
= −2 [ −1
First, we multiply the second equation by −1, in order to obtain a positive constant
in the right hand side.
_
2x
1
+ 3x
2
−x
3
= 9
x
1
−x
2
+x
3
= 2
b x
1
x
2
x
3
ratio test
9 2 3 −1
2 1 −1 1 min
_
9
2
,
2
1
_
= 2
5 0 5 −3
→ x
1
2 1 −1 1
→ x
2
1 0 1 −
3
5
← x
1
3 1 0
2
5
BFS : x =
_
_
3
1
0
_
_
x
2
11
2
3
2
1 0
→ x
3
15
2
5
2
0 1 BFS : x =
_
_
0
11
2
15
2
_
_
41
b)
_
2x
1
−x
2
−3x
3
+x
4
= 5
x
1
−2x
2
+x
3
+ 2x
4
= 10
b x
1
x
2
x
3
x
4
ratio test
5 2 −1 −3 1 min
_
5
1
,
10
2
_
= 5
10 1 −2 1 2
5 2 −1 −3 1
 ∨
5
1
2
−1
1
2
1
0
3
2
0 −
7
2
0
→ x
4
5
1
2
−1
1
2
1 min
_
0
3
2
,
5
1
2
_
= 0
x
1
0 1 0 −
7
3
0
x
4
5 0 −1
5
3
1 BFS :
_
_
_
_
0
0
0
5
_
_
_
_
a degenerate BFS
c)
_
_
_
x + 2y −3z −2s + 4t = 1
2x + 5y −8z −s + 6t = 4
x + 4y −7z + 5s + 2t = 8
b x y z s t ratio test
1 1 2 −3 −2 4
4 2 5 −8 −1 6 min
_
1
1
,
4
2
,
8
1
_
= 1
8 1 4 −7 5 2
→ x 1 1 2 −3 −2 4
2 0 1 −2 3 −2 min
_
2
3
,
7
7
_
=
2
3
7 0 2 −4 7 −2
x
7
3
1
8
3
−
13
3
0
8
3
→ s
2
3
0
1
3
−
2
3
1 −
2
3
7
3
0 −
1
3
2
3
0
8
3
x
35
2
1
1
2
0 0 20
s 3 0 0 0 1 2 BFS :
_
_
_
_
_
_
35
2
0
7
2
3
0
_
_
_
_
_
_
→ z
7
2
0 −
1
2
1 0 4
42
2.2 Linear programming problems (LPP)
Example. The diet problem
We want to determine the most economical diet which satisﬁes the basic minimum
nutritional requirements for a good health.
We know:
 there are available n diﬀerent kinds of food: F
1
, . . . , F
n
 food F
j
sells at a price c
j
per unit; j = 1, n
 there are m basic nutritional ingredients N
1
, . . . , N
m
 for a balanced diet each individual must receive at least b
i
units of N
i
th
ingredient
per day, i = 1, m
 each unit of food F
j
contains a
ij
units of the i
th
ingredient.
We want:
 the amount x
j
of food F
j
(j = 1, n) such that the total cost of the diet is as
small as possible.
The mathematical model is the following:
 The total cost:
f(x
1
, x
2
, . . . , x
n
) = x
1
c
1
+x
2
c
2
+ +x
n
c
n
→ minimize
 The quantity of the i
th
ingredient received by a person is:
a
i1
x
1
. ¸¸ .
from F
1
+ a
i2
x
2
. ¸¸ .
from F
2
+ + a
in
x
n
. ¸¸ .
from F
n
≥ b
i
, i = 1, m
We have to solve the following problem (example of a linear programming problem):
f =
n
j=1
c
j
x
j
→ minimize (the objective function)
subject to the constraints:
_
¸
¸
_
¸
¸
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
≥ b
1
. . .
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
≥ b
m
x
j
≥ 0, j = 1, n
The main characteristic of a LPP is that all the involved functions: the objective
function and those which express the constraints must be linear.
Deﬁnition (General form of a linear programming problem)
A LPP is an optimization (minimization or maximization) problem of the following
form:
Find the optimum (minimum or maximum) of the following function
f =
n
j=1
c
j
x
j
subject to the constraints:
43
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
n
j=1
a
ij
x
j
≤ b
i
, i = 1, p
n
j=1
a
ij
x
j
≥ b
i
, i = p + 1, q
n
j=1
a
ij
x
j
= b
i
, i = q + 1, m
x
j
≥ 0, j = 1, n
where c
j
, b
i
, a
ij
, j = 1, n, i = 1, m are known real numbers, and x
j
, j = 1, n are real
numbers to be determined.
Depending on particular values of p and q we may have inequality constraints of
one type or the other and equality restrictions as well.
Deﬁnition (Standard form of a LPP)
Optimize
f =
n
j=1
c
j
x
j
subject to the constraints:
_
¸
¸
_
¸
¸
_
a
11
x
1
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+ +a
mn
x
n
= b
m
x
j
≥ 0, j = 1, n
where c
j
, b
i
, a
ij
, j = 1, n, i = 1, m are known real numbers; x
j
, j = 1, n are real
numbers to be determined.
We can assume that b
i
≥ 0, i = 1, m (otherwise we multiply the equality by −1).
Remark. Any LPP can be converted to the standard form by using the slack or
surplus variables.
a) If a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
≤ b
i
, then we add to the left side a new variable
y
i
≥ 0 in order to transform the inequality into an equality. We obtain:
a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
+y
i
= b
i
.
In this case y
i
is called a slack variable.
b) If a
i1
x
1
+ a
i2
x
2
+ + a
in
x
n
≥ b
i
then we subtract to the left side of the
inequality a new variable y
i
≥ 0 in order to transform the inequality into an equality.
We obtain
a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
−y
i
= b
i
.
In this case y
i
is called a surplus variable.
Deﬁnition. Any solution of the constraints for which the optimum of the objective
function is obtained is called an optimal solution.
44
The graphical method for solving a LPP
When a LPP involves only 2 variables it can be solved by graphical procedures.
The graphical approach is extremely helpful in understanding the kinds of phenomena
which can occur in solving linear programming problems. We consider the case n = 2.
The feasible region is the set of points with coordinates (x, y) that satisfy all the
constraints. Each constraint (inequality) represents a halfplane at one side of the line
whose equation is the correspondent equality. So, the set of feasible solution is the
intersection of these half planes.
Example 1. Determine the maximum of the function
f(x, y) = 3x + 2y subject to
_
_
_
−2x + 2y ≤ 4
3x −y ≤ 3
x ≥ 0, y ≥ 0
`
¸
(0, 2)
(0, 0) (1, 0)
(2, 3)
3
x
−
y
=
3
−
x
+
2
y
=
4
3
x
+
2
y
=
1
0
3
x
+
2
y
=
0
To solve this problem graphically we ﬁrst shade the region in the graph in which
all the feasible solutions must lie and then shift the position of the objective function
line
f = 3x + 2y.
The objective function is linear so it has level curves that are straight lines of
equation
3x + 2y = c, c const.
45
The question is how big can c become so that the line of equation 3x + 2y = c
meets the above polygon somewhere (for a maximum problem).
The objective is to maximize the level curve 3x+2y = c. If we ﬁx for the beginning
the value of c to be 0 we see that the level curve can be represented as a line of slope −
3
2
that passes through the origin. Translating this objective line (i.e. moving it without
changing its slope) is equivalent to choose a diﬀerent value for c. When the value of
c increases, the correspondent line moves to the right and hence we are interested in
determining the greatest value for c such that the correspondent level curve touches
the set of feasible solutions.
Graphically it is not hard to realize that the optimum value is realized at the
vertex (2,3) and the value of the maximum is 10.
Remark (The corner point method for solving a LPP)
The following cases can arise for a maximization problem.
1) If the constraints are such that there is no feasible region, then there is no
solution for the LPP.
2) If the objective function line can be moved indeﬁnitely in a direction that
increases f and still intersects the feasible region, then f approaches ∞.
3) If the objective function line can be moved only a ﬁnite amount by increasing
the value of f (while still intersecting the feasible region) then the last point touched
by the objective function, if it is unique, will give the unique optimal solution. If it
is not unique, then any point on the segment of the boundary last touched gives an
optimal solution. In this case if x and x are the coordinates of the endpoints of the
segment, then the general solution is the following
(1 −t)x +tx, t ∈ [0, 1].
Example 2.
f(x, y) = 6x −2y → maximize
subject to
_
_
_
−x + 2y ≤ 4
3x −y ≤ 3
x ≥ 0, y ≥ 0
Solution. In this case the level curve 6x−2y = c is parallel to the line 3x−y = 3,
hence the optimal solutions are situated on the segment whose end points are (1,0)
and (2,3).
46
`
¸
y
x
(2,3)
(0,2)
(1,0) (0, 0)
3
x
−
y
=
3
−
x
+
2
y
=
4
6
x
−
2
y
=
0
6
x
−
2
y
=
6
Hence, the general solution is
(1 −t)(1, 0) +t(2, 3) = (1 +t, 3t), t ∈ [0, 1].
Example 3.
f(x, y) = 5x + 4y → maximize
subject to
_
_
_
x +y ≤ 2
−2x −2y ≤ −9
x ≥ 0, y ≥ 0
⇔ x +y ≥
9
2
Solution. In this case the set of feasible solutions is empty and hence the LPP
has no solution.
47
`
¸
(0,2)
y
(0,
9
2
)
(
9
2
, 0)
x
(2,0)
x
+
y
=
2
x
+
y
=
9
2
Example 4.
f(x, y) = x −4y → maximize
subject to
_
_
_
2x −y ≥ 1
x + 2y ≥ 2
x ≥ 0, y ≥ 0
Solution. In this case we have an unbounded feasible set. The objective function
can become as large as we want so the LPP is unbounded.
48
`
¸
(0,1)
y
(0, −1)
(
1
2
, 0) (0,2)
x
x
+
2
y
=
2
Remark. A similar discussion can be made regarding a minimum LPP.
The Simplex algorithm
The Simplex algorithm was developed by George B. Dantzig and was used ﬁrst
for military reasons and after the second world war in the business world. In the years
’70 the Simplex algorithm was used to optimize the production, the beneﬁts, the
costs and in the game theory. George B. Dantzig is considered to be one of the three
founders of linear programming, among John von Neumann and Leonid Kantorovich.
We will analyze only minimum problems, the results concerning maximum prob
lems will be only stated.
Consider a LPP in standard form
f = c
1
x
1
+ +c
n
x
n
→ minimize
subject to
_
¸
¸
¸
_
¸
¸
¸
_
a
11
x
1
+ +a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ +a
mn
x
n
= b
m
x
1
≥ 0, . . . , x
n
≥ 0
As we have already discussed, we can assume that
rank A = m and b
j
≥ 0, j = 1, m.
49
Theorem (The fundamental theorem of linear programming)
Let a LPP in standard form.
a) If there is no optimal solution the problem is either infeasible or unbounded.
b) If there is an optimal solution then there is an optimal basic feasible solution.
Remark. The previous theorem assures us that it is suﬃcient to consider only
BFS in our search for optimal solutions.
The idea of the simplex method is to start from one basic feasible solution of the
constraints set and transform it into another one in order to decrease the value of the
objective function until a minimum is reached.
We need a criterion to decide when the objective function cannot be decreased
anymore, case in which we found an optimal solution and no more iterative steps are
needed.
Let x be an arbitrary BFS associated to the ﬁrst part of the next table (the table
looks this way, eventually by renumbering the equations and the unknowns).
b x
1
x
2
. . . x
m
x
m+1
. . . x
n
← x
1
β
1
1 0 . . . 0 α
1m+1
. . . α
1n
x
2
β
2
0 1 . . . 0 α
2m+1
. . . α
2n
.
.
.
.
.
.
x
m
β
m
0 0 . . . 1 α
mm+1
. . . α
mn
→ x
m+1
β
′
1
1/α
1m+1
0 . . . 0 1 . . . α
1n
/α
1n+1
x
2
β
′
2
0 1 . . . 0 0
.
.
.
.
.
.
.
.
.
x
m
β
′
m
.
.
. 0 . . . 1 0 . . .
.
.
.
The basic variables are: x
1
, x
2
, . . . , x
m
.
The nonbasic variables are: x
m+1
, . . . , x
m
.
The ﬁrst part of the previous table gives us the following BFS
x =
_
_
_
_
_
_
_
_
_
_
β
1
β
2
. . .
β
m
0
. . .
0
_
_
_
_
_
_
_
_
_
_
For each j = 1, n, we deﬁne
f
j
=
m
i=1
c
i
α
ij
,
where c
i
, i = 1, m are the coeﬃcients of the objective function.
50
For instance, we have
f
1
= c
1
1 +c
2
0 + +c
m
0 = c
1
.
.
.
f
m
= c
1
0 +c
2
0 + +c
m
1 = c
m
f
m+1
= c
1
α
1m+1
+c
2
α
2m+1
+ +c
m
α
mm+1
.
.
.
f
n
= c
1
α
1n
+c
2
α
2n
+ +c
m
α
mn
We are now able to present the main result.
Theorem (The optimality criterion for a minimum LPP) If for a basic
feasible solution we have c
j
−f
j
≥ 0, ∀ j = 1, n, then the solution is optimal.
Proof. Let x be a BFS for which c
j
−f
j
≥ 0, ∀ j = 1, n.
We want to prove that any new BFS (x) obtained by choosing a new pivot from
the remaining columns (from m+ 1 to n) isn’t better then x, that is f(x) ≥ f(x).
Suppose we choose α
1m+1
as a pivot from the m+1
th
column, so α
1m+1
satisﬁes
the following conditions
_
¸
¸
_
¸
¸
_
α
1m+1
> 0
β
1
α
1m+1
= min
k
_
β
k
α
km+1
[ α
km+1
> 0
_
the lexicographical order is respected
By choosing the new pivot, x
1
is leaving the base and becomes a nonbasic variable
and x
m+1
is entering the base and becomes a basic variable.
We determine now, the new BFS
β
1
→ β
′
1
=
β
1
α
1m+1
β
2
→ β
′
2
= β
2
−
β
1
α
1m+1
α
2m+1
β
3
→ β
′
3
= β
3
−
β
1
α
1m+1
α
3m+1
.
.
.
β
m
→ β
′
m
= β
m
−
β
1
α
1m+1
α
mm+1
The basic variables for x are x
2
= β
′
2
, . . . , x
m
= β
′
m
and x
m+1
= β
′
1
.
The nonbasic variables for x are x
1
= x
m+2
= = x
n
= 0.
51
Hence,
x =
_
_
_
_
_
_
_
_
_
_
0
β
′
2
. . .
β
′
m
β
′
1
0
0
_
_
_
_
_
_
_
_
_
_
.
It remains for us to compute and compare f(x) and f(x).
f(x) = c
1
β
1
+ +c
m
β
m
+c
m+1
0 + +c
n
0
= c
1
β
1
+ +c
m
β
m
f(x) = c
1
0 +c
2
_
β
2
−
β
1
α
1m+1
α
2m+1
_
+ +
+c
m
_
β
m
−
β
1
α
1m+1
α
mm+1
_
+c
m+1
β
1
α
1m+1
+c
m+2
0 + +c
n
0
= c
2
β
2
+ +c
m
β
m
. ¸¸ .
=f(x)−c
1
β
1
+
β
1
α
1m+1
[c
m+1
−(c
2
α
2m+1
+ +c
m
α
mm+1
. ¸¸ .
=f
m+1
−c
1
·α
1m+1
)]
= f(x) −c
1
β
1
+
β
1
α
1m+1
(c
m+1
−f
m+1
+c
1
α
1m+1
)
= f(x) −c
1
β
1
+
β
1
α
1m+1
(c
m+1
−f
m+1
) +c
1
β
1
So,
f(x) = f(x) +
β
1
α
1m+1
. ¸¸ .
≥0
(c
m+1
−f
m+1
. ¸¸ .
≥0
) ≥ f(x).
In conclusion, the basic feasible solution x cannot be improved (so it is optimal).
This completes the proof.
From the previous theorem we get the following obvious corollary.
Corollary. If there exists l ∈ ¦1, n¦ with c
l
−f
l
< 0 for a basic feasible solution,
the value of the objective function can be decreased by choosing x
l
as a basic variable.
The following two theorems characterize situations when either an optimal solution
does not exist or when an existing optimal solution is not uniquely determined.
52
Theorem. If inequality c
l
−f
l
< 0 holds for a nonbasic variable x
l
and x
l
cannot
become a basic variable (the entire column of x
l
is nonpositive so we cannot choose a
pivot from its column) then the LPP does not have an optimal solution.
In the latter case, the objective function value is unbounded from below, and we
can stop our computation.
Theorem. If there is l ∈ ¦1, . . . , n¦ such that c
l
−f
l
= 0 for an optimal solution
and x
l
is a nonbasic variable which can become a basic variable (there is at least
one positive element on its column), then there exists another optimal basic feasible
solution (by choosing x
l
as a basic variable).
Indeed, if the assumptions of the previous theorem are satisﬁed, we can perform
a further pivoting step with x
l
as entering variable and there is at least one basic
variable which can be chosen as leaving variable. However, due to c
l
− f
l
= 0, the
objective function value does not change.
In what concerns a minimum LPP we have the following conclusions (regarding a
BFS denoted by x):
1) If c
j
− f
j
≥ 0, for each j ∈ ¦1, . . . , n¦ then x is an optimal solution and
f
min
= f(x).
2) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
< 0 and J
1
= ¦k = 1, m, α
kj
> 0¦ = ∅
then the LPP is unbounded from below; f
min
= −∞.
3) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
< 0 then x is not an optimal solution.
In this case we obtain a better solution x (f(x) < f(x)) by choosing a pivot from x
j
’s
column.
4) If c
j
− f
j
≥ 0 for each j ∈ ¦1, . . . , n¦ and there is l ∈ ¦1, . . . , n¦ such that
c
l
−f
l
= 0 and x
l
is a nonbasic variable then the solution x, obtained by choosing a
pivot from x
l
column, is optimal too.
In what concerns a maximum LPP we have the following conclusions (regarding
a BFS denoted by x).
1) If c
j
−f
j
≤ 0 for each j ∈ ¦1, . . . , n¦ then x is an optimal solution and f
max
=
f(x).
2) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
> 0 and J
1
= ¦k = 1, m, α
kj
> 0¦ = ∅
then the LPP is unbounded from above; f
max
= +∞.
3) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
> 0 then x is not an optimal solution.
In this case we obtain a better solution x (f(x) > f(x)) by choosing a pivot from x
j
’s
column.
4) If c
j
− f
j
≤ 0 for each j ∈ ¦1, . . . , n¦ and there is l ∈ ¦1, . . . , n¦ such that
c
l
−f
l
= 0 and x
l
is a nonbasic variable then the solution x, obtained by choosing a
pivot from x
l
column, is optimal too.
Based on the results above, we can summarize the simplex algorithm as follows.
Assume that we have some current basic feasible solutions. The correspondent
simplex tableau is given bellow.
53
c c
1
. . . c
j
. . . c
l
. . . c
n
ratio
c
B
B b x
1
. . . x
j
. . . x
l
. . . x
n
test
. . .
c
i
x
i
b
i
α
i1
. . . α
ij
. . . α
il
. . . α
in
. . .
c
k
x
k
b
k
α
k1
. . . α
kj
. . . α
kl
. . . α
kn
. . .
f
j
f(x) f
1
. . . f
j
. . . f
l
. . . f
n
c
j
− f
j
− c
1
− f
1
. . . c
j
− f
j
. . . c
l
− f
l
. . . c
n
− f
n
In the ﬁrst row and column of the previous table we write the coeﬃcients of the
corresponding variables in the objective function.
f
j
=
m
k=1
c
k
α
kj
, j = 1, n
are obtained by adding the corresponding products between the elements of column
c
B
and column x
j
.
The simplex algorithm
1
st
step. Determine a BFS.
2
nd
step. Check the optimality of the current BFS.
3
rd
step. If the LPP is unbouded exit the algorithm.
If the current BFS is optimal and unique exit the algorithm.
If the current BFS is optimal and not unique determine another optimal solu
tion.
If the current BFS is not optimal improve it.
4
th
step. Repeat steps 2 and 3 till obtaining all the optimal solutions.
Examples
1) A ﬁrm intends to manufacture three types of products P
1
, P
2
and P
3
so that
the total production cost does not exceed 32000 EUR. There are 400 working hours
possible and 30 units of raw materials may be used. Additionally, the data presented
in the table below are given.
Product P
1
P
2
P
3
Selling price (EUR/piece) 1600 3000 5200
Production cost (EUR/piece) 1000 2000 4000
Required raw material (per piece) 3 2 2
Working time (hours per piece) 20 10 20
54
The objective is to determine the quantities of each product so that the proﬁt
is maximized. Let x
i
be the number of produced pieces of P
i
, i ∈ ¦1, 2, 3¦. We can
formulate the above problem as an LPP as follows:
The objective function is obtained by subtracting the production cost from the
selling price and dividing the resulting proﬁt by 100 for each product
f(x
1
, x
2
, x
3
) =
1
100
(1600x
1
+ 3000x
2
+ 5200x
3
−1000x
1
−2000x
2
−4000x
3
)
= 6x
1
+ 10x
2
+ 12x
3
→ maximize
The constraint on the production cost can be divided by 1000
1000x
1
+ 2000x
2
+ 4000x
3
≤ 32000 [: 1000
and we obtain
x
1
+ 2x
2
+ 4x
3
≤ 32.
The constraint on the working time can be divided by 10
20x
1
+ 10x
2
+ 20x
3
≤ 400 [: 10
and we obtain
2x
1
+x
2
+ 2x
3
≤ 40.
The constraint on raw materials is the following:
3x
1
+ 2x
2
+ 2x
3
≤ 30.
So, we get the following LPP problem, written in general form:
f = 6x
1
+ 10x
2
+ 12x
3
→ max
subject to
_
¸
¸
_
¸
¸
_
x
1
+ 2x
2
+ 4x
3
≤ 32
3x
1
+ 2x
2
+ 2x
3
≤ 30
2x
1
+x
2
+ 3x
3
≤ 40
x
1
, x
2
, x
3
≥ 0
Introducing now in the i
th
constraint, i = 1, 3, the slack variable x
3+i
≥ 0, we
obtain the standard form and the following table.
f = 6x
1
+ 10x
2
+ 12x
3
→ max
subject to
_
¸
¸
_
¸
¸
_
x
1
+ 2x
2
+ 4x
3
+x
4
= 32
3x
1
+ 2x
2
+ 2x
3
+x
5
= 30
2x
1
+x
2
+ 3x
3
+x
6
= 40
x
1
, . . . , x
6
≥ 0
55
c 6 10 12 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
x
6
ratio test
0 ← x
4
32 1 2 4 1 0 0 min
_
32
4
,
30
2
,
40
3
_
= 8
0 x
5
30 3 2 2 0 1 0
0 x
6
40 2 1 3 0 0 1
f
j
0 0 0 0 0 0 0
c
j
−f
j
− 6 10 12 0 0 0
12 → x
3
8
1
4
1
2
1
1
4
0 0 min
_
8
1
2
,
14
1
_
= 14
0 ← x
5
14
5
2
1 0 −
1
2
1 0
0 x
6
16
5
4
−
1
2
0 −
3
4
0 1
f
j
96 3 6 12 3 0 0
c
j
−f
j
− 3 4 0 −3 0 0
12 x
3
1 −1 0 1
1
2
−
1
2
0
10 x
2
14
5
4
1 0 −
1
2
1 0
0 x
6
23
5
2
0 0 −1
1
2
1
f
j
152 13 10 12 1 4 0
c
j
−f
j
−7 0 0 −1 −4 0
Since now all coeﬃcients c
j
− f
j
are nonpositive, we get the following optimal
solution from the latter table:
x
1
= 0, x
2
= 14, x
3
= 1, x
4
= 0, x
5
= 0, x
6
= 23.
Since there is no nonbasic variable for which c
j
−f
j
= 0, we have a unique solution.
This means that the optimal solution is to produce no piece of product P
1
, 14 pieces
of product P
2
and one piece of product P
3
. Taking into account that the coeﬃcients
of the objective function were divided by 100, we get a total proﬁt of 15200 EUR.
2) We consider the following LPP
f = −2x
1
−2x
2
→ min
subject to
_
_
_
x
1
−x
2
≥ −1
−x
1
+ 2x
2
≤ 4
x
1
, x
2
≥ 0.
Solution. First, we transform the given problem into the standard form, i.e. we
multiply the ﬁrst constraint by −1 and introduce the slack variables x
3
and x
4
. We
obtain:
f = −2x
1
−2x
2
→ min
subject to
_
_
_
−x
1
+x
2
+x
3
= 1
−x
1
+ 2x
2
+x
4
= 4
x
1
, x
2
, x
3
, x
4
≥ 0
56
c −2 −2 0 0
c
B
B b x
1
x
2
x
3
x
4
ratio test
0 ← x
3
1 −1 1 1 0 min
_
1
1
,
4
2
_
= 1
0 x
4
4 −1 2 0 1
f
j
0 0 0 0 0
c
j
−f
j
− −2 −2 0 0
−2 → x
2
1 −1 1 1 0
0 ← x
4
2 1 0 −2 1
f
j
−2 2 −2 −2 0
c
j
−f
j
−4 0 2 0
−2 x
2
3 0 1 −1 1
−2 → x
1
2 1 0 −2 1
f
j
−10 −2 −2 6 −4
c
j
−f
j
0 0 −6 4
Since there is only one negative coeﬃcient of a nonbasic variable in the objective
row, variable x
3
should be chosen as entering variable. However, there are only neg
ative elements in the column belonging to x
3
. This means that we cannot perform a
further pivoting step, and so there does not exist an optimal solution of the minimiza
tion problem considered, i.e. the objective function value is unbounded from below
(f
min
= −∞).
3) We consider the following LPP
f = x
1
+x
2
+x
3
+x
4
+x
5
+x
6
→ min
subject to
_
¸
¸
_
¸
¸
_
2x
1
+x
2
+x
3
≥ 4000
x
2
+ 2x
4
+x
5
≥ 5000
x
3
+ 2x
5
+ 3x
6
≥ 3000
x
1
, x
2
, x
3
, x
4
, x
5
, x
6
≥ 0
Solution. To get the standard form, we notice that in each constraint there is one
variable that occurs only in this constraint. Therefore, we divide the ﬁrst constraint
by the coeﬃcient 2 of variable x
1
, the second constraint by 2 and the third constraint
by 3. Then, we introduce a surplus variable in each of the constraints and obtain the
standard form
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
f = x
1
+x
2
+x
3
+x
4
+x
5
+x
6
→ min
x
1
+
1
2
x
2
+
1
2
x
3
−x
7
= 2000
1
2
x
2
+x
4
+
1
2
x
5
−x
8
= 2500
1
3
x
3
+
2
3
x
5
+x
6
−x
9
= 1000
x
1
, x
2
, x
3
, x
4
, x
5
, x
6
, x
7
, x
8
, x
9
≥ 0
57
c 1 1 1 1 1 1 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
x
6
x
7
x
8
x
9
ratio test
1 x
1
2000 1
1
2
1
2
0 0 0 −1 0 0 min
_
2500
1
2
,
1000
2
3
_
1 x
4
2500 0
1
2
0 1
1
2
0 0 −1 0 = min{5000, 1500}
1 ←x
6
1000 0 0
1
3
0
2
3
1 0 0 −1 = 1500
f
j
5500 1 1
5
6
1
7
6
1 −1 −1 −1
c
j
−f
j
− 0 0
1
6
0 −
1
6
0 1 1 1
1 x
1
2000 1
1
2
1
2
0 0 0 −1 0 0 min
_
2000
1
2
,
1750
1
2
_
1 ←x
4
1750 0
1
2
−
1
4
1 0 −
3
4
0 −1
3
4
= 3500
1 →x
5
1500 0 0
1
2
0 1
3
2
0 0 −
3
2
f
j
5250 1 1
3
4
1 1
3
4
−1 −1 −
3
4
c
j
−f
j
− 0 0
1
4
0 0
1
4
1 1
1
4
Now all the coeﬃcients in the objective row are nonnegative and from the latter
tableau we obtain the following optimal solution
x
1
= 2000, x
2
= x
3
= 0, x
4
= 1750, x
5
= 1500, x
6
= 0
with the optimal objective function f
min
= 5250.
Notice that the optimal solution is not uniquely determined. In the last tableau,
there is one coeﬃcient in the objective row equal to zero (this coeﬃcient corresponds
to the nonbasic variable x
2
). Taking x
2
as entering variable, the ratio test determines
x
4
as the leaving variable, and we get
1 x
1
250 1 0
3
4
−1 −
3
4
−1 0 −
3
4
1 → x
2
3500 0 1 −
1
2
2 0 −
3
2
0 −2
3
2
1 x
5
1500 0 0
1
2
0 1
3
2
0 0 −
3
2
f
j
5250 1 1
3
4
1 1
3
4
−1 −2 −
3
4
c
j
−f
j
− 0 0
1
4
0 0
1
4
1 2
3
4
So, we obtain the following basic feasible solution
x
1
= 250, x
2
= 3500, x
3
= x
4
= 0, x
5
= 1500, x
6
= 0
with the same objective function value f
min
= 5250.
The general solution is:
x(t) = (1 −t)
_
_
_
_
_
_
_
_
2000
0
0
1750
1500
0
_
_
_
_
_
_
_
_
+t
_
_
_
_
_
_
_
_
250
3500
0
0
1500
0
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
2000 −1750t
3500t
1750 −1750t
1500
0
_
_
_
_
_
_
, t ∈ [0, 1].
58
Matrix form of the simplex method
We derive now the formulas in matrixvector form for the linear programming
problem. In vector notation the standard problem becomes:
f(x) = c
t
x → minimize
subject to
_
Ax = b
x ≥ 0
Here x is an ndimensional column vector, c
t
is an ndimensional row vector,
named the cost vector (the symbol c
t
means the transpose of the vector c), A is an
mn matrix and b is an mdimensional column vector. The vector inequality x ≥ 0
means that each component of x is nonnegative.
Let x be a basic feasible solution with the variables ordered so that
x =
_
x
B
0
_
, x
B
∈ R
m
, 0 ∈ R
n−m
, x
B
≥ 0
where x
B
is the vector of basic variables and 0 is the vector of nonbasic variables. In the
same way, the matrix A after the same permutations of column, can be decomposed
as
A =
_
B N
_
.
B is the submatrix of the matrix consisting of the m columns of A corresponding
to the basic variables. These columns are linearly independent and hence the columns
of B form a basis for R
m
. The matrix B is invertible.
The equation Ax = b is equivalent to
_
B N
_
_
x
B
0
_
= b, Bx
B
= b, x
B
= B
−1
b.
The cost of such a vector is
f(x) =
_
c
t
B
c
t
N
_
_
x
B
0
_
= c
t
B
x
B
= c
t
B
B
−1
b.
The basic step of the simplex method consists in moving to another basic feasible
solution such that the cost has been lowered.
The changing from x =
_
x
B
0
_
to x =
_
x
B
x
N
_
must satisfy the following
conditions:
1) Ax = b,
2) f(x) < f(x)
3) x ≥ 0.
The ﬁrst condition is equivalent to
x
B
= x
B
−B
−1
Nx
N
.
59
Indeed, the equality
_
B N
_
_
x
B
x
N
_
= b
implies
Bx
B
+Nx
N
= b
and hence
x
B
= B
−1
(b −Nx
N
) = x
B
−B
−1
Nx
N
.
The new cost f(x) will be
f(x) = c
t
x =
_
c
t
B
c
t
N
_
_
x
B
x
N
_
=
_
c
t
B
c
t
N
_
_
x
B
−B
−1
Nx
N
x
N
_
= c
t
B
(x
B
−B
−1
Nx
N
) +c
t
N
x
N
= (c
t
N
−c
t
B
B
−1
N)x
N
+c
t
B
x
B
= (c
t
N
−c
t
B
B
−1
N)x
N
+f(x).
The sign of (c
t
N
−c
t
B
B
−1
N)x
N
will show if it is possible to decrease the cost by
moving to the new vector x.
The vector c
t
N
−c
t
B
B
−1
N is called the vector of reduced costs or the relative cost
vector (for nonbasic variables). If all the components of the vector of reduced costs
are nonnegative, we are not able to lower the costs anymore and x is optimal.
Duality theory
Linear programming is based on the theory of duality. To each primal linear
programming problem we can assign a dual linear programming problem. This sec
tion formulates and discusses the relationships between the primal and dual problems
which are important for optimality conditions and oﬀers a meaningful economic in
terpretation of the optimization model.
Motivation
We begin with an example.
Example 1.
f = x
1
+ 3x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
x
1
+ 3x
2
≥ 4
4x
1
−x
2
≥ 1
x
2
≥ 3
x
1
, x
2
≥ 0
First we observe that every feasible solution provides an upper bound of the opti
mal objective function value f
min
. For example, the solution (x
1
, x
2
) = (1, 3) tells us
60
f
min
≤ 1 + 3 3 = 10. But how close is this bound to the optimal value? To answer,
we need to give lower bounds.
By multiplying the third constraint by 13 and adding that to the sum of the ﬁrst
two constraints we get
x
1
+ 3x
2
+ 4x
1
−x
2
+ 13x
2
≥ 4 + 1 + 39
which is equivalent to
5x
1
+ 15x
2
≥ 44.
Hence
44
5
≤ f
min
≤ 10.
To get a better lower bound, we apply again the same lower bounding technique,
but we replace the numbers used before with variables. So, we multiply the three
constraints by nonnegative numbers y
1
, y
2
and y
3
.
Hence
y
1
(x
1
+ 3x
2
) +y
2
(4x
1
−x
2
) +y
3
x
2
≥ 4y
1
+y
2
+ 3y
3
and
x
1
(y
1
+ 4y
2
) +x
2
(3y
1
−y
2
+y
3
) ≥ 4y
1
+y
2
+ 3y
3
.
If we stipulate that each of the coeﬃcients of the x
i
is at most as large as the
corresponding coeﬃcient in the coeﬃcient function,
_
y
1
+ 4y
2
≤ 1
3y
1
−y
2
+y
3
≤ 3
then
f = x
1
+ 3x
2
≥ 4y
1
+y
2
+ 3y
3
.
We now have a lower bound 4y
1
+y
2
+3y
3
, which we should maximize in our eﬀort
to obtain the best possible lower bound.
Therefore, we are led to the following optimization problem
g = 4y
1
+y
2
+ 3y
3
→ maximize
subject to
_
_
_
y
1
+ 4y
2
≤ 1
3y
1
−y
2
+y
3
≤ 3
y
1
, y
2
, y
3
≥ 0
This problem is called the dual linear programming problem associated to the
given linear programming problem. Next, we will deﬁne the dual linear programming
problem in general.
61
The dual problem. Symmetric form
Given a LPP in the form
(P) f =
n
j=1
c
j
x
j
→ minimize
subject to
_
¸
_
¸
_
n
j=1
a
ij
x
j
≥ b
i
, i = 1, 2, . . . , m
x
j
≥ 0, j = 1, 2, . . . , n
the associated dual linear programming problem is given by
(D) g =
m
i=1
b
i
y
i
→ maximize
subject to
_
¸
_
¸
_
m
i=1
a
ij
y
i
≤ c
j
, j = 1, 2, . . . , n
y
i
≥ 0, i = 1, 2, . . . , m
Since we started with the LPP (P) it is called the primal problem.
If we use the matrix notation in the form
_
A b
f ∗
_
we have
i) the minimization problem (P)
_
_
_
_
_
a
11
. . . a
1n
b
1
.
.
.
.
.
.
.
.
.
a
m1
. . . a
mn
b
m
c
1
. . . c
n
∗
_
_
_
_
_
ii) the maximization problem (D).
_
_
_
_
_
a
11
. . . a
m1
c
1
.
.
.
.
.
.
.
.
.
a
1n
. . . a
mn
c
n
b
1
. . . b
m
∗
_
_
_
_
_
Example 2. a) Find the dual of the given linear programming problem
f = 6x
1
+ 5x
2
+ 7x
3
→ minimize
subject to
62
_
¸
¸
_
¸
¸
_
3x
1
+x
2
+ 2x
3
≥ 3
2x
1
+ 2x
2
−x
3
≥ 5
x
1
+ 2x
2
+x
3
≥ 2
x
1
, x
2
, x
3
≥ 0
Solution
(D) g = 3y
1
+ 5y
2
+ 2y
3
→ maximize
subject to
_
¸
¸
_
¸
¸
_
3y
1
+ 2y
2
+y
3
≤ 6
y
1
+ 2y
2
+ 2y
3
≤ 5
2y
1
−y
2
+y
3
≤ 7
y
1
, y
2
, y
3
≥ 0
b) The dual of the diet problem
The diet problem was the problem faced by a dieticien to select a combination of
foods to meet certain nutritional requirements at minimum cost. This problem has
the form (see the ﬁrst example in section 2.2)
f =
n
j=1
c
j
x
j
→ minimize
subject to
_
¸
_
¸
_
n
j=1
a
ij
x
j
≥ b
i
, i = 1, m
x
j
≥ 0, j = 1, n
The dual problem is
g =
m
i=1
b
i
y
i
→ maximize
subject to
_
¸
_
¸
_
m
i=1
a
ij
y
i
≤ c
j
, j = 1, n
y
i
≥ 0, i = 1, m
We describe an interpretation of the dual problem. Imagine a pharmaceutical com
pany that produces the nutrients considered important by the dieticien. The problem
is to determine the positive unit prices y
1
, y
2
, . . . , y
m
for the nutrients, in order to
maximize the revenue
_
m
i=1
b
i
y
i
_
while at the same time being competitive to the
real food. To be competitive with real food, the cost of a unit of food i made by the
pharmaceutical company must be at most c
i
_
n
i=1
a
ij
y
i
≤ c
i
_
.
63
Remark 1. The dual of the dual symmetric problem is the primal problem.
Proof. We must ﬁrst write the dual problem in the form (P). To change a maxi
mization into a minimization, we note that:
max
m
i=1
b
i
y
i
= −min
_
m
i=1
(−b
i
)y
i
_
.
To change the direction of the inequalities, we simply multiply by −1.
The resulting equivalent representation of the dual problem in standard form then
is
 minimize
m
i=1
(−b
i
)y
i
subject to
_
¸
_
¸
_
n
i=1
(−a
ij
)y
i
≥ −c
j
, j = 1, . . . , n
y
i
≥ 0, i = 1, 2, . . . , m
Now we take its dual:
 maximize
n
j=1
(−c
j
)x
j
= minimize
n
j=1
c
j
x
j
subject to
_
¸
_
¸
_
n
j=1
(−a
ij
)x
j
≤ −b
i
, i = 1, 2, . . . , m
x
j
≥ 0, j = 1, 2, . . . , n
which is clearly equivalent to the primal problem (P).
It is always possible to obtain the dual of a LPP consisting a mixture of equations,
inequalities (in either direction), nonnegative variables or variables unrestricted in sign
by changing the system to an equivalent system (P).
However, an easier way is to apply certain rules, presented below.
Primal Dual
Minimize primal objective Maximize dual objective
Objective coeﬃcients Right hand side (RHS) of dual
RHS of primal Objective coeﬃcients
Coeﬃcient matrix Transpose coeﬃcient matrix
Primal relation Dual variable
i
th
inequality: ≥ y
i
≥ 0
i
th
inequality: ≤ y
i
≤ 0
i
th
equation: = y
i
unrestricted in sign
Primal variable Dual relation:
x
j
≥ 0 j
th
inequality: ≤
x
j
≤ 0 j
th
inequality: ≥
x
j
unrestricted in sign j
th
equation: =
64
The dual of a standard form
Applying the correspondence rules of the previous table, the dual of the standard
form can be easily obtained.
Thus, the primal problem for the standard linear problem is
(P
′
) f = c
1
x
1
+ +c
n
x
n
→ minimize
subject to
_
¸
¸
_
¸
¸
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= b
m
x
1
, x
2
, . . . , x
n
≥ 0
and the dual problem for the standard LPP is
(D
′
) g = b
1
y
1
+b
2
y
2
+ +b
m
y
m
→ maximize
subject to
_
¸
¸
_
¸
¸
_
a
11
y
1
+a
21
y
2
+ +a
m1
y
m
≤ c
1
. . .
a
1n
y
1
+a
2n
y
2
+ +a
mn
y
m
≤ c
n
y
1
, . . . , y
m
unrestricted in sign
The matrix form of the previous problem is the following:
If the primal problem is
f(x) = c
t
x → minimize
subject to
_
Ax = b
x ≥ 0
then its dual is
g(y) = b
t
y
subject to
A
t
y ≤ c.
The dual variables y are unrestricted.
Remark 2. The dual of a dual of a primal LPP in standard form is itself the
primal LPP in standard form.
Duality theorems
A duality theorem is a statement of the range of possible values for the primal
problem versus the range of possible values for the dual problem. There are two major
results relating the primal and dual problems. The ﬁrst, called ”weak” duality states
65
that primal objective values provide bounds for dual objective values, and viceversa.
The second, called ”strong” duality, states that the optimal values of the primal and
dual problems are equal, provided that they exist. Since every linear programming
problem can be converted to standard form, in the theoretical results below we work
with primal linear programs in standard form.
Theorem (The weak duality theorem). Let x be a BFS for the primal problem
in standard form, and let y be a BFS for the dual problem. Then f(x) ≥ g(y).
Proof. The constraints for the dual show that A
t
y ≤ c. By transposing the pre
vious inequality we get y
t
A ≤ c
t
. Since x ≥ 0,
f(x) = c
t
x ≥ y
t
Ax = y
t
b = g(y).
There are several simple consequences of the weak duality theorem.
Corollary 1. If the primal is unbounded then the dual is infeasible. If the dual is
unbounded, then the primal is infeasible.
Corollary 2. If x is a feasible solution to the primal problem, y is a feasible
solution to the dual, and f(x) = g(y), then x and y are optimal for their respective
problems.
The previous result shows us that it is possible to check if the vectors x and y are
optimal without solving the primal and dual problems.
The previous theorem is called the weak duality theorem because it expresses only
the guarding of the primal problem by the dual problem, but it doesn’t say that the
guarding is perfect. The latter is expressed by the strong duality theorem.
Theorem (The strong duality theorem). Let a pair of primal and dual linear
programming problems. If one of the problems has an optimal solution then so does
the other, and the optimal values are equal.
Proof. We assume that
 the primal problem is in standard form
 the primal problem has an optimal basic feasible solution x.
By reordering the variables we can write x in terms of basic and nonbasic variables
x =
_
x
B
0
_
and correspondingly we have
A =
_
B N
_
, c =
_
c
B
c
N
_
and x
B
= B
−1
b.
Since x is optimal, then c
t
N
−c
t
B
B
−1
N ≥ 0.
Let y = (B
−1
)
t
c
B
.
We will show that y is a feasible solution and f(x) = g(y). Then Corollary 2 will
show that y is optimal for the dual.
First we check the feasibility
y
t
A = (B
−t
c
B
)
t
A = c
t
B
B
−1
A = c
t
B
B
−1
_
B N
_
=
_
c
t
B
c
t
B
B
−1
N
_
≤
_
c
t
B
c
t
N
_
= c
t
.
66
Taking the transpose of the previous inequality we get A
t
y ≤ c and hence y satisﬁes
the dual constraints.
f(x) = c
t
x = c
t
B
x
B
= c
t
B
B
−1
b
g(y) = b
t
y = (b
t
y)
t
= y
t
b = c
t
B
B
−1
b.
So, y is feasible for the dual and f(x) = g(y). Hence by Corollary 2, y is optimal
for the dual.
The previous proof provides the optimal dual solution.
If
x =
_
x
B
x
N
_
, A =
_
B N
_
and c =
_
c
B
c
N
_
,
then the optimal values of the dual variables are given by
y = B
−t
c
B
.
Remark 3. If the given linear programming problem has a complete set of slack
variables, then the reduced costs for the slack variables are given by
c
t
N
−c
t
B
B
−1
N = 0
t
−c
t
B
B
−1
I = −(B
−t
c
B
)
t
= −y
t
because the objective coeﬃcients (c
t
N
) for the slack variables are zero, and their con
straint coeﬃcients (N) are given by the identity matrix I. In this case the values of
the optimal dual variables are the opposites of the reduced costs of the slack variables.
More precisely:
To obtain the optimal values of the nonsurplus variables of the dual problem
negate the entries in the c
j
−f
j
row under the slack columns (of the primal problem).
The slack column corresponding to the ﬁrst constraint of the primal problem yields
the ﬁrst variable of the dual, and so on. To obtain the optimal surplus values for the
dual negate the entries in the c
j
− f
j
row under the nonslack columns. The column
corresponding to the ﬁrst nonslack variable yields the surplus variables associated to
the ﬁrst constraint of the dual problem, and so on.
Example 3.
f = −x
1
−2x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
−2x
1
+x
2
≤ 2
−x
1
+ 2x
2
≤ 7
x
1
≤ 3
x
1
, x
2
≥ 0
The standard form of the previous problem is
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
f = −x
1
−2x
2
→ minimize
−2x
1
+x
2
+x
3
= 2
−x
1
+ 2x
2
+x
4
= 7
x
1
+x
5
= 3
x
1
, x
2
, x
3
, x
4
, x
5
≥ 0
67
c −1 −2 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
0 ← x
3
2 −2 1 1 0 0
0 x
4
7 −1 2 0 1 0 min
_
2
1
,
7
2
_
= 2
0 x
5
3 1 0 0 0 1
f
j
0 0 0 0 0 0
c
j
−f
j
− −1 −2 0 0 0
−2 x
2
2 −2 1 1 0 0
0 ← x
4
3 3 0 −2 1 0 min
_
3
3
,
3
1
_
= 1
0 x
5
3 1 0 0 0 1
f
j
−4 4 −2 −2 0 0
c
j
−f
j
− −5 0 2 0 0
−2 x
2
4 0 1 −
1
3
2
3
0
−1 x
1
1 1 0 −
2
3
1
3
0
0 ← x
5
2 0 0
2
3
−
1
3
1
f
j
−9 −1 −2
4
3
−
5
3
0
c
j
−f
j
− 0 0 −
4
3
5
3
0
−2 x
2
5 0 1 0
1
2
1
2
−1 x
1
3 1 0 0 0 1
0 → x
3
3 0 0 1 −
1
2
3
2
f
j
−13 −1 −2 0 −1 −2
c
j
−f
j
− 0 0 0 1 2
Hence f
min
= −13 and x
t
min
= (3, 5, 3, 0, 0).
The dual problem is
g = −2y
1
+ 7y
2
+ 3y
3
→ maximize
subject to
_
_
_
= 2y
1
−y
2
+y
3
≤ −1
y
1
+ 2y
2
≤ −2
y
1
, y
2
, y
3
≤ 0
g
max
= f
min
= −13, y
t
max
= (0, −1, −2).
Remark 4. If the given linear programming problem has a complete set of surplus
variables, then the reduced costs for the surplus variables are given by
c
t
N
−c
t
B
B
−1
N = 0
t
−c
t
B
B
−1
(−I) = (B
−t
c
B
)
t
= y
t
because the objective coeﬃcients (c
t
N
) for the surplus variables are zero, and their
constraint coeﬃcients (N) are given by −I. In this case the values of the optimal dual
variables are the same as the reduced costs of the surplus variables.
68
More precisely:
The optimal values of the nonslack variables of the dual problem are the entries in
the c
j
−f
j
row under the surplus columns of the primal problem. The surplus column
corresponding to the ﬁrst constraint of the primal problem yields the ﬁrst variable of
the dual, and so on.
The optimal values of the slack variables of the dual problem are the entries in
the c
j
− f
j
row under the nonsurplus columns of the primal problem. The column
corresponding to the ﬁrst nonsurplus variable yields the slack variable associated to
the ﬁrst constraint of the dual problem, and so on.
Example 4. We consider a very simple diet problem in which the nutrients are
starch, protein and vitamins. The foods are two types of grains with data given below.
Nutrient units/kg Minimum daily
of grain type requirement of nutrient
Nutrient 1 2 in units
Starch 2 1 2
Protein 1 2 2
Vitamins 2 2 3
Cost (RON/kg) of food 5 4
Determine the most economical diet which satisﬁes the basic minimum nutritional
requirements.
Solution. x
j
is the amount in kg of grain j included in the daily diet, j = 1, 2,
and the vector x = (x
1
, x
2
)
t
is the diet. Each of nutrients lead to a constraint. For
example the amount of vitamins contained in the diet is 2x
1
+ 2x
2
which must be
≥ 3.
The problem to be solved is the following
f(x) = 5x
1
+ 4x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
2x
1
+x
2
≥ 2
x
1
+ 2x
2
≥ 2
2x
1
+ 2x
2
≥ 3
x
1
≥ 0, x
2
≥ 0
The simplex table associated to the standard LPP corresponding to (P) is:
69
c 5 4 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
2 2 1 −1 0 0
2 1 2 0 −1 0 min
_
2
2
,
2
1
,
3
2
_
= 1
3 2 2 0 0 −1
5 → x
1
1 1
1
2
−
1
2
0 0
1 0
3
2
1
2
−1 0 min
_
1
1
2
,
1
3
2
,
1
1
_
=
2
3
1 0 1 1 0 −1
5 x
1
2
3
1 0 −
2
3
1
3
0
4 → x
2
2
3
0 1
1
3
−
2
3
0 min
_
2
3
1
3
,
1
3
2
3
_
=
1
2
1
3
0 0
2
3
2
3
−1
5 x
1
1 1 0 0 1 −1
4 x
2
1
2
0 1 0 −1
1
2
min
_
1
1
,
1
2
1
_
=
1
2
0 ← x
3
1/2 0 0 1 1 −
3
2
f
j
7 5 4 0 1 −3
c
j
−f
j
− 0 0 0 −1 3
5 x
4
1/2 1 0 −1 0 1/2
4 x
2
1 0 1 1 0 −1
0 → x
4
1/2 0 0 1 1 −
3
2
f
j
13
2
5 4 −1 0 −
3
2
c
j
−f
j
− 0 0 1 0
3
2
f
min
=
13
2
, x
t
=
_
1
2
, 1, 0,
1
2
, 0
_
The dual problem is
g = 2y
1
+ 2y
2
+ 3y
3
→ maximize
subject to
_
_
_
2y
1
+y
2
+ 2y
3
≤ 5
y
1
+ 2y
2
+ 2y
3
≤ 4
y
1
, y
2
, y
3
≥ 0
By using Remark 4 we obtain that g
max
=
13
2
and y
t
=
_
1, 0,
3
2
_
.
Complementary slackness
Theorem (Complementary Slackness). Consider a pair of primal and dual
linear programming problems with the primal problem in standard form.
a) If x is optimal for the primal and y is optimal for the dual then
x
t
(c −A
t
y) = 0 i.e.
70
n
j=1
x
j
(c
j
−(A
t
y)
j
) =
n
j=1
x
j
_
c
j
−
m
i=1
a
ij
y
i
_
= 0.
b) If x is feasible for the primal, y is feasible for the dual, and
x
t
(c −A
t
y) = 0
then x and y are optimal for their respective problems.
Proof. If x and y are feasible, then
f(x) = c
t
x ≥ (A
t
y)
t
x = y
t
Ax = y
t
b = (b
t
y)
t
= g(y).
If x and y are optimal then f(x) = g(y) so that
c
t
x = y
t
Ax,
where from we easily get
x
t
c = x
t
A
t
y and x
t
(c −A
t
y) = 0.
If x and y are optimal, then f(x) = g(y) and Corollary 2 shows that x and y are
optimal.
Example. We look again at the pair of LPP given by Example 3
f = −x
1
−2x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
−2x
1
+x
2
+x
3
= 2
−x
1
+ 2x
2
+x
4
= 7
x
1
+x
5
= 3
x
1
, . . . , x
5
≥ 0
The optimal solutions are
x = (3, 5, 3, 0, 0)
t
and y = (0, −1, −2)
t
.
The dual constraints are
_
_
_
−2y
1
−y
2
+y
3
≤ −1
y
1
+ 2y
2
≤ −2
y
1
, y
2
, y
3
≤ 0
The complementary slackness theorem says
5
j=1
x
j
_
c
j
−
3
i=1
a
ij
y
i
_
= 3[−1 −(−2 0 −1(−1) + (−2))]
+5[−2 −(0 + 2(−1))] + 3[0 −(1 0 + 0(−1) + 0(−2))]
71
+0[0 −(0 0 + 1(−1) + 0(−2))] + 0[0 −(0 0 + 0(−1) + 1(−2))]
= 3 0 + 5 0 + 3 0 + 0 1 + 0 2 = 0.
The complementary slackness theorem conducted us to the following results.
Remark. Consider a pair of primal and dual linear programming problems. Let
x be optimal for the primal and y optimal for the dual.
1) If x
j
> 0 then
c
j
−
m
i=1
a
ij
y
i
= 0.
In other words if x
j
is a basic variable then its reduced cost (or dual slack vari
able) is zero. Conversely, if a dual slack variable (reduced cost) is nonzero, then the
associated primal variable is nonbasic and hence zero.
2) It is possible to have both x
j
= 0 and c
j
−
m
i=1
a
ij
y
i
= 0, for example, when
the problem is degenerate and one of the basic variables is zero.
3) For a symmetric pair of primal and dual linear programming problems
(P) f = c
t
x → minimize
subject to
_
Ax ≥ b
x ≥ 0
and
(D) g = b
t
y → maximize
subject to
_
A
t
y ≤ c
y ≥ 0
the complementary slackness conditions are
x
t
(c −A
t
y) = 0 and y
t
(Ax −b) = 0.
The complementary slackness conditions have an economic interpretation:
Thinking in terms of the diet problem (see the ﬁrst example, section 2.2) which
is a primal part of a symmetric pair of dual problems, suppose that the optimal diet
supplies more than b
j
units of the j
th
nutrient. This means that the dietician will not
pay anything for small quantities of that nutrient, since availability of it would not
reduce the cost of the optimal diet. This implies y
j
= 0 which is (3) of Remark 5.
Marginal values. Shadow prices
Consider the LPP in standard form
f(x) = c
t
x → minimize
subject to
72
_
Ax = b
x ≥ 0
where A is a mn matrix and rank A = m.
The marginal value (or shadow price) of a constraint i is deﬁned to be the rate
of the change in the objective function as a result of change in the values of b
i
, the
righthand side of constraint i.
Suppose we keep all the other data in the problem ﬁxed at their current value,
except b
i
. Then as b
i
varies, the optimum objective value in the problem is a function
of b
i
which we denote by F(b
i
)(= c
t
x). Then the marginal value of b
i
in the problem
is F
′
(b
i
).
By using the limit deﬁnition of the derivative
F
′
(b
i
) = lim
h→0
F(b
i
+h) −F(b
i
)
h
≈ F(b
i
+ 1) −F(b
i
).
By using the previous approximation we can say that a shadow price is the amount
the optimal value of the objective function would change if the righthand side of a
constraint is increased by one unit.
If the given LPP has a nondegenerate optimum basic feasible solution, then the
dual problem has a unique optimal solution y and c
t
x = b
t
y. The previous equality
can be used to show that the marginal value associated with b
i
is y
i
. Since the solution
x is nondegenerate, small changes in any b
i
will not change the optimal dual solution.
Under nondegeneracy the change in value of f for small changes in b
i
is obtained
by partially diﬀerentiating F, F(b
i
) = b
t
y with respect to b
i
, as follows
F
′
(b
i
) =
_
m
i=1
b
i
y
i
_
′
b
i
= y
i
.
Example. [14] A mining company owns two diﬀerent mines that produce a given
kind of ore. The mines are located in diﬀerent parts of the country and have diﬀerent
production capacities. After crushing, the ore is graded into three classes: highgrade,
mediumgrade and lowgrade ores. There is some demand for each grade of ore. The
mining company has contracted to provide a smelting plant with 12 tons of high
grade, 8 tons of mediumgrade, and 24 tons of lowgrade ore. It costs the company $
200 per day to run the ﬁrst mine and $ 160 per day to run the second. However, in a
day’s operation the ﬁrst mine produces 6 tons of high grade, 2 tons of mediumgrade
and 4 tons of lowgrade ore, while the second mine produces daily 2 tons of high
grade, 2 tons of mediumgrade, and 12 tons of lowgrade ore. How many days should
each mine be operated in order to fulﬁll the company’s orders most economically?
Solution. First, we summarize the problem in the following table:
Highgrade Mediumgrade Lowgrade Cost
ore ore ore
Mine 1 6 2 4 200
Mine 2 2 2 12 160
Requirements 12 8 24
73
x
1
 the number of days that mine 1 operates
x
2
 the number of days that mine 2 operates
f(x) = 200x
1
+ 160x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
6x
1
+ 2x
2
≥ 12
2x
1
+ 2x
2
≥ 8
4x
1
+ 12x
2
≥ 24
x
1
≥ 0, x
2
≥ 0
The standard form of the previous problem is
f(x) = 200x
1
+ 160x
2
subject to
_
¸
¸
_
¸
¸
_
6x
1
+ 2x
2
−x
3
= 12
2x
1
+ 2x
2
−x
4
= 8
4x
1
+ 12x
2
−x
5
= 24
x
1
, x
2
, x
3
, x
4
, x
5
≥ 0
c 200 160 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
12 6 2 −1 0 0
8 2 2 0 −1 0 min
_
12
6
,
8
2
,
24
4
_
= 2
24 4 12 0 0 −1
200 → x
1
2 1
1
3
−
1
6
0 0 min
_
2
1
3
,
4
4
3
,
16
32
3
_
4 0
4
3
1
3
−1 0 = min
_
6, 3,
3
2
_
=
3
2
16 0
32
3
2
3
0 −1
200 x
1
3
2
1 0 −
3
16
0
1
32
min
_
3
2
1
32
,
2
1
8
_
2 0 0
1
4
−1
1
8
= min¦48, 8¦
160 → x
2
3
2
0 1
1
16
0 −
3
32
200 x
1
1 1 0 −
1
4
1
4
0
0 → x
5
16 0 0 2 −8 1
160 x
2
3 0 1
1
4
−
3
4
0
f
j
680 200 160 −10 −70 0
c
j
−f
j
− 0 0 10 70 0
In conclusion f
min
= 680, x
min
= (1, 3, 0, 0, 16).
The minimum operating cost is $ 680 and it is achieved by operating the ﬁrst mine
one day and the second mine three days.
If the mines are operated as indicated, then the combined production will be
6 + 2 3 = 12 tons of highgrade ore, 2 + 2 3 = 8 tons of mediumgrade ore and
4 + 12 3 = 40 tons of highgrade ore.
74
We can observe that the lowgrade ore is overproduced (with 16 tons).
The dual problem is
g = 12y
1
+ 8y
2
+ 24y
3
→ maximize
subject to
_
6y
1
+ 2y
2
+ 4y
3
≤ 200
2y
1
+ 2y
2
+ 12y
3
≤ 160
From the simplex table and Remark 4 we get that
y
1
= 10, y
2
= 70, y
3
= 0, y
4
= 0, y
5
= 0, g
max
= 680.
The ﬁrst step in interpreting the solution to the dual problem is that of determining
the dimensions of the variables involved. We will determine the dimensions of the
variables of both the primal and dual problems by following the next two rules:
a) the dimension of x
j
is the ratio of the dimension of b divided by the dimension
of a
ij
for any i
b) the dimension of y
i
is the ratio of the dimension of c
j
divided by the dimension
of a
ij
for any j.
In our example we already know that the dimensions of x
1
and x
2
are days.
dimension of y
1
=
The dimension of c
1
dimension of a
11
=
$/day
tonsHg/day
=
$
tonsHg
In the same way
The dimension of y
2
=
$
tonsMg
and the dimension of y
3
=
$
tonsLg
.
The next step is to look at the optimal dual solution and give its interpretation.
We know that y
1
= 10 has dimension $/ton of highgrade ore, which sounds like the
imputed cost of producing an additional ton of high grade ore and we shall show that
this is the case. Suppose we increase the requirements for highgrade ore production
from 12 to 16 tons.
The new problem is
f = 200x
1
+ 160x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
6x
1
+ 2x
2
≥ 16
2x
1
+ 2x
2
≥ 8
4x
1
+ 12x
2
≥ 24
x
1
≥ 0, x
2
≥ 0
75
c 200 160 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
16 6 2 −1 0 0
8 2 2 0 −1 0 min
_
16
2
,
8
2
,
24
2
_
= 2
24 4 12 0 0 −1
12
16
3
0 −1 0
1
6
4
4
3
0 0 −1
1
6
min
_
8
1
0
,
4
1
0
_
= 24
160 → x
2
2
1
3
1 0 0 −
1
12
8 4 0 −1 1 0
0 → x
5
24 8 0 0 −6 1 min
_
8
4
,
24
8
,
42
1
_
= 2
160 x
2
4 1 1 0 −
1
2
0
200 x
1
2 1 0 −
1
4
1
4
0
0 x
5
8 0 0 2 −4 1
160 x
2
2 0 1
1
4
−
3
4
0
f
j
720 200 160 −10 −70 0
c
j
−f
j
0 0 10 70 0
The optimal solution to the new problem x
1
= 2, x
2
= 2.
Notice that the cost of production has increased from 680 to 720 which is 4y
1
=
4 10 = 40. Hence y
1
= 10 is the cost/ton of each of the additional highgrade ore. The
fact that y
3
= 0 (which has dimension $/ton of lowgrade ore) says that the lowgrade
ore is free in the sense that producing an additional ton has zero cost (because there
is already an overproduction of 16 tons, so the additional ton will cost zero to produce
since it already exists).
Remark 1. The interpretation of a pair of symmetric dual linear programming.
a) For either problem the matrix A will be called the matrix of technological
coeﬃcients.
b) If the original problem is minimizing, we interpret x as the activity vector. b is
interpreted as the requirements vector, whose components give the minimum amounts
of each good that must be produced. The vector c is the cost vector, whose entries
give the unit costs of each of the activities. The vector y (the solution of the dual
problem) is the imputedcost vector, whose components give the imputed costs of
producing additional amounts of each of the required goods (provided the changes in
requirements are suﬃciently small that the dual solution remains optimal).
c) If the original problem is maximizing we interpret x as the activity vector.
Then the vector b is interpreted as the capacityconstraint vector, whose components
give the amounts of resources that can be demanded by a given activity vector. The
vector c is the proﬁt vector, whose entries give the unit proﬁts for each component
of the activity vector x. The vector y is the imputedvalue vector, whose entries give
the imputed values of each of the resources that enter into the productive process
(provided the changes in resources are suﬃciently small that the dual solution remains
optimal).
76
Remark 2. Suppose in the linear programming problem
f = c
t
x → minimize
subject to
_
Ax = b
x ≥ 0
Assume that the optimal basis is B with corresponding solution (x
B
, 0) where
x
B
= B
−1
b. A solution to the corresponding dual problem is y = B
t
c
B
.
Assuming nondegeneracy, small changes in the vector b will not cause the optimal
basis to change. Thus for b + ∆b the optimal solution is x = (x
B
+ ∆x
B
, 0) where
∆x
B
= B
−1
∆b. Thus the corresponding increment in the cost function is
∆f = ∆g = c
t
B
∆x
B
= y
t
∆b.
This equation shows that y give the sensitivity of the optimal cost with respect to
small changes in the vector b. If a new problem is solved with b changed to b + ∆b,
the change in optimal value of the objective function will be λ
t
∆b, where λ
j
is the
marginal price of the component b
j
, since if b
j
is changed to b
j
+∆b
j
the value of the
optimal solution changes by λ
j
∆b
j
.
Game theory
Game theory is a mathematical approach to the problems of strategy such as one
can ﬁnd in the theory of operational research or economy. This theory is frequently
and naturally used in every day life.
Game theory is used to analyse situations where for two or more individuals (or
institutions) the outcome of an action by one of them depends not only on the action
taken by that individual but also on the actions taken by the others. The strategies
of individuals will be dependent on expectations about the others are doing. These
games are called games of strategy and the participants are called players. The players
of such a game need to take into account the possible actions of the others when they
make decisions.
Strategic thinking characterizes many human interactions. Here are some exam
ples:
a) Two ﬁrms with large market shares in a particular industry makings decisions
with respect to price and output.
b) The decision of a ﬁrm to enter a new market where there is a risk that the
existing ﬁrms will try to ﬁght entry.
c) A criminal deciding whether to confess or not to a crime that he has committed
with an accomplice who is also questioned by the police.
d) Family members arguing over the division of work within the household.
We shall focus on the simplest type of game, called the ﬁnite twoperson zerosum
game, or matrix game for short.
77
Matrix games
A matrix game is a twoperson game deﬁned as follows. Each of two persons selects
(independently) an action from a ﬁnite set of choices and both reveal to each other
their choice. If we denote the ﬁrst player’s choice by i (i = 1, m) and the second
player’s choice by j (j = 1, l) then the rules of the game stipulate that the ﬁrst
player’s payoﬀ is a
ij
. We shall refer to the ﬁrst player as the row player (R) and the
second player as the column player (C).
The matrix of possible payments A = [a
ij
]
1≤i≤m
1≤j≤l
is known to both of players before
the game begins.
More explicitly: If a
ij
> 0, C pays to R an amount of a
ij
; if a
ij
< 0, R pays C an
amount of [a
ij
[; if a
ij
= 0, no money is won or lost.
The main properties of a matrix game are:
• there are two players (twoperson game)
• each player has ﬁnitely many choices of play, each makes one choice, and the
combination of the two choices determines a payoﬀ (ﬁnite game)
• what one player wins, the other loses (zerosum game).
We present now some examples:
Example 1. PaperScissorsRock game
This is a twoperson game in which each player declares either Paper, Scissors or
Rock. If both players declare the same object, then the payoﬀ is 0. Paper loses to
Scissors since scissors can cut a piece of paper. Scissors loses to Rock since a rock
can dull scissors and ﬁnally Rock loses to Paper since a piece of paper can cover up
a rock. The payoﬀ is 1 in these cases.
The payoﬀ matrix is:
C player
Paper Scissors Rock
Paper
R player Scissors
Rock
_
_
0 −1 1
1 0 −1
−1 1 0
_
_
Example 2. Morra game
The two players simultaneously show either one or two ﬁngers, and at the same
time, each player announce a number.
If the number announced by one of the players is the same as the number of ﬁngers
showed by both players, then he wins that number from the opponent (if both players
guess right then the payment is zero).
Each player has four possible strategies. If he shows one ﬁnger then he may guess
two or three (he will never guess four in this case, but this strategy never wins so he
will eliminate it). If he shows two ﬁngers then he may guess three or four.
If we denote by R
ij
and C
ij
the strategy of showing i ﬁngers and guessing the
number j then the following payoﬀ matrix will be associated to Morra game:
78
C
12
C
13
C
23
C
24
R
12
R
13
R
23
R
24
_
_
_
_
0 2 −3 0
−2 0 0 3
3 0 0 −4
0 −3 4 0
_
_
_
_
Example 3. Two stores R and C, are planning to locate in one of two towns.
Town 1 has 70 percent of the population while town 2 has 30 percent. If both stores
locate in the same town they will split the total business of both towns equality, but
if locate in diﬀerent towns each will get the business of that town. Where should each
store locate?
The payoﬀ matrix is:
Store C locates in
1 2
Store R 1
locates in 2
_
50 70
30 50
_
The entries of the payoﬀ matrix represent the percentages of business of store R
(or of the percentage loses of business by C).
It is easy to see that store R should prefer to locate in town 1 because by this
choice R can assure himself of 20 percent more business in town 1 than in town 2.
Similarly, store C also prefers to locate in town 1 because he will lose 20 percent less
business in town 1 than in 2.
Hence the best strategies are for each store to locate in town 1.
By a strategy for R in a matrix game, A, we mean a decision by R to play
the various rows with a given probability distribution, i.e. to play the ﬁrst row with
probability p
1
, to play the second row with probability p
2
, and so on.
This strategy for R is represented by the probability vector
p = (p
1
, . . . , p
m
),
m
k=1
p
k
= 1,
where p
i
, i = 1, m, represents the probability of choosing (by player R) the row i.
In the same way, by a strategy for C we mean a decision by C to play the
various columns with a given probability distribution, i.e. to play the ﬁrst column
with probability q
1
, to play the second column with probability q
2
, and so on.
This strategy for C is represented by the probability vector
q = (q
1
, . . . , q
l
),
l
j=1
q
j
= 1,
where q
j
, j = 1, l, represents the probability of choosing (by player C) the column j.
A strategy which contains a 1 as component (and in consequence 0 everywhere
else) is called a pure strategy, otherwise it is called a mixed strategy.
79
In the case of a pure strategy the player R (respectively the player C) decides to
play always a given row (respectively a given column).
When R plays row i with probability p
i
(i = 1, m) and C plays column j with
probability q
j
(j = 1, l) then the payoﬀ a
ij
will be realized with probability p
i
q
j
.
Hence the expected winning of R is:
E(p, q) = p
1
q
1
a
11
+p
1
q
2
a
12
+ +p
1
q
l
a
1l
+ +
+p
m
q
1
a
m1
+p
m
q
2
a
m2
+ +p
m
q
l
a
ml
=
m
i=1
p
i
l
j=1
q
j
a
ij
= (p
1
, . . . , p
m
)
_
_
_
a
11
. . . a
1l
.
.
.
.
.
.
a
m1
. . . a
ml
_
_
_
_
_
_
q
1
.
.
.
q
l
_
_
_
= pAq
t
=
l
j=1
q
j
m
i=1
p
i
a
ij
= (q
1
, . . . , q
l
)
_
_
_
a
11
. . . a
m1
.
.
.
.
.
.
a
1l
. . . a
ml
_
_
_
_
_
_
p
1
.
.
.
p
m
_
_
_
= qA
t
p
t
In conclusion:
E(p, q) = pAq
t
= qA
t
p
t
.
The player R tries to choose a row i (i = 1, m) such that the expected value of his
winnings is maximal no matter what column the player C chooses.
The player C tries to choose a column j (j = 1, l) such that the expected value of
his loses is minimal no matter what row the player R chooses.
We say that the game with payoﬀ matrix A has the value v, and we call p
0
and
q
0
optimal strategies, if
E(p
0
, q) ≥ v, for every strategy q for C (∗)
E(p, q
0
) ≤ v, for every strategy p for R (∗∗)
Remark 1.
a) If p
0
is a given strategy for player R in a matrix game A then the following two
conditions are equivalent:
(i) E(p
0
, q) ≥ v, for every strategy q for C
(ii) p
0
A ≥ (v, v, . . . , v).
b) If q
0
is a given strategy for player C in a matrix game A then the following two
conditions are equivalent:
(i) E(p, q
0
) ≤ v, for every strategy p for C
(ii) q
0
A
t
≤ (v, v, . . . , v).
Proof. a) Assume that (i) holds and that p
0
A = (a
1
, a
2
, . . . , a
l
). Choosing the
pure strategy q = (1, 0, . . . , 0) we have
E(p
0
, q) = p
0
Aq
t
= (a
1
, a
2
, . . . , a
l
)(1, 0, . . . , 0)
t
= a
1
≥ v.
Similarly a
2
≥ v, . . . , a
l
≥ v. In other words
p
0
A ≥ (v, v, . . . , v).
80
On the other hand, assume that (ii) holds. Then, for any strategy q for C,
E(p
0
, q) = p
0
Aq
t
≥ (v, v, . . . , v)(q
1
, q
2
, . . . , q
l
)
t
=
l
j=1
vq
l
= v 1 = v.
b) Assume that (i) holds and that q
0
A
t
= (b
1
, b
2
, . . . , b
m
). Choosing the pure
strategy p = (1, 0, . . . , 0) we have
E(p, q
0
) = q
0
A
t
p
t
= (b
1
, b
2
, . . . , b
m
)(1, 0, . . . , 0)
t
= b
1
≤ v.
Similarly, b
2
≤ v, . . . , b
m
≤ v. In other words
q
0
A
t
≤ (v, v, . . . , v).
On the other hand, assume that (ii) holds. Then, for any strategy p for R,
E(p, q
0
) = q
0
A
t
p
t
≤ (v, v, . . . , v)(p
1
, p
2
, . . . , p
m
)
t
=
m
i=1
vp
i
= v 1 = v.
As we can observe from the previous proof:
 the inequality p
0
A ≥ (v, v, . . . , v) can be written as E(p
0
, q) ≥ v for every
strategy q for C
 the inequality q
0
A
t
≤ (v, v, . . . , v) can be written as E(p, q
0
) ≤ v for every pure
strategy p for R.
In view of the previous remark we say that the game with payoﬀ matrix A has
the value v, and p
0
, q
0
optimal strategies if
p
0
A ≥ (v, v, . . . , v) (∗
′
)
q
0
A
t
≤ (v, v, . . . , v) (∗∗
′
)
We conclude this subsection by proving three results that characterize the value
and optimal strategies of a game.
Remark 2. If A is a matrix game that has a value and optimal strategies then
the value of the game is unique.
Proof. Suppose that v and w are two values for the matrix game A. If p
0
and q
0
are optimal strategy vectors associated with the value v then
(i) p
0
A ≥ (v, v, . . . , v)
(ii) q
0
A
t
≤ (v, v, . . . , v).
If p
1
and q
1
are optimal strategy vectors associated with the value w then
(iii) p
1
A ≥ (w, w, . . . , w)
(iv) q
1
A
t
≤ (w, w, . . . , w).
If we multiply (i) on the right by (q
1
)
t
we get
p
0
A(q
1
)
t
≥
l
j=1
vq
1
j
= v.
81
In the same way, multiplying (iv) on the right by (p
0
)
t
gives
p
0
A(q
1
)
t
= (p
0
A(q
1
)
t
)
t
= q
1
A
t
(p
0
)
t
≤
m
j=1
wp
1
i
= w.
The two inequalities obtained before show that w ≥ v.
Similarly, if we multiply (ii) on the right by (p
1
)
t
and (ii) on the right by (q
0
)
t
we
obtain v ≥ p
1
A(q
0
)
t
and p
1
A(q
0
)
t
≥ w, which together imply v ≥ w.
In consequence w = v, which completes the proof.
Remark 3. If A is a matrix game with value v and optimal strategies p
0
and q
0
,
then v = p
0
A(q
0
)
t
.
Proof. The following inequalities are true:
p
0
A ≥ (v, v, . . . , v) and q
0
A
t
≤ (v, v, . . . , v).
Multiplying the ﬁrst of these inequalities on the right by (q
0
)
t
, we get
p
0
A(q
0
)
t
≥ v.
Similarly, multiplying the second inequality on the right by (p
0
)
t
, we obtain
p
0
A(q
0
)
t
= (p
0
A(q
0
)
t
)
t
= q
0
A
t
(p
0
)
t
≤ v.
These two inequalities together imply that
v = p
0
A(q
0
)
t
.
The previous two remarks allow us to interpret the value of a game as an expected
value in the following way: If the matrix game is played repeatedly and if each time
the player R chooses the p
0
strategy and the player C chooses the q
0
strategy, then
the value of the matrix game A is the expected value of the game for R.
Remark 4. If A is a matrix game with value v and optimal strategies p
0
and q
0
,
then v is the largest expectation that R can assure for himself and w is the smallest
expectation that C can assure for himself.
Proof. Let p any strategy vector of R; then multiplying the inequality q
0
A
t
≤
(v, . . . , v) on the right by (p
0
)
t
, we get
pA(q
0
)
t
= (q
0
A
t
p
t
)
t
= q
0
A
t
p
t
≤ v.
So, if C plays optimally, the most that R can obtain for himself is v.
On the other hand, since v = p
0
A(q
0
)
t
, R can obtain for himself an expectation
of v.
The proof of the other statement of the remark is similar.
The previous remark tells us that the value of a game is the ”best” that a player
can obtain for himself (by using the optimal strategies).
82
Strictly determined games. Saddle point
A matrix game is strictly determined if the matrix has an entry which is a
minimum in its row and a maximum in its column; such an entry is called a ”saddle
point”.
Remark 5. Let v a saddle point of a strictly determined game. Then an optimum
strategy for R is to play the row containing v, an optimum strategy for C is to play
the column containing v, and v is the value of the game.
Proof. Suppose v = a
ij
, so p
0
= (0, 0, . . . , 1, . . . , 0) (1 is the value of the i
th
coordinate of p
0
) and q
0
= (0, . . . , 1, . . . , 0) (1 is the value of the j
th
coordinate of q
0
).
We now show that p
0
, q
0
and v satisfy the required properties to be optimum
strategies and the value of the game. Indeed,
p
0
A = (0, . . . , 1, . . . , 0) A = (a
i1
, . . . , a
ij
, . . . , a
il
)
≥ (a
ij
, . . . , a
ij
) = (v, . . . , v)
q
0
A
t
= (0, . . . , 1, . . . , 0)A
t
= (a
1j
, . . . , a
ij
, . . . , a
mj
)
≤ (a
ij
, . . . , a
ij
) = (v, . . . , v).
Thus in a strictly determined game a pure strategy for each player is an optimum
strategy:
for player R: to choose a row that contains a saddle value
for player C: to choose a column that contains a saddle value.
Example 4. We consider a generalization of Example 3 in which the stores R and
C are trying to locate in one of the three towns in ﬁgure below.
50
30 20
30 km
18 km
24 km
Town 1
Town 2
Town 3
If both stores locate in the same town they split all business equally, but if they
locate in diﬀerent towns then all the business in the town that doesn’t have a store
will go to the closer of the two stores.
The payoﬀ matrix for this game is the following:
83
Store C locates in
1 2 3
Store R 1
locates in 2
3
_
_
50 50 80
50 50 80
20 20 50
_
_
If we circle the minimum entry in each row and put a square around the maximum
entry in each column we obtain
1 2 3
1
2
3
_
_
_
_
_
_
50 50 80
50 50 80
20 20 50
_
_
_
_
_
_
Each of the four 50 entries in the 22 matrix in the upper lefthand corner is both
circled and boxed and so is a saddle value of the matrix. Hence the game is strictly
determined, and optimal strategies are:
• for store R: locate in town 1 or locate in town 2, represented by the vectors (1, 0, 0)
and (0, 1, 0) respectively.
: combining the previous two strategies we get the following mixed strategy
: locate in town 1 with probability p and locate in town 2 with probability 1−p
represented by the vector
p(1, 0, 0) + (1 −p)(0, 1, 0) = (p, 1 −p, 0), 0 < p < 1
• for store C: locate in town 1, locate in town 2 and ”locate in town 1 with probability
q and locate in town 2 with probability 1−q” represented by the vectors: (1, 0, 0),
(0, 1, 0) and (q, 1 −q, 0).
2 2 matrix games
Consider the matrix game
A =
_
a
11
a
12
a
21
a
22
_
.
If A is strictly determined, then the solution is presented above. Thus we need
only to consider the case in which A is nonstrictly determined.
Criterion. The 2 2 matrix game is nonstrictly determined if and only if each
of the entries on one of the diagonals is greater than each of the entries on the other
diagonal i.e. one of the following situations are fulﬁlled:
(i) a
11
, a
22
> a
12
and a
11
, a
22
> a
21
or
(ii) a
12
, a
21
> a
11
and a
12
, a
21
> a
22
.
84
Proof. If either of the conditions (i) or (ii) holds, it is easy to check that no entry
of the matrix is simultaneously the minimum of the row and the maximum of the
column in which it occurs hence the game is not strictly determined.
In order to prove the other part of the criterion we observe ﬁrst that if two of the
entries in the same row or the same column of A are equal then the game is strictly
determined: hence the entries in the same row (or column) are diﬀerent.
Suppose now that a
11
> a
12
; then a
22
> a
12
or else a
12
is a row minimum and a
column maximum; then also a
22
> a
21
or else a
22
is a row minimum and a column
maximum; then also a
11
> a
21
or else a
21
is a row minimum and a column maximum.
In a similar manner the assumption a
11
< a
12
leads to case (ii). This completes
the proof of the theorem.
In order to determine the optimal strategies for a 2 2 nonstrictly determinate
game we have the following result:
Theorem 1. If the 2 2 matrix game A is nonstrictly determined then p
0
=
(p
0
1
, p
0
2
) is an optimal strategy for player R, q
0
= (q
0
1
, q
0
2
) is an optimal strategy for
player C and v is the value of the game, where
p
0
1
=
a
22
−a
21
a
11
+a
22
−a
12
−a
21
, p
0
2
=
a
11
−a
12
a
11
+a
22
−a
12
−a
21
q
0
1
=
a
22
−a
12
a
11
+a
22
−a
12
−a
21
, q
0
2
=
a
11
−a
21
a
11
+a
22
−a
12
−a
21
and
v =
a
11
a
22
−a
12
a
21
a
11
+a
22
−a
12
−a
21
=
det A
a
11
+a
22
−a
12
−a
21
.
Proof. We have to see that the values above satisfy the following conditions:
p
0
A ≥ (v, v)
q
0
A
t
≤ (v, v)
which are equivalent to:
p
0
1
a
11
+p
0
2
a
21
≥ v
p
0
1
a
12
+p
0
2
a
21
≥ v
q
0
1
a
11
+q
0
2
a
12
≤ v
and
q
0
1
a
21
+q
0
2
a
22
≤ v.
It is easy to verify that the previous formulas are true, and in consequence the
proof is complete.
Example 5. (a simpliﬁed version of Morra game)
Each of the two players R and C simultaneously shows one or two ﬁngers. If the
sum of the ﬁngers shown is even, R wins the sum from C; if the sum is odd, R loses
the sum to C. The matrix is the following:
85
C shows
1 2
R 1
shows 2
_
2 −3
−3 4
_
It is easy to see that the game is nonstrictly determined so we apply the formulas
presented in Theorem 1
v =
2 4 −(−3)(−3)
2 + 4 + 3 + 3
= −
1
12
.
Thus the game is in favor of player C. Optimum strategies p
0
for R and q
0
for C
are as follows:
p
0
=
_
7
12
,
5
12
_
, q
0
=
_
7
12
,
5
12
_
.
Remark 6. If a matrix game contains a row (column) whose elements are smaller
or equal (greater or equal) then the elements of other row (column) then the smaller
(greater) row (column) is called a recessive row (column). Clearly, player R (player
C) would never play the recessive row (column), that’s why the recessive row (column)
can be omitted from the game.
Example 6. Consider the matrix game
A =
_
_
−4 −3 1
2 −1 2
−2 3 4
_
_
.
Note that (−4, −3, 1) ≤ (2, −1, 2) i.e. the ﬁrst row is recessive and can be omitted
from the game and the game may be reduced to
_
2 −1 2
−2 3 4
_
.
Now observe that the third column is recessive since each entry is greater or equal
to the corresponding entry in the second column. Thus the game may be reduced to
the 2 2 game
A
∗
=
_
2 −1
−2 3
_
.
The solution to the game A
∗
can be found by using the formulas in Theorem 1
and is
v =
4
8
=
1
2
; p
∗
0
=
_
5
8
,
3
8
_
; q
∗
0
=
_
4
8
,
4
8
_
.
Thus the solution to the original game A is
v =
1
2
, p
0
=
_
0,
5
8
,
3
8
_
and q
0
=
_
1
2
,
1
2
, 0
_
.
86
2 l and m2 matrix games
In the case in which one of the players has just 2 strategies we can solve the game
geometrically.
Example 6. Consider the game whose matrix is:
A =
_
1 0 −1 0
−3 −2 1 2
_
.
Since the fourth column is recessive, we can omit it from the game which can be
reduced to
A
∗
=
_
1 0 −1
−3 −2 −1
_
.
The player R plays an arbitrary strategy
p = (p
1
, p
2
) = (1 −p
2
, p
2
).
If the player C chooses column 1, then the expected payment y is:
y = 1 p
1
−3p
2
= 1 −p
2
−3p
2
= 1 −4p
2
.
If the player C chooses column 2 then
y = 0 p
1
−2p
2
= −2p
2
.
If the player C chooses column 3 then
y = −p
1
+p
2
= −(1 −p
2
) +p
2
= −1 + 2p
2
.
Notice that each of these expectations expresses y as a linear function of p
2
. Hence
the graph of these expectations will be straight line in each case. Since we gave the
restriction 0 ≤ p
2
≤ 1, we are interested only in segment for which p
2
satisﬁes the
restrictions.
87
¸
Maximum
1
0
−1
−1/2
−2
−3
Column 3
y = −1 + 2p
2
1
p
2
axis
Column 2
y = −2p
2
Column 1
y = 1 −4p
2
`
y axis
The player C will minimize his own expectation (his losses) by choosing the lowest
of the three lines presented in the above ﬁgure. Now R is the maximizing player, so he
will try to get the maximum of this function. This maximum occurs at the intersection
of the lines corresponding to the columns 2 and 3 (−1 + 2p
2
= −2p
2
) when p
2
=
1
4
,
p
0
=
_
3
4
,
1
4
_
and the value of the game is v = −2
1
4
= −
1
2
.
We can ﬁnd an optimal strategy for player C by considering the 2 2 subgame of
A consisting of the second and third columns:
_
0 −1
−2 1
_
.
Applying the formulas from Theorem 1 we obtain the strategy q
0
=
_
1
2
,
1
2
_
. We
can extend q
0
to an optimal strategy for player C in A by adding two zero entries
thus:
q
0
=
_
0 1/2 1/2 0
_
.
A similar method to that presented in the previous example works to solve games
in which the column player has just two strategies and the row player has more than
2.
Example 7. Consider the game whose matrix is
A =
_
_
6 −1
0 4
4 3
_
_
.
88
The player C plays an arbitrary strategy
q = (q
1
, q
2
) = (1 −q
2
, q
2
).
If the player R chooses row i (i = 1, 3) then the expectation that player R has is:
If R chooses row 1: y = 6q
1
+ (−1)q
2
= 6(1 −q
2
) −q
2
= 6 −7q
2
.
If R chooses row 2: y = 0 q
1
+ 4q
2
= 4q
2
.
If R chooses row 3: y = 4q
1
+ 3q
2
= 4(1 −q
2
) + 3q
2
= 4 −q
2
.
¸
6
4
−1
Row 1
y = 6 −7q
2
Row 3
y = 4 −q
2
Row 2
y = 4q
2
q
2
axis
`
y axis
The player R will maximize his own expectation (his winnings) by choosing the
greatest of the three lines presented in the above ﬁgure. C is the minimizing player,
so he will try to get the minimum of this function. This minimum occurs at the
intersection of the lines corresponding to the rows 2 and 3 (4q
2
= 4 − q
2
) when
q
2
=
4
5
, q
0
=
_
1
5
,
4
5
_
and the value of the game is v = 4 −
4
5
=
16
5
.
We can ﬁnd an optimal strategy for player R by considering the 2 2 subgame of
A consisting of the second and third rows:
_
0 4
4 3
_
.
Applying the formulas from Theorem 1 we obtain the strategy p
0
=
_
1
5
,
4
5
_
. We
can extend p
0
to an optimal strategy for player R in A by adding one zero entry thus:
p
0
=
_
0,
1
5
,
4
5
_
.
89
Next, we will prove the Von Neumann’s theorem (which says that every zerosum,
twoperson game has a value in mixed strategies). The proof of Von Neumann uses
the Brower ﬁxedpoint theorem. The proof that will be presented below uses an idea
of George Dantzig and linear programming theory. Dantzig’s proof is better that von
Neumann’s because it is elementary and because it shows how to construct a best
strategy.
Theorem 2. (John von Neumann’s theorem)
Let A be any real matrix. Then the zerosum, twoperson game with payoﬀ matrix
A has a value v which satisﬁes p
0
A ≥ (v, v, . . . , v) and q
0
A
t
≤ (v, v, . . . , v) for some
optimal strategies p and q.
Proof. With no loss of generality, assume that all a
ij
are positive. Otherwise, if
a
ij
+ α is positive for all i and j then the optimality conditions p
0
A ≥ (v, v, . . . , v)
and q
0
A
t
≤ (v, v, . . . , v) may be replaced by
m
i=1
p
i
(a
ij
+α) =
m
i=1
p
i
a
ij
+α ≥ v +α, j = 1, l
and
l
j=1
(a
ij
+α)q
j
=
l
j=1
a
ij
q
j
+α ≤ v +α.
Assuming all a
ij
> 0, we will construct a number v > 0 satisfying the optimality
conditions. First we observe that the optimality conditions can be written as
m
i=1
p
i
v
a
ij
≥ 1, j = 1, l
l
j=1
q
j
v
a
ij
≤ 1, i = 1, m.
If we deﬁne the unknowns x
j
=
q
j
v
(j = 1, l), y
i
=
p
i
v
(i = 1, m) then the previous
inequalities become:
l
j=1
a
ij
x
j
≥ 1, i = 1, m
m
i=1
a
ij
y
i
≤ 1, j = 1, l
with x ≥ 0, y ≥ 0 and
l
j=1
x
j
=
l
j=1
q
j
v
=
1
v
,
m
i=1
y
i
=
m
i=1
p
i
v
=
1
v
.
90
The required vectors x ≥ 0 and y ≥ 0 must solve the following linear programming
problems:
(P) f =
l
j=1
x
j
→ maximize (since
l
j=1
x
j
=
1
v
and v → minimized
according to the fact that the column player is a minimizing player)
subject to
_
¸
_
¸
_
l
j=1
a
ij
x
j
≥ 1, i = 1, m
x
1
, . . . , x
l
≥ 0
(D) g =
m
i=1
y
i
→ minimize (since
m
i=1
y
i
=
1
v
and v → maximized
according to the fact that the row player is a maximizing player)
subject to
_
¸
_
¸
_
m
i=1
a
ij
y
i
≤ 1
y
i
≥ 0, i = 1, l
The previous two linear programming problems are a symmetric pair of dual prob
lems. These problems have optimal solutions because they both have feasible solutions
(since all a
ij
are positive a vector y is feasible if all its components are large; the vec
tor x = 0 is feasible for the primal). By duality theorem, these linear programming
problems have optimal solutions x, y and the same optimal value which is denoted
by
1
v
.
We easily can see that v, p and q satisfy the von Neumann’s theorem if p = vy
and q = vx.
The simplex method for solving matrix games
Suppose we are given a matrix game. We assume that A is nonstrictly determined
(strictly determined games can be solved as we discussed before without using the
simplex method) and does not contain any recessive rows or columns. According to
the previous proof we can obtain the solution to A as follows:
1
st
step. Add a suﬃciently large number k to every entry of A to form the following
matrix game which has only positive entries
A
∗
=
_
_
a
11
a
12
a
1l
a
21
a
22
a
2l
a
m1
a
m2
a
ml
_
_
The purpose is to guarantee that the value of the new matrix game is positive.
91
2
nd
step. Solve the following LPP by the simplex method
f(x) = x
1
+x
2
+ +x
l
→ maximize
subject to
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
a
11
x
1
+ +a
1l
x
l
≥ 1
a
21
x
1
+ +a
2l
x
l
≥ 1
. . .
a
m1
x
1
+ +a
ml
x
l
≥ 1
x
1
, x
2
, . . . , x
l
≥ 0
Let x
0
be the optimum solution to the maximum problem, y
0
the optimum solution
to the dual minimum problem (which can be found in the row of reduced costs of the
terminal table) and v
∗
= f(x
0
) (the optimal solution of LPP).
Let p
0
=
1
v
∗
y
0
, q
0
=
1
v
∗
x
0
and v =
1
v
∗
−k.
Then p
0
is an optimum strategy for player R in game A, q
0
is an optimum strategy
for player C and v is the value of the game.
We can remark that the games A and A
∗
have the same optimum strategies for
their respective players and that their values diﬀer by the added constant k.
Example 8. Solve the PaperScissorsRock game (example 1) by using the simplex
method.
Solution. Add k = 2 to each entry to form the matrix game
_
_
2 1 3
3 2 1
1 3 2
_
_
.
We have to solve the following LPP
f(x) = x
1
+x
2
+x
3
→ maximize
subject to
_
¸
¸
_
¸
¸
_
2x
1
+x
2
+ 3x
3
≤ 1
3x
1
+ 2x
2
+x
3
≤ 1
x
1
+ 3x
2
+ 2x
3
≤ 1
x
1
, x
2
, x
3
≥ 0
92
c 1 1 1 0 0 0
C
B
B b x
1
x
2
x
3
x
4
x
5
x
6
ratio test
0 x
4
1 2 1 3 1 0 0 min
_
1
2
,
1
3
, 1
_
0 x
5
1 3 2 1 0 1 0 =
1
3
0 x
6
1 1 3 2 0 0 1
f
j
0 0 0 0 0 0 0
c
j
−f
j
− 1 1 1 0 0 0
0 x
4
1
3
0 −
1
3
7
3
1 −
2
3
0 min
_
1
3
7
3
,
1
3
1
3
,
2
3
5
3
_
1 x
1
1
3
1
2
3
1
3
0
1
3
0 =
1
7
0 x
6
2
3
0
7
3
5
3
0 −
1
3
1
f
j
1
3
1
2
3
1
3
0
1
3
0
c
j
−f
j
− 0
1
3
2
3
0 0 0
1 x
3
1
7
0 −
1
7
1
3
7
−
2
7
0 min
_
2
7
5
7
,
3
7
18
7
_
1 x
1
2
7
1
5
7
0 −
1
7
3
7
0 =
1
6
0 x
6
3
7
0
18
7
0 −
5
7
1
7
1
f
j
3
7
1
4
7
1
2
7
1
7
0
c
j
−f
j
0
3
7
0 −
2
7
−
1
7
0
1 x
3
1
6
0 0 1
7
18
−
5
18
1
18
1 x
1
1
6
1 0 0
1
18
7
18
−
5
18
1 x
2
1
6
0 1 0 −
5
18
1
18
7
18
f
j
1
2
1 1 1
1
6
1
6
1
6
c
j
−f
j
− 0 0 0 −
1
6
−
1
6
−
1
6
The optimal solutions of the primal and dual problem are
x
0
=
_
1
6
,
1
6
,
1
6
_
, y
0
=
_
1
6
,
1
6
,
1
6
_
and the optimal value is v
∗
=
1
2
.
Then for the original game
p
0
=
1
v
∗
y
0
= 2
_
1
6
,
1
6
,
1
6
_
=
_
1
3
,
1
3
,
1
3
_
q
0
=
1
v
∗
x
0
= 2
_
1
6
,
1
6
,
1
6
_
=
_
1
3
,
1
3
,
1
3
_
v =
1
v
∗
−2 = 0.
Observe that the game is fair (since its value is 0).
93
The transportation problem
The balanced transportation problem
An important component of economic life is the shipping of goods from where
they are produced, to markets. The aim is to ship these goods at minimum cost. This
problem was one of the ﬁrst problems that was modeled and solved by using the linear
programming.
We analyse the single commodity transportation problem. The data of this prob
lem consists of the amount available at each source, the requirement at each demand
center and the cost of transporting the commodity per unit from each source to each
market.
We consider the transportation problem with the following data
m = number of sources where material is available
n = number of demand centers where material is required
a
i
= units of material available at source i, a
i
> 0, i = 1, m
b
j
= units of material required at demand center j, b
j
> 0, j = 1, n
c
ij
= unit shipping cost (m.u./unit) from source i to demand center j, i = 1, m,
j = 1, n.
The transportation problem with this data is said to satisfy the
balance condition if it satisﬁes
m
i=1
a
i
=
n
j=1
b
j
A transportation problem which satisﬁes the previous condition is called a bal
anced transportation problem.
We want to determine the quantities to be shipped such that all the requirements
to be satisﬁed (all the supplies from the sources to be shipped and all the demands
to be satisﬁed) and the total cost of transportation to be minimum.
If we denote by x
ij
(i = 1, m, j = 1, n) the quantity to be shipped from the
source i to demand center j we get the following mathematical model (as a linear
programming problem)
f(x) =
m
i=1
n
j=1
c
ij
x
ij
→ minimize
subject to
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
n
j=1
x
ij
= a
i
, i = 1, m
m
i=1
x
ij
= b
j
, j = 1, n
x
ij
≥ 0, u = 1, m, j = 1, n
(1)
If we add the set of ﬁrst m constraints (those corresponding to the sources) and
separately, the set of last m constraints (those corresponding to the demand centers),
94
we see that
m
i=1
a
i
=
m
i=1
n
j=1
x
ij
=
n
j=1
m
i=1
x
ij
=
n
j=1
b
j
.
The previous equality shows us that the balance condition is a necessary condition
for the feasibility of the transportation problem.
That’s why we assumed that the data of the problem satisfy the balance condition.
The previous equality shows that there is a redundant constraint among the con
straints (1). One of the equality constraints in (1) can be deleted from the system
without aﬀecting the set of feasible solutions.
The matrix of the previous system of equations has the order (m + n − 1) mn
and its rank is m+n −1.
So, every basic vector for the balanced transportation problem of order m n
consists of (m+n −1) basic variables.
The transportation problem can be represented in a two dimensional array in
which row i corresponds to source i; column j corresponds to demand center j; and
(i, j) is the cell in row i and column j.
In the cell (i, j), we record the value x
ij
(the amount to be shipped) in the lower
righthand corner of the cell and the unit shipping cost in the upper lefthand corner
of the cell.
On the righthand side of the array we record the availabilities at the sources and
at the bottom of the array we record the requirements at the demand centers.
The objective function is the sum of the variables in the array multiplied by the
unit cost in the corresponding cell.
Array representation of the transportation problem
Demand center
1 j n Supply
1
c
11
x
11
c
1n
x
1n
a
i
Source i
c
i1
x
i1
c
ij
x
ij
c
in
x
in
a
i
m
c
m1
x
m1
c
mn
x
mn
a
n
Demand b
1
b
j
b
n
We end this subsection with an example.
Example 1. We consider a small transportation problem where the commodity
is iron ore, the source are mine 1 and mine 2 that produce the ore, and the markets
are three steel plants. Let c
ij
= cost (RON/ton) to ship ore from mine i to plant j,
i = 1, 2, j = 1, 2, 3. The data is given below.
95
Steel plant Available at
1 2 3 mine (tons) daily
Mine 1
11
x
11
8
x
12
2
x
13
800
2
7
x
21
5
x
22
4
x
23
300
Demand at 400 500 200
plant (tons) daily
Determine the amounts to be shipped such that the transportation cost to be
minimum.
Let x
ij
be the amount of ore (in tons) shipped from mine i to plant j.
At mine 1 there are 800 tons of ore available. The amount of ore shipped out of
this mine, x
11
+x
12
+x
13
has to be smaller than the amount available, leading to the
constraint x
11
+ x
12
+ x
13
≤ 800. Similarly, considering ore at steel plant 1, at least
400 of it is required there, leading to the constraint x
11
+x
21
≥ 400.
The total amount of ore available is 800 + 300 = 1100; and the total amount
required is 400 + 500 + 800 = 1100. This imply that all the ore at each mine will be
shipped out, and the requirement at each plant will be met exactly. In consequence
all constraints will be equalities.
The dual problem. The optimality criterion
We associate the dual variable u
i
to the constraint corresponding to source i,
i = 1, m, and the dual variable v
j
to the constraint corresponding to demand center
j, j = 1, n.
The dual problem is:
g(u, v) =
m
i=1
a
i
u
i
+
n
j=1
b
j
v
j
→ maximize
subject to
_
u
i
+v
j
≤ c
ij
, i = 1, m, j = 1, n
u
i
, v
j
, i = 1, m, j = 1, n unrestricted in sign
(2)
From the complementarity slackness theorem (from previous section) we know
that if x = (x
ij
)
i=1,m
j=1,n
is a basic feasible solution for the primal problem, (u, v) =
((u
i
)
i=1,m
, (v
j
)
j=1,n
) is feasible for the dual and
x
ij
(c
ij
−u
i
−v
j
) = 0, for all i and j
then x and (u, v) are optimal for their problems.
96
In conclusion, if x = (x
ij
)
i=1,m
j=1,n
is a basic feasible solution for the transportation
problem then the dual basic solution associated with it can be computed by solving
the following system of equations:
u
i
+v
j
= c
ij
for each (i, j) corresponding to a basic variable x
ij
.
The previous system has m + n unknowns and m + n − 1 equations (since each
BFS x has m+n −1 basic variables).
Deleting one constraint in (1) has the eﬀect of setting 0 the correspondent dual
variable.
The system which gives us the dual basic solution is
_
u
i
+v
j
= c
ij
for each (i, j) corresponding to a basic variable x
ij
u
1
= 0 (we can choose any dual variable to be 0)
The optimality criterion is
Optimality criterion. c
ij
≥ u
i
+v
j
for all nonbasic (i, j).
Indeed, if the optimal criterion is satisﬁed, then (u, v) is feasible for the dual
problem. Since x is feasible for the primal problem, according to the complementarity
slackness theorem, x and (u, v) are optimal to respective problems.
The transportation algorithm
By using the special structure of the transportation problem we will present a
version of the simplex algorithm that can be solved without canonical tables.
Step 1. Determine an initial basic feasible solution.
We present two methods of obtaining an initial basic feasible solution: the north
west corner rule and the minimal cost rule.
The northwest corner rule
Begin in the upper lefthand corner (or northwest corner) of the transportation
array and set x
11
as large as possible (there are limitations for setting x
11
: b
1
which
is the demand at the market 1 and a
1
which is the supply at source 1). So, x
11
=
min¦a
1
, b
1
¦. Setting x
11
= min¦a
1
, b
1
¦, the supply at the source 1 will be a
1
− x
11
and the demand at the market 1 will be b
1
−x
11
.
If x
11
= a
1
then x
12
= = x
1n
= 0 and hence x
12
, . . . , x
1n
will be nonbasic
variables (instead of 0 in their cells we will put a point).
If x
11
= b
1
then x
21
= = x
n1
= 0 and hence x
21
, . . . , x
n1
will be nonbasic
variables (instead of 0 in their cells we will put a point).
Continue this procedure from the upper lefthand corner of the remaining table.
The northwest corner does not utilize shipping costs. It provides us easily an initial
BFS but the total shipping cost may be very high.
Example. To make this description more concrete, we now illustrate the general
procedure on the iron ore shipping problem (example 1).
97
11
400
8
400
2
•
800 400 0
7
•
5
100
4
200
300
400 500 200
0 100
In the ﬁrst step, at the northwest corner, a maximum of 400 tons can be allocated,
since that is all that was required by plant 1. This left 800 −400 = 400 tons available
at the ﬁrst mine and 0 tons required at plant 1. The demand in column 1 is fully
satisﬁed and we cannot ship anymore to plant 1 so x
21
= 0 and we put a • in cell
(2,1).
Next, we move to the second cell in the top row (the northwest corner of the
remaining table) where a maximum of 400 tons can be allocated, since that is all that
was available at that moment at mine 2. This left 400 − 400 = 0 tons available at
the ﬁrst mine and 500 − 400 = 100 tons required by plant 2. At this moment the
ﬁrst row’s availability is met (x
13
= 0) and we move down to the second row where
x
22
= 100 and x
23
= 200. We have 3 + 2 − 1 = 4 basic variables which are x
11
, x
12
,
x
22
and x
23
. The transportation cost in this BFS is
11 400 + 8 400 + 2 0 + 7 0 + 5 100 + 4 200 = 8900 RON.
The minimal cost rule
This method uses shipping costs in order to provide an initial BFS that has a
lower cost.
First we determine the variable with the smallest shipping cost, then assign x
ij
the largest possible value, which is x
ij
= min¦a
i
, b
j
¦.
The supply at source i will be reduced to a
i
− x
ij
and the demand at demand
center j will be reduced to b
j
−x
ij
.
If x
ij
= a
i
then x
ik
= 0 for each k = 1, n, k ,= j.
If x
ij
= b
j
then x
lj
= 0 for each k = 1, m, l ,= i.
After that, we will choose the cell with the minimum shipping cost from the
remaining array and we will repeat the procedure.
If the minimum cost is realized for more than one cell then we will choose to begin
with that cell for which the correspondent variable is maximum.
Example. We illustrate the procedure described above on the iron shipping prob
lem (example 1).
98
11
400
8
200
2
200
800 600 400 0
7
•
5
300
4
•
300 0
400 500 200
200 0
0
The smallest coeﬃcient cost is c
13
= 2, so x
13
= min¦200, 800¦ = 200. The demand
in column 3 is fully satisﬁed, and we cannot ship anymore to plant 3. So, x
23
= 0
(and we put a point in the cell (2,3)). We change the amount still to be shipped from
mine 1 to 800 −200 = 600 and the amount to be shipped at plant 3 to 200 −200 = 0.
The smallest cost among the remaining cells is c
22
= 5 and
x
22
= min¦500, 300¦ = 300.
We change the remaining requirement at plant 2 to 500−300 = 200, the remaining
availability at mine one to 300 −300 = 0 and put a point in cell (1,3) since x
13
= 0.
The remaining cells are (1,1) and (1,2) and
x
12
= min¦200, 600¦ = 200; x
11
= min¦400, 400¦ = 400.
We have 4 = 3 + 2 −1 basic variables which are: x
11
, x
12
, x
13
and x
22
.
The transportation cost in this BFS is
11 400 + 8 200 + 2 200 + 7 0 + 5 300 + 4 0 = 7900 RON.
Step 2. Check the optimality of the current BFS.
Denote by u
i
(i = 1, m) and v
j
(j = 1, n) the dual variables correspondent to
sources and respectively demand centers.
Solve the system
_
u
i
+v
j
= c
ij
for each (i, j) corresponding to a basic variable x
ij
u
1
= 0
For each (i, j) compute u
i
+ v
j
and wrote down the obtained value in the upper
righthand corner in the correspondent cell.
Check the optimality criterion: c
ij
≥ u
i
+v
j
for all nonbasic x
ij
.
If the current BFS is optimal then stop. Write down the optimal solution and
compute the minimal value of f.
Remark. The above system can be solved very easily. Since we know u
1
= 0, we
can get the values of v
j
for columns j of the basic cells. Knowing the values of v
j
,
from equations corresponding to basic cells in these columns, we can get the values
of u
i
for rows i of these basic cells. We continue the method with the rows of u
i
in
the same way, until all the u
i
and v
j
are computed.
99
Since any dual variable can be 0, we will choose that dual variable on whose row
(or column) we have the maximum number of basic cells to be zero.
Example. Consider the BFS obtained by the minimal cost rule in the iron ore
shipping problem.
v
1
= 11 v
2
= 8 v
3
= 2
u
1
= 0
11 11
400
8 8
200
2 2
200
u
2
= −3
7 8
0
5 5
300
4 −1
•
To compute the dual basic solution, we start with u
1
= 0 (on the ﬁrst row we have
3 basic cells). Since (1,1) is a basic cell, we have u
1
+v
1
= c
11
= 11, so v
1
= 11.
Since (1,2) is a basic cell, we have u
1
+v
2
= c
12
= 8, so v
2
= 8.
Since (1,3) is a basic cell, we have u
1
+v
3
= c
13
= 2, so v
3
= 2; and the processing
of row 1 is done.
Since (2,2) is a basic cell, we have u
2
+v
2
= c
22
= 5, and since v
2
= 8 we get that
u
2
= 5 −8 = −3.
The dual solution is written in the above array.
We check the optimality conditions and see that c
21
= 7 < 8 = u
2
+v
1
.
Since c
21
< u
2
+v
1
, the optimality criterion is not satisﬁed.
Step 3. Improve the current basic feasible solution by replacing exactly one basic
variable with a nonbasic variable for which the optimality criterion is violated.
Choose as entering cell the nonbasic variable x
ij
with the most negative reduced
cost, c
ij
−u
i
−v
j
.
Since every BFS for the mn transportation problem has exactly m+n−1 basic
variables, when an entering variable is brought into the BFS, some present basic
variable should be dropped from the BFS: the variable is called the dropping basic
variable and the correspondent cell is called the dropping basic cell.
We determine now the dropping basic variable and the new BFS.
The value of the entering variable is changed from 0 (its present value as a nonbasic
variable) to a value denoted by α (which will be determined later) and all the other
nonbasic variables remain unchanged. If we denote by (p, q) the entering cell then the
value of x
pq
changes from 0 to α. We have to subtract α to one of the basic variables
in row p (such that a
p
units to be shipped out of source p) and to subtract α to one of
the basic variables in column q (such that b
q
units to be shipped to demand center q).
We continue these adjustments adding α to another basic variable, then subtracting
α from a basic variable; until all the adjustments cancel each other in every row and
column. All the cells which have the values modiﬁed by α or by −α belong to a loop
which has the following properties:
i) all the cells in the loop other than the entering cell are in present basic cells
ii) every row and column of the array either has no cells in the loop; or has exactly
two cells, one with +α adjustment and the other with a −α adjustment.
The cells in the loop with +α adjustment are called recipient cells. The cells with
−α adjustment are called donor cells.
100
The new solution is x(α) = (x
ij
(α))
i,j
x
ij
(α) =
_
_
_
x
ij
if the cell (i, j) is not in the loop
x
ij
+α if (i, j) is a recipient cell in the loop
x
ij
−α if (i, j) is a donor cell in the loop.
We have that
f(x(α)) = f(x) +α(c
pq
−u
p
−v
q
).
(The proof of the previous equality is beyond the scope of this text).
Since f(x(α)) = f(x) +α(c
pq
−u
p
−v
q
) and c
pq
−u
p
−v
q
< 0, in order to decrease
the objective value as much as possible we should give α the maximum value it can
have which is
α = min¦x
ij
, (i, j) a donor cell in the loop¦.
A donor cell for which α = x
ij
will become the dropping cell (and become a
nonbasic cell). If there are more than one donor cell for which α = x
ij
, then the
dropping cell will be that cell for which there are no loops with basic cells.
We can summarize the previous discussion as follows:
 choose as entering variable x
ij
with the most negative reduced cost (the reduced
cost is c
ij
−u
i
−v
j
)
 starting from this cell consider a loop whose all the other corners except the
starting cell, are situated in basic cells. On each row or column there are either 2 cells
of the loop or none.
 mark (in the lower lefthand corner of the correspondent cell) by a + the odd
cells of this loop (the ﬁrst, the third,...). These are the recipient cells of the loop.
 mark by a − the even cells of this loop (the second, the fourth,...). These are the
donor cells of the loop.
 determine the minimal value of the variables situated in the donor cells. Denote
it by α.
 add α to the recipient variables (situated in the cells marked by +).
 subtract α to the donor variables (situated in the cells marked by −)
 one of the donor variables equal to α will leave the base. Go to step 2.
Example. Improve the BFS obtained by the minimal cost rule in the iron ore
shipping problem.
v
1
= 11 v
2
= 8 v
3
= 2
u
1
= 0
11 11
− 400
8 8
+ 200
2
200
u
2
= −3
7 8
+ •
5 5
− 300
4 −1
•
The nonbasic cell (2,1) with c
12
= 7 < 8 = u
1
+v
2
is the only eligible cell to enter
the base. The loop consists of the following cells: (2,1), (2,2), (1,2), (1,1).
The odd cells (recipient cells) are (2,1), (1,2) and they are marked by +.
The even cells (donor cells) are (2,2), (1,1) and they are marked by −.
101
Since α = min¦400, 300¦ = 300 = x
22
then the dropping variable is x
22
.
The new BFS is given in the array below.
v
1
= 11 v
2
= 8 v
3
= 2
u
1
= 0
11 11
100
8 8
500
2 2
200
u
2
= −4
7 7
300
5 4
•
4 −2
•
We compute the dual basic variables and since
c
22
= 5 > 4 = u
2
+v
2
; c
23
= 4 > −2 = u
2
+v
3
the optimality criterion is satisﬁed.
Hence, we get an optimal solution
x =
_
100 500 200
300 0 0
_
with a minimum cost
f(x) = 11 100 + 8 500 + 2 200 + 7 300 + 5 0 + 4 0 = 7600 RON.
Remark. a) Degenerate solution
Each BFS which occurs in this iterative process must have m+n−1 basic variables.
If we have less than m+n−1 positive variables we must put 0’s instead of points
(in the cells of nonbasic variables) in order to obtain m + n − 1 basic variables. We
will transform the points into 0’s in that cells with minimal costs for which there are
no loops with the basic cells.
b) Multiple solution
If all the optimality conditions are fulﬁlled and if there is a nonbasic cell for which
c
ij
= u
i
+ v
j
then by choosing a loop starting with this cell we will obtain another
optimal solution.
If x
1
and x
2
are optimal solutions then for each t ∈ [0, 1], (1−t)x
1
+tx
2
is optimal,
too.
Next, we present an example of ﬁnding the loop which improves a given basic
feasible solution for a balanced transportation problem.
The loop will be a little more complicated in this case.
102
v
1
= 5 v
2
= 3 v
3
= 4 v
4
= −9 v
5
= 4
u
1
= 0
6 5
•
3 3
− 18
4 4
+ 20
5 −9
•
5 4
•
u
2
= 5
10 10
− 27
5 8
•
7 9
•
11 −4
•
9 9
+ 35
u
3
= 3
8 8
+ 0
5 6
•
7 7
− 27
11 −6
•
9 7
•
u
4
= 21
13 26
•
4 24
+ •
16 25
•
12 12
25
25 25
− 65
The previous BFS is degenerate since among 8 components just 7 are positive.
Computing the dual variables we observe that the optimality conditions are vio
lated in the following cells: (2,2), (2,3), (3,2), (4,1), (4,2) and (4,3).
The entering variable is (4,2) since it provides the most negative reduced cost.
We put a + in the lower lefthand side part of this cell. To satisfy the equality
constraints in the problem we need to subtract the same value as was added in cell
(3,2) from one of the basic cells in row 4, that is in cells (4,4) or (4,5).
If we choose the cell (4,4), since this is the only basic cell in column four, we
cannot make the next correction in another basic cell in this column. In conclusion
the adjustment must be made in the basic cell (4,5). Continuing in the same way we
obtain the entire loop (presented in the above array).
Example 2. Solve the following transportation problem.
Demand center
1 2 3
1
1 2 3
15
Source 2
4 4 10
19
3
6 5 15
11
7 8 30
We check ﬁrst the balance condition: 7 + 8 + 30 = 15 + 19 + 11.
Step 1. We determine an initial BFS by using the minimal cost rule:
103
1
7
2
8
3
•
15 8 0
4
•
4
•
10
19
19 0
6
•
5
0
15
11
11 0
7 8 30
0 0 11
0
Since c
11
= 1 = min¦c
ij
, i = 1, 3, j = 1, 3¦ we have
x
11
= min¦7, 15¦ = 7.
The minimum cost of the remaining array is now c
12
= 2 so x
12
= min¦48, 8¦ = 8.
Since we obtain just 4 positive variables and since we must have 3+3−1 = 5 basic
variables we must transform a • into a basic variable with value 0. We can transform
any • in 0 since neither of present nonbasic variables have loops with basic variables.
We consider x
23
= 0 to be a basic variable.
Step 2. We check the optimality of the previous BFS.
v
1
= 1 v
2
= 2 v
3
= 12
u
1
= 0
1 1
7
2 2
− 8
3 12
+ •
u
2
= −2
4 −1
•
4 0
•
10 10
19
u
3
= 3
6 4
•
5 5
+ 0
15 15
− 11
u
1
= 0, u
1
+v
1
= 1 ⇒ v
1
= 1
u
1
+v
2
= 2 ⇒ v
2
= 2
v
2
= 2, u
3
+v
2
= 5 ⇒ u
3
= 3
u
3
= 3, u
3
+v
3
= 15 ⇒ v
3
= 12
v
3
= 12, u
2
+v −3 = 10 ⇒ u
2
= −2
Since c
13
= 3 < 12 = u
1
+v
3
the BFS is not optimal.
Step 3. We improve the BFS, starting from the (1,3) cell.
The loop consists in the following cells: (1,3), (3,3), (3,2) and (1,2).
The odd (recipient) cells are: (1,3) and (3,2).
The even (donor) cells are: (1,2) and (3,3).
104
Since α = min¦8, 11¦ = 8 = x
12
then the dropping variable is x
12
.
The new BFS is given in the array below.
v
1
= 1 v
2
= 7 v
3
= 3
u
1
= 0
1 1
− 7
2 −7
•
3 3
+ 8
u
2
= 7
4 8
•
4 0
•
10 10
19
u
3
= 12
6 13
+ •
5 5
8
15 15
− 3
We compute the dual variables and since c
21
= 4 < 8 = u
2
+ u
1
and c
31
=
6 < 13 = u
3
+ u
1
the BFS is not oprimal. The entering variable is x
31
(since it
provides the most negative reduced cost). The loop is (3,1), (3,3), (1,3), (1,1) with
α = min¦7, 3¦ = 3 = x
33
which will be the dropping variable.
The new array is:
v
1
= 1 v
2
= 0 v
3
= 3
u
1
= 0
1 1
− 4
2 0
•
3 3
+ 11
u
2
= 7
4 8
+ •
4 7
•
10 10
− 19
u
3
= 5
6 6
3
5 5
8
15 8
•
The new BFS obtained before is not optimal since the optimality condition is
violated in the cells: (2,1) and (2,2). The entering variable is x
21
. Repeating the
procedure presented before we obtain the following new BFS.
v
1
= −3 v
2
= −4 v
3
= 3
u
1
= 0
1 −3
•
2 −4
•
3 3
15
u
2
= 7
4 4
4
4 3
•
10 10
15
u
3
= 9
6 6
3
5 5
8
15 12
•
We compute the dual variables and since c
11
> u
1
+v
1
, c
12
> u
1
+v
2
, c
22
> u
2
+v
2
,
c
33
> u
3
+v
3
the optimality criterion is satisﬁed.
105
Hence, we get an optimal solution
x =
_
_
0 0 15
4 0 15
3 8 0
_
_
with a minimum cost
f(x) = 3 15 + 4 4 + 10 15 + 6 3 + 5 8 = 269.
According to the previous remark, part b, the previous solution is unique since
there are no nonbasic variables for which the reduced costs to be zero.
Example 3. Solve the following balanced transportation problem.
Demand center
1 2 3
1
9 6 8
6
Source 2
10 5 12
11
3
11 13 20
4
3 4 14
Solution. We determine an initial basic feasible solution by using the least cost
rule. We show this BFS in the following array. The dual variables are also entered in
the array.
v
1
= 10 v
2
= 5 v
3
= 12
u
1
= −4
9 6
•
6 1
•
8 8
6
6 0
u
2
= 0
10 10
− 3
5 5
4
12 12
+ 4
11 7 4 0
u
3
= 8
11 18
+ •
13 13
•
20 29
4
4 0
3 4 14
0 0 8
0
The optimality criterion is violated since c
31
= 11 < 18 = u
3
+ v
1
. Hence (3,1)
is the entering cell, and the corresponding loop is already entered on the array. α =
min¦3, 4¦ = 3 and (2,1) is the dropping cell. The next BFS is presented in the following
array.
106
v
1
= 3 v
2
= 5 v
3
= 12
u
1
= −4
9 −1
•
6 1
•
8 8
6
u
2
= 0
10 3
•
5 5
− 4
12 12
+ 7
u
3
= 8
11 11
3
13 13
+ •
20 20
− 1
Now the optimality criterion is satisﬁed, so the present BFS is an optimal solution.
x =
_
_
0 0 6
0 4 7
3 0 1
_
_
.
The minimum cost is
f
min
= 8 6 + 5 4 + 12 7 + 11 3 + 20 1 = 205.
Since in the last array there is a nonbasic cell, namely (3,2), for which c
32
= u
3
+v
2
,
then by choosing a loop starting with this cell we will obtain another optimal solution,
as bellow:
v
1
= 3 v
2
= 5 v
3
= 12
u
1
= −4
9 −1
•
6 1
•
8 8
6
u
2
= 0
10 3
•
5 5
3
12 12
8
u
3
= 8
11 11
3
13 13
1
20 20
•
The new optimal solution is:
x
′
=
_
_
0 0 6
0 3 8
3 1 0
_
_
with the same minimal cost f
min
= f(x
′
) = 205.
The general solution (see the previous remark, part b) is:
x(t) = (1 −t)x +tx
′
=
_
_
0 0 6
0 4 −t 7 −t
3 0 1 −t
_
_
, t ∈ [0, 1].
107
Marginal values in the balanced transportation
problem
In this subsection we analyse how changes in availabilities and requirements a
i
and b
i
, aﬀect the transportation cost in a balanced transportation problem.
The marginal value is the rate of change in the optimum objective value, per unit
change in the availabilities and requirements, a
i
and b
j
.
Since the balance condition is necessary for feasilibity, we cannot change only one
quantity among a
1
, . . . , a
m
; b
1
, . . . , b
n
.
We will consider the following types of changes:
i) increased demand at demand center j and the same balancing increase in avail
ability at source i
ii) increased availability at source p and decreased availability at source i by the
same amount (this moves the supply from source i to source p)
iii) increased demand at demand center q and decreased demand at demand center
j by the same amount.
In all the cases presented above, all the other data in the balanced transportation
problem remain the same.
Let x and (u, v) be optimal solutions for primal and dual problems. Assume then x
is nondegenerate. According to Remark 6 of the previous section the marginal values
in the three cases presented before are:
i) v
j
+u
i
ii) u
p
−u
i
iii) v
q
−v
j
.
Example. Consider the balanced transportation problem presented in the previ
ous example. In the next array we consider the ﬁrst optimal solution x.
v
1
= 3 v
2
= 5 v
3
= 12
u
1
= −4
9 −1
•
6 1
•
8 8
6
u
2
= 0
10 3
•
5 5
4
12 12
7
u
3
= 8
11 11
3
13 13
•
20 20
1
The optimum transportation cost in this problem is 205.
According to the above discussion, if b
2
increases from its current value by 2 units
and a
2
changes by the same amount (to keep the problem balanced) then the optimum
objective value will change by 2(u
2
+v
2
) = 2 5 = 10 taking the value 215.
We remark the fact that if demand b
2
increases, the best place to create additional
supplies to satisfy that additional demand, is source 1 (it is the source with the small
est u
i
). By shifting supply (2 units in our case) from source 2 to source 1 the company
can save 8 monetary units since the rate of change in the optimum transportation
cost per unit shift is u
1
−u
2
= −4 −0 = −4.
108
Remark. The previous discussion is true for suﬃciently small changes in b
j
and
a
i
which don’t aﬀect the basis of the transportation problem. In this case, both, initial
and modiﬁed transportation have the same basic variables which determine the same
dual variables in both problems, as needed in Remark 6 from previous section.
Unbalanced transportation problem
So far we assumed that the total supply at all the sources is equal to the total
demand at all the demand centers. This implies that
m
i=1
a
i
=
n
j=1
b
j
so the system is in balance.
In many applications it may be impossible (or unproﬁtable) to ship all that is
required or the total supply either exceeds or is less than the total demand. Such
problems are called unbalanced and can be solved by transportation algorithm as
below.
(a) Supply exceeds demand (overproduction)
In this case
_
_
m
i
a
i
>
n
j
b
j
_
_
after all the demand is met an amount of
m
i
a
i
−
n
j
b
j
will be left unused at the sources.
To solve this problem, we introduce a new (n + 1) column in the array. For each
i, i = 1, m, the cell (i, n +1) corresponds to the material left unused at source i. The
cost coeﬃcients for all the cells in this new column are equal to zero and
b
n+1
=
m
i=1
a
i
−
n
j=1
b
j
.
In the optimum solution of this modiﬁed problem, basic values in the cells of n+1
column represent unused material at the sources.
(b) Demand exceeds supply (underproduction)
In this case
_
_
i
a
i
<
j
b
j
_
_
there is a shortage of
j
b
j
−
i
a
i
and we cannot
meet all the demand with the existing supply.
To solve this problem, we introduce a new (m + 1) row in the array. Since we
have to ﬁnd how to distribute the existing supply to meet as much as the demand as
possible, we introduce a dummy source with availability a
n+1
=
j
b
j
−
i
a
i
. The
cost coeﬃcients for all the cells in this new row are equal to zero.
In the optimum solution of this modiﬁed problem, basic values in the cells of
(m+ 1) row represent unfulﬁlled demand at demand centers.
109
110
Part II
Calculus
111
Chapter 3
One variable calculus
This part of the text concerns overviews of onevariable calculus. One can cover
this material either by taking the facultative mathematics course or read it on their
own as a review of the calculus they have already taken in high school. The examples
contained in this part should make the process relatively simple.
A central goal of economic theory is to express and analyse relationships between
economic variables which are described mathematically by functions.
A numerical function f : A → B, A, B ⊆ R,
A ∋ x → y = f(x) ∈ B
is a rule which assigns one and only one value y = f(x) in B to each element x in A.
The variable x is called the independent variable, or, in economic applications the
exogenous variable. y is called the dependent variable, or in economic applications
the endogeneous variable.
The development of calculus by Isaac Newton (16421727) and Gottfried Wilhelm
von Leibniz (16461716) resulted from studying certain mathematical problems such
as:
• ﬁnding the tangent line to a curve at a given point
• ﬁnding the extreme values of certain functions
• ﬁnding the areas of planar regions bounded by arbitrary curves.
The study of the ﬁrst classes of problems mentioned before led to the creation of
diﬀerential calculus. The study of the third class of problems led to the creation of
integral calculus.
3.1 Diﬀerential calculus of one variable
The most important information in which we are interested concerns how a change
in one variable aﬀects the other. In the case when the relationships are expressed as
113
linear functions the eﬀect of a change of one variable on the other is expressed by
the ”slope” of the function, but in the general case the eﬀect of this change can be
expressed by the derivative. In this section we will review some facts related to the
derivative of a onevariable function focusing on its role in quantifying relationships
between variables. The derivative of a function is deﬁned by using the notion of limit.
3.1.1 Limits and continuity
We will begin with the intuitive approach to the notion of a limit.
Let f : R ¸ ¦3¦ →R, f(x) =
9 −x
2
3 −x
.
Even if f(3) is not deﬁned, f(x) can be calculated for any value of x near 3. Simple
computations show if x approaches 3 either the left or right then the values f(x) are
approaching 6. We say 6 is the limit of f as x approaches 3 and write
lim
x→3
9 −x
2
3 −x
= 6 or f(x) → 6 as x → 3.
Intuitively, the notion of f(x) approaching a number l as x approaches to a number
a is deﬁned in the following way.
Let f : D → R, D ⊆ R, a ∈ D
′
(see appendix...). If f(x) can be made arbitrarily
close to a number l by taking x suﬃciently close to a number a (but diﬀerent from a)
from both the left and right side of a, then lim
x→a
f(x) = l.
We shall use the notation x ր a (or x → a
−
) to denote that x approaches a from
the left and x ց a (or x → a
+
) to denote that x approaches a from right.
If the limits lim
xցa
f(x) and lim
xրa
f(x) have a common value l, we say that lim
x→a
f(x)
exists and write lim
x→a
f(x) = l.
Intuitively the notion of an inﬁnite limit, lim
x→a
f(x) = ∞ is as follows.
If f can be made arbitrarily large by taking x suﬃciently close to a number a (but
diﬀerent from a) from both the left and right side of a then lim
x→a
f(x) = ∞.
The inﬁnite limit lim
x→a
f(x) = −∞ can be described in a similar manner.
The intuitive deﬁnitions are too vague to be of any use in proving theorems. A
proof of the existence of a limit can never be based on previous intuitive approach.
To give a rigorous demonstration of the existence of a limit or to prove results
concerning limits we must present now the precise deﬁnition of a limit. This rigorous
deﬁnition (ε − δ deﬁnition) is due to AugustinLouis Cauchy. He used ε because of
the correspondence between epsilon and the french word ”erreur” and δ because delta
correspond to ”diﬀ´erence”.
Deﬁnition 1. (ε −δ deﬁnition of a limit)
Let f : A →R, a ∈ A
′
∩ R.
lim
x→a
f(x) = l ∈ R means for every ε > 0 there exists a δ > 0 such that [f(x)−l[ < ε
whenever 0 < [x −a[ < δ and x ∈ A.
To try to understand the meaning behind this abstract deﬁnition, see the diagram
below.
114
`
¸
y
x a −δ a a + δ
l + ε
l
l −ε
We ﬁrst take an ε > 0 and represent on the yaxis the interval (l −ε, l +ε) around l.
We then determine an interval (a−δ, a+δ) around a so that for all xvalues (excluding
a) inside the determined interval the corresponding values f(x) lie inside (l −ε, l +ε).
In general, the value of δ will depend on the value of ε. That is, we will always
begin with ε > 0 and then determine an appropriate corresponding value for δ > 0.
There are many values of δ which work. Once a value for δ is found, all smaller values
of δ also work.
In the next examples we will use the precise deﬁnition.
We will begin the proofs by letting ε > 0 be given.
Then we take the expression [f(x) − l[ < ε and from this inequality we try to
determine an appropriate value for δ (which depends on ε) such that [x −a[ < δ will
guarantee [f(x) −l[ < ε.
Example 1. Prove that lim
x→3
(4x −2) = 10.
Solution. Given ε > 0, we want to determine a δ > 0 so that [x − 3[ < δ will
guarantee [f(x) − 10[ = [(4x − 2) − 10[ < ε. It is natural to try to determine a
connection between [(4x −2) −10[ and [x −3[. We have
[f(x) −10[ = [(4x −2) −10[ = [4x −12[ = 4[x −3[.
So [f(x) −10[ < ε ⇔ 4[x −3[ < ε ⇔ [x −3[ <
ε
4
.
The choice of δ is now clear. If we let δ =
ε
4
we have
[(4x −2) −10[ < ε ⇔ [x −3[ < δ.
Putting all these together, we can write down the following proof.
Given any ε > 0, let δ =
ε
4
> 0. Then for all x in the domain of the function f, if
0 < [x −3[ < δ, we have
[f(x) −10[ = [(4x −2) −10[ = [4x −12[ = [4(x −3)[ = 4[x −3[ < 4δ = ε.
This completes the proof of lim
x→3
(4x −2) = 10.
115
Example 2. Prove that lim
x→4
3
x + 5
=
1
3
.
Solution. We will start with the analysis of the problem.
Observe that
¸
¸
¸
¸
f(x) −
1
3
¸
¸
¸
¸
=
¸
¸
¸
¸
3
x + 5
−
1
3
¸
¸
¸
¸
=
¸
¸
¸
¸
4 −x
3(x + 5)
¸
¸
¸
¸
=
[x −4[
3[x + 5[
.
So,
¸
¸
¸
¸
3
x + 5
−
1
3
¸
¸
¸
¸
< ε ⇔
[x −4[
3[x + 5[
< ε ⇔ [x −4[ < 3ε[x + 5[.
We cannot just take δ = 3ε[x + 5[ since δ should not depend on x. We see that
we need here is a constant k so that for x close enough to 4,
1
3[x + 5[
≤ k which is
equivalent to [x + 5[ ≥ a constant.
If [x − 4[ < δ, then [x + 5[ = [x − 4 + 9[ ≥ 9 − [x − 4[ > 9 − δ. If we take δ ≤ 1
then the previous inequality becomes [x + 5[ > 9 −δ ≥ 8.
In consequence we will have:
1
3[x + 5[
≤
1
24
and
¸
¸
¸
¸
3
x + 5
−
1
3
¸
¸
¸
¸
=
[x −4[
3[x + 5[
≤
[x −4[
24
< ε.
We observe that δ must satisfy two conditions: δ ≤ 1 and
δ
24
≤ ε.
We can now write down the following proof:
Let ε > 0 and δ = min¦1, 24ε¦. Then δ > 0 and if x is in the domain of the
function f and if 0 < [x −4[ < δ we have
[x + 5[ = [x −4 + 9[ ≥ 9 −[x −4[ > 9 −δ ≥ 9 −1 = 8
and in consequence we get
¸
¸
¸
¸
3
x + 5
−
1
3
¸
¸
¸
¸
=
¸
¸
¸
¸
4 −x
3(x + 5)
¸
¸
¸
¸
=
[x −4[
3[x + 5[
≤
[x −4[
24
<
δ
24
≤
24ε
24
= ε.
This completes the proof of
lim
x→4
3
x + 5
=
1
3
.
We will present now the ε −δ deﬁnition for the limits that involve inﬁnity.
Deﬁnition 2. a) Let f : A →R and let a ∈ A
′
∩ R.
lim
x→a
f(x) = ∞ means for each ε > 0 there exists a δ > 0 such that f(x) > ε
whenever 0 < [x −a[ < δ and x ∈ A.
lim
x→a
f(x) = −∞ means for each ε > 0 there exist a δ > 0 such that f(x) < −ε
whenever 0 < [x −a[ < δ and x ∈ A.
b) Let f : A →R such that there is a a ∈ R such that (a, ∞) ⊂ A.
116
lim
x→∞
f(x) = l ∈ R if for each ε > 0 there is a δ > 0 such that [f(x) − l[ < ε
whenever x > δ and x ∈ A.
Similarly we can deﬁne lim
x→−∞
f(x) = l ∈ R.
lim
x→∞
f(x) = ∞ if for each ε > 0 there is a δ > 0 such that f(x) > ε whenever
x > δ and x ∈ A.
Similarly we can deﬁne lim
x→∞
f(x) = −∞, lim
x→−∞
= ∞.
The following properties of limits, which we list without proof, enable us to eval
uate limits of functions algebraically.
Property 1. If f(x) = c, where c is a constant, then lim
x→a
f(x) = c.
Property 1 states the limit of the constant function f(x) = c at any point x = a
is equal to the value of the constant function which is c.
Property 2. If f(x) = x, then lim
x→a
f(x) = lim
x→a
x = a.
Property 3. If lim
x→a
f(x) = l and n ∈ R such that (f(x))
n
is well deﬁned then
lim
x→a
[f(x)]
n
=
_
lim
x→a
f(x)
_
n
= l
n
.
Property states that the limit of the nth power of a function is equal to the nth
power of the limit of the function. For this result to be true we must assume that x
is chosen so that the nth power of f is well deﬁned for x close to a. For instance, if
n =
1
2
then f(x) cannot be negative.
Property 4. If lim
x→a
f(x) = l ∈ R and k is a constant, then
lim
x→a
(kf)(x) = k
_
lim
x→a
f(x)
_
= kl.
Property 4 states that the limit of a constant times a function is equal to the
constant times the limit of the function.
Property 5. If lim
x→a
f(x) = l ∈ R and lim
x→a
g(x) = m ∈ R, then
lim
x→a
(f ±g)(x) = lim
x→a
f(x) ± lim
x→a
g(x).
So, the limit of the sum or diﬀerence of two functions is equal to the sum or
diﬀerence of their limits. This result is easily extended to the case involving the sum
and (or) diﬀerence of any ﬁnite number of functions.
Property 6. If lim
x→a
f(x) = l ∈ R and lim
x→a
g(x) ∈ R then
lim
x→a
(fg)(x) = lim
x→a
f(x) lim
x→a
g(x).
The previous property can be also easily extended to the case involving the product
of any ﬁnite number of functions.
Property 7. If lim
x→a
f(x) = l, lim
x→a
g(x) = m and m ,= 0, then
lim
x→a
_
f
g
_
(x) =
lim
x→a
f(x)
lim
x→a
g(x)
=
l
m
.
117
The previous rules tell us that the limit operations interact with all the basic
algebraic operations in a natural way.
Example 3. Compute the following limit
lim
x→2
(2x
2
+ 1)(3x −1)
x + 4
.
Solution.
lim
x→2
(2x
2
+ 1)(3x −1)
x + 4
P7
=
lim
x→2
(2x
2
+ 1)(3x −1)
lim
x→2
(x + 4)
P6
=
=
lim
x→2
(2x
2
+ 1) lim
x→2
(3x −1)
lim
x→2
(x + 4)
P5
=
_
lim
x→2
(2x
2
) + lim1
_ _
lim
x→2
3x −limn
_
lim
x→2
x + lim
x→2
4
P1,P4
=
_
2 lim
x→2
x
2
+ 1
__
3 lim
x→2
x −1
_
lim
x→2
x + 4
P3,P2
=
_
2
_
lim
x→2
x
_
2
+ 1
_
(3 2 −1)
2 + 4
=
(2 4 + 1) 5
6
=
9 5
6
=
15
2
.
In certain situations, the attempt to apply Property 7 leads to the expression ”
0
0
”
(that is, both numerator and the denominator have limit 0 at x = a). In this case we
say that the quotient
f(x)
g(x)
has the indeterminate form ”
0
0
” at x = a.
To solve the problem, we have to replace the given function with another one that
takes the same values at the original function except at x = a and to evaluate the
limit of the latter. The next examples illustrate this process.
Example 4. Find lim
x→2
x
2
+ 6x −16
x
2
−5x + 6
.
Solution. The limit of the denominator is
lim
x→2
(x
2
−5x + 6) = 2
2
−5 2 + 6 = 0.
So, we cannot use the Property 7. We see if it is possible to simplify the given function.
We can try to factorize the denominator and the numerator too. The fact that the
denominator has limit zero suggest that 2 is a root of the denominator and so x −2
is a factor of the denominator. In the same way we can conclude that x −2 is also a
factor of the numerator. Thus,
lim
x→2
x
2
+ 6x −16
x
2
−5x + 6
= lim
x→2
(x −2)(x + 8)
(x −2)(x −3)
= lim
x→2
x + 8
x −3
=
2 + 10
2 −3
= −12.
Example 5. Find lim
x→4
√
3x + 4 −4
x −4
.
118
Solution. The limit is again of the form ”
0
0
”. The trouble this time is that it
might not be very clear how we can ﬁnd the hidden factor x − 4 in the numerator.
The technique which is to be applied in this case is to rationalize by using the idea
of ”the conjugate expression”.
In our example, when
√
3x + 4 −4 is multiplied by
√
3x + 4 + 4 we will get
(
√
3x + 4)
2
−4
2
= 2x + 4 −16 = 3(x −4).
The square root disappears and a polynomial is obtained. Here are the details
lim
x→4
√
3x + 4 −4
x −4
= lim
x→4
(
√
3x + 4 −3)(
√
3x + 4 + 4)
(x −4)(
√
3x + 4 + 4)
= lim
x→4
3(x −4)
(x −4)(
√
3x + 4 + 4)
= lim
x→4
3
√
3x + 4 + 4
=
3
√
3 4 + 4 + 4
=
3
8
.
Example 6. Compute lim
x→∞
3x
2
+ 8x −4
2x
2
+ 4x −5
if it exists.
Solution. Since the limits of both the numerator and the denominator are ∞ the
Property 7 is not applicable. We will try to put the function into a form in which we
can ﬁnd the required limit. By taking out as common factor x
2
(the highest power of
x appearing in the denominator) from both the numerator and the denominator, we
obtain
lim
x→∞
3x
2
+ 8x −4
2x
2
+ 4x −5
= lim
x→∞
x
2
_
3 +
8
x
−
4
x
2
_
x
2
_
2 +
4
x
−
5
x
2
_ = lim
x→∞
3 +
8
x
−
4
x
2
2 +
4
x
−
5
x
2
=
lim
x→∞
3 + 8 lim
x→∞
1
x
−4 lim
x→∞
1
x
2
lim
x→∞
2 + 4 lim
x→∞
1
x
−5 lim
x→∞
1
x
2
=
3 + 8 0 −4 0
2 + 4 0 −5 0
=
3
2
.
Observe that we have use that lim
x→∞
1
x
= lim
x→∞
1
x
2
= 0 in evaluating the second
and third terms of both the numerator and the denominator.
The previous remark can be generalized as follows.
Property 8. lim
x→∞
1
x
n
= 0 for all n > 0 and lim
x→−∞
1
x
n
= 0 for all n > 0, provided
that
1
x
n
is deﬁned.
Some limits are best calculated by ﬁrst ﬁnding the left and righthand limits.
Deﬁnition 3. (lefthand limit). Let f : A →R and let a ∈ A
′
∩ R.
lim
xրa
f(x) = l ∈ R if for every number ε > 0 there is a number δ > 0 such that if
a −δ < x < a and x ∈ A then [f(x) −l[ < ε.
119
lim
xրa
f(x) = ∞ if for every number ε > 0 there is a number δ > 0 such that if
a −δ < x < a and x ∈ A then f(x) > ε.
Similarly, we can deﬁne lim
xրa
f(x) = −∞.
Deﬁnition 4. (righthand limit). Let f : A →R and let a ∈ A
′
∩ R.
lim
xցa
f(x) = l ∈ R if for every number ε > 0 there is a number δ > 0 such that if
a < x < a +δ and x ∈ A then [f(x) −l[ < ε.
lim
xցa
f(x) = ∞ for every number ε > 0 there is a number δ > 0 such that if
a < x < a +δ and x ∈ A then f(x) > ε.
Similarly, we can deﬁne lim
xցa
f(x) = −∞.
Notice that Deﬁnition 3 is the same as Deﬁnition 1 except that x is restricted to
be in the left half (a−δ, a) of the interval (a−δ, a+δ). In Deﬁnition 4, x is restricted
to lie in the right half (a, a +δ) of the interval (a −δ, a +δ).
The following theorem says that a limit exists if and only both of the onesided
limits exist and are equal.
Theorem 1. Let f : A →R and let a ∈ A
′
∩ R.
Then lim
x→a
f(x) = l if and only if lim
xցa
f(x) = lim
xրa
f(x) = l.
Example 7. If
f(x) =
_ √
x −6, if x > 6
2x −12, if x < 6,
determine whether lim
x→4
f(x) exists.
Solution. Since f(x) =
√
x −6 for x > 6, we have
lim
xց6
f(x) = lim
xց6
√
x −6 =
√
6 −6 = 0.
Since f(x) = 2x −12 for x < 6, we have
lim
xր6
f(x) = lim
xր6
(2x −12) = 2 6 −12 = 0.
The right and left hand limits are equal. Thus the limit exists and lim
x→6
f(x) = 0.
By using the limits we can deﬁne the notion of asymptote of a real valued function.
A linear asymptote is essentially a straight line to which the graph of the function
becomes closer and closer but does not become identical.
A function may have multiple asymptotes, of diﬀerent or of the same kind. One
such function with a horizontal, vertical and oblique asymptote is graphed below.
120
`
¸
x
y = f(x)
f : R
∗
→R
f(x) =
_
¸
_
¸
_
1
x
+x, x > 0
1
x
, x < 0
Deﬁnition 5. (asymptote)
a) horizontal asymptote
The line y = l ∈ R is called a horizontal asymptote of the curve y = f(x) if
lim
x→∞
f(x) = l or lim
x→−∞
f(x) = l.
b) vertical asymptote
The line x = a ∈ R is called a vertical asymptote of the curve y = f(x) if at least
one of the following statements is true:
lim
x→a
f(x) = ∞ (or −∞); lim
xրa
f(x) = ∞ (or −∞);
lim
xցa
f(x) = ∞ (or −∞).
c) oblique (or slant) asymptote
The line y = mx +n, m ,≡ 0, is called an oblique (or slant) asymptote if
lim
x→∞
(f(x) −(mx +n)) = 0 or lim
x→−∞
(f(x) −(mx +n)) = 0.
In this case
m = lim
x→∞
f(x)
x
and n = lim
x→∞
(f(x) −mx)
or
m = lim
x→−∞
f(x)
x
and n = lim
x→−∞
(f(x) −mx).
121
In particular a function y = f(x) can have at most 2 horizontal or 2 oblique
asymptotes (or one of each).
Example 8. Find the asymptotes of the graph of the function deﬁned by
f(x) =
√
2x
2
+ 1
3x −5
.
Solution. First we determine the domain of f.
A = ¦x ∈ R [ 2x
2
+ 1 ≥ 0, 3x −5 ,= 0¦ = R ¸
_
5
3
_
lim
x→∞
√
2x
2
+ 1
3x −5
= lim
x→∞
¸
x
2
_
2 +
1
x
2
_
x
_
3 −
5
x
_ = lim
x→∞
x
_
2 +
1
x
2
x
_
3 −
5
x
_
= lim
x→∞
_
2 +
1
x
2
3 −
5
x
=
√
2
3
.
Therefore the line y =
√
2
3
is a horizontal asymptote of the graph of f.
In computing the limit as x → −∞, we must remember that for x < 0, we have
√
x
2
= [x[ = −x. So, when we take out as common factor x
2
we have
_
2x
2
+ 1 =
¸
x
2
_
2 +
1
x
2
_
= [x[
_
2 +
1
x
2
= −x
_
2 +
1
x
2
Therefore
lim
x→−∞
√
2x
2
+ 1
3x −5
= lim
x→−∞
−x
_
2 +
1
x
2
x
_
3 −
5
x
_ = − lim
x→−∞
_
2 +
1
x
2
3 −
5
x
= −
√
2
3
Thus the line y = −
√
2
3
is also a horizontal asymptote. A vertical asymptote is
likely to occur when the denominator 3x −5 is 0, that is when x =
5
3
.
lim
xց
5
3
√
2x
2
+ 1
3x −5
=
¸
2
_
5
3
_
2
+ 1
+0
= ∞
122
If x is close to
5
2
but x <
5
3
, then 3x −5 < 0 and so f(x) is large negative. Thus
lim
xր
5
3
√
2x
2
+ 1
3x −5
= −∞.
Since we already have two horizontal asymptotes there are no oblique asymptotes
of f.
Example 9. Determine the horizontal and oblique asymptotes of the function
deﬁned by
f(x) =
_
x
2
+x −x.
Solution.
D = ¦x ∈ R [ x
2
+x ≥ 0¦
= ¦x ∈ R [ x(x + 1) ≥ 0¦
= (−∞, −1] ∪ [0, ∞)
We compute ﬁrst
lim
x→∞
f(x) = lim
x→∞
(
_
x
2
+x −x)
Both
√
x
2
+x and x are large when x is large, so it is very diﬃcult to see what
happens to their diﬀerence. We will use algebra to rewrite the function. We ﬁrst
multiply both the numerator and the denominator by the conjugate radical.
lim
x→∞
(
_
x
2
+x −x) = lim
x→∞
(
_
x
2
+x −x)
√
x
2
+x +x
√
x
2
+x +x
= lim
x→∞
(x
2
+x) −x
2
√
x
2
+x +x
= lim
x→∞
x
√
x
2
+x +x
= lim
x→∞
x
¸
x
2
_
1 +
1
x
_
+x
= lim
x→∞
x
x
_
_
1 +
1
x
+ 1
_
= lim
x→∞
1
_
1 +
1
x
+ 1
=
1
2
So, y =
1
2
is an horizontal asymptote of f.
Since
lim
x→−∞
(
_
x
2
+x −x) = ∞+∞ = ∞
then there is no horizontal asymptote at −∞.
123
It remains to look for oblique asymptote at −∞.
m = lim
x→−∞
f(x)
x
= lim
x→−∞
√
x
2
+x −x
x
If in the previous limit we make the substitution y = −x then y → ∞ and the
limit becomes
m = lim
y→∞
_
(−y)
2
+ (−y) −(−y)
−y
= − lim
y→∞
_
y
2
−y +y
y
= − lim
y→∞
y
__
1 −
1
y
+ 1
_
y
= − lim
y→∞
_
1 −
1
y
+ 1
1
= −2
n = lim
x→−∞
(f(x) −mx) = lim
x→−∞
(f(x) + 2x) = lim
x→−∞
(
_
x
2
+x +x)
= lim
y→∞
(
_
y
2
−y −y) = lim
y→∞
y
2
−y −y
2
_
y
2
−y +y
= lim
y→∞
−y
_
y
2
−y +y
= lim
y→∞
−y
y
__
1 −
1
y
+ 1
_ = lim
y→∞
−1
_
1 −
1
y
+ 1
= −
1
2
In conclusion the line y = −2x −
1
2
is a slant asymptote to −∞.
Next, we want to look at another useful technique of ﬁnding limits. We start with
an example.
Example 10. Find lim
x→0
x
2
cos
1
x
.
Solution. We know from Property 6 that the limit of a product is the product of
the limits. That assumes that the limits of the factors exist. If we try to apply this
result, we would say that the limit of x
2
cos
1
x
is the limit of x
2
times the limit of
cos
1
x
. The problem is that the limit of cos
1
x
does not exist and the limit property
for product cannot be applied. Indeed,
cos
1
1
2nπ
= cos 2nπ = 1
for any nonzero natural number and
cos
1
1
π
2
+nπ
= 0
124
for any natural number. Since the values of cos
1
x
do not approach a ﬁxed number as x
approaches 0, lim
x→0
cos
1
x
does not exist. Let’s notice that even though cos
1
x
oscillates,
it oscillates between ﬁxed bounds, namely −1 and 1. So, as long as x ,= 0 we have
−1 ≤ cos
1
x
≤ 1.
We multiply the previous inequality by x
2
and we get
−x
2
≤ x
2
cos
1
x
≤ x
2
, for all x ,= 0.
Notice that x
2
is always positive, and so when we multiply it to the inequality, we
do not need to turn the inequality signs around.
As x → 0, both −x → 0 and x
2
→ 0. Being squeezed between two functions that
approach 0, the function x
2
cos
1
x
is forced to go to zero, too. So we can conclude that
it also has limit 0. That is
lim
x→0
x
2
cos
1
x
= 0.
The way we solved the above example suggests that we can write down a general
result.
Theorem 2. (Squeeze theorem). Suppose that f(x) ≤ g(x) ≤ h(x) for all x close
to a, except possible for x = a. If lim
x→a
f(x) = lim
x→a
h(x) = l then lim
x→a
g(x) = l.
If we compute the limit of a polynomial f at a given point a then the limit will
be f(a). For instance
lim
x→2
(2x
2
+ 4x + 1) = 2 2
2
+ 4 2 + 1.
Functions with this property are called continuous at a. We will see that the
mathematical deﬁnition of continuity corresponds closely with the meaning of the
word continuity in everyday language.
Deﬁnition 6. Let f : A →R and a ∈ A.
a) If a ∈ A
′
we say that f is continuous at a if and only if
lim
x→a
f(x) = f(a).
If a ∈ A¸ A
′
(a is an isolated point of A) then f is continuous at a.
b) If a ∈ A
′
we say that f is continuous from the right at a if
lim
xցa
f(x) = f(a).
c) If a ∈ A
′
we say that f is continuous from the left at a if
lim
xրa
f(x) = f(a).
125
The previous deﬁnition says that f is continuous at an accumulation point a if
f(x) approaches f(a) as x approaches a. A continuous function f has the property
that a small change in x produces only a small change in f(x).
Geometrically, the graph of a continuous function at each point of a given interval
can be drawn without removing the pen from the paper.
We say that f is discontinuous at a, or f has a discontinuity at a, if f is not
continuous at a.
Let f : I →R where I is an interval on the real axis. The function f is continuous
on I if it is continuous at each point in the interval. If f is deﬁned only on one side
of an end point of the interval, we understand continuous at the end point to mean
continuous from the right or continuous from the left.
Instead of using Deﬁnition 6 to verify the continuity of a function, it is often con
venient to use the next theorem, which shows how to build up complicated continuous
functions from simple ones.
Theorem 3. a) If f and g are continuous at a and c is a constant, then the
following functions are also continuous at a:
f ±g, cf, fg and
f
g
if g(a) ,= 0.
b) If g is continuous at a and f is continuous at g(a) then the composite function
f ◦ g (given by (f ◦ g)(x) = f(g(x))) is continuous at a.
c) The following types of functions are continuous at every point in their do
mains: polynomials, rational functions, root functions, trigonometric functions, in
verse trigonometric functions, exponential functions and logarithmic functions.
Intuitively, the part b) of the previous theorem is reasonable because if x is close
to a, then g(x) is close to g(a) and since f is continuous at g(a) then f(g(x)) is close
to f(g(a)).
Example 11. Where is the following function continuous?
f : A →R, f(x) =
1
√
x
2
+ 16 −5
Solution. The function f is the composition of four continuous functions
f = f
1
◦ f
2
◦ f
3
◦ f
4
where f
1
(x) =
1
x
, f
2
(x) = x −5, f
3
(x) =
√
x and f
4
(x) = x
2
+ 16.
We know that each of these functions is continuous on its domain (by Theorem 3,
part c)) and so by Theorem 3 (part a)), f is continuous on its domain.
The domain A of f is:
A = ¦x ∈ R [ x
2
+ 16 ≥ 0,
_
x
2
+ 16 ,= 5¦
= ¦x [ x = ±3¦ = (−∞, −3) ∪ (−3, 3) ∪ (3, ∞).
An important property of continuous functions is expressed by the following theorem.
126
Theorem 4. (The intermediate value theorem). Let f : [a, b] →R be a continuous
function on the closed interval [a, b] and let m be any number between f(a) and f(b).
Then there exists a number c in (a, b) such that f(c) = m.
The intermediate value theorem states that a continuous function takes on every
intermediate value between the function values f(a) and f(b).
Example 12. Show that there is a root of the equation
4x
3
−6x
2
+ 3x −2 = 0
between 1 and 2.
Solution. Let f : [1, 2] → R, f(x) = 4x
3
− 6x
2
+ 3x − 2. We are looking for a
number c between 1 and 2 such that f(c) = 0. Therefore we take a = 1, b = 2 and
m = 0 in the previous theorem.
We have
f(1) = 4 1
3
−6 1
2
+ 3 1 −2 = −1 < 0
f(2) = 4 2
3
−6 2
2
+ 3 2 −2 = 12 > 0
So, m = 0 is a number between f(1) and f(2). Since f is continuous (as a poly
nomial function), the intermediate value theorem says there is a number c between 1
and 2 such that f(c) = 0.
The intermediate value theorem plays an important role in the way the computers
are drawing the graphs of continuous functions. A computer calculates a ﬁnite number
of points on the graph and turns on the pixels that contain these calculated points.
We end this subsection by presenting an important property of continuous which
will be used in the next sections.
Theorem 5. (The extreme value theorem). Let f : [a, b] → R be a continuous
function on [a, b]. Then there exist c, d ∈ [a, b] such that f(c) ≥ f(x) and f(d) ≤ f(x)
for all x ∈ [a, b].
The extreme value theorem says that a continuous function on a closed interval
has a maximum value and a minimum value, but it doesn’t tell us how to ﬁnd these
extreme values.
3.1.2 Rates of change and derivatives
Starting from the slope of a straight line we try to introduce ﬁrst the notion of
the slope of an arbitrary curve.
If a line passes through the points (x
0
, y
0
) and (x
1
, y
1
) then its slope is deﬁned by
(see [??])
m =
y
1
−y
0
x
1
−x
0
.
The numerator, y
1
− y
0
is the change in y which occurs when x changes from x
0
to x
1
. Mathematicians often use the symbol ∆ to denote the change. Thus we write
y
1
−y
0
= ∆y and x
1
−x
0
= ∆x. Using this notation we have
m =
y
1
−y
0
x
1
−x
0
=
∆y
∆x
.
127
The quantity
∆y
∆x
tells us how fast y is changing with respect to x. It represents
the rate of change of y with respect to x. For example, if
∆y
∆x
= 2, then y is increasing
twice as fast as x, while if
∆y
∆x
= −2, then y is decreasing by two units as x increases
by one unit.
For example, if the straight line is the graph of the proﬁts of a company, then the
slope represents the change in proﬁts, which may be increasing or decreasing, rapidly
or slowly, depending on the sign and the size of the slope. The notion of rate of change
is fundamental in economics and includes topics as changes in proﬁt, inﬂation rate or
elasticity of demand.
However, practical situations in economics rarely generate straight line graphs. In
consequence we have to extend the notion of slope to general curves. Even if the slope
of a straight line is a single number we cannot expect a single number to represent
the steepness of a curve, which changes from point to point.
Suppose that the curve can be represented by the equation y = f(x) and the point
P by the coordinates (x
0
, y
0
). We will deﬁne the slope of the curve at P to be equal to
the slope of the tangent line to the curve at P. The word tangent is derived from the
Latin word ”tangens”, which means ”touching”. Thus a tangent to a curve is a line
that touches the curve. In other words, a tangent line should have the same direction
as the curve at the point of contact. We will assume initially that for our curve the
tangent line exists (it doesn’t always exist).
Choose a second point on the curve reasonably close to P, say Q(x, y). The line
through the two points P and Q is called the secant line whose slope is easily to be
found:
m
sec
=
y −y
0
x −x
0
.
We can deﬁne the slope m of the tangent line as the limit of the slope of the secant
line as x approaches x
0
.
m = lim
x→x
0
y −y
0
x −x
0
= lim
x→x
0
f(x) −f(x
0
)
x −x
0
,
provided this limit exists.
The expression
f(x) −f(x
0
)
x −x
0
measures the average rate of change of y = f(x)
with respect to x over the interval [x
0
, x] and provides us an approximation to the
rate of change of the function f at x
0
. The approximation becomes better and better
as the intervals become shorter and shorter. This leads to the following deﬁnition of
the rate of change of f at x
0
:
lim
x→x
0
f(x) −f(x
0
)
x −x
0
(provided that the limit exists) (1)
128
`
¸
x
0 x
P(x
0
, y
0
)
Q(x, y)
s
e
c
a
n
t
l
i
n
e
ta
n
g
en
t
lin
e
The rate of change of a function f at x
0
is often called the instantaneous rate of
change of f at x
0
. Thus, the previous limit measures both the slope of the tangent
line to the graph of f at the point P(x
0
, y
0
) and the (instantaneous) rate of change
of the function f at x
0
.
Example 1. What is the slope of the parabola y = x
2
at the point P(2, 4)?
Solution.
m
tan
= lim
x→2
f(x) −f(2)
x −2
= lim
x→2
x
2
−2
2
x −2
= lim
x→2
(x −2)(x + 2)
x −2
= lim
x→2
(x + 2) = 4
¸
`
P(2, 4)
2
4
x
y
O
129
The equation of the tangent line can be obtained by using the pointslope formula
y −y
0
= m(x −x
0
)
(see [??]) with x
0
= 2, y
0
= f(2) = 4 and m = 4.
In consequence the equation will be y −4 = 4(x −2) or y = 4x −4.
Example 2. Find the slope at the origin of the curve with equation y = [x[.
Solution. We can rewrite the expression of the given curve with a split formula:
y =
_
x, if x ≥ 0
−x, if x < 0
¸
`
P(0, 0)
Q(x, −x) Q(x, x)
y
=
−
x y
=
x
y
First let Q approach P from the right, so that the coordinates of Q are (x, x). The
slope of any secant line is
y −y
0
x −x
0
=
x −0
x −0
= 1
so the righthand limit of these slopes is also 1. However, when Q approaches P from
the left, its coordinates are (x, −x) which will produce a left hand limit of −1. Since
the right and the lefthand limits are not equal, the slope at the origin is not deﬁned.
Actually any function which has a ”sharp corner” fails to admit a tangent line at that
corner.
Since the limit of the form (1) occurs whenever we calculate a rate of change in
science and engineering it is given a special name and notation.
Deﬁnition 1. Let f : A →R, a ∈ A
′
.
The derivative of the function f at the point a, denoted by f
′
(a) is
f
′
(a) = lim
x→a
f(x) −f(a)
x −a
(2)
if the limit exists and takes a ﬁnite value.
If we write x = a+h, then h = x−a and h approaches 0 if and only if x approaches
a. Therefore, an equivalent way of stating the deﬁnition of the derivative is
f
′
(a) = lim
h→0
f(a +h) −f(a)
h
(3)
130
We say that the function f is diﬀerentiable at a, if f admits a ﬁnite derivative at
a.
We say that the function f is diﬀerentiable on the set A if f is diﬀerentiable at
each point of A.
In this case we can deﬁne the derivative function
f
′
: A →R by x → f
′
(x) = lim
h→0
f(x +h) −f(x)
h
.
Mathematicians, from the seventeenth to the nineteenth century, believed that a
continuous function usually possessed a derivative. In 1872 the German mathemati
cian Karl Weierstrass destroyed this tenet by publishing an example of a function
that was continuous at every real number but nowhere diﬀerentiable.
Actually, the opposite of the previous assumption is true.
Theorem 1. Let f : A → R and a ∈ A
′
. If f is diﬀerentiable at a then f is
continuous at a.
Example 3. Find the derivative of the function f : R → R, f(x) = x
2
− 4x + 2
at the point a ∈ R.
Solution. From (3) we have
f
′
(a) = lim
h→0
f(a +h) −f(a)
h
= lim
h→0
[(a +h)
2
−4(a +h) + 2] −[a
2
−4a + 2]
h
= lim
h→a
a
2
+ 2ah +h
2
−4a −4h + 2 −a
2
+ 4a −2
h
= lim
h→0
2ah +h
2
−4h
h
= lim
h→0
(2a +h −4)
= 2a −4.
Example 4. Suppose C(x) = 8000 +200x −0, 2x
2
(0 ≤ x ≤ 400) is the total cost
that a company incurs in producing x units of a certain commodity.
a) What is the actual cost incurred for manufacturing the 251
st
unit?
b) Find the rate of change of the total cost with respect to x when x = 250.
c) Compare the results obtained in parts a) and b).
Solution. a) The actual cost in producing the 251
st
unit is the diﬀerence between
the total cost incurred in producing the ﬁrst 251 units and the total cost of producing
the ﬁrst 250 units. Thus, the actual cost is given by
C(251) −C(250) = 99, 80.
b) The rate of change of the total cost function C with respect to x is given by
the derivative of C at the point 250.
C
′
(250) = lim
h→0
C(250 +h) −C(250)
h
131
= lim
h→0
[8000 + 200(250 + h) − 0, 2(250 + h)
2
] − [8000 + 200 · 250 − 0, 2 · 250
2
]
h
= lim
h→0
200h −0, 2h
2
−0, 2 500h
h
= lim
h→0
(200 −0, 2h −0, 2 500) = 200 −100 = 100
c) From the solution of part (a) we know that the actual cost for producing the
251st unit of commodity is 99,80. This answer is very approximated by the answer to
part (b) which is 100.
To explain that, we observe that
C
′
(250) = lim
h→0
C(250 +h) −C(250)
h
≈
C(250 +h) −C(250)
h
for h suﬃciently small.
Taking h = 1 (which is small enough compared to 250) we have
C
′
(250) ≈ C(251) −C(250).
The cost of producing an additional unit of a certain commodity is called the
marginal cost. If C(x) is the total cost function in producing x units of a certain
commodity then the marginal cost in producing one additional unit is C(x+1)−C(x).
This quantity can be approximated as in the previous example by the rate of change
C
′
(x) ≈ C(x + 1) −C(x).
For this reason, economists have deﬁned the marginal cost function to be the
derivative of the corresponding total cost function. Thus the word ”marginal” is syn
onymous with ”derivative of”.
When we try to apply the deﬁnition of the derivative we encounter diﬃculties of
an algebraic nature. In the next example it can be seen how much work is needed to
compute the derivative of a relatively simple function.
Example 5. Find f
′
(x) for f(x) =
2x + 1
x −2
, f : R ¸ ¦2¦ →R.
Solution.
f
′
(x) = lim
h→0
f(x +h) −f(x)
h
= lim
h→0
2x + 2h + 1
x +h −2
−
2x + 1
x −2
h
= lim
h→0
(2x + 2h + 1)(x −2) −(x +h −2)(2x + 1)
(x +h −2)(x −2)h
= lim
h→0
−5h
(x +h −2)(x −2)h
= lim
h→0
−5
(x +h −2)(x −2)
= −
5
(x −2)
2
132
So, the technical aspects of diﬀerentiation are complex. We need other techniques
for computing derivatives, which avoid using the formal deﬁnition. For this, we will
consider the various ways in which two functions may be combined to form a new
function. The technique for handling such combinations are generally known as rules
of diﬀerentiation. The most important of them are the following.
Rule 1. Constant multiple rule
The derivative of cf (where c is a constant) is cf
′
; (cf)
′
= cf
′
.
Rule 2. Sum rule
The derivative of f +g is f
′
+g
′
; (f +g)
′
= f
′
+g
′
.
Rule 3. Diﬀerence rule
The derivative of f −g is f
′
−g
′
; (f −g)
′
= f
′
−g
′
.
Rule 4. Product rule
The derivative of fg is f
′
g +fg
′
; (fg)
′
= f
′
g +fg
′
.
Rule 5. Quotient rule
The derivative of
f
g
is
f
′
g −fg
′
g
2
;
_
f
g
_
′
=
f
′
g −fg
′
g
2
.
Rule 6. Chain rule
The derivative of the composite function f ◦ g is (f
′
◦ g) g
′
.
(f ◦ g)
′
= (f
′
◦ g) g
′
so that (f ◦ g)
′
(x) = f
′
(g(x)) g
′
(x).
The general rules presented before, allow us to compute derivatives of complicated
functions which are constructed from the basic ones.
Example 6. Diﬀerentiate f : R →R, f(x) = (x
3
−2x
2
+ 4)(8x
2
+ 5x).
Solution. From the product rule we have
f
′
(x) = (x
3
−2x
2
+ 4)
′
(8x
2
+ 5x) + (x
2
−2x
2
+ 4)(8x
2
+ 5x)
′
= (3x
2
−4x)(8x
2
+ 5x) + (x
2
−2x
2
+ 4)(16x + 5)
Example 7. Diﬀerentiate f : D →R, f(x) =
3x
2
−1
2x
3
+ 5x
2
+ 7
.
Solution. From the quotient rule we have
f
′
(x) =
(3x
2
−1)
′
(2x
3
+ 5x
2
+ 7) −(3x
2
−1)(2x
3
+ 5x
2
+ 7)
′
(2x
3
+ 5x
2
+ 7)
2
=
6x(2x
3
+ 5x
2
+ 7) −(3x
2
−1)(6x
2
+ 10x)
(2x
3
+ 5x
2
+ 7)
2
=
−6x
4
+ 6x
2
+ 52x
(2x
3
+ 5x
2
+ 7)
2
Example 8. Find the derivative of h,
h : R →R, h(x) = (x
3
+ 6x
2
−5x + 2)
7
.
Solution. We will split h into its constituent parts
h(x) = [g(x)]
7
, where g(x) = x
3
+ 6x
2
−5x + 2
= (f ◦ g)(x), where f(x) = x
6
By the chain rule, we get
h
′
(x) = f
′
(g(x)) g
′
(x) = 7g
6
(x) g
′
(x)
= 7(x
3
+ 6x
2
−5x + 2)
6
(3x
2
+ 12x −5)
133
Example 9. Find the derivative of h,
h : R →R, h(x) =
_
6x
6
+x
2
+ 4.
Solution. We ﬁrst split h into its constituent functions
h(x) =
_
g(x), where g(x) = 6x
6
+x
2
+ 4
= (f ◦ g)(x), where f(x) =
√
x
By the chain rule
h
′
(x) = f
′
(g(x)) g
′
(x) =
1
2
_
g(x)
g
′
(x)
=
36x
5
+ 2x
2
√
6x
6
+x
2
+ 4
Example 10. Find the derivative of
h(x) =
_
x
2
+ sin
2
x.
Solution.
h
′
(x) =
1
2
_
x
2
+ sin
2
x
(x
2
+ sin
2
x)
′
=
1
2
_
x
2
+ sin
2
x
(2x + 2 sin x(sin x)
′
)
=
x + sin xcos x
_
x
2
+ sin
2
x
Example 11. Diﬀerentiate f,
f : R
∗
→R, f(x) = e
1
x
3
.
Solution.
f
′
(x) = e
1
x
3
_
1
x
3
_
′
= e
1
x
3
(x
−3
)
′
= e
1
x
3
(−3x
−4
) = −3e
1
x
3
1
x
4
Example 12. Diﬀerentiate f,
f : R →R, f(x) = ln(e
4x
+e
−4x
).
Solution.
f
′
(x) = (ln(e
4x
+e
−4x
))
′
=
1
e
4x
+e
−4x
(e
4x
+e
−4x
)
′
=
1
e
4x
+e
−4x
[e
4x
(4x)
′
+e
−4x
(−4x)
′
] =
4e
4x
−4e
−4x
e
4x
+e
−4x
134
Example 13. Let f : R →R, f(x) = (x
2
−4)
3
.
Compute f
′
(2) in two diﬀerent ways.
Solution. One way of computing f
′
(2) is to use the deﬁnition of the derivative at
a given point.
f
′
(2) = lim
x→2
f(x) −f(2)
x −2
= lim
x→2
(x
2
−4)
3
−0
x −2
= lim
x→2
(x −2)
3
(x + 2)
3
x −2
= lim
x→2
(x −2)
2
(x + 2)
3
= 0 4
3
= 0
The second way is to determine ﬁrst the derivative function f
′
and then evaluate
it at x = 2
f
′
(x) = [(x
2
−4)
3
]
′
= 3(x
2
−4)
2
(x
2
−4)
′
= 3(x
2
−4)
2
2x = 6x(x
2
−4)
2
So,
f
′
(2) = 6 2 (2
2
−4)
2
= 0,
as we expected.
If f is a diﬀerentiable function, then its derivative f
′
is also a function, so f
′
may
have a derivative of its own, denoted by (f
′
)
′
= f
′′
. This new function f
′′
(if it exists)
is called the second derivative of f because it is the derivative of the derivative of f.
Example 14. If f : R →R, f(x) = xsin x, ﬁnd f
′′
.
Solution. Using the product rule, we have
f
′
(x) = x
′
sin x +x(sin x)
′
= sin x +xcos x
To ﬁnd f
′′
we diﬀerentiate f
′
f
′′
(x) = (sin x +xcos x)
′
= (sin x)
′
+ (xcos x)
′
= cos x +x
′
cos x +x(cos x)
′
= cos x + cos x −xsin x
= 2 cos x −xsin x.
The third derivative f
′′′
is the derivative of the second derivative: f
′′′
= (f
′′
)
′
.
The process can be continued.
The fourth derivative is usually denoted by f
(4)
.
In general, the nth derivative of f is denoted by f
(n)
and is obtained from f by
diﬀerentiating n times.
We will end this subsection by presenting the l’Hospital’s rule for computing limits.
The l’Hospital rule talks about a method to calculate limits of fractions where the
denominator and the numerator both go to zero or both go to inﬁnity.
135
These forms of a limit are said to be indeterminate, since we cannot say what the
limit will be.
In an indeterminate form ”
0
0
” we do not know ”how fast” each is going to 0. If
the numerator goes to zero ”faster” then the denominator, we can expect that the
limit is zero. But if the denominator goes to zero ”faster” then the numerator of the
fraction will be a large number. Finally, if the numerator and the denominator are
going to zero equally fast, then the limit will be a nonzero real number. In any case,
the limit cannot be determined just looking at the form
0
0
.
Theorem 1. (l’Hˆospital’s rule for
0
0
)
Let f and g be functions and a ∈ R. If
(a) f and g are diﬀerentiable in some interval (a −h, a +h) with h > 0,
(b) lim
x→a
f(x) = 0 = lim
x→a
g(x) and
(c) lim
x→a
f
′
(x)
g
′
(x)
(allowing the limits +∞ and −∞)
then the limit lim
x→a
f(x)
g(x)
exists and
lim
x→a
f(x)
g(x)
= lim
x→a
f
′
(x)
g
′
(x)
.
The rule works also for limits of the indeterminate form
_
∞
∞
_
.
Theorem 2. (l’Hospital’s rule for
_
∞
∞
_
).
If f and g are functions that satisfy (a) and (c) above together with
(b’) lim
x→a
[f(x)[ = ∞ = lim
x→a
[g(x)[
then
lim
x→a
f(x)
g(x)
= lim
x→a
f
′
(x)
g
′
(x)
.
Either form of l’Hospital’s rule works for limits at inﬁnity, lim
x→∞
f(x)
g(x)
or
lim
x→−∞
f(x)
g(x)
, as long as for one sided limits. In each case, condition (a) has to be
adapted correspondingly.
For example,
lim
x→∞
f(x)
g(x)
= lim
x→∞
f
′
(x)
g
′
(x)
if
(a) f and g are diﬀerentiable on some interval (b, ∞),
(b) lim
x→∞
f(x) = 0 = lim
x→∞
g(x) or lim
x→∞
[f(x)[ = ∞ = lim
x→∞
[g(x)[ and
(c) lim
x→∞
f
′
(x)
g
′
(x)
exists.
136
The following limits are all of the form
_
0
0
_
or
_
∞
∞
_
, but their answers are all
diﬀerent.
Example 15. Find lim
x→0
x −sin x
x
3
.
Solution.
lim
x→0
x −sin x
x
3
[
0
0
]
= lim
x→0
(x −sin x)
′
(x
3
)
′
= lim
x→0
1 −cos x
3x
2
if the last limit exists. But the last limit is also of the indeterminate form
_
0
0
_
and
so we can try the l’Hospital’s rule one more time:
lim
x→0
1 −cos x
3x
2
= lim
x→0
(1 −cos x)
′
(3x
2
)
′
= lim
x→0
sin x
6x
[
0
0
]
= lim
x→0
cos x
6
=
1
6
Example 16.
lim
x→0
sin x
x
3
[
0
0
]
= lim
x→0
(sin x)
′
(x
3
)
′
= lim
x→0
cos x
3x
2
=
1
+0
= +∞
Example 17.
lim
x→∞
e
x
x
2
+ 4x + 2
[
∞
∞
]
= lim
x→∞
(e
x
)
′
(x
2
+ 4x + 2)
′
= lim
x→∞
e
x
2x + 4
[
∞
∞
]
= lim
x→∞
(e
x
)
′
(2x + 4)
′
= lim
x→∞
e
x
2
= ∞.
There are other indeterminate forms as we can see in the following examples.
a) lim
xց0
√
xln x is of the form [0 ∞]
b) lim
xց0
x
sin x
is of the form [0
0
]
c) lim
x→∞
(e
x
−x)
1
x
2
is of the form [∞
0
]
d) lim
x→∞
_
x
2
+ 1
x
2
_
2x
is of the form [1
∞
]
e) lim
x→0
_
1
x
−
sin x
x
2
_
is of the form [∞−∞].
To ﬁnd the limits of these indeterminate form, we can rewrite the functions as
quotients, as we can see below.
Example 18. Find lim
xց0
√
xln x.
Solution. The limit is of the indeterminate form 0 ∞. We rewrite it as a quotient
as follows:
lim
xց0
√
xln x = lim
xց0
ln x
x
−
1
2
.
137
Now, the limit is of the indeterminate form
_
∞
∞
_
. We use the l’Hospital’s rule.
lim
xց0
√
xln x = lim
xց0
ln x
x
−
1
2
= lim
xց0
(ln x)
′
(x
−
1
2
)
′
= lim
xց0
1
x
−
1
2
x
−
3
2
= −2 lim
xց0
x
1
2
= 0
Example 19. Find lim
x→0
_
1
x
−
sin x
x
2
_
.
Solution. The limit is of the indeterminate form ”∞ − ∞”. We ﬁrst rewrite
1
x
−
sin x
x
2
in the form of a quotient
1
x
−
sin x
x
2
=
x −sin x
x
2
.
Now, the limit
lim
x→0
_
1
x
−
sin x
x
2
_
= lim
x→0
x −sin x
x
2
[
0
0
]
= lim
x→0
(x −sin x)
′
(x
2
)
′
= lim
x→0
1 −cos x
2x
[
0
0
]
= lim
x→0
(1 −cos x)
′
(2x)
′
= lim
x→0
sin x
2
= 0.
The form 0
0
, 1
∞
, ∞
0
and 0
∞
are indeterminate powers.
We will use the exponential and logarithmic functions to convert them into an
indeterminate product.
Example 20. Find lim
xց0
x
sin x
.
Solution. By using the equality a = e
ln a
we can write
x
sin x
= e
ln x
sin x
= e
sin x ln x
Hence,
lim
xց0
x
sin x
= lim
xց0
e
sin x ln x
= e
lim
xց0
sin x ln x
.
The last limit is of the indeterminate form 0 ∞. So, we rewrite it as an indeter
minate quotient and use the l’Hospital rule
lim
xց0
sin xln x
[0·∞]
= lim
xց0
ln x
1
sin x
[
∞
∞
]
= lim
xց0
1
x
−
cos x
sin
2
x
= lim
xց0
−sin x
x
tg x = − lim
xց0
sin x
x
lim
xց0
tg x = −1 0 = 0.
138
In the previous equality we use lim
x→0
sin x
x
= 1. Indeed
lim
x→0
sin x
x
[
0
0
]
= lim
x→0
cos x
1
= 1.
Finally, we obtain that lim
xց0
x
sin x
= e
0
= 1.
Example 21. Find lim
x→∞
_
x
2
+ 1
x
2
_
2x
.
Solution.
lim
x→∞
_
x
2
+ 1
x
2
_
2x
= lim
x→∞
e
ln
_
x
2
+1
x
2
_
2x
= lim
x→∞
e
2x ln
_
x
2
+1
x
2
_
.
The last limit is of the indeterminate form [0 ∞].
lim
x→∞
2xln
x
2
+ 1
x
2
[∞·0]
= lim
x→∞
ln
x
2
+ 1
x
2
1
2x
[
0
0
]
= lim
x→∞
2x
x
2
+ 1
−
2x
x
2
−
1
2x
2
= lim
x→∞
2x
3
−2x
3
−2x
(x
2
+ 1)x
2
−
1
2x
2
= lim
x→∞
4x
x
2
+ 1
[
∞
∞
]
= lim
x→∞
4
2x
= 0.
In consequence we have
lim
x→∞
_
x
2
+ 1
x
2
_
2x
= e
0
= 1.
We end this section by mentioning the fact that the l’Hospital’s rule was ﬁrst pub
lished in 1696 in the Marquis de l’Hospital calculus textbook ”Analyse des inﬁniment
petits”. The rule actually was discovered in 1694 by the Swiss mathematician Johann
Bernoulli. That fact was possible because the Marquis de l’Hospital bought the rights
to Bernoulli’s mathematical discoveries.
3.1.3 Linear approximation and diﬀerentials
A curve lies very close to its tangent line near the point of tangency. This observa
tion is the basis for a method of ﬁnding approximate values of functions. We use the
tangent line at (a, f(a)) as an approximation to the curve y = f(x) when x is near a.
The equation of the tangent line at the point (a, f(a)) is
y = f(a) +f
′
(a)(x −a)
139
and the correspondent approximation will be
f(x) ≈ f(a) +f
′
(a)(x −a) (1)
The relation (1) is called the linear approximation of f at a.
Example 1. Find the linearization of the function f : [−3, ∞) →R,
f(x) =
√
x + 3 at a = 1 and use it to approximate the numbers
√
3, 98 and
√
4, 02.
Solution. The derivative of f is f
′
,
f
′
: (−3, ∞) →R, f
′
(x) =
1
2
√
x + 3
and so we have f(1) = 2 and f
′
(1) =
1
4
.
The approximation formula will be
f(x) ≈ f(1) +f
′
(1) (x −1)
(when x is near 1)
√
x + 3 ≈ 2 +
1
4
(x −1)
√
x + 3 ≈
7
4
+
x
4
.
In particular we have
_
3, 98 =
_
0, 98 + 3 ≈
7
4
+
0, 98
4
= 1, 995
_
4, 02 =
_
1, 02 + 3 ≈
7
4
+
1, 02
4
=
8, 02
4
= 2, 005
The linear approximation is illustrated in the ﬁgure below.
`
¸
y =
7
4
+
x
4
(1, 2)
y =
√
x + 3
140
We can see that the tangent line approximation is a good approximation to the
given function when x is near 1.
These approximation ideas are formulated in terminology of diﬀerentials. If y =
f(x), where f is a diﬀerentiable function at a then the diﬀerential of f at the point a
is the following function:
df
(a)
: R →R, deﬁned by df
(a)
(h) = f
′
(a)h (2)
Sometimes in the previous relation we use dx instead of h and dy instead of
df
(a)
(h), so we have
dy = f
′
(a)dx.
The geometric meaning of diﬀerentials is shown in the ﬁgure below. Let P(a, f(a))
and Q(a + ∆x, f(a + ∆x)) be points on the graph of f and let dx = ∆x. The corre
spondent change in y is ∆y = f(a + ∆x) −f(a). The slope of the tangent line PR is
the derivative f
′
(a).
`
¸
`
·
P
a a + ∆x
dx = ∆x
∆y
_
dy
R
Q
df
(a)
(dx) = dy represents the change in linearization whereas ∆y represents the change
in function. The approximation ∆y ≈ dy becomes better as ∆x = dx becomes smaller.
For complicated functions it may be impossible to compute ∆y exactly. In such cases
the approximation by diﬀerentials is useful
f(a +dx) ≈ f(a) +f
′
(a)dx. (3)
3.1.4 Extreme values of a real valued function
Some of the most important applications of diﬀerential are optimization problems.
These problems can be reduced to ﬁnding the maximum or minimum values of a
function.
141
Deﬁnition 1. Let f : A →R and a ∈ A.
The function f has a global maximum at a if f(a) ≥ f(x) for all x in A. The
number f(a) is called the maximum value of f on A.
The function f has a global minimum at a if f(a) ≤ f(x) for all x in A. The
number f(a) is called the minimum value of f on A.
The maximum and minimum values of f are called the extreme values of f.
Deﬁnition 2. Let f : A →R and a ∈ A.
The function f has a local maximum at a if f(a) ≥ f(x) where x is near a. (This
means that f(a) ≥ f(x) for all x in some open interval containing a).
The function f has a local minimum at a if f(a) ≤ f(x) where x is near a.
Example 1. Determine the extreme values of the function
f : R →R, f(x) = sin x.
Solution. Since −1 ≤ sin x ≤ 1 for all x ∈ R and
sin
_
π
2
+ 2nπ
_
= 1
for any integer n then the function f takes its (local and global) maximum value of 1
inﬁnitely many times.
In the same way −1 is its minimum value (local and global). This value is taken
inﬁnitely many times too, since
sin
_
3π
2
+ 2nπ
_
= −1 for all n ∈ Z.
Example 2. Determine the extreme values of the function
f : R →R, f(x) = x
3
.
Solution.
`
¸
y
x
142
From the graph of the function f we see that this function has neither an absolute
maximum value nor an absolute minimum value.
We have seen that some functions have extreme values, whereas others do not. The
extreme value theorem (Theorem 5, subsection 3.1.1) says that a continuous function
on a closed interval has a maximum value and a minimum value, but it doesn’t tell us
how to ﬁnd these extreme values. In the next ﬁgure we sketch the graph of a function
f with a local maximum at c and a local minimum at d.
`
¸
(c, f(c))
(d, f(d))
d
c
It seems that at the maximum and minimum points the tangent lines are parallel
to the xaxis and in consequence each has slope 0. Since the slope of the tangent line
is the derivative we may believe that f
′
(c) = f
′
(d) = 0.
The following theorem shows us that this remark is always true for diﬀerentiable
functions.
Theorem 1. (Fermat’s theorem). Let f : I → R, I ⊆ R, I an open interval and
a ∈ I. If f is diﬀerentiable at a and f has a local maximum or minimum at a then
f
′
(a) = 0.
The example 2 shows us that we can’t expect to locate extreme values by setting
f
′
(x) = 0 and solving for x. Indeed if f(x) = x
3
, then f(x) = x
3
, so f
′
(x) = 3x
2
et f
′
(0) = 0. But f has no maximum or minimum at 0, as we already mention in
discussing example 2.
Example 3. Let f : R → R, f(x) = [x[. The graph of f is showed below. The
function f has a minimum at 0, but this value can’t be found by solving the equation
f
′
(x) = 0 since f is not diﬀerentiable at x = 0. Indeed
lim
x→0
f(x) −f(0)
x
= lim
x→0
[x[
x
does not exist since
lim
xց0
[x[
x
= lim
xց0
x
x
= 1 and lim
xր0
[x[
x
= lim
xր0
−x
x
= −1.
143
¸
`
O
x
y
f(x) = [x[
In conclusion we can observe that the converse of Fermat’s theorem is false.
In fact, the Fermat’s theorem say that we have to seek the local extreme points
among the solutions of the equation f
′
(x) = 0 or among the points for which f is not
diﬀerentiable.
These points (solutions for f
′
(x) = 0 or points for which f is not diﬀerentiable)
are called critical points.
Example 4. Find the critical points of the function
f : R →R, f(x) =
3
√
x(3 −x).
Solution. We rewrite ﬁrst the function f as
f(x) = 3x
1
3
−x
4
3
and so
f
′
(x) = x
−
2
3
−
4
3
x
1
3
= x
−
2
3
_
1 −
4x
3
_
=
1 −
4
3
x
x
2
3
, for all x ,= 0.
Therefore, f
′
(x) = 0 if 1 −
4
3
x = 0 that is x =
3
4
. f is not diﬀerentiable at x = 0.
Thus the critical points are x = 0 and x =
3
4
.
Remark 1. To ﬁnd an absolute maximum or minimum of a continuous function
f on a closed interval we ﬁnd the critical points of f in (a, b) and compute the values
of function f at the critical points and at the endpoints of the interval. The largest of
the previous values is the absolute maximum value and the smallest of these values
is the absolute minimum value.
Example 5. Find the absolute maximum and minimum values of the function
f : [−2, 1] →R, f(x) = x
3
+ 2x
2
−1.
Solution. f
′
(x) = 3x
2
+ 4x = (3x + 4)x.
144
The function f is diﬀerentiable on (−2, 1) so the critical points are the solutions
of f
′
(x) = 0.
f
′
(x) = 0 ⇔ x(3x + 4) = 0 ⇔ x = 0 et x = −
4
3
.
The values of f at critical points are
f(0) = −1 and f
_
−
4
3
_
=
5
27
.
The values of f at the endpoints of the interval are
f(−2) = −1 and f(1) = 2.
Comparing these values we see that the absolute maximum value is f(1) = 2 and
the absolute minimum value is f(0) = f(−2) = −1.
As we have already shown the derivative function f
′
is very useful in studying the
properties of the given function f.
Next, we will present two important facts which summarize this connection.
Theorem 2. (Rolle’s theorem). Let f : [a, b] → R be continuous on [a, b] and
diﬀerentiable on (a, b) such that f(a) = f(b). Then there exists a number c ∈ (a, b)
such that f
′
(c) = 0.
As a ﬁrst application of Rolle’s theorem present the second order Taylor’s formula
for a function whose second derivative is continuous on an interval.
Remark 2. (Taylor’s formula with the remainder in the Lagrange’s form)
Let f : I →R where I is an open interval and let a ∈ I. Suppose that the second
derivative f
′′
is continuous on I.
For each x ∈ I there exists a number c between a and x such that
f(x) = f(a) +f
′
(a)(x −a) +
1
2
f
′′
(c)(x −a)
2
.
Proof. The previous equality is true for x = a. So, let x ∈ I, x ,= a. We deﬁne
the function g : I →R given by
g(t) = f(t) −f(a) −f
′
(a)(t −a) −α(t −a)
2
where α is chosen so that g(x) = 0.
We easily obtain that
α =
1
(x −a)
2
[f(x) −f(a) −f
′
(a)(x −a)].
We also have that g(a) = g
′
(a) = 0.
We apply Rolle’s theorem to the function g deﬁned on the interval
[min(a, x), max(a, x)] to ﬁnd c
1
between a and x so that g
′
(c
1
) = 0.
We apply Rolle’s theorem to the function g
′
deﬁned on the interval
[min(a, c
1
), max(a, c
1
)] to ﬁnd c between a and c (hence c lies between a and x)
so that g
′′
(c) = 0.
145
On the other hand, the second derivative of g
g
′′
(t) = f
′′
(t) −2α,
where we easily get that α =
1
2
f
′′
(c).
By putting t = x and α =
1
2
f
′′
(c) in the expression of g we get
0 = f(x) −f(a) −f
′
(a)(x −a) −
1
2
f
′′
(c)(x −a)
2
,
which completes the proof.
The main use of Rolle’s theorem is proving the following important theorem, which
was ﬁrst stated by the french mathematician, JosephLouis Lagrange.
Theorem 3. (The mean value theorem, Lagrange’s theorem). Let f : [a, b] → R
be continuous on [a, b] and diﬀerentiable on (a, b). Then there is a number c ∈ (a, b)
such that
f
′
(c) =
f(b) −f(a)
b −a
(1)
or, equivalently
f(b) −f(a) = f
′
(c)(b −a).
By interpreting geometrically the mean value theorem, we can see that it is rea
sonable. Indeed, if A(a, f(a)) and B(b, f(b)) are points on the graph of f (see ﬁgures
below) then the slope of the secant line AB is
m
AB
=
f(b) −f(a)
b −a
which is the same as the right side of equality (1). Since f
′
(c) is the slope of the
tangent line at the point (c, f(c)) the mean value theorem says that there is at least
one point P(c, f(c)) on the graph where the tangent line is parallel to the secant line
AB.
`
¸
a
b
A
B
P(c, f(c))
146
`
¸
P
1
P
2
B
A
a
b
The mean value theorem helps us to obtain information about a function from
information about its derivative.
Example 6. Let f : R → R be a diﬀerentiable function. Suppose that f(2) = 2
and f
′
(x) ≤ 2 for all value of x. How large f(4) can be?
Solution. We can apply the mean value theorem on the interval [2, 4]. There exists
a number c such that
f(4) −f(2) = f
′
(c)(4 −2)
so
f(4) = f(2) +f
′
(c) 2 = 2 +f
′
(c) 2.
We are given that f
′
(x) ≤ 2 for all x, so in particular we know that f
′
(c) ≤ 2. So,
f(4) = 2 +f
′
(2) 2 ≤ 2 + 2 2 = 6.
The largest possible value for f(2) is 6.
The mean value theorem is useful in establishing the following basic properties of
diﬀerentiable functions.
Theorem 4. Let f : (a, b) → R be a diﬀerentiable function. If f
′
(x) = 0 for all
x ∈ (a, b) then f is constant on (a, b).
Theorem 5. Let f, g : (a, b) →R be two diﬀerentiable functions. If f
′
(x) = g
′
(x)
for all x ∈ (a, b), then f − g is constant on (a, b); there is a constant c such that
f = g +c.
Deﬁnition 3. Let f : A →R, A ⊆ R.
a) We say that f is increasing on A if for all x
1
, x
2
∈ A with x
1
< x
2
we have
f(x
1
) < f(x
2
).
b) We say that f is decreasing on A if for all x
1
, x
2
∈ A with x
1
< x
2
we have
f(x
1
) > f(x
2
).
Theorem 6. Let f : (a, b) →R be a diﬀerentiable function.
a) If f
′
(x) > 0 for all x ∈ (a, b) then f is increasing on (a, b).
b) If f
′
(x) < 0 for all x ∈ (a, b) then f is decreasing on (a, b).
Example 7. Find the intervals where the function
f : R →R, f(x) = 3x
4
−24x
2
+ 2
147
is increasing and where it is decreasing.
Solution. f
′
(x) = 12x
3
−48x = 12x(x
2
−4) = 12x(x −2)(x + 2)
We have to solve the following inequations:
f
′
(x) > 0 and f
′
(x) < 0.
This depends on the sign of the three factors of f
′
(x), namely, 12x, x − 2 and
x + 2.
The critical points of f are x = 0, x = 2 and x = −2.
We can arrange the signs of f
′
(x) in the following table.
x −∞ −2 0 2 +∞
12x − − − − 0 + + + +
x −2 − − − − − − 0 + +
x + 2 − − 0 + + + + + +
f
′
(x) − − 0 + 0 − 0 + +
f(x) ց ց f(−2) ր f(0) ց f(2) ր ր
f is decreasing on (−∞, −2) and on (0, 2).
f is increasing on (−2, 0) and on (2, ∞).
Recall that when a function has a relative extremum it must occur at a criti
cal value. We will now combine the ideas mentioned before to obtain two tests for
determining when a critical value is a relative extremum point of a given function.
Theorem 7. (First derivative test for relative extrema) Let f : D → R and let
[a, b] ⊂ D.
Suppose that f is continuous on [a, b] and diﬀerentiable on (a, b) except possibly
at the critical value c.
(a) If f
′
(x) > 0 for a < x < c and f
′
(x) < 0 for c < x < b, then c is a relative
maximum point of f.
(b) If f
′
(x) < 0 for a < x < c and f
′
(x) > 0 for c < x < b, then c is a relative
minimum point of f.
(c) If f
′
(x) has the same algebraic sign on a < x < c and c < x < b then c is not
an extremum point of f.
Example 8. Determine the extreme points of the function
f : R →R, f(x) = 3x
4
−24x
2
+ 2.
Solution. By using the previous theorem and the results obtained in Example 7
we obtain that −2 and 2 are relative minimum points and 0 is a relative maximum
point.
Another geometric property of a graph of a given function is its ”concavity”.
Visually, concavity is easy to recognize. If a graph is ”smiling” at you, it is concave
up (or convex); if it is ”frowning” at you, it is concave down (or simply concave).
148
`
¸
x
y
concave up: ”smiling:
`
¸
x
y
concave down: ”frowning”
A mathematical characterization of concavity involves the second derivative of the
given function.
Theorem 8. (Test for concavity) Let f : I → R, I ⊆ R, where I is an open
interval.
(a) If f
′′
(x) > 0 for all x ∈ I then f is concave up on I.
(b) If f
′′
(x) < 0 for all x ∈ I then f is concave down on I.
Example 9. Find the intervals where the graph of
f : R →R, f(x) = 2x
3
−6x
2
is concave up and where it is concave down.
Solution. We have
f
′
(x) = 6x
2
−12x
and
f
′′
(x) = 12x −12 = 12(x −1).
The sign of the second derivative is given in the following table
x −∞ 1 +∞
f
′′
(x) − − 0 + +
f(x) ⌢ f(1) ⌣
concave down concave up
Thus the graph is concave up on (1, ∞) and concave down on (−∞, 1).
Another application of the second derivative is the following test for maximum
and minimum values. It is a consequence of the concavity test.
Theorem 9. (The second derivative test) Let f : A → R and c ∈ A. Suppose f
′′
is continuous near c (that is f
′′
is continuous on an interval (c −h, c +h)).
(a) If f
′
(c) = 0 and f
′′
(c) > 0 then f has a local minimum at c.
(b) If f
′
(c) = 0 and f
′′
(c) < 0 then f has a local maximum at c.
(c) If f
′
(c) = 0 and f
′′
(c) = 0 then the test is inconclusive.
149
Part (a) is true because f
′′
(x) > 0 near c and so f is concave up near c. This
means that the graph of f lies above its horizontal tangent at c (since f
′
(c) = 0) and
so f has a local minimum at c.
Part (b) is true because f
′′
(x) < 0 near c and so f is concave down near c. This
means that the graph of f lies below its horizontal tangent at c (since f
′
(c) = 0) and
so f has a local maximum at c.
Example 10. Use the second derivative test to ﬁnd the extrema of the following
function:
f : R →R, f(x) = 3x
4
−8x
3
+ 6x
2
.
Solution. We have to evaluate the second derivative at the critical points. First,
we determine the critical points of f.
f
′
(x) = 12x
3
−24x
2
+ 12x = 12x(x
2
−2x + 1) = 12x(x −1)
2
.
In consequence f
′
(x) = 0 for x = 0 and x = 1.
Now, ﬁnd the second derivative, and test the sign at x = 0 and x = 1.
f
′′
(x) = 36x
2
−48x + 12 = 12(3x
2
−4x + 1).
Since f
′′
(0) = 12 > 0 then the function f has a relative minimum point at x = 0
Since f
′′
(1) = 0 the test fails so we have to use the ﬁrst derivative test. The sign
of the ﬁrst derivative is given in the table below.
x −∞ 0 1 +∞
12x − − 0 + + + +
(x −1)
2
+ + + + 0 + +
f
′
(x) − − 0 + 0 + +
f(x) ց ց f(0) ր f(1) ր ր
The part (c) of the ﬁrst derivative test shows that 1 is not a relative extreme point.
3.1.5 Applications to economics
In subsection 3.1.2 we introduced the idea of marginal cost. Recall that if C is the
cost function and C(x) is the cost of producing x units of a certain product then the
marginal cost is the rate of change of C with respect to x. In fact the marginal cost
function is the derivative C
′
of the cost function. We also consider the average cost
function
c(x) =
C(x)
x
representing the cost per unit if x units are produced. We want to ﬁnd what happens
at a minimum point of the average cost function.
Theorem 1. a) If a is a minimum point for c then C
′
(a) = c(a).
b) If the marginal cost is less then the average cost function decreases.
c) If the marginal cost is greater then the average cost then the average cost in
creases.
150
Proof. a) If a is a minimum point of function c then c
′
(a) = 0 (as a consequence
of Fermat’s theorem).
By applying the quotient rule we have
c
′
(x) =
_
C(x)
x
_
′
=
C
′
(x) x −C(x)
x
2
=
x
_
C
′
(x) −
C(x)
x
_
x
2
=
C
′
(x) −c(x)
x
.
Since c
′
(a) = 0 then C
′
(a) −c(a) = 0 and so C
′
(a) = c(a).
b) If the marginal cost is less then the average cost then
c
′
(x) =
C
′
(x) −c(x)
x
< 0
and c is a decreasing function (by Theorem 6, subsection 3.1.4).
c) If the marginal cost is greater then the average cost then
c
′
(x) =
C
′
(x) −c(x)
x
> 0
and c is an increasing function (by Theorem 6, subsection 3.1.4). This completes the
proof.
The part a) of the previous theorem says if the average cost is minimum then the
marginal cost equals to the average cost.
We have the following explanation for parts b) and c) of the previous theorem.
The marginal cost is (approximatively) the cost of producing one additional unit
of the considered product (see Example 4, subsection 3.1.2).
If the addition unit cost is less then the average cost this less expensive unit will
determine the average cost per unit to decrease.
If the additional unit cost is greater then the average cost this more expensive
unit will determine the average cost per unit to increase.
This principle is plausible because if the marginal cost is smaller then the average
cost then it should be produced more in order to lower the average cost. Similarly, if
the marginal cost is greater then the average cost, then it would be produced less in
order to lower the average cost.
We also consider the revenue function R, R(x) representing the income from the
sale of x units of the product. The derivative R
′
is called the marginal revenue function.
If x units are sold the price function p will be deﬁned by
p(x) =
R(x)
x
.
The function P = R−C is naturally called the proﬁt function and the derivative
P
′
is called the marginal proﬁt function. Note that
P
′
(x) = R
′
(x) −C
′
(x) = 0 if R
′
(x) = C
′
(x).
151
We therefore conclude that.
Theorem 2. If the proﬁt is maximum, then the marginal revenue is equal to the
marginal cost.
Remark 1. It is often appropriate to represent a total cost function by a polyno
mial (usually of degree three)
C(x) = a +bx +cx
2
+dx
3
where a represents the overhead cost (rent, heat, maintenance) and the other terms
represent the cost of raw materials, labor and so on. The cost raw materials may be
proportional to x but labor costs might depend partially on higher powers of x.
Example 1. A publisher of a calculus text book works with a cost function
C(x) = 50000 + 20x −
1
10
4
x
2
+
1
3 10
8
x
3
and a price function
p(x) = 120 −
1
10
4
x,
both in dollars. Determine the maximum of the proﬁt function.
Solution. Clearly we have
C
′
(x) = 20 −
1
5 10
3
x +
1
10
8
x
2
and
C
′′
(x) = −
1
5 10
3
+
1
5 10
7
x
so that
C
′′
(x) = 0 pour x = 10
4
.
The marginal cost increases after 10000 copies. On the other hand we have
R(x) = xp(x) = 120x −
1
10
4
x
2
and
R
′
(x) = 120 −
1
5 10
3
x
Maximum proﬁt occurs when P
′
(x) = 0 and P
′′
(x) < 0, so that R
′
(x) = C
′
(x)
120 −
1
5 10
3
x = 20 −
1
5 10
3
x +
1
10
8
x
2
with the solution x = 10
5
. If we want to use the second derivative test to establish
the nature of the critical point x = 10
5
we have to evaluate P
′′
(10
5
).
P
′′
(10
5
) = R
′′
(10
5
) −C
′′
(10
5
) = −
1
5 10
3
+
1
5 10
3
−
1
5 10
7
10
5
< 0.
152
This means that maximum proﬁt occurs when exactly 100.000 copies are produced
and sold. The income is then
R(10
5
) = 11 10
6
at p(10
5
) = 110 dollars per copy. The cost is
C(10
5
) =
1315
3
10
4
dollars.
The maximum of the proﬁt function will be:
P(10
5
) = R(10
5
) −C(10
5
) = 11 10
6
−
1315
3
10
4
=
1985
3
10
4
dollars.
Finally, we will use the marginal concepts introduced before to derive an important
criterion used by economists to analyze the demand function. This concept is the price
elasticity of demand.
In mathematics, elasticity of a diﬀerentiable function f at a point x is deﬁned as
E(x) =
xf
′
(x)
f(x)
which can be rewritten in the following two diﬀerent forms
E(x) =
f
′
(x)
f(x)
1
x
=
(ln f(x))
′
(ln x)
′
(1)
or
E(x) =
x
f(x)
f
′
(x) =
x
f(x)
lim
∆x→0
f(x + ∆x) −f(x)
∆x
= lim
∆x→0
f(x + ∆x) −f(x)
f(x)
∆x
x
≈
f(x + ∆x) −f(x)
f(x)
100
∆x
x
100
=
percentage change in f
percentage change in x
(2)
So
E(x) ≈
percentage change in f
percentage change in x
(3)
If we use the notations y = f(x) or y = y(x), then the x point elasticity of y is
denoted by
E
x
y
=
xf
′
(x)
f(x)
.
Remark 2. If y = y(x), then the y point elasticity of x is
E
y
x
=
1
E
x
y
.
153
Proof. If y = f(x), then x = f
−1
(y) and by using the deﬁnition of elasticity
E
y
x
=
y(f
−1
)
′
(y)
f
−1
(y)
=
y
1
f
′
(f
−1
(y))
x
=
f(x)
1
f
′
(x)
x
=
f(x)
xf
′
(x)
=
1
E
x
y
Next, we shall present the following economic example.
The demand for a product is usually related to its price. In most cases, the demand
decreases when the price increases. The sensitivity of demand to changes in price varias
from one product to another.
For some products small percentage changes in price have little eﬀect on demand.
For other products small percentage changes in price have considerable eﬀect on
demand. We want to measure the sensitivity of demand to changes in price.
Deﬁnition 1. If p represents the price per unit of a certain product and Q rep
resents the demand function (in fact Q(p) is the number of the considered product)
then the price elasticity of demand is (see (2))
E
p
Q
=
pQ
′
(p)
Q(p)
= lim
∆p→0
Q(p + ∆p) −Q(p)
Q(p)
100
∆p
p
100
≈
percentage change in quantity demanded
percentage change in price
(4)
We observe that if the percentage change in price is one then
E
p
Q
≈ percentage change in demand due to 1% increase in price. (5)
Remark 3. a) The price elasticity of demand is usually negative because the
demand decreases when the price increases.
b) If E
p
Q
< −1 the demand is said to be elastic with respect to price.
In this case the percentage decrease in demand is greater than the percentage
increase in price that caused it.
c) If −1 < E
p
Q
the demand is said to be inelastic with respect to price.
In this case the percentage decrease in demand is less then the percentage increase
in price that caused it.
d) If E
p
Q
= −1 the demand is said to be of unit elasticity with respect to price.
Theorem 3. (Elasticity and the total revenue)
Let R, R(p) = pQ(p) be the total revenue function.
a) If E
p
Q
< −1 then R is a decreasing function.
In this case, when the price is raised the total revenue decreases.
b) E
p
Q
> −1 then R is an increasing function.
In this case, when the price is raised the total revenue increases.
154
Proof. By the product rule of diﬀerentiation we have
R
′
(p) = (pQ(p))
′
= pQ
′
(p) +Q(p) = Q(p)
_
pQ
′
(p)
Q(p)
+ 1
_
= Q(p)(E
p
Q
+ 1)
For the part a) we have E
p
Q
< −1 and so E
p
Q
+1 < 0. Since R
′
(p) = Q(p)(E
p
Q
+1)
and E
p
Q
+ 1 < 0 we obtain that R
′
(p) < 0. So, R is a decreasing function.
For the part b) we have E
p
Q
> −1 and so E
p
Q
+1 > 0. Since R
′
(p) = Q(p)(E
p
Q
+1)
and E
p
Q
+1 > 0 we obtain that R
′
(p) > 0. So, R is an increasing function in this case.
Example 2. Suppose the relationship between the unit price p in dollars and the
quantity demanded, x, is given by the equation
p = −0, 02x + 400 (0 ≤ x ≤ 20000).
Compute the price elasticity of demand and interpret the results.
Solution. Solving the given demand equation for x in terms of p we ﬁnd
x = Q(p) = −50p + 20000
from which we see that Q
′
(p) = −50. Therefore
E
p
Q
=
pQ
′
(p)
Q(p)
=
−50p
−50(p −400)
=
p
p −400
(0 ≤ p < 400).
Next, we will solve the equation
E
p
Q
= −1,
that is
p
p −400
= −1
giving p = 200.
We also see that E
p
Q
< −1 when p > 200 (elastic demand) and E
p
Q
> −1 when
p < 200 (inelastic demand).
So, when the unit price is between 0 and 200, an increase in the unit price will
increase the revenue; when the unit price is between 200 and 400, an increase in the
unit price will cause a decrease in revenue.
In consequence the revenue is maximized when the unit price is set at 200.
Example 3. Let Q be the demand function deﬁned by
Q(p) = 10
_
50 −p
p
; 0 < p ≤ 50.
a) Determine the elasticity of demand when the price is p = 10. If the price
increases by 6% determine the approximate change in demand.
b) Determine where the demand is elastic, inelastic and of unitary elasticity with
respect to price.
155
c) Determine the price function as a function of demand.
d) Find the maximum of the total revenue function.
Solution. a)
E
p
Q
=
pQ
′
(p)
Q(p)
=
p
_
50 −p
p
_
′
2
_
50 −p
p
_
50 −p
p
=
p
−p −(50 −p)
p
2
2
50 −p
p
=
25
p −50
,
then
E
10
Q
= −
5
8
On the other hand, from (4) we know that
E
p
Q
≈
percentage change in demand
percentage change in price
.
If we take the percentage change in price to be 6 then
E
10
Q
= −
5
8
≈
percentage change in demand
6
,
where from the percentage change in demand is approximately −
15
4
. This means that
the demand decreases with
15
4
%.
b) First we solve the equation
25
p −50
= −1. This gives us the solution p = 25.
For determining the elasticity intervals we have to solve the inequations
E
p
Q
< −1 and E
p
Q
> −1.
We easily obtain that E
p
Q
< −1 when p ∈ (25, 50) (elastic demand) and E
p
Q
< −1
when p ∈ (0, 25) (inelastic demand).
c) In order to determine the price function as a function of demand we solve the
equation
Q(p) = 10
_
50 −p
p
for p.
Q
2
= 100
50 −p
p
.
156
Thus Q
2
p = 5000 −100p and
p = p(Q) =
5000
Q
2
+ 100
.
d) The critical points of the total revenue function are given by the following
equation
R
′
(Q) = 0,
where
R(Q) = Qp(Q) =
5000Q
Q
2
+ 100
.
R
′
(Q) = 5000
100 −Q
2
(Q
2
+ 100)
2
= 0 implies that Q = 10.
By using the ﬁrst derivative test for determining the extreme values we get that
Q = 10 is a maximum point for the total revenue function.
3.2 Integral calculus of one variable
3.2.1 Antiderivatives and techniques of integration
In the previous chapter we were concerned only with the basic problem: given a
function f ﬁnd its derivative f
′
. In this chapter we are interested in precisely the
opposite process, that is, given a function f, ﬁnd a function whose derivative is f.
This process is called antidiﬀerentiation. Antidiﬀerentiation and diﬀerentiation are
inverse operations in the sense that one undoes what the other does.
Deﬁnition 1. A function F is an antiderivative of the function f if F
′
= f.
Example 1. An antiderivative of f : R →R, f(x) = 2x is F : R →R, F(x) = x
2
since F
′
(x) = 2x = f(x).
There is always more than one antiderivative of a function. For instance, in the
previous example, F
1
: R → R, F
1
(x) = x
2
− 1 and F
2
: R → R, F
2
(x) = x
2
+ 10
are also antiderivatives of f. If F is an antiderivative of a function f then so is G,
G(x) = F(x) +c, for any constant c.
Theorem 1. Let f : I → R, I ⊆ R an interval and let F : I → R be an
antiderivative of f. Then any other antiderivative G of f must be of the form G(x) =
F(x) +c, where c is a constant.
The proof of the previous result is based on Theorem 5, subsection 3.1.4.
The indeﬁnite integral of a function f represents the entire family of antiderivatives
of the given function. We will use the following notation for the indeﬁnite integral
_
f(x)dx.
The indeﬁnite integral is a family of functions. The function f is called the inte
grand.
157
If F is an antiderivative of a given function f (deﬁned on an open interval) then
the indeﬁnite integral of f will be
_
f(x)dx = F(x) +(.
Extensive techniques for the calculation of antiderivatives have been developed.
We will discuss now some basic techniques of integration.
Integration by substitution
This technique is based on the chain rule of diﬀerentiation. We have to mention
ﬁrst that the integration by substitution does not always work and there is no simple
routine that could help us to ﬁnd a suitable substitution even in the cases where the
method works.
Theorem 2. If F is an antiderivative of f, then
_
f(g(x))g
′
(x)dx = F(g(x)) +( (1)
Proof. By the chain rule,
(F(g(x)) +c)
′
= F
′
(g(x))g
′
(x) = f(g(x))g
′
(x).
Hence, from the deﬁnition of an antiderivative we have that
_
f(g(x))g
′
(x)dx = F(g(x)) +(.
Example 2. Evaluate
_
(x
2
+ 3)
4
2xdx.
Solution. Let g(x) = x
2
+ 3. Then g
′
(x) = 2x. If we deﬁne the function f by
f(u) = u
4
then the integrand of the indeﬁnite integral we are considering has the
form
(x
2
+ 3)
4
2x = [g(x)]
4
g
′
(x) = f(g(x))g
′
(x).
From the equality
_
f(g(x))g
′
(x)dx = F(g(x)) +(,
we conclude that the required antiderivative can be found if we know the antiderivative
of the function f. But if f(u) = u
4
, then F(u) =
1
5
u
5
. Thus
_
(x
2
+ 3)
4
2xdx =
_
f(g(x))g
′
(x)dx
= F(g(x)) +( =
1
5
(x
2
+ 3)
5
+(.
158
On a practical level it is helpful to rewrite the integral in a more recognizable form
by using the substitutions u = g(x) and du = g
′
(x)dx. Then the rules of integration
are used to complete the solution of the problem. This formal procedure is justiﬁed
since it leads to the correct solution of the problem.
If we write u = g(x) and du = g
′
(x)dx, then the integral
_
f(g(x))g
′
(x)dx
which is to be evaluated becomes
_
f(u)du which is equal to F(u) +( since F is an
antiderivative of f.
So we have
_
f(u)du = F(u) +( which is the same with (1), as mentioned before.
Example 3. Rework the previous example using the relationships
u = g(x) and du = g
′
(x)dx.
We want to evaluate the indeﬁnite integral
I =
_
(x
2
+ 3)
4
(2x)dx.
Let u = x
2
+ 3 so that du = 2xdx.
Making this substitution into the expression for I we get
I =
_
u
4
du =
1
5
u
5
+( =
1
5
(x
2
+ 3)
5
+(
which agrees with the results of previous example.
Example 4. Evaluate
_
1
xln x
dx.
Solution. Note ﬁrst that the derivative of the function lnx is equal to
1
x
, so it is
convenient to make the substitution u = ln x. Then du =
1
x
dx and
_
1
xln x
dx =
_
1
u
du = ln [u[ +( = ln [ ln x[ +(.
Example 5. Evaluate
_
sin
3
xcos
3
xdx.
Solution. Since the derivative of the function sinx is equal to cos x, it is convenient
to make the substitution u = sin x. Then du = cos xdx, and
_
sin
3
xcos
3
xdx =
_
sin
3
xcos
2
xcos xdx
=
_
u
3
(1 −u
2
)du =
_
(u
3
−u
5
)du =
u
4
4
−
u
6
6
+( =
sin
4
x
4
−
sin
6
x
6
+(.
159
Alternatively, note that the derivative of the function cos x is equal to −sin x, so
it is convenient to make the substitution v = cos x. Then dv = −sin xdx, and
_
sin
3
xcos
3
xdx =
_
(−sin
2
x) cos
3
x(−sin x)dx
=
_
[−(1 −v
2
)]v
3
dv =
_
(v
5
−v
3
)dv =
v
6
6
−
v
4
4
+( =
cos
6
x
6
−
cos
4
x
4
+(
It can be checked that
sin
4
x
4
−
sin
6
x
6
=
cos
6
x
6
−
cos
4
x
4
+
1
12
so both of the previous results are true.
Example 6. Evaluate
_
x
√
x + 1dx.
Solution. If we make the substitution u =
√
x + 1, then x = u
2
−1 and dx = 2udu.
_
x
√
x + 1dx =
_
(u
2
−1)u 2udu = 2
_
u
4
du −2
_
u
2
du
=
2
5
u
5
−
2
3
u
3
+( =
2
5
(x + 1)
5
2
−
2
3
(x + 1)
3
2
+(.
Note that in this example the variable x is written as a function of the new variable
u. The substitution x = g(u) has to be invertible (u = g
−1
(x)) to enable us to return
from the new variable u to the original variable x at the end of the process.
Integration by parts
Recall the product rule for diﬀerentiation, that is
(fg)
′
(x) = f
′
(x)g(x) +f(x)g
′
(x).
Integrating with respect to variable x, we obtain
_
(fg)
′
(x)dx =
_
f
′
(x)g(x)dx +
_
f(x)g
′
(x)dx.
Since fg is an antiderivative of (fg)
′
the previous equality can be rewritten as
_
f(x)g
′
(x)dx = f(x)g(x) −
_
f
′
(x)g(x)dx (2)
The relationship (2) is called the formula for integration by parts for indeﬁnite
integrals. It is very useful if the indeﬁnite integral
_
f
′
(x)g(x)dx is much easier to
calculate than the indeﬁnite integral
_
f(x)g
′
(x)dx.
160
Example. Compute
_
xe
x
dx.
Solution. Writing f(x) = x and g
′
(x) = e
x
, we have f
′
(x) = 1 and g(x) = e
x
. It
follows that
_
xe
x
dx = xe
x
−
_
e
x
dx = xe
x
−e
x
+(
Example 8. Evaluate
_
ln xdx.
Solution. Writing f(x) = ln x and g
′
(x) = 1, we have f
′
(x) =
1
x
and g(x) = x so
_
ln xdx = xln x −
_
x
1
x
dx = xln x −x +(.
Example 9. Evaluate
_
e
x
sin xdx.
Solution.
_
e
x
sin xdx =
_
(e
x
)
′
sin xdx
= e
x
sin x −
_
e
x
cos xdx (3)
We now need to study the indeﬁnite integral
_
e
x
cos xdx =
_
(e
x
)
′
cos xdx
= e
x
cos x −
_
e
x
(−sin x)dx = e
x
cos x +
_
e
x
sin xdx (4)
It looks like we are back to the same problem.
However, if we combine (3) and (4) then we obtain
_
e
x
sin xdx = e
x
sin x −e
x
cos x −
_
e
x
sin xdx
so that
_
e
x
sin xdx =
1
2
e
x
(sin x −cos x) +(.
Completing squares
In this section, we shall consider thechniques to solve integrals involving square
roots of the form
√
a
2
x +bx +c, where a ,= 0. Our task is to show that such integrals
can be reduced to integrals discussed before.
161
Note that
ax
2
+bx +c = a
_
x
2
+
b
a
x +
c
a
_
= a
_
x
2
+
b
a
x +
_
b
2a
_
2
_
+c −
b
2
4a
= a
_
x +
b
2a
_
2
−
b
2
−4ac
4a
We will use now the following substitution
u = x +
b
2a
and du = dx.
Example 10. Evaluate the integral
_
1
√
3 −2x −x
2
dx.
Solution. We have
3 −2x −x
2
= −(x
2
+ 2x −3) = −(x
2
+ 2x + 1) + 4 = 4 −(x + 1)
2
.
We use the substitutions u = x + 1 and du = dx
_
1
√
3 −2x −x
2
dx =
_
1
√
4 −u
2
dx =
1
2
arcsin
u
2
+(
=
1
2
arcsin
u
2
+( =
1
2
arcsin
x + 1
2
+(
Partial fractions
In this section we shall consider indeﬁnite integrals of the form
_
p(x)
q(x)
dx where
p and q are polynomials in x.
If the degree of p is not smaller than the degree of q, then we can always ﬁnd
polynomials c and r such that
p(x)
q(x)
= c(x) +
r(x)
q(x)
where r ≡ 0 or r has a smaller degree than the degree of q.
We can therefore restrict our attention to the case when the polynomial p is of
lower degree than q.
The ﬁrst step is to factorize the polynomial q into a product of irreducible factors.
It is a fundamental result in algebra that a polynomial with real coeﬃcients can be
factorized into a product of irreducible linear factors and quadratic factors with real
coeﬃcients.
Suppose that a linear factor (ax+b) occurs n times in the factorization of q. Then
we write down a decomposition:
A
1
ax +b
+
A
2
(ax +b)
2
+ +
An
(ax +b)
n
162
where the constants A
1
, A
2
, . . . , A
n
will be determinated later.
Suppose that a quadratic factor (ax
2
+bx +c) occurs n times in the factorization
of q. Then we write down a decomposition
A
1
x +B
1
ax
2
+bx +c
+
A
2
x +B
2
(ax
2
+bx +c)
2
+ +
A
n
x +B
n
(ax
2
+bx +c)
n
where the constants A
1
, . . . , A
n
and B
1
, . . . , B
n
will be determined later.
We proceed to add all the decompositions and equate their sum to
p(x)
q(x)
and then
calculate all the constants by equating the coeﬃcients.
Example 11. Consider the indeﬁnite integral
_
x
2
+x −3
x
3
−2x
2
−x + 2
dx.
Solution. We factorize ﬁrst the denominator of the integrand
x
3
−2x
2
−x + 2 = x
3
−x −2(x
2
−1) = x(x
2
−1) −2(x
2
−1)
= (x −2)(x
2
−1) = (x −2)(x −1)(x + 1)
So we consider partial fractions of the form
x
2
+x −3
x
3
−2x
2
−x + 2
=
a
x −2
+
b
x −1
+
c
x + 1
=
a(x −1)(x + 1) +b(x −2)(x + 1) +c(x −2)(x −1)
(x −2)(x −1)(x + 1)
It follows that
x
2
+x −3 = a(x
2
−1) +b(x
2
−x −2) +c(x
2
−3x + 2)
= x
2
(a +b +c) +x(−b −3c) −a −2b + 2c
We equate coeﬃcients and solve for a, b, c.
_
_
_
a +b +c = 1
−b −3c = 1
−a −2b + 2c = −3
where from we get a = 1, b =
1
2
, c = −
1
2
.
Hence
_
x
2
+x −3
x
3
−2x
2
−x + 2
dx =
_
1
x −2
dx −
1
2
_
1
x + 1
dx +
1
2
_
1
x −1
dx
= ln [x −2[ −
1
2
ln [x + 1[ +
1
2
ln [x −1[ +(
163
3.2.2 The deﬁnite integral
In order to deﬁne the concept of a deﬁnite integral we will deﬁne ﬁrst the Riemann
sums (which are named after the famous german mathematician, Georg Friedrich
Bernhard Riemann (18261866)).
This is a 5 steps process.
1) Let f be deﬁned on a closed interval [a, b].
2) Partition the interval [a, b] into n subintervals [x
k−1
, x
k
] of length x
k
− x
k−1
.
Let P denote the partition
a = x
0
< x
1
< < x
n−1
< x
n
= b.
3) Let P be the length of the longest subinterval. The number P is called the
norm of the partition P.
4) Choose a number x
∗
k
∈ (x
k−1
, x
k
) in each subinterval k = 1, n.
5) Form the sum
n
k=1
f(x
∗
k
)(x
k
−x
k−1
). (1)
Sums as (1) for the various partitions of [a, b] are known as Riemann sums.
Deﬁnition 1. Let f be a function deﬁned on the closed interval [a, b]. Then the
deﬁnite integral of f from a to b, denoted
_
b
a
f(x)dx, is deﬁned to be:
_
b
a
f(x)dx = lim
P→0
n
k=1
f(x
∗
k
)(x
k
−x
k−1
) (2)
provided that the previous limit exists and has a ﬁnite value.
If the limit in (2) exists and is ﬁnite, the function f is said to be integrable on
[a, b].
The numbers a and b in the previous deﬁnition are called the lower and upper
limits of integration, respectively. The integral symbol
_
, ﬁrst used by Leibniz, is an
elongated S for the word ”sum”.
We have the following important result, which gives us an important class of
integrable functions.
Theorem 1. If f is continuous on [a, b] then f is integrable on the integrable on
the interval.
The precise characterization of the integrable functions is given by the following
theorem.
Theorem 2. Let f : [a, b] →R.
The function f is integrable on [a, b] if and only if f is bounded on [a, b] and f is
continuous almost everywhere on [a, b].
In consequence any integrable function is a bounded one.
The next theorem gives some of the basic properties of the deﬁnite integral.
Theorem 3. Let f and g be integrable functions on [a, b]. Then we have:
164
a)
_
b
a
kf(x)dx = k
_
b
a
f(x)dx, where k is any constant;
b)
_
b
a
[f(x) ±g(x)]dx =
_
b
a
f(x)dx ±
_
b
a
g(x)dx;
c)
_
b
a
f(x)dx =
_
c
a
f(x)dx +
_
b
c
f(x)dx,
where c is any number in [a, b];
d)
_
b
a
f(x)dx = −
_
a
b
f(x)dx;
e)
_
a
a
f(x)dx = 0.
The most helpful result in computing deﬁnite integrals is the following:
Theorem 4. (LeibnizNewton’s theorem) Let f : [a, b] → R. Suppose that f is
integrable on [a, b] and that there exists an antiderivative F of f. Then
_
b
a
f(x)dx = F(b) −F(a). (3)
The diﬀerence F(b) −F(a) is usually written F(x)
¸
¸
¸
b
a
.
Example 1. Evaluate
_
2
−2
(3x
2
−x + 1)dx.
Solution.
_
2
−2
(3x
2
−x + 1)dx = 3
_
2
−2
x
2
dx −
_
2
−2
xdx +
_
2
−2
dx
= 3
x
3
3
¸
¸
¸
2
−2
−
x
2
2
¸
¸
¸
2
−2
dx +x
¸
¸
¸
2
−2
= 2
3
−(−2)
3
−
1
2
[2
2
−(−2)
2
] + [2 −(−2)] = 20.
The LeibnizNewton theorem allows us to use all the techniques of integration
presented in subsection 3.2.1.
Example 2. Evaluate
_
3
0
x
√
x + 1dx.
We will use the technique of integration by substitution.
Remark 1. If we don’t use the Theorem 4 in evaluating such an integral and
we make the substitution in the deﬁnite integral we have to change the limits of
integration to correspond to the values of u for x = a and x = b.
Solution. To calculate the previous integral we can use the substitution
u =
√
x + 1,
where from we have x = u
2
−1 and dx = 2udu.
165
Note that if x = 0, then u = 1 and if x = 3, then u = 2.
It follows that
_
3
0
x
√
x + 1dx =
_
2
1
(u
2
−1)u2udu
=
_
2
1
(2u
4
−2u
2
)du =
2
5
u
5
¸
¸
¸
2
1
−
2
3
u
3
¸
¸
¸
2
1
=
2
5
(32 −1) −
2
3
(8 −1) =
62
5
−
14
3
=
116
15
Example 3. Evaluate the integral
_
π/2
0
xcos xdx.
We will use the method of integration by parts in order to compute the previous
integral.
Remark 2. For deﬁnite integrals over an interval [a, b] we have the following
formula for integrating by parts
_
b
a
f
′
(x)g(x)dx = f(x)g(x)
¸
¸
¸
b
a
−
_
b
a
f(x)g
′
(x)dx (4)
Solution.
_ π
2
0
xcos xdx =
_ π
2
0
x(sin x)
′
dx
= xsin x
¸
¸
¸
π
2
0
−
_ π
2
0
sin xdx =
π
2
+ cos x
¸
¸
¸
π
2
0
=
π
2
−1.
One of the most important applications of the deﬁnite integral is the calculation
of areas bounded by arbitrary curves.
Theorem 5. Let f be a continuous nonnegative function with the domain con
taining the interval [a, b]. Then the area of the region bounded above by the graph of
f, below by the xaxis, and on the left and right by the vertical lines x = a and x = b,
respectively, is given by the deﬁnite integral
_
b
a
f(x)dx.
Example 4. Find the area of the region situated under the curve y = x
2
+1 from
x = −1 to x = 2.
166
`
¸
x = −1
x = 2
y
x
−1 2
y
=
f
(
x
)
=
x
2
+
1
O
Solution. The region under consideration is shown in this ﬁgure. Using Theorem
5 we have
_
2
−1
(x
2
+ 1)dx =
x
3
3
¸
¸
¸
2
−1
+x
¸
¸
¸
2
−1
=
1
3
[2
3
−(−1)
3
] + 2 −(−1) = 6.
3.3 Improper integrals
3.3.1 Improper integrals
In deﬁning a deﬁnite integral (or a Riemann integral)
_
b
a
f(x)dx it was understood
that:
1
◦
the limits of integration were ﬁnite numbers
2
◦
the function f was bounded on the interval [a, b].
Now we will extend the concept of a deﬁnite (proper) integral to the case where
length of the interval is inﬁnite and also to the case when f is unbounded.
The resulting integral is said to be an improper integral.
So, in conclusion, ”improper” means that some part of
_
b
a
f(x)dx becomes inﬁnite.
It might be a or b or the function f.
First we will consider integrals of functions that are deﬁned on unbounded inter
vals.
To motivate the deﬁnition of an improper integral of a function f over an inﬁnite
interval, consider the problem of ﬁnding the area of the region under the curve y =
f(x) =
1
x
2
, above the xaxis, and to the right of the line x = 1 (as shown in ﬁgure
below).
167
`
¸
t 1
y =
1
x
2
,
The area that lies to the left of the line x = t (shaded in ﬁgure below) is
A(t) =
_
t
1
1
x
2
dx = −
1
x
¸
¸
¸
t
1
= 1 −
1
t
.
Note that A(t) < 1 no matter how large t is chosen.
We also observe that
lim
t→∞
A(t) = lim
t→∞
_
1 −
1
t
_
= 1.
The area of the shaded region approaches 1 at t → ∞, so we can say that the area
of the inﬁnite region is equal to 1 and we write
_
∞
1
1
x
2
dx = lim
t→∞
_
t
1
1
x
2
dx = lim
t→∞
_
1 −
1
t
_
= 1.
Using this example we deﬁne the integral of f over an inﬁnite interval as the limit
of integrals over ﬁnite intervals.
Deﬁnition 1. (Improper integrals on unbounded intervals)
1) Let f : [a, ∞) →R. If
_
t
a
f(x)dx exists for each t ≥ 0 then
_
∞
a
f(x)dx = lim
t→∞
_
t
a
f(x)dx.
The integral
_
∞
a
f(x)dx is called an improper integral on an unbounded interval
on the right. This integral is said to be convergent if the limit exists and has a ﬁnite
value and it is said to be divergent if the limit does not exist or it has an inﬁnite
value.
2) Let f : (−∞, b] →R. If
_
b
t
f(x)dx exists for each t ≤ b, then
_
b
−∞
f(x)dx = lim
t→−∞
_
b
t
f(x)dx.
168
The previous integral
_
b
−∞
f(x)dx is called an improper integral on an unbounded
interval on the left. The deﬁnition of convergence or divergence is similar with the
previous case.
3) Let f : R →R and a ∈ R.
_
+∞
−∞
f(x)dx =
_
a
−∞
f(x) +
_
+∞
a
f(x)dx.
The improper integral
_
+∞
−∞
f(x)dx is said to be convergent if both of
_
a
−∞
f(x)dx
and
_
+∞
a
f(x)dx are convergent.
The previous improper integral is divergent if at least one of the improper integrals
_
a
−∞
f(x)dx,
_
∞
a
f(x)dx is divergent.
This type of improper integrals are easy to identify. It is suﬃcient to look at
the limits of integration. If either the lower limit of integration, the upper limit of
integration or both of them are not ﬁnite, it will be an improper integral on an
unbounded interval.
Example 1. Evaluate
_
∞
0
e
−x
dx.
Solution.
_
∞
0
e
−x
dx = lim
t→∞
_
t
0
e
−x
dx = lim
t→∞
_
−e
−x
¸
¸
¸
t
0
_
= lim
t→∞
(−e
−t
+ 1) = 1
We can abbreviate this calculation by writing (instead of writing the limit):
_
∞
0
e
−x
dx = −e
−x
¸
¸
¸
∞
0
= −0 + 1.
Example 2. Evaluate
_
0
−∞
xe
x
dx.
Solution. By using the deﬁnition of an improper integral we have
_
0
−∞
xe
x
dx = lim
t→−∞
_
0
t
xe
x
dx.
We integrate by parts with f(x) = x and g
′
(x) = e
x
, so that f
′
(x) = 1 and
g(x) = e
x
.
_
0
t
xe
x
dx = xe
x
¸
¸
¸
0
t
−
_
0
t
e
x
dx
= −te
t
−e
x
¸
¸
¸
0
t
= −te
t
−1 +e
t
.
169
We know that lim
t→−∞
e
t
= 0 and by using the l’Hospital’s rule (theorem 2, subsec
tion 3.1.2) we get
lim
t→−∞
te
t
= lim
t→−∞
t
e
−t
= lim
y→∞
−y
e
y
[
∞
∞
]
= lim
y→∞
(−y)
′
(e
y
)
′
= lim
y→∞
−1
e
y
= 0
Another way of determining the previous limit is by using the fact the exponential
function goes faster to inﬁnity as any polynomial. So,
lim
y→∞
P(y)
e
y
= 0
and in particular
lim
y→∞
y
e
y
= 0.
Therefore
_
0
−∞
xe
x
dx = lim
t→−∞
(−te
t
−1 +e
−t
) = −0 −1 + 0 = −1.
Example 3. For what values of α is the integral
_
∞
1
1
x
α
dx convergent?
Solution. For α = 1 we have
_
∞
1
1
x
dx = lim
t→∞
_
t
1
1
x
dx = lim
t→∞
ln [x[
¸
¸
¸
t
1
= lim
t→∞
(ln t −ln 1) = ∞.
The limit is not ﬁnite and so the improper integral
_
∞
1
1
x
dx is divergent.
For α ,= 1 we have
_
∞
1
1
x
α
dx = lim
t→∞
_
t
1
x
−α
dx = lim
t→∞
x
−α+1
−α + 1
¸
¸
¸
t
1
=
1
1 −α
lim
t→∞
_
1
t
α−1
−1
_
If α > 1 then α −1 > 0 and
lim
t→∞
1
t
α−1
=
1
∞
= 0.
Therefore
_
∞
1
1
x
α
dx =
1
α −1
170
and the integral is convergent.
If α < 1 then 1 −α > 0,
lim
t→∞
1
t
α−1
= lim
t→∞
t
1−α
= ∞
and the integral is divergent.
We can summarize the previous results in the following remark (for future refer
ence).
Remark 1. The improper integral
_
∞
1
1
x
α
dx is convergent if α > 1 and divergent
if α ≤ 1.
Example 4. Evaluate
_
0
−∞
cos xdx if possible.
Solution.
_
0
−∞
cos xdx = lim
t→−∞
_
0
t
cos xdx
= lim
t→−∞
_
sin x
¸
¸
¸
0
t
_
= lim
t→−∞
(−sin t) = − lim
t→−∞
sin t.
Since lim
t→−∞
sin t does not exist (as in example 10, subsection 3.1.1).
We will analyse now the integrals of unbounded functions.
Deﬁnition 2. (improper integrals of unbounded functions)
1) Let f : [a, b) → R be a continuous function on [a, b) with lim
xրb
f(x) = ∞ (or
−∞).
We deﬁne the improper integral of the unbounded function f as
_
b
a
f(x)dx = lim
tրb
_
t
a
f(x)dx.
This integral is said to be convergent if the limit exists and has a ﬁnite value and
it is said to be divergent if the limit does not exist or it has an inﬁnite value. The
point b is called a critical point or a bad point.
2) Let f : (a, b] → R be a continuous function on (a, b] with lim
xցa
f(x) = ∞ (or
−∞). Then
_
b
a
f(x)dx = lim
tցa
_
b
t
f(x)dx.
The deﬁnition of convergence or divergence is similar with the previous case.
3) Let f : [a, c)∪(c, b] →R be a continuous function on [a, c)∪(c, b] with lim
xրc
f(x) =
∞ (−∞) or lim
xցc
f(x) = ∞ (−∞).
We deﬁne
_
b
a
f(x)dx =
_
c
a
f(x)dx +
_
b
c
f(x)dx.
171
The improper integral
_
b
a
f(x)dx is said to be convergent if both of
_
c
a
f(x)dx
and
_
b
c
f(x)dx are convergent.
The previous improper integral is divergent if at least one of the improper integrals
_
c
a
f(x)dx,
_
b
c
f(x)dx is divergent.
The integrals of unbounded functions are more diﬃcult to identify. It is necessary
to look at the interval of integration and determine if the integrand is continuous or
not in that interval. Things to look are fractions for which the denominator becomes
zero in the interval of integration.
Example 5. Evaluate
_
3
0
1
x −2
dx if possible.
Solution. Observe that the line x = 2 is a vertical asymptote of the integrand.
We have to use part 3) of the Deﬁnition 2 with c = 2:
_
3
0
1
x −2
dx =
_
2
0
1
x −2
dx +
_
3
2
1
x −2
dx
where
_
2
0
1
x −2
dx = lim
tր2
_
t
0
1
x −2
dx = lim
tր2
ln [x −2[
¸
¸
¸
t
0
= lim
tր2
ln(2 −x)
¸
¸
¸
t
0
= lim
tր2
[ln(2 −t) −ln 2]
= lim
tր2
ln(2 −t) −ln 2 = −∞.
Thus
_
2
0
1
x −2
dx is divergent. This implies that
_
3
0
1
x −2
dx is divergent. We do
not need to evaluate
_
3
2
1
x −2
dx.
If we had not observed the asymptote x = 2 in the previous example and we
confused the integral with a proper integral, then we might have made the following
erroneous calculation.
_
3
0
1
x −2
dx = ln [x −2[
¸
¸
¸
3
0
= ln 1 −ln 2 = −ln 2.
This is wrong because the integral is improper and must be calculated in terms of
limits.
Example 6. Evaluate
_
4
0
dx
√
x
if possible.
Solution. Observe that lim
xց0
1
√
x
= +∞.
172
We must use the part 1) of the Deﬁnition 2 with a = 0.
_
4
0
dx
√
x
= lim
tց0
_
4
t
x
−
1
2
dx = lim
tց0
x
−
1
2
+1
−
1
2
+ 1
¸
¸
¸
4
t
= lim
tց0
2
√
x
¸
¸
¸
4
t
= lim
tց0
(4 −2
√
t) = 4.
Hence, the integral converges and
_
4
0
dx
√
x
= 4.
Example 7. Evaluate
_
e
0
ln xdx if possible.
Solution. Since lim
xց0
ln x = −∞, then the critical point is a = 0.
Using integration by parts we get
_
e
0
ln xdx = lim
tց0
_
e
t
xln xdx = lim
tց0
_
xln x
¸
¸
¸
e
t
−
_
e
t
x
1
x
dx
_
= lim
tց0
(xln x −x)
¸
¸
¸
e
t
= e ln e −e − lim
tց0
(t ln t −t)
= −lim
tց0
t ln t = −lim
tց0
ln t
1
t
[
∞
∞
]
= −lim
tց0
1
t
−
1
t
2
= lim
tց0
t = 0
In conclusion, the integral is convergent and
_
e
0
ln xdx = 0.
Example 8. For what values of α is the integral
_
b
a
1
(x −a)
α
convergent?
Solution. For α = 1 we have
_
b
a
1
x −a
dx = lim
tցa
_
b
t
1
x −a
dx = lim
tցa
ln [x −a[
¸
¸
¸
b
t
= lim
tցa
(ln(b −a) −ln(t −a)) = ∞.
For α ,= 1 we have
_
b
a
1
(x −a)
α
dx = lim
tցa
_
b
t
(x −a)
−α
dx = lim
tցa
(x −a)
−α+1
−α + 1
¸
¸
¸
b
t
= lim
tցa
1
1 −α
[(b −a)
1−α
−(t −a)
1−α
]
If α > 1 then α −1 > 0 and
lim
tցa
(t −a)
1−α
= lim
tցa
1
(t −a)
α−1
= ∞.
173
If α < 1 then 1 −α > 0 and
lim
tցa
(t −a)
1−α
= 0.
We can summarize the previous results in the following remark (for future refer
ence).
Remark 2. a)
_
b
a
1
(x −a)
α
dx is convergent if α < 1 and divergent for α ≥ 1.
b)
_
b
a
1
(b −x)
α
dx is convergent if α < 1 and divergent for α ≥ 1.
The proof of the part b) in Remark 2 is similar with that presented in solving the
part a), so it will be omitted.
Sometimes an improper integral is too diﬃcult to be evaluated. In these cases we
can compare the integrals with known integrals. The theorem below shows us how to
do this.
Theorem 1. (Comparison theorem)
Let f, g : [0, ∞) →R be two continuous functions with f(x) ≥ g(x) ≥ 0 for x ≥ a.
a) If
_
∞
a
f(x)dx is convergent then
_
∞
a
g(x)dx is convergent.
b) If
_
∞
a
g(x)dx is divergent then
_
∞
a
f(x)dx is divergent.
If we use the previous theorem and Remark 1 we obtain the following criterion for
convergencedivergence.
Theorem 2. (Criterion for convergencedivergence).
a) If there is α > 1 such that lim
x→∞
x
α
[f(x)[ = c < ∞ then the improper integral
_
∞
a
f(x)dx is a convergent one.
b) If there is 0 < α ≤ 1 such that lim
x→∞
x
α
[f(x)[ = c > 0, then the improper integral
_
∞
a
f(x)dx is a divergent one.
Similar results are valid for the improper integral
_
b
−∞
f(x)dx.
In what concerns the improper integrals of unbounded functions we have the
following results.
Theorem 3. (Comparison theorem)
Let f, g : [a, b) →R be two continuous functions such that
lim
xրb
f(x) = lim
xրb
g(x) = ∞
and
f(x) ≥ g(x) ≥ 0 for x ∈ [a, b).
174
a) If
_
b
a
f(x)dx is convergent then
_
b
a
g(x)dx is convergent.
b) If
_
b
a
g(x)dx is divergent then
_
b
a
f(x)dx is divergent.
Similar results are valid for the improper integral
_
b
a
f(x)dx
where a is a critical point.
If we use Theorem 3 and Remark 2 we obtain the following criterions for
convergencedivergence.
Theorem 4. (Criterion for convergencedivergence)
Let f, g : [a, b) →R be two continuous functions such that
lim
xրb
[f(x)[ = lim
xրb
[g(x)[ = ∞.
a) If there is α ∈ (0, 1) such that
lim
xրb
(b −x)
α
[f(x)[ = c < ∞,
then the improper integral
_
b
a
f(x)dx is a convergent one.
b) If there is α ≥ 1 such that
lim
xրb
(b −x)
α
[f(x)[ = c > 0,
then the improper integral
_
b
a
f(x)dx is a divergent one.
Theorem 5. (Criterion for convergencedivergence)
Let f, g : (a, b] →R be two continuous functions such that
lim
xցa
[f(x)[ = lim
xցa
[g(x)[ = ∞.
a) If there is α ∈ (0, 1) such that
lim
xցa
(x −a)
α
[f(x)[ = c < ∞,
then the improper integral
_
b
a
f(x)dx is a convergent one.
b) If there is α ≥ 1 such that
lim
xցa
(x −a)
α
[f(x)[ = c > 0,
then the improper integral is a divergent one.
175
Example 9. Show that
_
∞
0
e
−x
2
dx is convergent.
Solution. We can’t evaluate the integral directly because we are not able to
compute the antiderivative of e
−x
2
. We write
_
∞
0
e
−x
2
dx =
_
1
0
e
−x
2
dx +
_
∞
1
e
−x
2
dx.
The ﬁrst integral on the right hand side is just a proper integral. In the second
integral we use the fact that for x ≥ 1 we have x
2
≥ x, so e
−x
2
≤ e
−x
.
The integral of e
−x
is easy to evaluate:
_
∞
1
e
−x
dx = lim
t→∞
_
t
1
e
−x
dx = lim
t→∞
(e
−1
−e
−t
) =
1
e
.
Thus, taking f(x) = e
−x
and g(x) = e
−x
2
in the Comparison theorem (Theorem
1), we see that
_
∞
1
e
−x
2
dx is convergent and so will be
_
∞
0
e
−x
2
dx.
3.3.2 Euler’s integrals
Euler’s integrals are special functions (deﬁned by using improper integrals) that
are used in probabilities and in the computation of certain integrals.
Beta function
The integral
_
1
0
x
p−1
(1 −x)
q−1
dx is called Euler’s ﬁrst integral.
This integral can be an improper integral of an unbounded function where the
potential critical points are 0 and 1.
If p < 1 then 0 is a critical point since
lim
xց0
x
p−1
(1 −x)
q−1
= ∞.
If q < 1 then 1 is a critical point since
lim
xր1
x
p−1
(1 −x)
q−1
= ∞.
If p ≥ 1 and q ≥ 1 then the Euler’s ﬁrst integral is a deﬁnite (proper) integral.
In what concerns the convergence of the Euler’s ﬁrst integral we have the following
result.
Theorem 1.
a) If p > 0 and q > 0 then the Euler’s ﬁrst integral is convergent.
b) If p ≤ 0 or q ≤ 0 then the Euler’s ﬁrst integral is divergent.
Proof. We split ﬁrst the integral as
_
1
0
x
p−1
(1 −x)
q−1
dx =
_
1/2
0
x
p−1
(1 −x)
q−1
dx +
_
1
1/2
x
p−1
(1 −x)
q−1
dx
176
and we study the convergence of both improper integrals in the righthand side of
previous equality.
We use Theorem 5 section 3.3.1 to study the convergence of the ﬁrst improper
integrals mentioned before
lim
xց0
x
α
x
p−1
(1 −x)
q−1
= lim
xց0
x
α+p−1
=
_
_
_
∞, if α +p −1 < 0
1, if α +p −1 = 0
0, if α +p −1 > 0
The previous limit is ﬁnite if α +p −1 ≥ 0 and is positive if α +p −1 ≤ 0.
The improper integral
_
1/2
0
x
p−1
(1−x)
q−1
is convergent if there is α ∈ (0, 1) such
that the previous limit is ﬁnite. We are looking for α ∈ (0, 1) such that α+p −1 ≥ 0.
So, we need to have 1 −p ≤ α < 1 which is possible if p > 0. Therefore for p > 0
the improper integral
_
1/2
0
x
p−1
(1 −x)
q−1
dx is convergent.
The improper integral
_
1/2
0
x
p−1
(1−x)
q−1
is divergent if there is α ≥ 1 such that
the previous limit is positive.
We are looking for α ≥ 1 such that α +p −1 ≤ 0.
So, we need to have 1 ≤ α ≤ 1 −p which is possible if p ≤ 0. Therefore for p ≤ 0
the improper integral
_
1/2
0
x
p−1
(1 −x)
q−1
dx is divergent.
Similar arguments, based on Theorem 4, give us the following results: for q > 0
the improper integral
_
1
1/2
x
p−1
(1−x)
q−1
dx is convergent and for q ≤ 0 the improper
integral
_
1
1/2
x
p−1
(1 −x)
q−1
dx is divergent, as desired.
Since for p > 0 and q > 0 the ﬁrst Euler’s integral is convergent we can deﬁne the
following function which is called Beta function:
B : (0, ∞) (0, ∞) →R
B(p, q) =
_
1
0
x
p−1
(1 −x)
q−1
dx
(1)
Theorem 2. (Properties of Beta function)
B1) B(p, 1) =
1
p
, B(1, 1) = 1, for each p > 0.
B2) B
_
1
2
,
1
2
_
= π.
B3) B(p, q) = B(q, p), for each p > 0 and q > 0.
B4) B(p, q) =
p −1
p +q −1
B(p −1, q), for each p > 1 and q > 0.
B5) B(p, q) =
q −1
p +q −1
B(p, q −1), for each p > 0 and q > 1.
B6) B(m, n) =
(m−1)!(n −1)!
(m+n −1)!
, for each m, n ∈ N
∗
.
177
B7) B(p, 1 −p) =
π
sin pπ
, for each 0 < p < 1.
Proofs. (for statements from 1 to 6)
B1) B(p, 1) =
_
1
0
x
p−1
(1 −x)
1−1
dx =
_
1
0
x
p−1
dx =
x
p
p
¸
¸
¸
1
0
=
1
p
If we let p = 1 in the previous equality we get B(1, 1) = 1.
B2) B
_
1
2
,
1
2
_
=
_
1
0
x
−
1
2
(1 −x)
−
1
2
dx =
_
1
0
1
√
x −x
2
dx
=
_
1
0
1
¸
1
4
−
_
x −
1
2
_
2
dx = arcsin
x −
1
2
1
2
¸
¸
¸
1
0
= arcsin 1 −arcsin(−1) = 2 arcsin 1 = 2
π
2
= π.
B3) B(p, q) =
_
1
0
x
p−1
(1 −x)
q−1
dx
Let t = 1 −x so that x = 1 −t and dx = −dt. When x = 1, t = 0 and when x = 0,
t = 1. Making the indicated substitution we ﬁnd:
B(p, q) =
_
0
1
(1 −t)
p−1
t
q−1
(−dt) = −
_
0
1
(1 −t)
p−1
t
q−1
dt
=
_
1
0
(1 −t)
p−1
t
q−1
dt = B(q, p)
B4) Let p > 1 and q > 0. By using the integration by parts we obtain:
B(p, q) =
_
1
0
x
p−1
_
−
(1 −x)
q
q
_
′
dx
= −
1
q
x
p−1
(1 −x)
q
¸
¸
¸
1
0
+
1
q
_
1
0
(p −1)x
p−2
(1 −x)
q
dx
=
p −1
q
_
1
0
x
p−2
(1 −x)
q−1
(1 −x)dx
=
p −1
q
_
1
0
x
p−2
(1 −x)
q−1
dx −
p −1
q
_
1
0
x
p−1
(1 −x)
q−1
dx
=
p −1
q
B(p −1, q) −
p −1
q
B(p, q).
From the previous equality we have:
B(p, q)
_
1 +
p −1
q
_
=
p −1
q
B(p −1, q)
where from we ﬁnally obtain
B(p, q) =
p −1
p +q −1
B(p −1, q)
178
as desired.
B5) Let q > 1 and p > 0. By using successively properties B3 and B4 we obtain:
B(p, q)
B3)
= B(q, p)
B4)
=
q −1
p +q −1
B(q −1, p)
B3)
=
q −1
p +q −1
B(p, q −1)
B6) The desired equality can be obtained by applying successively properties B4,
B5 and B1 as follows:
B(m, n)
B4)
=
m−1
m+n −1
B(m−1, n)
B4)
=
m−1
m+n −1
m−2
m+n −2
B(m−2, n)
B4)
= . . .
B4)
=
m−1
m+n −1
m−2
m+n −2
. . .
1
n + 1
B(1, n)
B5)
=
m−1
m+n −1
. . .
1
n + 1
n −1
1 +n −1
B(1, n −1)
B5)
=
m−1
m+n −1
. . .
1
n + 1
n −1
n
n −2
n −1
B(1, n −2)
B5)
= . . .
B5)
=
(m−1)!
(m+n −1) . . . (n + 1)
n −1
n
n −2
n −1
. . .
1
2
B(1, 1)
B1)
=
(m−1)!(n −1)!
(m+n −1)!
.
B7) The proof of this statement is beyond the scope of this text.
Example 1. By using the properties of Beta function compute the following
values:
a) B(11, 9); b) B
_
5
2
,
1
2
_
; c) B
_
7
4
,
1
4
_
.
Solution. a) B(11, 9)
B6)
=
(11 −1)!(9 −1)!
(11 + 9 −1)!
=
10!8!
19!
b)
B
_
5
2
,
1
2
_
B4)
=
5
2
−1
5
2
+
1
2
−1
B
_
5
2
−1,
1
2
_
=
3
4
B
_
3
2
,
1
2
_
B4)
=
3
4
3
2
−1
3
2
+
1
2
−1
B
_
3
2
−1,
1
2
_
=
3
8
B
_
1
2
,
1
2
_
B2)
=
3
8
π
179
c)
B
_
7
4
,
1
4
_
B4)
=
7
4
−1
7
4
+
1
4
−1
B
_
7
4
−1,
1
4
_
=
3
4
B
_
3
4
,
1
4
_
B3)
=
3
4
B
_
1
4
,
3
4
_
B7)
=
3
4
π
sin
π
4
=
3
4
π
√
2
2
=
3
2
√
2
π
Example 2. By using the properties of Beta function compute the following
integrals:
a)
_
1
0
x
10
(1 −x)
8
dx;
b)
_
1
0
x
_
x
1 −x
dx;
c)
_
1
0
4
¸
_
1 −x
x
_
3
dx.
Solution. a)
_
1
0
x
10
(1 −x)
8
dx =
_
1
0
x
11−1
(1 −x)
9−1
dx = B(11, 9) =
10!8!
9!
(see example 1, part a))
b)
_
1
0
x
_
x
1 −x
dx =
_
1
0
x
x
1
2
(1 −x)
1
2
dx
=
_
1
0
x
3
2
(1 −x)
−
1
2
dx =
_
1
0
x
5
2
−1
(1 −x)
1
2
−1
dx = B
_
5
2
,
1
2
_
=
3
8
π
(see example 2, part b))
c)
_
1
0
4
¸
_
1 −x
x
_
3
dx =
_
1
0
x
−
3
4
(1 −x)
3
4
dx
=
_
1
0
x
1
4
−1
(1 −x)
7
4
−1
dx = B
_
1
4
,
7
4
_
B3)
= B
_
7
4
,
1
4
_
=
3π
2
√
2
(see example 2, part c)).
Example 3. By using the properties of Beta function compute the following
integrals:
a)
_ π
2
0
sin
4
xcos
2
xdx; b)
_
∞
0
3
√
x
1 +x
2
dx.
Solution. a) We make the following change of variable
sin
2
x = t.
180
Since sin x =
√
t we get
x = arcsin
√
t
and hence
dx =
1
√
1 −t
1
2
√
t
dt.
We have to change the limits of integration. If x = 0 then t = 0 and if x =
π
2
then
t = 1.
_ π
2
0
sin
4
xcos
2
xdx =
_
1
0
t
2
(1 −t)
1
√
1 −t 2
√
t
dt
=
1
2
_
1
0
t
3
2
(1 −t)
1
2
dt =
1
2
_
1
0
t
5
2
−1
(1 −t)
3
2
−1
dt
=
1
2
B
_
5
2
,
3
2
_
B5)
=
1
2
3
2
−1
5
2
+
3
2
−1
B
_
5
2
,
3
2
−1
_
=
1
12
B
_
5
2
,
1
2
_
=
1
12
3
8
π =
π
32
(see example 1, part b)).
b) We make the following change of variable
x
2
=
t
1 −t
.
Since x =
_
t
1 −t
we get
dx =
1
2
_
t
1 −t
_
t
1 −t
_
′
dt
that is
dx =
1
2
_
1 −t
t
1
(1 −t)
2
dt
We have to change the limits of integration. If x = 0 then t = 0 and if x = ∞ then
t = 1.
_
∞
0
3
√
x
1 +x
2
dx =
_
1
0
6
_
t
1 −t
1 +
t
1 −t
1
2
_
1 −t
t
1
(1 −t)
2
dt
=
1
2
_
1
0
6
¸
t
1 −t
(1 −t)
3
t
3
1
(1 −t)
6
dt
181
=
1
2
_
1
0
6
¸
1
t
2
(1 −t)
4
dt =
1
2
_
1
0
t
−
1
3
(1 −t)
−
2
3
dt
=
1
2
_
1
0
t
2
3
−1
(1 −t)
1
3
−1
dt =
1
2
B
_
2
3
,
1
3
_
B2)
=
1
2
B
_
1
3
,
2
3
_
B7)
=
1
2
π
sin
π
3
=
1
2
π
√
3
2
=
π
√
3
Gamma function
The integral
_
∞
0
x
p−1
e
−x
dx is called Euler’s second integral.
This integral is an improper one on an unbounded interval. If p < 0 then 0 is a
critical point for this integral.
In what concerns the convergence of the Euler’s second integral we have the fol
lowing result:
Theorem 3. a) If p > 0 then the Euler’s second integral is convergent.
b) If p ≤ 0 then the Euler’s second integral is divergent.
Proof. We split ﬁrst the integral as
_
∞
0
x
p−1
e
−x
dx =
_
1
0
x
p−1
e
−x
dx +
_
∞
1
x
p−1
e
−x
dx
and we study the convergence of both improper integrals in the righthand side of
previous equality.
Similar arguments (based on Theorem 5 section 3.3.1) to those used in the proof of
Theorem 1 give us the following results: for p > 0 the improper integral
_
1
0
x
p−1
e
−x
dx
is convergent and for p ≤ 0 the improper integral
_
1
0
x
p−1
e
−x
dx is divergent.
We use Theorem 2, section 3.3.1, to study the convergence of the improper integral
_
∞
1
x
p−1
e
−x
dx.
We have
lim
x→∞
x
α
[x
p−1
e
−x
[ = lim
x→∞
x
α+p−1
e
x
= 0 < ∞
for each α, in particular for α > 1 (the previous limit is 0 since the exponential function
goes faster to inﬁnity than any power function). Hence the considered integral is a
convergent one, as desired.
Since for p > 0 the second Euler’s integral is convergent we can deﬁne the following
function which is called Gamma function.
Γ : (0, ∞) →R
Γ(p) =
_
∞
0
x
p−1
e
−x
dx
(2)
182
Gamma function is also known as generalized factorial function. We will present
next the basic properties of the gamma function and its relation with n!.
Theorem 4. (Properties of Gamma function)
Γ1) Γ(1) = 1.
Γ2) Γ(p) = (p −1)Γ(p −1), for each p > 1.
Γ3) Γ(n) = (n −1)!, for each n ∈ N
∗
.
Γ4) B(p, q) =
Γ(p)Γ(q)
Γ(p +q)
, for each p > 0 and q > 0.
Γ5) Γ
_
1
2
_
=
√
π.
Proof. Γ1)
Γ(1) =
_
∞
0
x
1−1
e
−x
dx =
_
∞
0
e
−x
dx = −e
−x
¸
¸
¸
∞
0
= −0 + 1 = 1
Γ2) Integration by parts gives us:
Γ(p) =
_
∞
0
x
p−1
e
−x
dx = lim
t→∞
_
t
0
x
p−1
(−e
−x
)
′
dx
= lim
t→∞
_
−e
−x
x
p−1
¸
¸
¸
t
0
+ (p −1)
_
t
0
x
p−2
e
−x
dx
_
= − lim
t→∞
t
p−1
e
t
+ (p −1) lim
t→∞
_
t
0
x
p−2
e
−x
dx
= 0 + (p −1)Γ(p −1) = (p −1)Γ(p −1)
Γ3) The desired equality can be obtained by applying successively property Γ2)
and Γ1) as follows:
Γ(n)
Γ2)
= (n −1)Γ(n −1)
Γ2)
= (n −1)(n −2)Γ(n −2) = =
Γ2)
= (n −1)(n −2) . . . 1Γ(1)
Γ1)
= (n −1) . . . 1 = (n −1)!
Γ4) There will be no proof of this item.
Γ5) We take p = q =
1
2
in Euler relation Γ4) to obtain
B
_
1
2
,
1
2
_
=
_
Γ
_
1
2
__
2
Γ(1)
,
where from
_
Γ
_
1
2
__
2
= π.
Since Γ
_
1
2
_
> 0 from the last equality we get Γ
_
1
2
_
=
√
π. This completes the
proof.
183
Example 4. Compute the following integrals:
a)
_
1
0
_
x
5
−x
6
dx;
b)
_
∞
0
x
2
e
−
x
5
dx;
c)
_
∞
2
xe
2−x
dx;
d)
_
∞
0
xe
2−x
dx.
Solution. a)
_
1
0
_
x
5
−x
6
dx =
_
1
0
_
x
5
(1 −x)dx =
_
1
0
x
5
2
(1 −x)
1
2
dx
=
_
1
0
x
7
2
−1
(1 −x)
3
2
−1
dx = B
_
7
2
,
3
2
_
=
Γ
_
7
2
_
Γ
_
3
2
_
Γ(5)
It remains for us to compute Γ(5), Γ
_
3
2
_
and Γ
_
7
2
_
.
Γ(5)
Γ3)
= (5 −1)! = 4! = 24
Γ
_
3
2
_
Γ2)
=
_
3
2
−1
_
Γ
_
3
2
−1
_
=
1
2
Γ
_
1
2
_
Γ5)
=
1
2
√
π
Γ
_
7
2
_
Γ2)
=
_
7
2
−1
_
Γ
_
7
2
−1
_
=
5
2
Γ
_
5
2
_
Γ2)
=
5
2
_
5
2
−1
_
Γ
_
5
2
−1
_
=
5
2
3
2
Γ
_
3
2
_
=
15
4
√
π
2
=
15
√
π
8
In consequence
_
1
0
_
x
5
−x
6
dx =
Γ
_
7
2
_
Γ
_
3
2
_
Γ(5)
=
15
√
π
8
√
π
2
24
=
5π
128
b) In order to use Gamma function we make the following substitutions
x
5
= t; x = 5t; dx = 5dt
If x = 0 then t = 0 and if x = ∞ then t = ∞.
_
∞
0
x
2
e
−
x
5
dx =
_
∞
0
(5t)
2
e
−t
5dt = 125
_
∞
0
t
2
e
−t
dt
184
= 125
_
∞
0
t
3−1
e
−t
dt = 125Γ(3) = 125 2! = 250
c) In order to use Gamma function we make the following substitutions
t = x −2; x = t + 2; dx = dt
If x = 2 then t = 0 and if x = ∞ then t = ∞
_
∞
2
xe
2−x
dx =
_
∞
0
(t + 2)e
−t
dt =
_
∞
0
te
−t
dt + 2
_
∞
0
e
−t
dt
=
_
∞
0
t
2−1
e
−t
dt + 2
_
∞
0
t
1−1
e
−t
dt = Γ(2) + 2Γ(1) = (2 −1)! + 2 1 = 3
d)
_
∞
0
xe
2−x
dx =
_
∞
0
e
2
xe
−x
dx = e
2
_
∞
0
xe
−x
dx
= e
2
_
∞
0
x
2−1
e
−x
dx = e
2
Γ(2) = e
2
(2 −1)! = e
2
.
EulerPoisson integral
The integral
_
∞
0
e
−x
2
dx is called EulerPoisson integral.
As we saw in example 9, subsection 3.3.1 the previous integral is convergent.
Next, by using the substitution
t = x
2
, x =
√
t, dx =
1
2
√
t
dt
we will evaluate the EulerPoisson integral. We observe that by the previous change
of variable the limits of integration remain the same.
_
∞
0
e
−x
2
dx =
1
2
_
∞
0
1
√
t
e
−t
dt =
1
2
_
∞
0
t
−
1
2
e
−t
dt
=
1
2
_
∞
0
t
1
2
−1
e
−t
dt =
1
2
Γ
_
1
2
_
=
1
2
√
π
In conclusion
_
∞
0
e
−x
2
dx =
√
π
2
Theorem 5. (Properties of EulerPoisson integral)
a)
_
∞
0
e
−x
2
dx =
√
π
2
;
b)
_
∞
−∞
e
−x
2
dx =
√
π;
185
c)
_
∞
−∞
e
−
x
2
2
dx =
√
2π.
Proof. a) The equality was already proved.
b)
_
∞
−∞
e
−x
2
dx =
_
0
−∞
e
−x
2
dx +
_
∞
0
e
−x
2
dx.
We compute separately the ﬁrst integral of the rightside of equality by making
the following substitutions:
t = −x, x = −t and dx = −dt.
If x = −∞ then t = ∞ and if x = 0 then t = 0.
Hence
_
0
−∞
e
−x
2
dx = −
_
0
∞
e
−t
2
dt =
_
∞
0
e
−t
2
dt =
_
∞
0
e
−x
2
dx =
√
π
2
Finally, we get
_
∞
−∞
e
−x
2
dx =
√
π
2
+
√
π
2
=
√
π.
c) By making the following change of variable
x
√
2
= t; x = t
√
2; dx =
√
2dt
we obtain
_
∞
−∞
e
−
x
2
2
dx =
_
∞
−∞
e
−
(
√
2t)
2
2
√
2dt =
√
2
_
∞
−∞
e
−t
2
dt =
√
2
√
π =
√
2π.
186
Chapter 4
Diﬀerential calculus of several
variables
4.1 Real functions of several variables.
Limits and continuity
4.1.1 Real functions of several variables
In many practical situations, the value of one quantity may depend on the values
on two or more others. For example, the output of a factory depends on the amount
of capital invested in the plant and on the size of the labor force. The demand for
butter may depend on the price of butter and on the price of margarine. Relationships
of this type can be represented mathematically by functions having more than one
variable.
We shall restrict ﬁrst our attention to functions of two variables.
Deﬁnition 1. A real function f of two variables is a rule that assigns to each
ordered pair of real numbers (x, y) in a set A a unique real number denoted by
f(x, y).
f : A →R, A ∋ (x, y) → f(x, y) ∈ R.
The set A is the domain of f (usually is the largest set for which the rule of f
makes sense), the set R in which f takes its values is called the target space and its
range is the set of values that f takes on, that is, ¦f(x, y) [ (x, y) ∈ A¦.
We often write z = f(x, y) to make explicit the value taken on by f at the general
point (x, y). The variables x and y are independent variables and z is the dependent
variable.
A function of two variables is a function whose domain is a subset of R
2
and whose
range is a subset of R. One way of visualizing such a function is by using an arrow
diagram, where the domain is a subset of R
2
.
187
`
¸ ¸
f(x, y) 0 f(a, b)
(x, y)
(a, b)
y
x
x
If a function f is given by a formula and no domain is speciﬁed, then the domain
of f is the largest set for which the rule of f makes sense.
Example 1. For the following function ﬁnd the domain and evaluate f(3, 2),
f(x, y) =
√
x +y + 1
x −1
.
Solution. The given expression is welldeﬁned if the denominator is not 0 and the
quantity under the square root sign is nonnegative. In conclusion the domain of f is
A = ¦(x, y) [ x +y + 1 ≥ 0, x ,= 1¦.
The inequality x + y + 1 ≥ 0, or y ≥ −x −1, represents the points that lie on or
above the line y = −x − 1 while the condition x ,= 1 means that the points on the
line x = 1 must be excluded from the domain.
`
¸
x = 1
−1
−1
y
x
O
f(3, 2) =
√
3 + 2 + 1
3 −1
=
√
6
2
188
Example 2. In 1928 Charles Cobb and Paul Douglas published a study in which
they modeled the growth of the american economy during the period 18991922. They
considered a simpliﬁed view of the economy in which production output is determined
by the amount of labor involved and the amount of capital invested.
The function used to model production was
Q(L, K) = bL
α
K
1−α
(1)
where Q is the total production (the monetary value of all goods produced ina year),
L is the amount of labor (the total number of hours worked in a year) and K is the
amount of capital invested.
Cobb and Douglas used the method of least squares (see 4.7.1) to ﬁt the data
published by the government to the function
Q(L, K) = 1, 01 L
0,75
K
0,25
.
The production function (1) has been used in many settings, from individual ﬁrms
to global economic functions. Its domain is ¦(L, K) [ L ≥ 0, K ≥ 0¦ because L and
K represent labor and capital and they are never negative.
Example 3. Suppose that at a certain factory, output is given by the Cobb
Douglas production function
Q(K, L) = 60 K
1/3
L
2/3
,
where K is the capital investment measured in unit of 1000 Euros and L the size of
the labor force measured in workerhours.
a) compute the output if the capital investment is 512.000 Euros and 1000 of
workerhours are used.
b) show that the output in part (a) will be double if both the capital investment
and the size of the labor force are doubled.
Solution. a) Evaluate Q(K, L) with K = 512 and L = 1000.
Q(512, 1000) = 60 (512)
1/3
(1000)
2/3
= 60 8 100 = 48000 (units)
b) Q(2 512, 2 1000) = 60 (2 512)
1/3
(2 1000)
2/3
= 60 2
1/3
512
1/3
2
2/3
1000
2/3
= 60 2
1/3+2/3
512
1/3
1000
2/3
= 2Q(512, 1000) = 2 48000 = 96000.
Using a calculation similar to the one in part (b) of the previous example, it can
be shown that if both capital and labor are multiplied by some positive number m,
then output will also be multiplied by m. In economics, production functions with
this property are said to have constant return to scale.
Indeed,
Q(mK, mL) = b(mK)
α
(mL)
1−α
= m
α
m
1−α
b K
α
L
1−α
= mQ(K, L)
189
Another way of visualizing the behaviour of a function of two variables is to
consider its graph.
Deﬁnition 2. If f is a real function of two variables with domain A, then the
graph of f is the set:
G
f
= ¦(x, y, z) ∈ R
3
[ (x, y) ∈ A, z = f(x, y)¦ (2)
The graph of a function f of two variables is a surface S ⊂ R
3
with equation
z = f(x, y).
Example 4. Sketch the graph of the function
f(x, y) = 6 −3x −2y, A = R
2
.
Solution. The graph has the equation
z = 6 −3x −2y or 3x + 2y +z = 6
which represents a plane. We determine ﬁrst the intercepts.
Putting y = z = 0 in the equation, we get x = 2 as the xintercept. Similarly
yintercept in 3 and zintercept is 6.
`
¸
x
(0,0,6)
(0,3,0)
z
y
(2,0,0)
.
The graph of a function of two variables is, in general, diﬃcult to be sketched,
and we shall not develop a systematic procedure for sketching the graphs of such
functions.
However, computer programs are available for graphing functions of two variables.
Fortunately, there is another way to visualize a function from R
2
to R  the study
of level curves of f.
190
Suppose f is a function of two variables x and y. If c is some value in the range
of the function f, then the equation f(x, y) = c describes a curve lying on the plane
z = c called the trace of the graph of f in the plane z = c.
If this trace is projected onto the xyplane, the resulting curve in the xyplane is
called a level curve.
Actually, for any constant c, the points (x, y) for which f(x, y) = c form a curve
in the xy plane that is said a level curve of f.
Example 5. If f(x, y) = x
2
− y, f : R
2
→ R, sketch the level curves f(x, y) = 4
and f(x, y) = 9.
Solution. The level curve f(x, y) = 9 consists of all points (x, y) in xy plane for
which
x
2
−y = 9 or y = x
2
−9.
The latter equality represent a quadratic function whose graph is sketched below.
`
¸
f
=
4
f
=
9
2 −2
3 −3
−4
−9
x
y
Deﬁnition 3. Let f : A ⊆ R
n
→ R, f : A → R is called a real function of n
variables if and only if for each x ∈ A there corresponds one and only one element
f(x) ∈ R
A ∋ x = (x
1
, x
2
, . . . , x
n
) → f(x) ∈ R.
The set A is called the domain of f, R is called the target space and the set
¦f(x) [ x ∈ A¦ is called the range of f or the image of f.
191
4.1.2 Limits. Continuity
Global limit
In order to make the things clear from the beginning we shall discuss ﬁrst the case
n = 2.
Intuitively, f has the limit l at a given point (a, b) if the values f(x, y) are ap
proaching l when (x, y) approaches (a, b). This limit can be written as
lim
(x,y)→(a,b)
f(x, y) = l or lim
x→
y→b
f(x, y).
In the last limit x → a and y → b in the same time and independently.
In the previous discussion we mention the word ”approach” and we want to mea
sure somehow the ”notion of approaching” to a given point.
In one variable x → a means that we can approach a only from the left side
and from the right side. In two or more variables the situation is more complicated.
There is an inﬁnite number of ways of approaching a point (a, b). For instance, we
can approach along vertical or horizontal lines; along every straight line which passes
through (a, b) or along every curve (as we can see in ﬁgure below).
`
¸
(a, b)
a
b
y
x
We will use the notion of distance between two points to measure how close is a
point another.
We can say that lim
(x,y)→(a,b)
f(x, y) = l if the distance between f(x, y) and l can be
made arbitrarily small by making the distance from (x, y) to (a, b) suﬃciently small.
We are prepared now to present the rigorous deﬁnition of a limit of a function at
a given point.
Deﬁnition 1. (the global limit)
Let f : A →R, A ⊆ R
2
and (a, b) ∈ A
′
.
192
We say that the limit of f as (x, y) approaches (a, b) is l and we write
lim
(x,y)→(a,b)
f(x, y) = l
if and only if ∀ ε > 0, ∃ δ > 0 such that if (x, y) ∈ A and
0 <
_
(x −a)
2
+ (y −b)
2
< δ,
then [f(x, y) −l[ < ε.
Other notations for the limit in the Deﬁnition 1 are
f(x, y) → l as (x, y) → (a, b)
and
lim
x→a
y→b
f(x, y) = l
(here x → a, y → b independently and in the same time).
`
¸ ¸ ¸
(x, y)
(a, b)
x
y
f
l l − ε
δ
l + ε
From the previous discussion we obtain the following remark.
If we can ﬁnd two diﬀerent paths of approach along which f has diﬀerent limits
then the global limit does not exist.
Remark 1. If f(x, y) → l
1
when (x, y) → (a, b) along a path C
1
and f(x, y) → l
2
as (x, y) → (a, b) along a path C
2
where l
1
,= l
2
then lim
(x,y)→(a,b)
f(x, y) does not exist.
Example 1. Show that lim
(x,y)→(0,0)
x
2
−3y
2
x
2
+y
2
does not exist.
Solution. First we will approach (0,0) along the real axis. Then y = 0 and
f(x, 0) =
x
2
x
2
= 1 for all x ,= 0, so f(x, y) → 1 as (x, y) → (0, 0) along xaxis.
We now approach along the yaxis by putting x = 0. Then
f(0, y) = −
3y
2
y
2
= −3 for all y ,= 0
193
so
f(x, y) → −3 as (x, y) → (0, 0)
along yaxis.
Since f has two diﬀerent limits along two diﬀerent paths, the global limit does not
exist.
Example 2. Study the existence of the following limit
lim
(x,y)→(0,0)
xy
x
2
+y
2
.
Solution. It is obvious that
A = R
2
¸ ¦(0, 0)¦, so (0, 0) ∈ A
′
.
If (x, y) → (0, 0) along the x axis then y = 0 and
lim
x→0
y=0
f(x, y) = lim
x→0
f(x, 0) = lim
x→0
x 0
x
2
+ 0
= 0.
In the same way
lim
x=0
y→0
f(x, y) = 0.
Even if we have obtained identical limits along the axes, that does not assure that
the given limit is 0. We can go to (0,0) along another line, let’s say y = mx (that is
the equation of a nonvertical line which passes through (0,0)).
lim
x→0
y=mx
f(x, y) = lim
x→0
y=mx
xy
x
2
+y
2
= lim
x→0
xmx
x
2
+m
2
x
2
= lim
x→0
x
2
m
x
2
(1 +m
2
)
=
m
1 +m
2
.
Since we have obtained diﬀerent limits along diﬀerent paths the global limit does
not exist.
Example 3. Does the limit lim
(x,y)→(0,0)
x
2
y
x
4
+y
2
exist?
Solution. If (x, y) → (0, 0) along any nonvertical line which passes through the
origin, y = mx then
lim
x→0
y=mx
f(x, y) = lim
x→0
x
2
mx
x
4
+m
2
x
2
= lim
x→0
xm
x
2
+m
2
= 0
So f(x, y) → 0 as (x, y) → (0, 0) along y = mx.
Even if f has the same limiting value along nonvertical lines, that does not show
that the given limit is 0.
194
Indeed, if we now let (x, y) → (0, 0) along the parabola y = mx
2
, we have
lim
x→0
y=mx
2
x
2
y
x
4
+y
2
= lim
x→0
x
2
mx
2
x
4
+m
2
x
4
= lim
x→0
mx
4
x
4
(1 +x
2
)
=
m
1 +m
2
Since diﬀerent paths lead to diﬀerent limiting values, the given limit does not
exist.
Example 4. Show that lim
(x,y)→(0,0)
x
3
y
x
6
+y
2
does not exist.
Solution. If we let y = mx or y = mx
2
we obtain that
lim
(x,y)→(0,0)
f(x, y) = 0.
The limit does not exist since on y = x
3
we have
lim
x→0
y=x
3
x
3
y
x
6
+y
2
= lim
x→0
x
6
x
6
+x
6
=
1
2
.
In what concerns the limits that do exist, their computation can be greatly sim
pliﬁed by the use of properties of limits. As in the case of one variable we have that:
the limit of a sum is the sum of the limits, the limit of a product of the limits, etc
(see subsection 3.1.1). The squeeze theorem also holds (Theorem 2, subsection 3.1.1).
Example 5. Compute
lim
(x,y)→(0,0)
xy
√
xy + 4 −2
[
0
0
]
= lim
x→0
y→0
xy(
√
xy + 4 + 2)
xy + 4 −4
= lim
x→0
y→0
xy(
√
xy + 4 + 2)
xy
= lim
x→0
y→0
(
_
xy + 4 + 2) = 4.
Example 6. Compute lim
x→0
y→0
3x
2
y
x
2
+y
2
if it exists.
Solution. Since x
2
≤ x
2
+y
2
and −3[y[ ≤ 3y ≤ 3[y[ then
−3[y[ ≤
3x
2
y
x
2
+y
2
≤ 3[y[.
We know that 3[y[ → 0 as (x, y) → (0, 0) and by using the Squeeze Theorem we
obtain that
lim
x→0
y→0
3x
2
y
x
2
+y
2
= 0.
Another way of computing the considered limit is to use the following result.
195
Remark 2. If f is a bounded function and
lim
(x,y)→(a,b)
g(x, y) = 0,
then
lim
(x,y)→(a,b)
[f(x, y)g(x, y)] = 0.
In our example
f(x, y) =
x
2
x
2
+y
2
and g(x, y) = 3y.
It is obvious that
f(x, y) =
x
2
x
2
+y
2
≤ 1
and lim
x→0
y→0
3y = 0, so the product limit will be 0.
We can now present the deﬁnition of the limit in general case (n variable case).
Deﬁnition 2. Let f : A →R, A ⊆ R
n
and a ∈ A
′
.
We say that the limit of f as x approaches a is l and we write
lim
x→a
f(x) = l
iﬀ ∀ ε > 0, ∃ δ > 0 such that x ∈ A and 0 < x −a < δ then [f(x) −l[ < ε (iﬀ is an
abbreviation for if and only if).
Other notations for the limit in the previous deﬁnition are f(x) → l as x → a and
lim
x
1
→a
1
...
x
n
→a
n
f(x) = l (here x
1
→ a
1
, . . . , x
n
→ a
n
independently and in the same time).
The following theorem will permit us to compute limits.
Theorem 1. (basic limit theorem). Let f, g, h : A ⊆ R
n
→R and a ∈ A
′
. Suppose
lim
x→a
f(x) = l and lim
x→a
g(x) = m. Then:
a) lim
x→a
[f(x) +g(x)] = l +m
b) lim
x→a
[cf(x)] = cl, where c is a real constant
c) lim
x→a
f(x)g(x) = lm
d) lim
x→a
f(x)
g(x)
=
l
m
provided m ,= 0
e) Moreover if l = m, lim
x→a
f(x) = lim
x→a
g(x) and if f(x) ≤ h(x) ≤ g(x), for x near
a, then lim
x→a
h(x) = l. (The Squeeze theorem for functions of several variables).
In the previous theorem, part e), x near a means x in a ball centered at a.
Iterated limits for function of two variables
Deﬁnition 3. Let f : A →R, A ⊆ R
2
, (a, b) ∈ A
′
. The iterated limits l
12
and l
21
are deﬁned as
l
12
= lim
x→a
_
lim
y→b
f(x, y)
_
196
and
l
21
= lim
y→b
_
lim
x→a
f(x, y)
_
provided that the previous limits exist.
In the case of iterated limits x → a and y → b independently but not in the same
time as you can see in the ﬁgure below.
`
¸
(a, y) (x, y)
(a, b) (x, b)
x a
b
y
l
12
l
21
·
¡
D
y
x
Remark 3. (The connection between the iterated limits and the global limit).
a) If there exist the iterated limits l
12
, l
21
and l
12
,= l
21
then the global limit does
not exist.
b) The existence and equality of the iterated limits does not assure that the global
limit exists.
Next, we will illustrate by 2 examples the statements of the previous results.
Example 7. Study the existence of the iterated limits and of the global limit in
the next cases:
a) f : R
2
→R
f(x, y) =
_
_
_
x
2
−3y
2
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
b) f : R
2
→R
f(x, y) =
_
xy
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
Solution. a) l
12
= lim
x→0
_
lim
y→0
x
2
−3y
2
x
2
+y
2
_
= lim
x→0
x
2
x
2
= lim
x→0
1 = 1
l
21
= lim
y→0
_
lim
x→0
x
2
−3y
2
x
2
+y
2
_
= lim
y→0
−3y
2
y
2
= lim
y→0
(−3) = −3
Hence l
12
,= l
21
.
197
The global limit does not exist (see example 1).
b) As we have already observed the global limit does not exist (see example 2).
l
12
= lim
x→0
_
lim
y→0
xy
x
2
+y
2
_
= lim
x→0
0
x
2
= lim
x→0
0 = 0
l
21
= lim
y→0
_
lim
x→0
xy
x
2
+y
2
_
= lim
y→0
0
y
2
= lim
y→0
0 = 0
So, in conclusion, even if l
12
= l
21
the global limit does not exist.
Directional limits
Deﬁnition 4. (the limit in the direction of a given vector)
Let f : A →R, A ⊆ R
n
, a ∈ A
′
and let h ∈ R
n
¸ ¦θ¦. The limit in the direction of
the vector h is deﬁned by
l
h
= lim
t→0
f(a +th)
In the case n = 2 we can geometrically present the way of approaching a in this
type of limit.
`
¸
`
,
,
y
x
h
a +th
a
th
As you can see a + th goes to a on the straight line which passes through a and
has the same direction with the vector h.
Example 8. Compute l
h
in the case when
f : A →R, f(x, y) =
x
2
+y
2x +y
2
, a = (1, 0),
h = (h
1
, h
2
) ,= θ
Solution. A = ¦(x, y) ∈ R
2
[ y
2
,= −2x¦
The domain A of f is the set of points in R
2
which not lie on the parabola whose
equation is y
2
= 2x. Hence a = (1, 0) ∈ A
′
.
198
¸
`
(1, 0)
a +th = (1, 0) +t(h
1
, h
2
) = (1, 0) + (th
1
, th
2
) = (1 +th
1
, th
2
)
l
h
= lim
t→0
f(a +th) = lim
t→0
f(1 +th
1
, th
2
)
= lim
t→0
(1 +th
1
)
2
+th
2
2(1 +th
1
) +t
2
h
2
2
=
1
2
Remark 4. (The connection between the global limit and the directional limit)
1) If the global limit exists and is equal to l then any directional limit exists and
is equal to l.
2) If there are two diﬀerent vectors h ,= h
′
(h, h
′
,= θ) such that l
h
1
,= l
h
2
, then
the global limit does not exist.
3) If any directional limit exists and is equal to the same real number then it is
not sure that the global limit exists.
Continuity
Deﬁnition 5. Let f : A → R, A ⊆ R
n
and a ∈ A. If a ∈ A
′
we say that f is
continuous at a if
lim
x→a
f(x) = f(a).
If a is an isolated point then f is continuous at a.
A function f is said to be continuous on the set A if f is continuous at any point
which belongs to A.
The intuitive meaning of continuity is that if the point x (in R
n
) changes by a
small amount, then the value of f(x) changes by a small amount.
This means that a surface that is the graph of a continuous function (of two
variables) has no holes or breaks.
Using the properties of limits, it can be easily seen that sums, diﬀerences, products,
quotients and compositions of continuous functions are continuous on their domain.
199
Theorem 2. (Basic continuity theorem). Let f, g : A ⊆ R
n
→ R and let a ∈ A.
Suppose f and g are continuous at a. Then f + g, fg and cf (c ∈ R, constant) are
continuous at a as is
f
g
provided g(a) ,= 0.
Theorem 3. (Continuity composition theorem) Let f : A →R, A ⊆ R
n
, h : B →
R such that f(A) ⊆ B ⊆ R and let a ∈ A. Suppose f is continuous at a and that h is
continuous at f(a). Then the function h ◦ f,
h ◦ f : A →R, (h ◦ f)(x) = h(f(x))
is continuous at a.
Example 9. Let f : R
2
→R deﬁned by
f(x, y) =
_
_
_
3x
2
y
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
Study the continuity of f.
Solution. We know that f is continuous for (x, y) ,= (0, 0) since it is equal to a
rational function there.
Also from Example 6, we have
lim
(x,y)→(0,0)
f(x, y) = 0 = f(0, 0).
Therefore f is continuous at (0,0) and so it is continuous on R
2
.
Example 10. Study the continuity of the function f : R
2
→R,
f(x, y) =
_
_
_
1 −cos(x
3
+y
3
)
x
2
+y
2
, if (x, y) ,= (0, 0)
α, if (x, y) = (0, 0)
where α is a real parameter.
Solution. We know that f is continuous for (x, y) ,= (0, 0) since it is equal to a
composition of elementary functions there.
By using the formula
1 −cos 2z = 2 sin
2
z
and the fact that
lim
z→0
sin z
z
0
0
= lim
z→0
(sin z)
′
z
′
= lim
z→0
cos z = 1,
we have
lim
(x,y)→(0,0)
f(x, y) = lim
(x,y)→(0,0)
1 −cos(x
3
+y
3
)
x
2
+y
2
= lim
(x,y)→(0,0)
2 sin
2
_
x
3
+y
3
2
_
x
2
+y
2
= 2 lim
(x,y)→(0,0)
sin
2
_
x
3
+y
3
2
_
_
x
3
+y
3
2
_
2
_
x
3
+y
3
2
_
2
x
2
+y
2
200
=
1
2
_
¸
_ lim
(x,y)→(0,0)
sin
x
3
+y
3
2
x
3
+y
3
2
_
¸
_
2
lim
(x,y)→(0,0)
x
6
+y
6
+ 2x
3
y
3
x
2
+y
2
=
1
2
lim
(x,y)→(0,0)
_
(x
2
+y
2
)(x
4
−x
2
y
2
+y
4
)
x
2
+y
2
+
2x
3
y
3
x
2
+y
2
_
=
1
2
_
lim
(x,y)→(0,0)
(x
4
−x
2
y
2
+y
4
) + 2 lim
(x,y)→(0,0)
x
3
y
3
x
2
+y
2
_
=
1
2
lim
(x,y)→(0,0)
x
3
y
3
x
2
+y
2
=
1
2
lim
(x,y)→(0,0)
x
2
x
2
+y
2
xy
3
=
1
2
0 = 0.
The last limit is 0 by using Remark 2 since
0 ≤
x
2
x
2
+y
2
≤ 1 and lim
(x,y)→(0,0)
xy
3
= 0.
In conclusion we get that if α = 0 then f is continuous on R
2
and if α ,= 0 then f
is continuous on R
2
¸ ¦(0, 0)¦.
We end this section by presenting (without proof) a result concerning continuous
functions deﬁned on compact sets. We need this result later.
Theorem 4. (Weierstrass theorem. Extreme value theorem). Suppose f : K →R
is continuous and K is a closed and bounded subset of R
n
(hence a compact subset).
Then f has a maximum value and a minimum value on K.
4.2 Partial derivatives
An important goal in economic analysis is to understand how a change in one
economic variable aﬀects another.
We have seen (see subsection 3.1.2) that onevariable calculus helps us to under
stand such kind of changes in the case of functions of one variable.
Since we are interested in the variation brought by the change in one variable,
we will change one variable at a time, keeping all the other variables constant and
the correspondent derivative is called the partial derivative of f with respect to the
considered variable.
Remark 1. From now on all the domains of deﬁnition of considered functions will
be open and connected sets and will be denoted by D. Recall that an open connected
set is a domain (see appendix A). Hence any point in D will be an accumulation point
for D.
For a better understanding of these notions we will start with the case of functions
of two variables.
Deﬁnition 1. (Partial diﬀerentiability at a point; two variables case) Let f : D →
R, D ⊆ R
2
and let (a, b) ∈ D.
201
We say that f is partial diﬀerentiable with respect to the variable x at the point
(a, b) if the following limit exists and has a ﬁnite value
∂f
∂x
(a, b) = f
′
x
(a, b) = lim
x→a
f(x, b) −f(a, b)
x −a
= lim
h→0
f(a +h, b) −f(a, b)
h
(1)
In the same way we can deﬁne
∂f
∂y
(a, b) = f
′
y
(a, b) = lim
y→b
f(a, y) −f(a, b)
y −b
= lim
h→0
f(a, b +h) −f(a, b)
h
(2)
Example 1. Compute the partial derivatives of f at the point (2,1) where
f : R
2
→R, f(x, y) = x
2
+y
2
+xy.
Solution.
∂f
∂x
(2, 1) = lim
x→2
f(x, 1) −f(2, 1)
x −2
= lim
x→2
x
2
+ 1 +x −(2
2
+ 1
2
+ 2 1)
x −2
= lim
x→2
x
2
+x −6
x −2
= lim
x→2
(x −2)(x + 3)
x −2
= lim
x→2
(x + 3) = 5
∂f
∂y
(2, 1) = lim
y→1
f(2, y) −f(2, 1)
y −1
= lim
y→1
4 +y
2
+ 2y −7
y −1
= lim
y→1
y
2
+ 2y −3
y −1
= lim
y→1
(y −1)(y + 3)
y −1
= lim
y→1
(y + 3) = 4
Deﬁnition 2. (Partial diﬀerentiability on a set, two variables case) Let f : D →R,
D ⊆ R
2
. We say that the function f is partial diﬀerentiable with respect to x on the
domain D, if f is partial diﬀerentiable with respect to x at any point (x, y) in D.
In this case the following limit exists and has a ﬁnite value at each point (x, y) in
D.
∂f
∂y
(x, y) = f
′
x
(x, y) = lim
h→0
f(x +h, y) −f(x, y)
h
, for each (x, y) ∈ D.
Similarly, we can deﬁne partial diﬀerentiability of f with respect to y on a set.
∂f
∂y
(x, y) = f
′
y
(x, y) = lim
h→0
f(x, y +h) −f(x, y)
h
for each (x, y) ∈ D.
In this case we can deﬁne the partial derivative functions of f with respect to x,
respectively to y
∂f
∂x
= f
′
x
: D →R (3)
202
(x, y) →
∂f
∂x
(x, y)
respectively
∂f
∂y
= f
′
y
: D →R (4)
(x, y) →
∂f
∂y
(x, y).
Actually, to ﬁnd the partial derivative function with respect to x we think of y as
a constant and we diﬀerentiate in the usual way with respect to x. This gives another
method of computing the partial derivatives at a given point. First we compute the
partial derivative functions and then we evaluate them at the considered point.
Example 2. Another way of computing the partial derivatives in Example 1 is
the following:
∂f
∂x
: R
2
→R
∂f
∂x
(x, y) = (x
2
+y
2
+xy)
′
x
= (x
2
)
′
x
+ (y
2
)
′
x
+ (xy)
′
x
= 2x + 0 +y = 2x +y
So,
∂f
∂x
(2, 1) = 2 2 + 1 = 5
∂f
∂y
(x, y) = (x
2
+y
2
+xy)
′
y
= (x
2
)
′
y
+ (y
2
)
′
y
+ (xy)
′
y
= 2y +x
and
∂f
∂y
(2, 1) = 2 1 + 2 = 4,
as we expected.
As we can see in the above computations in order not to make mistakes, we have
to mention with respect to whom we compute the partial derivative functions.
Example 3. Compute the partial derivatives of each of the following functions:
a) f : D →R, f(x, y) =
xy
x
2
+y
2
b) g : D →R, g(s, t) = (s
2
−st +t
2
)
5
c) f : D →R, f(x, y) =
5
√
xy.
Solution. a) D = ¦(x, y) ∈ R
2
[ x
2
+y
2
,= 0¦ = R
2
−¦(0, 0)¦
To compute
∂f
∂x
, think of the variable y as a constant and apply the quotient rule
(see subsection 3.1.2)
∂f
∂x
(x, y) =
_
xy
x
2
+y
2
_
′
x
=
(xy)
′
x
(x
2
+y
2
) −xy(x
2
+y
2
)
′
x
(x
2
+y
2
)
2
=
y(x
2
+y
2
) −xy 2x
(x
2
+y
2
)
2
=
y(y
2
−x
2
)
(x
2
+y
2
)
2
203
To compute
∂f
∂y
, think of the variable x as a constant and apply the quotient rule.
∂f
∂y
(x, y) =
_
xy
x
2
+y
2
_
′
y
=
(xy)
′
y
(x
2
+y
2
) −xy(x
2
+y
2
)
′
y
(x
2
+y
2
)
2
=
x(x
2
+y
2
) −xy 2y
(x
2
+y
2
)
2
=
x(x
2
−y
2
)
(x
2
+y
2
)
2
b) D = R
2
To compute
∂g
∂s
, we treat the variable t as if it is a constant.
∂g
∂s
(s, t) = [(s
2
−st +t
2
)
5
]
′
s
= 5(s
2
−st +t
2
)
4
(s
2
−st +t
2
)
′
s
= 5(s
2
−st +t
2
)
4
(2s −t)
In the same way we can obtain that
∂g
∂t
(s, t) = [(s
2
−st +t
2
)
5
]
′
t
= 5(s
2
−st +t
2
)
4
(s
2
−st +t
2
)
′
t
= 5(s
2
−st +t
2
)
4
(2t −s)
c) D = R
2
∂f
∂x
(x, y) = (
5
√
xy)
′
x
=
5
√
y (
5
√
x)
′
x
=
5
√
y (x
1
5
)
′
=
5
√
y
1
5
x
1
5
−1
=
5
√
y
1
5
1
5
√
x
4
=
1
5
5
_
y
x
4
The previous equality is true for x ,= 0.
So, if x = 0, in order to obtain the partial derivative with respect to x at the
points of the form (0, y), we cannot simply diﬀerentiate and substitute (x, y) = (0, y).
In fact, we can use the deﬁnition 1.
∂f
∂x
(0, y) = lim
x→0
f(x, y) −f(0, y)
x −0
= lim
x→0
5
√
xy
x
= lim
x→0
5
_
y
x
4
=
_
∞, if y > 0
−∞, ify < 0
, for x = 0 and y ,= 0.
If x = 0, y = 0 we have
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x −0
= lim
x→0
0 −0
x
= lim
x→0
0 = 0
204
∂f
∂y
(x, y) = (
5
√
xy)
′
y
=
5
√
x (
5
√
y)
′
y
=
5
√
x
1
5
5
_
y
4
=
1
5
5
_
x
y
4
which is valid for y ,= 0 and x ,= 0.
For y = 0 and x ,= 0
∂f
∂y
(x, 0) = lim
y→0
f(x, y) −f(x, 0)
y −0
= lim
y→0
5
√
xy
y
= lim
y→0
5
_
x
y
4
=
_
∞, if x > 0
−∞, if x < 0
For y = 0 and x = 0
∂f
∂y
(0, 0) = lim
y→0
f(0, y) −f(0, 0)
y −0
= lim
y→0
0 −0
y
= 0
In conclusion
∂f
∂x
(x, y) =
_
¸
¸
_
¸
¸
_
0, (x, y) = (0, 0)
1
5
5
_
y
x
4
, (x, y) ,= (0, y)
does not exist, (x, y) = (0, y) where y ,= 0
and
∂f
∂y
(x, y) =
_
¸
¸
_
¸
¸
_
0, (x, y) = (0, 0)
1
5
5
_
x
y
4
, (x, y) ,= (x, 0)
does not exist, (x, y) = (x, 0) where x ,= 0.
Example 4. Economic interpretation of the partial derivatives
Let Q = f(K, L) be a production function where Q represents the output, K
represents the capital input and L represents the labor input.
If the ﬁrm is using K
0
units of capital and L
0
units of labor to produce Q
0
units
of output then the partial derivative:
∂f
∂K
(K
0
, L
0
) = lim
K→K
0
f(K, L
0
) −f(K
0
, L
0
)
K −K
0
= lim
∆K→0
f(K
0
+ ∆K, L
0
) −f(K
0
, L
0
)
∆K
= lim
∆K→0
∆Q
∆K
≈
∆Q
∆K
is the rate at which output changes with respect to capital, keeping L ﬁxed at L
0
.
If capital increases by ∆K, then the output will increase by
∆Q ≈
∂f
∂K
(K
0
, L
0
) ∆K.
205
If ∆K = 1 then ∆Q ≈
∂f
∂K
(K
0
, L
0
) so
∂f
∂K
(K
0
, L
0
) represents approximately the
change in output due to a one unit increase in capital (keeping L ﬁxed).
∂f
∂K
(K
0
, L
0
) is called the marginal product of capital.
In the same way we can deduce that
∂f
∂L
(K
0
, L
0
) is the rate at which output
changes with respect to labor in the case in which capital is held ﬁxed at K
0
.
∂f
∂L
(K
0
, L
0
) represents approximately the change in output due to a one unit
increase in labor (keeping the capital ﬁxed at K
0
).
∂f
∂L
(K
0
, L
0
) is called the marginal product of labor.
As a particular case of the previous analysis we will consider the CobbDouglas
production function
Q : [0, ∞) [0, ∞) →R
Q(K, L) = 4K
3/4
L
1/4
If K = 10000 and L = 625 the output is
Q = 4 (10
4
)
3/4
(5
4
)
1/4
= 20000.
The partial derivatives at (10000, 625) are
∂Q
∂K
= (4K
3
4
L
1
4
)
′
K
= 4L
1
4
3
4
K
3
4
−1
= 3
_
L
K
_1
4
= 3
4
_
L
K
∂Q
∂K
(10000, 625) = 3
4
_
625
10000
= 3
5
10
=
3
2
∂Q
∂K
(10000, 625) =
3
2
∂Q
∂L
= 4
1
4
K
3
4
L
−
3
4
=
_
K
L
_3
4
∂Q
∂L
(10000, 625) =
_
10000
625
_3
4
=
_
10
5
_
3
= 2
3
= 8
∂Q
∂L
(10000, 625) = 8
By the previous discussion if L
0
= 625 and K is increased by ∆K, Q will increase
by approximately
3
2
∆K.
If ∆K = 10 then
Q(10000, 625) ≈ 20000 +
3
2
10 = 20015.
Evaluating
Q(10010, 625) = 4(10010)
3
4
625
1
4
= 20014, 99,
206
so the previous estimation is a very good one.
In the same way, by using the previous discussion if Q
0
= 10000 and L is decreased
by ∆L, L will decrease by approximately 8 ∆L.
That means that a 10 units decrease in labor should induce a 8 10 decrease in
output.
Q(10000, 615) ≈ 20000 −80 = 19920
Evaluating
Q(10000, 615) = 4 (10000)
3
4
615
1
4
≈ 19919, 5
so the previous estimation is a good one.
Example 5. In example 1, subsection 4.1.1 we described the work of Cobb and
Douglas in modelling the total production of an economic system.Here we use the
partial derivatives to show how their model can be obtained from certain assumptions
on economy.
If Q = Q(K, L) then the partial derivatives
∂Q
∂K
,
∂Q
∂L
are the rates of change at
which production modiﬁes with respect to amount of capital and labor.
The assumptions made by Cobb and Douglas are the following.
(a) If either capital or labor vanishes, then so will production.
(b) The marginal product of capital is proportional to the amount of production
per unit of capital.
(c) The marginal product of labor is proportional to the amount of production
per unit of labor.
The second assumption says that
∂Q
∂K
= α
Q
K
for some constant α which is equivalent to
∂Q
∂K
Q
= α
1
K
.
If we keep L constant, by integrating this equality with respect to K we get
ln Q = αln K + ln C
1
,
where C
1
is a function which depend on L. From the previous relation we get
Q(K, L) = C
1
(L) K
α
(5)
Similarly, from the third assumption we get
Q(K, L) = C
2
(K) L
β
(6)
Combining (5) and (6) we have
Q(K, L) = bK
α
L
β
207
where b is a constant that is independent of both K and L.
If capital are labor are both increased by a factor m, then
Q(mK, mL) = b(mK)
α
(mL)
β
= bm
α+β
K
α
L
β
= m
α+β
Q(K, L)
If α +β = 1, then Q(mK, mL) = mQ(K, L), which means that the production is
also increased by a factor m. That is why Cobb and Douglas assumed that α+β = 1
and therefore
Q(K, L) = bK
α
L
1−α
.
Remark 2. The existence of partial derivatives of a function f at a given point
does not imply the continuity of f at that point, as we can see in the next example.
Example 6. Let f : R
2
→R,
f(x, y) =
_
xy
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
Show that f is not continuous at (0,0) even that the function f admits partial
derivatives at (0,0).
Solution. Example 2 from subsection 4.1.2 shows that
lim
(x,y)→(0,0)
f(x, y)
does not exist and in consequence f is not continuous at the point (0,0).
We will prove next that the partial derivatives of f exist at (0,0).
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x
= lim
x→0
0 −0
x
= lim
x→0
0
x
= lim
x→0
0 = 0
∂f
∂y
(0, 0) = lim
y→0
f(0, y) −f(0, 0)
y
= lim
y→0
0
y
= lim
y→0
0 = 0.
Next, we will present the deﬁnition of partial derivatives in the general case of a
function of n variables.
Deﬁnition 3. Let f : D →R, D ⊆ R
n
, a = (a
1
, . . . , a
n
) ∈ D and let i = 1, n.
The partial derivative of function f with respect to x
i
at a is the following limit
(if the limit exists and has a ﬁnite value):
∂f
∂x
i
(a) = f
′
x
i
(a) =
208
lim
x→x
i
f(a
1
, . . . , a
i−1
, x
i
, a
i+1
, . . . , a
n
) −f(a
1
, . . . , a
n
)
x
i
−a
i
(7)
If the partial derivative of function f with respect to x
i
at a exists for all points a
in D we can deﬁne the partial derivative function with respect to x, in the following
way:
∂f
∂x
i
: D →R, x →
∂f
∂x
i
(x) (8)
Deﬁnition 4. A function f : D → R is continuously diﬀerentiable (or C
1
) on D
if the partial derivative functions
∂f
∂x
i
exist for all i = 1, n and are continuous on D.
Example 7. Compute the partial derivative functions of each of the following
functions:
a) f : R
3
→R, f(x, y, z) = x
2
+ sin yz
b) f : D →R, D = ¦x ∈ R
n
[ x
1
> 0, x
2
> 0, . . . , x
n
> 0¦
f(x
1
, . . . , x
n
) = x
2
1
+x
2
2
+ +x
2
n−1
+
x
n
x
1
.
Solution. a)
∂f
∂x
: R
3
→R,
∂f
∂x
(x, y, z) = (x
2
+ sin yz)
′
x
= 2x + 0 = 2x
∂f
∂y
: R
3
→R,
∂f
∂y
(x, y, z) = (x
2
+ sin yz)
′
y
= 0 +z cos yz = z cos yz
∂f
∂y
: R
3
→R,
∂f
∂z
(x, y, z) = (x
2
+ sin yz)
′
z
= y cos yz
b)
∂f
∂x
1
: D →R,
∂f
∂x
1
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_
′
x
1
= 2x
1
+
1
x
n
∂f
∂x
2
: D →R,
∂f
∂x
2
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_
′
x
2
= 2x
2
. . . . . . . . .
∂f
∂x
n−1
: D →R,
∂f
∂x
n−1
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_
′
x
n−1
= 2x
n−1
∂f
∂x
n
: D →R,
∂f
∂x
n
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_
′
x
n
= −
x
1
x
2
n
We end this section by presenting a geometric interpretation of partial derivatives.
Suppose f is a function of two variables x and y. If c is some value in the range
of the function f, then the equation f(x, y) = c describes a curve lying on the plane
z = c called the trace of the graph of f in the plane z = c.
209
If this trace is projected onto the xyplane, the resulting curve in the xyplane is
called a level curve.
Actually, for any constant c, the points (x, y) for which f(x, y) = c form a curve
in the xy plane that is said a level curve of f.
The slope of the line that is tangent to the level curve f(x, y) = c at a particular
point is given by the derivative y
′
(x). This derivative is the rate of change of y with
respect to x on the level curve and hence is approximately the amount by which the
y coordinate of a point on the level curve changes when the x coordinate is increased
by 1.
For example, if f represent output and x and y represent the levels of skilled
and unskilled labor, respectively, the slope y
′
(x) of the tangent to the level curve
f(x, y) = c is an approximation to the amount by which the manufacturer should
change the level of unskilled labor y to compensate for a 1unit increase in the level
of skilled labor x so that output will remain unchanged.
`
¸
·
`
_
y
′
(x)
f(x, y) = c
y
x x x + 1
actual change in y
on level curve
One way to compute y
′
(x) is to solve the equation f(x, y) = c in terms of x, and
then diﬀerentiate the resulting expression with respect to x. Sometimes is diﬃcult or
even impossible to solve the equation f(x, y) = c explicitly for y. In such cases, we can
diﬀerentiate the equality f(x, y) = c with respect to x, by considering y to depend on
x:
∂f
∂x
1 +
∂f
∂y
y
′
(x) = 0
wherefrom we get the formula
y
′
(x) = −
f
′
x
(x, y)
f
′
y
(x, y)
(9)
Since
y
′
(x) = lim
∆x→0
∆y
∆x
≈
∆y
∆x
,
210
then
∆y ≈ y
′
(x) ∆x = −
f
′
x
f
′
y
∆x.
In conclusion, the change in y needed to compensate a small change ∆x in x so
that the value of the function f(x, y) will remain unchanged is
∆y ≈ −
f
′
x
f
′
y
∆x.
Example 8. Indiﬀerence curves
Let U(x, y) be a utility function which measures the total satisfaction (or utility)
the consumer obtains from having x units of the ﬁrst commodity and y units of the
second. An indiﬀerence curve is a level curve U(x, y) = C of the utility function.
An indiﬀerence curve gives all the combination of x and y that lead to the same
level of consumer satisfaction.
We next present a typical example involving the slope of an indiﬀerence curve.
Suppose U(x, y) = x
3
2
y. The consumer currently owns x = 16 units of the ﬁrst
commodity and y = 20 units of the second.
Use calculus to estimate how many units of these commodity could substitute for
1 unit of the ﬁrst commodity without changing the total utility.
Solution. The level of utility is
U(16, 20) = 1280.
The corresponding indiﬀerence curve is sketched in ﬁgure below. Since 1280 = x
3
2
y,
then y = 1280x
−
3
2
.
`
¸
We try to estimate the change ∆y required to compensate a change of ∆x = −1
so that the utility will remain at its current level which is 1280. The approximation
formula
∆y ≈ −
U
′
x
U
′
y
∆x =
U
′
x
U
′
y
=
3
2
x
1
2
y
x
3
2
=
3
2
y
x
211
with x = 16 and y = 20 gives
∆y =
3
2
20
16
=
15
8
= 1, 875 units.
4.3 Higher order partial derivatives
The partial derivative
∂f
∂x
i
of a function f, i ∈ ¦1, . . . , n¦, is itself a function of
n variables so we can continue analyzing the existence of partial derivatives of these
partial derivatives obtaining the second order partial derivatives.
Let f : D →R, D ⊆ R
n
, a ∈ D and i = 1, n. We assume that the partial derivative
function
∂f
∂x
i
exists.
Deﬁnition 1. If the function
∂f
∂x
i
is partial diﬀerentiable at the point a with
respect to x
j
(j = 1, n) we say that f admits secondorder partial derivative at the
point a with respect to x
i
and x
j
.
We have the following notations:
∂
∂x
j
_
∂f
∂x
i
_
(a) =
∂
2
f
∂x
i
∂x
j
(a)
or
(f
′
x
i
)
′
x
j
(a) = f
′′
x
i
x
j
(a).
If i = j, then the second order partial derivative is written in the following way
∂
∂x
i
_
∂f
∂x
i
_
(a) =
∂
2
f
∂x
2
i
(a) = f
′′
x
2
i
(a)
If i ,= j, then
∂
2
f
∂x
i
∂x
j
(a) = f
′′
x
i
x
j
(a)
is called the mixed partial derivative or cross partial derivative.
Continuing in the same way we can obtain higher order partial derivatives.
Example 1. Let f : R
2
→R, f(x, y) = e
x
y. Compute f
′
x
, f
′′
x
2
, f
′′′
x
2
y
.
Solution.
f
′
x
= (e
x
y)
′
x
= y(e
x
)
′
x
= ye
x
f
′′
x
2 = (f
′
x
)
′
x
= (ye
x
)
′
x
= ye
x
f
′′′
x
2
y
= (f
′′
x
2)
′
y
= ((f
′
x
)
′
x
)
′
y
= e
x
Remark 1. A function of n variables can admit
• n ﬁrst order partial derivatives:
∂f
∂x
1
,
∂f
∂x
2
, . . . ,
∂f
∂x
n
212
• n
2
second order partial derivatives (since for each partial derivative we have n
second order partial derivatives)
• n
3
third order partial derivatives
. . . . . . . . .
• n
k
k
th
order partial derivatives.
It is natural to arrange the n
2
partial derivatives of f at a given point a into a
n n matrix whose (i, j)th entry is
∂
2
f
∂x
i
∂x
j
(a). This matrix is called the Hessian or
the Hessian matrix of f at a.
H(a) =
_
_
_
_
_
_
_
_
_
_
_
_
∂
2
f
∂x
2
1
(a)
∂
2
f
∂x
1
∂x
2
(a) . . .
∂
2
f
∂x
1
∂x
n
(a)
∂
2
f
∂x
2
∂x
1
(a)
∂
2
f
∂x
2
2
(a) . . .
∂
2
f
∂x
2
∂x
n
(a)
. . . . . . . . . . . .
∂
2
f
∂x
n
∂x
1
(a)
∂
2
f
∂x
n
∂x
2
(a) . . .
∂
2
f
∂x
2
n
(a)
_
_
_
_
_
_
_
_
_
_
_
_
(1)
If we denote by
a
ij
=
∂
2
f
∂x
i
∂x
j
(a),
then a shorter written form of H(a) is the following:
H(a) =
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
n1
a
n2
. . . a
nn
_
_
_
_
Deﬁnition 2. a) If all the second partial derivatives of f exist and are themselves
continuous functions we say that f is twice continuously diﬀerentiable or C
2
.
b) A function f is C
3
(or 3 times continuously diﬀerentiable) if all of its n
3
third
order partial derivatives exist and are continuous.
In the same way we can deﬁne C
k
functions (for which all n
k
k
th
order partial
derivatives exist and are continuous).
Example 2. Let f : R
2
→R be the function deﬁned by
f(x, y) = x
3
y
2
−4y
2
x.
Compute all the third partial derivative functions of f.
Solution. It is easier for us to arrange the calculation in the following tree dia
gram.
213
f(x, y) = x
3
y
2
− 4y
2
x
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
f
′
x
= 3x
2
y
2
− 4y
2
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
f
′′
x
2 = 6xy
2
_
_
_
f
′′′
x
3 = 6y
2
f
′′′
x
2
y
= 12xy
f
′′
xy
= 6x
2
y − 8y
_
_
_
f
′′′
xyx
= 12xy
f
′′′
xy
2 = 6x
2
− 8
f
′
y
= 2x
3
y − 8yx
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
f
′′
yx
= 6x
2
y − 8y
_
_
_
f
′′′
yx
2 = 12xy
f
′′′
yxy
= 6x
2
− 8
f
′′
y
2 = 2x
3
− 8x
_
_
_
f
′′′
y
2
x
= 6x
2
− 8
f
′′′
y
3 = 0
From the previous example we observe that some of the mixed partial derivatives
are equal and the order of diﬀerentiation seems to be of no importance (at least in this
case). This is not an accident, the equality is true for almost all functions which arise
in practical applications. More precisely we have the following result. This theorem
gives us informations about the equality of the mixed second order partial derivatives.
Theorem 1. (Schwarz theorem, general case). Let f : D → R, D ⊆ R
n
, a ∈ D,
i, j = 1, n, with i ,= j.
If both f
′′
x
i
x
j
and f
′′
x
j
x
i
exist for all points in a ball centered at a and they are
continuous at a then
f
′′
x
i
x
j
(a) = f
′′
x
j
x
i
(a).
For the sake of simplicity of the notations we will present the proof just for the
case of two variables.
The reasoning for the general case remains the same. In the case of two variables
we have the following statement of Schwarz theorem.
Theorem 2. Let f : D →R, D ⊆ R
2
, (a, b) ∈ D. If both f
′′
xy
and f
′′
yx
exist for all
points in a disc centered at (a, b) and they are continuous at (a, b) then
f
′′
xy
(a, b) = f
′′
yx
(a, b).
Proof. Let r be the radius of the disc centered at (a, b) and let u, v be real number
such that [u[, [v[ <
r
√
2
.
214
(a, b)
(a, b +v) (a +u, b +v)
(a +u, b)
In this case the rectangle whose corners are the points (a, b), (a+u, b), (a+u, b+v)
and (a, b +v) is situated inside the disc centered at (a, b) with radius r.
Applying the mean value theorem (Theorem 3, subsection 3.1.4) to the function
g : [a, a +u] →R,
g(x) = f(x, b +v) −f(x, b)
we ﬁnd there exist a
0
(which depends on u and v) between a and a +u such that
g(a +u) −g(a) = g
′
(a
0
)u
which translates to
f(a +u, b +v) −f(a +u, b) −f(a, b +v) +f(a, b)
=
_
∂f
∂x
(a
0
, b +v) −
∂f
∂x
(a
0
, b)
_
u
Applying Lagrange’s theorem to the function
h : [b, b +v] →R, h(y) =
∂f
∂x
(a
0
, y),
there is some b
0
(which depends on u and v) between b and b +v such that
h(b +v) −h(b) = h
′
(b
0
)v
Hence,
f(a +u, b +v) −f(a +u, b) −f(a, b +v) +f(a, b) =
∂
2
f
∂x∂y
(a
0
, b
0
)uv
Interchanging x and y in the argument above, we can ﬁnd a point (a
1
, b
1
) such
that:
f(a +u, b +v) −f(a +u, b) −f(u, b +v) +f(a +b) =
∂
2
f
∂y∂x
(a
1
, b
1
)uv
215
Thus
∂
2
f
∂x∂y
(a
0
, b
0
) =
∂
2
f
∂y∂x
(a
1
, b
1
)
Letting u, v → 0 we obtain that (a
0
, b
0
) → (a, b) and (a
1
, b
1
) → (a, b) so the claim
follows from the continuity of the partial derivatives at (a, b).
Fortunately, just about every function we will meet in applications will be C
2
and
therefore its mixed partial derivatives will be equal.
It is important to note that Schwarz’ theorem implies that the Hessian matrix (1)
is a symmetric one.
Remark 2. If f : D → R, D ⊆ R
n
is a C
2
function on D and a ∈ D then the
Hessian matrix H(a) is a symmetric one.
Remark 3. The previous result (Schwarz theorem) is true for mixed higher order
derivatives too (with the assumption of continuity of partial derivatives). So, in these
conditions we can reverse the order of any two successive diﬀerentiations.
For example, if we take an x
1
, x
2
, x
4
derivative of order 3 then the order of
diﬀerentiation does not matter for a C
3
function and we have
f
′′′
x
1
x
2
x
4
= f
′′′
x
1
x
4
x
2
= f
′′′
x
2
x
1
x
4
= f
′′′
x
2
x
4
x
1
= f
′′′
x
4
x
1
x
2
= f
′′′
x
4
x
2
x
1
Example 3. Let f : R
2
→R deﬁned by
f(x, y) =
_
_
_
0, if (x, y) = (0, 0)
x
3
y −xy
3
x
2
+y
2
, otherwise.
Compute the mixed partial derivatives at (0,0). Does this result contradict the
conclusion of Schwarz theorem?
Solution. For each (x, y) ,= (0, 0), by applying the quotient rule of diﬀerentiation
we easily get that:
∂f
∂x
(x, y) =
x
4
y −y
5
+ 4x
2
y
3
(x
2
+y
2
)
2
and
∂f
∂y
(x, y) =
x
5
−4x
3
y
2
−xy
4
(x
2
+y
2
)
2
We will compute the mixed partial derivatives at (0,0) by using the limit of deﬁ
nition
∂
2
f
∂y∂x
(0, 0) =
∂
∂y
_
∂f
∂x
_
(0, 0) = lim
y→0
∂f
∂x
(0, y) −
∂f
∂x
(0, 0)
y −0
(2)
By the previous computation we have that
∂f
∂x
(0, y) =
−y
5
y
4
= −y
216
On the other hand, by using once more the limit deﬁnition we obtain
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x −0
= lim
x→0
0
y
2
−0
x
= lim
x→0
0
x
= 0.
Substituting these two partial derivatives in (2) we get
∂
2
f
∂y∂x
(0, 0) = lim
y→0
−y −0
y
= −1
so the ﬁnal result will be
∂
2
f
∂y∂x
(0, 0) = −1.
In the same way we obtain
∂
2
f
∂x∂y
(0, 0) =
∂
∂x
_
∂f
∂y
_
(0, 0) = lim
x→0
∂f
∂y
(x, 0) −
∂f
∂y
(0, 0)
x
= lim
x→0
x −0
x
= 1
so
∂
2
f
∂x∂y
(0, 0) = 1
In conclusion we have
∂
2
f
∂x∂y
(0, 0) ,=
∂
2
f
∂y∂x
(0, 0)
This result does not contradict Schwarz’s theorem, since, as we can latter see, the
partial derivative function is not continuous at (0,0) so the hypotheses of Schwarz’s
theorem do not hold.
For each (x, y) ,= (0, 0) we have
∂
2
f
∂y∂x
(x, y) =
∂
∂y
_
∂f
∂x
_
(x, y) =
_
x
4
y −y
5
+ 4x
2
y
3
(x
2
+y
2
)
2
_
′
y
=
x
6
−y
6
−9x
2
y
4
+ 9x
4
y
2
(x
2
+y
2
)
3
It remains for us to evaluate the limit of the previous function at (0,0). We will
compute ﬁrst the limit along the line y = mx, m ∈ R.
lim
x→0
y=mx
∂
2
f
∂y∂x
(x, y) = lim
x→0
y=mx
x
6
−y
6
−9x
2
y
4
+ 9x
4
y
2
(x
2
+y
2
)
3
= lim
x→0
x
6
(1 −m
6
−9m
4
+ 9m
2
)
x
6
(1 +m
2
)
3
=
1 −m
6
−9m
4
+ 9m
2
(1 +m
2
)
3
217
Since the limit depends on m we conclude that lim
(x,y)→(0,0)
∂
2
f
∂y∂x
(x, y) does not
exist.
In consequence the function
∂
2
f
∂x∂y
is not continuous in (0,0).
4.4 Diﬀerentiability
4.4.1 Diﬀerentiability. The total diﬀerential
Because of the geometric representation we will start the discussion in this section
with the case of functions of two variables.
Suppose we are interested in the behaviour of a function f in the neighborhood of
a given point (a, b).
We know from the calculus of one variable that (see subsection 3.1.3)
• if y = b and u is a real number suﬃciently small we have that
f(a +u, b) −f(a, b) ≈
∂f
∂x
(a, b) u
• if x = a and v is a real number suﬃciently small than
f(a, b +v) −f(a, b) ≈
∂f
∂y
(a, b) v.
It is natural to ask ourselves what happens if both a, b changes into a + u and
b +v. The expected eﬀect is the sum of the eﬀects of the one variables changes.
f(a +u, b +v) −f(a, b) ≈
∂f
∂x
(a, b)u +
∂f
∂y
(a, b)v (1)
For a geometric interpretation it is more convenient to write the previous formula
in the form
f(a +u, b +v) ≈ f(a, b) +
∂f
∂x
(a, b)u +
∂f
∂y
(a, b)v (2)
The righthand side of (2) is exactly the equation of the tangent plane to the graph
of f at the point ((a, b), f(a, b)) and therefore (2) is an analytical expression of the
fact that the tangent plane is a good approximation to the graph (as we can see in
the ﬁgure below).
218
We have now to justify rigorously the discussion above.
Deﬁnition 1.
a) Diﬀerentiability at a given point (two variables case)
Let f : D →R, D ⊆ R
2
, (a, b) ∈ D.
We say that f is diﬀerentiable at (a, b) if there exist
α
1
, α
2
∈ R, ω : D →R, lim
(x,y)→(a,b)
ω(x, y) = ω(a, b) = 0
such that
f(x, y) −f(a, b) = α
1
(x −a) +α
2
(y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
, for all (x, y) ∈ D (3)
b) Diﬀerentiability on a set
We say that f is diﬀerentiable on D if f is diﬀerentiable at each point in D.
Remark 1. Let f : D →R, D ⊆ R
2
, (a, b) ∈ D.
a) If f is diﬀerentiable at (a, b) then f is continuous at (a, b).
b) If f is diﬀerentiable at (a, b) then f is partial diﬀerentiable at (a, b) with respect
to x and y and
∂f
∂x
(a, b) = α
1
and
∂f
∂y
(a, b) = α
2
.
c) The converse statements of part a) and part b) are not true.
Proof. a) If we let (x, y) → (a, b) in (3) we get that
lim
x→a
y→b
[f(x, y) −f(a, b)] = lim
x→a
y→b
[α
1
(x −a) +α
2
(y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
] = 0.
In consequence
lim
(x,y)→(a,b)
f(x, y) = f(a, b),
so f is continuous at (a, b).
219
b) We have to evaluate the limits
lim
x→a
f(x, b) −f(a, b)
x −a
and lim
y→b
f(a, y) −f(a, b)
y −b
.
If we take y = b in (3), divide the obtained relation by x − a and take then the
limit in both parts of equality we obtain
∂f
∂x
(a, b) = lim
x→a
f(x, b) −f(a, b)
x −a
= lim
x→a
_
α
1
+ω(x, b)
[x −a[
x −a
_
= α
1
+ lim
x→a
ω(x, b)
[x −a[
x −a
= α
1
+ 0 = α
1
.
The last limit is 0 as a limit of a product between a function with limit equal to
0 and a bounded function.
The fact that
∂f
∂y
(a, b) = α
2
can be proved in a similar manner.
c) We will present here a function which is continuous at (0,0), admits partial
derivatives at (0,0) and still is not diﬀerentiable at (0,0).
Let
f : R
2
→R, f(x, y) =
_
_
_
xy
_
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
We study ﬁrst the continuity of f at (0,0) by taking the following limit:
lim
x→0
y→0
xy
_
x
2
+y
2
= lim
x→0
y→0
y
_
x
2
+y
2
x = 0 = f(0, 0)
The previous limit is 0 since
−1 ≤
y
_
x
2
+y
2
≤ 1 and lim
x→0
y→0
x = 0.
By using the limit deﬁnition we study the existence of partial derivatives at (0,0).
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x
= lim
x→0
0
x
= lim
x→0
0 = 0
∂f
∂y
(0, 0) = lim
y→0
f(0, y) −f(0, 0)
y
= lim
y→0
0
y
= lim
y→0
0 = 0
Assume, by contradiction, that f is diﬀerentiable at (0,0). Then there exists
ω : R
2
→R, lim
x→0
y→0
ω(x, y) = ω(0, 0) = 0,
220
such that
f(x, y) −f(0, 0) =
∂f
∂x
(0, 0)(x −0) +
∂f
∂y
(0, 0)(y −0)
+ω(x, y)
_
x
2
+y
2
, (x, y) ∈ R
2
.
Since
∂f
∂x
(0, 0) =
∂f
∂y
(0, 0) = 0,
the previous equality reduces to
f(x, y) = ω(x, y)
_
x
2
+y
2
,
wherefrom we obtain:
ω(x, y) =
_
_
_
xy
_
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
But lim
(x,y)→(0,0)
ω(x, y) does not exist (see Example 2, subsection 4.1.2) which con
tradicts the assumption, so f is not diﬀerentiable as desired.
Remark 2. Let f : D → R, (a, b) ∈ R. If f is diﬀerentiable at (a, b) then there
exists a function ω : D →R, with
lim
x→a
y→b
ω(x, y) = ω(a, b) = 0,
such that
f(x, y) = f(a, b) +
∂f
∂x
(a, b)(x −a) +
∂f
∂y
(a, b)(y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
(4)
for all (x, y) ∈ D.
If we let x −a = u, y −b = v and u and v are suﬃciently small then the quantity
ω(x, y)
_
(x −a)
2
+ (y −b)
2
is small enough to justify the geometric approximation
f(x, y) ≈ f(a, b) +
∂f
∂x
(a, b)(x −a) +
∂f
∂y
(a, b)(y −b)
which was discussed at the beginning of this section.
Since the deﬁnition of diﬀerentiability is quite diﬃcult to be checked we will
present, without proof (which is beyond the scope of this text) the suﬃcient con
ditions for diﬀerentiability.
Theorem 1. If f has continuous partial derivatives on D (f is C
1
) then f is
diﬀerentiable on D.
Deﬁnition 2. (the diﬀerential or the total diﬀerential, 2 variables case)
221
Lett f : D →R, (a, b) ∈ D. If f is diﬀerentiable at (a, b) the diﬀerential (the total
diﬀerential) of f at (a, b) is deﬁned by
df
(a,b)
: R
2
→R
df
(a,b)
(h
1
, h
2
) =
∂f
∂x
(a, b)h
1
+
∂f
∂y
(a, b)h
2
(5)
For h = (h
1
, h
2
) ∈ R
2
the number
df
(a,b)
(h) = df
(a,b)
(h
1
, h
2
)
is called the diﬀerential of f at (a, b) of increment h = (h
1
, h
2
) ∈ R
2
.
By using this new notion the relation (5) becomes:
f(x, y) = f(a, b) +df
(a,b)
(x −a, y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
(6)
with the correspondent approximation
f(x, y) ≈ f(a, b) +df
(a,b)
(x −a, y −b) (7)
Example 1. Compute the total diﬀerential of the function
f : R
2
→R, f(x, y) = x
2
+y
2
+xy
at the point (2,1).
Solution. The function f is a C
1
function on R
2
so, according to Theorem 1, f is
diﬀerentiable on R
2
and in consequence the total diﬀerential of f exists at any point
in R
2
.
df
(2,1)
: R
2
→R
df
(2,1)
(h
1
, h
2
) =
∂f
∂x
(2, 1)h
1
+
∂f
∂y
(2, 1)h
2
= (2 2 + 1)h
1
+ (2 1 + 2)h
2
= 5h
1
+ 4h
2
We will present now the deﬁnitions in general case.
Deﬁnition 3. (diﬀerentiability, general case)
Let f : D → R, D ⊆ R
n
, D domain. We say that f is diﬀerentiable at a if there
exist α
1
, α
2
, . . . , α
n
∈ R, ω : D →R,
lim
x→a
ω(x) = ω(a) = 0,
such that
f(x) −f(a) = α
1
(x
1
−a
1
) + +α
n
(x
n
−a
n
)
222
+ω(x)x −a, x ∈ D. (8)
As in the two dimensional case if f is diﬀerentiable at a then f is continuous at a
and f admits partial derivatives at a with
∂f
∂x
i
(a) = α
i
, i = 1, n.
If f is C
1
on D then f is diﬀerentiable at each point in D.
Deﬁnition 4. (the diﬀerential or the total diﬀerential, general case)
Let f : D →R, D ⊆ R
n
and a ∈ D. If f is diﬀerentiable at a then the diﬀerential
(the total diﬀerential) of f at a is deﬁned by
df
(a)
: R
n
→R
df
(a)
(h) =
∂f
∂x
1
(a)h
1
+
∂f
∂x
2
(a)h
2
+ +
∂f
∂x
n
(a)h
n
(9)
Sometimes instead of h
i
, i = 1, n, we can use the notation dx
i
which is considered
to be a small increment along x
i
axis.
If dx = (dx
1
, dx
2
, . . . , dx
n
) then
df
(a)
(dx) =
∂f
∂x
1
(a)dx
1
+
∂f
∂x
2
(a)dx
2
+ +
∂f
∂x
n
(a)dx
n
.
By using (9) the relation (8) can be written in the following form
f(x) = f(a) +df
(a)
(x −a) +ω(x)x −a, (10)
for all x ∈ D.
Since ω(x)x − a → 0 as x → a for x −a small enough we have the following
approximation of (10)
f(x) ≈ f(a) +df
(a)
(x −a) (11)
The chain rule
Recall that the chain rule for functions of a single variable gives the rule for
diﬀerentiating a composite function. If f and g are diﬀerentiable functions such that
f ◦ g makes sense then
(f ◦ g)
′
(t) = f
′
(g(t)) g
′
(t).
For functions of more than one variable, the Chain rule has several versions, each
of them giving a rule for diﬀerentiating a composite function.
The ﬁrst version (Theorem 2) analyse the case where z = f(x, y) and each of the
variables x and y is a function of a variable t.
This means that f is indirectly a function of t,
z(t) = f(x(t), y(t)),
223
and the Chain rule gives a formula for diﬀerentiating z as a function of t. Assume f,
x and y are diﬀerentiable functions.
Theorem 2. Suppose that z = f(x, y) is a diﬀerentiable function of x and y, where
x = x(t) and y = y(t) are both diﬀerentiable functions of t. Then z is diﬀerentiable
function of t and
z
′
(t) =
∂f
∂x
(x(t), y(t))x
′
(t) +
∂f
∂y
(x(t), y(t))y
′
(t) (12)
For short, we can write down the previous formula in the following way:
z
′
(t) =
∂f
∂x
x
′
(t) +
∂f
∂y
y
′
(t) (13)
Proof. From Deﬁnition 1 and Remark 2 we have:
∆z =
∂f
∂x
∆x +
∂f
∂y
∆y +ω
_
(∆x)
2
+ (∆y)
2
where ω → 0 as (∆x, ∆y) → (0, 0).
Dividing both sides of this equation by ∆t, we have
∆z
∆t
=
∂f
∂x
∆x
∆t
+
∂f
∂y
∆y
∆t
+ω
¸
_
∆x
∆t
_
2
+
_
∆y
∆t
_
2
If we now let ∆t → 0, then
∆x = x(t + ∆t) −x(t) → 0
because x is diﬀerentiable and therefore continuous. In the same way ∆y → 0. This
will imply that ω → 0 so,
z
′
(t) = lim
∆t→0
∆z
∆t
=
∂f
∂x
lim
∆t→0
∆x
∆t
+
∂f
∂y
lim
∆t→0
∆y
∆y
+ lim
∆t→0
ω
¸
_
lim
∆t→0
∆x
∆t
_
2
+
_
lim
∆t→0
∆y
∆t
_
2
=
∂f
∂x
x
′
(t) +
∂f
∂y
y
′
(t) + 0
_
(x
′
(t))
2
+ (y
′
(t))
2
=
∂f
∂x
x
′
(t) +
∂f
∂y
y
′
(t)
Since we often write
∂z
∂x
in place of
∂f
∂x
we can rewrite the chain rule in the form
z
′
(t) =
∂z
∂x
x
′
(t) +
∂z
∂y
y
′
(t). (14)
224
If z depends on more than two variables then
z
′
(t) =
∂z
∂x
1
x
′
1
(t) +
∂z
∂x
2
x
′
2
(t) + +
∂z
∂x
n
x
′
n
(t)
=
n
i=1
∂z
∂x
i
x
′
i
(t) (15)
As you have already observed, in order not to make the notations too complicated
we worked in a more formal way in formulating and proving the previous theorem.
Example 2. Use the chain rule to ﬁnd w
′
(t) if w = ln
_
x
2
+y
2
+z
2
, x = sin t,
y = cos t, z = tg t et t =
π
4
.
Solution.
w
′
(t) =
∂w
∂x
x
′
(t) +
∂w
∂y
y
′
(t) +
∂w
∂z
z
′
(t)
= (ln
_
x
2
+y
2
+z
2
)
′
x
sin
′
t + (ln
_
x
2
+y
2
+z
2
)
′
y
cos
′
t
+(ln
_
x
2
+y
2
+z
2
)
′
z
tg
′
t
=
(
_
x
2
+y
2
+z
2
)
′
x
_
x
2
+y
2
+z
2
cos t −
(
_
x
2
+y
2
+z
2
)
′
y
_
x
2
+y
2
+z
2
sin t
+
(
_
x
2
+y
2
+z
2
)
′
z
_
x
2
+y
2
+z
2
1
cos
2
t
w
′
(t) =
x
x
2
+y
2
+z
2
cos t −
y
x
2
+y
2
+z
2
sin t +
z
x
2
+y
2
+z
2
1
cos
2
t
It’s not necessary to substitute the expressions for x, y and z in terms of t. We
observe that when t =
π
4
we have
x = sin
π
4
=
√
2
2
, y = cos
π
4
=
√
2
2
, z = tg
π
4
= 1.
Therefore
w
′
_
π
4
_
=
√
2
2
_
√
2
2
_
2
+
_
√
2
2
_
2
+ 1
√
2
2
−
√
2
2
_
√
2
2
_
2
+
_
√
2
2
_
2
+ 1
√
2
2
+
1
_
√
2
2
_
2
+
_
√
2
2
_
2
+ 1
1
_
√
2
2
_
2
= 1
Hence
w
′
_
π
4
_
= 1.
225
We now consider the situation where z = f(x, y) but each of x and y is a function
of two variables s and t: x = x(s, t), y = y(s, t). Then z is indirectly a function of s
and t. We try to ﬁnd
∂z
∂s
and
∂z
∂t
.
Theorem 3. Suppose that z = f(x, y) is a diﬀerentiable function of x and y,
where x = x(s, t) and y = y(s, t) are diﬀerentiable functions of s and t. Then
∂z
∂s
=
∂z
∂x
∂x
∂s
+
∂z
∂y
∂y
∂s
and
∂z
∂t
=
∂z
∂x
∂x
∂t
+
∂z
∂y
∂y
∂t
(16)
Proof. Recall that in computing
∂z
∂s
we hold t ﬁxed and compute the derivative
of z with respect to x. Therefore we can apply Theorem 2 to obtain
∂z
∂s
=
∂z
∂x
∂x
∂s
+
∂z
∂y
∂y
∂s
.
A similar argument holds for
∂z
∂t
and the proof is complete.
Example 3. Use the chain rule to ﬁnd
∂z
∂s
and
∂z
∂t
if z = e
x+2y
, x =
s
t
and y =
t
s
.
Solution.
∂z
∂s
=
∂z
∂x
∂x
∂s
+
∂z
∂y
∂y
∂s
= (e
x+2y
)
′
x
_
s
t
_
′
s
+ (e
x+2y
)
′
y
_
t
s
_
′
s
= e
x+2y
1
t
+e
x+2y
2
_
−
t
s
2
_
= e
s
t
+
2t
s
_
1
t
−
2t
s
2
_
=
s
2
−2t
2
ts
2
e
s
2
+2t
2
st
This case of the chain rule contains three types of variables: s and t are independent
variables, x and y are called intermediate variables and z is the dependent variable.
A tree diagram (see ﬁgure below) could help us to remember the previous form of
the chain rule. We draw branches from the dependent variable z to the intermediate
variables x and y. Then we draw branches from x and y to the independent variables
x and t. On each branch we write the corresponding partial derivative.
226
z
x
y
∂z
∂x
∂z
∂y
∂x
∂s
∂x
∂t
∂y
∂s
∂y
∂t
s s t t
To ﬁnd
∂z
∂s
we ﬁnd the product of the partial derivatives along each path from z
to s and then add these products:
∂z
∂s
=
∂z
∂x
∂x
∂s
+
∂z
∂y
∂y
∂s
We consider now the general case in which the dependent variable z is a function
of n intermediate variables x
1
, . . . , x
n
, each of which is a function of m independent
variables s
1
, . . . , s
m
.
Theorem 4. (general case) Suppose that z is a diﬀerentiable function of the n
variables x
1
, . . . , x
n
and each x
j
is a diﬀerentiable function of the m independent
variables s
1
, . . . , s
m
. Then z is a function of s
1
, . . . , s
m
and
∂z
∂s
j
=
∂z
∂x
1
∂x
1
∂s
j
+ +
∂z
∂x
n
∂x
n
∂s
j
=
n
i=1
∂z
∂x
i
∂x
i
∂s
j
(17)
for each j = 1, m.
The proof is similar to previous case.
Example 4. Use the chain rule to ﬁnd the indicated partial derivatives:
∂z
∂u
,
∂z
∂v
,
∂z
∂w
when u = 2, v = 1, w = 0 for z = x
2
+xy
3
, x = uv
2
+w
3
, y = u +ve
w
.
227
Solution.
z
x y
∂z
∂x
∂z
∂y
∂x
∂u
∂x
∂w
∂y
∂u
∂y
∂w
u u w w v
∂x
∂v
v
∂y
∂v
∂z
∂u
=
∂z
∂x
∂x
∂u
+
∂z
∂y
∂y
∂u
= (x
2
+xy
3
)
′
x
(uv
2
+w
3
)
′
u
+ (x
2
+xy
3
)
′
y
(u +ve
w
)
′
u
= (2x +y
3
)v
2
+ 3xy
2
1
When u = 2, v = 1 and w = 0, then
x = 2 1
2
+ 0 = 2 and y = 2 + 1 e
0
= 2
Therefore
∂z
∂u
(2, 1, 0) = (2 2 + 2) 1 + 3 2 2
2
1 = 6 + 24 = 30.
The other two partial derivatives can be calculated in a similar way.
4.4.2 Higher order diﬀerentials
Deﬁnition 1. Let f : D →R, D ⊆ R
n
, a ∈ D.
We say that f is diﬀerentiable of kth order at the point a if all the partial deriva
tives of order k −1 exist for all points near a and they are diﬀerentiable at a.
Remark 1. If f has continuous partial derivatives of order k on D then f is
diﬀerentiable of order k on D.
Deﬁnition 2. Let f : D → R, D ⊆ R
n
be a C
k
function on D. The kth order
diﬀerential of the function f at a of increment h = (h
1
, . . . , h
n
) ∈ R
n
is deﬁned in the
following ”formal” way:
d
k
f
(a)
(h) =
_
∂
∂x
1
h
1
+ +
∂
∂x
n
h
n
_
(k)
f(a) (1)
228
In the previous deﬁnition
∂
∂x
i
, i = 1, n, is the partial diﬀerentiation operator with
respect to x
i
and (k) is a ”formal” power. When we rise to the formal power (k) we
apply the binomial theorem where the ”formal” k
th
power of
∂
∂x
i
means actually the
kth order partial derivative
∂
k
f
∂x
k
i
(a).
In order to make the things clear we will present some particular cases of the
previous formula.
Particular case 1. n = 2.
In this case the kth order diﬀerential of the function f at (a, b) of increment
h = (h
1
, h
2
) ∈ R
2
is deﬁned by
d
k
f
(a,b)
(h
1
, h
2
) =
_
∂
∂x
h
1
+
∂
∂y
h
2
_
(k)
f(a, b).
a) k = 2
d
2
f
(a,b)
(h
1
, h
2
) =
_
∂
∂x
h
1
+
∂
∂y
h
2
_
(2)
f(a, b)
=
_
∂
2
∂x
2
h
2
1
+ 2
∂
2
∂x∂y
h
1
h
2
+
∂
2
∂y
2
h
2
2
_
f(a, b)
=
∂
2
f
∂x
2
(a, b)h
2
1
+ 2
∂
2
f
∂x∂y
(a, b)h
1
h
2
+
∂
2
f
∂y
2
(a, b)h
2
2
.
b) k = 3
d
3
f
(a,b)
(h
1
, h
2
) =
_
∂
∂x
h
1
+
∂
∂y
h
2
_
(3)
f(a, b)
=
_
∂
3
∂x
3
h
3
1
+ 3
∂
3
∂x
2
∂y
h
2
1
h
2
+ 3
∂
3
∂x∂y
2
h
1
h
2
2
+
∂
3
∂y
3
h
3
2
_
f(a, b)
=
∂
3
f
∂x
3
(a, b)h
3
1
+ 3
∂
3
f
∂x
2
∂y
(a, b)h
2
1
h
2
+ 3
∂
3
f
∂x∂y
2
(a, b)h
1
h
2
2
+
∂
3
f
∂y
3
(a, b)h
3
2
Particular case 2. n = 3.
In this case the kth order diﬀerential of the function f at (a, b, c) of increment
h = (h
1
, h
2
, h
3
) ∈ R
3
is deﬁned by
d
k
f
(a,b,c)
(h
1
, h
2
, h
3
) =
_
∂
∂x
h
1
+
∂
∂y
h
2
+
∂
∂z
h
3
_
(k)
f(a, b, c).
If k = 2 we have
d
2
f
(a,b,c)
(h
1
, h
2
, h
3
) =
_
∂
∂x
h
1
+
∂
∂y
h
2
+
∂
∂z
h
3
_
(2)
f(a, b, c)
229
=
_
∂
2
∂x
2
h
2
1
+
∂
2
∂y
2
h
2
2
+
∂
2
∂z
2
h
2
3
+ 2
∂
2
∂x∂y
h
1
h
2
+2
∂
2
∂y∂z
h
2
h
3
+ 2
∂
2
∂z∂x
h
3
h
1
_
f(a, b, c)
=
∂
2
f
∂x
2
(a, b, c)h
2
1
+
∂
2
f
∂y
2
(a, b, c)h
2
2
+
∂
2
f
∂z
2
(a, b, c)h
2
3
+2
∂
2
f
∂x∂y
(a, b, c)h
1
h
2
+ 2
∂
2
f
∂y∂z
(a, b, c)h
2
h
3
+ 2
∂
2
f
∂z∂x
(a, b, c)h
3
h
1
Remark 2. Let f : D → R, D ⊆ R
n
, be a C
2
function and let a ∈ D. Then we
have
d
2
f
(a)
(h) = h H(a) h
t
, for each h ∈ R
n
where by h
t
we understand the column vector
h
t
=
_
_
_
_
_
h
1
h
2
.
.
.
h
n
_
_
_
_
_
(which is the transpose of the row vector h = (h
1
, . . . , h
n
)).
Proof. Since f is a C
2
function then the Hessian matrix H(a) is a symmetric
matrix (see Remark 2, Section 4.3)
h · H(a) · h
t
= (h
1
, . . . , h
n
)
_
_
_
_
_
_
_
_
_
_
_
_
∂
2
f
∂x
2
1
(a)
∂
2
f
∂x
1
∂x
2
(a) . . .
∂
2
f
∂x
1
∂x
n
(a)
∂
2
f
∂x
2
∂x
1
(a)
∂
2
f
∂x
2
2
(a) . . .
∂
2
f
∂x
2
∂x
n
(a)
. . . . . . . . . . . .
∂
2
f
∂x
n
∂x
1
(a)
∂
2
f
∂x
n
∂x
2
(a) . . .
∂
2
f
∂x
2
n
(a)
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
h
1
h
2
. . .
h
n
_
_
_
_
=
_
_
n
j=1
∂
2
f
∂x
1
∂x
j
(a)h
j
,
n
j=1
∂
2
f
∂x
2
∂x
j
(a)h
j
, . . . ,
n
j=1
∂
2
f
∂x
n
∂x
j
(a)h
j
_
_
_
_
_
_
h
1
h
2
. . .
h
n
_
_
_
_
=
n
i,j=1
∂
2
f
∂x
i
∂x
j
(a)h
i
h
j
=
n
i=1
∂
2
f
∂x
2
i
(a)h
2
i
+ 2
1≤i<j≤n
∂
2
f
∂x
i
∂x
j
(a)h
i
h
j
= d
2
f
(a)
(h).
Remark 3. Let f : D →R, D ⊆ R
n
be a C
2
function and let a ∈ D. Then
Q : R
n
→R deﬁned by
Q(h) = d
2
f
(a)
(h)
230
is a quadratic form.
Proof. The proof is an easy consequence of the previous remark and of the deﬁ
nition of a quadratic form (see 1.3).
4.4.3 Taylor formula in R
n
We learned in subsection 4.4.1 that the approximation by diﬀerential for a C
1
function f, f : D →R, a ∈ D, is
f(x) ≈ f(a) +df
a
(x −a) (1)
The previous estimation is valid for x −a suﬃciently small.
The diﬀerence between the lefthand side of (1) and the righthand side is ω(x)x−
a (see (10), subsection 4.1.2) where lim
x→a
ω(x) = 0.
If we denote the diﬀerence mentioned above by R
1
(x) (R from the remaider) then
the equality (10), subsection 4.4.1, becomes
f(x) = f(a) +df
(a)
(x −a) +R
1
(x), x ∈ D (2)
where
lim
x→a
R
1
(x)
x −a
= 0.
(2) is called the Taylor approximation of order one for a C
1
function of several
variables.
The next theorem shows that if f has continuous secondorder partial derivatives,
the error term is equal to a quadratic form (see 1.3) plus a term of smaller order than
x −a
2
.
Theorem 1. (Second order Taylor formula)
Let f : D → R be a C
2
function, D ⊆ R
n
and let a ∈ D. Then, there exists a
function R
2
: D →R such that
f(x) = f(a) +df
(a)
(x −a) +
1
2!
d
2
f
(a)
(x −a) +R
2
(x), x ∈ D (3)
where
lim
x→a
R
2
(x)
x −a
2
= 0.
Proof. Keep x ﬁxed and deﬁne g : [0, 1] →R by the relationship
g(u) = f(a +u(x −a)).
Then
f(x) −f(a) = g(1) −g(0).
We will prove the theorem by applying the second order Taylor’s formula to g for
x = 1 and a = 0. We obtain
g(1) = g(0) +g
′
(0) +
1
2
g
′′
(c), where 0 < c < 1.
231
Here we have used Lagrange’s form of the remainder (see Remark 2, subsection
3.1.4).
We will compute the derivatives of g by the chain rule
g
′
(u) =
n
i=1
∂f
∂x
i
(a +u(x −a))(x
i
−a
i
)
= df
(a+u(x−a))
(x −a)
In particular
g
′
(0) = df
(a)
(x −a).
Using the chain rule once more we ﬁnd
g
′′
(u) =
n
i,j=1
∂
2
f
∂x
i
∂x
j
(a +u(x −a))(x
i
−a
i
)(x
j
−a
j
)
= d
2
f
(a+u(x−a))
(x −a).
Hence
g
′′
(c) = d
2
f
(a+c(x−a))
(x −a).
To prove (3) we deﬁne
R
2
(x) =
1
2
d
2
f
(a+c(x−a))
(x −a) −
1
2
d
2
f
(a)
(x −a).
By using the previous equality we have
f(x) −f(a) = df
(a)
(x −a) +
1
2
d
2
f
(a+c(x−a))
(x −a)
= df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a) +R
2
(x)
To complete the proof we need to show that
R
2
(x)
x −a
2
→ 0 as x → a.
We have:
[R
2
(x)[
x −a
2
=
1
x −a
2
1
2
¸
¸
¸
¸
¸
¸
n
i,j=1
_
∂
2
f
∂x
i
∂x
j
(a +c(x −a))
−
∂
2
f
∂x
i
∂x
j
(a)
_
(x
i
−a)(x
j
−a)
¸
¸
¸
¸
≤
1
2
n
i,j=1
¸
¸
¸
¸
∂
2
f
∂x
i
∂x
j
(a +c(x −a)) −
∂
2
f
∂x
i
∂x
j
(a)
¸
¸
¸
¸
232
Since each secondorder partial derivative is continuous at a, we have
∂
2
f
∂x
i
∂x
j
(a +c(x −a)) →
∂
2
f
∂x
i
∂x
j
(a), as x → a
so
R
2
(x)
x −a
2
→ 0, as x → a.
This completes the proof.
The correspondent secondorder Taylor approximation is
f(x) ≈ f(a) +df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a)
The previous estimation is valid for x −a suﬃciently small.
We will present without proof the next theorem which summarize the approxima
tion of a C
k
function on R
n
by a Taylor polynomial of order k.
Theorem 2. Let k be a positive integer. Let f : D →R, D ⊆ R
n
, be a C
k
function
on D and let a ∈ D. Then, there exists a function R
k
: D → R, such that for all
x ∈ D
f(x) = f(a) +df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a) + +
+
1
k!
d
k
f
(a)
(x −a) +R
k
(x) (4)
where
R
k
(x)
x −a
k
→ 0, as x → a.
Relation (4) is called Taylor’s formula.
In the previous theorem the polynomial T
k
: D →R
T
k
(x) = f(a) +
1
1!
df
(a)
(x −a) +
1
2!
d
2
f
(a)
(x −a)+
+ +
1
k!
d
k
f
(a)
(x −a) (5)
is called the Taylor polynomial of degree k and R
k
= f −T
k
is called the remainder
of oder k.
The correspondent korder Taylor approximation is
f(x) ≈ f(a) +
1
1!
df
(a)
(x −a) +
1
2!
d
2
f
(a)
(x −a)+
+ +
1
k!
d
k
f
(a)
(x −a), (6)
valid for x −a suﬃciently small.
Example 1. Compute the ﬁrst and second order Taylor approximation of the
CobbDouglas function
f : (0, ∞) (0, ∞) →R
f(x, y) = x
1
4
y
3
4
233
at the point (1, 1).
Solution. We compute the ﬁrst and second order partial derivative functions of
f:
∂f
∂x
=
1
4
x
−
3
4
y
3
4
,
∂f
∂y
=
3
4
x
1
4
y
−
1
4
∂
2
f
∂x
2
= −
3
16
x
−
7
4
y
3
4
,
∂
2
f
∂x∂y
=
3
16
x
−
3
4
y
−
1
4
,
∂
2
f
∂y
2
= −
3
16
x
1
4
y
−
5
4
Evaluating these partial derivatives at the point (1,1) we obtain:
∂f
∂x
(1, 1) =
1
4
,
∂f
∂y
(1, 1) =
3
4
,
∂
2
f
∂x
2
(1, 1) = −
3
16
,
∂
2
f
∂x∂y
(1, 1) =
3
16
,
∂
2
f
∂y
2
(1, 1) = −
3
16
Substituting in
df
(1,1)
(x −1, y −1) =
∂f
∂x
(1, 1)(x −1) +
∂f
∂y
(1, 1)(y −1)
the values of the ﬁrst partial derivatives obtained before we get:
df
(1,1)
(x −1, y −1) =
1
4
(x −1) +
3
4
(y −1)
Substituting in
d
2
f
(1,1)
(x −1, y −1) =
∂
2
f
∂x
2
(1, 1)(x −1)
2
+ 2
∂
2
f
∂x∂y
(1, 1)(x −1)(y −1)
+
∂
2
f
∂y
2
(1, 1)(y −1)
2
the values of the second partial derivatives we get:
d
2
f
(1,1)
(x −1, y −1) = −
3
16
(x −1)
2
+
3
8
(x −1)(y −1) −
3
16
(y −1)
2
.
Finally we obtain:
f(x, y) ≈ 1 +
1
4
(x −1) +
3
4
(y −1) – the ﬁrst order approximation
f(x, y) ≈ 1 +
1
4
(x −1) +
3
4
(y −1) −
3
32
(x −1)
2
+
3
16
(x −1)(y −1) −
3
32
(y −1)
2
– the second order approximation.
234
If we use the Taylor approximation of order one to approximate
f(1, 1; 0, 9) = (1, 1)
1
4
(0, 9)
3
4
,
we obtain that
(1, 1)
1
4
(0, 9)
3
4
≈ 1 +
1
4
(1, 1 −1) +
3
4
(0, 9 −1)
= 1 +
1
4
0, 1 −
3
4
0, 1 = 0, 9
If we use the Taylor approximation of order two we get that
(1, 1)
1
4
(0, 9)
3
4
≈ 1 +
1
4
0, 1 +
3
4
(−0, 1) −
3
32
(0, 1)
2
+
3
16
0, 1 (−0, 1) −
3
32
(−0, 1)
2
= 0, 94625
It can be easily seen that the approximation obtain by using the second order
Taylor approximation is a better approximation of
(1, 1)
1
4
(0, 9)
3
4
= 0, 943026 . . . .
4.5 Extrema of function of several variables
Since optimization plays a major role in economic theory this section can be
considered the core of this part of the book.
At this moment we have a good understanding of conditions under which a is a
local extreme point of a C
2
function f : I → R, where I is an open interval, I ⊆ R.
These conditions can be stated as follows.
1
◦
. Necessary conditions
If a is a local minimum point of f then
f
′
(a) = 0 and f
′′
(a) ≥ 0.
If a is a local maximum point of f then
f
′
(a) = 0 and f
′′
(a) ≤ 0.
2
◦
. Suﬃcient conditions
If f
′
(a) = 0 and f
′′
(a) < 0, then a is a local maximum point of f.
If f
′
(a) = 0 and f
′′
(a) > 0, then a is a local minimum point of f.
Our purpose is to develop the generalizations of the previous results to the case
of functions of more then one variable.
We will see that the main results for functions of several variables are analogous
to the onedimensional results.
Deﬁnition 1. Let f : D →R be a realvalued function of n variables, D a subset
of R
n
and let a ∈ D.
235
a) The point a is called a global or absolute maximum point of f if f(a) ≥ f(x)
for all x ∈ D. In this case, f(a) is the global maximum value of f.
b) The point a is called a local (or relative) maximum point of f if there is a ball
B(a, r) such that f(a) ≥ f(x) for all x ∈ B(a, r) ∩ D. In this case, f(a) is the local
maximum value of f.
Reversing the inequalities in the above two deﬁnitions we obtain the deﬁnitions of
a global minimum point and of a local minimum point.
First order conditions
The results presented here are obtained by using only the ﬁrst order partial deriva
tives of a given function.
In the case of one variable, the ﬁrst order condition for a point a be to a local
maximum or minimum point of a C
1
function f is that f
′
(a) = 0. In this case a has
to be a critical point of f.
The generalization to several variables of the critical point notion is the following:
Deﬁnition 2. Let f : D → R, D ⊆ R
n
and let a ∈ D. The point a is called a
stationary (or critical) point of f if f admits partial derivatives at a and
∂f
∂x
1
(a) = =
∂f
∂x
n
(a) = 0.
The next theorem is very useful in locating local extrema of f.
Theorem 1. Let f : D →R be a diﬀerentiable function on D ⊆ R
n
. If a is a local
maximum (or minimum) point of f then
∂f
∂x
1
(a) =
∂f
∂x
2
(a) = =
∂f
∂x
n
(a) = 0.
Hence, a is a stationary point of f.
Proof. We will work in the case when a is a local minimum point (the same proof
works for the maximum case).
Let B = B(a, r) be a ball centered at a, B ⊂ D with the property that f(a) ≤ f(x)
for all x ∈ B.
Since a is a minimum point of f on B then along each segment which passes
through a (that lies in B) f takes its minimum value at a.
In consequence, for each i = 1, n, a
i
is the minimum value of the following function
of one variable
g
i
: (a
i
−r, a
i
+r) →R
g
i
(x
i
) = f(a
1
, . . . , a
i−1
, x
i
, a
i+1
, . . . , a
n
).
If we apply now Fermat’s theorem (Theorem 1, subsection 3.1.4) to the above
function we conclude that
g
i
(a
i
) =
∂f
∂x
i
(a
i
) = 0, i = 1, n.
236
The previous theorem says that in order to determine the local extreme points for
a diﬀerentiable function we must seek among the stationary points.
Example 1. Determine the stationary points of the function deﬁned by
f : R
2
→R
f(x, y) = x
3
−y
3
+ 9xy.
Solution. To ﬁnd the stationary points of f, we compute the ﬁrst order partial
derivatives and equate them to zero.
∂f
∂x
(x, y) = 3x
2
+ 9y = 0
∂f
∂y
(x, y) = −3y
2
+ 9x = 0
From the ﬁrst equation we get
y = −
1
3
x
2
which can be substituted in the second one to get
−
1
3
x
4
+ 9x = 0.
The solution of the previous equation are x = 0 and x = 3.
Substituting these values into y = −
1
3
x
2
we obtain that the stationary points of
the function f are (0,0) and (3, −3). At this moment we are not able to decide the
nature of each of the previous two stationary points. To determine the nature of the
stationary points we need to use a condition on the second order diﬀerential of f, as
we did for functions of one variable.
Second order conditions
Suﬃcient conditions
Theorem 2. Let f : D → R be a C
2
function. Suppose that D ⊆ R
n
, a ∈ D is a
stationary point for f.
a) If d
2
f
(a)
(h) > 0 for each h ∈ R
n
¸ ¦θ¦, then a is a local minimum point for f.
b) If d
2
f
(a)
(h) < 0 for each h ∈ R
n
¸ ¦θ¦, then a is a local maximum point of f.
c) If there are v, w ∈ R
n
¸ ¦θ¦ such that d
2
f
(a)
(v) > 0 and d
2
f
(a)
(w) < 0, then a
is neither a local maximum point nor a local minimum of f.
Deﬁnition 3. A stationary point of f for which the assumptions of part c) hold
is called a saddle point.
Proof. Since the proofs for part a) and b) are quite similar, we will prove part a)
and leave the proof of part b) as an exercise.
237
a) We assume that a is a stationary point of the C
2
function f and that d
2
f
(a)
(h) >
0, for each h ∈ R
n
¸ ¦θ¦. Write the second order Taylor’s formula at the critical point
a:
f(x) = f(a) +df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a) +R
2
(x) (1)
where
R
2
(x)
x −a
2
→ 0, as x → a.
Since a is a stationary point of f then (1) becomes
f(x) −f(a) =
1
2
d
2
f
(a)
(x −a) +R
2
(x).
We divide the previous equality by x −a
2
and we get
f(x) −f(a)
x −a
2
=
1
2
d
2
f
(a)
_
x −a
x −a
_
+
R
2
(x)
x −a
2
(2)
As a polynomial of degree two, the quadratic form
Q(h) = d
2
f
(a)
(h)
is a continuous function on R
n
.
Let α = min¦Q(v) [ v = 1¦.
Since the unit sphere ¦v [ v = 1¦ is compact, by applying Weierstrass’ theorem
(Theorem 4, subsection 4.1) to the restriction of function Q on the unit sphere we
conclude that there exists a point w on the unit sphere such that Q(w) = α. Since Q
is positive deﬁnite and w ,= 0 (w = 1) then α = Q(w) > 0 and it follows that
1
2
α ≤
1
2
Q
_
x −a
x −a
_
=
1
2
d
2
f
(a)
_
x −a
x −a
_
, for all x ,= a (3)
Since
R
2
(x)
x −a
2
→ 0 as x → a, there exist an r > 0 such that
−
α
4
<
R
2
(a)
x −a
2
<
α
4
for all x, 0 < x −a < r (4)
Combining (3) and (4) we ﬁnd that for all x with 0 < x −a < r we have
1
2
d
2
f
(a)
_
x −a
x −a
_
+
R
2
(a)
x −a
2
>
1
2
α −
1
4
α =
1
4
α > 0.
The second part of the equality (2) is positive for 0 < x − a < r, therefore, so
is the ﬁrst part of the above mentioned equality:
f(x) −f(a)
x −a
2
> 0 for 0 < x −a < r or x ∈ B(a, r).
238
c) We will show that the conditions
df
(a)
≡ 0 and d
2
f
(a)
(v) > 0
imply that a cannot be a local maximum point of f.
In the same way the conditions
df
(a)
≡ 0 and d
2
f
(a)
(w) < 0
imply that a cannot be a local minimum point of f.
We consider the function
t → g(t) = f(a +tv)
and use the chain rule to compute the ﬁrst and second derivatives of function g deﬁned
above
g
′
(t) =
n
i=1
∂f
∂x
i
(a +tv)v
i
= df
(a+tv)
(v)
so
g
′
(0) = df
(a)
(v) = 0.
Taking the second derivative we get
g
′′
(t) =
n
i,j=1
∂
2
f
∂x
i
∂x
j
(a +tv)v
i
v
j
= d
2
f
(a+tv)
(v)
Since g
′′
(0) = d
2
f
(a)
(v) > 0 and g
′′
is a continuous function then there is ε > 0
such that g
′′
(t) > 0 for all t ∈ (−ε, ε) and in conclusion g
′
is an increasing function on
(−ε, ε). By taking in account the fact g
′
(0) = 0 we obtain g
′
(t) > 0 for each t ∈ (0, ε).
This implies that g is an increasing function on (0, ε).
In particular a cannot be a local maximum point for f.
A similar argument shows that
df
(a)
≡ 0 and d
2
f
(a)
(w) < 0
imply that a cannot be a local minimum point either.
This completes the proof of Theorem 2.
By using the characterization of positive deﬁnite and negative deﬁnite quadratic
forms, Theorem 2 can be restated as the following theorem.
Theorem 3. Let f : D → R be a C
2
function, D ⊆ R
n
. Suppose that a is a
stationary point of f.
a) If the n leading principal minors of H(a) are all positive
[a
11
[ > 0,
¸
¸
¸
¸
a
11
a
12
a
21
a
22
¸
¸
¸
¸
> 0, . . . ,
¸
¸
¸
¸
¸
¸
¸
¸
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
n1
a
n2
. . . a
nn
¸
¸
¸
¸
¸
¸
¸
¸
> 0,
239
then a is a local minimum point of f.
b) If the n leading principal minors of H(a) alternate is sign
[a
11
[ < 0,
¸
¸
¸
¸
a
11
a
12
a
21
a
22
¸
¸
¸
¸
> 0, . . . , (−1)
n
¸
¸
¸
¸
¸
¸
¸
¸
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
n1
a
n2
. . . a
nn
¸
¸
¸
¸
¸
¸
¸
¸
> 0,
then a is a local maximum point of f.
c) If some nonzero leading principal minors of H(a) dont’s satisfy the sign condi
tions in the hypotheses of parts a) and b) then a is a saddle point of f (it is neither
a local maximum nor a local minimum point of f).
Next, we will present a particular case of the previous results which concerns
functions of two variables.
Theorem 4. Let f : D → R, D ⊆ R
2
, be a C
2
function. Suppose that (a, b) is a
stationary point of f.
Let
A =
∂
2
f
∂x
2
(a, b), B =
∂
2
f
∂x∂y
(a, b),
C =
∂
2
f
∂y
2
(a, b) and D = B
2
−AC.
a) If D < 0 and A > 0 then (a, b) is a local minimum point of f.
b) If D < 0 and A < 0 then (a, b) is a local maximum point of f.
c) If D > 0 then (a, b) is a saddle point.
d) If D = 0 no conclusion can be drawn concerning a relative extremum, the test
inconclusive and some other technique must be used to solve the problem.
Proof. Since
a
11
= A and D = −
¸
¸
¸
¸
a
11
a
12
a
21
a
22
¸
¸
¸
¸
theorem 4 is a immediate consequence of theorem 3.
Example 2. In example 1 we computed the stationary points of
f : R
2
→R, f(x, y) = x
3
−y
3
+ 9xy
are (0, 0) and (3, −3). By diﬀerentiating the ﬁrst partial derivatives, we obtain that
the Hessian of f at (x, y) is
H(x, y) =
_
f
′′
x
2
(x, y) f
′′
xy
(x, y)
f
′′
yx
(x, y) = f
′′
xy
(x, y) f
′′
y
2
(x, y)
_
=
_
6x 9
9 −6y
_
The ﬁrst order principal minor is
∆
1
(x, y) = 6x
240
and the second order principal minor is
∆
2
(x, y) = −36xy −81.
At (0,0), these two minors are 0 and −81, respectively.
Since the second order leading principal minor is negative, (0,0) is a saddle point
of f – neither a maximum point nor a minimum point (see Theorem 3).
At (3, −3) these two minors are 18 and 243 which are positive numbers and in
consequence (3, −3) is a local point of f (see Theorem 3).
Another way of solving the problem is by using Theorem 4. If analyze the nature
of stationary point we observe that
A =
∂
2
f
∂x
2
(0, 0) = 0, B =
∂
2
f
∂x∂y
(0, 0) = 9,
C =
∂
2
f
∂y
2
(0, 0) = 0, D = 81,
hence (0,0) is a saddle point (see part c) of Theorem 4).
At (3, −3) we have
A =
∂
2
f
∂x
2
(3, −3) = 18, B =
∂
2
f
∂x∂y
(3, −3) = 9,
C =
∂
2
f
∂y
2
(3, −3) = −18, D = −18
2
−81 < 0,
hence (3, −3) is a local minimum point of f.
We have to mention that (3, −3) is not a global minimum point, because f(0, n) =
−n
3
which goes to −∞ as n → ∞.
Example 3. A monopolist producing a single output has two types of customers.
If it produces a units for customers of type 1, then these customers are willing to pay
100 − 10a euros per unit. If it produces b units for customers of type 2, then these
customers are willing to pay a price of 200 − 20b euros per unit. The monopolist’s
cost of producing c units of output is 180 + 40c euros. In order to maximize proﬁts,
how much should the monopolist produce for each market?
Solution. The proﬁt function is the following
f(a, b) = a(100 −10a) +b(200 −2b) −[180 + 40(a +b)]
The stationary points are the solutions of the following system:
_
¸
¸
¸
_
¸
¸
¸
_
∂f
∂a
= 100 −20a −40 = 60 −20a = 0
∂f
∂b
= 200 −40b −40 = 160 −40b = 0
_
60 −20a = 0
160 −40b = 0
⇔ (a, b) = (3, 4)
241
It remains to check the second order conditions. Since
f
′′
a
2(a, b) = −20, f
′′
b
2(a, b) = −40
and
f
′′
ab
(a, b) = f
′′
ba
(a, b) = 0,
then A = −20 < 0, D = −800. Therefore the point (3,4) is a local maximum point of
f.
Example 4. A ﬁrm uses two inputs to produce a single product. If its production
function is
Q(x, y) = x
1
4
y
1
4
and if it sells its output for a euro a unit and buys each input for
1
16
euros a unit,
ﬁnd its maximum proﬁt.
Solution. The proﬁt function is the following
f(x, y) = x
1
4
y
1
4
−
1
16
(x +y), x > 0, y > 0.
The stationary points are the solutions of the following system
_
¸
¸
¸
_
¸
¸
¸
_
∂f
∂x
(x, y) = 0
∂f
∂y
(x, y) = 0
⇔
_
¸
¸
_
¸
¸
_
1
4
x
−
3
4
y
1
4
−
1
16
= 0
1
4
x
1
4
y
−
3
4
−
1
16
= 0
⇔
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
y
x
3
=
_
1
4
_
4
x
y
3
=
_
1
4
_
4
⇔
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
1
x
2
y
2
=
_
1
4
_
8
y
x
3
=
_
1
4
_
4
⇔
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
1
xy
=
_
1
4
_
4
y
x
3
=
_
1
4
_
4
⇔
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
_
1
x
_
4
=
_
1
4
_
8
y
x
3
=
_
1
4
_
4
⇒
_
x = 16
y = 16
It remains to check the second order conditions:
∂
2
f(x, y)
∂x
2
= −
3
16
x
−
7
4
y
1
4
; A = −
3
16
2
−6
< 0
∂
2
f(x, y)
∂x∂y
=
1
16
x
−
3
4
y
−
3
4
; B =
1
16
2
−3
2
−3
=
1
16
2
−6
∂
2
f(x, y)
∂y
2
= −
3
16
x
1
4
y
−
7
4
and C = −
3
16
2
−6
.
242
In consequence
D =
1
16
2
−12
−
9
16
2
2
−12
< 0 and A < 0,
hence (16, 16) is a local maximum point for f.
Example 5. A farmer wishes to build a rectangular storage bin, without a top,
with a volume of 500 cubic meters using the least amount of material possible. De
termine the dimensions of such a storage bin.
Solution. If we let x and y be the dimensions of the base of the bin and z be the
height, all measured in meters, then the farmer wishes to minimize the surface area
of the bin, given by
S = xy + 2xz + 2yz (3)
subject to the constraint of the volume, namely,
500 = xyz
Solving for z in the latter expression and substituting into (3) we have
S = S(x, y) = xy + 2x
500
xy
+ 2y
500
xy
= xy +
1000
y
+
1000
x
This is the function we need to minimize on the unbounded set
D = ¦(x, y) [ x > 0, y > 0¦.
Now
∂S
∂x
= y −
1000
x
2
and
∂S
∂y
= x −
1000
y
2
so to ﬁnd the stationary points of S we need to solve
_
¸
¸
¸
_
¸
¸
¸
_
y −
1000
x
2
= 0
x −
1000
y
2
= 0
Solving for y in the ﬁrst equation and then substituting into the second equation
we get
x −
x
4
1000
= 0 ⇔ x
_
1 −
x
3
1000
_
= 0.
243
The solutions of the latter are x = 0 and x = 10. Since the ﬁrst of these will not
give us a point in D, we have x = 10 and
y =
1000
10
2
= 10.
Thus the only stationary point is (10,10).
Now
∂
2
S
∂x
2
=
2000
x
3
; A =
∂
2
S
∂x
2
(10, 10) =
2000
1000
= 2 > 0
∂
2
S
∂x∂y
= 1; B =
∂
2
S
∂x∂y
(10, 10) = 1
∂
2
S
∂x∂y
=
2000
y
3
; C =
∂
2
S
∂y
2
(10, 10) =
2000
1000
= 2 > 0
Hence D
2
= B
2
−AC = 1 −4 = −3 < 0 and A > 0.
This shows that S has a local minimum of
S(10, 10) = 10 10 +
1000
10
+
1000
10
= 300 at (x, y) = (10, 10).
Finally, when x = 10 and y = 10, we have
z =
500
10 10
= 5,
so the farmer should build the bin to have a base of 10 meters by 10 meters and a
height of 5 meters.
The problem of showing that the point (10,10) is actually the global minimum
value of S will be discussed later.
Example 6. Determine the local extreme values of the following function
f : R
3
→R, f(x, y, z) = 2x
2
+y
2
+ 2xy + 3z
2
+ 2xz + 4z + 3.
Solution. The stationary points are the solutions of the following system of linear
equations:
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
∂f
∂x
(x, y, z) = 0
∂f
∂y
(x, y, z) = 0
∂f
∂z
(x, y, z) = 0
⇔
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
4x + 2y + 2z = 0
2x + 2y = 0
2x + 6z + 4 = 0
The solution of the previous system is (x, y, z) = (1, −1, −1). In order to establish
the nature of the stationary point (1, −1, −1) we have to apply the second order
conditions. We evaluate ﬁrst the Hessian matrix at (1, −1, −1).
H(x, y, z) =
_
_
4 2 2
2 2 0
2 0 6
_
_
and hence H(1, −1, −1) =
_
_
4 2 2
2 2 0
2 0 6
_
_
.
244
The principal minors are
∆
1
= 4, ∆
2
=
¸
¸
¸
¸
4 2
2 2
¸
¸
¸
¸
= 4 and
∆
3
=
¸
¸
¸
¸
¸
¸
4 2 2
2 2 0
2 0 6
¸
¸
¸
¸
¸
¸
= 48 + 0 + 0 −8 −0 −24 = 16.
By applying part a) of Theorem 3 we get that (1, −1, −1) is a local minimum point
of f.
Necessary conditions
We will now prove the second order necessary conditions for optimization.
Theorem 5. Let f : D → R be a C
2
function, D ⊆ R
n
. Suppose that a is a
local minimum point (respectively a local maximum point) of f. Then df
(a)
≡ 0 and
d
2
f
(a)
(h) ≥ 0, for each h in R
n
(respectively df
(a)
≡ 0 and d
2
f
(a)
(h) ≤ 0 for each h
in R
n
).
Proof. From Theorem 1 we know that a is a stationary point of f and hence
df
(a)
≡ 0.
By using a similar argument as in the proof of Theorem 2 if a is a stationary point
and d
2
f
(a)
(v) > 0 for some vector v then a cannot be a local maximum point of f.
So, if a stationary point is a local maximum point of f there is no vector such that
d
2
f
(a)
(v) > 0. In consequence we have d
2
f
(a)
(h) ≤ 0 for all h ∈ R
n
.
In the same way if a is a local minimum point of f then df
(a)
≡ 0, d
2
f
(a)
(h) ≥ 0
for each h ∈ R
n
.
Global maxima and minima
Deﬁnition 4. Let D ⊆ R
n
. D is said to be convex if, for all x and y in D and
every t in the interval [0, 1], the point (1 −t)x +ty is in D.
In other words, every point on the line segment connecting x and y is in D.
Theorem 6. a) Let f be a C
2
function on an open convex subset D of R
n
for
which d
2
f
(x)
(h) ≥ 0 for all x ∈ D and h ∈ R
n
.
If a is a stationary point of f then a is a global minimum point of f.
b) Let f be a C
2
function on an open convex subset D of R
n
for which d
2
f
(x)
(h) ≤ 0
for all x ∈ D and h ∈ R
n
. If a is a stationary point of f, that is d
2
f
(a)
≡ 0, then a is
a global maximum point of f.
The proof of the previous theorem involves ideas that are beyond the scope of this
text and will be omitted.
Example 7. In example 3 we get that the point (3,4) is a local maximum point
of f. By applying the previous theorem we obtain that (3,4) is a global maximum
point. Indeed,
D = ¦(a, b) ∈ R
2
[ a > 0, b > 0¦
245
is a convex set (see the Deﬁnition 4). On the other hand, the Hessian matrix is
H(a, b) =
_
−20 0
0 −40
_
for each point in D, hence
d
2
f
(a,b)
(h
1
, h
2
) = −20h
2
1
−40h
2
2
≤ 0
for each (h
1
, h
2
) ∈ R
2
.
Since all the hypotheses of theorem 4, part b) are fulﬁlled it results that the
stationary point (3,4) is a global maximum point.
Example 8. Prove that the stationary point in example 6 is a global (absolute)
minimum point of f.
Solution. D = R
3
is a convex set.
The Hessian matrix is an arbitrary point (x, y, z) in R
3
is
H(x, y, z) =
_
_
4 2 2
2 2 0
2 0 6
_
_
,
hence
d
2
f
(x,y,z)
(h
1
, h
2
, h
3
) = 4h
2
1
+ 4h
1
h
2
+ 4h
1
h
3
+ 2h
2
2
+ 6h
2
3
= 2h
2
1
+ 4h
1
h
2
+ 2h
2
2
+ 2h
2
1
+ 4h
1
h
3
+ 2h
2
3
+ 4h
2
3
= 2(h
1
+h
2
)
2
+ 2(h
1
+h
3
)
2
+ 4h
2
3
≥ 0,
for all (x, y, z) ∈ R
3
and (h
1
, h
2
, h
3
) ∈ R
3
.
Since all the hypotheses of Theorem 4 part a) are fulﬁlled it results that the
stationary point (1, −1, −1) is a global minimum point.
Remark 1. The global minimum and maximum values of a continuous function
on a closed and bounded (hence compact) set D can be obtained in the following way:
• ﬁnd the values of f at the stationary points of f in the interior of D
• ﬁnd the extreme values of f on the boundary of D
• the largest of the values of f from steps 1 and 2 is the global maximum value,
the smallest of these values is the global minimum value.
Example 9. Prove that the stationary point (10,10) in example 5 is an absolute
maximum point of S.
246
Solution. let D
1
be the closed rectangle:
D
1
= ¦(x, y) [ 1 ≤ x ≤ 400, 1 ≤ y ≤ 400¦.
Now, if 0 < x ≤ 1, then
1000
x
≥ 1000 and so
S = xy +
1000
y
+
1000
x
≥ 1000 > 300.
Similarly, if 0 < y ≤ 1, then S > 300. Moreover, if x ≥ 400 and y ≥ 1, then
xy ≥ 400, and so S > 300. Similarly, if y ≥ 400 and x ≥ 1, then S > 300. Hence
S > 300 for all (x, y) outside of D
1
and for all (x, y) on the boundary of D
1
.
From the previous observations, the global minimum on D must in fact be the
global minimum of S on all D
1
which is 300.
4.6 Constrained extrema
In this section we discuss a powerful method for determining the relative extrema
of a function whose independent variables satisfy one or more constraints. This method
is called the Lagrange multipliers method. Consider the problem:
_
_
_
optimize f(x) = f(x
1
, x
2
, . . . , x
n
)
subject to
g
j
(x
1
, x
2
, . . . , x
n
) = c
j
, j = 1, . . . , m < n
(1)
f is called the objective function, g
1
, . . . , g
m
are the constraint functions, c
1
, . . . , c
m
are the constraint constants.
If it is possible to express m independent variables as functions of the other n−m
independent variables, we can eliminate m variables in the objective function (as in
example 5, section 4.5) thus the initial problem will be reduced to the unconstrained
optimization problem with respect to n − m variables. However, in many cases it is
not technically possible to express one variable as a function of the others.
In this case, instead of the substitution and elimination method, we will use the
method of Lagrange multipliers.
In comparison with using the constraint to express m independent variables in
terms of the others, the Lagrangean technique involves more variables and more equa
tions. The advantage of the Lagrangean method is its universality.
Constrained optimization has a proeminent place in economic theory due to the
importance of maximization of utility subject to a budget constraint.
In economic theory it is important that the Lagrange multipliers express how the
extreme value of the problem would change as the constraint is modiﬁed.
We begin with the simplest constrained maximization problem, that is maximizing
a function f(x, y) of two variables subject to a single equality constraint g(x, y) = c.
247
`
¸
g
(
x
,
y
)
=
c
f(x, y) = k
y
x
The previous ﬁgure shows this curve together with several level curves of f. To
maximize f(x, y) subject to g(x, y) = c is to ﬁnd the largest value of k such that
the level curve f(x, y) = k intersects g(x, y) = c. It appears from the ﬁgure above
that this happens when these curves just touch each other, that is, when they have
a common tangent line (otherwise, the value of k could be increased further). In this
case the slope of the constraint curve g(x, y) = c is equal to the slope of a level curve
f(x, y) = k. According to the formula (9), section 4.2 the slope of the constraint curve
is = −
g
′
x
g
′
y
and the slope of the level curve is −
f
′
x
f
′
y
.
Hence the condition that the slopes be equal can be expressed by the equation
−
f
′
x
f
′
y
= −
g
′
x
g
′
y
or, equivalently,
f
′
x
g
′
x
=
f
′
y
g
′
y
.
If we let λ denote this common ratio, we have
λ =
f
′
x
g
′
x
and λ =
f
′
y
g
′
y
from which we get the following equations (Lagrange equations)
f
′
x
= λg
′
x
and f
′
y
= λg
′
y
.
The third equation g(x, y) = c is simply a statement of the fact that the point in
question actually lies on the constraint set.
There is a formal way of obtaining the previous equations:
248
Form the Lagrange function
L(x, y, λ) = f(x, y) −λ[g(x, y) −c].
Find the critical points of the Lagrangean. The result of this process is the following
system:
_
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
_
∂f
∂x
(x, y) = λ
∂g
∂x
(x, y)
∂f
∂y
(x, y) = λ
∂g
∂y
(x, y)
g(x, y) = c
Among the solutions of this system we can ﬁnd the extreme points of f.
A minimization problem can be analyzed by using the same arguments.
The statement of necessary conditions for optimizing a function of b variables
constrained by m equality constraints is the following.
Theorem 1. (Necessary conditions)
Let f, g
1
, . . . , g
m
: D → R be C
1
functions of n variables (m < n). Consider the
problem of maximizing (or minimizing) f on the constraint set
C
g
= ¦x [ g
1
(x) = c
1
, . . . , g
m
(x) = c
m
¦.
Suppose that a is a local maximum or minimum point of f on C
g
(that is a ∈ C
g
).
Suppose further that the rank of the Jacobian matrix
Dg(a) =
_
_
_
_
_
∂g
1
∂x
1
(a) . . .
∂g
1
∂x
n
(a)
. . . . . . . . .
∂g
m
∂x
1
(a) . . .
∂g
m
∂x
n
(a)
_
_
_
_
_
, is m. (2)
Then there exist λ
1
, λ
2
, . . . , λ
n
such that
(a
1
, . . . , a
n
, λ
1
, . . . , λ
m
) = (a, λ)
is a stationary point of the Lagrangean function:
L(x, λ) = f(x) −λ
1
[g
1
(x) −c
1
] −λ
2
[g
2
(x) −c
2
] − −
−λ
m
[g
m
(x) −c
m
] (3)
In other words
∂L
∂x
1
(a, λ) = 0, . . . ,
∂L
∂x
n
(a, λ) = 0,
∂L
∂λ
1
(a, λ) = g
1
(x) −c
1
= 0, . . . ,
∂L
∂λ
m
(a, λ) = g
m
(x) −c
m
= 0.
249
The proof of this theorem involves ideas that are beyond the scope of this text all
will be omitted.
Example 1. A consumer has 1200 m.u. (monetary units) to spend on two com
modities, the ﬁrst of which costs 40 m.u. per unit and the second 60 m.u. per unit.
Suppose that the utility derived by the consumer from x units of the ﬁrst commodity
and y units of the second commodity is given by the CobbDouglas utility function
U(x, y) = 20x
0,6
y
0,4
.
How many units of each commodity should the consumer buy to maximize utility?
Solution. The total cost of buying x units of the ﬁrst commodity and y units of
the second is 40x+60y. Since the consumer has only 1200 m.u. to spend, the goal is to
maximize utility U(x, y) subject to the budgetary constraint that 40x + 60y = 1200.
The Lagrangean function is
L(x, y, λ) = 20x
0,6
y
0,4
−λ(40x + 60y −1200).
The three Lagrange equations are:
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
∂L
∂x
= 12
_
y
x
_
0,4
−40λ = 0
∂L
∂y
= 8
_
x
y
_
0,6
−60λ = 0
∂L
∂λ
= 40x + 60y −1200 = 0
wherefrom we easily get that
_
y
x
_
0,4
=
10λ
3
;
_
x
y
_
0,6
=
15λ
2
and 2x + 3y = 60.
From the ﬁrst two equalities we get
_
y
x
_
0,4
_
y
x
_
0,6
=
10λ
3
2
15λ
⇒
y
x
=
4
9
⇒ y =
4
9
x
Substituting this into the third equation we get
2x + 3
4
9
x = 60
from which it follows that x = 18 and y = 8.
So the only candidate for a solution to our problem is x = 18 and y = 8;
λ =
3
10
_
y
x
_
0,4
=
3
10
_
8
18
_
0,4
=
3
10
_
2
3
_
0,8
.
Next we will describe a second order condition that distinguished maximum points
from minimum points.
250
Intuitively, the second order condition for a constrained maximization (or mini
mization) problem should involve the negative deﬁniteness of some Hessian matrix,
but should only be concerned with directions along the constraint set.
Theorem 2. (Suﬃcient conditions).
Let f, g
1
, . . . , g
m
: D → R, D ⊆ R
n
, be C
2
functions of n variables (m < n).
Consider the problem of maximizing or minimizing f on the constraint set
C
g
= ¦x = (x
1
, . . . , x
n
) [ g
1
(x) = c
1
, . . . , g
m
(x) = c
m
¦.
Form the Lagrangean
L(x, λ) = f(x) −λ
1
[g
1
(x) −c
1
] − −λ
m
[g
m
(x) −c
m
]
and suppose that:
a) a ∈ C
g
b) There exists λ = (λ
1
, . . . , λ
m
) ∈ R
m
, such that
∂L
∂x
1
(a, λ) = =
∂L
∂x
n
(a, λ) = 0
c) The Hessian of L with respect to x at (a, λ) is negative deﬁnite (positive deﬁnite)
on the set ¦v [ Dg(a)v = 0¦ that is for each v ,= θ with Dg(a)v = 0 we have
d
2
L
(a,λ)
(v) < 0 (respectively d
2
L
(a,λ)
(v) > 0).
Then a is a strict local constrained maximum (minimum) point on C
g
.
Proof. We want to show that a is a local maximum point of f on the constraint
set C
g
.
We assume the opposite, that means that there exists a sequence (x
j
)
j≥1
⊆R
n
such
that x
j
→
k→∞
a with x
j
,= a for all j and x
j
∈ C
g
and f(x
j
) > f(a) for all j. Construct
a new sequence by using these x
j
’s.
v
j
=
x
j
−a
x
j
−a
.
It is obvious that v
j
 = 1.
(v
j
)
j≥1
is a sequence contained by the unit sphere (in R
n
).
Since the unit sphere in R
n
is a compact set, the sequence (v
j
)
j≥1
has a convergent
subsequence which will be denoted by (v
k
)
k≥1
and its limit by v.
Since g
i
is C
1
, for each i = 1, m, we write down its ﬁrst Taylor polynomial of order
one about a evaluating it at each x
k
.
g
i
(x
k
) −g
i
(a) = dg
i
(a)
(x
k
−a) +R
i
1
(x
k
)
where
R
i
1
(x
k
)
x
k
−a
→ 0 as x
k
→ a
c
i
−c
i
x
k
−a
= dg
i
(a)
_
x
k
−a
x
k
−a
_
+
R
i
(x
k
)
x
k
−a
.
251
If we let k → ∞ in the previous equality we get that
0 = dg
i
(x)
(a), i = 1, m.
Now write down the second order Taylor polynomial of the Lagrangean as a func
tion of x about a:
L(x
k
) = L(a) +dL
a
(x
k
−a) +
1
2
d
2
L
a
(x
k
−a) +R
2
(x
k
) (4)
where
R
2
(x
k
)
x
k
−a
2
→ 0 as x
k
→ a.
By hypothesis, dL
a
≡ 0; also
L(x
k
) = f(x
k
) −
i
λ
i
(g
i
(x
k
) −c
i
)
= f(x
k
)
In the same way L(a) ≡ f(a).
Using these results, rewrite (4) as
0 ≤
f(x
k
) −f(a)
x
k
−a
2
= d
2
L
a
(v
k
) +
R
2
(x
k
)
x
k
−a
2
.
Let x
k
→ a in the previous relation. Hence d
2
L
a
(v) ≥ 0 which is a contradiction
with the hypotheses. This completes the proof.
By combining the previous two theorems we obtain the following 5 steps algorithm
in order to determine the constrained extreme points of a given function.
Lagrange’s method of multipliers
Let f, g
1
, . . . , g
m
: D → R, D ⊆ R
n
(m < n) be C
2
functions and let c
1
, . . . , c
m
∈
R. Consider the problem of maximizing (or minimizing) f on the constraint set
C
g
= ¦x [ g
1
(x) = c
1
, . . . , g
m
(x) = c
m
¦.
Suppose that the rank of the matrix
Dg(x) =
_
_
_
_
_
∂g
1
∂x
1
(x) . . .
∂g
1
∂x
n
(x)
. . . . . . . . .
∂g
m
∂x
1
(x) . . .
∂g
m
∂x
n
(x)
_
_
_
_
_
is m for each x ∈ C
g
.
Step 1. Assign to any constraint g
j
(x) = c
j
one Lagrange multiplier, λ
j
∈ R,
j = 1, . . . , m.
252
Write down the Lagrange function (Lagrangean)
L(x, λ) = f(x) −λ
1
[g
1
(x) −c
1
] − −λ
m
[g
m
(x) −c
m
]
= f(x) −
m
j=1
λ
j
[g
j
(x) −c
j
] (5)
Step 2. Find the stationary points (a, λ) of the function L with respect to variables
x and λ. These are solutions of the following system
_
¸
¸
¸
_
¸
¸
¸
_
∂L
∂x
i
(x, λ) = 0, i = 1, n
∂L
∂λ
j
(x, λ) = g
j
(x) −c
j
= 0, j = 1, m
Step 3. For each stationary point (a, λ) consider the function L of n variables:
L(x) = f(x) −λ
1
[g
1
(x) −c
1
] − −λ
m
[g
m
(x) −c
m
]
= f(x) −
m
j=1
λ
j
[g
j
(x) −c
j
]
Step 4. Consider and solve the following system whose rank is m (see (1))
_
_
_
dg
1(a)
(v) = 0
. . .
dg
m(a)
(v) = 0
Step 5. Evaluate d
2
L
(a)
at each solution v of the previous system.
a) If d
2
L
(a)
(v) > 0 for each v ,= θ obtained at step 4, then a is a constrained
minimum point.
b) If d
2
L
(a)
(v) < 0 for each v ,= θ obtained at step 4, then a is a constrained
maximum point.
c) If d
2
L
(a)
(v) takes both positive and negative values on the solution set obtained
at step 4 then a is not a constrained extreme point.
Example 2. Determine the nature of the stationary point in Example 1.
Solution. The rank of the matrix
(g
′
x
(x, y), g
′
y
(x, y)) = (40, 60)
is always 1. The Lagrangean function in this example is
L(x, y, λ) = 20x
0,6
y
0,4
−λ(40x + 60y −1200)
and the critical point is
_
18, 8,
3
10
_
2
3
_
0,8
_
. It remains for us to check steps 3, 4 and
5 from the previous algorithm.
253
Step 3. L(x, y) = 20x
0,6
y
0,4
−
3
10
_
2
3
_
0,8
(40x + 60y −1200)
Step 4. We have to solve the following equation (because we have just one con
straint)
∂g
∂x
(18, 8)v
1
+
∂g
∂y
(18, 8)v
2
= 0.
Since
∂g
∂x
(18, 8) = 40 and
∂g
∂y
(18, 8) = 60, then the equation to be solved is:
40v
1
+ 60v
2
= 0
wherefrom we have v
2
= −
2
3
v
1
.
Step 5. d
2
L
(18,8)
(v
1
, v
2
) = d
2
L
(18,8)
_
v
1
, −
2
3
v
1
_
=
∂
2
L
∂x
2
(18, 8)v
2
1
+ 2
∂
2
L
∂x∂y
(18, 8)v
1
_
−
2
3
v
1
_
+
∂
2
L
∂y
2
(18, 8)
_
−
2
3
v
1
_
2
=
_
∂
2
L
∂x
2
(18, 8) −
4
3
∂
2
L
∂x∂y
(18, 8) +
4
9
∂
2
L
∂y
2
(18, 8)
_
v
2
1
We have
∂
2
L
∂x
2
(x, y) =
_
12
_
y
x
_
0,4
−40λ
_
′
x
= −4, 8y
0,4
x
−1,4
= −4, 8
1
x
_
y
x
_
0,4
so
∂
2
L
∂x
2
(18, 8) = −
4, 8
18
_
2
3
_
0,8
;
∂
2
L
∂x∂y
(x, y) =
_
12
_
y
x
_
0,4
−40λ
_
′
y
= 4, 8x
−0,4
y
−0,6
= 4, 8
1
y
_
y
x
_
0,4
so
∂
2
L
∂x∂y
(18, 8) = 4, 8
1
8
_
8
18
_
0,4
=
4, 8
8
_
2
3
_
0,8
∂
2
L
∂y
2
(x, y) =
_
8
_
x
y
_
0,6
−60λ
_
′
y
= −4, 8y
−1,6
x
0,6
= −4, 8
1
y
_
x
y
_
0,6
so
∂
2
L
∂y
2
(18, 8) = −
4, 8
8
_
2
3
_
1,2
254
We ﬁnally obtain that
d
2
L
(18,8)
_
v
1
, −
2
3
v
2
_
=
_
−
4, 8
18
_
2
3
_
0,8
−
4, 8
6
_
2
3
_
0,8
−
4, 8
18
_
2
3
_
1,2
_
v
2
1
< 0
for all v
1
,= 0 and in consequence we have that (18,8) is a local maximum constraint
point f.
Recall from section 4.2 (example 8) that the level curves of a utility function are
the optimal indiﬀerence curve U(x, y) = C, where C = U(18, 8) and the budgetary
constraint is 40x + 60y = 1200 is sketched in ﬁgure below.
`
¸
20
30

(18,8)
budget line
Example 3. Rework Example 5 from section 4.5 by using Lagrange’s multipliers
method.
Solution. Let x, y and z be the length, width and height, respectively, of the bin
in meters. We wish to minimize
S : (0, ∞) (0, ∞) (0, ∞) →R, S(x, y, z) = xy + 2yz + 2zx
subject to the constraint of the volume, namely
V : (0, ∞) (0, ∞) (0, ∞) →R, V (x, y, z) = xyz = 500.
Using the method of Lagrange multipliers, we have to follows the ﬁve steps of the
Lagrange’s algorithm.
We have ﬁrst to check that the rank of the matrix
(V
′
x
(x, y, z), V
′
y
(x, y, z), V
′
z
(x, y, z)) = (yz, xz, xy)
is one for each point (x, y, z) which satisﬁes xyz = 500. The matrix rank is then 1 if
and only if x = y = z = 0, but the point (0,0,0) does not respect the constraint. In
conclusion the rank of the matrix is one for all the points of the constraint set.
255
Steps 1 and 2.
L(x, y, z) = S(x, y, z) −λ[V (x, y, z) −500]
L(x, y, z) = xy + 2yz + 2zx −λ(xyz −500)
We have to solve the following system
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
∂L
∂x
(x, y, z) = y + 2z −λyz = 0
∂L
∂y
(x, y, z) = x + 2z −λxz = 0
∂L
∂z
(x, y, z) = 2y + 2x −λxy = 0
∂L
∂λ
(x, y, z) = 500 −xyz = 0
There are no general rules for solving nonlinear systems of equations. Sometimes
some ingenuity is required. Usually we eliminate λ from the equations and try to solve
the remaining system.
λ =
y + 2z
yz
=
x + 2z
xz
=
2y + 2x
xy
From the previous equality we obtain that
1
z
+
2
y
=
1
z
+
2
x
=
2
x
+
2
y
The ﬁrst equality shows us that x = y and the last equality assure us that y = 2z,
so x = y = 2z.
If we substitute this values in the constraint equality we get 4z
3
= 500 wherefrom
we obtain that z = 5, x = y = 10 and λ =
10 + 10
50
=
2
5
. In consequence
_
10, 10, 5,
2
5
_
is the unique critical point of L.
Step 3. L(x, y, z) = xy + 2xz + 2yz −
2
5
(xyz −500)
Step 4. In order to solve the equation
∂g
∂x
(10, 10, 5)v
1
+
∂g
∂y
(10, 10, 5)v
2
+
∂g
∂z
(10, 10, 5)v
3
= 0
we compute ﬁrst
∂g
∂x
= yz,
∂g
∂y
= xz and
∂g
∂z
= xy,
hence
∂g
∂x
(10, 10, 5) = 50,
∂g
∂y
(10, 10, 5) = 50 and
∂g
∂z
(10, 10, 5) = 100
256
The equation to be solved is
50v
1
+ 50v
2
+ 100v
3
= 0
wherefrom we have
v
3
= −
1
2
(v
1
+v
2
).
Step 5. d
2
L
(10,10,5)
(v
1
, v
2
, v
3
) = d
2
L
(10,10,5)
_
v
1
, v
2
, −
1
2
(v
1
+v
2
)
_
= 2
∂
2
L
∂x∂y
(10, 10, 5)v
1
v
2
+ 2
∂
2
L
∂x∂z
(10, 10, 5)v
1
_
−
1
2
(v
1
+v
2
)
_
+2
∂
2
L
∂y∂z
(10, 10, 5)v
2
_
−
1
2
(v
1
+v
2
)
_
= 2
_
1 −
2
5
5
_
v
1
v
2
−
_
2 −
2
5
10
_
v
1
(v
1
+v
2
) −
_
2 −
2
5
10
_
v
2
(v
1
+v
2
)
= −2v
1
v
2
+ 2v
1
(v
1
+v
2
) + 2v
2
(v
1
+v
2
)
= 2v
2
1
+ 2v
1
v
2
+ 2v
2
2
= v
2
1
+v
2
2
+ (v
1
+v
2
)
2
> 0
implies that (10,10,5) is a local minimum constraint point for f.
Example 4. Consider the problem of optimization of f
f : R
3
→R, f(x, y, z) = x + 2y + 3z
on the constraint set deﬁned by
g(x, y, z) = x
2
+y
2
= 1 and h(x, y, z) = x +z = 1.
Solution. First, compute the Jacobian matrix of the constraint function
Dg(x, y, z) =
_
2x 2y 0
1 0 1
_
Its rank is less than 2 if and only if x = y = 0. Since any point of the form (0, 0, z)
does not respect the ﬁrst constraint, the rank of the Jacobian matrix is two for all
the points of the constraint set.
Next, form the Lagrangean
L(x, y, z, λ, µ) = x + 2y + 3z −λ(x
2
+y
2
−1) −µ(x +z −1)
257
and set its ﬁrst partial derivatives equal to 0
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
∂L
∂x
= 1 −2λx −µ = 0
∂L
∂y
= 2 −2λy = 0
∂L
∂z
= 3 −µ = 0
∂L
∂λ
= −(x
2
+y
2
−1) = 0
∂L
∂µ
= −(x +z −1) = 0
Solve the second and third equations for λ and µ and plug these into the ﬁrst
equation to obtain: µ = 3, λ =
1
y
and 1 − 2
x
y
− 3 = 0 which implies that x = −y.
Then, solve the fourth equation for x and the last equation for z.
2x
2
= 1 ⇒ x =
1
√
2
, y = −
1
√
2
, z = 1 −
1
√
2
, λ = −
√
2, µ = 3
⇒ x = −
1
√
2
, y =
1
√
2
, z = 1 +
1
√
2
, λ =
√
2, µ = 3
For each of two previous critical points it remains for us to follows steps 3 to 5.
We will analyse just the ﬁrst case, the second is similar and it is left to the reader.
Step 3. L(x, y, z) = x + 2y + 3z +
√
2(x
2
+y
2
−1) −3(x +z −1)
Step 4. We have to solve the following system (we denoted by a =
_
1
√
2
, −
1
√
2
, 1 −
1
√
2
_
)
_
¸
¸
¸
_
¸
¸
¸
_
∂g
∂x
(a)v
1
+
∂g
∂y
(a)v
2
+
∂g
∂z
(a)v
3
= 0
∂h
∂x
(a)v
1
+
∂h
∂y
(a)v
2
+
∂h
∂z
(a)v
3
= 0
Easy computations lead us to
_ √
2v
1
−
√
2v
2
= 0
v
1
+v
3
= 0
, hence
_
v
2
= v
1
v
3
= −v
1
Step 5. d
2
L
_
1
√
2
,−
1
√
2
,1−
1
√
2
_
(v
1
, v
2
, v
3
)
= d
2
L
_
1
√
2
,−
1
√
2
,1−
1
√
2
_
(v
1
, v
1
, −v
1
)
258
= 2
√
2v
2
1
+ 2
√
2v
2
1
= 4
√
2v
2
1
> 0
so
_
1
√
2
, −
1
√
2
, 1 −
1
√
2
_
is a local constraint maximum point for f.
The signiﬁcance of the Lagrange multipliers
It is possible to solve a constrained optimization problem by the method of La
grange multipliers without obtaining numerical values for the Lagrange multipliers.
However, the multipliers play an important role in economic analysis since they
measure the sensitivity of the optimal value of the objective function to changes in
the righthand sides of the constraints.
We analyse ﬁrst the simplest problem – two variables and one equality constraint.
Let f, g : D →R, D ⊆ R
2
optimize f(x, y)
subject to g(x, y) = c
(6)
Consider c as a parameter, c ∈ R, which may vary.
For any ﬁxed value of c we denote by (a(c), b(c)) the solution of the previous
problem by λ(c) the multiplier which corresponds to this solution and by f(a(c), b(c))
the corresponding optimal value of the objective function.
We will show that λ(c) measures the rate of change of the optimal value of f with
respect to the parameter c.
Theorem 3. Let f, g : D →R be C
1
functions of two variables. Let (a(c), b(c)) be
the solution of the problem (6) where the corresponding multiplier is denoted by λ(c).
Suppose that a, b and λ are C
1
functions of c and
∂g
∂x
(a(c), b(c)) ,= 0 or
∂g
∂y
(a(c), b(c)) ,= 0.
Then
λ(c) =
d
dc
f(a(c), b(c)).
The derivative in the previous equality is taken with respect to c since f(a(c), b(c))
can be seen as a function of one variable which is c.
Proof. From Theorem 1 we know that at an extreme point (a(c), b(c)) we have
∂f
∂x
(a(c), b(c)) = λ(c)
∂g
∂x
(a(c), b(c)),
∂f
∂y
(a(c), b(c)) = λ(c)
∂g
∂y
(a(c), b(c)) and
g(a(c), b(c)) = c.
259
If we diﬀerentiate the latter equality (with respect to c) we get
∂g
∂x
(a(c), b(c))a
′
(c) +
∂g
∂y
(a(c), b(c))b
′
(c) = 1.
By the chain rule of partial derivatives:
d
dc
(f(a(c), b(c)) =
∂f
∂x
(a(c), b(c))a
′
(c) +
∂f
∂y
(a(c), b(c))b
′
(c)
= λ(c)
∂g
∂x
(a(c), b(c))a
′
(c) +λ(c)
∂g
∂y
(a(c), b(c))b
′
(c)
= λ(c)
_
∂g
∂x
(a(c), b(c))a
′
(c) +
∂g
∂y
(a(c), b(c))b
′
(c)
_
= λ(c) 1
= λ(c).
This completes the proof.
Remark 1. Under the assumptions of Theorem 3 we have
λ ≈ change in the optimal value of f due to 1unit change in c. (7)
Proof. We know, from Theorem 3, that
λ(c) =
d
dc
(f(a(c), b(c))
= lim
h→0
f(a(c +h), b(c +h)) −f(a(c), b(c))
h
≈
∆f
∆c
.
In consequence
∆f ≈ λ ∆c (8)
If we take ∆c = 1 in (8), we get ∆f ≈ λ, as desired.
Example 5. Suppose the consumer in Example 1 has 1201 m.u. instead of 1200
m.u. to spend on the two commodities. Estimate how the additional 1 m.u. will aﬀect
the maximum utility.
Solution. From Example 1 we know that
λ =
3
10
_
y
x
_
0,4
.
Since the maximum value M of utility when 1200 m.u. was available occured when
x = 18 and y = 8, substitute these values into the formula for λ we get
λ = 0, 3
_
8
18
_
0,4
≈ 0, 22
260
which is (see Remark 1) approximately the increase ∆M in maximum utility resulting
from the 1 m.u. increase in available funds.
The statement of the natural generalization of Theorem 3 to several variables and
several equality constraints is the following.
Theorem 4. Let f, g
1
, . . . , g
m
: D → R, D ⊆ R
n
, be C
1
functions, m < n. Let
(a
1
(c), a
2
(c), . . . , a
n
(c)), c = (c
1
, . . . , c
m
) be the solution of the problem (1). Suppose
that a
1
, . . . , a
n
, λ
1
, . . . , λ
m
are diﬀerentiable functions of the parameters (c
1
, . . . , c
m
)
and that condition (2) holds.
Then, for each j = 1, m we have
λ
j
(c) =
∂f
∂c
j
(a
1
(c
1
, . . . , c
m
), . . . , a
n
(c
1
, . . . , c
m
)).
In the previous equality λ
j
describes (approximately) the inﬂuence of the c
j
(j
th
component of the constraint constants) on the change of the optimal value of the
problem.
4.7 Applications to economics
4.7.1 The method of least squares
Scientists studying the data from some observations or experiments are often in
terested in determining a function that ﬁts the data reasonable well. Suppose that
we are studying a relationship between two variables, so that each observation can be
represented by a point (x, y) in the plane.
The method of least squares is used for determining a function which approximates
the set of given points so that its graph is closest to that points.
Suppose we have n point P
1
(x
1
, y
1
), . . . , P
n
(x
n
, y
n
) which describe a relationship
between the two variables x and y. Usually these data are presented in a table of the
following form:
x x
1
x
2
. . . x
n
y y
1
y
2
. . . y
n
.
The ﬁrst step is to determine what type of function to look for. This can be done
by theoretical analysis of the practical situation or by inspection of the graph of the
n points P
1
, . . . , P
n
. The second step is to determine the particular function whose
graph is closest to the given set of points.
261
¸
`
P
1
(x
1
, y
1
)
P
2
(x
2
, y
2
)
P
n
(x
n
, y
n
)
x
1
x
2
(x
2
, f(x
2
))
x
n
. . .
It is obvious that the error at x
i
is y
i
−f(x
i
), i = 1, n. The question is how can we
combine these errors in order to deﬁne the total error which has to reﬂect how close
is the graph to the given points.
The choice
n
i=1
(y
i
− f(x
i
)) is not convenient because this sum can be 0 and still
the terms of this sum can have great values and opposite signs.
The sum
n
i=1
[y
i
− f(x
i
)[ reﬂects better how close is the graph of f to the points
but this choice is not convenient, too, since the modulus function is not everywhere
diﬀerentiable and we can’t use the suﬃcient conditions for local extrema. The sum
of the squares of the vertical distances from the given points to the graph of f,
n
i=1
(y
i
−f(x
i
))
2
, is the convenient choice for the total error.
We have to solve the following problem:
Determine the function f such that the sum
n
i=1
(y
i
−f(x
i
))
2
takes the minimum
value.
From now on we restrict the discussion to the case when f is a polynomial.
Problem. Determine a polynomial f of degree at most m,
f(x) = a
0
+a
1
x + +a
m
x
m
(a
0
, a
1
, . . . , a
m
=?) such that the function
F(a
0
, a
1
, . . . , a
m
) =
n
i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)]
2
takes the minimum value.
262
The unknowns are the coeﬃcients of the polynomial. We have m + 1 unknowns:
a
0
, a
1
, . . . , a
m
.
We will apply Theorem 2, section 4.5.
First we determine the stationary points of F by solving the following system:
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
F
′
a
0
= 2
n
i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)](−1) = 0
F
′
a
1
= 2
n
i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)](−x
i
) = 0
. . .
F
′
a
m
= 2
n
i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)](−x
m
i
) = 0
The previous system can be written in the following way:
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
n
i=1
y
i
= a
0
n
i=1
1 +a
1
n
i=1
x
i
+ +a
m
n
i=1
x
m
i
n
i=1
y
i
x
i
= a
0
n
i=1
x
i
+a
1
n
i=1
x
2
i
+ +a
m
n
i=1
x
m+1
i
. . .
n
i=1
y
i
x
m
i
= a
0
n
i=1
x
m
i
+a
1
n
i=1
x
m+1
i
+ +a
m
n
i=1
x
2m
i
By denoting
n
i=1
x
k
i
= s
k
, k = 0, 2m,
n
i=1
y
i
x
l
i
= t
l
, l = 0, m (here we make the
convention x
0
i
= 1, i = 1, m), the system to be solved becomes:
_
¸
¸
_
¸
¸
_
s
0
a
0
+s
1
a
1
+ +s
m
a
m
= t
0
s
1
a
0
+s
2
a
1
+ +s
m+1
a
m
= t
1
. . . . . .
s
m
a
0
+s
m+1
a
1
+ +s
2m
a
m
= t
m
(1)
The previous system is called the normal system.
It can be shown that the previous system has a unique solution which is a minimum
point for F (we will not prove these statements).
In conclusion the solution of the normal system will give us the coeﬃcients for the
desired polynomial.
The coeﬃcients s
0
, s
1
, . . . , s
2m
; t
0
, t
1
, . . . , t
m
can be determined by arranging the
calculation in the following table:
263
x
0
i
x
1
i
x
2
i
. . . x
2m
i
y
i
y
i
x
i
. . . y
i
x
m
i
1 x
1
x
2
1
. . . x
2m
1
y
1
y
1
x
1
. . . y
1
x
m
n
. . . . . . . . . . . . . . . . . . . . . . . . . . .
1 x
n
x
2
n
. . . x
2m
n
y
n
y
n
x
n
. . . y
m
x
m
n
s
0
s
1
s
2
. . . s
2m
t
0
t
1
. . . t
m
(2)
In the case m = 1 the desired function is a polynomial of degree one f(x) = a
0
+a
1
x
whose graph is called the least squares line.
Example 1. On election day, the polls open at 8:00 A.M. Every 2 hours after that,
an election oﬃcial determines what percentage of the registered voters have already
express their ballots. The data through 6:00 P.M. are shown below.
time 10:00 12:00 2:00 4:00 6:00
percentage 12 19 24 30 37
Find the equation of the leastsquares line (let x denote the number of hours after
8:00 A.M.). Use the leastsquares line to predict what percentage of the registered
voters will have express their ballots by the time the polls close at 8:00 P.M.
Solution. Let x denote the number of hours after 8:00 A.M. and y the percentage.
Arrange the calculations as follows:
x 2 4 6 8 10
y 12 19 24 30 37
x
0
i
x
i
x
2
i
y
i
y
i
x
i
1 2 4 12 24
1 4 16 19 76
1 6 36 24 144
1 8 64 30 240
1 10 100 37 370
s
0
= 5 s
1
= 30 s
2
= 220 t
0
= 122 t
1
= 854
The normal system is
_
5a
0
+ 30a
1
= 122
30a
0
+ 220a
1
= 854
The solution of the previous system is given by
a
0
=
¸
¸
¸
¸
122 30
854 220
¸
¸
¸
¸
¸
¸
¸
¸
5 30
30 220
¸
¸
¸
¸
=
122 220 −30 854
5 220 −30 30
=
122
20
= 6, 1
a
1
=
¸
¸
¸
¸
5 122
30 854
¸
¸
¸
¸
¸
¸
¸
¸
5 30
30 220
¸
¸
¸
¸
=
5 854 −30 122
200
=
610
200
= 3, 05
264
So the least squares line is
f(x) = 6, 1 + 3, 05x.
To predict the percentage at 8:00 P.M., substitute 12 (the number of hours after
8:00 A.M.) into the equation of the leastsquares line. This gives
y = 6, 1 + 12 3, 05 = 42, 7
which suggest that the percentage at 8:00 P.M. might be 42,7.
4.7.2 Inventory control. The economic order
quantity model
Inventory is the set of items (goods or materials) that are held by an organization
for later use.
For every type of item held in inventory we want to establish how much should be
ordered each time and when the reordering occur.
The objective is to minimize variable inventory costs which are: ordering costs and
holding costs.
Ordering costs are expenses of processing an order (these costs are independent
of the order quantity).
Holding costs are rent, heat, salaries etc.
The economic order quantity model is the simplest and the oldest of the inventory
models. It uses unrealistic assumptions but it gives a reasonable ﬁrst approximation
to the given situation.
The assumptions in this model are the following:
 we study a single product with a constant demand
 no shortages (stockouts) are allowed
 the order is constant
 the time between the orders is constant
 lead time is 0 (the lead time is the time between the ordering moment and the
receipt of the goods; so, goods arrive at the same day they are ordered).
The elements of this model are:
a) τ  the entire period of time (known)
b) D  the demand of the entire period (known)
c) Q  the order quantity (unknown)
d) T  the time interval between the orders (unknown)
e) C
h
 holding cost / per item / per day (known)
f) n  number of orders (unknown)
g) C
0
 ordering cost, which is ﬁxed for each period (known).
265
¸
O
`
T 2T nT
time
inventory
control
Q
We have to determine the costs per each cycle and then to multiply them by n in
order to obtain the total variable costs. The cost per cycle consist in ordering costs
(C
0
) and holding costs C
h
Q
2
T so the total cost function will be:
C
τ
= n
_
C
0
+C
h
Q
2
T
_
.
We have to solve the following constraint problem
_
¸
¸
¸
_
¸
¸
¸
_
C
τ
(n, Q, T) = nC
0
+nC
h
Q
2
T → min
subject to the following constraints :
τ = nT
D = nQ
We will solve the previous constraint optimization problem by using the elimina
tion method. Since n =
D
Q
and τ = nT the total cost function will depend just on Q
and it remains for us to ﬁnd the minimum value of the following one variable function
C(Q) =
D
Q
C
0
+C
h
Q
2
τ
The critical points are obtained by solving the next equation
C
′
(Q) = −
DC
0
Q
2
+
τC
h
2
= 0,
wherefrom we get
Q
2
=
2DC
0
τC
h
and Q =
_
2DC
0
τC
h
.
Since C
′′
(Q) =
2DC
0
Q
3
> 0, we have that the optimal order quantity is
Q
∗
=
_
2DC
0
τC
h
.
Easy computations will give us the values of all unknowns mentioned before.
266
The minimum cost
C
∗
=
D
Q
∗
C
0
+C
h
Q
∗
2
τ =
_
2DC
0
τC
h
.
The minimum number of order is
n
∗
=
_
DτC
h
2C
h
and the optimal time between two orders is
T
∗
=
_
2τC
0
DC
h
.
267
268
Part III
Probabilities
269
A short history of probabilities
It is said that the theory of probabilities as a branch of mathematics appeared in
the middle of 17
th
century in France. Antoine Gombaud, Chevalier de M´er´e (a French
noblement) proposed his gambling problems to Blaise Pascal (16231662) who started
a mathematical correspondence with Pierre de Fermat (16011665). The gambling
problems were:
 how many throws of two dice are required such that the number of double six
appearevents to be more than a half of total throws
 how to share the wagered money between two gamblers if the game is interrupted
before it ends.
The legend say that the de M´er´e’s gambling problems made the beginning of the
theory of probabilities. Actually, the legend is not entirely true since years before
Pascal and Fermat, problems of a probabilistic nature have been analysed by some
mathematicians. It would be more realistic to say that Pascal and Fermat initiated
the fundamental principles of probability theory as we know them now (the theory
started as an empirical science).
There are at least two distinct roots of the probability theory. The ﬁrst one is
the processing of statistical data for determining mortality tables and insurances
rates (the Babylonians had forms of maritime insurances, the romans had annuities,
elements of empirical probability were applied for census of population in China).
The second one is gambling which appeared in the early stages of human history in
many places of the world. The predecessor of the dice was the astragalus (a heel bone
of an animal, the bones were used both for religious ceremonies and for gambling).
It took more than 2000 years of dice games, card games, etc. before someone
developed the basic probabilitic rules. There are at least two reasons for this late
appearance of probabilistic abstractions: Greek philosophy (the antiempiricism was
against the quantiﬁcation of the random events) and early Christian theology (ev
ery event was supposed to be a direct manifestation of God’s intervention and in
consequence every probabilist could be considered an heretic).
The ﬁrst reasoned considerations which put rudimentary probabilistic bases to the
games of chance were presented in the manuscript ”The book on games of chance”
written around 1550 by Gerolamo Cardano and found after his death in 1576 (the
manuscript was printed only in 1663). G. Cardano was a phisicist addicted to gam
bling. It is said that he sold all his wife’s possessions just to get table stakes. It can
be said that the ”classical deﬁnition” of probability came out of his obsession for
gambling.
The next paper on probability, ”On a discovery concerning dice” is due to Galileo
Galilei (presumable written between 1613 and 1623).
The PascalFermat exchange of letters (1654) remained unpublished until 1657.
Even that in the correspondence there were solved a set of isolated problems in prob
ability we cannot say that the obtained results put the basis of a new theory.
But the strong inﬂuence on many mathematicians, focused initially on gambling
and then in other branches of mathematics and sciences, lead to idea that the history
of probability begins with the correspondence between Pascal and Fermat.
271
One of those who heard about the correspondence was a Dutch mathematician,
Christian Huygens (16291695). In 1657, after a visit to Paris where he did not meat
Pascal (at that time Pascal abandoned math for religion) nor Fermat, he published
the ﬁrst book on probability, ”On reasoning in games of dice” in which he solved the
same problems that have been already solved by Fermat and Pascal, he proposed and
solved some new problems and he introduced the concept of mathematical expectation
as ”the value of the chance”. Huygens’ book remained a standard introduction to the
subject for about a half a century.
During the same period, important advances were made in collection of demo
graphic data and the development of the science known today as statistics. John
Graunt (16201674) made a semimathematical study of mortality and insurances. His
work was extended by Sir William Petty (16231687) and by Edmund Halley (1656
1742) who developed mortality tables and is considered to initiate the science of life
statistics.
Because of the games of chance, probability theory became popular and the subject
developed rapidly during the 18
th
century.
The major contributors during this period were:
Jacob Bernoulli (16541705) whose most important result was the Law of large
numbers.
Abraham de Moivre (16671754) derived the theory of permutations and combina
tions from the principles of probability and founded the theory of annuities. In 1733 he
discovered the equation of the normal curve. The normal curve is know as the ”Gaus
sian curve” or ”GaussLaplace curve” in honor of Marquis de Laplace (17491827)
and Karl Friedrich Gauss (17771855) who independently rediscovered the equation.
Gauss obtained it from a study of errors in repeated measurements of the same quan
tity. Laplace made great contributions to the application of probability to astronomy
and introduced the use of partial diﬀerential equations into the study of probability.
Between 1835 and 1870 the Belgian scientist Lambert A.J. Quetelet (17961874)
showed that biological and anthropological measurements follows the normal curve
and applied statistical methods in biology, education and sociology.
Sim´eon Denis Poisson (17811840) publishes in 1837 ”Research on the probabil
ity of the judgements out of criminal matter and civil matter” where the Poisson
distribution ﬁrst appears.
Probability theory has been developed since the 17
th
century and now has many
applications in many ﬁelds such as: actuarial mathematics, statistical mechanics, ge
netics, law, medicine, meteorology, etc.
The major diﬃculty in developing the rigorous theory of probabilities was to give
a deﬁnition of probability that is rigorous enough to be used in mathematics but in
the same time to be applicable in real world.
It took three century until an acceptable deﬁnition was obtained.
Andrey Nikolaevich Kolmogorov (19031987) presented an axiomatic deﬁnition for
probability, this work is the basis for the modern theory of probabilities.
272
Chapter 5
Counting techniques.
Tree diagrams
In this section we present some techniques for determining without direct enumer
ation the number of possible outcomes of a particular experiment or the number of
elements in a particular set. Such counting problems are called combinatorial prob
lems, because we count the number of ways in which diﬀerent possible outcomes can
be combined.
As the fundamental rules of all combinatorics we consider the addition rule and
the multiplication rule. While these rules are very easy to state, they are useful in
many various and complicated situations.
5.1 The addition rule
The number of elements in a given set is called the cardinality of set A, and is
denoted by cardA or [A[.
If it is an empty set then the cardinality is 0.
If it is an inﬁnite set, then the cardinality is ∞.
We will analyze only ﬁnite sets in this section.
Cardinality of unions
Let A and B be two ﬁnite sets. Then
card (A∪ B) = card A+ card B −card (A∩ B).
The previous equality is obvious since if we add the cardinality of A to the cardi
nality of B we have added the cardinality of the intersection A ∩ B twice. Hence we
have to subtract once the cardinality of A∩ B from cardA+ card B.
As a corollary, if the sets A and B are mutually exclusive then the cardinality of
273
the union is the sum of cardinalities
card (A∪ B) = card A+ card B, if A∩ B = ∅.
The generalization of the previous result to an arbitrary union of n ﬁnite sets is
called the inclusionexclusion principle.
The inclusionexclusion principle. Let A
1
, A
2
, . . . , A
n
be n ﬁnite sets. Then:
card
_
n
_
i=1
A
i
_
=
n
i=1
card A
i
−
1≤i<j≤n
card (A
i
∩ A
j
)
+
1≤i<j<k≤n
card (A
i
∩ A
j
∩ A
k
) − + (−1)
n+1
card
_
n
i=1
A
i
_
.
If the sets ¦A
i
¦
i=1,n
are mutually exclusive then:
card
_
n
_
i=1
A
i
_
=
n
i=1
card A
i
, if A
i
∩ A
j
= ∅, ∀ i ,= j.
Example. In an association gathering 95 people, 72 play backgammon, 44 play
chess and 30 do not play any of these two games. How many of them play at the same
time backgammon and chess?
Solution. We will use the following notations:
A  the set of whole members of the association
B  the set of backgammon players
C  the set of chess players.
We are given the following data:
card A = 95, card B = 72, card C = 44
and
card (A¸ (B ∪ C)) = 30
wherefrom we can easily get:
card (B ∪ C) = 95 −30 = 65.
By applying the previous formula we obtain:
card (B ∩ C) = card B + card C −card (B ∪ C)
= 72 + 44 −65
= 51.
274
Hence, the number of people that play at the same time backgammon and chess
is 51.
Example. 120 people take part at a conference. Among participants each can
speak at least one language: French, Spanish or German. We also know that:
• 10 people speak the three languages
• 4 people speak French, Spanish, but not German
• 8 people speak only Spanish
• 100 people speak French
• 32 people speak Spanish
• 53 people speak German.
Determine:
 the number of people who speak Spanish and German but not French
 the number of people who speak French and German but not Spanish
 the number of people who speak only German
 the number of people who speak only French.
Solution. We consider the following VennEuler diagram.
F
B
C
A
D
G
S
French
German
Spanish
Each disc among the 3 drawn corresponds to a group who speak a language
(French, German or Spanish).
In order to simplify the notations we will denote by small correspondent letters
the cardinality of the involved sets (for instance a = card A).
By applying the addition rule with mutually exclusive sets our problem reduces
to solving the following system of linear equations:
275
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
a +b +c +d +f +g +s = 120
a = 10
c = 4
s = 8
f +c +a +b = 100
s +c +a +d = 32
g +a +b +d = 53
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
a = 10, c = 4, s = 8
b +d +f +g = 120 −10 −4 −8 = 98
f +b = 100 −14 = 86
d = 32 −22 = 10
g +b = 53 −10 −10 = 33
_
¸
¸
_
¸
¸
_
a = 10, c = 4, s = 8, d = 10
f +b = 86
g +b = 33
b +f +g = 88
_
¸
¸
_
¸
¸
_
a = 10, c = 4, s = 8, d = 10
g = 88 −86 = 2
b = 33 −2 = 31
f = 88 −33 = 55
In conclusion: a = 10, b = 31, c = 4, d = 10, f = 55, g = 2, s = 8.
Hence:
 there are d = 10 people who speak Spanish and German, but not French
 there are b = 31 people who speak French and German but not Spanish
 there are g = 2 people who speak only German
 there are f = 55 people who speak only French.
The addition rule can be formulated in general terms involving objects, operations
or symbols but the main idea is the same.
The addition rule
If there are m possible outcomes for an event (or ways to do something) and n
possible outcomes for another event (or ways to do another thing) and the two events
cannot both occur (or the two things can’t both be done) then either of the two events
can occur (or total possible ways to do one of the things) in m+n ways.
Formally, the sum of the sizes of two disjoint sets is equal to the size of their union.
Example. A square with side length 3 is divided by parallel lines into 9 equal
squares. What is the total number of squares obtained by this procedure?
Solution. We divide the squares into three sets S
1
, S
2
, S
3
such that the set S
i
contains all squares of side length i (i = 1, 2, 3).
276
It is obvious that: cardS
1
= 1, card S
2
= 4 and cardS
3
= 9, hence the total
number of squares is
card (S
1
∪ S
2
∪ S
3
) = card S
1
+ card S
2
+ card S
3
= 1 + 4 + 9 = 14.
5.2 Tree diagrams and the multiplication principle
Consider an experiment that takes place in several steps such that the number
of outcomes at the n
th
step is independent of the outcomes of the previous steps.
Suppose that the number of outcomes at each step may be diﬀerent for diﬀerent
steps. We have to count the number of ways that the entire experiment can occur.
The best way to analyze such multistep problems is by drawing a tree diagram.
We list the possible outcomes of the ﬁrst step, and then draw lines to present the
possible outcomes that can occur in the second step and so on.
To clarify the above description we will present the following example:
Example. A caf´e has the following menu:
a) two choices for appetizers: soup or juice;
b) three for the main course; lamb chops, ﬁsh or vegetable dish;
c) two for dessert: ice cream or cake.
How many possible choices do you have for your complete menu?
Solution. The complete menu is choosen in three independent steps: two choices
at the ﬁrst course, three at the second and two at the third.
From the following tree diagram we see that the total number of choices is the
product of the number of choices at each stage.
277
soup
juice
lamb
ﬁsh
vegetable
lamb
ﬁsh
vegetable
ice cream
ice cream
ice cream
ice cream
ice cream
ice cream
cake
cake
cake
cake
cake
cake
We have 2 3 2 = 12 possible menus.
A tree diagram is a device used to enumerate all the possible outcomes of a mul
tistep experiment where the outcomes at each step are independent of those at the
previous steps and the outcomes at each step can occur in a ﬁnite number of steps.
From the previous example we can observe that the tree is constructed from left
to right, and the number of branches at each point corresponds to the number of
possible outcomes of the next step of the experiment.
More rigorously, we can introduce a tree as:
A directed graph is a set of points, called vertices, together with a set of directed
line segments, called edges, between some pairs of distinct vertices. A path from a
vertex u to a vertex v in a directed graph G is a ﬁnite sequence (v
0
, v
1
, . . . , v
n
) of
vertices of G (n ≥ 1), v
0
= u, v
n
= v, and (v
i−1
, v
i
) is an edge in G for i = 1, 2, . . . , n.
A directed graph T is a tree if it has a distinguished vertex r, called the root,
such that r has no edges going into it and such that for every other vertex v of T
there is a unique path from r to v.
278
We can easily generalize the result obtained in the previous example to multistep
experiments.
The multiplication rule
If an experiment is performed in m steps, and there are n
1
choices in the ﬁrst
step, and for each of those choices there are n
2
choices in the second step, and so on,
with n
m
choices in the last step for each of the previous choices, then the number of
all possible outcomes is given by the product n
1
n
2
n
3
. . . n
m
.
Example. On a grid of sporting lotto, we have to choose one of the three boxes 1,
x or 2 for each of the 9 matches (1 is put when the host team gains the match, x when
the match ﬁnishes at the same score and 2 when the host team loses the match).
How many diﬀerent choices do we have?
Solution. There are 9 steps in our experiment (since there are 9 matches on the
grid) hence m = 9.
Since n
1
= n
2
= = n
9
= 3 (at each step we have to choose one of the three
possible boxes: 1, x or 2) then the number of all choices is
3 3 . . . 3
. ¸¸ .
9 times
= 3
9
= 19683.
Example. How many functions can we deﬁne on a set A, card A = m, given that
the target space is B, card B = n.
Solution. There are m steps in our experiment (since there are m points in the
domain of the function).
Since n
1
= n
2
= = n
m
= n (at each step we have to choose one of the n
elements of the set B) then the number of all functions is
n n. . . n
. ¸¸ .
m times
= n
m
.
Example. In a race with 20 horses, in how many diﬀerent ways can the ﬁrst three
places be ﬁlled?
Solution. There are 3 steps in our experiment (ﬁrst, second and third place)
hence m = 3. There are 20 horses that can come ﬁrst (n
1
= 20). Whichever horse
comes ﬁrst, there are 19 horses left that can come second (n
2
= 19). Whichever
horses come ﬁrst and second, there are 18 horses left that can come third. So there
are n
1
n
2
n
3
= 20 19 18 = 6840 ways in which the ﬁrst three positions can be ﬁlled.
Example. Ruxi and Ana are to play a tennis match. The ﬁrst person who wins
three sets wins the match. Draw a tree diagram which shows the possible outcomes
of the match.
279
Solution.
R
R R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
A
A
A
A
A
A
A
A
A
A
A
R
R
R
A
A
A
A
A
(R, R, R)
(R, R, A, R)
(R, R, A, A, R)
(R, R, A, A, A)
(R, A, R, R)
(R, A, R, A, R)
(R, A, R, A, A)
(R, A, A, R, R)
(R, A, A, R, A)
(R, A, A, A)
(A, R, R, R)
(A, R, R, A, R)
(A, R, R, A, A)
(A, R, A, R, R)
(A, R, A, R, A)
(A, R, A, A)
(A, A, R, R, R)
(A, A, R, R, A)
(A, A, R, A)
(A, A, A)
Notice that the total number of choices is 20. In this example, the branches have
diﬀerent lengths, and this makes the counting more diﬃcult than in the previous
examples.
Example. How many natural numbers are there under 1000 whose digits are
even?
Solution. There are 4 singledigit even numbers.
There are 4 5 numbers with two even digits, since the ﬁrst digit can take 4 values
(2,4,6,8) and the second digit can take 5 values (0,2,4,6,8).
There are 4 5
2
numbers with three even digits.
By applying the addition rule we get the solution, which is
4 + 4 5 + 4 5 = 4 31 = 124.
280
5.3 Permutations and combinations
Some counting problems appear so frequently in application that we have special
names and symbols. In this subsection we will discuss such problems.
Permutations
Example. How many diﬀerent ordered arrangements of the letters a, b, c are pos
sible?
Solution. We can enumerate all the possibilities which are: abc, acb, bac, bca, cab,
cba. Hence there are 6 ordered arrangements.
This result can also be obtained from the multiplication rule, since the ﬁrst letter
can be any of three, the second letter can be any of the two remaining letters and the
third letter is the remaining one.
Thus there are 3 2 1 = 6 ordered arrangements of the given letters.
Factorial notation
The product of the positive numbers from 1 to n inclusive occurs frequently in
mathematics and is denoted by the special symbol n! (read n factorial).
n! = 1 2 . . . n.
The expression 0! is deﬁned to be 1 to make simpler certain formulas.
• Any arrangement of a set of n objects in a given order is called a permutation
of the objects (taken all a time).
• Any arrangement of any k ≤ n of these objects in a given order is called an
kpermutation, a permutation of the n objects taken k at time or an ar
rangement of the n objects taken k at a time.
Notations. The number of permutations of n objects taken k at a time is denoted
by
P(n, k) or A
k
n
.
The number of permutations of n objects taken n at a time is denoted by
P(n, n) or P
n
.
We usually are interested in the number of such permutations without listing them.
Theorem. Given n distinct objects, the number of distinct permutations of the n
objects taken k (k ≤ n) at a time is
P(n, k) = A
k
n
= n(n −1) . . . (n −k + 1) =
n!
(n −k)!
P
n
= n!
Proof. In the ﬁrst place we can put n objects (which can be written as n−1+1);
in the second place we can put n −1 (n −2 + 1) objects; and so on. In the k
th
place
(the last one) we can put n −k + 1 objects.
281
By applying the multiplication rule
A
k
n
= P(n, k) = n(n −1)(n −2) . . . (n −k + 1)
=
n(n −1)(n −2) . . . (n −k + 1)(n −k) . . . 2 1
(n −k) . . . 1
=
n!
(n −k)!
If we take k = n in the previous formula we get
P
n
= P(n, n) =
n!
0!
= n!, as desired.
Example. Ioana has 11 books that she is going to put on her bookshelf. Of these,
5 are law books, 3 are literature books, 2 are history books and 1 is a language book.
Ioana wants to arrange her books so that all the books with the same subject are
together on the shelf. How many diﬀerent arrangements are possible?
Solution. For each possible ordering of the subjects, there are 5! 3! 2! 2! possible
arrangements. Since there are 4! possible orderings of the subjects, the desired result
is 4! 5! 3! 2! 1!.
Example. In how many ways can 8 persons arrange themselves
a) in a row of 8 chains?
b) around a circular table?
Solution. a) The eight persons can arrange themselves in P
8
= 8! ways.
b) One person can sit at any place in the circular table. The other 7 persons can
then arrange themselves in P
7
= 7! ways around the table.
This is an example of circular permutation.
n objects can be arranged in a circle in (n −1)! ways.
Example. a) In how many ways 3 women aged 20, 4 women aged 45 and 6 women
aged 75 be seated in a row so that those of the same age sit together?
b) Solve the same problem is they sit at a round table.
Solution. a) The 3 groups of women can be arranged in a row in 3! ways. In each
case, the 3 women aged 20 can be seated in 3! ways, the 4 women aged 45 in 4! ways
and the 6 women aged 75 in 6! ways. Thus, there are 3! 3! 4! 6! arrangements.
b) The 3 groups of women can be arranged in a circle in 2! ways (see the previ
ous example with circular permutations). Thus, in this case there are 2! 3! 4! 6!
arrangements.
Example. Find the number of ”fourletter words” using only the letters
a, b, c, d, e, f. Don’t use a letter twice!
Solution.
A
4
6
=
6!
(6 −4)!
=
6!
2!
= 3 4 5 6 = 360.
Permutations with indistinguishable objects
We will now determine the number of permutations of a set of n objects when
certain objects are indistinguishable from each other.
282
First of all we will present an example.
Example. How many diﬀerent letter arrangements can be formed using the letters
MUMMY ?
Solution. First, note that there are not 5! permutations, since the M’s are not
distinguishable from each other.
If the three M’s are distinguished there are 5! permutations of the letters
M
1
UM
2
M
3
Y . Observe that the following 3! = 6 permutations
M
1
M
2
M
3
UY, M
1
M
3
M
2
UY, M
2
M
1
M
3
UY,
M
2
M
3
M
1
UY, M
3
M
1
M
2
UY, M
3
M
2
M
1
UY
produce the same word when the subscripts are removed.
This is true for each of the other possible positions in which the M’s appear.
In conclusion there are
5!
3!
= 4 5 = 20
diﬀerent letter words that can be obtained using the letters from the word MUMMY .
Theorem. If there are n objects with n
1
indistinguishable objects of a ﬁrst type, n
2
indistinguishable objects of a second type,... and n
k
indistinguishable objects of an k
th
type, where n
1
+n
2
+ +n
k
= n, then there are
n!
n
1
!n
2
! . . . n
k
!
linear arrangements
of the given n objects.
Proof. We begin with the assumption that the objects of the same type are
distinct and let all n! arrangements of these n objects. We split these arrangements
into groups such that two elements of the same group diﬀer only by the fact that the
objects of the same type are interchanged. Each group can be represented by one ﬁxed
arrangement with repetitions. Since the objects of the ﬁrst type can be interchanged
in n
1
! ways, the objects of the second type can be interchanged in n
2
! ways,. . . and
the objects of the k
th
type can be interchanged in n
k
!, each group contains exactly
n
1
!n
2
! . . . n
k
! ways arrangements. In conclusion, by applying the addition rule we get
that the desired number of arrangements is
n!
n
1
!n
2
! . . . n
k
!
.
Example. An university applicant has to pass four entrance exams, which means
getting 2, 3 or 4 point for each exam. In order to be accepted the applicant must get
a total of at least 13 points. How many possible exam results are there (in order that
the applicant to be accepted).
Solution. In order to be accepted the applicant can obtain a total of 13, 14, 15
or 16 points at the 4 exams.
16 points can be achieved in 1 one way (four points at each exam).
15 points can be achieved in
4!
3! 1!
= 4 ways (four points at any three exams out
of four and 3 points at the other exam).
283
14 points can be achieved in
4!
3! 1!
+
4!
2! 2!
= 4 +6 = 10 ways (four points at any
3 exams and 2 points at the other exam or four points at any two exams and 3 points
at the other two exams).
13 points can be achieved in
4!
2! 1! 1!
+
4!
3! 1!
= 12 + 4 = 16 (four points at 2
exams, 3 points at one exam and 2 points at the other exam or 3 points at any 3
exams and four points at the other exam).
By applying the addition rule we get 1 + 4 + 10 + 16 = 31 possible exam results
in order that the applicant to be accepted.
Permutations with repetitions
Let A = ¦a
1
, . . . , a
n
¦ be a set with n elements.
A kpermutation with repetitions of elements of n types is an ktuple whose
components are in the set A.
Theorem. The number of all kpermutations with repetitions of elements of n
types is n
k
.
Proof. Each component of the ktuple can take n values and by applying the
multiplication rule the desired number is
n n. . . n
. ¸¸ .
k times
= n
k
.
Combinations
Example. Ten points lie in a plane in such way that no three of them lie on the
same straight line. How many lines do these point determine?
Solution. Since each line is uniquely determined by a pair of points through which
it passes, the number of all lines is equal to the number of all unordered pairs of points
that can be chosen from the given set of 10. There are A
2
10
pairs of 2 points when the
order in which the points are selected is relevant. However, since every pair is counted
twice, the total number of lines is equal to
A
2
10
2
=
10!
8! 2
=
9 10
2
= 45.
As the previous example shows, sometimes we are interested in determining the
number of diﬀerent groups of k objects that could be selected from a total of n objects.
Deﬁnition. Let us consider a set with n elements.
A combination of these n elements taken k at a time is any selection of k of the n
elements where the order does not count. Such a selection is called an kcombination.
The number of all possible unordered selections of k diﬀerent elements out of n dif
ferent ones is denoted by
_
n
k
_
(read ”n choose k”) or by C
k
n
(read ”combinations of n
taken k at a time).
Theorem. If 0 ≤ k ≤ n then
C
k
n
=
A
k
n
k!
=
n!
k!(n −k)!
.
284
Proof. We have C
k
n
ways of choosing k elements out of n without regarding order.
In each case we have k elements which can be ordered in k! ways. By applying the
multiplication rule, the number of k permutations is C
k
n
k!. On the other hand, this
number is A
k
n
. Hence
C
k
n
k! = A
k
n
,
wherefrom we get the desired formula.
Remark. The quantity C
k
n
is also called the binomial coeﬃcient since it occurs
as the coeﬃcient of the binomial expansion given by
(a +b)
n
= C
0
n
a
n
+C
1
n
a
n−1
b + +C
k
n
a
n−k
b
k
+ +C
n
n
b
n
.
Properties of the binomial coeﬃcients
1
◦
. C
k
n
= C
n−k
n
2
◦
. Pascal’s identity:
C
k
n+1
= C
k
n
+C
k−1
n
3
◦
. Sum of the binomial coeﬃcients
C
0
n
+C
1
n
+C
2
n
+ +C
n
n
= 2
n
4
◦
. Vandermonde’s identity
C
k
n+m
= C
0
n
C
k
m
+C
1
n
C
k−1
m
+C
2
n
C
k−2
m
+ +C
k
n
C
0
m
Proofs. 1
◦
. C
n−k
n
=
n!
(n −k)!(n −(n −k))!
=
n!
(n −k)!k!
= C
k
n
2
◦
. Expanding the righthand side of the equality we obtain
C
k
n
+C
k−1
n
=
n!
k!(n −k)!
+
n!
(k −1)!(n −k + 1)!
=
n!
(k −1)!(n −k)!
_
1
k
+
1
n −k + 1
_
=
n!
(k −1)!(n −k)!
n −k + 1 +k
k(n −k + 1)
=
(n + 1)!
k!(n −k + 1)!
= C
k
n+1
3
◦
. Substituting in the binomial expansion
(a +b)
n
= C
0
n
a
n
+C
1
n
a
n−1
b + +C
n
n
b
n
a = b = 1, we obtain 2
n
= C
0
n
+C
1
n
+C
2
n
+ +C
n
n
, as desired.
4
◦
. By identifying the coeﬃcients of x
k
of both sides of the following identity:
(1 +x)
m+n
= (1 +x)
m
(1 +x)
n
,
C
0
m+n
+C
1
m+n
x + +C
k
m+n
x
k
+ +C
m+n
x
m+n
285
= (C
0
n
+C
1
n
x +C
2
n
x
2
+ +C
n
n
x
n
)(C
0
m
+C
1
m
x +C
2
m
x
2
+ +C
m
m
x
m
)
we get that:
C
k
m+n
= C
0
n
C
k
m
+C
1
m
C
k−1
m
+ +C
k
n
C
0
m
.
Example. A farmer buys 3 cows, 2 pigs and 4 hens from a man who has 6 cows,
5 pigs and 8 hens. Find the number of choices that the farmer has.
Solution. The farmer can choose the cows in C
3
6
ways, the pigs in C
2
5
ways and
the hens in C
4
8
ways. By the multiplication rule the number of choices is C
3
6
C
2
5
C
4
8
.
Example. From a group which consists of 7 boys and 4 girls we want to choose
a sixmember volleyball team that has at least 2 girls. In how many ways the volley
ball team can be selected?
Solution. We divide all possible choices into three groups V
2
, V
3
, V
4
such that in
each team in V
i
, i = 2, 3, 4 there are exactly i girls. These i girls can be chosen in C
i
4
ways and the remaining 6 −i team members are chosen in C
6−i
7
ways. Hence,
card V
i
= C
i
4
C
6−i
7
and the number of all choices is
card V
2
+ card V
3
+ card V
4
= C
2
4
C
4
7
+C
3
4
C
3
7
+C
4
4
C
2
7
= 6 35 + 4 35 + 1 21 = 371.
Combinations with repetitions
Let A = ¦a
1
, . . . , a
n
¦ be a set with n elements.
A kcombination with repetitions of elements of n types is an unordered
group of k elements which consists of k
i
copies of a
i
, i = 1, n and k
1
+k
2
+ +k
n
= k.
For example ¦a, a, b, b, b¦ is a 5combination with repetitions of the elements of
the set ¦a, b, c, d¦.
Theorem. The number of all kcombinations with repetitions of elements of n
types, is equal to C
k
k+n−1
.
Proof. To each kelement with repetitions of elements of n types we associate a
sequence of zeros and ones as follows.
First we write k
1
ones, then one zero, then k
2
ones, then 1 zero etc. up to k
n
ones.
If k
i
= 0 for some i, there will be no one in the correspondent position. Thus we
obtained an ordered mtuple of 0’s and 1’s where
m = k
1
+ 1 +k
2
+ 1 + +k
n−1
+ 1 +k
n
= k
1
+k
2
+ +k
n
+n −1
= k +n −1.
There is a one to one correspondence between the kelements with repetitions of
elements of n types and k +n −1tuples which contains k ones and n −1 zeros. The
number of these is C
k
k+n−1
, as we needed.
286
Example. A domino is a rectangle divided into two squares with each square
numbered one of 0, 1, . . . , 6, repetitions allowed. How many dominoes are there?
Solution. In this case n = 7 (there are seven possibilities, 0, 1, 2, . . . , 6 for each
square) and k = 2 (each of two squares of a domino is to be numbered).
Hence, there are C
2
2+7−1
= C
2
8
=
8 7
2
= 28 dominoes.
Example. On their way home, seven students stop at a restaurant, where each of
them has one of the following: a cheeseburger, a hot dog, a taco or a ﬁsh sandwich.
How many diﬀerent orders are possible (from the point of view of the restaurant)?
Proof. In this case n = 4 (there are four types of food available) and k = 7 (each
of 7 students choose a food). Hence, the number of possible orders is
C
7
7+4−1
= C
7
10
=
10!
7! 3!
=
8 9 10
6
= 120.
Remark. The problem of combinations with repetitions allowed with given n and
k is equivalent to the following problem. How many solutions are there to the equation
x
1
+x
2
+ +x
n
= k such that each x
i
is a nonnegative integer?
Proof. x
i
represents the number of elements of type i selected,
i = 1, n.
Example. How many solutions does the equation
x
1
+x
2
+x
3
+x
4
= 7
have such that x
i
, i = 1, 4, are nonnegative integers.
Solution. Here n = 4 and k = 7, so the answer is
C
7
7+4−1
= C
7
10
=
10!
7! 3!
= 120.
287
288
Chapter 6
Basic probability concepts
6.1 Sample space. Events
A random experiment is a process or an action whose outcomes are not known
in advance with certainty. Classic random experiments include ﬂipping a coin, rolling
a dice, selecting a ball from an urn and drawing a card from a deck.
Each repetition of an experiment is a trial which has an observable outcome. If we
assume in the coin experiment that the coin cannot rest on its edge, the two possible
outcomes for a trial are the occurrence of a head or the occurrence of a tail.
The set which contains all the possible outcomes for an experiment is called the
sample space (denoted by S).
For the coin example the sample space S is deﬁned as
S = ¦head, tail¦ = ¦H, T¦.
Some other examples:
Example 1. If the experiment consists of ﬂipping two coins, then the sample
space consists of the following 4 elements:
S = ¦(H, H), (H, T), (T, H), (T, T)¦.
The outcome will be (H, H) if both coins come up heads; it will be (H, T) if the
ﬁrst coin comes up heads and the second comes up tails, etc.
Example 2. If the experiment consists of rolling a dice, then the sample space is
S = ¦1, 2, 3, 4, 5, 6, ¦ where the outcome i means that i points appeared on the dice,
i = 1, 6.
Example 3. If the experiment consists of rolling a dice until a six is obtained
then we obtain an inﬁnite sample space S
S = ¦6, 16, 26, 36, 46, 56, 116, 126, . . . ¦.
289
Example 4. If the experiment consists of rolling two dice, then the sample space
consists of the following 36 elements
S =
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6)
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6)
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
where the outcome (i, j) is said to occur if i appears on the ﬁrst dice and j on the
second dice.
Each element of S is called elementary event. Any subset E of the sample space
(or any collection of elementary events) is known as an event. The events are denoted
by capital letters. Some examples of events are the following.
Example 5. In example 2 the event that an odd number appears on the dice is
A = ¦1, 3, 5¦ and the event that the outcome is at most 4 is B = ¦1, 2, 3, 4¦.
Example 6. In example 4 if
E = ¦(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)¦,
then E is the event that the sum of the dice equals 7.
We say that an event E ⊂ S is realized if the outcome of the experiment is an
element of the set E.
The impossible event is the event that never happened. This is the event con
taining no outcomes and is denoted by ∅.
The certain event is the event that happens in each trial. This is the event that
contains all possible outcomes which is S.
Remark. Since the events are sets of elementary events (or subsets of the sample
space) we may combine them according to the usual set operations.
Operations with events
Consider an experiment whose sample space is S.
Let A and B be two events of the sample space S.
• The union of the events A and B is the event denoted by A∪B which consists
of all elementary events that are either in A or in B in both of them. That is, the
event A∪ B will occur if either A or B occurs.
For instance, in example 5 if A = ¦1, 3, 5¦ and B = ¦1, 2, 3, 4¦ then
A∪ B = ¦1, 2, 3, 4, 5¦.
290
• The intersection of the events A and B is the event denoted by A ∩ B which
consists of all elementary events that are both in A and B. That is, the event A∩ B
will occur only if both A and B occur.
For instance, in example 5, if A = ¦1, 3, 5¦ and B = ¦1, 2, 3, 4¦ then
A∩ B = ¦1, 3¦.
Two events A and B are said to be mutually exclusive events or disjoint
events if they cannot be realized at the same time, that is A∩ B = ∅.
We can also deﬁned unions and intersections of more than two events in a similar
manner.
• The contrary event of a given event A is the event denoted by A which consists
of all elementary events in the sample space S that are not in A.
The contrary event, A, will occur if and only if the event A does not occur.
For instance, in example 5, if A = ¦1, 3, 5¦ then A = ¦2, 4, 6¦.
• The diﬀerence of two events A and B is the event denoted by A¸B(= A∩B)
which consists of all the elementary events that are in A but not in B. The event A¸B
will occur if and only if the event A will occur and the event B will not occur.
• The inclusion A ⊂ B is the event which is realized if every elementary event in
A is also in B.
Properties of events operations
Operations with events satisfy various identities which are listed in the table below.
Event spaces (σﬁelds)
This section contains a technical approach which can be omitted at the ﬁrst read
ing.
We have already observed that events are subsets of S. The question which arises
as a consequence of the previous remark is the following: which subsets of S can be
considered to be events?
It is obvious that if A and B are events, then A ∪ B, A, A ∩ B are also events.
This is too vague; to be rigorous, we say that a subset A of S can be an event if it
belongs to a set T ⊆ T(S) which satisﬁes the following properties.
Deﬁnition. (Event space or σﬁeld)
Let S be the sample space of a given experiment.
The set T ⊆ T(S) is called an event space or a σﬁeld if the following three
conditions are fulﬁlled:
(i) S ∈ T
(ii) if A ∈ T then A ∈ T
(iii) if A
j
∈ T, j ∈ N, then
∞
_
j=1
A
j
∈ T.
291
Idempotent laws A∪ A = A A∩ A = A
Associative laws (A∪ B) ∪ C = A∪ (B ∪ C) (A∩ B) ∩ C = A∩ (B ∩ C)
Commutative laws A∪ B = B ∪ A A∩ B = B ∩ A
Distributive laws A∪ (B ∩ C) = (A∪ B) ∩ (A∩ C) A∩ (B ∪ C) = (A∩ B) ∪ (A∩ C)
Identity laws A∪ ∅ = A A∩ S = A
A∪ S = S A∩ ∅ = ∅
Involution laws (A) = A
Complement laws A∪ A = S A∩ A = ∅
S = ∅ ∅ = S
De Morgan’s laws A∪ B = A∩ B A∩ B = A∪ B
2
9
2
Remark. Let T be an event space (on the sample space S).
If A, B ∈ T then A∪ B, A∩ B, A¸ B and A∆B ∈ T.
Proof. First, we observe the fact that ∅ ∈ T. This is true since ∅ = S ∈ T
(according to the second rule of the previous deﬁnition).
If we take now in the third rule of the deﬁnition of an event space: A
1
= A, A
2
= B
and A
j
= ∅, j ≥ 3 then
A∪ B =
∞
_
j=1
A
j
∈ T.
As a consequence of De Morgan’s laws we have that A∩ B = A ∪ B which leads
us to the following identity
A∩ B = (A∪ B) ∈ T
(according to the second and third rule of the previous deﬁnition.
Now, it is obvious that A¸ B = A∩ B ∈ T.
Similarly, A∆B = (A¸ B) ∪ (B ¸ A) ∈ T.
Example. In the experiment of rolling a dice we deﬁne the following events:
A the event that the outcome is even (A = ¦2, 4, 6¦)
B the event that the outcome is odd (B = ¦1, 3, 5¦)
In this case T
1
= ¦∅, A, B, S¦ is an event space and T
2
= ¦∅, A, S¦ is not an event
space since A = B ,∈ T
2
.
The smallest event space which can be deﬁned on a sample space S is T = ¦∅, S¦
and the largest event space is T(S).
Usually, if S is ﬁnite then T = T(S). If S is inﬁnite, then T(S) is too big to be
useful, and a smaller collection of subsets is required.
The classical deﬁnition of probability
The classical deﬁnition was given by the French mathematician Pierre Simon
Laplace in his book ”Th´eorie analytique des probabilit´es” in the following form:
”The probability of an event is the ratio of the number of cases favorable to it, to
the number of all cases possible when nothing leads us to expect that any of these
cases should occur more than any other.”
Deﬁnition. (The classical deﬁnition of probability)
We consider an experiment which has a ﬁnite number of equally likely outcomes,
S = ¦s
1
, . . . , s
n
¦.
The function P : T(S) → [0, 1], deﬁned as
T(S) ∋ A → P(A) =
number of favorable cases for the occurrence of A
number of all possible outcomes(n)
is a probability on T(Ω).
Example. In the experiment of rolling a pair of unbiassed dice compute the
probability of the event that
a) the sum of the dice is 11
293
b) the sum of the dice is at least 11.
Solution. The number of all possible cases is 36 = 6 6 (see Example 2).
Let A be the event that the sum of the dice is 11 and B the event that the sum
of the dice is at least 11. Then
A = ¦(5, 6), (6, 5)¦, B = ¦(5, 6), (6, 5), (6, 6)¦,
P(A) =
2
36
=
1
18
and P(B) =
3
36
=
1
12
.
Example. In the experiment of drawing a card from a deck with 52 cards compute
the probability of drawing a king or a spade.
Solution. Out of the 52 cards, there are 13 spades and 4 kings. The number of
favorable cases is 13 + 4 − 1 = 16 (in order not to count the king of spades twice).
The desired probability will be
16
52
=
4
13
.
Classical probability suﬀers from a serious limitation since its deﬁnition consider
all outcomes to be equiprobable. This can be useful for drawing cards, rolling dice or
extracting balls from an urn but it cannot help us in the experiments with outcomes
with unequal probabilities.
The axiomatic deﬁnition of probability
The axiomatic approach build up probability theory from a number of axioms.
Deﬁnition. Let S be the sample space of a given experiment and let T ⊂ T(S)
be an event space on S.
The function P : T →R which satisﬁes the following properties:
i) P(A) ≥ 0, ∀ A ∈ T
ii) P(S) = 1
iii) P
_
∞
_
i=1
A
i
_
=
∞
i=1
P(A
i
), for each sequence (A
i
)
i≥1
⊆ T of mutually exclusive
events (A
i
∩ A
j
= ∅, ∀ i ,= j)
is called a probability function on T.
The previous deﬁnition was introduced by the Russian mathematician Kolmogorov
in 1933.
Deﬁnition. (Probability space)
Let S be a sample space of a given experiment, T ⊆ T(S) be an event space
on S and P be a probability function on T. Then the triple (S, T, P) is called a
probability space.
Elementary properties of a probability function
From the previous deﬁnition with its three axioms we can deduce a lot of properties
that one would expect a probability function to have.
Let (S, T, P) be a probability space.
P1) P(∅) = 0.
294
Proof. Since S = S ∪ ∅ ∪ ∅ ∪ . . . and the sequence ¦S, ∅, ∅, . . . , ∅, . . . ¦ consists of
mutually exclusive events, by applying the third axiom of the probability function we
get
P(S) = P(S) +
∞
i=1
P(∅)
which is equivalent to
∞
i=1
P(∅) = 0.
The last equality cannot be true unless P(∅) = 0.
Of course, the fact that the impossible event has probability 0 is natural.
P2) For each pair A, B ∈ T of mutually exclusive events (A∩ B = ∅) we have
P(A∪ B) = P(A) +P(B).
Proof. Since A∪ B = A∪ B ∪ ∅ ∪ . . . and P(∅) = 0 we get
P(A∪ B) = P(A) +P(B) +
∞
i=1
P(∅) = P(A) +P(B),
as desired.
P3) ∀ A ∈ T, P(A) = 1 −P(A).
Proof. For each A ∈ T we have A ∪ A = S and A ∩ A = ∅. By applying the
previous property and the second axiom we get
1 = P(S) = P(A∪ A) = P(A) +P(A)
which implies that
P(A) = 1 −P(A).
P4) For each A, B ∈ T with A ⊂ B we have:
a) P(B ¸ A) = P(B) −P(A);
b) P(A) ≤ P(B).
Proof. Since A ⊂ B we get that B = A∪(B¸ A) and A∩(B¸ A) = ∅. From (P2)
we have that
P(B) = P(A) +P(B ¸ A)
and since P(B ¸ A) ≥ 0 we obtain the inequality P(B) ≥ P(A).
P5) ∀ A ∈ T, 0 ≤ P(A) ≤ 1.
Proof. Clearly, ∅ ⊆ A ⊆ S for every event A. Then the ﬁrst axiom and (P4) give
0 = P(∅) ≤ P(A) ≤ P(S) = 1.
P6) The addition rules of probabilities
i) The case of two events
∀ A, B ∈ T : P(A∪ B) = P(A) +P(B) −P(A∩ B)
295
ii) the case of three events
∀ A, B, C ∈ T : P(A∪ B ∪ C) = P(A) +P(B) +P(C) −P(A∩ B)
−P(A∩ C) −P(B ∩ C) +P(A∩ B ∩ C)
Proof. i) It is clear that (by means of a Venn diagram for example)
A∪ B = A∪ [B ¸ (A∩ B)]
Then, by using (P2) and (P4), we get
P(A∪ B)
(P2)
= P(A) +P(B ¸ (A∩ B))
(P4)
= P(A) +P(B) −P(A∩ B)
ii) We apply part (i) to obtain:
P(A∪ B ∪ C) = P((A∪ B) ∪ C)
(i)
= P(A∪ B) +P(C) −P((A∪ B) ∩ C)
(i)
= P(A) +P(B) −P(A∩ B) +P(C)
−P((A∩ C) ∪ (B ∩ C))
(i)
= P(A) +P(B) −P(A∩ B) +P(C)
−[P(A∩ C) +P(B ∩ C) −P((A∩ C) ∩ (B ∩ C))]
= P(A) +P(B) +P(C) −P(A∩ B) −P(A∩ C)
−P(B ∩ C) +P(A∩ B ∩ C)
We can generalize the addition rules to the case of more then three events.
Theorem. (The Poincar´e’s formula) The probability of the union of any n
events A
1
, A
2
, . . . , A
n
is given by:
P
_
n
_
i=1
A
i
_
=
n
i=1
P(A
i
) −
1≤i<j≤n
P(A
i
∩ A
j
) +
1≤i<j<k≤n
P(A
i
∩ A
j
∩ A
k
)
− + (−1)
n+1
P(A
1
∩ A
2
∩ ∩ A
n
).
Even if the proof of the theorem (which is by induction) will not be presented, the
form of the rightside above is clear. First, we have to sum the probabilities of the
individual events, then subtract the probabilities of the intersections of the events,
taken two at a time (in the ascending order of indices), then add the probabilities of
the intersections of the events, taken three at a time as before, and continue like this
until you add or subtract (depending on n) the probability of the intersection of all
n events.
P7) i) The Boole’s inequalities
∀ A, B ∈ T : P(A∩ B) ≤ P(A) +P(B).
296
ii) ∀ (A
i
)
i≥1
⊆ T; P
_
∞
_
i=1
A
i
_
≤
∞
i=1
P(A
i
).
Proof. i) From the previous property we have:
P(A∪ B) = P(A) +P(B) −P(A∪ B) ≤ P(A) +P(B)
ii) First, we observe that the union
∞
_
i=1
A
i
can be written as a union of mutually
exclusive events in the following way
∞
_
i=1
A
i
= A
1
∪ (A
2
¸ A
1
) ∪ (A
3
¸ (A
1
∪ A
2
)) ∪ . . .
So,
P
_
∞
_
i=1
A
i
_
= P(A
1
) +P(A
2
¸ A
1
) +P(A
3
¸ (A
1
∪ A
2
)) +. . .
(P4)
≤ P(A
1
) +P(A
2
) +P(A
3
) +. . .
=
∞
i=1
P(A
i
)
P8) The Bonferroni’s inequality
∀ A
1
, . . . , A
n
∈ T : P
_
n
i=1
A
i
_
≥
n
i=1
P(A
i
) −n + 1.
Proof. The proof is by induction.
The ﬁrst case is n = 1 and is P(A
1
) ≥ P(A
1
).
The case n = 2: P(A
1
∩ A
2
) ≥ P(A
1
) +P(A
2
) −1.
To prove this we use the addition rule and the fact that P(A
1
∪ A
2
) ≤ 1.
P(A
1
∩ A
2
) = P(A
1
) +P(A
2
) −P(A
1
∪ A
2
) ≥ P(A
1
) +P(A
2
) −1.
The inductive step remains. We assume that the proposition is true for k and we
show that it necessarily follows for the case k + 1 (we use the case n = 2).
P(A
1
∩ A
2
∩ ∩ A
k+1
) = P((A
1
∩ ∩ A
k
) ∩ A
k+1
)
≥ P(A
1
∩ A
2
∩ ∩ A
k
) +P(A
k+1
) −1
≥ P(A
1
) + +P(A
k
) −k + 1 +P(A
k+1
) −1
=
k+1
i=1
P(A
i
) −k
297
which is what we have to prove.
Next, some examples are presented to illustrate some of the above properties.
Example. (i) Let A, B ∈ T such that P(A) = 0, 5, P(B) = 0, 4 and P(A∩ B) =
0, 6. Calculate P(A∩ B).
(ii) If P(A) = 0, 5, P(B) = 0, 4, P(A¸B) = 0, 4 and B ⊂ C, calculate P(A∪B∪C).
Solution. (i) From P(A∪ B) = P(A) +P(B) −P(A∩ B) we obtain
P(A∩ B) = P(A) +P(B) −P(A∪ B) = 0, 5 + 0, 4 −0, 6 = 0, 3.
(ii) The inclusion B ⊂ C implies C ⊂ B and hence
A∪ B ∪ C = A∪ B.
In consequence
P(A∪ B ∪ C) = P(A∪ B) = P(A) +P(B) −P(A∩ B)
= P(A) + 1 −P(B) −P(A¸ B) = 0, 5 + 1 −0, 4 −0, 4 = 0, 7.
Example. Consider a biased dice such that the probability of occurrence of a face
is directly proportional to the number of the points on the considered face.
Consider the following events:
A = ¦the occurrence of an even number¦
B = ¦the occurrence of an odd number¦
C = ¦the occurrence of a prime number¦
a) Compute the probabilities of occurrence of each face of the dice.
b) Compute P(A), P(B) and P(C).
c) P(A∪ C), P(B ∩ C), P(A∩ B).
Solution. a) If we denote by P(¦1¦) = p then P(¦i¦) = i p,
i ∈ ¦2, 3, 4, 5, 6¦. On the other hand since S = ¦1, . . . , 6¦ we have that
1 = P(S) =
6
i=1
P(¦i¦) =
6
i=1
i p = p(1 + 2 + + 6) = p 21.
ˆ
In conclusion
P(¦i¦) = i p =
i
21
, i = 1, 6.
b) P(A) = P(¦2, 4, 6¦) = P(¦2¦) +P(¦4¦) +P(¦6¦)
=
2
21
+
4
21
+
6
21
=
12
21
=
4
7
P(B) = P(¦1, 3, 5¦) = P(¦1¦) +P(¦3¦) +P(¦5¦) =
1 + 3 + 5
21
=
9
21
=
3
7
P(C) = P(¦2, 3, 5¦) = P(¦2¦) +P(¦3¦) +P(¦5¦) =
10
21
298
c) Since A∪ C = ¦2, 3, 4, 5, 6¦ = ¦1¦, then
P(A∪ C) = P(¦1¦) = 1 −P(¦1¦) = 1 −
1
21
=
20
21
P(B ∩ C) = P(¦3, 5¦) = P(¦3¦) +P(¦5¦) =
3
21
+
5
21
=
8
21
P(A∩ B) = P(A¸ B) = P(A) =
4
7
.
Example. Consider the experiment of throwing in the same time a dice and a
coin.
Determine the following events and compute their probabilities:
a) A: the occurrence of an even number and the head;
b) B: the occurrence of a prime number;
c) C: the occurrence of an odd number and the tail;
d) D: A or B is realized;
e) E: B and C is realized;
f) Which events among A, B and C are mutually exclusive?
Solution. The sample space is:
S = ¦1H, 2H, 3H, 4H, 5H, 6H, 1T, 2T, 3T, 4T, 5T, 6T¦.
a) A = ¦2H, 4H, 6H¦
P(A) =
card A
card S
=
3
12
=
1
4
b) B = ¦2H, 2T, 3H, 3T, 5H, 5T¦
P(B) =
card B
card S
=
6
12
=
1
2
c) C = ¦1T, 3T, 5T¦
P(C) =
card C
card S
=
3
12
=
1
4
d) D = A∪ B = ¦2H, 4H, 6H, 2T, 3H, 3T, 5H, 5T¦
P(A∪ B) =
8
12
=
2
3
or
P(A∪ B) = P(A) +P(B) −P(A∩ B)
A∩ B = ¦2H¦, P(A∩ B) =
1
12
P(A∪ B) =
1
4
+
1
2
−
1
12
=
3 + 6 −1
12
=
8
12
=
2
3
e) E = B ∩ C = ¦3T, 5T¦
P(E) =
card E
card S
=
2
12
=
1
6
299
f) A∩ B = ¦2H¦ ,= ∅
A∩ C = ∅
B ∩ C = ¦3T, 5T¦
Hence, the events A and C are mutually exclusive.
Example. Let (S, T, P) a probability space and A, B ∈ T such that
P(A) =
3
8
, P(B) =
1
2
, P(A∩ B) =
1
4
.
Compute the following probabilities:
a) P(A∪ B);
b) P(A) and P(B);
c) P(A∩ B);
d) P(A∪ B);
e) P(A∩ B);
f) P(B ∩ A).
Solution. a) P(A∪ B) = P(A) +P(B) −P(A∩ B)
=
3
8
+
1
2
−
1
4
=
3 + 4 −2
8
=
5
8
b) P(A) = 1 −P(A) = 1 −
3
8
=
5
8
P(B) = 1 −P(B) = 1 −
1
2
=
1
2
c) P(A∩ B) = P(A∪ B) = 1 −P(A∪ B) = 1 −
5
8
=
3
8
d) P(A∪ B) = P(A∩ B) = 1 −P(A∩ B) = 1 −
1
4
=
3
4
e) P(A∩ B) = P(A¸ B) = P(A¸ (A∩ B))
= P(A) −P(A∩ B) =
3
8
−
1
4
=
1
8
f) P(B ∩ A) = P(B ¸ A) = P(B ¸ (A∩ B))
= P(B) −P(A∩ B) =
1
2
−
1
4
=
1
4
.
Example. Chevalier de Mere was a midseventeenth century nobleman and gam
bler who tried to make money gambling with dice. De Mere made money by betting
that he could obtain at least one 6 on four rolls of one dice. When people did not
bet on this game with de Mere, he created a new game. He began to bet he would
get a double 6 on twentyfour rolls of two dice but he began losing money on it. He
asked his friend Blaise Pascal to analyze this game. Pascal analyzed this game and
asked Pierre de Fermat to work with him. It can be said that the formal study of
probability was lauched by two mathematicians and a gambler.
300
We will calculate and compare the probabilities of the following events:
A: we obtain at least one six in 4 rolls of a dice
B: we obtain at least one double 6 in 24 rolls of two dice
C: we obtain at least one double 6 in 25 rolls of two dice.
For this problem it is easier to determine the probabilities of the contrary events
A, B, C.
The event A means no six is obtained in 4 rolls of a dice.
The experiment has 6 6 6 6 = 6
4
possible outcomes. Since in each rolling of
one dice we have 5 possibilities to obtain no six, then in 4 rolls of the dice we have 5
4
possibilities to get no six.
Therefore, P(A) =
5
4
6
4
which implies
P(A) = 1 −P(A) = 1 −
5
4
6
4
≈ 0, 52.
The event B means no double 6 is obtained in 24 rolls of two dice. The number
of possible outcomes is 36
24
. The number of favorable outcomes is 35
24
. Therefore,
P(B) = 1 −P(B) = 1 −
35
4
36
4
≈ 0, 49.
The event C has the probability
P(C) = 1 −
_
35
36
_
25
≈ 0, 505.
Example. The birthday problem. If n people are present in a room, what is
the probability that no two of them celebrate their birthday on the same day of this
year? How large need n be so that this probability is less then
1
2
?
Solution. As each person can celebrate his or her birthday on any of 365 day, there
is a total o(365)
n
possible outcomes. (We are ignoring the possibility of someone’s was
born on February 29). Assuming that each outcome is equally likely, we see that the
number of favorable cases is
365 (365 −1) (365 −2) . . . (365 −(n −1))
since the ﬁrst person can celebrate his or her birthday on any day, the second person
can celebrate on any day except the ﬁrst person’s birthday, the third person can
celebrate on any day except the ﬁrst and second person’s birthdays and so on.
The desired probability is
p
n
=
365 364 . . . (365 −n + 1)
365
n
.
The values for the previous probability for diﬀerent values of n can be found in
the next table.
301
n 1 2 5 10 20 23 30 50
p
n
1 0, 99 0, 97 0, 88 0, 58 0, 49 0, 29 0, 03
When n ≥ 23, the desired probability is less than
1
2
. That is, if there are 23 or
more people in a room, then the probability that at least two of them have the same
birthday is greater than
1
2
.
When there are 50 persons in the room, the probability that at least two have the
same birthday is approximately 0,97.
6.2 Conditional probability
In this section we introduce the concept of conditional probability. Conditional
probabilities are used to compute the probabilities when some partial information
concerning the result of the experiment is available.
Conditional probabilities
In the experiment of tossing a pair of unbiased dice suppose that we observe that
the ﬁrst dice is a 2. We want to determine the probability that the sum of the two
dice equals 7, given that we already know that the ﬁrst dice is a 2.
Since the possible outcomes for our experiment are (2,1), (2,2), (2,3), (2,4), (2,5)
and (2,6) and the favorable outcome is (2,5), the desired probability is
1
6
.
If we consider the events
A: the ﬁrst dice is 2 and
B: the sum of the dice is 7 then
A = ¦(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)¦,
B = ¦(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)¦
and
A∩ B = ¦(2, 5)¦.
In the above example we are told that the event A has occurred and we are asked
to evaluate the probability of B on the basis of this fact. What is important here
is the event A, and given that A has occurred, the event B occurs only if the event
¦(2, 5)¦ = A∩ B occurs. The required probability is then
1
6
=
1
36
6
36
=
P(A∩ B)
P(A)
.
If we denote by P(B[A) (the probability of B) given that A has occurred or just,
given B) then
P(B[A) =
P(A∩ B)
P(A)
.
302
This example justiﬁes the following deﬁnition of conditional probability.
Deﬁnition. (Conditional probability) Let (S, T, P) be a probability space and
let A, B ∈ T such that P(A) > 0.
The conditional probability of the event B, given that A (has occurred) is denoted
by P(B[A) (or P
A
(B)) and is deﬁned by
P(B[A) = P
A
(B) =
P(A∩ B)
P(A)
.
Replacing A by the sample space S we obtain the probability of B,
P(B[S) =
P(B ∩ S)
P(S)
=
P(B)
1
= P(B).
Hence, the conditional probability is a generalization of the concept of probability
where S is restricted to an event A.
We will see now that the conditional probability, is, indeed, a probability.
Proposition. If (S, T, P) is a probability space and A ∈ T such that P(A) > 0
then P
A
: T →R, P
A
(B) = P(B[A) is a probability function, too.
Proof. We verify the three axioms of a probability function.
i) P
A
(B) = P(B[A) =
P(B ∩ A)
P(A)
≥ 0, ∀ B ∈ T.
ii) P
A
(S) = P(S[A) =
P(S ∩ A)
P(A)
=
P(A)
P(A)
= 1.
iii) Let (B
i
)
i≥1
⊆ T be a sequence of mutually exclusive events. Then
P
A
_
∞
_
i=1
B
i
_
= P
_
∞
_
i=1
B
i
[A
_
=
P
__
∞
_
i=1
B
i
_
∩ A
_
P(A)
=
P
_
∞
_
i=1
(B
i
∩ A)
_
P(A)
=
∞
i=1
P(B
i
∩ A)
P(A)
=
∞
i=1
P(B
i
∩ A)
P(A)
=
∞
i=1
P(B
i
[A) =
∞
i=1
P
A
(B
i
),
which completes the proof.
From the deﬁnition of conditional probability we can derive a simple but very
useful result: the socalled multiplicative theorem.
Theorem. (The multiplication rules) Let (S, T, P) be a probability space.
i) If A, B ∈ T such that P(A) > 0, then
P(A∩ B) = P(A) P(B[A).
ii) If A, B, C ∈ T such that P(A∩ B) > 0, then
P(A∩ B ∩ C) = P(A) P(B[A) P(C[A∩ B).
303
iii) If A
1
, . . . , A
n
∈ T such that P(A
1
∩ A
2
∩ ∩ A
n−1
) > 0 then
P(A
1
∩ A
2
∩ ∩ A
n
) = P(A
1
) P(A
2
[A
1
) P(A
3
[A
1
∩ A
2
) . . .
P(A
n
[A
1
∩ A
2
∩ ∩ A
n−1
).
Proof. i) P(A) P(B[A) = P(A)
P(B ∩ A)
P(A)
= P(B∩A) = P(A∩B), as desired.
ii) P(A) P(B[A) P(C[A∩ B)
= P(A)
P(A∩ B)
P(A)
P(A∩ B ∩ C)
P(A∩ B)
= P(A∩ B ∩ C).
Observe that if P(A∩B) > 0 then P(A) ≥ P(A∩B) > 0, so the previous fractions
are correctly deﬁned.
iii) By induction and is left as an exercise.
The importance of the previous theorem is given by the fact that we can calculate
the probability of the intersection of n events, step by step, by means of conditional
probabilities which is easier.
Next, we present a simple example which illustrates the point.
Example. An urn contains 12 identical balls of which 6 are red, 4 are green and
2 are yellow. Four balls are extracted from the urn without replacement. Determine
the probability that the ﬁrst ball is green, the second is red, the third is yellow and
the last is red.
Solution. If we denote by G
1
, R
2
, Y
3
and R
4
the events that the ﬁrst ball is green,
the second is red, the third is yellow and the fourth is red then the desired probability
is
P(G
1
∩ R
2
∩ Y
3
∩ R
4
) = P(G
1
) P(R
2
[G
1
) P(Y
3
[G
1
∩ R
2
) P(R
4
[G
1
∩ R
2
∩ Y
3
)
=
4
12
6
11
2
10
5
9
=
2
99
.
Probability trees
An eﬀective and simpler method of applying the probability rules is the probability
tree, in which:
• the events are represented by lines (branches)
• the probability along a path is the product of the probabilities on the branches
which form the path
• the sum of the probabilities at the end of the branches which start from the
same point is 1 because all possible events are listed.
We will present ﬁrst a theoretical example.
Example. Let (S, T, P) be a probability space. Let A
1
, A
2
, A
3
, E ∈ T such that
A
1
∪ A
2
∪ A
3
= S, A
i
∩ A
j
= ∅, i ,= j.
304
We can obtain the following probability tree.
E ∩ A
1
P(E ∩ A
1
) = P(A
1
) · P(EA
1
)
E ∩ A
1
P(E ∩ A
1
) = P(A
1
) · P(EA
1
)
E ∩ A
2
P(E ∩ A
2
) = P(A
2
) · P(EA
2
)
E ∩ A
2
P(E ∩ A
2
) = P(A
2
) · P(EA
2
)
E ∩ A
3
P(E ∩ A
3
) = P(A
3
) · P(EA
3
)
E ∩ A
3
P(E ∩ A
3
) = P(A
3
) · P(EA
3
)
A
1
A
2
A
3
P(EA
1
)
P(EA
1
)
P(EA
2
)
P(EA
2
)
P(EA
3
)
P(EA
3
)
P(A
1
)
P(A
2
)
P(A
3
)
Example. A graduate statistics course has 7 male and 3 female students. The
professor wants to select two students at random for a research project. By using a
probability tree determine the probabilities of all the possible outcomes.
F
F ∩ F P(F ∩ F) =
3
10
·
2
9
=
2
30
M ∩ F P(M ∩ F) =
7
10
·
3
9
=
7
30
M
F ∩ M P(F ∩ M) =
3
10
·
7
9
=
7
30
M ∩ M P(M ∩ M) =
7
10
·
6
9
=
14
30
P(F) =
3
10
P(M) =
7
10
P(FF) =
2
9
P(MF) =
7
9
P(FM) =
3
9
P(MM) =
6
9
305
Remark that the ﬁrst two branches represent the two possibilities female and male
students, on the ﬁrst choice. The second set of branches represents the two possibilities
on the second choice. The probabilities of female and male student chosen ﬁrst are
3
10
and
7
10
respectively. The probabilities for the second set of branches are conditional
probabilities based on the choice of the ﬁrst student selected.
6.3 The total probability formula.
Bayes’ formula
We will ﬁrst analyze the following example:
Example. Suppose a disease is present in 0,1% of a population. A diagnostic test
is available but imperfect. The test shows 5% false positives and 1% false negatives.
That is, for a patient not having the disease, the test shows positive with probability
0,05 and negative with probability 0,95. For a patient having the disease, the test
shows negative with probability 0,01 and positive with probability 0,99.
A person is randomly chosen.
(i) Determine the probabilities of the following conﬁgurations: diseased and posi
tive test, diseased and negative test, not diseased and positive test, not diseased and
negative test.
(ii) Determine the probability that a person will test positive and the probability
that a person will test negative.
(iii) If the chosen person tests positive, what is the probability that he/she is
diseased? If the chosen person tests negative, what is the probability that he/she is
diseased?
Solution. Let
D: the event that the person is diseased
T: the test is positive.
We are given the following data:
P(D) = 0, 001, P(D) = 0, 999, P(T[D) = 0, 05,
P(T[D) = 0, 95, P(T[D) = 0, 99, P(T[D) = 0, 01.
We can represent these informations by using a probability tree:
(i)
306
D
D ∩ T P(D ∩ T) = 0, 99 · 0, 001
D ∩ T P(D ∩ T) = 0, 01 · 0, 001 = 0, 00001
D ∩ T P(D ∩ T) = 0, 05 · 0, 999 = 0, 04995
D ∩ T P(D ∩ T) = 0, 95 · 0, 999 = 0, 94905
D
P(D) = 0, 001
P(D) = 0, 999
P(TD) = 0, 99
P(TD) = 0, 01
P((TD) = 0, 05
P(TD) = 0, 95
(ii) P(T) = P(D ∩ T) +P(D ∩ T) = 0, 00099 + 0, 04995 = 0, 05094
P(T) = P(D ∩ T) +P(D ∩ T) = 0, 00001 + 0, 094905 = 0, 094906
(iii) P(D[T) =
P(D ∩ T)
P(T)
=
0, 00099
0, 05094
=
99
5094
≈ 0, 0198
P(D[T) =
P(D ∩ T)
P(T)
=
1
94905
Thus only 1,98% of those persons whose test results are positive actually have the
disease. The result is surprising, since the proportion is low given that the test is quite
good. We will present a second argument which is less rigorous but is more relevant.
Since 0,1% of the population actually has the disease, it follows that 1 person
out of every 1000 tested will have it (on average). The test conﬁrms that a diseased
person has the disease with probability 0,99. Thus out of every 1000 person tested,
the test will conﬁrm that 0,99 persons have the disease. On the other hand, out of
999 healthy people, the test will state (incorrectly) that 999 0, 05 = 49, 95 have the
disease. Hence, for every diseased persons that the test correctly states are ill, there
are 49,95 healthy persons that the test states are ill (incorrectly).
Thus, the proportion of correct positives is equal to:
correct positives
correct positives + incorrect positives
=
0, 99
0, 99 + 49, 95
=
99
5094
≈ 0, 019.
The fact that the probability P(D[T) is less then 1 reﬂects the fact that the test
is imperfect. If the test would be perfect
P(T[D) = P(T[D) = 1
then
P(D[T) =
P(D ∩ T)
P(D ∩ T) +P(D ∩ T)
307
=
P(D) P(T[D)
P(D) P(T[D) +P(D) P(T[D)
=
P(D) 1
P(D) 1 +P(D) 0
= 1.
By using the same reasoning we can observe that a person testing positive has the
disease depends on the proportion of people in population who are actually ill. Let
us suppose that the incidence rate of disease is r. Replacing the proportion 0,001 by
r and 0,999 by 1 −r in the above calculation, we obtain that
P(D[T) =
r 0, 99
r 0, 99 + (1 −r) 0, 05
=
99r
99r + 5 −5r
=
99r
5 + 94r
A graph of this function is shown in the ﬁgure below.
`
¸
1
0,8
0,6
0,4
0,2
0,2 0,4 0,6 0,8 1
r
P(D[T)
We see that the test is relevant when the incidence rate for the disease is large.
Since most diseases have small incidence rate then the false positive rate and false
negative rate for this tests are very important numbers.
Deﬁnition. (Partition of the sample space)
Let (S, T, P) be a probability space. The events ¦A
1
, A
2
, . . . , A
n
¦ ⊆ T form a
partition of S if:
i) A
1
∪ A
2
∪ ∪ A
n
= S
ii) A
i
∩ A
j
= ∅, i ,= j
iii) P(A
i
) > 0, i = 1, n.
It is obvious that any event B ∈ T can be expressed in terms of a partition of S:
B = B ∩ S = B ∩
_
n
_
i=1
A
i
_
=
n
_
i=1
(B ∩ A
i
) =
n
_
i=1
(A
i
∩ B).
Furthermore,
P(B) = P
_
n
_
i=1
(A
i
∩ B)
_
=
n
i=1
P(A
i
∩ B) =
n
i=1
P(A
i
) P(B[A
i
).
308
Thus, we have the following result.
Theorem. (The total probability formula) Let (S, T, P) be a probability space
and let ¦A
1
, A
2
, . . . , A
n
¦ be a partition of S.
Then, for any event B ∈ T we have
P(B) =
n
i=1
P(A
i
) P(B[A
i
).
So, if we know the probabilities of the partitioning events P(A
i
), i = 1, n and the
conditional probabilities of B, given A
i
, then by using the previous formula we can
obtain the probability P(B).
The computation in the last example is a particular case of the following general
result.
Theorem. (Bayes’ formula) Let (S, T, P) be a probability space,
¦A
1
, A
2
, . . . , A
n
¦ be a partition of S and let B ∈ T with P(B) > 0.
Then, for all j = 1, n we have
P(A
j
[B) =
P(B[A
j
) P(A
j
)
n
i=1
P(B[A
i
) P(A
i
)
.
Proof. We write
P(A
j
[B) =
P(A
j
∩ B)
P(B)
=
P(B[A
j
) P(A
j
)
n
i=1
P(B[A
i
) P(A
i
)
according to the deﬁnition of conditional probability and to the total probability
formula.
The previous formula was ﬁrst stated by the English clergyman Thomas Bayes,
who died in 1761 but whose now famous formula was not published until 1763.
• The probabilities P(A
1
), . . . , P(A
n
) are called the prior probabilities in the
sense that they do not take into account any information about B.
• P(A
j
[B), j = 1, n are called the posterior probability in the sense that they
are reevaluations of the respective prior P(A
j
) after the event has occurred.
Example. Two identical urns have the following compositions:
 the ﬁrst urn contains 10 black and 30 white balls
 the second urn contains 20 black and 20 white balls.
An urn is selected at random and a ball is taken from the urn. If the ball is white,
what is the probability that it comes from the ﬁrst urn?
Solution. We consider the following events:
U
1
: the event of choosing the ﬁrst urn
U
2
: the event of choosing the second urn
W: the event that the extracted ball is white.
We compute the probabilities:
P(U
1
) = P(U
2
) =
1
2
309
(since the urns are identical, we have the same chance to select any urn)
P(W[U
1
) =
30
40
=
3
4
P(W[U
2
) =
20
40
=
1
2
In order to compute the desired probability, P(U
1
[W) we apply the Bayes law:
P(U
1
[W) =
P(W[U
1
) P(U
1
)
P(U
1
) P(W[U
1
) +P(U
2
) P(W[U
2
)
=
3
4
1
2
1
2
3
4
+
1
2
1
2
=
3
4
3
4
+
2
4
=
3
5
= 0, 6.
Example. (Bac Polynesie 2007) In a vacantion village, three training courses are
proposed both to the children and to the adults. They take place in the same time
and their topics are magic, drama and digital photography. 150 persons (90 adults
and 60 children) are registered at one of this training courses.
 the course of magic was chosen by one half of the children and 20% of the adults
 the course of digital photography was chosen by 27 adults and 10% of the children.
1. Fill in the following table.
Digital
Magic Drama photography Total
Adults 90
Children 60
Total 150
We choose at random a person registered at one of the training courses. We con
sider the following events.
A: the chosen person is an adult
M: the chosen person is registered at the course of magic
D: the chosen person is registered at the drama course
P: the chosen person is registered at the digital photography course.
2. a) What is the probability that the chosen person is a child?
b) What is the probability that the chosen person takes the course of digital
photography given that he/she is an adult?
c) What is the probability that the chosen person is an adult which is registered
at drama course?
3. Compute the probability that the chosen person is registered at the course of
magic.
4. Compute the probability that the chosen person is a child given that he/she is
registered at the course of magic.
5. We choose at random three persons out of the group of 150 persons. Which is
the probability that only one person takes the course of magic?
310
Solution. 1)
Digital
Magic Drama photography Total
Adults 18 45 27 90
Children 30 24 6 60
Total 48 69 33 150
2) a) P(A) =
card A
card S
=
60
150
=
2
5
b) P(P[A) =
P(P ∩ A)
P(A)
=
27
150
90
150
=
27
90
=
3
10
c) P(A∩ D) =
card (A∩ D)
card S
=
45
150
=
3
10
3) P(M) =
card M
card S
=
48
150
=
9
25
4) P(A[M) =
P(A∩ M)
P(M)
=
30
150
9
25
=
30
48
=
5
8
.
6.4 Independence
Since we already know what conditional probability means we can deﬁne the notion
of independent events. Intuitively, in order that two events A and B to be independent,
the probability of A should not change when the event B occurs and the probability
of B should not change when the event A occurs. A ﬁrst approach to the deﬁnition of
independent events is: P(A) = P(A[B). There are two reasons for which the previous
deﬁnition is not satisfactory: the deﬁnition is not symmetric in A and B and P(A[B)
cannot be deﬁned when P(B) = 0. We deﬁne independence as follows.
Deﬁnition. Let (S, T, P) be a probability space. Two events A and B ∈ T are
said to be independent if
P(A∩ B) = P(A) P(B).
The events A
1
, A
2
, . . . , A
n
∈ T are said to be independent if
P
_
_
j∈J
A
j
_
_
=
j∈J
P(A
j
),
for all index sets J ⊆ ¦1, 2, . . . , n¦.
311
Example. We choose a random card from a deck of 52 cards. Let A be the event
that the card is a queen, and B be the event that it is a spade. Then
P(A) =
4
52
=
1
13
and P(B) =
13
52
=
1
4
.
The event A∩B is the event that we draw a spade queen, the probability of which
is just
1
52
. We see that P(A) P(B) = P(A ∩ B) and hence the events A and B are
independent.
The next example explains why in the deﬁnition of independence for more than
two events, we need to require
P
_
_
j∈J
A
j
_
_
=
j∈J
P(A
j
)
for all index sets J, and not only for such sets of size 2.
Example. In the experiment of rolling 2 fair dice, let A be the event that the ﬁrst
dice is even, B the event that the second dice is even and C the event that the sum
of the dice is even.
Show that events A, B and C are pairwise independent but A, B, C are not inde
pendent.
Solution.
P(A) =
card A
card S
=
18
36
=
1
2
P(B) =
card B
card S
=
18
36
=
1
2
A∩ B = ¦(2, 2), (2, 4), (2, 6), (4, 2), (4, 4), (4, 6), (6, 2), (6, 4), (6, 6)¦
P(A∩ B) =
card (A∩ B)
card S
=
9
36
=
1
4
Hence A and B are independent since
P(A∩ B) =
1
4
=
1
2
1
2
= P(A) P(B)
P(C) =
1
2
, A∩ B = A∩ C, P(A∩ C) =
1
4
hence A and C are independent.
In the same way we can show that B and C are independent.
Since A∩ B ∩ C = A∩ B = A∩ C = B ∩ C then
P(A∩ B ∩ C) =
1
4
,=
1
8
= P(A) P(B) P(C)
wherefrom we deduce the fact that the events A, B and C are not independent.
Remark. 1) If P(B) ,= 0, A and B are independent if and only if
P(A[B) = P(A).
312
2) A, B are independent events
⇔ A, B are independent events
⇔ A, B are independent events
⇔ A, B are independent events.
Proof. 1) Let A, B ∈ T such that P(B) ,= 0.
A, B are independent events
⇔ P(A∩ B) = P(A) P(B) ⇔
P(A) =
P(A∩ B)
P(B)
⇔ P(A) = P(A[B).
2) We shall prove just the ﬁrst equivalence since the other equivalences can be
justiﬁed similarly.
Since B = B ∩ S = B ∩ (A ∪ A) = (B ∩ A) ∪ (B ∩ A) and the events B ∩ A and
B ∩ A are mutually exclusive we obtain
P(B) = P(B ∩ A) +P(B ∩ A).
Suppose now that A and B are independent events, that is
P(A∩ B) = P(A) P(B).
P(A∩ B) = P(B) −P(B ∩ A) = P(B) −P(B) P(A)
= P(B)(1 −P(A)) = P(B) P(A).
We see that P(A∩B) = P(A) P(B) and hence the events A and B are indepen
dent.
Suppose now that A and B are independent events, that is
P(A∩ B) = P(A) P(B)
P(A∩ B) = P(B) −P(A∩ B) = P(B) −P(A) P(B)
= P(B)(1 −P(A)) = P(B) P(A) = P(B) P(A).
We see that P(A∩B) = P(A) P(B) and hence the events A and B are indepen
dent.
6.5 Classical probabilistic models.
Urn models
In this section we will consider random experiments which frequently appear in
practical applications and we will calculate the probabilities of their outcomes. The
mathematical models that will be used to describe the considered experiments are the
urn models (which contain colored balls of the same weight).
313
Urn models with replacement
Urn model with two states with replacement
Consider an urn U which contains white and black balls. Let p be the probability
of getting a white ball and q the probability of getting a black ball. Since the events
of extracting a white ball, respectively a black ball are contrary events we have that
p +q = 1.
A trial consists in taking a ball, recording its colour and putting it back into the
urn. In consequence the probability of taking a ball of a speciﬁed colour at the ﬁrst
trial is the same with the probability of taking a ball of the same colour at the second
trial, and so on. The trials in this experiment are independent.
We want to determine the probability that in n repeated trials to get k white balls
(k ≤ n).
We denote by X
k
n
the desired event. We have to compute P(X
k
n
).
The desired white balls can be obtained at any k trials from the considered n
trials.
Denote by W
j
the event of getting a white ball at the j
th
trial, j = 1, n. The
desired event can be written as
X
k
n
=
_
(W
i
1
∩ W
i
1
∩ ∩ W
i
k
∩ W
i
k+1
∩ ∩ W
i
n
)
where ¦i
1
, i
2
, . . . , i
n
¦ = ¦1, 2, . . . , n¦.
The previous union contains C
k
n
terms since we can obtain k white balls at any k
trials from n trials.
P(X
k
n
) = P
_
_
(W
i
1
∩ W
i
1
∩ ∩ W
i
k
∩ W
i
k+1
∩ ∩ W
i
n
)
_
=
P(W
i
1
) P(W
i
2
) . . . P(W
i
k
) P(W
i
k+1
) . . . P(W
i
n
)
=
p p . . . p
. ¸¸ .
k times
q q . . . q
. ¸¸ .
n−k times
=
p
k
q
n−k
= C
k
n
p
k
q
n−k
Hence,
P(X
k
n
) = C
k
n
p
k
q
n−k
.
Remark 1. The term C
k
n
p
k
q
n−k
can be obtained as the general term in the
binomial theorem
(p +q)
n
=
n
k=0
C
k
n
p
k
q
n−k
.
This is why the previous model is also called the binomial model.
Remark 2. Pascal urn model
Consider an urn U which contains white and black balls. Let p the probability of
getting a white ball and q the probability of getting a black ball (p +q = 1).
A trial consists in taking a ball from the urn, recording its colour and putting it
back into the urn.
314
We want to determine the probability that in n successive trials to get k (k ≤ n)
white balls and at the n
th
trial (which is the last trial) to get a white ball. The
previous event can also be described as: to get k −1 white balls in the ﬁrst n−1 trials
and a white ball at the n
th
trial.
Denote by Y
k
n
the desired event. The needed event can be written as:
Y
k
n
= X
k−1
n−1
∩ W
n
where X
k−1
n−1
represents the event of obtaining: k − 1 white balls in n − 1 successive
trials and W
n
is the event of getting a white ball at the n
th
trial.
Hence,
P(Y
k
n
) = P(X
k−1
n−1
∩ W
n
) = P(X
k−1
n−1
) P(W
n
)
= C
k−1
n−1
p
k−1
q
(n−1)−(k−1)
p = C
k−1
n−1
p
k
q
n−k
.
Remark 3. Geometric model
The geometric model is a particular case of the Pascal model in which we take
k = 1, that is in n successive trials we get one white ball which is obtained at the n
th
trial.
Hence,
P(Y
1
n
) = C
1−1
n−1
p q
n−1
= pq
n−1
P(Y
1
n
) = pq
n−1
.
The name ”Geometric model” comes from the fact that the term pq
n−1
is the n
th
term of a geometric progression whose ﬁrst term in p and its ratio is q.
Another way of obtaining Y
1
n
is the following:
Since Y
1
n
= W
1
∩ W
2
∩ ∩ W
n−1
∩ W
n
, we obtain
P(Y
1
n
) = P(W
1
) P(W
2
) . . . P(W
n−1
) P(W
n
)
= q q . . . q
. ¸¸ .
n−1 times
p
= q
n−1
p = pq
n−1
,
as we expected.
Urn model with more than two states with replacement
Consider an urn U which contains balls of s colours: c
1
, c
2
, . . . , and c
s
. Let p
i
be the
probability of getting a ball of colour c
i
, i = 1, s. Since the events of extracting a ball of
colour c
i
, i = 1, s form a partition of the sample space we have that p
1
+p
2
+ +p
s
= 1.
A trial consists in taking a ball, recording its colour and putting it back into the
urn.
We want to determine the probability that in n repeated trials to get k
i
balls of
colour c
i
, i = 1, s,
k
1
+k
2
+ +k
s
= n.
315
First, we will count in how many diﬀerent ways the desired event can be obtained.
There are C
k
1
n
possible choices for the balls of colour c
1
; for each choice of the balls
of the ﬁrst colour there are C
k
2
n−k
1
possible choices for the balls of colour c
2
; for each
choice of the balls of the ﬁrst two colours there are C
k
3
n−k
1
−k
2
possible choices for the
third group; and so on. In consequence, there are
C
k
1
n
C
k
2
n−k
1
C
k
3
n−k
1
−k
2
. . . C
k
s
n−k
1
−k
2
−···−k
s−1
=
n!
k
1
!(n −k
1
)!
(n −k
1
)!
(n −k
1
−k
2
)! k
2
!
(n −k
1
−k
2
)!
(n −k
1
−k
2
−k
3
)!k
3
. . .
. . .
(n −k
1
−k
2
− −k
s−1
)
(n −k
1
−. . . k
s
)! k
s
!
=
n!
k
1
!k
2
! . . . k
s
!
possible ways in which the desired event can be obtained.
We denote by X
k
1
,k
2
,...,k
s
n
the desired event. We have to compute P(X
k
1
,k
2
,...,k
s
n
).
Denote by X
k
i
c
i
the probability of extracting k
i
balls of colour c
i
from n trials,
i = 1, s.
X
k
1
,...,k
s
n
=
_
(X
k
1
c
1
∩ X
k
2
c
2
∩ ∩ X
k
s
c
s
).
The previous union contains
n!
k
1
!k
2
! . . . k
s
!
terms.
P(X
k
1
,...,k
s
n
) =
P(X
k
1
c
1
∩ X
k
2
c
2
∩ ∩ X
k
s
c
s
)
=
p
k
1
1
p
k
2
2
. . . p
k
s
s
=
n!
k
1
!k
2
! . . . k
s
!
p
k
1
1
. . . p
k
s
s
Remark 4. The term
n!
k
1
!k
2
! . . . k
s
!
p
k
1
1
. . . p
k
s
s
can be obtained as the general term
in the multinomial theorem
(p
1
+ +p
s
)
n
=
(k
1
,...,k
s
)
k
1
+k
2
+···+k
s
=n
n!
k
1
!k
2
! . . . k
s
!
p
k
1
1
p
k
2
2
. . . p
k
s
s
(the above sum is over all nonnegative integervalued vectors (k
1
, . . . , k
s
) such that
k
1
+k
2
+ +k
s
= n.
This is why the previous model is also called the multinomial model.
Poisson urn model
Suppose we have n urns, (U
1
, . . . , U
n
), each of them containing white and black
balls in diﬀerent proportions. Let p
i
be the probability of getting a white ball from
the i
th
urn and q
i
the probability of getting a black ball from the same urn, i = 1, n.
Since the previous events are contrary events we have that p
i
+q
i
= 1, i = 1, n.
Our experiment consists in taking one ball from each urn (so we will get exactly
n balls). We want to ﬁnd the probability of getting k white balls from the selected n
balls, k ≤ n. We denote by X
k
the desired event.
316
Since the desired k white balls can be obtained from any k urns then X
k
can be
written as:
X
k
=
_
(W
i
1
∩ W
i−2
∩ ∩ W
i
k
∩ W
i
k+1
∩ ∩ W
i
n
)
where by W
i
we denote the event of getting a white ball from the i
th
urn and
¦i
1
, . . . , i
n
¦ is any permutation of the set ¦1, 2, . . . , n¦.
Hence
P(X
k
) =
p
i
1
p
i
2
. . . p
i
k
q
i
k+1
. . . q
i
n
,
where the sum is made over all the permutations (i
1
, . . . , i
n
) of the set (1, 2, . . . , n).
Remark 5. The previous value can also be obtained as the coeﬃcient of t
k
of the
following polynomial:
(p
1
t +q
1
)(p
2
t +q
2
) . . . (p
n
t +q
n
).
Actually, we have the following more general identity:
n
k=0
P(X
k
)t
k
=
n
i=1
(p
i
t +q
i
).
Remark 6. The Poisson urn model is a generalization of the binomial model.
Indeed, if in the Poisson urn model we consider n urns with same composition, then
extracting one ball from each urn is the the same to extract repeatedly n balls from
one urn.
In this case we have
n
k=0
P(X
k
)t
k
=
n
i=1
(pt +q) = (pt +q)
n
,
wherefrom we obtain
P(X
k
) = C
k
n
p
k
q
n−k
as we expected.
Urn model without replacement
Urn model with two states without replacement
Let U be an urn which contains a white balls and b black balls. A trial consists
in taking a ball from the urn, recording its colour and not replacing the ball into the
urn. We want to ﬁnd the probability that in n successive trials to get k white balls
and l = n −k black balls. The numbers l and k must satisfy the following conditions
n = l +k, l ≤ a, l ≤ b.
We observe that since the extracted ball in one trial is not replaced our experiment
(which consists in n successive trials) can be performed by taking n balls at a time.
317
By using the previous remark and the classical deﬁnition of the probabilities we get
that the probability of the desired event X
k,l
a,b
is
P(X
k,l
a,b
) =
no. of favorable outcomes
no. of possible outcomes
=
C
k
a
C
l
b
C
n
a+b
.
Indeed, since the number of possibilities of taking k white balls is C
k
a
and for
each choice of k white balls there are C
l
b
possibilities for taking l black balls then the
number of favorable cases is C
k
a
C
l
b
as we mentioned before.
In conclusion:
P(X
k,l
a,b
) =
C
k
a
C
l
b
C
n
a+b
, n = k +l, l ≤ a, l ≤ b.
The previous model can be easily generalized to more then two states as follows.
Urn model with more than two states without replacement
Let U be an urn which contains a
i
balls of colour c
i
, i = 1, s. The experiment
described is similar to the previous one. A trial consists in taking a ball from the
urn, recording its colour and not replacing back into the urn. We want to ﬁnd the
probability that in n successive trials to get k
i
balls of colour c
i
, i = 1, s. The numbers
k
i
, i = 1, s must satisfy the following conditions: k
1
+ k
2
+ + k
s
= s and k
i
≤ a
i
for each i = 1, s.
The probability of the desired event, X
k
1
,k
2
,...,k
s
a
1
,a
2
,...,a
s
can be obtained by the same
reasoning as in the previous example.
P(X
k
1
,k
2
,...,k
s
a
1
,a
2
,...,a
s
) =
C
k
1
a
1
C
k
2
a
2
. . . C
k
s
a
s
C
n
a
1
+a
2
+···+a
s
, k
1
+k
2
+ +k
s
= n
and k
i
≤ a
i
, i = 1, s.
Examples
Example 1. Suppose that the probability that an item produced by a certain
production line will be defective is 0,1. Find the probability that a sample of 10 items
will contain at most 1 defective item.
Solution. We shall use the binomial model (the urn model with 2 states with
replacement) since the probability of obtaining a defective item at each trial is the
same.
Since we are interested in at most 1 defective item, we have to compute the
probability of the following event X
0
10
∪ X
1
10
.
The desired probability is
P(X
0
10
∪ X
1
10
) = P(X
0
10
) +P(X
1
10
)
= C
0
10
(0, 1)
0
(0, 9)
10
+C
1
10
(0, 1) (0, 9)
9
= 0, 9
10
+ 10 0, 1 0, 9
9
= 0, 9
9
.
318
Example 2. An urn contains 10 white and 5 black balls. Balls are randomly
selected, one at a time, until a white ball is obtained. If we assume that each selected
ball is replaced before the next ball is drawn, what is the probability that:
(a) exactly 3 extractions are needed
(b) at least 3 extractions are needed.
Solution. We shall use the geometric model, with
p =
10
15
=
2
3
and q =
5
15
=
1
3
a) P(Y
1
3
) = p q
2
=
2
3
_
1
3
_
2
=
2
27
b) The desired event is
Y = Y
1
3
∪ Y
1
4
∪ =
∞
_
n=3
Y
1
n
.
It is easier to compute the probability of the contrary event which is: at most two
extractions are needed.
Y = Y
1
1
∪ Y
1
2
P(Y ) = 1 −P(Y ) = 1 −P(Y
1
1
∪ Y
1
2
) = 1 −(P(Y
1
1
) +P(Y
1
2
))
= 1 −(p +pq) = 1 −p −pq = q −pq = q
2
=
1
9
.
Example 3. The probabilities that three men hit a target are respectively
1
6
,
1
4
and
1
3
. Each shoots once at the target.
a) Find the probability that exactly one of them hits the target.
b) Find the probability that at most two of them hit the target.
c) If only one hit the target, what is the probability that is was the ﬁrst man.
Solution. We will use the Poisson model. By using the notations introduced before
we have:
p
1
=
1
6
, q
1
=
5
6
, p
2
=
1
4
, q
2
=
3
4
, p
3
=
1
3
, q
3
=
2
3
.
a) P(X
1
) is the coeﬃcient of t
1
of the following polynomial
(p
1
t +q
1
)(p
2
t +q
2
)(p
3
t +q
3
) =
_
1
6
t +
5
6
__
1
4
t +
3
4
__
1
3
t +
2
3
_
,
hence
P(X
1
) =
1
6
3
4
2
3
+
5
6
1
4
2
3
+
5
6
3
4
1
3
=
31
72
.
We denote by M
i
the event that the target was hit by the i
th
men, i = 1, 3. The
previous probability can be directly computed as follows.
X
1
= (M
1
∩ M
2
∩ M
3
) ∪ (M
1
∩ M
2
∩ M
3
) ∪ (M
1
∩ M
2
∩ M
3
)
319
P(X
1
) = P(M
1
) P(M
2
) P(M
3
) +P(M
1
) P(M
2
) P(M
3
)
+P(M
1
) P(M
2
) P(M
3
)
P(X
1
) =
1
6
3
4
2
3
+
5
6
1
4
2
3
+
5
6
3
4
1
3
=
31
72
b) B = X
0
∪ X
1
∪ X
2
We will compute the probability of the contrary event: B = X
3
.
P(B) = 1 −P(B) = 1 −P(X
3
) = 1 −
1
6
1
4
1
3
=
71
72
c) P(M
1
[X
1
) =
P(M
1
∩ X
1
)
P(X
1
)
=
P(M
1
∩ M
2
∩ M
3
)
P(X
1
)
P(M
1
[X
1
) =
P(M
1
) P(M
2
) P(M
3
)
P(X
1
)
=
1
6
3
4
2
3
31
72
=
6
31
Example 4. At a lottery, among 100 tickets, 25 are winning tickets. A person
buys 4 tickets from this lottery. Find the probability that at least one ticket is a
winning one?
Solution. We use the urn model with two states without replacement. The desired
event is
A = X
1
4
∪ X
2
4
∪ X
3
4
∪ X
4
4
.
We compute the probability of the contrary event A = X
0
4
.
P(A) = 1 −P(A) = 1 −P(X
0
4
) = 1 −
C
0
25
C
4
75
C
4
100
= 1 −
C
4
75
C
4
100
.
Example 5. In the last 30 years, the probability that a newborn is a girl is 0,52.
A family has 5 babies born in the last 30 years. What is the probability that:
a) the ﬁfth born child to be the second boy of the family;
b) the ﬁrst boy to be the 4
th
newborn;
c) the last baby born in the family to be a boy.
Solution. We use the Pascal model with
p = 1 −0, 52 = 0, 48, q = 0, 52.
a) P(Y
2
5
) = C
1
4
p
2
q
3
= 4 0, 42
2
0, 52
3
b) P(Y
1
4
) = C
0
3
p q
3
= 0, 42 0, 52
3
c) p = 0, 48.
Example 6. A dice is rolled fourteen times.
a) What is the probability that we obtain exactly one 6?
b) What is the probability that we obtain 4 times 4, 2 times 6 and 6 times 3?
Solution. a) We use the urn model with 2 states (face 6 and other faces) with
replacement.
320
In this case we have n = 14, k = 1, p =
1
6
and q =
5
6
.
P(X
1,13
14
) = C
1
14
_
1
6
_
1
_
5
6
_
13
= 14
1
6
_
5
6
_
13
=
14 5
13
6
14
.
b) We use the urn model with 4 states (face 4, face 6, face 3 and the other faces)
with replacement. In this case we have
n = 14, k
1
= 4, k
2
= 2, k
3
= 6,
k
4
= n −k
1
−k
2
−k
3
= 14 −4 −2 −6 = 2, p
1
= p
2
= p
3
=
1
6
, p
4
=
1
2
P(X
4,2,6,2
14
) =
14!
4! 2! 6! 2!
_
1
6
_
4
_
1
6
_
2
_
1
6
_
3
_
1
2
_
2
.
Miscellaneous examples
Example 1. Among the 20 students of a group, 6 speak English, 5 speak French
and 2 speak German.
If we choose randomly one student which is the probability that he/she knows a
foreign language (English, French or German)?
Solution. Let E, F, G be the considered events. Then
P(E) =
6
20
, P(F) =
5
20
, P(A) =
2
20
.
If the desired event is denoted X, then X = E ∪ F ∪ G. Since the pairs of events
E and F, E and G, F and G are not mutually exclusive we will use the addition rule
for computing the probability of the union E ∪ F ∪ G.
P(X) = P(E ∪ F ∪ G) = P(E) +P(F) +P(G) −P(E ∩ G)
−P(F ∩ G) −P(E ∩ F) +P(E ∩ F ∩ G).
The events E, F and G are independent, hence:
P(X) = P(E) +P(F) +P(G) −P(E) P(G) −P(F) P(G)
−P(E) P(F) +P(E) P(F) P(G)
=
6 + 5 + 2
20
−
6 5 + 6 2 + 5 2
20 20
+
6 5 2
20 20 20
=
13
20
−
52
400
+
3
400
=
211
400
.
Example 2. We are given 3 urns which contain white balls and black balls as
follows: U
1
(a, b), U
2
(c, d) and U
3
(e, f). A ball is drawn from the third urn. If the
selected ball is white it is replaced in the ﬁrst urn and the second ball is drawn from
the urn U
1
. If the ﬁrst ball is black it is replaced in the second urn wherefrom the
second ball is drawn.
Determine the probability of the following events:
321
a) the second ball is white;
b) the ﬁrst ball is white given that the second ball is black.
Solution. We use the following notations:
W
i
 the event that the i
th
ball is white, i = 1, 2
B
i
 the event that the i
th
ball is black, i = 1, 2.
a) We apply the total probability formula where the partition is ¦W
1
, B
1
¦.
P(W
2
) = P(W
1
) P(W
2
[W
1
) +P(B
1
) P(W
2
[B
1
)
=
e
e +f
a + 1
a +b + 1
+
f
e +f
c
c +d + 1
.
In the same say we get:
P(B
2
) = P(W
1
) P(B
2
[W
1
) +P(B
1
) P(B
2
[B
1
)
=
e
e +f
b
a +b + 1
+
f
e +f
d + 1
c +d + 1
.
b) According to Bayes’ rule we have
P(W
1
[B
2
) =
P(W
1
) P(B
2
[W
1
)
P(B
2
)
=
e
e +f
b
a +b + 1
e
e +f
b
a +b + 1
+
f
e +f
d + 1
c +d + 1
.
Example 3. An urn contains a white balls (a ≥ 3) and b black balls. If we extract
without replacement 3 balls, what is the probability that all three balls to be white?
Solution. If we denote by W
i
the event that the i
th
ball is white, i = 1, 3, and by
X the desired event, then
X = W
1
∩ W
2
∩ W
3
.
By applying the multiplication rule for computing the probabilities we obtain:
P(X) = P(W
1
∩ W
2
∩ W
3
) = P(W
1
) P(W
2
[W
1
) P(W
3
[W
1
∩ W
2
)
=
a
a +b
a −1
a +b −1
a −2
a +b −2
.
Example 4. Three machines A, B and C produce respectively 40%, 30% and 30%
of the total number of items of a factory. The percentages of defective output of these
machines are 2%, 4% and 5%. Suppose an item is selected at random and is found to
be defective. Find the probability that the item was produces by machine A.
Solution. We are given
P(A) = 0, 4, P(B) = 0, 3, P(C) = 0, 3
P(D[A) = 0, 02, P(D[B) = 0, 04, P(D[C) = 0, 05.
322
A ∩ D
B ∩ D
C ∩ D
A ∩ D
B ∩ D
C ∩ D
P(DA) = 0, 02
P(DB) = 0, 04
P(DC) = 0, 05
A
B
C
0,4
0,3
0,3
By using the Bayes’ formula and the total probability formula we get:
P(A[D) =
P(A) P(D[A)
P(D)
=
P(A) P(D[A)
P(A) P(D[A) +P(B) P(D[B) +P(C) P(D[C)
=
0, 4 0, 02
0, 4 0, 02 + 0, 3 0, 04 + 0, 3 0, 05
≈ 0, 229.
Example 5. A student takes a multiplechoice exam. Suppose for each question
he either knows the answer or gambles and chooses an option at random (at each
question there are exactly 4 choices one of which is the correct answer). To pass,
students need to answer at least 60% of the questions correctly. The student has
”studied for a minimal pass”, that is, with probability 0,6 he knows the answer to a
question. Given that he answers a question correctly, what is the probability that he
actually knows the answer?
Solution. Let C and K denote, respectively, the events that the student answers
323
the question correctly and the event that he knows the answer. Now
P(K[C) =
P(K) P(C[K)
P(C)
=
P(K) P(C[K)
P(K) P(C[K) +P(K) P(C[K)
=
0, 6 1
0, 6 1 + 0, 4 0, 25
=
0, 6
0, 6 + 0, 1
=
6
7
≈ 0, 857
Example 6. Suppose A and B are events with 0 < P(A) < 1 and 0 < P(B) < 1.
a) If A and B are independent, can they be mutually exclusive?
b) If A and B are mutually exclusive, can they be independent?
c) If A ⊂ B, can A and B be independent?
Solution. a) No.
Since A and B are independent, then
P(A∩ B) = P(A) P(B) ,= 0
wherefrom A∩ B ,= ∅.
b) No.
Since A and B are mutually exclusive, then
P(A∩ B) = 0 ,= P(A) P(B).
c) No.
If A ⊂ B, then
P(A∩ B) = P(A) ,= P(A) P(B).
Example 7. Suppose that each of three men at a part throws his hat into the
center of the room. The hats are ﬁrst mixed up and then each man randomly selects
a hat. What is the probability that none of the three men selects his own hat?
Solution. Let H
i
, i = 1, 3, be the event that the i
th
man selects his own hat. We
have to compute the probability of the event H
1
∩ H
2
∩ H
3
. In order to do that we
will compute the probability of the contrary event:
H
1
∩ H
2
∩ H
3
= H
1
∪ H
2
∪ H
3
= H
1
∪ H
2
∪ H
3
.
Hence
P(H
1
∩ H
2
∩ H
3
) = 1 −P(H
1
∪ H
2
∪ H
3
).
To calculate P(H
1
∪ H
2
∪ H
3
) we will apply the addition rule
P(H
1
∪ H
2
∪ H
3
) = P(H
1
) +P(H
2
) +P(H
3
) −P(H
1
∩ H
2
) −P(H
1
∩ H
3
)
−P(H
2
∩ H
3
) +P(H
1
∩ H
2
∩ H
3
).
324
It remains to compute P(H
1
∩ H
j
), i ,= j and P(H
1
∩ H
2
∩ H
3
).
For each i, j ∈ ¦1, 2, 3¦, i ,= j we have:
P(H
i
∩ H
j
) = P(H
i
) P(H
j
[H
i
) =
1
3
1
2
=
1
6
P(H
1
∩ H
2
∩ H
3
) = P(H
1
) P(H
2
[H
1
) P(H
3
[H
1
∩ H
2
) =
1
3
1
2
1 =
1
6
.
Now, we have that:
P(H
1
∪ H
2
∪ H
3
) =
1
3
+
1
3
+
1
3
−
1
6
−
1
6
−
1
6
+
1
6
=
2
3
.
Hence, the probability that none of the men selects his own hat is
1 −
2
3
=
1
3
.
Example 8. In the poker game (dealing 5 cards from a wellshuﬄed deck of 52
cards) ﬁnd the following probabilities:
a) the hand is all spades?
b) the hand us a ﬂush?
c) the hand is a full house?
Solution. To calculate the probability associated with any particular hand, we
ﬁrst need to calculate how many hands can be dealt. Since the order of cards is
irrelevant we should use combinatorics. This is expressed as:
C
5
52
=
52!
5! 47!
=
52 51 50 49 48
5 4 3 2 1
= 2.598.960
a) We determine ﬁrst in how many ways we can select 5 spades. That is C
5
13
= 1287.
Thus, the probability of a hand of spades is
C
5
13
C
5
52
=
1287
2598960
≈ 0.0005
b) A ﬂush is a hand of ﬁve cards all in the same suit. The probability of a ﬂush is
4 C
5
13
C
5
52
=
4 1287
2598960
≈ 0.002
c) A full house consists of three of a kind and one pair. There are thirteen numbers
that the three of a kind may have and then twelve possible numbers possible for the
pair. This is expressed as: 13C
3
4
12C
2
4
and in consequence the probability of a full
house is equal to:
13C
3
4
12C
2
4
C
5
52
=
13 4 12 6
2958960
= 0, 0012.
325
Example 9 (Probability as a continuous set function)
A sequence of events ¦A
n
¦
n≥1
is said to be an increasing sequence if A
1
⊂
A
2
⊂ ⊂ A
n
⊂ A
n+1
⊂ . . . and it is said to be a decreasing sequence if
A
1
⊃ A
2
⊃ ⊃ A
n
⊃ A
n+1
⊃ . . . . If ¦A
n
¦
n≥1
is an increasing sequence of events,
then we deﬁne the new event lim
n→∞
A
n
, by lim
n→∞
A
n
=
∞
_
n≥1
A
n
. Similarly, if ¦A
n
¦
n≥1
is
a decreasing sequence of events then lim
n→∞
A
n
is deﬁned by lim
n→∞
A
n
=
∞
n≥1
A
n
.
Prove that if ¦A
n
¦
n≥1
is either an increasing or a decreasing sequence of events,
then
lim
n→∞
P(A
n
) = P
_
lim
n→∞
A
n
_
.
Solution. Suppose, ﬁrst that ¦A
n
¦
n≥1
is an increasing sequence and deﬁned the
events B
n
, n ≥ 1 by
B
1
= A
1
, B
n
= A
n
¸ A
n−1
, n > 1.
It is easy to verify that the events ¦B
n
¦
n≥1
are mutually exclusive events such
that
∞
_
i=1
A
i
=
∞
_
i=1
B
i
and
n
_
i=1
A
i
=
n
_
i=1
B
i
, n ≥ 1.
Hence
P
_
lim
n→∞
A
n
_
= P
_
∞
_
i=1
A
i
_
= P
_
∞
_
i=1
B
i
_
=
∞
i=1
P(B
i
)
= lim
n→∞
n
i=1
P(B
i
) = lim
n→∞
P
_
n
_
i=1
B
i
_
= lim
n→∞
P
_
n
_
i=1
A
i
_
= lim
n→∞
P(A
n
)
which proves the result when ¦A
n
¦
n≥1
is increasing.
If ¦A
n
¦
n≥1
is a decreasing sequence, then ¦A
n
¦
n≥1
is an increasing sequence,
hence
P
_
∞
_
i=1
A
n
_
= lim
n→∞
P(A
n
).
Since
∞
_
n=1
A
n
=
∞
n=1
A
n
, the previous equality becomes
1 −P
_
∞
i=1
A
i
_
= lim
n→∞
(1 −P(A
n
)) = 1 − lim
n→∞
P(A
n
)
which proves the result.
326
Chapter 7
Random variables
In general, in performing an experiment we are not interested in its outcomes,
but rather in some function of them. For example, suppose one plays a game where
the payoﬀ is a function of the number of dots on two dice: suppose one receives 2
euros if the total number of dots equals 2 or 3; that one received 4 euros if the total
number of dots equals 4, 5, 6 or 7, and that one has to pay 8 euros otherwise. Our
payoﬀ is a function of the total number of dots on the dice. In order to compute the
probability that the payoﬀ equals some number we compute the probability that the
total number of dots correspond to the number selected. This leads to the notion of
random variables.
Deﬁnition. Let (S, T, P) be a probability space.
A random variable is a (measurable) function from the probability space to the
real numbers:
X : S →R such that for each
x ∈ R, ¦s : X(s) < x¦ ∈ T.
Random variables are denoted by capital letters, such as X, Y, Z, U, V and W.
Example. In the particular example described before we obtain the following
random variable:
X : S → ¦2, 4, −8¦
X(s) = 2, for each s ∈ A = ¦(1, 1), (1, 2), (2, 1)¦
X(s) = 4, for each s ∈ B = ¦(2, 2), (1, 3), (3, 1), (1, 4), (2, 3), (3, 2), (4, 1)
(1, 5), (2, 4), (3, 3), (4, 2), (5, 1), (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)¦
X(s) = −8 otherwise, s ∈ C = S ¸ (A∪ B)
Since in this experiment we have T = T(S) the condition ”∀ x ∈ R, ¦s ∈ S [
X(s) < x¦ ∈ T” is fulﬁlled.
For short, we shall use the notation ¦X < x¦ for the event
¦s ∈ S [ X(s) < x¦.
327
Remark. (Events deﬁned by a random variable)
Let (S, T, P) be a probability space and let X : S → R be a random variable. If
x ∈ R, then:
a) ¦X ≤ x¦ = ¦s ∈ S [ X(s) ≤ x¦ ∈ T
b) ¦X = x¦ = ¦s ∈ S [ X(s) = x¦ ∈ T
c) ¦X ≥ x¦, ¦X > x¦ ∈ T
d) ¦X ≤ x¦ ∪ ¦X > x¦ = S; ¦X ≤ x¦ ∩ ¦X > x¦ = ∅
¦X ≥ x¦ ∪ ¦X < x¦ = S; ¦X ≥ x¦ ∩ ¦X < x¦ = ∅.
Proof. We shall use the deﬁnition of the σﬁeld T and the properties of event
operations.
a) ¦X ≤ x¦ =
n>0
_
X < x +
1
n
_
∈ T
b) ¦X = x¦ = ¦X ≤ x¦ ¸ ¦X < x¦ ∈ T
c) ¦X ≥ x¦ = ¦X < x¦ ∈ T
¦X > x¦ = ¦X ≤ x¦ ∈ T
d) Obvious.
7.1 Discrete random variables
Deﬁnition. Let (S, T, P) be a probability space and let X : S →R be a random
variable.
X is said to be a discrete random variable (d.r.v.) if X(S) = M ⊆ R is ﬁnite or
countable.
A set M is countable if there is a onetoone correspondence between M and N
(the set of natural number).
In consequence a random variable X : S → M ⊆ R is a discrete random variable
if
M = ¦x
i
[ i ∈ I ⊆ N¦.
Deﬁnition. (Probability mass function) (p.m.f.)
Let X : S → M, M = ¦x
i
[ i ∈ I ⊆ N¦ be a d.r.v.
The function f = f
X
: M →R deﬁned by
f(x
i
) = P(¦X = x
i
¦), i ∈ I
is called the probability mass function of the d.r.v. X.
Example 1. The random variable described before is a d.r.v. since M = ¦2, 4, −8¦.
Its p.m.f. is f : ¦2, 4, −8¦ →R deﬁned by
f(2) = P(A) =
3
36
=
1
12
f(4) = P(B) =
18
36
=
1
2
f(−8) = p(C) =
15
36
=
5
12
328
Theorem. (Properties of the p.m.f.)
Let X : S → M = ¦x
i
[ i ∈ I ⊆ N¦ be a d.r.v. and let f
X
= f be its probability
mass function. Then
1) f(x
i
) ≥ 0, ∀ i ∈ I
2)
i∈I
f(x
i
) = 1.
Proof. We will use the properties of the probability function.
1) f(x
i
) = P(X = x
i
) ≥ 0.
2) We remark ﬁrst that the events ¦X = x
i
¦
i∈I
form a partition of the sample
space S. By using this fact, we get that
i∈I
f(x
i
) =
i∈I
P(X = x
i
) = P
_
_
i∈I
¦X = x
i
¦
_
= P(S) = 1,
as desired.
Notation. If X : S → M = ¦x
i
[ i ∈ I ⊆ N¦ is a d.r.v. with the p.m.f. f
X
we will
use the following notations:
p
i
= f(x
i
), i ∈ I.
By using these notations the properties of the p.m.f. can be written as:
1) p
i
≥ 0, i ∈ N
2)
i∈I
p
i
= 1.
Deﬁnition. The distribution of a d.r.v.
The distribution of a d.r.v. X is a table of the following form:
X :
_
x
i
f
X
(x
i
)
_
i∈I
or X :
_
x
i
p
i
_
i∈I
with
_
¸
_
¸
_
p
i
≥ 0
i∈I
p
i
= 1.
The values taken by X are written on the ﬁrst row of the previous table and the
probabilities
p
i
= P(X = x
i
), i ∈ I.
are written on the second row of the table.
Operations with discrete random variables
Let (S, T, P) bee a probability space and let X, Y two discrete random variables
deﬁned on S.
X : S → M
1
= ¦x
i
[ i ∈ I ⊆ N¦ d.r.v.
X :
_
x
i
p
i
_
i∈I
, p
i
≥ 0,
i∈I
p
i
= 1
Y : S → M
2
= ¦y
j
[ j ∈ J ⊆ N¦ d.r.v.
329
Y :
_
y
j
q
j
_
, q
j
≥ 0,
j∈J
q
j
= 1.
Deﬁnition. The discrete random variables X and Y are called independent if
for each i ∈ I and j ∈ J the following events ¦X = x
i
¦, ¦Y = y
j
¦ are independent
events. In this case
P(¦X = x
i
¦ ∩ ¦Y = y
j
¦) = P(¦X = x
i
¦) P(¦Y = y
j
¦) = p
i
q
j
.
• The sum of two discrete random variables
X +Y : S → M = ¦x
i
+y
j
[ i ∈ I, j ∈ J¦
f
X+Y
(x
i
+y
j
) = P(¦X = x
i
¦ ∩ ¦Y = y
j
¦)
X,Y indep
=
= P(¦X = x
i
¦) P(¦Y = y
j
¦) = p
i
q
j
.
The distribution of the sum X +Y is
X +Y :
_
x
i
+y
j
f
X+Y
(x
i
+y
j
)
_
i∈I,j∈J
If X and Y are independent then
X +Y :
_
x
i
+y
j
p
i
q
j
_
i∈I,j∈J
Example 2. Let X, Y be two independent discrete random variables deﬁned by:
X :
_
−1 1
0, 3 0, 7
_
and Y :
_
0 2
0, 6 0, 4
_
Compute X +Y .
Solution. Since X and Y are independent, we have:
X +Y :
_
−1 + 0 −1 + 2 1 + 0 1 + 2
0, 3 0, 6 0, 3 0, 4 0, 7 0, 6 0, 7 0, 4
_
X +Y :
_
−1 1 1 3
0, 18 0, 12 0, 42 0, 28
_
.
As we can see, the value 1 is taken twice, with probabilities 0,12 and 0,42. Then
P(X +Y = 1) = 0, 12 + 0, 42 = 0, 56.
Finally,
X +Y :
_
−1 1 3
0, 18 0, 56 0, 28
_
.
330
• The product of 2 discrete random variables
X Y : S → M = ¦x
i
y
j
[ i ∈ I, j ∈ J¦
f
XY
(x
i
y
j
) = P(¦X = x
i
¦ ∩ ¦Y = y
j
¦)
X,Y indep
=
= P(X = x
i
) P(Y = y
j
) = p
i
q
j
.
The distribution of the d.r.v. XY is:
XY :
_
x
i
y
j
f
XY
(x
i
y
j
)
_
i∈I,j∈J
If X and Y are independent, then
XY :
_
x
i
y
j
p
i
y
j
_
i∈I,j∈J
• The sum (product) of a d.r.v. with a constant
If c ∈ R, then
c +X :
_
c +x
i
p
i
_
i∈I
c X :
_
cx
i
p
i
_
i∈I
• The inverse of a discrete random variable
If x
i
,= 0, ∀ i ∈ I, then
X
−1
:
_
1
x
i
p
i
_
i∈I
Example. If X and Y are the discrete random variables deﬁned in example 2,
then
X
−1
:
_
−1 1
0, 3 0, 7
_
and ∄ Y
−1
.
• The power of a discrete random variable
Let k ∈ R and X :
_
x
i
p
i
_
i∈I
.
If x
k
i
is well deﬁned for each i ∈ I then
X
k
:
_
x
k
i
p
i
_
i∈I
331
Example. Let X, Y be the independent d.r.v. deﬁned by
X :
_
−1 0 1 2
0, 3 0, 4 0, 2 p
_
and Y :
_
1 2 3
0, 2 0, 4 q
_
Compute: X
2
; 2X −Y ; X
2
−X.
Solution. First, observe that
p = 1 −0, 3 −0, 4 −0, 2 = 0, 1 and q = 1 −0, 2 −0, 4 = 0, 4
X
2
:
_
(−1)
2
0
2
1
2
2
2
0, 3 0, 4 0, 2 0, 1
_
X
2
:
_
0 1 4
0, 4 0, 3 + 0, 2 0, 1
_
2X −Y : 2
_
−1 0 1 2
0, 3 0, 4 0, 2 0, 1
_
−
_
1 2 3
0, 2 0, 4 0, 4
_
2X −Y :
_
−2 −1 −2 −2 −2 −3 0 −1 0 −2 0 −3
0, 3 0, 2 0, 3 0, 4 0, 3 0, 4 0, 4 0, 2 0, 4 0, 4 0, 4 0, 4
2 −1 2 −2 2 −3 4 −1 4 −2 4 −3
0, 2 0, 2 0, 2 0, 4 0, 2 0, 4 0, 1 0, 2 0, 1 0, 4 0, 1 0, 4
_
2X −Y :
_
−5 −4 −3 −2 −1 0 1 2 3
0, 12 0, 12 0, 22 0, 16 0, 16 0, 08 0, 08 0, 04 0, 02
_
The computation rules introduced before cannot be used in order to determine
X
2
−X since the variables X
2
and X are not independent
X
2
−X :
_
(−1)
2
+ 1 0
2
−0 1
2
−1 2
2
−2
0, 3 0, 4 0, 2 0, 1
_
X
2
−X :
_
0 2
0, 4 + 0, 2 0, 3 + 0, 1
_
X
2
−X :
_
0 2
0, 6 0, 4
_
.
7.2 The distribution function of a random variable
Deﬁnition. The distribution function F (or the cumulative distribution
function) of the random variable X is the function
F : R → [0, 1] deﬁned by
F(x) = P(¦X < x¦) = P(X < x).
If we want to mention the role of X we denote F by F
X
.
332
Example. Compute the cumulative distribution function of the following discrete
random variable
X :
_
0 1 2
1
4
1
2
1
4
_
.
Solution. If x ≤ 0, F(x) = P(X < x) = P(∅) = 0.
If x ∈ (0, 1], F(x) = P(X < x) = P(¦X = 0¦) =
1
4
.
If x ∈ (1, 2], F(x) = P(X < x) = P(¦X = 0¦ ∪ ¦X = 1¦) =
1
4
+
1
2
=
3
4
.
If x > 2, F(x) = P(X < x) = P(¦X = 0¦ ∪ ¦X = 1¦ ∪ ¦X = 2¦) = P(S) = 1.
Hence,
F(x) =
_
¸
¸
_
¸
¸
_
0, x ≤ 0
1
4
, 0 < x ≤ 1
3
4
, 1 < x ≤ 2
1, x > 2.
Theorem. (Properties of the cumulative distribution function) The cu
mulative distribution function F = F
X
of a random variable X has the following
properties.
1) If x
1
, x
2
∈ R, x
1
< x
2
, then
P(x
1
≤ X < x
2
) = F(x
2
) −F(x
1
)
P(x
1
< X < x
2
) = F(x
2
) −F(x
1
) −P(X = x
1
)
P(x
1
≤ X ≤ x
2
) = F(x
2
) −F(x
1
) +P(X = x
2
)
P(x
1
< X ≤ x
2
) = F(x
2
) −F(x
1
) +P(X = x
2
) −P(X = x
1
)
2) F is a monotone increasing function.
3) F is continuous from the left i.e.
lim
yրx
F(y) = F(x −0) = F(x), ∀ x ∈ R.
4) lim
x→−∞
F(x) = 0, lim
x→∞
F(x) = 1.
5) If x ∈ R, then
P(X ≤ x) = F(x + 0) = lim
yցx
F(y).
6) If x ∈ R, then
P(X = x) = lim
yցx
F(y) −F(x) = F(x + 0) −F(x).
7) The set of all points of discontinuity of F is at most countable.
Proof. 1) Let x
1
, x
2
∈ R, x
1
< x
2
.
The events ¦X < x
1
¦ and ¦x
1
≤ X < x
2
¦ are mutually exclusive and satisfy the
equality:
¦X < x
1
¦ ∪ ¦x
1
≤ X < x
2
¦ = ¦X < x
2
¦
333
F(x
2
) = P(X < x
2
) = P(¦X < x
1
¦ ∪ ¦x
1
≤ X < x
2
¦)
= P(X < x
1
) +P(x
1
≤ X < x
2
)
= F(x
1
) +P(x
1
≤ X < x
2
),
hence
P(x
1
≤ X < x
2
) = F(x
2
) −F(x
1
).
Similarly, starting from the equality
¦X < x
1
¦ ∪ ¦x
1
< X < x
2
¦ ∪ ¦X = x
1
¦ = ¦X < x
2
¦,
we obtain
P(X < x
1
) +P(x
1
< X < x
2
) +P(X = x
1
) = P(X < x
2
),
hence
P(x
1
< X < x
2
) = F(x
2
) −F(x
1
) −P(X = x
1
).
The remaining two equalities can be obtained in the same way.
2) Property 2 follows because for x
1
< x
2
the event ¦X < x
1
¦ is contained in the
event ¦X < x
2
¦ and cannot have a larger probability, hence F(x
1
) ≤ F(x
2
).
3) Let x ∈ R and let (x
n
)
n∈N
an arbitrary increasing sequence such that lim
n→∞
x
n
=
x. If x
n
increase to x, then the events ¦X < x
n
¦ are increasing events whose union is
the event ¦X < x¦.
_
n≥1
¦X < x
n
¦ = ¦X < x¦.
Hence, by the continuity property of probabilities (see example 9, page 326):
lim
n→∞
F(x
n
) = lim
n→∞
P(X < x
n
) = P
_
_
_
n≥1
¦X < x
n
¦
_
_
= P(X < x) = F(x).
Hence, F(x −0) = F(x).
4) If (x
n
)
n∈N
increases to ∞, then the events ¦X < x
n
¦, n ≥ 1, are increasing
events whose union is ¦X < ∞¦ = S.
Hence, lim
n→∞
F(x
n
) = lim
n→∞
P(X < x
n
) = P(X < ∞) = 1 which proves the second
part of the fourth property.
The proof of the ﬁrst part of this property is similar and is left as an exercise.
5) 6) F(x + 0) = lim
yցx
F(y) = lim
n→∞
F
_
x +
1
n
_
= lim
n→∞
P
_
X < x +
1
n
_
= P
_
∩
_
X < x +
1
n
__
= P(X ≤ x) = P(X < x) +P(X = x) = F(x) +P(X = x).
7) The proof of this property is beyond the scope of this text and it will be omitted.
334
Remark. If x ∈ R and F is continuous at x then P(X = x) = 0.
Proof. Since F is continuous at x then F(x+0) = F(x). By applying the property
6 of the previous theorem we get
P(X = x) = F(x + 0) −F(x) = F(x) −F(x) = 0.
7.3 Continuous random variables
In the previous sections we considered discrete random variables, that is, random
variables whose set of possible values is at most countable. However, there also exist
random variables whose set of possible values is uncountable.
Deﬁnition. Let (X, T, P) a probability space and let X : S → R be a random
variable whose cumulative distribution function is F.
If there exists a function f : R →R such that
F(x) =
_
x
−∞
f(t)dt,
for each x ∈ R, then
a) X is said to be a continuous random variable
b) f is called the probability density function of X.
The distribution of a c.r.v. X whose density is f is deﬁned as:
X :
_
x
f(x)
_
x∈R
Theorem. (Properties of the density function of a continuous random
variable) Let X be a continuous random variable and let f : R → R be its density
function. Then the following properties hold:
1) f(x) ≥ 0, ∀ x ∈ R
2)
_
∞
−∞
f(x)dx = 1.
Remark. If X is a continuous random variable having the distribution function
F then all the properties of a distribution function hold. Also we have:
a) F is continuous on R, hence for each x ∈ R we have
P(X = x) = 0.
b) If x
1
, x
2
∈ R, x
1
< x
2
P(x
1
< X < x
2
) = P(x
1
≤ X < x
2
) = P(x
1
≤ X ≤ x
2
)
= P(x
1
< X ≤ x
2
) =
_
x
2
x
1
f(t)dt.
c) F is diﬀerentiable on R ¸ I, where I is a set which is at most countable and
F
′
(x) = f(x), ∀ x ∈ R ¸ I.
335
7.4 Numerical characteristics of random variables
Expected value
One of the most important concepts in probability theory is that of the expectation
of a random variable.
If X is a discrete random variable deﬁned as
X :
_
x
i
p
i
_
i∈I⊆N
, p
i
≥ 0, i ∈ I,
i∈I
p
i
= 1,
such that
i∈I
[x
i
[p
i
< ∞, then the expectation or the expected value of X, denoted
by E(X), is:
E(X) =
i∈I
x
i
p
i
.
This is also known as the mean, or average or ﬁrst moment of X.
Hence, the expected value of X is a weighted average of the possible values that
X can take on, each value being weighted by the probability that X assumes it.
The expected value can be seen as a guide to the location of X and is often called
a location parameter.
If X is a continuous random variable has the density f such that
_
∞
−∞
[x[f(x)dx < ∞,
then X has an expected value, which is given by
E(X) =
_
∞
−∞
xf(x)dx.
Remark. The condition
∞
i=1
[x
i
[p
i
< ∞ is needed because, if it is violated, it is
known that
∞
i=1
x
i
p
i
may be take diﬀerent values, depending on the order of summa
tion.
Example. Suppose an insurance company pays the amount of 500 Euro for lost
luggage on an airplane trip. It is known that the company pays this amount in 1 out
of 100 policies it sells. What premium should the company charge?
Solution. Let X be the r.v. deﬁned as X = 0 if no loss occur, and X = −500 for
lost luggage. Then the distribution of X is:
X :
_
0 −500
0, 99 0, 01
_
.
Then the expected loss to the insurance company is
E(X) = 0 0, 99 −500 0, 01 = −5.
336
Thus, the company must charge 5 Euro (it will also add an amount for adminis
trative expenses and a proﬁt).
From the deﬁnition of the expectation and familiar properties of summations or
integrals, it follows that:
Theorem. Properties of the expected value.
1) If X is a constant random variable, i.e.
X :
_
a
1
_
, then E(X) = a.
2) Let X be a r.v. and let a ∈ R. Then
E(aX) = aE(X).
3) Let X and Y be two r.v. and let a, b ∈ R. Then
E(X +Y ) = E(X) +E(Y )
E(a +X) = a +E(X)
E(aX +b) = aE(X) +b.
4) If X and Y are two independent r.v. then:
E(XY ) = E(X)E(Y ).
Variance
The following example illustrates that the expected value, as a measure of location
of the distribution, may show us very little about the entire distribution.
Let X and Y be two d.r.v. whose distributions are deﬁned as follows:
X :
_
−1 1 3
5
8
2
8
1
8
_
; Y :
_
−100 100 300
5
8
2
8
1
8
_
.
It is easy to see that E(X) = E(Y ) = 0.
The distribution of X is over an interval of length 4, the distribution of Y is over
an interval of length 100 times larger and they have the same center of location.
Hence, an additional measure is needed to be associated with the spread of location.
This new measure is the variance of a r.v.
Deﬁnition. If X is a random variable with mean E(X) = m, then the variance
of X, denoted by V (X) is deﬁned by
V (X) = E((X −m)
2
).
If X is a d.r.v., i.e. X :
_
x
i
p
i
_
i∈I
then
V (X) =
i∈I
(x
i
−m)
2
p
i
.
337
If X is a c.r.v. X :
_
x
f(x)
_
x∈R
then
V (X) =
_
∞
−∞
(x −m)
2
f(x)dx.
For the random variables X and Y mentioned before we have
V (X) = (−1)
2
5
8
+ 1
2
2
8
+ 3
2
1
8
= 2
and
V (Y ) = (−100)
2
5
8
+ 100
2
2
8
+ 300
2
1
8
= 20000.
Thus, the variance shows the diﬀerence in size of the range of the distributions of
the r.v.’s X and Y .
Theorem. (Properties of the variance)
1) If X is a r.v. then V (X) ≥ 0.
2) If X is a r.v. then
V (X) = E(X
2
) −(E(X))
2
.
3) If X is a constant random variable, i.e. X :
_
a
1
_
, then
V (X) = 0.
4) If X is a r.v. and a, b ∈ R then
V (aX +b) = a
2
V (X).
5) If X and Y are two independent random variables and a, b ∈ R then
V (aX +bY ) = a
2
V (X) +b
2
V (Y ).
For a = 1 and b = −1 we have
V (X −Y ) = V (X) +V (Y ).
Proof. We will prove the second property since the rest of them can be eas
ily obtain from the variance’s deﬁnition and familiar properties of summations and
integrals.
2) V (X) = E((X −E(X))
2
) = E((X −m)
2
) = E(X
2
−2mX +m
2
)
= E(X
2
) −2mE(X) +m
2
= E(X
2
) −m
2
= E(X
2
) −(E(X))
2
.
In words, the variance of X is equal to the expected value of X
2
minus the square
of its expected value. This is, in practice, the easier way to compute V (X).
338
Standard deviation
The square root of the variance V (X) is called the standard deviation of X :
σ(X) =
_
V (X).
Unlike the variance, the standard deviation is measured in the same units as X
and E(X) and serves as a measure of deviation of X from E(X).
Moments and central moments
Deﬁnition. (Moments)
Let X be a r.v. and let k ∈ N.
The moment of order k of X is the number
ν
k
= E(X
k
).
If X is a d.r.v., i.e., X :
_
x
i
p
i
_
i∈I
then
ν
k
= E(X
k
) =
i∈I
x
k
i
p
i
(if the previous sum exists).
If X is a c.r.v., i.e., X :
_
x
f(x)
_
x∈R
then
ν
k
= E(X
k
) =
_
∞
−∞
x
k
f(x)dx
(if the previous integral converges).
The moments of order k generalize the expected value because
ν
1
= E(X).
Deﬁnition. (Central moments)
Let X be a random variable with mean E(X) = m and let k ∈ N.
The central moment of order k of X is the number
µ
k
= E((X −m)
k
).
If X is a d.r.v., i.e., X :
_
x
i
p
i
_
i∈I
then
µ
k
= E((X −m)
k
) =
i∈I
(x
i
−m)
k
p
i
(if the previous sum exists).
339
If X is a c.r.v., i.e, X :
_
x
f(x)
_
x∈R
then
µ
k
= E((X −m)
k
) =
_
∞
−∞
(x −m)
k
f(x)dx
(if the previous integral converges).
The central moments of order k generalize the variance because
µ
2
= E((X −m)
2
) = V (X).
We have the following relationship moments and central moments.
Theorem. Let X be a random variables and let k ∈ N. Then
µ
k
=
k
i=0
(−1)
i
C
i
k
ν
k−i
(ν
1
)
i
, where ν
0
= 1.
Proof. By using the binomial theorem, the properties of the expected value and
the fact that m = E(X) = ν
1
we have:
µ
k
= E((X −m)
k
) = E
_
k
i=0
C
i
k
X
k−i
(−m)
i
_
= E
_
k
i=0
(−1)
k
C
i
k
X
k−i
ν
i
1
_
=
k
i=0
(−1)
i
C
i
k
E(X
k−i
) ν
i
1
=
k
i=0
(−1)
i
C
i
k
ν
k−i
ν
i
1
,
as desired.
Particular cases:
µ
1
= 0
µ
2
= ν
2
−ν
2
1
µ
3
= ν
3
−3ν
2
ν
1
+ 2ν
3
1
µ
4
= ν
4
−4ν
3
ν
1
+ 6ν
2
ν
2
1
−3ν
4
1
.
Examples
Example 1. We consider the following gambling game. A player bets on one of
the numbers 1 through 6. Three dice are rolled, and if the number bet by the player
appears i times, i = 1, 3, then the player wins i units; if the number bet by the player
340
does not appear on any of the dice, then the player loses 1 unit. Is the game fair to
the player?
Solution. By assuming that the dice are fair and independent of each other we
can use the urn model with replacement and 2 states (p =
1
6
, q =
5
6
) and 3 repeated
trials.
Let X be the random variables which represents the player’s winning in the game.
P(X = −1) = C
0
3
_
1
6
_
0
_
5
6
_
3
=
125
216
P(X = 1) = C
1
3
_
1
6
__
5
6
_
2
=
75
216
P(X = 2) = C
2
3
_
1
6
_
2
_
5
6
_
=
15
216
P(X = 3) = C
3
3
_
1
6
_
3
_
5
6
_
0
=
1
216
.
Hence the distribution of X is
X :
_
−1 1 2 3
125
216
75
216
15
216
1
216
_
In order to determine whether or not this is a fair game for the player we compute
E(X).
E(X) =
−125 + 75 + 2 15 + 3 1
216
= −
17
216
.
The game is not fair, since in the long run, the player will lose 17 monetary units
at every 216 games he plays.
Example 2. Let X be the discrete random variable whose distribution is
X :
_
1 3 5 7 9 11
0, 05 0, 1 0, 15 0, 2 0, 3 0, 2
_
Compute its distribution function and draw its graph representation.
Solution. The distribution function F : R → [0, 1] is deﬁned as
F(x) = P(X < x)
F(x) =
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
0, x ≤ 1
0, 05, 1 < x ≤ 3
0, 05 + 0, 1 = 0, 15, 3 < x ≤ 5
0, 15 + 0, 15 = 0, 3, 5 < x ≤ 7
0, 3 + 0, 2 = 0, 5, 7 < x ≤ 9
0, 5 + 0, 3 = 0, 8, 9 < x ≤ 11
0, 8 + 0, 2 = 1, x > 11
341
¸
`
1
0,8
0,5
0,3
0,15
0,05
O 1 3 5 7 9 11
F(x)
x
Example 3. Let X :
_
−1 0 2
0, 2 a b
_
.
1) Determine a, b ∈ R such that E(X) = 0, 8.
2) Compute V (X).
3) Compute the moment and the central moment of order 3 of X.
Solution. 1) From the properties of the probability mass function and the deﬁni
tion of the expected value we have to determine a, b ≥ 0 such that:
_
0, 2 +a +b = 1
−1 0, 2 + 0 a + 2 b = 0, 8
⇔
_
2b = 1
a = 0, 8 −b
⇔
_
b = 0, 5
a = 0, 3
Hence
X :
_
−1 0 2
0, 2 0, 3 0, 5
_
2) V (X) = E(X
2
) −(E(X))
2
= (−1)
2
0, 2 + 0
2
0, 3 + 2
2
0, 5 −0, 8
2
= 2, 2 −0, 64 = 1, 56
4) ν
3
= E(X
3
) = (−1)
3
0, 2 + 0
3
0, 3 + 2
3
0, 5 = −0, 2 + 4 = 3, 8
µ
3
= E((X −0, 8)
3
) = (−1 −0, 8)
3
0, 2 + (0 −0, 8)
3
0, 3 + (2 −0, 8)
3
0, 5
= −1, 1664 −0, 1536 + 0, 864 = −0, 456
By using the relationship between central moments and moments we have a second
342
method for computing µ
3
µ
3
= ν
3
−3ν
2
ν
1
+ 2ν
3
1
= 3, 8 −3 2, 2 0, 8 + 2 0, 8
3
= 3, 8 −5, 28 + 1, 024 = −0, 456
as we expected.
Example 4. Let f : R →R deﬁned by
f(x) =
_
−
1
18
x +k, if −1 < x ≤ 2
0, otherwise
a) Determine k ∈ R such that f is a density probability function of a continuous
random variable X.
b) Determine the distribution function F.
c) Determine E(X).
Solution. a) f is a density probability function if the following two conditions are
satisﬁed:
• f(x) ≥ 0, x ∈ R
•
_
∞
−∞
f(x)dx = 1
The ﬁrst condition implies that k ≥
x
18
for each x ∈ (−1, 2] wherefrom we easily
obtain
k ≥
2
18
=
1
9
.
From the second condition we obtain:
1 =
_
∞
−∞
f(x)dx =
_
−1
−∞
f(x)dx +
_
2
−1
f(x)dx +
_
∞
2
f(x)dx
=
_
2
−1
_
−
1
18
x +k
_
dx = −
1
18
x
2
2
¸
¸
¸
2
−1
+kx
¸
¸
¸
2
−1
= −
4
36
+
1
36
+ 3k.
It remains for us to solve the equation
3k = 1 +
1
12
, k =
13
36
_
observe that k ≥
1
9
_
.
The distribution function is F : R → [0, 1]
F(x) =
_
x
−∞
f(t)dt.
 for x ≤ −1, since f(x) = 0 then F(x) = 0
343
 for −1 < x ≤ 2,
F(x) =
_
x
−∞
f(t)dt =
_
−1
−∞
f(t)dt +
_
x
−1
f(t)dt =
_
x
−1
_
−
t
18
+
13
36
_
dt
=
_
−
t
2
36
+
13t
36
_
¸
¸
¸
x
−1
= −
x
2
36
+
13x
36
+
1
36
+
13
36
= −
x
2
36
+
13x
36
+
14
36
.
 for x > 2
F(x) =
_
2
−∞
f(t)dt +
_
x
2
f(t) = F(2) +
_
x
2
0dt = F(2) = 1.
Hence
F(x) =
_
¸
¸
¸
_
¸
¸
¸
_
0, x ≤ −1
−
x
2
36
+
13x
36
+
14
36
, −1 < x ≤ 2
1, x > 2
b) E(X) =
_
∞
−∞
xf(x)dx =
_
2
−1
x
_
−
x
18
+
13
36
_
dx
=
_
−
x
3
54
+
13x
2
72
_
¸
¸
¸
2
−1
= −
8
54
−
1
54
+
52
72
−
13
72
= −
1
6
+
13
24
=
9
24
=
3
8
.
Example 5. Let f : R →R be a function deﬁned as
f(x) =
_
αe
−
x
5
, x ≥ 0
0, x < 0
a) Determine α ∈ R such that f is a density probability function of a continuous
random variable X.
b) Compute E(X), V (X) and ν
15
.
Solution. a) Since f is a probability density function then
f(x) ≥ 0, ∀ x ∈ R and
_
∞
−∞
f(x)dx = 1.
From the inequality f(x) ≥ 0, ∀ x ∈ R we obtain α ≥ 0.
From the second condition we obtain
1 =
_
∞
−∞
f(x)dx =
_
∞
0
αe
−
x
5
dx = α
_
∞
0
e
−y
5dy = 5αΓ(1) = 5α.
344
Hence α =
1
5
≥ 0.
b) E(X) =
_
∞
−∞
xf(x)dx =
_
∞
0
x
1
5
e
−
x
5
dx.
By using the following change of variable:
x
5
= y, x = 5y, dx = 5dy
we have:
E(X) =
_
∞
0
5y
1
5
e
−y
5dy = 5
_
∞
0
ye
−y
dy = 5Γ(2) = 5
V (X) = E(X
2
) −[E(X)]
2
.
By using the same change of variable we obtain
E(X
2
) =
_
∞
0
x
2
1
5
e
−
x
5
dx =
_
∞
0
25y
2
1
5
e
−y
5dy
= 25Γ(3) = 25 2! = 50,
hence V (X) = 50 −25 = 25.
Using once more the Gamma function we obtain
ν
15
=
_
∞
0
x
15
1
5
e
−
x
5
dx =
_
∞
0
5
15
y
15
1
5
e
−y
5dy
= 5
15
Γ(16) = 5
15
15!
7.5 Special random variables
Certain types of random variables occur over and over again in applications. In
this section we will study a few of them.
Discrete random variables
The Bernoulli and binomial random variables
Suppose that we perform an experiment whose outcome can be classiﬁed as either
a ”success” (with probability p, 0 < p < 1) or a ”failure” (with probability 1 −p).
If we let X = 1 when the outcome is a success and X = 0 when it is a failure then
the distribution of X is
X :
_
1 0
p 1 −p
_
.
A random variable X is said to be a Bernoulli random variable (after the Swiss
mathematician James Bernoulli) if its distribution is:
X :
_
1 0
p 1 −p
_
, 0 < p < 1.
345
The expected value is E(X) = 1 p + 0 (1 −p) = p.
The variance is
V (X) = E(X
2
) −(E(X))
2
= 1
2
p + 0
2
(1 −p) −p
2
= p(1 −p) = pq,
where by q we denoted the probability of a ”failure”; q = 1 −p.
Suppose that n independent trials, each of which is a Bernoulli experiment, are
performed. If X represents the number of successes that occur in the n trials, then X
is said to be a binomial random variables with parameters (n, p).
Notation X ∼ B(n, p).
The probability mass function of a binomial random variable with parameters n
and p is given by
P(X = k) = C
k
n
p
k
(1 −p)
n−k
= C
k
n
p
k
q
n−k
, k = 0, n
(the reasoning is similar to that used in the urn model with two states and with
replacement).
Deﬁnition. The binomial random variable with parameters n and p is the
random variable whose distribution is
X :
_
k
C
k
n
p
k
q
n−k
_
k=0,n
, p +q = 1, p ∈ (0, 1).
To check that P(X = k) = C
k
n
p
k
q
n−k
, k = 1, n is a probability mass function, we
note that:
P(X = k) = C
k
n
p
k
q
n−k
> 0
and
n
k=1
P(X = k) =
n
k=1
C
k
n
p
k
q
n−k
= (p +q)
n
= 1.
Remark. If X is a binomial random variable with parameters n and p, X ∼
B(n, p), then
E(X) = np and V (X) = npq.
Proof. Since a binomial random variable X, with parameters n and p, represents
the number of successes in n independent trials (with the success probability p), then
X can be represented as
X =
n
i=1
X
i
where X
i
=
_
1, if the i
th
trial is a success
0, otherwise
Because X
i
, i = 1, n, are n independent Bernoulli r.v. we have that
E(X) = E
_
n
i=1
X
i
_
=
n
i=1
E(X
i
) = np
346
V (X) = V
_
n
i=1
X
i
_
=
n
i=1
V (X
i
) = npq.
Remark. If X
1
∼ B(n
1
, p) and X
2
∼ B(n
2
, p) are independent then X
1
+X
2
is a
binomial random variable with parameters (n
1
+n
2
, p); i.e. X
1
+X
2
∼ B(n
1
+n
2
, p).
Example. Suppose that an airplane engine will fail with probability 1 − p inde
pendently from engine to engine; suppose that the airplane will make a successful
ﬂight if at least a half of its engines are operative. For what value of p a fourengine
airplane is preferable to a twoengine airplane?
Solution. As the number of functioning engines is a binomial random variable
with parameters (n, p) it follows that the probability for a four engine airplane to
make a successful ﬂight is
P(X
1
= 2) +P(X
1
= 3) +P(X
1
= 4)
= C
2
4
p
2
(1 −p)
2
+C
3
4
p
3
(1 −p) +C
4
4
p
4
(1 −p)
0
= 6p
2
(1 −p)
2
+ 3p
3
(1 −p) +p
4
whereas the corresponding probability for a two engine airplane is
P(X
1
= 1) +P(X
2
= 2) = C
1
2
p(1 −p) +C
2
2
p
2
= 2p(1 −p) +p
2
.
Hence, the four engine plane is better if
6p
2
(1 −p)
2
+ 3p
3
(1 −p) +p
4
≥ 2p(1 −p) +p
2
which is equivalent to (by dividing the inequality with p)
6p(1 −p)
2
+ 3p
2
(1 −p) +p
3
≥ 2 −p ⇔ (p −1)
2
(3p −2) ≥ 0
which is equivalent to
p ≥
2
3
.
In conclusion, the fourengine plane is better when the success probability is
greater than
2
3
, whereas the twoengine plane is better if the success probability
is smaller than
2
3
.
The geometric random variable
The geometric random variable is closely related to the binomial random variable.
Consider a sequence of independent Bernoulli experiments where p is the proba
bility of ”success” and q = 1 −p the probability of ”failure”.
We saw that the random variable which represents the number of successes (in n
successive trials) is binomial.
The geometric random variable represents the waiting time until the ﬁrst
success occurs.
347
If the ﬁrst success occurs at the k
th
trial, then we must have k −1 failures before
the ﬁrst success. The Bernoulli trials are independent, hence the probability of the
desired event is (1 −p)
k−1
p = q
k−1
p, k ≥ 1 (see also the geometric urn model).
Deﬁnition. The geometric random variable with parameter p is the random vari
able whose distribution is
X :
_
k
q
k−1
p
_
k≥1
p ∈ (0, 1), p +q = 1.
To check that P(X = k) = q
k−1
p, k ≥ 1 is a probability mass function, we note
that:
P(X = k) = q
k−1
p > 0
and
∞
k=1
P(X = k) =
∞
k=1
q
k−1
p = p
∞
k=1
q
k−1
= p lim
n→∞
(1 +q + +q
n
)
= p lim
n→∞
1 −q
n−1
1 −q
= p
1
1 −q
=
p
p
= 1.
Remark. If X is a geometric random variable with parameter p ∈ (0, 1) then
E(X) =
1
p
and V (X) =
q
p
2
.
Proof.
E(X) =
∞
k=1
kq
k−1
p = p
∞
k=1
kq
k−1
= p
_
∞
k=1
q
k−1
_
′
= p
_
1
1 −q
_
′
= p
1
(1 −q)
2
=
p
p
2
=
1
p
,
as desired.
348
To determine V (X) we compute ﬁrst E(X
2
).
E(X
2
) =
∞
k=1
k
2
q
k−1
p = p
∞
k=1
k
2
q
k−1
= p
∞
k=1
(k
2
−k +k)q
k−1
= p
∞
k=2
k(k −1)q
k−1
+p
∞
k=1
kq
k−1
= pq
∞
k=2
k(k −1)q
k−2
+p
∞
k=1
kq
k−1
= pq
_
∞
k=1
q
k
_
′′
+
1
p
= pq
_
1
1 −q
_
′′
+
1
p
= pq
_
1
(1 −q)
2
_
′
+
1
p
= pq
2
(1 −q)
3
+
1
p
=
2q
p
2
+
1
p
=
q +q +p
p
2
=
1 +q
p
2
.
Hence
V (X) =
1 +q
p
2
−
1
p
2
=
q
p
2
.
We can also remark that the standard deviation is:
σ(X) =
√
q
p
.
The occurrence of a geometric series explains the use of the word ”geometric” in
describing the probability distribution.
As an application of the previous remark we present the following:
 if we toss a fair coin, then the expected waiting time for the ﬁrst head to occur
is
1
p
=
1
1
2
= 2 tosses
 if we roll a fair dice, then the expected waiting time for the six to occur is
1
p
=
1
1
6
= 6 rolls.
The negative binomial (Pascal) random variable
The binomial distribution ﬁnds the probability of exactly k successes in n inde
pendent trials.
The geometric distribution ﬁnds the number of independent trials until the ﬁrst
success occurs.
349
We can generalize these two results and ﬁnd the number of independent trials
required for k successes.
Suppose that independent trials, each having the probability of success p, 0 < p <
1 are performed until a total of k successes is obtained.
Let X be the random variable which represent the number of trials required, then
P(X = n) = C
k−1
n−1
p
k
(1 −p)
n−k
, n = k, k + 1, . . . ,
The previous equality holds because, in order that k successes to occur in the ﬁrst
n trials, there must be k −1 successes in the ﬁrst n −1 trials and the n
th
trial must
be a success. The mentioned probability was computed in the Pascal urn model.
Deﬁnition. The negative binomial (or Pascal) random variable with parameters
k and p is the random variable whose distribution is:
X :
_
n
C
k−1
n−1
p
k
q
n−k
_
n≥k
To check that P(X = n) = C
k−1
n−1
p
k
q
n−k
, n ≥ k is a probability mass function, we
note that
P(X = n) > 0
and
∞
n=k
C
k−1
n−1
p
k
q
n−k
= p
k
(C
k−1
k−1
+C
k−1
k
q +C
k−1
k+1
q
2
+. . . )= p
k
1
(1 −q)
k
=
p
k
p
k
= 1.
In establishing the previous equality we used the following Taylor expansion:
1
(1 −q)
k
= 1 +kq +
k(k + 1)
2
q
2
+
k(k + 1)(k + 2)
3!
q
3
+. . . , [q[ < 1.
The geometric random variable is a negative binomial random variable with k = 1.
Remark. If X is negative binomial random variable with parameters k and p ∈
(0, 1) then
E(X) =
k
p
and V (X) =
kq
p
2
.
Proof.
E(X) =
∞
n=k
nC
k−1
n−1
p
k
q
n−k
=
k
p
∞
n=k
C
k
n
p
k+1
q
n−k
, since nC
k−1
n−1
= kC
k
n
=
k
p
∞
m=k+1
C
k+1−1
m−1
p
k+1
q
m−(k+1)
, by setting m = n + 1
=
k
p
1 =
k
p
since the numbers C
k+1−1
m−1
p
k+1
q
m−(k+1)
, m ≥ k + 1
350
represent the probability mass function of a negative binomial random variables with
parameters (k + 1, p).
To determine V (X) we compute ﬁrst E(X
2
).
E(X
2
) =
∞
n=k
n
2
C
k−1
n−1
p
k
q
n−k
=
k
p
∞
n=k
nC
k
n
p
k+1
q
n−k
, since nC
k−1
n−1
= kC
k
n
=
k
p
∞
m=k+1
(m−1)C
k
m−1
p
k+1
q
m−(k+1)
by setting m = n + 1
=
k
p
∞
m=k+1
mC
k+1−1
m−1
p
k+1
q
n−(k+1)
−
k
p
∞
m=k+1
C
k+1−1
m−1
p
k+1
q
m−(k+1)
=
k
p
k + 1
p
−
k
p
=
k
p
_
k + 1
p
−1
_
.
Therefore,
V (X) =
k
p
_
k + 1
p
−1
_
−
_
k
p
_
2
=
k
2
+k −kp −k
2
p
2
=
k(1 −p)
p
2
=
kq
p
2
,
as desired.
Example. Find the expected value and the variance of the number of times one
must roll a dice until the face 6 occurs 4 times.
Solution. The experiment can be described by a negative binomial random vari
able with parameters k = 4 and p =
1
6
.
Hence
E(X) =
k
p
=
4
1
6
= 24
and
V (X) =
4
5
6
_
1
6
_
2
= 120.
The hypergeometric random variable
The hypergeometric random variable is obtained while sampling without replace
ment.
Suppose that a sample of size n is to be chosen randomly (without replacement)
from an urn containing a white balls and b black balls. If we let X denote the number
351
of white balls selected, then
P(X = k) =
C
k
a
C
n−k
b
C
k
a+b
, k = 0, 1, . . . , n
(see also the urn model with two states without replacement).
Deﬁnition. The hypergeometric random variables with parameters n, a, b (n ≤
a +b, max(a, n −b) ≤ k ≤ min(n, a)) is the random variable whose distribution is
X :
_
k
C
k
a
C
n−k
b
C
n
a+b
_
k=0,n
To check that P(X = k) =
C
k
a
C
n−k
b
C
n
a+b
, k = 0, n is a probability mass function we
note that
P(X = k) > 0
and
n
k=0
P(X = k) =
n
k=0
C
k
a
C
n−k
b
C
n
a+b
=
1
C
n
a+b
n
k=0
C
k
a
C
n−k
b
=
C
n
a+b
C
n
a+b
= 1.
In establishing the previous equality we used the Vandermonde’s identity:
n
k=0
C
k
a
C
n−k
b
= C
n
a+b
.
Remark 1. If X is a hypergeometric random variable with parameters n, a, b then
E(X) = n
a
a +b
, V (X) = n
a
a +b
b
a +b
a +b −n
a +b −1
.
Proof.
E(X) =
n
k=1
k
C
k
a
C
n−k
b
C
n
a+b
=
1
C
n
a+b
n
k=0
(kC
k
a
)C
n−k
b
=
1
C
n
a+b
n
k=1
(aC
k−1
a−1
)C
n−k
b
, since kC
b
a
= aC
k−1
a−1
=
a
C
n
a+b
n−1
l=0
C
l
a−1
C
n−1−l
b
, by setting k = l + 1
=
a
C
n
a+b
C
n−1
a+b−1
, by using Vandermonde’s identity
=
a
C
n
a+b
n
a +b
C
n
a+b
= n
a
a +b
, since C
n−1
a+b−1
=
n
a +b
C
n
a+b
.
352
Hence E(X) = n
a
a +b
.
To determine V (X) we compute ﬁrst E(X
2
).
E(X
2
) =
n
k=0
k
2
C
k
a
C
n−k
b
C
n
a+b
=
1
C
n
a+b
n
k=1
[k(k −1) +k]C
k
a
C
n−k
b
=
1
C
n
a+b
n
k=2
(k(k −1)C
k
a
)C
n−k
b
+E(X)
=
1
C
n
a+b
n
k=2
a(a −1)C
k−2
a−2
C
n−k
b
+E(X),
since k(k −1)C
k
a
= (k −1)aC
k−1
a−1
= a(a −1)C
k−2
a−2
=
a(a −1)
C
n
a+b
n−2
k=0
C
l
a−2
C
n−2−l
b
+E(X), by setting k = l + 2
=
a(a −1)
C
n
a+b
C
n−2
a+b−2
+E(X), by using the Vandermonde’s identity
=
a(a −1)n(n −1)
(a +b)(a +b −1)
+n
a
a +b
V (X) = E(X
2
) −(E(X))
2
= n(n −1)
a(a −1)
(a +b)(a +b −1)
+n
a
a +b
−n
2
a
2
(a +b)
2
= = n
a
a +b
b
a +b
a +b −n
a +b −1
,
as desired.
Remark 2. Let X be a hypergeometric random variable with parameters n, a
and b. If we denote by p (respectively q) the probability of extracting a white ball
(respectively a black ball) at the beginning of the experiment then the expected value
and the variance of the r.v. X can be written as
E(X) = np, V (X) = npq
a +b −n
a +n −1
,
where p =
a
a +b
and q =
b
a +b
.
Remark 3. (Approximation to binomial distribution)
If n balls are randomly chosen without replacement from a set of a + b balls, of
which a are white balls, then the r.v. which represents the number of white balls
extracted is hypergeometric. If a and b are large in relation to n then it seems that
there is no diﬀerence whether the selection is made with or without replacement.
In this case, when a and b are large, the probability of taking a white ball at each
additional selection will be approximately equal to p =
a
a +b
.
353
We may expect, that the probability mass function of X can be approximate by
the p.m.f. of a binomial r.v. with parameters n and p.
We will verify now the previous statement.
P(X = k) =
C
k
a
C
n−k
b
C
n
a+b
=
a!
k!(a −k)!
b!
(n −k)!(b −n −k)!
n!(a +b −n)!
(a +b)!
=
n!
k!(n −k)!
a!(a +b −n)!
(a +b)!(a −k)!
b!
(b −n −k)!
= C
k
n
a(a −1) . . . (a −k + 1)
(a +b) . . . (a +b −k + 1)
b . . . (b −n −k + 1)
(a +b −k)(a +b −n + 1)
= C
k
n
a
a +b
a −1
a +b −1
. . .
a −k + 1
a +b −k + 1
b
a +b −k
b −n −k + 1
a +b −n + 1
≈ C
k
n
p
k
q
n−k
.
In practice, the hypergeometric law can be replaced by the binomial distribution
if the following inequality holds 10n < a +b.
The Poisson random variable
Deﬁnition. The Poisson random variable with parameter λ, λ > 0 is a random
variable whose distribution is:
X :
_
k
e
−λ λ
k
k!
_
k=0,1,...
To check that P(X = k) = e
−λ
λ
k
k!
, k ≥ 0 is a probability mass function we note
that P(X = k) > 0 and
∞
k=0
P(X = k) =
∞
k=0
e
−λ
λ
k
k!
= e
−λ
∞
k=0
λ
k
k!
= e
−λ
e
λ
= 1.
In establishing the previous equality we used the following Taylor expansion:
e
λ
= 1 +
λ
1!
+
λ
2
2!
+. . . , λ ∈ R.
The Poisson probability distribution was introduced by S.D. Poisson in the book
entitled ”Recherches sur la probabilit´e des jugements en mati`ere criminelle et en
mati`ere civile”.
Remark 1. If X is a Poisson random variable with parameter λ, λ > 0, then
E(X) = λ and V (X) = λ.
354
Proof.
E(X) =
∞
k=0
ke
−λ
λ
k
k!
= e
−λ
∞
k=1
k
λ
k
k!
= e
−λ
λ
∞
k=1
λ
k−1
(k −1)!
= e
−λ
λ e
λ
= λ
V (X) = E(X
2
) −[E(X)]
2
=
∞
k=0
k
2
e
−λ
λ
k
k!
−λ
2
= e
−λ
∞
k=1
[k(k −1) +k]
λ
k
k!
−λ
2
= e
−λ
λ
2
∞
k=2
λ
k−2
(k −2)!
+λe
−λ
∞
k=1
λ
k−1
(k −1)!
−λ
2
= e
−λ
λ
2
e
λ
+λe
−λ
e
λ
−λ
2
= λ
Remark 2. (The Poisson distribution as the limit of the binomial)
The Poisson random variable may be used as an approximation for a binomial
random variable with parameters (n, p) when n is large (compared to k) and p is
small enough such that np is of moderate size.
Suppose that X is a binomial random variable with parameters (n, p) and let
λ = np. Then
C
k
n
p
k
(1 −p)
n−k
=
n!
k!(n −k)!
p
k
(1 −p)
n−k
=
n!
k!(n −k)!
_
λ
n
_
k
_
1 −
λ
n
_
n−k
=
n(n −1) . . . (n −k + 1)
n
k
λ
k
k!
_
1 −
λ
n
_
n
_
1 −
λ
n
_
k
If n is large and p is small, then
_
1 −
λ
n
_
n
=
_
_
1 −
λ
n
_
−
n
λ
_
−λ
≈ e
−λ
n(n −1) . . . (n −k + 1)
n
k
≈ 1
355
(since k is much smaller)
_
1 −
λ
n
_
k
≈ 1
Hence, for n large (compared to k) and p small,
C
k
n
p
k
(1 −p)
n−k
≈ e
−λ
λ
k
k!
In consequence, if n independent trials (whose outcomes are ”success” with prob
ability p and ”failure” with probability 1 −p) then, for n large and p small such that
np is moderate in size, the number of successes occurring is approximately a Poisson
variable with parameter λ = np.
Some examples of random variables that usually follow the Poisson probability
law are:
1. The number of misprints on a page (or on a group of pages) of a book.
2. The number of customers entering a bank on a given day.
3. The number of people in a community living to 90 years of age.
4. The number of particles emitted by a radioactive source within a certain period
of time.
5. The number of accidents in 1 day on a particular stretch of a highway.
Example. (Misprints on a page)
Suppose a page of a book contains n = 1000 characters, each of which is misprinted
with a probability p = 10
−4
. Compute the probabilities of having:
a) no misprint on the page
b) at least one misprint on the page
both by binomial formula and by Poisson formula.
Solution. Let X be the r.v. which represents the number of misprints on the page:
a)  by the binomial formula:
C
0
1000
(10
−4
)
0
(1 −10
−4
)
1000
≈ 0.904833
 by the Poisson formula with
λ = np = 1000 10
−4
= 0, 1 :
0, 1e
−0,1
0!
≈ 0, 904837
b)  by the binomial formula
P(X ≥ 1) = 1 −P(X = 0) ≈ 1 −0.904833 = 0, 095167
 by the Poisson formula
P(X ≥ 1) = 1 −P(X = 0) ≈ 1 −0, 904837 = 0, 095163.
Remark 3. (The sum of independent Poisson variables is Poisson)
If X
1
and X
2
are independent Poisson random variables with parameters λ
1
and
λ
2
, respectively, then X
1
+X
2
is Poisson with parameter λ
1
+λ
2
.
356
Proof.
P(X
1
+X
2
= n) =
n
i=0
P(X
1
= k, X
2
= n −k)
=
n
i=0
P(X
1
= k) P(X
2
= n −k) =
n
k=0
e
−λ
1
λ
k
1
k!
e
−λ
2
λ
n−k
2
(n −k)!
= e
−(λ
1
+λ
2
)
1
n!
n
k=0
n!
k!(n −k)!
λ
k
1
λ
n−k
2
= e
−(λ
1
+λ
2
)
1
n!
n
k=0
C
k
n
λ
k
1
λ
n−k
2
= e
−(λ
1
+λ
2
)
1
n!
(λ
1
+λ
2
)
n
= e
−(λ
1
+λ
2
)
(λ
1
+λ
2
)
n
n!
, n = 0, 1, . . .
Sometimes, we are interested not just in one Poisson random variable, but in a
family of random variables.
For example, in the previous example we may be interested for the probabilities
of the number of misprints on several pages.
A family of random variables X(t) depending on a parameter t is called a stochastic
or random process.
The parameter t is time in most applications.
Next we will present a particular stochastic process called the Poisson process.
Deﬁnition. (Poisson process)
A family of random variable (X(t))
t>0
is called a Poisson process with rate λ,
λ > 0, if the r.v. X(t) (the number of occurrences of some type in any interval of
length t) has a Poisson distribution with parameter λt for any t > 0:
P(X(t) = k) =
(λt)
k
e
−λt
k!
, k = 0, 1, . . .
and for each 0 < t
1
< < t
n
the random variables X(t
1
); X(t
2
)−X(t
1
); . . . , X(t
n
)−
X(t
n−1
) are independent (i.e. the numbers of occurrences in non overlapping time
intervals are independent of each other).
Example. (Misprints on several pages)
Suppose the pages of a book contain misprinted characters, independent of each
other, with a rate of λ = 0, 1 misprints per page. Suppose that the number X(t) of
misprints on any t pages form a Poisson process. Find the probabilities of having
a) no misprints on the ﬁrst 3 pages
b) at least two misprints on the ﬁrst two pages.
Solution. a) Since in this case t = 3 and λt = 0, 3, then
P(X(3) = 0) =
0, 3
0
e
−0,3
0!
= e
−0,3
≈ 0, 74
b) In this case t = 2, λt = 0, 2. Hence
P(X(2) ≥ 2) = 1 −[P(X(2) = 0) +P(X(2) = 1)]
357
= 1 −
0, 2
0
e
−0,2
0!
−
0, 2
1
e
−0,2
1!
≡ 1 −1, 2e
−0,2
≈ 0, 017
The next remark, which is an immediate consequence of the deﬁnition of random
processes and Remark 3, says that the number of occurrences in an interval depends
only on the length of the interval. This property is called stationarity.
Remark 4. For any s, t > 0
X(s +t) −X(s) = X(t).
Continuous random variables
The uniform random variable
Deﬁnition. A random variable X is said to be uniformly distributed over the
interval [a, b] if its distribution is
X :
_
x
f(x)
_
x∈R
,
where the probability density function is
f(x) =
_
1
b −a
, a ≤ x ≤ b
0, otherwise
Notation: X ∼ U(a, b)
Remark 1. The function f veriﬁes the two conditions of a probability density
function.
Proof. 1) f(x) ≥ 0, ∀ x ∈ R (since a < b)
2)
_
∞
−∞
f(x)dx =
_
b
a
1
b −a
dx =
x
b −a
¸
¸
¸
b
a
=
b −a
b −a
= 1
Remark 2. The distribution function
If X ∼ U(a, b), then
F(x) =
_
¸
_
¸
_
0, x ≤ a
x −a
b −a
, a < x ≤ b
1, x > b.
Proof. F : R → [0, 1], F(x) = P(X < x) =
_
x
−∞
f(t)dt
• if x ≤ a then
F(x) =
_
x
−∞
0dt = 0
• if a < x ≤ b then
F(x) =
_
a
−∞
0dt +
_
x
a
1
b −a
dt = 0 +
t
b −a
¸
¸
¸
x
a
=
x −a
b −a
358
• if b < x then
F(x) =
_
a
−∞
0dt +
_
b
a
1
b −a
dt +
_
x
b
0dt =
b −a
b −a
= 1,
as desired.
Remark 3. If X ∼ U(a, b), then
E(X) =
a +b
2
, V (X) =
(b −a)
2
12
.
Proof.
E(X) =
_
∞
∞
xf(x)dx =
_
b
a
x
1
b −a
dx =
1
b −a
x
2
2
¸
¸
¸
b
a
=
b
2
−a
2
2(b −a)
=
b +a
2
To determine V (X) we compute ﬁrst E(X
2
).
E(X
2
) =
_
∞
−∞
x
2
f(x)dx =
_
b
a
x
2
1
b −a
dx
=
1
b −a
x
3
3
¸
¸
¸
b
a
=
b
3
−a
3
3(b −a)
=
a
2
+ab +b
2
3
.
Hence, the variance is
V (X) =
a
2
+ab +b
2
3
−
_
a +b
2
_
2
=
a
2
−2ab +b
2
12
=
(b −a)
2
12
Notation. If a = 0 and b = 1 we obtain the standard uniform random variable.
Example. Buses arrive at a speciﬁed station at 10 minute intervals starting a
6 A.M. That is, they arrive at 6, 6:10, 6:20 and so on. If a passenger arrives at the
station at a time that is uniformly distributed between 6 and 6:20, ﬁnd the probability
that he waits:
a) less than 5 minutes for a bus
b) more than 7 minutes for a bus.
Solution. Let X denote the number of minutes past 6 that the passenger arrives
at the station.
a) The passenger will have to wait less than 5 minutes if (and only if) he arrives
between 6:05 and 6:10 or between 6:15 and 6:20.
Hence, the desired probability is
P(5 < X < 10) +P(15 < X < 20) =
_
10
5
1
20
dx +
_
20
15
1
20
dx =
10
20
=
1
2
b) The passenger will wait more than 7 minutes if he arrives between 6 and 6:03
or between 6:10 and 6:13, so the desired probability:
P(0 < X< 3) +P(10 < X< 13) =
3
20
+
3
20
=
6
20
=
3
10
.
359
The exponential random variable
Deﬁnition. A random variable X is said to be an exponential random variable (or
exponentially distributed) with parameter λ, λ > 0 if its probability density function
is
f(x) =
_
λe
−λx
, x > 0
0, x ≤ 0
Remark 1. The function f veriﬁes the two conditions of a probability density
function.
Proof. 1) f(x) ≥ 0, ∀ x ∈ R (since λ > 0)
2)
_
∞
−∞
f(x)dx =
_
0
−∞
0 dx +
_
∞
0
λe
−λx
dx = λ
e
−λx
−λ
¸
¸
¸
∞
0
= λ
_
0 −
1
−λ
_
= 1.
Remark 2. The distribution function
If X is an exponential distribution then
F(X) =
_
0, x ≤ 0
1 −e
−λx
, x > 0
Proof. F : R → [0, 1],
F(x) = P(X < x) =
_
x
−∞
f(t)dt
 if x ≤ 0,
F(x) =
_
x
−∞
0dt = 0
 if x > 0,
F(x) =
_
0
−∞
0dt +
_
x
0
λe
−λt
dt
= −
_
x
0
(e
−λt
)
′
dt = −e
−λx
+ 1 = 1 −e
−λx
,
as desired.
Remark 3. If X is exponential distributed then
E(X) =
1
λ
and V (X) =
1
λ
2
.
Proof.
E(X) =
_
∞
−∞
xf(x)dx =
_
∞
0
xe
−λx
λdx
By using the following change of variable: y = λx we obtain
E(X) =
_
∞
0
y
λ
e
−y
dy =
1
λ
_
∞
0
y
2−1
e
−λ
dy =
1
λ
Γ(2) =
1
λ
360
To determine V (X) we compute ﬁrst E(X
2
) (in computing the corresponding
integral, we will use the same change of variable as before).
E(X
2
) =
_
∞
0
x
2
e
−λx
λdx =
_
∞
0
y
2
λ
2
e
−y
dy
=
1
λ
2
_
∞
0
y
3−1
e
−y
dy =
1
λ
2
Γ(3) =
2
λ
2
Hence, the variance is
V (X) = E(X
2
) −(E(X))
2
=
2
λ
2
−
1
λ
2
=
1
λ
2
Remark 4. The memoryless property
If X is an exponential random variable then:
P(X > s +t[X > t) = P(X > s), for all s, t ≥ 0.
Proof.
P(X > s +t[X > t) =
P(X > s +t, X > t)
P(X > t)
=
P(X > s +t)
P(X > t)
=
1 −P(X ≤ s +t)
1 −P(X ≤ t)
=
1 −P(X < s +t)
1 −P(X < t)
=
1 −F(s +t)
1 −F(t)
=
e
−λ(s+t)
e
−λt
= e
−λs
= 1 −F(s) = 1 −P(X < s)
= P(X ≥ s) = P(X > s).
To understand why the previous equality is called the memoryless property, con
sider that X represents the length of time that an item functions before failing. The
previous equality says that the probability that an item functioning at age t will con
tinue to function for at least an additional time s is the same as of a new item to
function at least a period of time equal to s.
We can say the mentioned equality says that an functional item is ”as good as
new”. It can be shown that the exponential random variables are the only continuous
random variables that are memoryless.
Then is a very important relationship between the exponential random variables
and the Poisson process.
The next remark shows that in a Poisson process, the ”waiting time” for an occur
rence and ”the time between two any consecutive occurrences (the interarrival time)
have the same exponential distribution with parameter λ.
Remark 5. Let (X(t))
t≥0
be a Poisson process with rate λ, λ > 0.
a) If s ≥ 0 and if T
1
is the random variable which represents the length of time till
the ﬁrst occurrence after s then T
1
is an exponential random variable with parameter
λ.
361
b) Suppose we have an occurrence at time s ≥ 0. Let T
2
be the random variable
which represents the time between this occurrence and the next one. Then T
2
is an
exponential random variable with parameter λ.
Proof. a) Let t > 0. We have to compute P(T
1
< t).
P(T
1
< t) = P(T
1
≤ t) = 1 −P(T
1
> t).
Clearly, for any t > 0, the waiting time T
1
is greater than t, if and only if there is
no occurrence in the time interval (s, s +t]. Thus:
P(T
1
< t) = 1 −P(T
1
> t) = 1 −P(X(s +t) −X(s) = 0)
= 1 −P(X(t) = 0) = 1 −
(λt)
0
0!
e
−λt
= 1 −e
−λt
The previous equality together with P(T
1
< t) = 0 for t ≤ 0, shows that T
1
has
the distribution function of an exponential random variable with parameter λ.
b) We have that P(T
2
< t) = 0 for t ≤ 0.
Let t > 0. We have to compute P(T
2
< t).
Instead of assuming that we have an occurrence at time s, we assume that we have
an occurrence in the time interval [s −∆s, s] and let ∆s → 0. Then
P(T
2
< t) = P(T
2
≤ t) = 1 −P(T
2
> t)
= 1 − lim
∆s→0
P(X(s +t) −X(s) = 0[X(s) −X(s −∆s) = 1)
= 1 −P(X(s +t) −X(s) = 0) = 1 −P(X(t) = 0) =1 −e
−λt
.
Thus T
2
has the distribution function of an exponential random variable.
Remark 6. If in a Poisson process the rate is λ, that is the mean number of
occurrences per unit time is λ, then the mean interoccurrence time is
1
λ
.
Example. A checkout counter at a supermarket completes the process according
to an exponential random variable with a service rate of 15/hour. A customer arrives
at the checkout counter. Find the following probabilities.
a) the service is completed in more than 5 minutes;
b) the customer has to wait more than 8 minutes knowing that he already waited
3 minutes;
c) the service is completed in a time between 5 and 8 minutes.
Solution. a) We have ﬁrst to convert the service rate so that the time period is
1 minute.
The service rate is λ = 0, 25/minute.
a) P(X > 5) = 1 −P(X ≤ 5) = 1 −P(X < 5)
= 1 −1 +e
−0,25·5
= e
−1,25
.
b) P(X > 8[X > 3) =
P(X > 8, X > 3)
P(X > 3)
=
P(X > 8)
P(X > 3)
=
1 −1 +e
−0,25·8
1 −1 +e
−0,25·3
= e
−0,25·5
= e
−1,25
,
362
as we expected according to the ”memoryless property” of the exponential random
variable.
c) P(5 < X < 8) =
_
8
5
f(x)dx =
_
8
5
λe
−λx
dx = −e
−λx
¸
¸
¸
8
5
= e
−0,25·5
−e
−0,25·8
= e
−1,25
−e
−2
.
The Erlang random variable
The Erlang distribution is a generalization of the exponential distribution. While
the exponential random variable describes the time between two consecutive events,
the Erlang random variable describes the time interval between any event and the k
th
following event.
Deﬁnition. A random variable X is said to be an Erlang random variable with
parameters λ and n (λ > 0, ∈ N
∗
) if it has the following distribution
X :
_
x
f(x, n, λ)
_
x∈R
where
f(x, n, λ) =
_
_
_
λ
n
x
n−1
e
−λx
(n −1)!
, n = 1, 2, 3, . . . , x ≥ 0
0, x < 0
Remark 1. The function f(, n, λ) veriﬁes the two conditions of a probability
density function.
Proof. 1) f(x, n, λ) ≥ 0, ∀ x ∈ R (since λ > 0).
2) In order to compute the integral
_
∞
−∞
f(x, n, λ)dx we will use the Gamma
function. If we let t = λx, then
_
∞
−∞
f(x, n, λ)dx =
λ
n
(n −1)!
_
∞
0
x
n−1
e
−λx
dx
=
λ
n
(n −1)!
_
∞
0
_
t
λ
_
n−1
e
−t
1
λ
dt
=
λ
n
(n −1)!
1
λ
n
_
∞
0
t
n−1
e
−t
dt
=
1
(n −1)!
Γ(n) =
1
(n −1)!
(n −1)! = 1,
as we needed.
Remark 2. If X is an Erlang random variable, then
E(X) =
n
λ
and V (X) =
n
λ
2
.
363
Proof. In the next computations we will make the same change of variable as in
the proof of the previous remark.
E(X) =
_
∞
−∞
xf(x, n, λ)dx =
λ
n
(n −1)!
_
∞
0
x x
n−1
e
−λx
dx
=
λ
n
(n −1)!
_
∞
0
t
n
λ
n
e
−t
1
λ
dt =
1
(n −1)!
1
λ
Γ(n + 1)
=
1
(n −1)!
1
λ
n! =
n
λ
.
Note that this is n times the expected value of the exponential distribution (with
parameter λ).
Similarly,
E(X
2
) =
1
(n −1)!
1
λ
2
Γ(n + 2) =
n(n + 1)
λ
2
.
Therefore, the variance of X is:
V (X) =
n(n + 1)
λ
2
−
n
2
λ
2
=
n
λ
2
.
Remark 3. The distribution function of an Erlang random variable
If X is an Erlang random variable with parameters n and λ then
F(x) =
_
¸
¸
_
¸
¸
_
1 −
n−1
k=0
(λx)
k
e
−λx
k!
, x ≥ 0
0, x < 0
364
Proof. Let x > 0.
F(x) =
_
x
−∞
f(x, n, λ)dx =
λ
n
(n −1)!
_
x
0
t
n−1
e
−λt
dt
=
1
(n −1)!
_
x
0
(λt)
n−1
e
−λt
λdt =
1
(n −1)!
_
λx
0
u
n−1
e
−u
du
=
1
(n −1)!
_
λx
0
u
n−1
(−e
−u
)
′
du
=
1
(n −1)!
_
−(λx)
n−1
e
−λx
+ (n −1)
_
λx
0
u
n−2
e
−u
du
_
=
1
(n −2)!
_
λx
0
u
n−2
e
−u
du −
(λx)
n−1
e
−λx
(n −1)!
=
1
(n −3)!
_
λx
0
u
n−3
e
−u
du −
(λx)
n−2
e
−λx
(n −2)!
−
(λx)
n−1
e
−λx
(n −1)!
= =
=
_
λx
0
ue
−u
du −
n
k=2
(λx)
k
e
−λx
k!
= −ue
−u
¸
¸
¸
λx
0
+
_
λx
0
(−e
−u
)
′
du −
n
k=2
(λx)
k
e
−λx
k!
= 1 −
n
k=0
(λx)
k
e
−λx
k!
,
as desired.
Example. The lengths of phone calls at a certain phone booth are exponentially
distributed with a mean of 4 minutes. I arrived at the booth while Ana was using the
phone, and I was told that she already spent 2 minutes on the call before I arrived.
a) What is the average time I will wait until she ends her call?
b) What is the probability that Ana’s call will last between 3 and 6 minutes after
my arrival.
c) Assume that I am the ﬁrst in line at the booth to use the phone after Ana, and
by the time she ﬁnished her call more than 4 people were waiting to use the phone.
What is the probability that the time between I start using the phone and the time
the fourth person behind me starts his/her call is greater than 15 minutes?
Solution. Let X be the random variable which represents the lengths of calls at
the phone booth. Then
f
X
(x) =
_
λe
−λx
, x ≥ 0
0, x < 0
λ =
1
4
a) Due to the memoryless property of the exponential random variable, the average
time I wait until Ana’s call ends is 4 minutes.
b) Due to the memoryless property of the exponential random variable, the proba
bility that Ana’s call lasts between 3 and 6 minutes after my arrival is the probability
365
that an arbitrary call lasts between 3 and 6 minutes, which is
P(3 < X < 6) =
_
6
3
λe
−λx
dx = (−e
−λx
)
¸
¸
¸
6
3
= e
−3λ
−e
−6λ
= e
−
3
4
−e
−
6
4
≈ 0.2492
c) Let Y the random variable that represents the time between I start my phone
call until the fourth person starts his/her call. Then Y is an Erlang random variable
with parameters n = 4 and λ =
1
4
. Then,
P(Y > 15) = 1 −P(Y ≤ 15) = 1 −P(Y < 15)
= 1 −F
Y
(15) = 1 −1 +
3
k=0
(λ 15)
k
k!
e
−λ·15
= e
−
15
4
_
1 + 15
1
4
+
1
2!
_
15
1
4
_
2
+
1
3!
_
15
1
4
_
3
_
≈ 0, 4838
The normal random variable
Deﬁnition. A random variable X is said to be a normal random variable with
parameters m and σ (m ∈ R, σ > 0) if it has the following distribution:
X :
_
x
f(x; m, σ)
_
x∈R
where
f(x, m, σ) =
1
√
2πσ
e
−
(x−m)
2
2σ
2
.
Notation: X ∼ N(m, σ
2
).
The normal r.v. is also called the LaplaceGauss random variable.
Remark 1. If X ∼ N(m, σ
2
) then its density function is a bellshaped curve (or
Gauss curve) that is symmetric with respect to the line x = m (see ﬁgure below).
366
¸
`
m− 2σ m− σ m m + σ m + 2σ
1
√
2πσ
≈
399
σ
The normal distribution was introduced by the French mathematician Abraham
de Moivre in 1733 and was used by him to approximate probabilities associated with
binomial random variables when n (the binomial parameter) is large. This result was
later extended by Gauss and Laplace.
Remark 2. The function f(, m, σ) veriﬁes the two conditions of a probability
density function.
Proof. 1) f(x, m, σ) > 0, ∀ x ∈ R (since σ > 0)
2) In order to compute the integral
_
∞
−∞
f(x, m, σ)dx =
1
√
2πσ
_
∞
−∞
e
−
(x−m)
2
2σ
2
dx
we make the following change of variable:
x −m
√
2σ
= u
with x = m+
√
2σu and dx =
√
2σdu.
Hence:
_
∞
−∞
f(x, m, σ)dx =
1
√
2πσ
_
∞
−∞
e
−u
2
√
2σdu
=
2
√
π
_
∞
0
e
−u
2
du =
2
√
π
√
π
2
= 1,
as desired.
In establishing the previous equality we used the value of the EulerPoisson inte
gral:
_
∞
0
e
−u
2
du =
√
π
2
.
367
Remark 3. If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then
E(X) = m and V (X) = σ
2
.
Proof.
E(X) =
1
√
2πσ
_
∞
−∞
xe
(x−m)
2
2σ
2
dx
=
1
√
2πσ
_
∞
−∞
(x −m)e
(x−m)
2
2σ
2
dx +
m
√
2πσ
_
∞
−∞
e
−
(x−m)
2
2σ
2
dσ
Letting y = x −m in the ﬁrst integral yields
E(X) =
1
√
2πσ
_
∞
−∞
ye
−
y
2
2σ
2
dy +m
_
∞
−∞
f(x, m, σ)dσ
where f(, m, σ) is the normal density. By symmetry, the ﬁrst integral must be zero,
so
E(X) = m
_
∞
−∞
f(x, m, σ)dσ = m 1 = m.
Since E(X) = m, we have that
V (X) = E((X −m)
2
) =
1
√
2πσ
_
∞
−∞
(x −m)
2
e
−
(x−m)
2
2σ
2
dx
=
1
√
2πσ
_
∞
−∞
σ
2
u
2
e
−
u
2
2
σdu
=
σ
2
√
2π
_
∞
−∞
u
_
−e
−
u
2
2
_
′
du
=
σ
2
√
2π
_
−ue
−
u
2
2
¸
¸
¸
∞
−∞
+
_
∞
−∞
e
−
u
2
2
du
_
=
σ
2
√
2π
2
_
∞
0
e
−
u
2
2
du =
2σ
2
√
2π
_
π
2
= σ
2
.
Remark 4. a) If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then
Y = aX +b ∼ N(am+b, a
2
σ
2
) (a ,= 0, b ∈ R).
b) If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then
Z =
X −m
σ
∼ N(0, 1)
The random variable Z ∼ N(0, 1) is called a standard normal random variable.
Proof. We can suppose that a > 0 (the proof for a < 0 is quite similar).
Let F
Y
be the cumulative distribution function of the r.v. Y
F
Y
(y) = P(aX +b < x) = P
_
X <
x −b
a
_
= F
X
_
x −b
a
_
.
368
The density probability function of Y is obtained by diﬀerentiating the previous
equality
f
Y
(y) =
1
a
f
X
_
x −b
a
_
=
1
√
2πσa
e
−
(
x−b
a
−m
)
2
2σ
2
=
1
√
2πσa
e
−
(x−(am+b))
2
2σ
2
a
2
Hence, Y is normal with mean am+b and variance a
2
σ
2
.
b) Take a =
1
σ
and b = −
m
σ
in the part a).
Remark 5. (The distribution function of a normal random variable)
Let φ : R →R
φ(z) =
1
√
2π
_
z
0
e
−
y
2
2
dy
be the Laplace function (whose values can be found in tables).
If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then the distribution function of the r.v. X is
given by
F(x) =
1
2
+φ
_
x −m
σ
_
.
Also, we have
i) P(a < X < b) = φ
_
b −m
σ
_
−φ
_
a −m
σ
_
ii) P([X −m[ < r) = 2φ
_
r
σ
_
with the following particular cases:
P([X −m[ < σ) = 2φ(1) = 0, 6826
P([X −m[ < 2σ) = 2φ(2) = 0, 9544
P([X −m[ < 3σ) = 2φ(3) = 0, 9972
Proof. We shall list ﬁrst some properties of the Laplace function:
a) φ(0) = 0
b) lim
z→∞
φ(z) =
1
2
c) lim
z→−∞
φ(z) = −
1
2
d) φ(−z) = −φ(z), ∀ z ∈ R.
Let F : R → [0, 1] be the distribution function of the r.v. X ∼ N(m, σ
2
). Then
F(x) = P(X < x) =
_
x
−∞
f(t; m, σ)dt
=
1
√
2πσ
_
x
−∞
e
−
(t−m)
2
2σ
2
dt
369
By making the following change of variable:
t −m
σ
= y, dt = σdy
then
F(x) =
1
√
2πσ
_ x−m
σ
−∞
e
−
y
2
2
σdy =
1
√
2π
_ x−m
σ
−∞
e
−
y
2
2
dy
=
1
√
2π
_
0
−∞
e
−
y
2
2
dy +
1
√
2π
_ x−m
σ
0
e
−
y
2
2
dy =
1
2
+φ
_
x −m
σ
_
i) P(a < X < b) = F(b) −F(a) = φ
_
b −m
σ
_
−φ
_
a −m
σ
_
ii) P([X −m[ < r) = P(−r < X −m < r) = P(m−r < X < m+r)
= φ
_
m+r −m
σ
_
−φ
_
m−r −m
σ
_
= φ
_
r
σ
_
−φ
_
−
r
σ
_
= φ
_
r
σ
_
+φ
_
r
σ
_
= 2φ
_
r
σ
_
.
The particular cases are obtained from the previous equality by taking r = σ,
r = 2σ and r = 3σ.
Example. An expert in a paternity suit testiﬁes that the length of pregnancy
is approximately normally distributed with parameters m = 270 and σ = 10. The
defendent is able to prove that he wasn’t in the country for a period that began 290
days before the birth of the child and ended 240 days before the birth. What is the
probability that the mother could have had a very long or a very short pregnancy as
mentioned before.
Solution. Let X denote the length of pregnancy in days. If he is the father, the
probability that the birth could occur within the indicated period is
P(X < 240 or X > 290) = P(X < 240) +P(X > 290)
= F(240) + 1 −F(290) =
1
2
+φ
_
240 −270
10
_
+ 1 −
1
2
−φ
_
290 −270
10
_
= 1 +φ(−3) −φ(2) = 1 −φ(3) −φ(2) ≈ 0, 0241
Next, we will present the De MoivreLaplace limit theorem which states that when
n is large a binomial random variable with parameters n and p will have approximately
the same distribution as a normal random variable with the same mean and variance
(as the binomial).
De Moivre (1733) proved this result for the particular case p =
1
2
and Laplace
(1812) generalized it to general p.
370
Theorem. (De MoivreLaplace limit theorem)
If X
n
∼ B(n, p) then, for any a, b ∈ R, a < b we have
lim
n→∞
P
_
a ≤
X
n
−np
_
np(1 −p)
< b
_
= φ(b) −φ(a).
Remark 6. (Normal approximation with continuity correction)
If X
n
∼ B(n, p) then for any integers i, j, 0 ≤ i ≤ j ≤ n then
P(i ≤ X
n
≤ j) ≈ φ
_
_
_
j +
1
2
−np
_
np(1 −p)
_
_
_−φ
_
_
_
i −
1
2
−np
_
np(1 −p)
_
_
_.
In general, we make the following adjustments
P(X
n
≤ j) ≈
1
2
+φ
_
_
_
j +
1
2
−np
_
np(1 −p)
_
_
_
P(X
n
< j) ≈
1
2
+φ
_
_
_
j −
1
2
−np
_
np(1 −p)
_
_
_
Hence, we have two possible approximation to binomial probabilities:
 if n is large and p small such that np is moderate in size we can use the Poisson
approximation
 if np(1 − p) is large (usually for np and n(1 − p) ≥ 5) we can use the normal
approximation.
Example. Each item produced by a manufacturer is, independently, of good qual
ity with probability 0,95. Approximate the probability of the event that at most 40
of the next 1000 items are of bad quality.
Solution. Let X be the random variable which represents the items of bad quality
among the next 1000 produced. It is obvious X ∼ B(1000; 0, 05). The expected value
of the binomial is np = 1000 0, 05 = 50 and the variance is
np(1 −p) = 1000 0, 05 0, 95 = 47, 5
P(X ≤ 40) = P
_
X ≤ 40 +
1
2
_
≈ P
_
_
_
X −50
√
47, 5
≤
40 +
1
2
−50
√
47, 5
_
_
_
=
1
2
+φ(−1, 42) =
1
2
−φ(1, 42) = 0, 5 −0, 422 = 0, 078
371
The Gamma random variable
A gamma distribution is a generalization of the Erland distribution where n = a.
which may not be an integer. This distribution has a lot of applications in statistics.
Deﬁnition. A random variable X is said to be a gamma random variable with
two parameters a > 0 (shape parameter) and b > 0 (called the scale parameter) if it
has the following distribution:
X :
_
x
f(x; a, b)
_
x∈R
where
f(x; a, b) =
_
_
_
1
Γ(a)b
a
x
a−1
e
−
x
b
, x > 0
0, x ≤ 0.
Notation: X ∼ Γ(a, b).
Remark 1. The function f(, a, b) veriﬁes the two conditions of a probability
density function.
Proof. 1) f(a, x, b) ≥ 0, ∀ x ∈ R (since a, b > 0).
2) In order to compute the integral
_
∞
−∞
f(x, a, b)dx we make the following change
of variable
x
b
= y with x = by and dx = bdy. Hence
_
∞
−∞
f(x, a, b)dx =
1
Γ(a)b
a
_
∞
0
b
a−1
y
a−1
e
−y
bdy
=
1
Γ(a)
_
∞
0
y
a−1
e
−y
dy =
1
Γ(a)
Γ(a) = 1,
as desired.
Remark 2. If X ∼ Γ(a, b) then E(X) = ab and V (X) = ab
2
.
Proof. In the next computations we will make the same change of variable as in
the proof of the previous remark.
E(X) =
_
∞
−∞
xf(x, a, b)dx =
1
Γ(a)b
a
_
∞
0
xx
a−1
e
−
x
b
dx
=
1
Γ(a)b
a
_
∞
0
b
a
y
a
e
−y
bdy
=
b
a+1
Γ(a)b
a
Γ(a + 1) =
baΓ(a)
Γ(a)
= ab.
372
Similarly,
E(X
2
) =
_
∞
−∞
x
2
f(x, a, b)dx =
1
Γ(a)b
a
_
∞
0
x
a+1
e
−
x
b
dx
=
1
Γ(a)b
a
_
∞
0
b
a+1
y
a+1
e
−y
bdy
=
b
2
Γ(a)
Γ(a + 2) =
b
2
(a + 1)aΓ(a)
Γ(a)
= a(a + 1)b
2
.
Therefore, the variance of X is:
V (X) = E(X
2
) −(E(X))
2
= a(a + 1)b
2
−a
2
b
2
= ab
2
.
Remark 3. (The distribution function of a Gamma random variable)
If X is a gamma random variable with parameters a and b, then
F(x) =
_
_
_
0, x ≤ 0
1
Γ(a)
_ x
b
0
u
a−1
e
−u
du.
Proof. For each x > 0 we have
F(x) = P(X < x) =
_
x
−∞
f(t, a, b)dt =
1
Γ(a)b
a
_
x
0
t
a−1
e
−
t
b
dt
=
1
Γ(a)b
a
_ x
b
0
b
a−1
u
a−1
e
−u
bdu =
1
Γ(a)
_ x
b
0
u
a−1
e
−u
du.
Gamma random variables, with values of the parameter a not just integers or half
integers are often use to model continuous random variables with an approximately
known distribution on (0, ∞).
An important case in which we obtain a gamma random variable is described in
the following remark:
Remark 4. (Square of a normal random variable)
If X ∼ N(0, σ
2
) then Y = X
2
∼ Γ
_
1
2
,
1
2σ
2
_
.
Proof. The distribution function F
Y
of Y is given by
F
Y
(y) = P(X
2
≤ y) =
_
P(−
√
y ≤ X ≤
√
y), if y > 0
0, if y ≤ 0
=
_
F
X
(
√
y) −F
X
(−
√
y), if y > 0
0, if y ≤ 0
373
Hence, for y > 0 we have
f
Y
(y) = F
′
Y
(y) =
1
2
√
y
(f
X
(
√
y) +f
X
(−
√
y))
=
1
2
√
y
2
√
2πσ
e
−
y
2σ
2
=
1
√
2πσ
y
−
1
2
e
−
y
2σ
2
=
1
Γ
_
1
2
_
(2σ
2
)
1
2
y
1
2
−1
e
−
y
2σ
2
In conclusion this density is gamma with a =
1
2
and λ =
1
2σ
2
.
Remark 5. Sum of independent gamma variables
If X
1
∼ Γ(a
1
, b) and X
2
∼ Γ(a
2
, b) are independent then
X
1
+X
2
∼ Γ(a
1
+a
2
, b).
Proof. Appendix B.
The Beta random variable
Similarly, continuous random variables with unknown distribution on [0, 1] are
often modeled by Beta random variables.
Deﬁnition. A random variable X is said to be a Beta random variable with
parameters a and b (a > 0, b > 0) if it has the following distribution:
X :
_
x
f(x; a, b)
_
x∈R
where
f(x; a, b) =
_
_
_
1
B(a, b)
x
a−1
(1 −x)
b−1
, x ∈ [0, 1]
0, otherwise
Remark 1. The function f(, a, b) veriﬁes the two conditions of a probability
density function.
Proof. 1) f(x, a, b) ≥ 0, ∀ x ∈ R (since a, b > 0).
2)
_
∞
−∞
f(x, a, b)dx =
1
B(a, b)
_
1
0
x
a−1
(1 −x)
b−1
dx =
1
B(a, b)
B(a, b) = 1.
Remark 2. If X is a Beta r.v. then
E(X) =
a
a +b
, V (X) =
ab
(a +b)
2
(a +b + 1)
.
374
Proof.
E(X) =
_
R
xf(x, a, b)dx =
1
B(a, b)
_
1
0
x
a
(1 −x)
b−1
dx
=
1
B(a, b)
B(a + 1, b) =
1
B(a, b)
a + 1 −1
a + 1 +b −1
B(a, b) =
a
a +b
E(X
2
) =
1
B(a, b)
_
1
0
x
a+1
(1 −x)
b−1
dx =
1
B(a, b)
B(a + 2, b)
=
1
B(a, b)
a + 1
a +b + 1
B(a + 1, b)
=
1
B(a, b)
a + 1
a +b + 1
a
a +b
B(a, b)
=
a + 1
a +b + 1
a
a +b
.
Hence,
V (X) = E(X
2
) −(E(X))
2
=
a + 1
a +b + 1
a
a +b
−
a
2
a +b
=
a
a +b
_
a + 1
a +b + 1
−
a
a +b
_
=
ab
(a +b)
2
(a +b + 1)
.
Remark 3. (The distribution function of a Beta random
variable)
If X is a Beta r.v. then
F(x) =
_
¸
¸
_
¸
¸
_
0, x < 0
1
B(a, b)
_
x
0
t
a−1
(1 −t)
b−1
dt, x ∈ [0, 1]
1, x > 1
The Chisquare random variable
The Chisquare distribution straddle the exponential and the normal distribution.
If in the Gamma distribution we take a =
n
2
and b = 2σ
2
we obtain the Chisquare
distribution.
Deﬁnition. A random variable X is said to be a Chisquare r.v. or χ
2
random
variable with n degrees of freedom (n ∈ N
∗
) and parameter σ (σ > 0) if it has the
following distribution
X :
_
x
f(x; n, σ)
_
x∈R
where
f(x; n, σ) =
_
¸
_
¸
_
1
(2σ
2
)
n
2
Γ
_
n
2
_x
n
2
−1
e
−
x
2σ
2
, x > 0
0, x ≤ 0
375
By using the properties of the Gamma distribution we easily get that:
Remark 1. The function f(, n, σ) veriﬁes the two conditions of the probability
density function.
Remark 2. If X is a χ
2
random variable then
E(X) = nσ
2
and V (X) = 2nσ
4
.
Remark 3. (The distribution function of a χ
2
random
variable)
If X is a χ
2
random variable then
F(x) =
_
¸
¸
_
¸
¸
_
0, x ≤ 0
1
Γ
_
n
2
_
_ x
2σ
2
0
u
n
2
−1
e
−u
du, x > 0
Remark 4. If X
1
, X
2
, . . . , X
n
are independent normal variables,
X
i
∼ N(0, σ
2
), i = 1, n then the random variable X deﬁned by
X = X
2
1
+ +X
2
n
is a Chisquared r.v. with n degrees of freedom and parameter σ.
Proof. Since X
i
, i = 1, n, is normal then X
2
i
∼ Γ
_
1
2
,
1
2σ
2
_
(see Remark 4 from
the subsection The Gamma random variable) and
X = X
2
1
+X
2
2
+ +X
2
n
∼ Γ
_
n
2
,
1
2σ
2
_
(see Remark 5 from the subsection The Gamma random variable), as we needed.
The ChiSquare distribution is used in certain statistical inference problems.
The lognormal random variable
Deﬁnition. A random variable X is said to be a lognormal random variable with
parameters m and σ (m > 0, σ > 0) if it has the following distribution:
X :
_
x
f(x; m, σ)
_
x∈R
where
f(x; m, σ) =
_
_
_
1
√
2πσx
e
−
1
2σ
2
ln
2 x
m
, x > 0
0, x ≤ 0
Remark 1. The function f(, m, σ) veriﬁes the two conditions of a probability
density function.
376
Proof. By using the following change of variable y =
1
σ
ln
x
m
with x = me
σy
and
dx = mσe
σy
dy we obtain
_
∞
−∞
f(x; , m, σ)dx =
1
√
2πσ
_
∞
0
1
x
e
−
1
2σ
2
ln
2 x
m
dx
=
1
√
2πσ
_
∞
−∞
1
m
e
−σy
e
−
1
2
y
2
mσe
σy
dy
=
1
√
2π
_
∞
−∞
e
−
y
2
2
=
√
2π
√
2π
= 1.
Remark 2. If X is a lognormal random variable then
E(X) = me
σ
2
2
and V (X) = m
2
e
σ
2
(e
σ
2
−1).
Proof. By using the same change of variable as in the previous remark we obtain:
E(X) =
_
∞
−∞
xf(x; m, σ)dx =
1
√
2πσ
_
∞
0
e
−
1
2σ
2
ln
2 x
m
dx
=
1
√
2πσ
_
∞
0
e
−
y
2
2
mσe
σy
dy =
m
√
2π
_
∞
0
e
−
y
2
2
+σy−
σ
2
2
+
σ
2
2
dy
=
me
σ
2
2
√
2π
_
∞
0
e
−
(y−σ)
2
2
dy =
m
√
2π
e
σ
2
2
√
2π = me
σ
2
2
.
Similarly,
E(X
2
) =
1
√
2πσ
_
∞
−∞
me
σy
e
−
y
2
2
mσe
σy
dy
=
m
2
√
2π
_
∞
−∞
e
−
y
2
2
+2σy−2σ
2
+2σ
2
dy
=
m
2
e
2σ
2
√
2π
_
∞
−∞
e
−
(y−2σ)
2
2
dy =
m
2
√
2π
e
2σ
2
√
2π
= m
2
e
2σ
2
In conclusion:
V (X) = E(X
2
) −(E(X))
2
= m
2
e
2σ
2
−
_
me
σ
2
2
_
2
= m
2
e
2σ
2
−m
2
e
σ
2
= m
2
e
σ
2
(e
σ
2
−1).
Remark 3. If Y is the random variable deﬁned by Y = ln X where X is a
lognormal random variable with parameters m > 0 and σ > 0 then
E(Y ) = E(ln X) = ln m and V (X) = V (ln Y ) = σ.
377
Proof. We will use the same change of variable as in the previous remarks
E(Y ) = E(ln X) =
_
∞
−∞
ln x f(x, m, σ)dx
=
1
√
2πσ
_
∞
0
ln x
1
x
e
−
1
2σ
2
ln
2 x
m
dx
=
1
√
2πσ
_
∞
−∞
(yσ + ln m)
1
m
e
−σy
e
−
1
2
y
2
mσe
σy
dy
=
σ
√
2π
_
∞
−∞
ye
−
y
2
2
dy +
ln m
√
2π
_
∞
−∞
e
−
1
2
y
2
dy
= 0 +
ln m
√
2π
√
2π = ln m.
Similarly
E(Y
2
) = E(ln
2
X) =
1
√
2π
_
∞
−∞
(yσ + ln m)
2
e
−
1
2
y
2
dy
=
σ
2
√
2π
_
∞
−∞
y
2
e
−
y
2
2
dy +
ln
2
m
√
2π
_
∞
−∞
e
−
1
2
y
2
dy
=
σ
2
√
2π
_
∞
−∞
y
_
−e
−y
2
2
_
′
dy +
ln
2
m
√
2π
√
2π
=
σ
2
√
2π
_
−ye
−
y
2
2
¸
¸
¸
∞
−∞
+
_
∞
−∞
e
−
y
2
2
dy
_
+ ln
2
m
= σ
2
+ ln
2
m.
In conclusion:
V (Y ) = V (ln X) = σ
2
+ ln
2
m−ln
2
m = σ
2
,
as desired.
Remark 4. If Y ∼ N(0, 1), m > 0 and σ > 0 then the random variable X deﬁned
by X = me
σY
is a lognormal random variable with parameters m and σ.
Proof. Let Y ∼ N(0, 1), m > 0 and σ > 0.
We have to determine the probability density function of the random variable X.
First,we will determine the distribution function of the random variable X.
If x ≤ 0 then
F
X
(x) = P(X < x) = P(me
σY
< x) = 0.
For x > 0 we get
F
X
(x) = P(X < x) = P(me
σY
< x) = P
_
e
σY
<
x
m
_
= P
_
σy < ln
x
m
_
= P
_
Y <
1
σ
ln
x
m
_
= F
Y
_
1
σ
ln
x
m
_
.
378
By diﬀerentiating the distribution function of X we get the probability density
function f
X
.
Hence, for x ≤ 0 we have f
X
(x) = 0 and for x > 0
f
X
(x) = F
′
X
(x) = f
Y
_
1
σ
ln
x
m
_
1
σx
=
1
xσ
1
√
2π
e
−
1
2σ
ln
2 x
m
,
which is the probability density function of a lognormal random variable with param
eters m and σ.
Remark 5. If X is a lognormal random variable with parameters m > 0 and
σ > 0 then the random variable Y deﬁned by
Y =
1
σ
ln
X
m
is a standard normal random variable (Y ∼ N(0, 1)).
Proof. Let m > 0, σ > 0 and X a lognormal r.v. with parameters m and σ.
We have to determine the probability density function of the random variable Y .
We determine ﬁrst the distribution of the random variable Y
F
Y
(y) = P(Y < y) = P
_
1
σ
ln
X
m
< y
_
= P
_
ln
x
m
< σy
_
= P
_
X
m
< e
σy
_
= P(X < me
σy
) = F
X
(me
σy
).
By diﬀerentiating the distribution function of Y we get the probability density
function f
Y
f
Y
(y) = F
′
Y
(y) = F
′
X
(me
σy
)me
σy
σ = f
X
(me
σy
)mσe
σy
= mσe
σy
1
σme
σy
√
2π
e
−
1
2σ
2
ln
2
e
σy
=
1
√
2π
e
−
1
2σ
2
·σ
2
y
2
=
1
√
2π
e
−
y
2
2
which is the probability density function of a standard random variable.
379
Appendix A
Notions of topology in R
n
A metric space is an ordered pair (X, d) where X is an nonempty set and d is
a metric (that is a function):
d : X X →R
such that
D1) d(x, y) ≥ 0, ∀ x, y ∈ X (nonnegativity)
d(x, y) = 0 if and only if x = y
D2) d(x, y) = d(y, x), ∀ x, y ∈ X (symmetry)
D3) d(x, z) ≤ d(x, y) +d(y, z), ∀ x, y, z ∈ X (triangle inequality).
The notion of a metric is a generalization of the Euclidean distance in R
n
deﬁned
as
d(x, y) =
_
(x
1
−y
1
)
2
+ + (x
n
−y
n
)
2
, ∀ x = (x
1
, . . . , x
n
) ∈ R
n
,
∀ y = (y
1
, . . . , y
n
) ∈ R
n
.
Remark. If d is the Euclidean distance deﬁned before, then (R
n
, d) is a metric
space called the Euclidean metric space R
n
.
A metric space also induces topological properties like open and closed sets which
leads to the study of more abstract topological spaces.
Deﬁnition. (Topological space)
A topological space is a pair (X, T ) where X is an nonempty set and T ⊂ T(X),
satisfying the following axioms:
T1) ∅, X ∈ T
T2) The union of any collection of sets in T is also in T .
T3) The intersection of any ﬁnite collection of sets in T is also in T .
The collection T is called a topology on X. The sets in T are called open sets
and their complements in X are called closed sets.
Every metric space is a topological space.
Below, we will present the manner in which the Euclidean metric on R
n
induces
a topological structure on R
n
.
Deﬁnitions
1. Open ball in R
n
Let a = (a
1
, . . . , a
n
) ∈ R
n
and let r > 0.
The open ball of radius r and center a is the set
B(a, r) = ¦x ∈ R
n
[ d(a, x) < r¦.
Some particular cases:
R : B(a, r) = (a −r, a +r).
380
If n = 1, the open ball of radius r and center a is the interval of center a and
radius r. Indeed, if n = 1 then x ∈ B(a, r) iﬀ [x −a[ < r which gives us:
−r < x −a < r or a −r < x < a +r, as desired.
R
2
: B(a, r) is the disc centered at a and radius r.
R
3
: B(a, r) is the interior of a sphere centered at a and radius r.
2. Vicinity or neighbourhood of a ∈ R
n
Let a ∈ R
n
. A set V ⊆ R
n
is called a vicinity (or neighbourhood) of the point
a iﬀ there is r > 0 such that B(a, r) ⊂ V .
By 1(a) we denote the set of all vicinities of the point a, i.e.
1(a) = ¦V ⊂ R
n
[ ∃ r > 0 : B(a, r) ⊂ V ¦.
3. Open set in R
n
A set G ⊆ R
n
is called an open set iﬀ for every x ∈ G, G is a vicinity of x i.e.
∀ x ∈ G, ∃ r > 0: B(x, r) ⊂ G.
Remark. If we denote by T = ¦G ⊂ R
n
[ G  open set¦ then (R
n
, T ) is a topo
logical space. A topological space which can arise in this way from a metric space is
called a metrizable space.
4. Closed set in R
n
A set F ⊂ R
n
is called a closed set in R
n
iﬀ R
n
¸ F is an open set.
5. Domain in R
n
A set D ⊂ R
n
is called a domain if it is an open set and a connected one (in one
piece).
6. Accumulation point (limit point)
The point a is an accumulation point (or limit point) of the set A ⊂ R
n
iﬀ
every open ball centered at a contains at least one element of A, other than a.
In the previous deﬁnition there is no condition on the membership of a in A.
Hence, if A ⊆ R
n
then a ∈ R
n
is an accumulation point of the set A iﬀ ∀ r > 0,
B(a, r) ∩ A¸ ¦a¦ ,= ∅.
We will denote by A
′
the set of accumulation points of the given set A. Hence, if
A ⊆ R
n
A
′
= ¦a ∈ R
n
[ ∀ r > 0 : B(a, r) ∩ A¸ ¦a¦ ,= ∅¦.
381
7. Isolated point
Let A ⊂ R
n
and let a ∈ A. A point a ∈ A is an isolated point for A iﬀ there is
an open ball centered at a which contains only the point a from A. In other words, a
point a from A is an isolated point for A if it isn’t an accumulation point of the given
set.
Hence, if A ⊂ R
n
and a ∈ A, a is an isolated point of the set A iﬀ ∃ r > 0:
B(a, r) ∩ A = ¦a¦.
8. Boundary
Let A ⊂ R
n
. The boundary of A consists of all points a ∈ R
n
for which there is
an open ball of positive radius which intersects both A and R
n
¸ A.
We denote the boundary of A by frA.
Hence, if A ⊂ R
n
then
frA = ¦a ∈ R
n
[ ∃ r > 0 : B(a, r) ∩ A ,= ∅, B(a, r) ∩ (R
n
¸ A) ,= ∅¦.
9. Bounded set
The set A ⊂ R
n
is a bounded set if there is M > 0 such that A ⊂ B(0, M).
Hence A ⊂ R
n
is bounded iﬀ ∃ M > 0 such that ∀ a ∈ A, d(0, a) < M.
Example. Let A ⊆ R
2
,
A = ¦(x, y) ∈ R
2
[ x
2
+y
2
≤ 9, x > 0, y > 0¦ ∪ ¦(3, 3)¦.
`
¸
y
x
(3,3)
1
2
3
1 2 3
(1,1)
O
(3,3)  an isolated point
(1,1)  an accumulation point which belongs to A
382
(0, 1)
_
a boundary point
an accumulation point which doesn’t belong to A
A
′
= ¦(x, y) ∈ R
2
[ x
2
+y
2
≤ 9, x ≥ 0, y ≥ 0¦
frA = ¦(x, y) ∈ R
2
[ x
2
+y
2
= 9, x ≥ 0, y ≥ 0¦∪
∪ ¦(x, 0) [ 0 ≤ x ≤ 3¦ ∪ ¦(0, y) [ 0 ≤ y ≤ 3¦.
383
Appendix B
Functions of random variables
In many cases we are given the probability distribution of a random variable and
we are asked for the distribution of some function of it. For example, suppose that
we know the distribution of X and try to ﬁnd the distribution of g(X). In order to
do that we have to express the event that g(X) < y in terms of X being in some set
(which depends on the function g). We start with some examples:
Example. Linear functions of random variables
Let X be a random variable and consider a new random variable Y = aX + b,
with a ,= 0 and b ∈ R.
Case 1. If X is a discrete random variable, then
f
Y
(y) = P(Y = y) = P(aX +b = y) = P
_
X =
y −b
a
_
= f
X
_
y −a
a
_
.
Case 2. If X is continuous, we determine ﬁrst the distribution function of Y .
F
Y
(y) = P(Y < y) = P(aX +b < y) = P(aX < y −b).
 for a > 0 we have
F
Y
(y) = P
_
X <
y −b
a
_
= F
X
_
y −b
a
_
Since f
X
= F
′
X
, we obtain that F
Y
is diﬀerentiable and
f
Y
(y) = F
′
Y
(y) =
1
a
f
X
_
y −b
a
_
=
1
[a[
f
X
_
y −b
a
_
 for a < 0 we have
F
Y
(y) = P
_
X >
y −b
a
_
= 1 −P
_
X ≤
y −b
a
_
= 1 −P
_
X <
y −b
a
_
= 1 −F
X
_
y −b
a
_
and
f
Y
(y) = F
′
Y
(y) = −
1
a
f
X
_
y −b
a
_
=
1
[a[
f
X
_
y −b
a
_
.
In conclusion, we obtain that
f
Y
(y) =
1
[a[
f
X
_
y −b
a
_
.
Example. Let X be a continuous random variable whose probability density
function is f
X
. Determine the p.d.f. of Y = X
2
.
384
Solution. If y ≤ 0 then
F
Y
(y) = P(Y < y) = P(X
2
< y) = 0.
If y > 0 then
F
Y
(y) = P(Y < y) = P(X
2
< y) = P(−
√
y < X <
√
y)
= F
X
(
√
y) −F
X
(−
√
y).
By diﬀerentiating the previous equality we get:
f
Y
(y) =
1
2
√
y
f
X
(
√
y) +
1
2
√
y
f
X
(−
√
y).
In conclusion,
f
Y
(y) =
_
¸
_
¸
_
0, y ≤ 0
1
2
√
y
[f
X
(
√
y) +f
X
(−
√
y)].
The previous examples illustrate two diﬀerent situations: the case in which Y is
an invertible function of X and the case in which it is not.
We will generalize the previous situations just in the case in which Y is an invertible
function of X.
In the other case we will not present the general case which is more complicated
and is beyond the scope of this text.
Theorem. (Discrete case) Let X be a discrete random variable and let Y =
g(X), where g is onetoone function on R.
Then Y is a discrete random variable whose p.m.f. is
f
Y
(y) =
_
f
X
(g
−1
(y)), if there is x ∈ RangeX with x = g(y)
0, otherwise
Theorem. (Continuous case) Let X be a continuous random variable and let
Y = g(X) where g is a onetoone diﬀerentiable function on R.
Then Y is a continuous random variable whose p.d.f. is
f
Y
(y) =
_
¸
¸
_
¸
¸
_
f
X
(g
−1
(y))[(g
−1
)
′
(y)[ =
f
X
(x)
[g
′
(x)[
, if there is x ∈ RangeX
with x = g(y)
0, otherwise
The proofs of these theorems are similar to the ideas presented in the ﬁrst example
of this subsection.
385
Expectation of a function of one random variable
The expectation of Y = g(X) is given by:
E(Y ) = E[g(X)] =
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
i
g(x
i
)p
i
(discrete case)
_
∞
−∞
g(x)f
X
(x)dx (continuous case)
Sums of independent continuous random variables
Let X and Y be independent continuous random variables whose probability den
sity functions are f
X
and f
Y
. We wish to know the p.d.f. of X +Y .
Deﬁnition. (Convolution)
Let X and Y be two continuous random variables with density functions f
X
and
f
Y
. Then the convolution f ∗ g of f and g is the function deﬁned by
(f ∗ g)(z) =
_
∞
−∞
f
X
(z −y)f
Y
(y)dy
=
_
∞
−∞
f
Y
(z −x)f
X
(x)dx.
Theorem. Let X and Y be two independent random variables with density func
tions f
X
and f
Y
. Then the sum Z = X+Y is a random variable with density function
f
Z
, where f
Z
= f ∗ g.
The proof of this theorem is beyond the scope of this book and will be omitted.
To get a better understanding of this important result, we will present some ex
amples.
Example. (Sum of two independent Gamma random variables)
If X ∼ Γ(a
1
, b) and Y ∼ Γ(a
2
, b) are independent, then
Z = X +Y ∼ Γ(a
1
+a
2
, b).
Solution. We know that:
f
X
(x) =
_
_
_
1
Γ(a
1
)b
a
1
x
a
1
−1
e
−
x
b
, x > 0
0, x ≤ 0
and
f
Y
(y) =
_
_
_
1
Γ(a
2
)b
a
2
y
a
2
−1
e
−
y
b
, y > 0
0, y ≤ 0
386
and so, if z > 0
f
Z
(z) =
_
∞
−∞
f
X
(z −y)f
Y
(y)dy
=
1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
_
z
0
(z −y)
a
1
−1
e
−
z−y
b
y
a
2
−1
e
−
y
b
dy
=
e
−
z
b
z
a
1
−1
z
a
2
−1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
_
z
0
_
1 −
y
z
_
a
1
−1
_
y
z
_
a
2
−1
dy.
If we make the following change of variable:
y
z
= x with dy = zdx, we obtain:
If z > 0 then
f
Z
(z) =
e
−
z
b
z
a
1
+a
2
−2
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
_
1
0
(1 −x)
a
1
−1
x
a
2
−1
dx
=
e
−
z
b
z
a
1
+a
2
−1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
B(a
1
, a
2
)
=
e
−
z
b
z
a
1
+a
2
−1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
Γ(a
1
)Γ(a
2
)
Γ(a
1
+a
2
)
=
1
Γ(a
1
+a
2
)b
a
1
+a
2
e
−
z
b
z
a
1
+a
2
−1
If z ≤ 0 then
f
X
(z −y) = 0 for y > 0 and
f
Y
(y) = 0 for y ≤ 0,
hence f
Z
(z) = 0. This completes the proof.
Example. (Sum of two independent normal random variables)
If X ∼ N(m
1
, σ
2
1
) and Y ∼ N(m
2
, σ
2
2
) are independent then
Z = X +Y ∼ N(m
1
+m
2
, σ
2
1
+σ
2
2
).
Solution. If X ∼ N(m
1
, σ
2
1
) then U =
X −m
1
σ
1
∼ N(0, 1)
(see Remark 4 from the subsection ”The normal random variable”).
Similarly, V =
Y −m
2
σ
2
∼ N(0, 1).
First, we will prove that if U, V are two independent standard normal variables
and α, β > 0 such that α
2
+β
2
= 1, then
W = αU +βV ∼ N(0, 1).
If U ∼ N(0, 1) then by applying once more the remark mentioned before we have
that αU ∼ N(0, α
2
).
387
Similarly, βV ∼ N(0, β
2
), so
f
W
(z) =
_
∞
−∞
f
αU
(z −y)f
βV
(y)dy
=
1
2παβ
_
∞
−∞
e
−
(z−y)
2
2α
2
e
−
y
2
2β
2
dy
=
1
2παβ
_
∞
−∞
e
−
y
2
α
2
+β
2
(z−y)
2
2α
2
β
2
dy
α
2
+b
2
=1
====
=
1
2παβ
_
∞
−∞
e
−
y
2
+β
2
z
2
−2β
2
zy
2α
2
β
2
dy
=
1
2παβ
_
∞
−∞
e
−
y
2
−2β
2
zy+β
4
z
2
−β
4
z
2
+β
2
z
2
2α
2
β
2
dy
=
1
2παβ
_
∞
−∞
e
−
(y−β
2
z)
2
2α
2
β
2
e
−α
2
b
2
z
2
2α
2
β
2
dy.
If we make the following change of variable: x =
y −β
2
z
αβ
with
dy = αβdx, then
f
W
(z) =
1
2π
e
−
z
2
2
_
∞
−∞
e
−
x
2
2
dx =
1
2π
e
−
z
2
2
√
2π =
1
√
2π
e
−
z
2
2
In conclusion: W ∼ N(0, 1).
If we take α =
σ
1
_
σ
2
1
+σ
2
2
and β =
σ
2
_
σ
2
1
+σ
2
2
, then
W = αU +βV =
X −m
1
σ
1
σ
1
_
σ
2
1
+σ
2
2
+
Y −m
2
σ
2
σ
2
_
σ
2
1
+σ
2
2
=
X +Y −(m
1
+m
2
)
_
σ
2
1
+σ
2
2
∼ N(0, 1).
We apply again the Remark 4 (mentioned before), part a), with
a =
_
σ
2
1
+σ
2
2
and b = m
1
+m
2
we get that
aW +b = X +Y ∼ N(m
1
+m
2
, σ
2
1
+σ
2
2
),
as desired.
388
Bibliography
[1] Apostol, T.M., Calculus, vol. 2, Multivariable Calculus and Linear Algebra, with
Applications to Diﬀerential Equations and Probability, Second edition, John Wi
ley, New York, 1969.
[2] Anton, H., Rorres, C., Elementary Linear Algebra, 9
th
edition, John Wiley, New
York, 2005.
[3] Binmore, K., Game Theory, A very Short Introduction, Oxford University Press,
2007.
[4] Blaga, P., Mure¸san, A.S., Matematici aplicate ˆın economie, vol. 1, Ed. Transil
vania Press, ClujNapoca, 1996.
[5] Brickman, L., Mathematical Introduction to Linear Programming and Game The
ory, SpringerVerlag, NewYork, 1989.
[6] Carmichael, F., A Guide to Game Theory, Prentice Hall, Pearson Education
Limited, 2005.
[7] Cobza¸s, S., Analiz˘ a matematic˘ a (Calcul diferent ¸ial), Presa Universitar˘ a Clujean˘a,
ClujNapoca, 1997.
[8] Dantzig, G.B., Thapa, M.N., Linear Programming, SpringerVerlag, New York,
1997.
[9] Dowling, E.T., Math´ematiques pour l’´economistes, Mc GrawHill, Paris, 1990.
[10] Duca, D.I., Multicriteria optimisation in complex space, Casa C˘art ¸ii de S¸tiint ¸˘a,
ClujNapoca, 2005.
[11] Eiselt, H.A., Sandblom, C.L., Linear Programming and its Applications, Springer
Berlin Heidelberg New York, 1965.
[12] Filip, D.A., Curt, P., M´ethodes quantitatives en ´economie, Ed. Mediamira, Cluj
Napoca, 2008.
[13] Grinstead, Cl.M., Introduction to Probability,
http://www.darmouth.edu/∼chance.reaching aids/books articles/ probabil
ity book/amsbook.mac.pdf
389
[14] Kemeny, J.G., Snell, J.L., Thompson, G.L., Introduction to Finite Mathematics,
3
rd
edition, PrenticeHall, 1974.
[15] Kinney, J., A probability and Statistic Companion, John Wiley, New York, 2009.
[16] Klimov, G., Probability Theory and Mathematical Statistics, Mir Publishers,
Moscow, 1996.
[17] Kolman, B., Beck, R.E., Elementary Linear Programming with Applications, El
sevier Science, 1995.
[18] Krishnan, V., Probability and Random Processes, John Wiley, New York, 2006.
[19] Lay, D.C., Linear Algebra and its Applications, AddisonWesley Publishing Com
pany, 2003.
[20] Le Gall, J.F., Int´egration, Probabilit´es et Processus Al´eatoires, 2006.
[21] Lipschitz, S., Schaum’s outline of Theory and Problems of Finite Mathematics,
Schaum’s outline series, Mc GrawHill Book Company, 1966.
[22] Lisei, H., Probability Theory, Casa C˘art ¸ii de S¸tiint ¸˘a, ClujNapoca, 2004.
[23] Luderer, B., Nollan, V., Vetfers, K., Mathematical Formulas for Economists, 3rd
edition, SpringerVerlag Berlin Heidelberg, 2007.
[24] Luenberger, D.G., Ye, Y., Linear and Nonlinear Programming, 3
rd
edition,
SpringerVerlag, 2008.
[25] Meester, R., A Natural Introduction to Probability Theory, 2nd edition,
Birkh¨auser Verlag AG, 2008.
[26] Mihoc, I., Calculul probabilit˘ at ¸ilor ¸si statistic˘ a matematic˘ a, lito Univ. Babe¸s
Bolyai, ClujNapoca, 1998.
[27] Mihoc, I., Mihoc, M., Matematici aplicate ˆın economie. Analiz˘ a matematic˘ a, vol.
II, Ed. Presa Universitar˘ a Clujean˘a, 1999.
[28] Mure¸san, A.S., Matematici pentru economi¸sti, vol. 1, 2, lito Univ. Babe¸sBolyai,
ClujNapoca, 1991.
[29] Mure¸san, A.S., Matematici aplicate ˆın ﬁnant ¸e, b˘ anci ¸si burse, Ed. Risoprint,
ClujNapoca, 2000.
[30] Mure¸san, A.S., Lung, R.I., Matematici aplicate ˆın economie (Cercet˘ ari
operat ¸ionale), Ed. Mediamira, ClujNapoca, 2005.
[31] Mure¸san, A.S., ¸si colectiv, Elemente de algebr˘ a liniar˘ a ¸si analiz˘ a matematic˘ a
pentru economi¸sti, Ed. Todesco, ClujNapoca, 2003.
[32] Mure¸san, A.S., Noncooperative games, Ed. Mediamira, ClujNapoca, 2003.
390
[33] Mure¸san, A.S., ¸si colectiv, Elemente de teoria probabilit˘ at ¸ilor ¸si statistic˘ a matem
atic˘ a pentru economi¸sti, Ed. Todesco, ClujNapoca, 2004.
[34] Mure¸san, A.S., ¸si colectiv, Analiz˘ a matematic˘ a ¸si Teoria probabilit˘ at ¸ilor aplicate
ˆın economie, Ed. Todesco, ClujNapoca, 2006.
[35] Mure¸san, A.S., Blaga, P., Matematici aplicate ˆın economie, vol. 2, Ed. Transil
vania Press, ClujNapoca, 1996.
[36] Mure¸san, A.S., Mihoc, M., Filip, D., Curt, P., Rˆap, I., Radu, Ro¸sca, A., P˘ acurar,
M., Petru, P., Mihalca, G., Analiz˘ a matematic˘ a, teoria probabilit˘ at ¸ilor ¸si algebr˘ a
liniar˘ a aplicate ˆın economie, edit ¸ia a doua, Ed. Mediamira, 2008.
[37] Murty, K.G., Linear Programming, John Wiley, 1983.
[38] Nicolescu, M., Analiz˘ a matematic˘ a, vol. 1, E.D.P., Bucure¸sti, 1997.
[39] Pedregal, P., Introduction to Optimization, SpringerVerlag New York, 2004.
[40] Piatecki, C., Le dilemme du prisonnier et autres dilemmes sociaux, 2006.
[41] Piatecki, C., Th´eorie des Choix en Incertain, 2007.
[42] Popescu, O., ¸si alt ¸ii, Matematici aplicate ˆın economie, E.D.P., Bucure¸sti, 1998.
[43] Purcaru, I., Matematici generale ¸si elemente de optimizare: teorie ¸si aplicat ¸ii,
Ed. Economica, Bucure¸sti, 1997.
[44] Purcaru, I., Matematici generale ¸si elemente de optimizare: teorie ¸si aplicat ¸ii,
edit ¸ia II, Ed. Economica, Bucure¸sti, 2004.
[45] Roussases, G., Introduction to Probability and Statistical Inference, Elsevier Sci
ence, 2003.
[46] Sheldon, R., A First Course in Probability, 5th edition, PrenticeHall Inc., 1998.
[47] Sheldon, R., Introduction to Probability and Statistics for Engineers and Scien
tists, third edition, Elsevier Academic Press, 2004.
[48] Sheldon, R., Introduction to Probability Models, Sixth edition, Academic Press,
1997.
[49] Simon, C.P., Blume, L., Mathematics for Economists, W.W. Norton Company
Inc., New York, 1994.
[50] St˘an˘a¸sil˘ a, O., Analiz˘ a matematic˘ a, E.D.P., Bucure¸sti, 1991.
[51] Stewart, J., Analyse. Concepts et contextes, Vol. 1, Fonctions d’une variable, De
Boeck Universit´e, Paris, Bruxelles, 2001.
[52] Stewart, J., Analyse. Concepts et contextes, vol. 2, Fonctions de plusieurs vari
ables, De Boeck Universit´e, Paris, Bruxelles, 2001.
391
[53] Stirzaker, D., Elementary Probability, 2
nd
edition, Cambridge University Press,
2003.
[54] Sydsater, K., Strom, A., Berck, P., Economists’ Mathematical Manual, fourth
edition, SpringerVerlag, 2005.
[55] http://www.masterﬁnance.proba.jussieu.fr, 20042005
[56] http://fr.wikipedia.org/wiki/Accueil
[57] http://www.bibmath.net, BibM@th, la bibliot`eque des Math´ematiques
[58] http://www.cmath.fr
[59] http://www.netprof.fr
[60] http://aleph0.clarku.edu/∼djoyce
[61] http://homeomath.imingo.net
[62] http://www.lesmathematiques.net
392
4.1
4.2 4.3 4.4
4.5 4.6 4.7
Real functions of several variables. Limits and continuity . . . . . . . . . . . . . . . . . . . . 4.1.1 Real functions of several variables . . 4.1.2 Limits. Continuity . . . . . . . . . . . Partial derivatives . . . . . . . . . . . . . . . Higher order partial derivatives . . . . . . . . Diﬀerentiability . . . . . . . . . . . . . . . . . 4.4.1 Diﬀerentiability. The total diﬀerential 4.4.2 Higher order diﬀerentials . . . . . . . 4.4.3 Taylor formula in Rn . . . . . . . . . . Extrema of function of several variables . . . Constrained extrema . . . . . . . . . . . . . . Applications to economics . . . . . . . . . . . 4.7.1 The method of least squares . . . . . . 4.7.2 Inventory control. The economic order quantity model . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
187 187 192 201 212 218 218 228 231 235 247 261 261
. . . . . . . . . . . . . . 265
III
Probabilities
269
A short history of probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 271 5 Counting techniques. Tree diagrams 273 5.1 The addition rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 5.2 Tree diagrams and the multiplication principle . . . . . . . . . . . . . 277 5.3 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . 281 6 Basic probability concepts 6.1 Sample space. Events . . . . . 6.2 Conditional probability . . . 6.3 The total probability formula. 6.4 Independence . . . . . . . . . 6.5 Classical probabilistic models. . . . . . . . . . . . . . . . . . . Bayes’ formula . . . . . . . . . Urn models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 289 302 306 311 313 327 328 332 335 336 345 380 384 389
7 Random variables 7.1 Discrete random variables . . . . . . . . . . . . 7.2 The distribution function of a random variable 7.3 Continuous random variables . . . . . . . . . . 7.4 Numerical characteristics of random variables . 7.5 Special random variables . . . . . . . . . . . . . Appendix A Appendix B Bibliography
6
Introduction
Why Maths?
Because Mathematics is the universal language of sciences. When we speak mathematics, all the barriers  linguistic or cultural ones  are pushed away.
Why Maths in Economis?
Mathematics plays an important role in Economics. This role has been rather signiﬁcant for the late century and knew a real impulse during the last decades. Emmanuel Kant (17241804) said: ”A science contains as much science as it contains Mathematics”. One of the ﬁrst economists who wanted to make economics more scientiﬁc by applying the mathematical rigor to it was Alfred Marshall (18421924, English economist). He did not want that an overside of mathematics to make the economic texts harder to understand. Accordingly, Marshall put the mathematical content in the footnotes and appendices of his economics books. In 1906 we wrote: ”I had a growing feeling in the later years of my work at the subject that a good mathematical theorem dealing with economic hypotheses was very unlikely to be a good economics: and I went more and more on the rules (1) Use mathematics as a short hand language, rather than an engine of inquiry; (2) Keep to them till you have done; (3) Translate into English; (4) Then illustrate by examples that are important in real life; (5) Burn the mathematics; (6) If you can’t succeed in (4), burn (3). That last I did often.” The use of mathematics in economics provides us some advantages:  the language that is used is more precise and concise  it allows us to treat the general case  we have at our disposal a great number of mathematical results. On his blog, Greg Mankiw (Professor of Economics at Harvard University) wrote the following answers to the next question: ”Why aspiring economists need Math?” ”A student who wants to pursue a career in policyrelated economics is advised to 7
go to the best graduate school he or she can get into. The best graduate schools will expect to see a lot of math on your undergraduate transcript, so you need to take it. But will you use a lot of diﬀerential equations and real analysis once you land that dream job in a policy organization? No, you won’t. That raises the question: Why do we academics want students that have taken a lot of math? There are several reasons: 1. Every economist needs to have a solid foundation in the basics of economic theory and econometrics, even if you are not going to be either a theorist or an econometrician. You cannot get this solid foundation without understanding the language of mathematics that these ﬁelds use. 2. Occasionally, you will need math in your job. In particular, even as a policy economist, you need to be able to read the academic literature to ﬁgure out what research ideas have policy relevance. That literature uses a lot of math, so you will need to be equipped with mathematical tools to read it intelligently. 3. Math is good training for the mind. It makes you a more rigorous thinker. 4. Your math courses are one long IQ test. We use math courses to ﬁgure out who is really smart. 5. Economics graduate programs are more oriented to training students for academic research than for policy jobs. Although many econ PhD go on to policy work, all of us teaching in graduate programs are, by deﬁnition, academics. Some academics take a few years oﬀ to experience the policy world, as I did not long ago, but many academics have no idea what that world is like. When we enter the classroom, we teach what we know. (I am not claiming this is optimal, just reality.) So math plays a larger role in graduate classes than it does in many jobs that PhD economists hold. It is possible that admissions committees for econ PhD programs are excessively fond of mathematics on student transcripts? Perhaps. That is some thing I might argue with my colleagues about if I were ever put on the admissions committee. But a student cannot change that. The fact is, if you are thinking about a PhD program in economics, you are advised to take math course until it hurts.” At present the Maths teachers’ mission is to do the best advertisement for Maths, to make the students the importance of Maths, to ”dress up” the Maths classes in vivid colours. The purpose of this book (covering three parts (Linear Algebra, Calculus and Probabilities) divided in seven chapters) is to give to the students at economics the possibility to acquire the basic knowledge in Maths which they will have to work with in the future, in order to be able to use them with complex economic models belonging to the real world. Our book attempts to develop the student’s intuition concerning the ways of working with mathematical techniques. This book wants to be a device for using Maths in order to understand the structure of ”economics”. The text oﬀers an introduction into the most intimate relationship between Maths and Economics. Taking into account the applicative content of this book, we will not always present the complete proofs of all theoretical statements but attached importance to examples and economic applications. As Mathematics is a very old science, we can’t possibly be 8
entirely original, but the structure and concepts have been thought out after several years of work together with the students in economics. For the content could get closer to the necessities of economists, we introduced several examples from the economic ﬁeld. This book is especially meant to the ﬁrst year’s students in ”Economic Sciences and Business Administration”. We also turn to all those who should need to refresh their knowledge in maths which are to be used in economics or just for the sake of their professional update.
9
10
Part I
Elements of linear algebra
11
.
The vertical line through the point P meets the horizontal axis (x axis) at a which is called the abscissa of P . to represent 1. A point.e. called the origin. O −∞ 0 P 1 ∞ The real line R We assume that the reader is familiar with the Cartesian plane. The horizontal line through the point P meets the vertical axis (y axis) at b which is called the ordinate of P . usually to the right of 0. is chosen to represent 0 and another point.Chapter 1 Linear spaces 1. lines. 13 . For this reason we refer to R as the real line and use the words point and number interchangeably. b)  a ∈ R. In this paragraph we will study how to generalize notions of points.1 The Euclidean space One of the main uses of mathematics in economic theory is to construct the appropriate geometric and analytic generalizations of the two or threedimensional geometric models which are the main stay of undergraduate economic courses. planes. There is a natural correspondence between the points on the line and real numbers. For this reason we refer to R2 = {(a. The geometric representation of R is a straight line. i. The set of real numbers. denoted by R. Each point P represents an ordered pair of real numbers (a. each point will represent a unique real number and each real number will be represented by a unique point. plays a dominant role in mathematics. b ∈ R} as the plane and use the words plane and ordered pair of real numbers interchangeably. b) ∈ R2 and each element of R2 can be represented by a point in the Cartesian plane. distances and angles to ndimensional Euclidean spaces.
The slope m of the line is deﬁned as the tangent of its angle θ of inclination: m = tg θ. The range of θ is given by 0 ≤ θ < 180◦ . The inclination θ of a line l is the angle that l makes with the horizontal axis. We shall denote the distance between P and Q by d(P. Next. y1 ) and Q(x2 . Q) = (x2 − x1 )2 + (y2 − y1 )2 .y axis 6 b > P O a The cartesian plane R2 x axis If P (x1 . It is well know that two diﬀerent points determine exactly one line. 14 . y2 ) are two points in R2 then the distance between them can be determined by using the pythagorean theorem in a right triangle. Q). θ is the smallest positive angle measured counterclockwise from the positive end of the x axis to the line l. 6 y2 Q x2 y2 − y1 y1 P θ x2 − x1  x1 The line P Q d(P. we will present diﬀerent forms of the equation of a line.
i. i. b) The slope of a vertical line. The equation of the line having slope m and passing through the point (x1 . a line parallel to the y axis. y2 − y1 x2 − x1 15 .e. i. the slope of the line passing through two points P (x1 .In particular. where a and b are not both zero. Pointslope form A line is completely determined if we know its direction (its slope) and a point on the line. a) The slope of an horizontal line. y2 ) is given by: y2 − y1 . Two points form Let P (x1 . y2 ) two diﬀerent points in the Cartesian plane. In particular. d) Two distinct lines l1 and l2 are perpendicular if and only if the slope of one is the negative reciprocal of the other: m1 = − 1 m2 or m1 m2 = −1. i. c) Two distinct lines l1 and l2 are parallel if and only if their slopes. is of the form y = k where k is the ordinate of the point at which the line intersects the y axis. m1 = m2 . The equation of the line which passes through the previous two points is: (l) x − x1 y − y1 = . the equation of the x axis is y = 0. In particular. y1 ) and Q(x2 . the equation of the y axis is x = 0. y1 ) is (l) y − y1 = m(x − x1 ). when x2 = x1 is not deﬁned. each point on l is a solution of (l) and each solution of (l) is a point on l. Horizontal and vertical lines The equation of a horizontal line i. m1 respectively m2 . m = tg θ = x2 − x1 Remark. is zero. a line parallel to the x axis.e.e.e.e. y1 ) and Q(x2 . are equal. when y2 = y1 . Linear equations Every line l in the Cartesian plane R2 can be represented by a linear equation of the form (l) ax + by + c = 0. The equation of a vertical line i. is of the form x = k where k is the abscissa of the point at which the line intersects the x axis.e.
We represent these displacements as vectors in R2 . coordinatewise multiplication does not satisfy the basic properties of the multiplication of real numbers. 16 . geometrically.The previous equation was obtained by replacing m = form of the equation. y 6 u v u+ v 3 v 1 u O Addition of two vectors x We can use the parallelogram as in the above ﬁgure to draw u + v keeping the tails of u and v at the same point.2 Euclidean nspace We can interpret the order pairs of R2 not only as locations but also as displacements. b) and v = (c. d) are two vectors in R2 . then u + v will represent a displacement of a + c units to the right and b + d units up. On the other hand. If u = (a. The displacement (a. scalar multiplication of a vector v be a nonnegative (negative) scalar corresponds to stretching or shrinking v without (with) changing its direction. the head marks the location after the displacement is made. b) means: move a units to the right and b units up from the current location. The tail of the arrow marks the initial location. To develop a geometric intuition for vector addition we can think in the following way. It is generally not possible to multiply two vectors in a nice way to generalize the multiplication of real numbers. For instance. y2 − y1 in the pointslope x2 − x1 1.
x2 + y2 . . addition of two vectors: if x. xn ) of real numbers i. . αxn ) ∈ Rn . we will present an axiomatic concept based on the simplest properties of the previous operations. . . z ∈ V (ii) there is a vector θ ∈ V (called the null vector) that is an identity element for addition: x + θ = θ + x = θ. Next. x2 . i = 1. A real vector space is a set V = ∅ with an operation + : V ×V → V called vector addition and an operation · : R × V → V called scalar multiplication with the following properties: (i) (x + y) + z = x + (y + z). . ∀ x. x2 . . By deﬁnition Rn is the set of ordered ntuples x = (x1 . xn )  x1 . . . . αx2 . . . Deﬁnition. ∀ x ∈ V (iii) for any x ∈ V there is −x ∈ V such that x + (−x) = (−x) + x = θ 17 . . y. x2 . Rn = {x = (x1 . . y ∈ Rn then x + y := (x1 + y1 . n. .y 6 2b 2u b u a * * O 2a x Vector multiplication by a real scalar We will generalize the previous discussion to the general case. The two fundamental operations (which generalize the addition of two vectors and the multiplication of a vector by a scalar) are: 1. . . . xn + yn ) ∈ Rn 2. .e. For an integer n ≥ 1. . . The elements of Rn are called vectors and the numbers xi . xn ∈ R}. are called the coordinates of x (xi is the ith coordinate of x). scalar multiplication: if x ∈ Rn and α ∈ R then αx = (αx1 .
A linear subspace of V is a subset W ⊂ V that is itself a vector space with vector addition and scalar multiplication deﬁned by restriction of given operations on V. . . A basis is linearly independent. y ∈ V (v) 1 · x = x. . . . vn ∈ V }. . . . v ∈ W. The proof of the previous example follows immediately by using the deﬁnitions of the operations in Rn . Remark. αn ∈ R and v1 . such that α1 v1 + · · · + αn vn = θ. . Remark. v2 . αn ∈ R. that is. . ∀ u. αn ∈ R. αn ∈ R not all zero. . . In other words. . vn } is called linearly independent if the vector equation α1 v1 + α2 v2 + · · · + αn vn = θ has the only trivial solution α1 = α2 = · · · = αn = 0. . . . In this case we say that V is ﬁnite dimensional and the common number of elements of the basis of V is called the dimension of V . ∀ α. If W ⊆ V then W is a linear subspace if and only if the following two conditions are fulﬁlled: a) W = ∅ b) ∀ α ∈ R. . If B = {v1 . there are α1 . vn . A set of vectors {v1 . . . Conversely. a linearly independent set that spans V is a basis for V . A basis for V is a subset B ⊂ V that spans V (spanB = V ) and is minimal for this property in the sense that there are no proper subsets of B that span V . vn } is a basis of v then each point v ∈ V can be written as a linear combination v = α1 v1 + α2 v2 + · · · + αn vn 18 . A set of vectors {v1 . Remark. . . u + αv ∈ W . . vn ∈ V then the sum α1 v1 + · · · + an vn ∈ V is called a linear combination of the vectors v1 . . . . . v2 . . a set of vectors in a vector space is linearly dependent if and only if one vector can be written as a linear combination of the others. If V has a ﬁnite basis. . β ∈ R (a) (αβ)x = α(βx) (b) (α + β)x = αx + βx (c) α(x + y) = αx + βy. α2 . . then all basis have the same number of elements. . . (V. Let V be a vector space. . α1 . Rn is a vector space. Example. ∀ x. . . We next show that any ﬁnite dimensional vector space is ”like” Rn . ∀ x ∈ V (vi) ∀ x.(iv) x + y = y + x. . α2 . y ∈ V. span S = {α1 v1 + · · · + αn vn  n ∈ N∗ . . . The span of a set S ⊂ V is the set of all linear combinations α1 v1 + · · · + αn vn where v1 . α2 . . . . . +) is a commutative group. If α1 . vn ∈ V and α1 . . . v1 . . vn } is called linearly dependent if the vector equation α1 v1 + · · · + αn vn = θ has a nontrivial solution.
A function · : V → R. Normed spaces The concept of a norm is an abstract generalization of the length of a vector. x is a norm on Rn which is called the Euclidean norm. . y = x1 y1 + x2 y2 + · · · + xn yn . · : Rn × Rn → R x. 1. ∀ x. y ∈ Rn . . ∀ x. ∀ α. ∀ x. n. · ). is a base of the vector space Rn . z ∈ V. 0) (1 is the ith coordinate). i = 1. x = 0 ⇔ x = θ (positive deﬁniteness) IP2) x. 0. Proof. ∀ x. x. . Example. . ∀ x ∈ V \ {θ}. x > 0. A vector space V with a norm · is called a normed space and it is denoted by (V. Let V be a vector space. y  ≤ x · y . . b) dim Rn = n. The function · : Rn → R x→ x = x. z + β y. Inner product spaces Deﬁnition. · : V × V → R is called an inner product in V if the following conditions are satisﬁed: IP1) x. . a) The set B = {e1 . The canonical inner product in Rn is deﬁned in the following way: ·. .in exactly one way. The properties (N1) and (N2) are easy consequences of the properties of the inner product. x → x is called a norm if it satisﬁes the following conditions: N1) x ≥ 0. The proof that the previous functions satisﬁes all the properties of the previous deﬁnition is left to the reader (easy computations based on the properties of real numbers). ∀ x ∈ V x =0 ⇔ x=0 N2) αx = α · x . en } where ei = (0. . 0. Example. Deﬁnition. A vector space with an inner product is called an inner product space. β ∈ R (bilinearity). y. z . Example. Let V be a vector space. z = α x. . y ∈ V IP3) αx + βy. y ∈ V . x . y ∈ Rn . ∀ x. . y = y. A mapping ·. ∀ x ∈ V. The property (N3) follows from the CauchyBuniakovskiSchwarz inequality:  x. . ∀ α ∈ R N3) x + y ≤ x + y . . 19 .
y) + d(y. x). z). y) ≥ 0. We can assume that x = θ and y = θ (if x = θ or y = θ the CauchyBuniakovskiSchwarz is true). = x 2 + 2 x.We consider ﬁrst the following obvious inequalities: 0≤ x+y wherefrom we get hence 0≤ x−y 2 2 = x + y. Each normed space (V. y 2 −( x + y 2 ) ≤ 2 x. We are now able to prove N3) x+y 2 y x . y ∈ V satisﬁes the properties D1). ∀ x. y  ≤ x · y . D3). d : X × X → R is called distance on X if the following conditions are satisﬁed: D1) d(x. ≤ x 2 +2 x · y + y 2 Example (other norms on Rn ). . y ∈ X d(x. z (triangle inequality). . y. y (symmetry) D3) d(x. y − 2 x. x 1 = x1  + · · · + xn . y) = d(y. d) in which X is a nonempty set and d is a distance on X. x − y = x 2 2 2 + y + y 2 2 2 + 2 x. If X = ∅. ∀ x ∈ Rn 2) · ∞ : Rn → R. y) = 20 (x1 − y1 )2 + · · · + (xn − yn )2 . ∀ x. ∀ x. . y ∈ Rn . Metric spaces Deﬁnition. Remark. x y ≤ x x 2 + y y 2 =2  x. x + y = x = x − y. y) = 0 ⇔ x = y D2) d(x. y + y 2 CBS = ( x + y )2 . ∀ x. y) ≤ d(x. Example. x ∞ = max{x1 . . . · ) is a metric space since the function d:V ×V →R d(x. 1) · 1 : Rn → R. y  ≤ x 2 + y + y 2. y) = x − y . x y If we replace x by and y by in the previous inequality we obtain x y 2 wherefrom we have: as desired. ∀ x. A metric space is a pair (X. xn }. D2). In Rn the Euclidean distance is: d(x. y ≤ x 2 x.
. . Let T : Rn → Rm be a linear operator.1. A linear operator T : V → R is called a linear form.. Remark 1. A linear operator from V to W is a function T that preserves the vector space structure. Remark 2. .. Linear operators Deﬁnition. i. i = 1. x = x1 e1 + x2 e2 + · · · + xn en T (x) = T (x1 e1 + · · · + xn en ) = x1 T (e1 ) + · · · + xn T (en ) = x1 a1 + · · · + xn an = at x.3 Quadratic forms In this section we present the natural generalizations of linear and quadratic functions to several variables. Then. ∀ x ∈ Rn . Let B = {e1 . there exists a vector a1 . ∀ x. . en } be the canonical basis of Rn . Proof. y ∈ V. Proof. for any vector x ∈ Rn . hence a1j a T (ej ) = 2j . Then. ∀ α. Let ai = T (ei ) ∈ R. Then there exists an m × n matrix A such that T (x) = Ax. . Let V and W be two real vector spaces. n. T (ej ) ∈ Rm . .e. . Let B = {e1 . T (αx + βy) = αT (x) + βT (y). Let T : Rn → R be a linear operator. amj 21 . n. The idea is the same as that of the previous remark. For each j = 1. such that dim V = n and dim W = m. The same correspondence between linear operators and matrices is valid for linear operators from Rn to Rm . The previous remark implies that every linear form on Rn can be associated with a unique vector a ∈ Rn (or with a unique 1 × n matrix) so that T (x) = at x. ∈ Rn . . an such that T (x) = at x for all x ∈ V . . a = . β ∈ R. en } be the canonical base of Rn .
we can say that matrices are representations of linear operators. Q(x) = ax2 Q(x. . a1n x1 a21 a22 . z ∈ W. . a) Let V and W be two real vector spaces such that dim V = n and dim W = m.. The application (··) : V × W → R is called bilinear if it is linear with respect to its two variables.. β ∈ R. ∀ x ∈ V is called a quadratic form on V . i. ∀ y. = . = Ax. c) Let V be a real vector space whose dimension is n and let (··) : V × V → R be a symmetric bilinear form. 22 (xαy + βz) = α(xy) + β(xz). The application: Q:V →R x → Q(x) = (xx). ∀ x.Let A be the m × n matrix whose j th column is the column vector T (ej ). amn am2 am1 a11 a12 .. . z) = ax2 + by 2 + cz 2 + dxy + exz + f yz are quadratic forms in one... Quadratic forms are associated to bilinear forms.. ∀ α. . a2n . amn So. y ∈ V. β ∈ R.. . Examples.. . .. . . two or three variables. ∀ α. y ∈ V.: (αx + βyz) = α(xz) + β(yz). Quadratic forms In mathematics. ∀ x. y... . ∀ x ∈ V. Deﬁnition. xn am1 am2 ..e. .. ∀ z ∈ W b) The bilinear form (··) : V × V → R is called symmetric if (xy) = (yx).. a quadratic form is a homogeneous polynomial of degree two in a number of variables. y) = ax2 + bxy + cy 2 Q(x. For any x = x1 e1 + · · · + xn en ∈ Rn we have T (x) = T (x1 e1 + · · · + xn en ) = x1 T (e1 ) + · · · + xn T (en ) a1n a12 a11 a a a = x1 21 + x2 22 + · · · + xn 2n . . . .
. n we denote aij = (vi vj ) then aij = aji (since (··) is symmetric) and n n Q(x) = = i=1 j=1 a11 x2 1 aij xi xj +2a12 x1 x2 +a22 x2 2 + · · · + 2a1n x1 xn + · · · + 2a2n x2 xn 2 + · · · + ann xn which is the analytical expression of the quadratic form Q. n n Q(x) = i=1 j=1 aij xi xj (aij = aji ) 23 . Hence: Q(x) = (xx) = (x1 v1 + x2 v2 + · · · + xn vn x) = x1 (v1 x) + x2 (v2 x) + · · · + xn (vn x) = x1 (v1 x1 v1 + · · · + xn vn ) + ···+ + xn (vn x1 v1 + · · · + xn vn ) + xn [x1 (vn v1 ) + · · · + xn (vn vn )] n n = x1 [x1 (v1 v1 ) + · · · + xn (v1 vn )] + ···+ = x1 j=1 n xj (v1 vj ) + · · · + xn n j=1 j=1 xj (vn vj ) (vi vj )xi xj . = i=1 xi xj (vi vj ) = n n n n i=1 j=1 In conclusion: Q(x) = (xx) = i=1 j=1 (vi vj )xi xj . j = 1.Next. . Just as a linear function has a matrix representation. too. . we determine the analytical expression of a quadratic form. vn } is a base of V then x can be uniquely expressed as x = x1 v1 + x2 v2 + · · · + xn vn . If B = {v1 . . If for each i. a quadratic form has a matrix representation. The quadratic form Q : Rn → R. Remark.
. Deﬁnition. a) Let A be a n×n matrix. 24 . . . det An = det A > 0 (b) Q is negative deﬁnite if and only if all its n leading principal minors alternate in sign as follows det A1 < 0. . .. . x′ ∈ Rn \ {θ} such that Q(x) < 0 and Q(x′ ) > 0. we will describe a simple test for the deﬁnitess of a quadratic form. The mth order principal submatrix of A obtained by deleting the last n − m rows and the last n − m columns from A is called the mth order leading principal minor of A. . Let Q : Rn → R be a quadratic form whose coeﬃcient matrix is A..can be written as a11 a21 Q(x) = (x1 .e. then Q is (a) positive deﬁnite if Q(x) > 0 for all x ∈ Rn \ {θ} (b) positive semideﬁnite if Q(x) ≥ 0 for all x ∈ Rn (c) negative deﬁnite if Q(x) < 0 for all x ∈ Rn \ {θ} (d) negative semideﬁnite if Q(x) ≤ 0 for all x ∈ Rn (e) indeﬁnite if there are x.. (−1)n det An = (−1)n det A > 0.. A m×m submatrix of A formed by deleting n − m columns and the same n − m rows from A is called a mth order principal submatrix of A... . We will denote the mth order leading principal submatrix by Am and the corresponding leading principal minor by det Am .. .. . . .. (c) Q is positive semideﬁnite if and only if every principal minor of A is nonnegative. . det A2 > 0. . . det A2 > 0. . The determinant of a m × m principal submatrix is called a mth order principal minor of A. . Next. a1n x1 a2n . . xn ann where A is the matrix (symmetric) of the coeﬃcients of the quadratic form Q. m For a n × n matrix there are Cn mth order principal minors of A. xn ) .. an1 = xt Ax. If Q : Rn → R be a quadratic form. The following remark provides an algorithm which uses the leading principal minors to determine the deﬁnitess of a quadratic form Q whose coeﬃcient matrix is A. det A1 > 0. Remark 3. b) Let A be a n × n matrix... To present the test we need some deﬁnitions related to the coeﬃcient matrix of Q. Then (a) Q is positive deﬁnite if and only if all its n leading principal minors are strictly positive i.. Deﬁnitess of quadratic forms Deﬁnition.
x2 ) a11 a21 a12 a22 2 x1 x2 2 a11 a22 − a12 2 x2 a11 2 = a11 x1 + 2a12 x1 x2 + a22 x2 2 = a11 x1 + a12 x2 a11 + 2 = det A1 x1 + a12 x2 a11 + det A2 2 x > 0. at ak+1 k+1 ak k+1 If d = ak+1 k+1 − at A−1 a then we have k Ak 0 Ak at 0 d Ik 0 Ik (A−1 a)t k = 0 1 A−1 a k 1 = = Ak at Ik (A−1 a)t k a ak+1 k+1 0 1 = A. n}) such that det Am < 0 or if there are two odd numbers m1 and m2 such that det Am1 < 0 and det Am2 > 0 then Q is indeﬁnite. If n = 1 then the result is trivial. x2 ) = (0. Let A be symmetric matrix of order k + 1. det A2 = det A = a11 · a22 − a12 > 0 and hence: Q(x) = (x1 .d) Q is negative semideﬁnite if and only if every principal minor of odd order is not positive and every principal minor of even order is nonnegative. Proof. det A1 2 We suppose that the theorem is true for symmetric matrices of order k and prove it for symmetric matrices of order k + 1. j = 1. 0). The matrix A can be written as a1 k+1 a Ak a where a = 2 k+1 . We have to show that Q is positive deﬁnite. 2 If n = 2 then det A1 = a11 > 0. Ak 0 a d a (A−1 a)t a + d k The previous equality can be written as A = C t BC. A= . k + 1 then xt Ax > 0. We will prove the part (a) (the proofs are similar for parts (b). (c) and (d)) by using induction on the size of A (the coeﬃcient matrix of Q). . . ∀ x = θ. Since det C = det C t = 1 and det B = d · det Ak then det A = det B = d · det Ak . Suppose that all the leading minors are strictly positive.. ∀ (x1 .. . where C= Ik 0 A−1 a k 1 and B = Ak 0 0 d . e) If there is an even number m (m ∈ {1. We have to prove that if det Aj > 0. . 25 .
Let x ∈ Rk+1 \ {θ}. 26 . 1) . x Let x ∈ Rk \ {θ}. det A2 > 0. t t −1 t −1 d = (0.Since det A > 0 and det Ak > 0 then d > 0. and det Ak > 0. We have to prove now that d > 0. n) we will use the induction once more. Since det Ak > 0 and d > 0 then det A = det Ak+1 > 0 as desired. since Q is positive deﬁnite we have that 0 Ak 0 . then det A1 > 0 and det A2 > 0. . then Q(x) = det A1 x1 + a12 x2 a11 2 + det A2 2 x . 0. By using the inductive hypothesis and the fact that d > 0 we get that xt Ax > 0. Every x ∈ Rk+1 x where x ∈ Rk . To prove the converse (Q positive deﬁnite implies that det Aj  > 0. If n = 1 then the result is trivial. Hence A = C t BC and det A = det Ak · d. We write the matrix A as in the ﬁrst part of the proof. If n = 2. . . j = 1. hence Q is positive deﬁnite. det A1 2 In the previous equality we used the fact that a11 = 0 since if a11 = 0 then Q(1. By the inductive hypothesis we obtain that det A1 > 0. = x Bx = x (C ) AC x . 0 d 1 = (C −1 x)t A(C −1 x) > 0. ∀ x = θ. If x = ∈ Rk+1 then 0 0 < xt Ax = xt 0 A x 0 = xt Ak x. 0) = 0 and Q cannot be positive deﬁnite. Indeed. can be written as x = xk+1 Then xt Ax = xt C t BCx = (Cx)t B(Cx) = y t By = yt yn+1 Ak 0 0 d y yk+1 2 = y t Ak y + dyk+1 . Assume that the result is true for any quadratic form whose coeﬃcient matrix has order k and let A be the (k + 1) × (k + 1) coeﬃcient matrix of a positive deﬁnite quadratic form. It is obvious that if Q(x) > 0. In the previous equality we denoted the vector Cx by y = y yk+1 which is not the null vector since C is invertible and x = θ. It remains for us only to prove that det A = det Ak+1 is positive. .
. respectively. inﬁnitely many solutions. sn ) of numbers which makes each equation a true statement when the values s1 . and each solution of the second system is a solution of the ﬁrst system. xn ∈ R is called a system of linear equations or a linear system. n. a system is inconsistent if it has no solution. . . A solution of the system is a list (s1 . x2 . . The general form of a linear system of m equations and n unknowns is the following: a11 x1 + · · · + a1n xn = b1 .m = . am1 x1 + · · · + amn xn = bm am1 . bi ∈ R. . . .Chapter 2 Linear Algebra 2. That is.. For a linear system we consider • the matrix of the system (the matrix of the coeﬃcients of the unknowns) a11 . xn . . j = 1. ∀ i = 1. . a1n . .n A ﬁnite set of linear equations in the variables x1 . GaussJordan elimination method where aij ∈ R. . m. sn are substituted for x1 . . . or 2.. . . no solution. each solution of the ﬁrst system is a solution of the second system. j=1.. exactly one solution. or 3. . amn 27 . . . . Two linear systems are called equivalent if they have the same solution set. . We say that a linear system is consistent if it has either one solution or inﬁnitely many solutions. The set of all possible solutions is called the solution set or the general solution of the linear system. A system of linear equation has either 1.1 Systems of linear equations. A = (aij )i=1.. . . .
2) If rank A = rank A = n (where n is the number of the unknowns) then the system has exactly one solution. The system is inconsistent. . bn x1 . . ain ) is the i row. A= . a1n b1 . . am1 . This chapter describes an algorithm or a systematic procedure for solving linear systems. The system is consistent. . .where Ri = (ai1 . Rm . .. . . 28 . .. . . xn A= th R1 R2 . • the augmented matrix (the coeﬃcient matrix with an added column containing the constants from the right sides of the equations) a11 . b= . . ai2 . . then the system is inconsistent. amn bm • the column of the constants b1 . Concerning the solutions set of a linear system we have the following result: Remark. . If the equivalent system contains a degenerate linear equation of the following form 0 · x1 + 0 · x2 + · · · + 0 · xn = bi then i) If bi = 0. This algorithm is called GaussJordan elimination method and its basic strategy is to replace one system with an equivalent one that is easier to solve. The method is named after German mathematicians Carl Friederich Gauss (17771855) and Wilhelm Jordan (18421899) but it appears in an important Chinese mathematical text which was written approximately at 150 BCE. x = . ii) If bi = 0. . . • the column of the unknowns By using the above matrix notations the system can be written in the following form: Ax = b. . then the degenerate equation may be deleted from the system without changing the solution set. 1) If rank A < rank A then there is no solution of the considered linear system. 3) If rank A = rank A < n then the system has inﬁnitely many solutions. .
ail ... aij . We want to determine the elementary row operations which transform the element aij into 1 (aij → 1) and all the other elements of the j th column into 0 (akj → 0.The rectangle rule for row operations The purpose of this paragraph is to transform a matrix which has a nonzero column into an equivalent one that contains one element equal to 1 and all the other elements equal to 0 (we say that such a column is in proper form). Replace one row by the sum of itself and a multiple of another row. ∀ k = 1. . . . . .. ∀ k = i). . aij aij → aij =1 aij The eﬀects of the previous elementary row operations on the other elements of the matrix are: ail ail → .. . aij akj → akj − aij · akj = 0. ... . We consider the following row operations Ri · and Rk − Ri · akj → Rk . l = j aij 29 . Interchange two rows.. λRi → Ri .. This remark is true since it is well known that the solution of a system remains unchanged if we multiply one equation by a nonzero constant or if we add a multiple of one equation to another or if we interchange two equations of a system (the rows of an augmented matrix correspond to the equations in the associated system). .. Ri + λRk → Ri 3) Interchange. k = i.. .. aij 1 → Ri .. Multiply all entries in a row by a nonzero constant. . m. Suppose that aij = 0. . . Let A be the matrix . . A= .. . . . . . . λ=0 2) Replacement. .. n. ∀ l = 1. .. akj .. If we apply the elementary row operations to an augmented matrix of a linear system we obtain a new matrix which is the augmented matrix of an equivalent linear system to the given one. . .... Ri ↔ Rj Remark. . .. This can be done by using the elementary row operations which are: 1) Scaling. . akl .
This can be done by writing at the right side of the given matrix the unitary matrix I which has the same number of rows and columns as the matrix A 30 . Choose and circle (from the considered column) a nonzero element which is called the pivot. if akj = 0 then akl → akl · aij − ail · 0 = akl . Indeed. aij aij The element aij = 0 is called the pivot. Then. Set the elements of the pivot column (except the pivot) equal to 0. Example.akl → akl − ail · akj akl aij − ail akj = . Remark. Step 5. aij So. Compute all the other elements of the matrix by using the rectangle’s rule. Step 4. if ail = 0 then akl → akl · aij − 0 · akj = akl . Step 2. Step 3. ∀ k = i. The rows which contain a 0 on the pivot column remain unchanged. in order to transform a matrix which has a nonzero column into an equivalent one that contains one element equal to 1 and all the other elements equal to 0 we have to follow the next steps: Rectangle’s algorithm Step 1. ∀ l = j. Divide the pivot row by the pivot. from the product of the elements situated in the opposite corners of the previous rectangle’s diagonal which contains the pivot we subtract the product of the elements situated in the corners of the other diagonal and the result is divided by the pivot (rectangle’s rule). 1) The rows which contain 0 on the pivot column remain unchanged. The rectangle rule can be used to determine the inverse of a given invertible matrix A. in order to transform the element akl (by using aij as a pivot) we locate the rectangle which contains the element and the pivot aij as opposite corners. Indeed. aij 2) The columns which contain 0 on the pivot row remain unchanged. So. The columns which contain a 0 on the pivot row remain unchanged. 2 A= 1 −1 0 3 1 1 −1 2 ∼ 1/3 2 0 −4/3 0 2 0 1 0 1 2/3 −2/3 −1 0 2 Remark.
xn a1n a2n . .and then applying the rectangle rule to the obtained matrix. The GaussJordan elimination method This method is an elimination procedure which transforms the initial system into an equivalent one whose solution can be obtained directly. am2 . The table contains the augmented matrix with the constant column written at the left side of the matrix A. The matrix situated at the right side of the unitary matrix (in the ﬁnal table) is the inverse of the matrix A. By choosing successively the elements situated on the main diagonal of matrix A as pivots we will ﬁnally obtain the unitary matrix I (situated below the given matrix A). Associate to the given system the following table.. .. . 1 2 3 1 0 0 1 0 0 1 0 0 Hence A 2 3 1 2 −1 −5 0 1 0 0 1 0 3 1 2 3 −5 −7 −7 5 18 0 0 1 5 − 13 1 18 7 18 1 0 0 1 −2 −3 −3 2 7 5 − 13 1 18 7 18 1 18 7 18 5 − 18 7 18 5 − 18 1 18 I3 0 1 0 0 1 0 2 −1 −5 0 0 1 0 0 1 0 0 1 7 18 5 − 18 1 18 A−1 = 1 18 7 18 5 − 18 . . We will illustrate the previous procedure by an example... b b1 b2 . Determine the inverse of the matrix A given by 1 2 3 A = 2 3 1 . . GaussJordan elimination algorithm Step 1. . . . am1 x2 a12 a22 . 3 1 2 We observe that the matrix A is invertible since its determinant is −18 = 0.... amn 31 . . Example. bm x1 a11 a21 . . ..
each new equation) R. with bi = 0 then exit the algorithm. If all the variables are leading variables then the system has a unique solution which can be obtained directly from the column b . Step 5. The solution set can be speciﬁed as follows . equivalently.the variables whose columns are not in proper form may assume any values and they are called secondary variables. Use aij as a pivot to eliminate the unknown xj from all the equations except the ith equation (by applying rectangle’s algorithm). Step 4. Repeat steps 2 and 3 with the subsystem formed by all the equations from which a pivot hasn’t been chosen yet. In the case of consistency (the system is consistent if we choose a pivot from each row) write the general solution.Step 2. If there is at least one secondary variable then the system has inﬁnitely many solutions. The pivot has to be chosen from the coeﬃcient matrix A not from the constant column. Examine each new row obtained (or. x + 2y − 3z + 4t = 2 2x + 5y − 2z + t = 1 a) 5x + 12y − 7z + 6t = 7 Solution b 2 1 7 2 −3 −3 8 −3 3 x 1 2 5 1 0 0 1 0 0 y 2 5 12 2 1 2 0 1 0 z −3 −2 −7 −3 4 8 −11 4 0 t 4 1 6 4 −7 −14 18 −7 0 The system is inconsistent since we obtain the following equation: 3 = 0 · x + 0 · y + 0 · z + 0 · t.the variables whose columns are in proper form are called leading variables. In this case express the leading variables in terms of secondary variables. 32 . Continue the above process until we choose a pivot from each row or a degenerate equation is obtained at the step 3b. Choose and circle aij = 0 (the pivot). b) If R corresponds to the following equation 0 · x1 + 0 · x2 + · · · + 0 · xn = bi . Solve the following linear systems. a) If R corresponds to the following equation 0 · x1 + · · · + 0 · xn = 0 then delete R from the table. The system is inconsistent. Step 3. Example.
The leading variables are x. z 3 x + 2y − 3z − 2s + 4t = 1 2x + 5y − 8z − s + 6t = 4 c) x + 4y − 4z + 5s + 2t = 8 Solution b 1 4 8 1 2 7 −3 2 3 21 −7 3 x 1 2 1 1 0 0 1 0 0 1 0 0 y 2 5 4 2 1 2 0 1 0 0 1 0 z −3 −8 −4 −3 −2 −1 1 −2 3 25 −11 3 s −2 −1 5 −2 3 7 −8 3 1 0 0 1 t 4 6 2 4 −2 −2 8 −2 2 24 −8 2 33 . z and the system has a unique solution which is x 2 y = −1 . y. x − 2y + z = 7 2x − y + 4z = 17 b) 3x − 2y + 2z = 14 Solution b 7 17 14 7 3 −7 0 −11 7 2 −1 3 x 1 2 3 1 0 0 1 0 0 1 0 0 y −2 −1 −2 −2 3 4 2 11 −4 0 1 0 z 1 4 2 1 2 −1 0 0 1 0 0 1 The system is consistent since we have chosen a pivot from each row.
We will measure the levels of the outputs in terms of their economic values. s. . the secondary variables are z. . In this model there are n industries producing n diﬀerent products such that consumption equals production. t and in consequence the system is consistent and has inﬁnitely many solutions. = . The general solution is: where from we easily can express the leading variables in terms of secondary variables x = 21 − 25z − 24t y = −7 + 11z + 8t s = 3 − 3z − 2t x y z s t 21 − 25z − 24t −7 + 11z + 8t z 3 − 3z − 2t t The system is consistent since we have chosen a pivot from each row. The problem is to determine the levels of the outputs of the industries if the external demand is given and the prices are ﬁxed. The leading variables are x. . y. xn 34 . t ∈ R. Leontief Production Model The Leontief production model is a model for the economics of a whole country or region. Over some ﬁxed period of time. The general solution can be expressed as follows.with z. We remark that a part of production is consumed internally by industries and the rest is to satisfy the outside demand. We deﬁne the production vector x1 x2 x = . t ∈ R. From the ﬁnal table we write down the following equivalent system with the given one: x + 25z + 24t = 21 y − 11z − 8t = −7 3z + s + 2t = 3 where z. let xi = monetary value of the total output of the ith industry di = monetary value of the output of the ith industry needed to satisfy the external demand cij = monetary value of the output of the ith industry needed by the j th industry to produce one unit of monetary of its own output.
4 0. i = 1.. suppose the economy consists of three sectors manufacturing. ... 7x3 = 20 35 ..the demand vector and the consumption matrix d= c12 c22 . If the matrix In − C is invertible.. cnn It is obvious that xj . 1x2 + 0. 2x2 − 0.. . or production model. then we obtain x = (In − C)−1 d. The above system can be solved by using the GaussJordan elimination method. cn1 . 1x3 = 30 −0. . we can rewrite the previous equation as In x − Cx = d (In − C)x = d. Find the production level that will satisfy this demand. . Example. 0. 2 0. The quantity ci1 x1 +c12 x2 +· · ·+cin xn is the value of the output of the ith industry needed by all n industries. 5 0. 1 C = 0.. 2 0. 1 0. As a simple example. n. Writing x as In x and using matrix algebra. 3 0. agriculture and services whose consumption matrix is given by 0. n. We are led to the following equation x = Cx + d which is called the Leontief inputoutput model. dn c1n c2n . 2x1 − 0. .. cn2 d1 d2 . 3 c11 c21 C= .. dj . Suppose the external demand is 50 units for manufacturing. 1 . cij ≥ 0 for each j = 1. 5x1 − 0. .. Solution 1 (by using the GaussJordan elimination method) The production equation is the following (I3 − C)x = d which gives us the following system to be solved: 0.... 1x3 = 50 −0. 30 units for agriculture and 20 units for services.. 7x2 − 0. 4x1 + 0.
7 −1 5 9 − 50 33 50 −1 3 −1 3 3 5 1 0 0 2 4 5 2 5 70 27 40 27 2 3 80 27 50 27 10 9 I3 0 1 0 0 1 0 20 27 50 27 1 3 25 27 55 27 5 9 0 0 1 0 0 1 0 0 1 5 9 5 9 5 3 0 1 0 0 1 0 0 0 1 . 1 −2 5 27 50 9 − 50 −0. 2 1 0 0 1 0 0 1 0 0 36 I3 − C −0. 1 2 7 −1 2 9 −15 0 1 0 0 1 0 x3 −0. 1 0. 0. 2 −5 −4 −2 −5 −9 33 −3 −1 18 0 0 1 x2 −0. 5 −0. 1 −0. We ﬁrst determine the matrix (I3 − C)−1 . 1 −0. 5 −0.b 50 30 20 −500 300 200 −500 −200 3700 − 4100 9 − 200 9 10100 3 950 9 4450 27 5050 27 x1 0. 4 −0. 2 0. 2 0. 7 −0. 4 −0. 7 1 −1 7 1 0 0 1 0 0 1 0 0 5050 ≈ 187 27 4450 x2 = ≈ 165 27 950 ≈ 106 x3 = 9 Solution 2 (by determining the inverse of the matrix I − C) We know that the production level is determined by x1 = x = (I3 − C)−1 D. 1 0. 7 −0.
(I − C) and in consequence −1 = 80 27 50 27 10 9 25 27 55 27 5 9 5 9 5 9 5 3 as we expected. The corresponding production vector x is the j th column of (I − C)−1 .. Remark. If C and d have nonnegative entries and if each row sum or each column sum of C is less than 1. Proof. I − C is invertible and the production vector x is economically feasible in the sense that the entries in x are nonnegative. Theorem. and if we eliminate these equations we don’t change the general solution). x = (I − C)−1 d = 5050 27 4450 27 950 9 Basic feasible solutions We consider a linear system in general form a11 x1 + a12 x2 + · · · + a1n xn = b1 . Also. j)th entry of (I − C)−1 gives the production of the ith sector to satisfy 1 unit in the external demand for sector j. This shows that the (i. The theorem below shows that in most practical cases. Now. then (I − C)−1 exists and the production vector x = (I − C)−1 d has nonnegative entries and is the unique solution of the production equation x = Cx + d. we suppose that rank A = m (in the case that rank A < m then there are some equations of the system which are linear combinations of the others. j)th entry of the matrix (I − C)−1 is the increased amount of the ith sector which is to be produced in order to satisfy an increase of 1 unit in the external demand for sector j. The economic interpretation of entries in (I − C)−1 The (i. We suppose that the above system is a consistent one with an inﬁnite number of solutions (that means that rank A = rank A < n). the conclusion is true since if x1 and x2 are production vectors which satisfy respectively the external demands d1 and d2 then x1 − x2 is the production vector which satisﬁes the external demand d1 − d2 .. Let d be the vector in Rn with 1 in the j th entry and zeros elsewhere. 37 . Let C be the consumption matrix for an economy and let d be the vector of external demand.Hence. am1 x1 + am2 x2 + · · · + amn xn = bm .
If a BFS is degenerate. A basic feasible solution (BFS) is a feasible solution which is also a basic one. 2 15 2 38 . Example. x2 (x3 is a nonbasic variable) Since x3 is nonbasic then x3 = 0 and the system becomes 2x1 + 3x2 = 9 −x1 + x2 = −2 The solution of the previous system is x1 = 3 and x2 = 1. x3 (x1 is a nonbasic variable) 0 In this case we obtain the BS 11 which is a BFS. we have 2 basic variables and one nonbasic variable. −5 3 c) x2 . A leading variable is also called a basic variable and a secondary variable is called a nonbasic variable. Since A= 2 3 −1 1 −1 9 −1 −2 . Determine all the basic solutions and all the basic feasible solutions of the following system: 2x1 + 3x2 − x3 = 9 −x1 + x2 − x3 = −2 Solution. Actually. 0 3 b) x1 . The 2 basic variables can be: a) x1 . x3 (x2 is a nonbasic variable) 11 In this case we obtain the BS 0 which is not a BFS. then rank A = rank A = 2 then the system is consistent with an inﬁnite number of solutions. it is called a degenerate BFS. Deﬁnitions A feasible solution (FS) of a linear system is a solution for which all the components are nonnegative. If one or more basic variables in a BS are zero then the solution is a degenerate BS. 3 In this case we obtain the BS 1 which is also a BFS. A basic solution (BS) of a linear system is a solution for which all the nonbasic variables are zero.Since rank A = rank A = m < n the system will have m leading variables and n − m secondary variables.
Since a basic variable is a variable from whose column we have chosen a pivot. . . For a consistent system having an inﬁnite number of solutions whose m rank is m < n (n is the number of unknowns) there are at most Cn basic solutions. we choose a pivot from each row.. 1 0 .. . .. . . Our purpose is to determine the basic feasible solutions of a linear system. The basic solution β1 β2 .. . . ... β1 . . bm . so x1 = β1 ... . . . The computations can be arranged in the following table. we can suppose that the basic variables are x1 . ... xm = βm . . that means that we have chosen m pivots from m diﬀerent columns and m diﬀerent rows. . . . We will use the GaussJordan elimination method. . 39 x = βm 0 . βm x1 a11 am1 . Since the rank A = m then we have m basic variables and n−m nonbasic variables. xn a1n amn α1n αmn The general solution is: x1 = β1 − (α1m+1 xm+1 + · · · + α1n xn ) x2 = β2 − (α2m+1 xm+1 + · · · + α2n xn ) . . . . xn .Remark. .. .. . . is in the ﬁnal table... . . . x2 = β2 . . xn ∈ R In order to get a basic solution we let xm+1 = · · · = xn = 0. .. This basic solution is also a basic feasible solution if in the ﬁnal table the column of the constants contains only nonnegative elements... 0 .. Eventually by renumbering the unknowns we can suppose that we have chosen pivots from the ﬁrst m columns. xm and the nonbasic variables are xm+1 . xm = βm − (αmm+1 xm+1 + · · · + αmn xn ) xm+1 .... xm a1m amm 0 1 xm+1 a1m+1 amm+1 α1m+1 αmm+1 . So. b b1 .. ... . In consequence. .
then we can multiply it by −1). aij > 0. aij Since aij > 0 then bk aij − bi akj ≥ 0. Conclusion. then the element bk will transform by using the rectangle’s rule into bk → bk aij − bi akj ≥ 0. aij akj So. we are interested in preserving the property of the constant column to contain only nonnegative elements at each intermediate table which occurs when we solve the system. m. which has to be If we choose aij = 0 as a pivot then bi will transform into aij nonnegative. The previous condition is called the ratio test. We may assume that in the initial table the constant column is nonnegative (if there is an equation whose righthand side constant is negative. 40 . (∗) is satisﬁed if bi = min aij bk . m  akj ≤ 0}. aij If k = i. bi . ∀ k ∈ J1 . then aij (the pivot) has to be positive. For k = i the previous inequality becomes bi aij − bi aij = 0 ≥ 0. m  akj > 0} and J2 = {k = 1. bi Since bi ≥ 0 and ≥ 0. 1) The pivot has to be positive. We are interested in choosing a pivot from the j th column such that the constant column in the next table will remain nonnegative. k = 1. Actually. we will determine rules for choosing the pivot such that if in the initial table the column of the constants is nonnegative then so it will be in the ﬁnal table. In this case xj can’t be a basic variable. k ∈ J1 akj where J1 = {k = 1. If k ∈ J1 then (∗) is equivalent with the following condition bk bi ≤ . so the pivot has to satisfy the following condition bi akj ≤ bk aij .Next. too. k = i. k = 1. In order to keep the nonnegativity property of the constants column we obtain the following rule for choosing a pivot on the j th column. m (∗) Let J1 = {k = 1. 2) If J1 = ∅ (on the j th column there is no positive element) then none of the elements of the j th column can become a pivot. m  akj > 0}. If k ∈ J2 then (∗) is satisﬁed since bi akj ≤ 0 ≤ bk aij .
. a2 = b2 . 2x1 + 3x2 − x3 = 9 x1 − x2 + x3 = 2 b 9 2 5 2 1 3 11 2 15 2 → x1 → x2 ← x1 x2 → x3 x1 2 1 0 1 0 1 3 2 5 2 x2 x3 3 −1 −1 1 5 −3 −1 1 1 −3 5 0 1 0 2 5 ratio test min 9 2 2. bn ) ∈ Rn . . a3 < b3 or a1 = b1 .If J1 = ∅ then the pivot will be the positive element situated on j th column for which the ratio test is satisﬁed. Determine a basic feasible solution for the following systems: a) 2x1 + 3x2 − x3 = 9 −x1 + x2 − x3 = −2  −1 First. . an−1 = bn−1 . . . for the ratio test. Let a = (a1 . . . an < bn Examples. 1 =2 0 1 3 BF S : x = 1 0 BF S : x = 0 11 2 15 2 41 . we multiply the second equation by −1. The computation table contains an extra column situated at the right hand side of the usual table. an ) ∈ Rn and b = (b1 . If the ratio test is satisﬁed for more than one element then the pivot will be the element which provide