Contents

Introduction 7
I Elements of linear algebra 11
1 Linear spaces 13
1.1 The Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Euclidean n-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Linear Algebra 27
2.1 Systems of linear equations. Gauss-Jordan
elimination method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Linear programming problems (LPP) . . . . . . . . . . . . . . . . . . . 43
II Calculus 111
3 One variable calculus 113
3.1 Differential calculus of one variable . . . . . . . . . . . . . . . . . . . . 113
3.1.1 Limits and continuity . . . . . . . . . . . . . . . . . . . . . . . 114
3.1.2 Rates of change and derivatives . . . . . . . . . . . . . . . . . . 127
3.1.3 Linear approximation and differentials . . . . . . . . . . . . . . 139
3.1.4 Extreme values of a real valued function . . . . . . . . . . . . . 141
3.1.5 Applications to economics . . . . . . . . . . . . . . . . . . . . . 150
3.2 Integral calculus of one variable . . . . . . . . . . . . . . . . . . . . . . 157
3.2.1 Antiderivatives and techniques of integration . . . . . . . . . . 157
3.2.2 The definite integral . . . . . . . . . . . . . . . . . . . . . . . . 164
3.3 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.3.1 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.3.2 Euler’s integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 176
4 Differential calculus of several variables 187
5
4.1 Real functions of several variables. Limits and
continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
4.1.1 Real functions of several variables . . . . . . . . . . . . . . . . 187
4.1.2 Limits. Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 192
4.2 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
4.3 Higher order partial derivatives . . . . . . . . . . . . . . . . . . . . . . 212
4.4 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
4.4.1 Differentiability. The total differential . . . . . . . . . . . . . . 218
4.4.2 Higher order differentials . . . . . . . . . . . . . . . . . . . . . 228
4.4.3 Taylor formula in R
n
. . . . . . . . . . . . . . . . . . . . . . . . 231
4.5 Extrema of function of several variables . . . . . . . . . . . . . . . . . 235
4.6 Constrained extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
4.7 Applications to economics . . . . . . . . . . . . . . . . . . . . . . . . . 261
4.7.1 The method of least squares . . . . . . . . . . . . . . . . . . . . 261
4.7.2 Inventory control. The economic order
quantity model . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
III Probabilities 269
A short history of probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 271
5 Counting techniques. Tree diagrams 273
5.1 The addition rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
5.2 Tree diagrams and the multiplication principle . . . . . . . . . . . . . 277
5.3 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . 281
6 Basic probability concepts 289
6.1 Sample space. Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
6.2 Conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . . 302
6.3 The total probability formula. Bayes’ formula . . . . . . . . . . . . . . 306
6.4 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
6.5 Classical probabilistic models. Urn models . . . . . . . . . . . . . . . . 313
7 Random variables 327
7.1 Discrete random variables . . . . . . . . . . . . . . . . . . . . . . . . . 328
7.2 The distribution function of a random variable . . . . . . . . . . . . . 332
7.3 Continuous random variables . . . . . . . . . . . . . . . . . . . . . . . 335
7.4 Numerical characteristics of random variables . . . . . . . . . . . . . . 336
7.5 Special random variables . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Appendix A 380
Appendix B 384
Bibliography 389
6
Introduction
Why Maths?
Because Mathematics is the universal language of sciences. When we speak math-
ematics, all the barriers - linguistic or cultural ones - are pushed away.
Why Maths in Economis?
Mathematics plays an important role in Economics. This role has been rather
significant for the late century and knew a real impulse during the last decades.
Emmanuel Kant (1724-1804) said: ”A science contains as much science as it con-
tains Mathematics”.
One of the first economists who wanted to make economics more scientific by
applying the mathematical rigor to it was Alfred Marshall (1842-1924, English
economist). He did not want that an overside of mathematics to make the economic
texts harder to understand. Accordingly, Marshall put the mathematical content in
the footnotes and appendices of his economics books. In 1906 we wrote:
”I had a growing feeling in the later years of my work at the subject that a good
mathematical theorem dealing with economic hypotheses was very unlikely to be a
good economics: and I went more and more on the rules
(1) Use mathematics as a short hand language, rather than an engine of inquiry;
(2) Keep to them till you have done;
(3) Translate into English;
(4) Then illustrate by examples that are important in real life;
(5) Burn the mathematics;
(6) If you can’t succeed in (4), burn (3).
That last I did often.”
The use of mathematics in economics provides us some advantages:
- the language that is used is more precise and concise
- it allows us to treat the general case
- we have at our disposal a great number of mathematical results.
On his blog, Greg Mankiw (Professor of Economics at Harvard University) wrote
the following answers to the next question:
”Why aspiring economists need Math?”
”A student who wants to pursue a career in policy-related economics is advised to
7
go to the best graduate school he or she can get into. The best graduate schools will
expect to see a lot of math on your undergraduate transcript, so you need to take it.
But will you use a lot of differential equations and real analysis once you land that
dream job in a policy organization? No, you won’t.
That raises the question: Why do we academics want students that have taken a
lot of math? There are several reasons:
1. Every economist needs to have a solid foundation in the basics of economic
theory and econometrics, even if you are not going to be either a theorist or an econo-
metrician. You cannot get this solid foundation without understanding the language
of mathematics that these fields use.
2. Occasionally, you will need math in your job. In particular, even as a policy
economist, you need to be able to read the academic literature to figure out what
research ideas have policy relevance. That literature uses a lot of math, so you will
need to be equipped with mathematical tools to read it intelligently.
3. Math is good training for the mind. It makes you a more rigorous thinker.
4. Your math courses are one long IQ test. We use math courses to figure out who
is really smart.
5. Economics graduate programs are more oriented to training students for aca-
demic research than for policy jobs. Although many econ PhD go on to policy work,
all of us teaching in graduate programs are, by definition, academics. Some academics
take a few years off to experience the policy world, as I did not long ago, but many
academics have no idea what that world is like. When we enter the classroom, we
teach what we know. (I am not claiming this is optimal, just reality.) So math plays
a larger role in graduate classes than it does in many jobs that PhD economists hold.
It is possible that admissions committees for econ PhD programs are excessively
fond of mathematics on student transcripts? Perhaps. That is some thing I might
argue with my colleagues about if I were ever put on the admissions committee. But
a student cannot change that. The fact is, if you are thinking about a PhD program
in economics, you are advised to take math course until it hurts.”
At present the Maths teachers’ mission is to do the best advertisement for Maths,
to make the students the importance of Maths, to ”dress up” the Maths classes in
vivid colours.
The purpose of this book (covering three parts (Linear Algebra, Calculus and
Probabilities) divided in seven chapters) is to give to the students at economics the
possibility to acquire the basic knowledge in Maths which they will have to work with
in the future, in order to be able to use them with complex economic models belonging
to the real world. Our book attempts to develop the student’s intuition concerning
the ways of working with mathematical techniques.
This book wants to be a device for using Maths in order to understand the struc-
ture of ”economics”.
The text offers an introduction into the most intimate relationship between Maths
and Economics.
Taking into account the applicative content of this book, we will not always present
the complete proofs of all theoretical statements but attached importance to examples
and economic applications. As Mathematics is a very old science, we can’t possibly be
8
entirely original, but the structure and concepts have been thought out after several
years of work together with the students in economics. For the content could get closer
to the necessities of economists, we introduced several examples from the economic
field.
This book is especially meant to the first year’s students in ”Economic Sciences
and Business Administration”. We also turn to all those who should need to refresh
their knowledge in maths which are to be used in economics or just for the sake of
their professional update.
9
10
Part I
Elements of linear algebra
11
Chapter 1
Linear spaces
1.1 The Euclidean space
One of the main uses of mathematics in economic theory is to construct the ap-
propriate geometric and analytic generalizations of the two or three-dimensional ge-
ometric models which are the main stay of undergraduate economic courses. In this
paragraph we will study how to generalize notions of points, lines, planes, distances
and angles to n-dimensional Euclidean spaces.
The set of real numbers, denoted by R, plays a dominant role in mathematics. The
geometric representation of R is a straight line. A point, called the origin, is chosen
to represent 0 and another point, usually to the right of 0, to represent 1. There is
a natural correspondence between the points on the line and real numbers, i.e. each
point will represent a unique real number and each real number will be represented
by a unique point. For this reason we refer to R as the real line and use the words
point and number interchangeably.
0 1
O P
−∞ ∞
The real line R
We assume that the reader is familiar with the Cartesian plane. Each point P
represents an ordered pair of real numbers (a, b) ∈ R
2
and each element of R
2
can
be represented by a point in the Cartesian plane. The vertical line through the point
P meets the horizontal axis (x axis) at a which is called the abscissa of P. The
horizontal line through the point P meets the vertical axis (y axis) at b which is
called the ordinate of P. For this reason we refer to
R
2
= ¦(a, b) [ a ∈ R, b ∈ R¦
as the plane and use the words plane and ordered pair of real numbers interchangeably.
13
¸
`

O
P
b
a
x axis
y axis
The cartesian plane R
2
If P(x
1
, y
1
) and Q(x
2
, y
2
) are two points in R
2
then the distance between them
can be determined by using the pythagorean theorem in a right triangle. We shall
denote the distance between P and Q by d(P, Q).
`
¸
y
2
y
1
x
2
x
1
P
Q
θ
_
¸
_
¸
_
. ¸¸ .
x
2
−x
1
y
2
−y
1
The line PQ
d(P, Q) =
_
(x
2
−x
1
)
2
+ (y
2
−y
1
)
2
.
It is well know that two different points determine exactly one line.
Next, we will present different forms of the equation of a line.
The inclination θ of a line l is the angle that l makes with the horizontal axis. θ
is the smallest positive angle measured counterclockwise from the positive end of the
x axis to the line l. The range of θ is given by 0 ≤ θ < 180

.
The slope m of the line is defined as the tangent of its angle θ of inclination:
m = tg θ.
14
In particular, the slope of the line passing through two points P(x
1
, y
1
) and
Q(x
2
, y
2
) is given by:
m = tg θ =
y
2
−y
1
x
2
−x
1
.
Remark. a) The slope of an horizontal line, i.e. when y
2
= y
1
, is zero.
b) The slope of a vertical line, i.e. when x
2
= x
1
is not defined.
c) Two distinct lines l
1
and l
2
are parallel if and only if their slopes, m
1
respectively
m
2
, are equal, i.e. m
1
= m
2
.
d) Two distinct lines l
1
and l
2
are perpendicular if and only if the slope of one is
the negative reciprocal of the other:
m
1
= −
1
m
2
or m
1
m
2
= −1.
Linear equations
Every line l in the Cartesian plane R
2
can be represented by a linear equation of
the form
(l) ax +by +c = 0,
where a and b are not both zero, i.e. each point on l is a solution of (l) and each
solution of (l) is a point on l.
Horizontal and vertical lines
The equation of a horizontal line i.e. a line parallel to the x axis, is of the form
y = k where k is the ordinate of the point at which the line intersects the y axis.
In particular, the equation of the x axis is y = 0.
The equation of a vertical line i.e. a line parallel to the y axis, is of the form
x = k where k is the abscissa of the point at which the line intersects the x axis.
In particular, the equation of the y axis is x = 0.
Point-slope form
A line is completely determined if we know its direction (its slope) and a point on
the line.
The equation of the line having slope m and passing through the point (x
1
, y
1
) is
(l) y −y
1
= m(x −x
1
).
Two points form
Let P(x
1
, y
1
) and Q(x
2
, y
2
) two different points in the Cartesian plane. The equa-
tion of the line which passes through the previous two points is:
(l)
y −y
1
y
2
−y
1
=
x −x
1
x
2
−x
1
.
15
The previous equation was obtained by replacing m =
y
2
−y
1
x
2
−x
1
in the point-slope
form of the equation.
1.2 Euclidean n-space
We can interpret the order pairs of R
2
not only as locations but also as displace-
ments. We represent these displacements as vectors in R
2
. The displacement (a, b)
means: move a units to the right and b units up from the current location. The tail of
the arrow marks the initial location; the head marks the location after the displace-
ment is made.
To develop a geometric intuition for vector addition we can think in the following
way. If u = (a, b) and v = (c, d) are two vectors in R
2
, then u + v will represent a
displacement of a +c units to the right and b +d units up.
`
¸

`
y
x
O
u
u
v
v
u
+
v
Addition of two vectors
We can use the parallelogram as in the above figure to draw u + v keeping the
tails of u and v at the same point.
It is generally not possible to multiply two vectors in a nice way to generalize the
multiplication of real numbers. For instance, coordinatewise multiplication does not
satisfy the basic properties of the multiplication of real numbers. On the other hand,
geometrically, scalar multiplication of a vector v be a nonnegative (negative) scalar
corresponds to stretching or shrinking v without (with) changing its direction.
16
`
¸

y
x
2a
a
b
2b
O
u
2u
Vector multiplication by a real scalar
We will generalize the previous discussion to the general case.
For an integer n ≥ 1. By definition R
n
is the set of ordered n-tuples x =
(x
1
, x
2
, . . . , x
n
) of real numbers i.e.
R
n
= ¦x = (x
1
, x
2
, . . . , x
n
) [ x
1
, x
2
, . . . , x
n
∈ R¦.
The elements of R
n
are called vectors and the numbers x
i
, i = 1, n, are called the
coordinates of x (x
i
is the i
th
coordinate of x).
The two fundamental operations (which generalize the addition of two vectors and
the multiplication of a vector by a scalar) are:
1. addition of two vectors: if x, y ∈ R
n
then
x +y := (x
1
+y
1
, x
2
+y
2
, . . . , x
n
+y
n
) ∈ R
n
2. scalar multiplication: if x ∈ R
n
and α ∈ R then
αx = (αx
1
, αx
2
, . . . , αx
n
) ∈ R
n
.
Next, we will present an axiomatic concept based on the simplest properties of
the previous operations.
Definition. A real vector space is a set V ,= ∅ with an operation + : V V → V
called vector addition and an operation : R V → V called scalar multiplication
with the following properties:
(i) (x +y) +z = x + (y +z), ∀ x, y, z ∈ V
(ii) there is a vector θ ∈ V (called the null vector) that is an identity element for
addition:
x +θ = θ +x = θ, ∀ x ∈ V
(iii) for any x ∈ V there is −x ∈ V such that
x + (−x) = (−x) +x = θ
17
(iv) x +y = y +x, ∀ x, y ∈ V
(v) 1 x = x, ∀ x ∈ V
(vi) ∀ x, y ∈ V, ∀ α, β ∈ R
(a) (αβ)x = α(βx)
(b) (α +β)x = αx +βx
(c) α(x +y) = αx +βy.
Remark. (V, +) is a commutative group.
Example. R
n
is a vector space.
The proof of the previous example follows immediately by using the definitions of
the operations in R
n
.
We next show that any finite dimensional vector space is ”like” R
n
.
Let V be a vector space.
A linear subspace of V is a subset W ⊂ V that is itself a vector space with
vector addition and scalar multiplication defined by restriction of given operations on
V .
Remark. If W ⊆ V then W is a linear subspace if and only if the following two
conditions are fulfilled:
a) W ,= ∅
b) ∀ α ∈ R, ∀ u, v ∈ W, u +αv ∈ W.
If α
1
, α
2
, . . . , α
n
∈ R and v
1
, v
2
, . . . , v
n
∈ V then the sum α
1
v
1
+ + a
n
v
n
∈ V
is called a linear combination of the vectors v
1
, v
2
, . . . , v
n
.
The span of a set S ⊂ V is the set of all linear combinations α
1
v
1
+ + α
n
v
n
where v
1
, . . . , v
n
∈ V and α
1
, α
2
, . . . , α
n
∈ R.
span S = ¦α
1
v
1
+ +α
n
v
n
[ n ∈ N

, α
1
, . . . , α
n
∈ R, v
1
, . . . , v
n
∈ V ¦.
A set of vectors ¦v
1
, . . . , v
n
¦ is called linearly independent if the vector equation
α
1
v
1

2
v
2
+ +α
n
v
n
= θ
has the only trivial solution α
1
= α
2
= = α
n
= 0.
A set of vectors ¦v
1
, . . . , v
n
¦ is called linearly dependent if the vector equation
α
1
v
1
+ +α
n
v
n
= θ has a nontrivial solution, that is, there are α
1
, α
2
, . . . , α
n
∈ R
not all zero, such that α
1
v
1
+ +α
n
v
n
= θ.
In other words, a set of vectors in a vector space is linearly dependent if and only
if one vector can be written as a linear combination of the others.
A basis for V is a subset B ⊂ V that spans V (spanB = V ) and is minimal for
this property in the sense that there are no proper subsets of B that span V .
A basis is linearly independent. Conversely, a linearly independent set that spans
V is a basis for V .
Remark. If V has a finite basis, then all basis have the same number of elements.
In this case we say that V is finite dimensional and the common number of elements
of the basis of V is called the dimension of V .
If B = ¦v
1
, . . . , v
n
¦ is a basis of v then each point v ∈ V can be written as a linear
combination
v = α
1
v
1

2
v
2
+ +α
n
v
n
18
in exactly one way.
Example. a) The set B = ¦e
1
, . . . , e
n
¦ where
e
i
= (0, . . . , 0, 1, 0, . . . , 0)
(1 is the i
th
coordinate), i = 1, n, is a base of the vector space R
n
.
b) dimR
n
= n.
Normed spaces
The concept of a norm is an abstract generalization of the length of a vector.
Definition. Let V be a vector space. A function | | : V →R, x → |x| is called
a norm if it satisfies the following conditions:
N1) |x| ≥ 0, ∀ x ∈ V
|x| = 0 ⇔ x = 0
N2) |αx| = [α[ |x|, ∀ x ∈ V, ∀ α ∈ R
N3) |x +y| ≤ |x| +|y|, ∀ x, y ∈ V .
A vector space V with a norm | | is called a normed space and it is denoted by
(V, | |).
Inner product spaces
Definition. Let V be a vector space. A mapping ¸, ) : V V → R is called an
inner product in V if the following conditions are satisfied:
IP1) ¸x, x) > 0, ∀ x ∈ V ¸ ¦θ¦;
¸x, x) = 0 ⇔ x = θ (positive definiteness)
IP2) ¸x, y) = ¸y, x), ∀ x, y ∈ V
IP3) ¸αx +βy, z) = α¸x, z) +β¸y, z), ∀ x, y, z ∈ V, ∀ α, β ∈ R (bilinearity).
A vector space with an inner product is called an inner product space.
Example. The canonical inner product in R
n
is defined in the following way:
¸, ) : R
n
R
n
→R
¸x, y) = x
1
y
1
+x
2
y
2
+ +x
n
y
n
, ∀ x, y ∈ R
n
.
The proof that the previous functions satisfies all the properties of the previous
definition is left to the reader (easy computations based on the properties of real
numbers).
Example. The function | | : R
n
→R
x → |x| =
_
¸x, x)
is a norm on R
n
which is called the Euclidean norm.
Proof. The properties (N1) and (N2) are easy consequences of the properties of
the inner product.
The property (N3) follows from the Cauchy-Buniakovski-Schwarz inequality:
[¸x, y)[ ≤ |x| |y|, ∀ x, y ∈ R
n
.
19
We consider first the following obvious inequalities:
0 ≤ |x +y|
2
= ¸x +y, x +y) = |x|
2
+|y|
2
+ 2¸x, y)
0 ≤ |x −y|
2
= ¸x −y, x −y) = |x|
2
+|y|
2
−2¸x, y)
wherefrom we get
−(|x|
2
+|y|
2
) ≤ 2¸x, y) ≤ |x|
2
+|y|
2
hence
2[¸x, y)[ ≤ |x|
2
+|y|
2
.
We can assume that x ,= θ and y ,= θ (if x = θ or y = θ the Cauchy-Buniakovski-
Schwarz is true).
If we replace x by
x
|x|
and y by
y
|y|
in the previous inequality we obtain
2
¸
¸
¸
¸
_
x
|x|
,
y
|y|

¸
¸
¸

_
_
_
_
x
|x|
_
_
_
_
2
+
_
_
_
_
y
|y|
_
_
_
_
2
= 2
wherefrom we have:
[¸x, y)[ ≤ |x| |y|,
as desired.
We are now able to prove N3)
|x +y|
2
= |x|
2
+ 2¸x, y) +|y|
2
CBS
≤ |x|
2
+ 2|x| |y| +|y|
2
= (|x| +|y|)
2
.
Example (other norms on R
n
).
1) | |
1
: R
n
→R, |x|
1
= [x
1
[ + +[x
n
[, ∀ x ∈ R
n
2) | |

: R
n
→R, |x|

= max¦[x
1
[, . . . , [x
n
[¦.
Metric spaces
Definition. If X ,= ∅, d : X X → R is called distance on X if the following
conditions are satisfied:
D1) d(x, y) ≥ 0, ∀ x, y ∈ X
d(x, y) = 0 ⇔ x = y
D2) d(x, y) = d(y, x), ∀ x, y (symmetry)
D3) d(x, y) ≤ d(x, y) +d(y, z), ∀ x, y, z (triangle inequality).
A metric space is a pair (X, d) in which X is a nonempty set and d is a distance
on X.
Remark. Each normed space (V, | |) is a metric space since the function
d : V V →R
d(x, y) = |x −y|, ∀ x, y ∈ V
satisfies the properties D1), D2), D3).
Example. In R
n
the Euclidean distance is:
d(x, y) =
_
(x
1
−y
1
)
2
+ + (x
n
−y
n
)
2
, ∀ x, y ∈ R
n
.
20
1.3 Quadratic forms
In this section we present the natural generalizations of linear and quadratic func-
tions to several variables.
Linear operators
Definition. Let V and W be two real vector spaces, such that dimV = n and
dimW = m.
A linear operator from V to W is a function T that preserves the vector space
structure, i.e.
T(αx +βy) = αT(x) +βT(y), ∀ x, y ∈ V, ∀ α, β ∈ R.
A linear operator T : V →R is called a linear form.
Remark 1. Let T : R
n
→R be a linear operator. Then, there exists a vector
a =
_
_
_
a
1
.
.
.
a
n
_
_
_ ∈ R
n
such that T(x) = a
t
x for all x ∈ V .
Proof. Let B = ¦e
1
, . . . , e
n
¦ be the canonical basis of R
n
.
Let a
i
= T(e
i
) ∈ R, i = 1, n. Then, for any vector x ∈ R
n
,
x = x
1
e
1
+x
2
e
2
+ +x
n
e
n
T(x) = T(x
1
e
1
+ +x
n
e
n
) = x
1
T(e
1
) + +x
n
T(e
n
)
= x
1
a
1
+ +x
n
a
n
= a
t
x.
The previous remark implies that every linear form on R
n
can be associated with
a unique vector a ∈ R
n
(or with a unique 1 n matrix) so that T(x) = a
t
x.
The same correspondence between linear operators and matrices is valid for linear
operators from R
n
to R
m
.
Remark 2. Let T : R
n
→ R
m
be a linear operator. Then there exists an m n
matrix A such that
T(x) = Ax, ∀ x ∈ R
n
.
Proof. The idea is the same as that of the previous remark. Let B = ¦e
1
, . . . , e
n
¦
be the canonical base of R
n
.
For each j = 1, n, T(e
j
) ∈ R
m
, hence
T(e
j
) =
_
_
_
_
a
1j
a
2j
. . .
a
mj
_
_
_
_
.
21
Let A be the mn matrix whose j
th
column is the column vector T(e
j
). For any
x = x
1
e
1
+ +x
n
e
n
∈ R
n
we have
T(x) = T(x
1
e
1
+ +x
n
e
n
) = x
1
T(e
1
) + +x
n
T(e
n
)
= x
1
_
_
_
_
a
11
a
21
. . .
a
m1
_
_
_
_
+x
2
_
_
_
_
a
12
a
22
. . .
a
m2
_
_
_
_
+ +x
n
_
_
_
_
a
1n
a
2n
. . .
a
mn
_
_
_
_
=
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
m1
a
m2
. . . a
mn
_
_
_
_
_
_
x
1
. . .
x
n
_
_
= Ax.
So, we can say that matrices are representations of linear operators.
Quadratic forms
In mathematics, a quadratic form is a homogeneous polynomial of degree two in
a number of variables.
Examples.
Q(x) = ax
2
Q(x, y) = ax
2
+bxy +cy
2
Q(x, y, z) = ax
2
+by
2
+cz
2
+dxy +exz +fyz
are quadratic forms in one, two or three variables.
Quadratic forms are associated to bilinear forms.
Definition. a) Let V and W be two real vector spaces such that dimV = n and
dimW = m. The application ([) : V W → R is called bilinear if it is linear with
respect to its two variables, i.e.:
(αx +βy[z) = α(x[z) +β(y[z), ∀ α, β ∈ R, ∀ x, y ∈ V, ∀ z ∈ W
(x[αy +βz) = α(x[y) +β(x[z), ∀ α, β ∈ R, ∀ x ∈ V, ∀ y, z ∈ W.
b) The bilinear form ([) : V V →R is called symmetric if
(x[y) = (y[x), ∀ x, y ∈ V.
c) Let V be a real vector space whose dimension is n and let ([) : V V →R be
a symmetric bilinear form. The application:
Q : V →R
x → Q(x) = (x[x), ∀ x ∈ V
is called a quadratic form on V .
22
Next, we determine the analytical expression of a quadratic form.
If B = ¦v
1
, . . . , v
n
¦ is a base of V then x can be uniquely expressed as
x = x
1
v
1
+x
2
v
2
+ +x
n
v
n
.
Hence:
Q(x) = (x[x) = (x
1
v
1
+x
2
v
2
+ +x
n
v
n
[x)
= x
1
(v
1
[x) +x
2
(v
2
[x) + +x
n
(v
n
[x)
= x
1
(v
1
[x
1
v
1
+ +x
n
v
n
)
+ +
+x
n
(v
n
[x
1
v
1
+ +x
n
v
n
)
= x
1
[x
1
(v
1
[v
1
) + +x
n
(v
1
[v
n
)]
+ +
+x
n
[x
1
(v
n
[v
1
) + +x
n
(v
n
[v
n
)]
= x
1
n

j=1
x
j
(v
1
[v
j
) + +x
n
n

j=1
x
j
(v
n
[v
j
)
=
n

i=1
_
_
x
i
n

j=1
x
j
(v
i
[v
j
)
_
_
=
n

i=1
n

j=1
(v
i
[v
j
)x
i
x
j
.
In conclusion:
Q(x) = (x[x) =
n

i=1
n

j=1
(v
i
[v
j
)x
i
x
j
.
If for each i, j = 1, n we denote a
ij
= (v
i
[v
j
) then a
ij
= a
ji
(since ([) is symmetric)
and
Q(x) =
n

i=1
n

j=1
a
ij
x
i
x
j
= a
11
x
2
1
+2a
12
x
1
x
2
+ + 2a
1n
x
1
x
n
+a
22
x
2
2
+ + 2a
2n
x
2
x
n
+ +a
nn
x
2
n
which is the analytical expression of the quadratic form Q.
Just as a linear function has a matrix representation, a quadratic form has a matrix
representation, too.
Remark. The quadratic form Q : R
n
→R,
Q(x) =
n

i=1
n

j=1
a
ij
x
i
x
j
(a
ij
= a
ji
)
23
can be written as
Q(x) = (x
1
, . . . , x
n
)
_
_
_
_
a
11
. . . a
1n
a
21
. . . a
2n
. . . . . . . . .
a
n1
. . . a
nn
_
_
_
_
_
_
x
1
. . .
x
n
_
_
= x
t
Ax,
where A is the matrix (symmetric) of the coefficients of the quadratic form Q.
Definitess of quadratic forms
Definition. If Q : R
n
→R be a quadratic form, then Q is
(a) positive definite if Q(x) > 0 for all x ∈ R
n
¸ ¦θ¦
(b) positive semidefinite if Q(x) ≥ 0 for all x ∈ R
n
(c) negative definite if Q(x) < 0 for all x ∈ R
n
¸ ¦θ¦
(d) negative semidefinite if Q(x) ≤ 0 for all x ∈ R
n
(e) indefinite if there are x, x

∈ R
n
¸ ¦θ¦ such that Q(x) < 0 and Q(x

) > 0.
Next, we will describe a simple test for the definitess of a quadratic form. To
present the test we need some definitions related to the coefficient matrix of Q.
Definition. a) Let A be a nn matrix. A mm submatrix of A formed by deleting
n − m columns and the same n − m rows from A is called a m
th
order principal
submatrix of A. The determinant of a m m principal submatrix is called a m
th
order principal minor of A.
For a n n matrix there are C
m
n
m
th
order principal minors of A.
b) Let A be a n n matrix. The m
th
order principal submatrix of A obtained by
deleting the last n − m rows and the last n − m columns from A is called the m
th
order leading principal minor of A.
We will denote the m
th
order leading principal submatrix by A
m
and the corre-
sponding leading principal minor by det A
m
.
The following remark provides an algorithm which uses the leading principal mi-
nors to determine the definitess of a quadratic form Q whose coefficient matrix is
A.
Remark 3. Let Q : R
n
→ R be a quadratic form whose coefficient matrix is A.
Then
(a) Q is positive definite if and only if all its n leading principal minors are strictly
positive i.e.
det A
1
> 0, det A
2
> 0, . . . , det A
n
= det A > 0
(b) Q is negative definite if and only if all its n leading principal minors alternate
in sign as follows
det A
1
< 0, det A
2
> 0, . . . , (−1)
n
det A
n
= (−1)
n
det A > 0.
(c) Q is positive semidefinite if and only if every principal minor of A is nonnega-
tive.
24
d) Q is negative semidefinite if and only if every principal minor of odd order is
not positive and every principal minor of even order is nonnegative.
e) If there is an even number m (m ∈ ¦1, . . . , n¦) such that det A
m
< 0 or if there
are two odd numbers m
1
and m
2
such that det A
m
1
< 0 and det A
m
2
> 0 then Q is
indefinite.
Proof. We will prove the part (a) (the proofs are similar for parts (b), (c) and
(d)) by using induction on the size of A (the coefficient matrix of Q).
Suppose that all the leading minors are strictly positive. We have to show that Q
is positive definite.
If n = 1 then the result is trivial.
If n = 2 then det A
1
= a
11
> 0, det A
2
= det A = a
11
a
22
−a
2
12
> 0 and hence:
Q(x) = (x
1
, x
2
)
_
a
11
a
12
a
21
a
22
__
x
1
x
2
_
= a
11
x
2
1
+ 2a
12
x
1
x
2
+a
22
x
2
2
= a
11
_
x
1
+
a
12
a
11
x
2
_
2
+
a
11
a
22
−a
2
12
a
11
x
2
2
= det A
1
_
x
1
+
a
12
a
11
x
2
_
2
+
det A
2
det A
1
x
2
2
> 0, ∀ (x
1
, x
2
) ,= (0, 0).
We suppose that the theorem is true for symmetric matrices of order k and prove
it for symmetric matrices of order k + 1.
Let A be symmetric matrix of order k + 1. We have to prove that if det A
j
> 0,
j = 1, k + 1 then x
t
Ax > 0, ∀ x ,= θ. The matrix A can be written as
A =
_
A
k
a
a
t
a
k+1 k+1
_
where a =
_
_
_
_
a
1 k+1
a
2 k+1
. . .
a
k k+1
_
_
_
_
.
If d = a
k+1 k+1
−a
t
A
−1
k
a then we have
_
I
k
0
(A
−1
k
a)
t
1
__
A
k
0
0 d
__
I
k
A
−1
k
a
0 1
_
=
_
I
k
0
(A
−1
k
a)
t
1
__
A
k
a
0 d
_
=
_
A
k
a
a
t
(A
−1
k
a)
t
a +d
_
=
_
A
k
a
a
t
a
k+1 k+1
_
= A.
The previous equality can be written as A = C
t
BC, where
C =
_
I
k
A
−1
k
a
0 1
_
and B =
_
A
k
0
0 d
_
.
Since det C = det C
t
= 1 and det B = d det A
k
then
det A = det B = d det A
k
.
25
Since det A > 0 and det A
k
> 0 then d > 0. Let x ∈ R
k+1
¸ ¦θ¦. Every x ∈ R
k+1
can be written as x =
_
x
x
k+1
_
where x ∈ R
k
.
Then
x
t
Ax = x
t
C
t
BCx = (Cx)
t
B(Cx) = y
t
By
=
_
y
t
y
n+1
_
_
A
k
0
0 d
__
y
y
k+1
_
= y
t
A
k
y +dy
2
k+1
.
In the previous equality we denoted the vector Cx by y =
_
y
y
k+1
_
which is not
the null vector since C is invertible and x ,= θ.
By using the inductive hypothesis and the fact that d > 0 we get that x
t
Ax > 0,
hence Q is positive definite.
To prove the converse (Q positive definite implies that det [A
j
[ > 0, j = 1, n) we
will use the induction once more.
If n = 1 then the result is trivial.
If n = 2, then
Q(x) = det A
1
_
x
1
+
a
12
a
11
x
2
_
2
+
det A
2
det A
1
x
2
2
.
In the previous equality we used the fact that a
11
,= 0 since if a
11
= 0 then
Q(1, 0) = 0 and Q cannot be positive definite.
It is obvious that if Q(x) > 0, ∀ x ,= θ, then det A
1
> 0 and det A
2
> 0.
Assume that the result is true for any quadratic form whose coefficient matrix has
order k and let A be the (k + 1) (k + 1) coefficient matrix of a positive definite
quadratic form.
Let x ∈ R
k
¸ ¦θ¦. If x =
_
x
0
_
∈ R
k+1
then
0 < x
t
Ax =
_
x
t
0
_
A
_
x
0
_
= x
t
A
k
x.
By the inductive hypothesis we obtain that det A
1
> 0, det A
2
> 0, and det A
k
> 0.
It remains for us only to prove that det A = det A
k+1
is positive.
We write the matrix A as in the first part of the proof. Hence
A = C
t
BC and det A = det A
k
d.
We have to prove now that d > 0. Indeed, since Q is positive definite we have that
d = (0, 0, . . . , 1)
_
A
k
0
0 d
_
_
_
_
0
.
.
.
1
_
_
_ = x
t
Bx = x
t
(C
−1
)
t
AC
−1
x
= (C
−1
x)
t
A(C
−1
x) > 0.
Since det A
k
> 0 and d > 0 then det A = det A
k+1
> 0 as desired.
26
Chapter 2
Linear Algebra
2.1 Systems of linear equations.
Gauss-Jordan elimination method
A finite set of linear equations in the variables x
1
, x
2
, . . . , x
n
∈ R is called a
system of linear equations or a linear system. The general form of a linear system of
m equations and n unknowns is the following:
_
_
_
a
11
x
1
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+ +a
mn
x
n
= b
m
where a
ij
∈ R, b
i
∈ R, ∀ i = 1, m, j = 1, n.
A solution of the system is a list (s
1
, . . . , s
n
) of numbers which makes each equation
a true statement when the values s
1
, . . . , s
n
are substituted for x
1
, . . . , x
n
, respectively.
The set of all possible solutions is called the solution set or the general solution of the
linear system. Two linear systems are called equivalent if they have the same solution
set. That is, each solution of the first system is a solution of the second system, and
each solution of the second system is a solution of the first system.
A system of linear equation has either
1. no solution, or
2. exactly one solution, or
3. infinitely many solutions.
We say that a linear system is consistent if it has either one solution or infinitely
many solutions; a system is inconsistent if it has no solution.
For a linear system we consider
• the matrix of the system (the matrix of the coefficients of the unknowns)
A = (a
ij
)
i=1,m
j=1,n
=
_
_
_
a
11
. . . a
1n
.
.
.
.
.
.
a
m1
. . . a
mn
_
_
_;
27
A =
_
_
_
_
_
R
1
R
2
.
.
.
R
m
_
_
_
_
_
,
where R
i
= (a
i1
, a
i2
, . . . , a
in
) is the i
th
row.
• the augmented matrix (the coefficient matrix with an added column containing
the constants from the right sides of the equations)
A =
_
_
_
a
11
. . . a
1n
b
1
.
.
.
.
.
.
.
.
.
a
m1
. . . a
mn
b
m
_
_
_
• the column of the constants
b =
_
_
_
b
1
.
.
.
b
n
_
_
_
• the column of the unknowns
x =
_
_
_
x
1
.
.
.
x
n
_
_
_.
By using the above matrix notations the system can be written in the following
form:
Ax = b.
Concerning the solutions set of a linear system we have the following result:
Remark. 1) If rank A < rank A then there is no solution of the considered linear
system. The system is inconsistent.
2) If rank A = rank A = n (where n is the number of the unknowns) then the
system has exactly one solution. The system is consistent.
3) If rank A = rank A < n then the system has infinitely many solutions.
This chapter describes an algorithm or a systematic procedure for solving linear
systems. This algorithm is called Gauss-Jordan elimination method and its basic
strategy is to replace one system with an equivalent one that is easier to solve.
If the equivalent system contains a degenerate linear equation of the following
form
0 x
1
+ 0 x
2
+ + 0 x
n
= b
i
then
i) If b
i
,= 0, then the system is inconsistent.
ii) If b
i
= 0, then the degenerate equation may be deleted from the system without
changing the solution set.
The method is named after German mathematicians Carl Friederich Gauss (1777-
1855) and Wilhelm Jordan (1842-1899) but it appears in an important Chinese math-
ematical text which was written approximately at 150 BCE.
28
The rectangle rule for row operations
The purpose of this paragraph is to transform a matrix which has a nonzero
column into an equivalent one that contains one element equal to 1 and all the other
elements equal to 0 (we say that such a column is in proper form).
This can be done by using the elementary row operations which are:
1) Scaling. Multiply all entries in a row by a nonzero constant.
λR
i
→ R
i
, λ ,= 0
2) Replacement. Replace one row by the sum of itself and a multiple of another
row.
R
i
+λR
k
→ R
i
3) Interchange. Interchange two rows.
R
i
↔ R
j
Remark. If we apply the elementary row operations to an augmented matrix of a
linear system we obtain a new matrix which is the augmented matrix of an equivalent
linear system to the given one. This remark is true since it is well known that the
solution of a system remains unchanged if we multiply one equation by a nonzero
constant or if we add a multiple of one equation to another or if we interchange two
equations of a system (the rows of an augmented matrix correspond to the equations
in the associated system).
Let A be the matrix
A =
_
_
_
_
_
_
_
. . . . . . . . . . . . . . .
. . . a
ij
. . . a
il
. . .
.
.
.
.
.
.
. . . a
kj
. . . a
kl
. . .
. . . . . . . . . . . . . . .
_
_
_
_
_
_
_
Suppose that a
ij
,= 0.
We want to determine the elementary row operations which transform the element
a
ij
into 1 (a
ij
→ 1) and all the other elements of the j
th
column into 0 (a
kj
→ 0,
∀ k ,= i).
We consider the following row operations
R
i

1
a
ij
→ R
i
, a
ij

a
ij
a
ij
= 1
and
R
k
−R
i

a
kj
a
ij
→ R
k
, a
kj
→ a
kj
−a
ij

a
kj
a
ij
= 0, ∀ k = 1, m, k ,= i.
The effects of the previous elementary row operations on the other elements of the
matrix are:
a
il

a
il
a
ij
, ∀ l = 1, n, l ,= j
29
a
kl
→ a
kl
−a
il

a
kj
a
ij
=
a
kl
a
ij
−a
il
a
kj
a
ij
, ∀ k ,= i, ∀ l ,= j.
The element a
ij
,= 0 is called the pivot.
So, in order to transform the element a
kl
(by using a
ij
as a pivot) we locate the
rectangle which contains the element and the pivot a
ij
as opposite corners. Then,
from the product of the elements situated in the opposite corners of the previous
rectangle’s diagonal which contains the pivot we subtract the product of the elements
situated in the corners of the other diagonal and the result is divided by the pivot
(rectangle’s rule).
Remark. 1) The rows which contain 0 on the pivot column remain unchanged.
Indeed, if a
kj
= 0 then
a
kl

a
kl
a
ij
−a
il
0
a
ij
= a
kl
.
2) The columns which contain 0 on the pivot row remain unchanged.
Indeed, if a
il
= 0 then
a
kl

a
kl
a
ij
−0 a
kj
a
ij
= a
kl
.
So, in order to transform a matrix which has a nonzero column into an equivalent
one that contains one element equal to 1 and all the other elements equal to 0 we
have to follow the next steps:
Rectangle’s algorithm
Step 1. Choose and circle (from the considered column) a nonzero element which is
called the pivot.
Step 2. Divide the pivot row by the pivot.
Step 3. Set the elements of the pivot column (except the pivot) equal to 0.
Step 4. The rows which contain a 0 on the pivot column remain unchanged.
The columns which contain a 0 on the pivot row remain unchanged.
Step 5. Compute all the other elements of the matrix by using the rectangle’s rule.
Example.
A =
_
_
2 0 1 −1
1 3 2 0
−1 1 0 2
_
_

_
_
2 0 1 −1
1/3 1 2/3 0
−4/3 0 −2/3 2
_
_
Remark. The rectangle rule can be used to determine the inverse of a given
invertible matrix A. This can be done by writing at the right side of the given matrix
the unitary matrix I which has the same number of rows and columns as the matrix A
30
and then applying the rectangle rule to the obtained matrix. By choosing successively
the elements situated on the main diagonal of matrix A as pivots we will finally obtain
the unitary matrix I (situated below the given matrix A). The matrix situated at the
right side of the unitary matrix (in the final table) is the inverse of the matrix A.
We will illustrate the previous procedure by an example.
Example. Determine the inverse of the matrix A given by
A =
_
_
1 2 3
2 3 1
3 1 2
_
_
.
We observe that the matrix A is invertible since its determinant is −18 ,= 0.
A I
3
1 2 3 1 0 0
2 3 1 0 1 0
3 1 2 0 0 1
1 2 3 1 0 0
0 −1 −5 −2 1 0
0 −5 −7 −3 0 1
1 0 −7 −3 2 0
0 1 5 2 −1 0
0 0 18 7 −5 1
1 0 0 −
5
13
1
18
7
18
0 1 0
1
18
7
18

5
18
0 0 1
7
18

5
18
1
18
Hence
A
−1
=
_
_

5
13
1
18
7
18
1
18
7
18

5
18
7
18

5
18
1
18
_
_
.
The Gauss-Jordan elimination method
This method is an elimination procedure which transforms the initial system into
an equivalent one whose solution can be obtained directly.
Gauss-Jordan elimination algorithm
Step 1. Associate to the given system the following table. The table contains the
augmented matrix with the constant column written at the left side of the matrix A.
b x
1
x
2
. . . x
n
b
1
a
11
a
12
. . . a
1n
b
2
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
b
m
a
m1
a
m2
. . . a
mn
31
Step 2. Choose and circle a
ij
,= 0 (the pivot). The pivot has to be chosen from
the coefficient matrix A not from the constant column.
Use a
ij
as a pivot to eliminate the unknown x
j
from all the equations except the
i
th
equation (by applying rectangle’s algorithm).
Step 3. Examine each new row obtained (or, equivalently, each new equation) R.
a) If R corresponds to the following equation
0 x
1
+ + 0 x
n
= 0
then delete R from the table.
b) If R corresponds to the following equation
0 x
1
+ 0 x
2
+ + 0 x
n
= b
i
,
with b
i
,= 0 then exit the algorithm. The system is inconsistent.
Step 4. Repeat steps 2 and 3 with the subsystem formed by all the equations
from which a pivot hasn’t been chosen yet.
Step 5. Continue the above process until we choose a pivot from each row or a
degenerate equation is obtained at the step 3b.
In the case of consistency (the system is consistent if we choose a pivot from each
row) write the general solution. The solution set can be specified as follows
- the variables whose columns are in proper form are called leading variables. If all
the variables are leading variables then the system has a unique solution which can
be obtained directly from the column b
- the variables whose columns are not in proper form may assume any values and
they are called secondary variables. If there is at least one secondary variable then
the system has infinitely many solutions. In this case express the leading variables in
terms of secondary variables.
Example. Solve the following linear systems.
a)
_
_
_
x + 2y −3z + 4t = 2
2x + 5y −2z +t = 1
5x + 12y −7z + 6t = 7
Solution
b x y z t
2 1 2 −3 4
1 2 5 −2 1
7 5 12 −7 6
2 1 2 −3 4
−3 0 1 4 −7
−3 0 2 8 −14
8 1 0 −11 18
−3 0 1 4 −7
3 0 0 0 0
The system is inconsistent since we obtain the following equation:
3 = 0 x + 0 y + 0 z + 0 t.
32
b)
_
_
_
x −2y +z = 7
2x −y + 4z = 17
3x −2y + 2z = 14
Solution
b x y z
7 1 −2 1
17 2 −1 4
14 3 −2 2
7 1 −2 1
3 0 3 2
−7 0 4 −1
0 1 2 0
−11 0 11 0
7 0 −4 1
2 1 0 0
−1 0 1 0
3 0 0 1
The system is consistent since we have chosen a pivot from each row. The leading
variables are x, y, z and the system has a unique solution which is
_
_
x
y
z
_
_
=
_
_
2
−1
3
_
_
.
c)
_
_
_
x + 2y −3z −2s + 4t = 1
2x + 5y −8z −s + 6t = 4
x + 4y −4z + 5s + 2t = 8
Solution
b x y z s t
1 1 2 −3 −2 4
4 2 5 −8 −1 6
8 1 4 −4 5 2
1 1 2 −3 −2 4
2 0 1 −2 3 −2
7 0 2 −1 7 −2
−3 1 0 1 −8 8
2 0 1 −2 3 −2
3 0 0 3 1 2
21 1 0 25 0 24
−7 0 1 −11 0 −8
3 0 0 3 1 2
33
The system is consistent since we have chosen a pivot from each row. The leading
variables are x, y, s; the secondary variables are z, t and in consequence the system is
consistent and has infinitely many solutions.
The general solution can be expressed as follows.
From the final table we write down the following equivalent system with the given
one:
_
_
_
x + 25z + 24t = 21
y −11z −8t = −7
3z +s + 2t = 3
where from we easily can express the leading variables in terms of secondary variables
_
_
_
x = 21 −25z −24t
y = −7 + 11z + 8t
s = 3 −3z −2t
with z, t ∈ R.
The general solution is:
_
_
_
_
_
_
x
y
z
s
t
_
_
_
_
_
_
=
_
_
_
_
_
_
21 −25z −24t
−7 + 11z + 8t
z
3 −3z −2t
t
_
_
_
_
_
_
,
where z, t ∈ R.
Leontief Production Model
The Leontief production model is a model for the economics of a whole country or
region. In this model there are n industries producing n different products such that
consumption equals production. We remark that a part of production is consumed
internally by industries and the rest is to satisfy the outside demand.
The problem is to determine the levels of the outputs of the industries if the
external demand is given and the prices are fixed. We will measure the levels of the
outputs in terms of their economic values. Over some fixed period of time, let
x
i
= monetary value of the total output of the i
th
industry
d
i
= monetary value of the output of the i
th
industry needed to satisfy the external
demand
c
ij
= monetary value of the output of the i
th
industry needed by the j
th
industry to
produce one unit of monetary of its own output.
We define the production vector
x =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
,
34
the demand vector
d =
_
_
_
_
_
d
1
d
2
.
.
.
d
n
_
_
_
_
_
and the consumption matrix
C =
_
_
_
_
c
11
c
12
. . . c
1n
c
21
c
22
. . . c
2n
. . . . . . . . . . . .
c
n1
c
n2
. . . c
nn
_
_
_
_
.
It is obvious that x
j
, d
j
, c
ij
≥ 0 for each j = 1, n, i = 1, n.
The quantity c
i1
x
1
+c
12
x
2
+ +c
in
x
n
is the value of the output of the i
th
industry
needed by all n industries. We are led to the following equation
x = Cx +d
which is called the Leontief input-output model, or production model.
Writing x as I
n
x and using matrix algebra, we can rewrite the previous equation
as
I
n
x −Cx = d
(I
n
−C)x = d.
The above system can be solved by using the Gauss-Jordan elimination method.
If the matrix I
n
−C is invertible, then we obtain
x = (I
n
−C)
−1
d.
Example. As a simple example, suppose the economy consists of three sectors -
manufacturing, agriculture and services whose consumption matrix is given by
C =
_
_
0, 5 0, 2 0, 1
0, 4 0, 3 0, 1
0, 2 0, 1 0, 3
_
_
.
Suppose the external demand is 50 units for manufacturing, 30 units for agriculture
and 20 units for services. Find the production level that will satisfy this demand.
Solution 1 (by using the Gauss-Jordan elimination method)
The production equation is the following
(I
3
−C)x = d
which gives us the following system to be solved:
_
_
_
0, 5x
1
−0, 2x
2
−0, 1x
3
= 50
−0, 4x
1
+ 0, 7x
2
−0, 1x
3
= 30
−0, 2x
1
−0, 1x
2
+ 0, 7x
3
= 20
35
b x
1
x
2
x
3
50 0, 5 −0, 2 −0, 1
30 −0, 4 0, 7 −0, 1
20 −0, 2 −0, 1 0, 7
−500 −5 2 1
300 −4 7 −1
200 −2 −1 7
−500 −5 2 1
−200 −9 9 0
3700 33 −15 0

4100
9
−3 0 1

200
9
−1 1 0
10100
3
18 0 0
950
9
0 0 1
4450
27
0 1 0
5050
27
1 0 0
x
1
=
5050
27
≈ 187
x
2
=
4450
27
≈ 165
x
3
=
950
9
≈ 106
Solution 2 (by determining the inverse of the matrix I −C)
We know that the production level is determined by
x = (I
3
−C)
−1
D.
We first determine the matrix (I
3
−C)
−1
.
I
3
−C I
3
0, 5 −0, 2 −0, 1 1 0 0
−0, 4 0, 7 −0, 1 0 1 0
−0, 2 −0, 1 0, 7 0 0 1
1 −
2
5

1
5
2 0 0
0
27
50

9
50
4
5
1 0
0 −
9
50
33
50
2
5
0 1
1 0 −
1
3
70
27
20
27
0
0 1 −
1
3
40
27
50
27
0
0 0
3
5
2
3
1
3
1
1 0 0
80
27
25
27
5
9
0 1 0
50
27
55
27
5
9
0 0 1
10
9
5
9
5
3
36
Hence,
(I −C)
−1
=
_
_
80
27
25
27
5
9
50
27
55
27
5
9
10
9
5
9
5
3
_
_
and in consequence
x = (I −C)
−1
d =
_
_
5050
27
4450
27
950
9
_
_
as we expected.
The theorem below shows that in most practical cases, I − C is invertible and
the production vector x is economically feasible in the sense that the entries in x are
nonnegative.
Theorem. Let C be the consumption matrix for an economy and let d be the
vector of external demand. If C and d have nonnegative entries and if each row sum
or each column sum of C is less than 1, then (I − C)
−1
exists and the production
vector
x = (I −C)
−1
d
has nonnegative entries and is the unique solution of the production equation
x = Cx +d.
Remark. The economic interpretation of entries in (I −C)
−1
The (i, j)
th
entry of the matrix (I −C)
−1
is the increased amount of the i
th
sector
which is to be produced in order to satisfy an increase of 1 unit in the external demand
for sector j.
Proof. Let d be the vector in R
n
with 1 in the j
th
entry and zeros elsewhere. The
corresponding production vector x is the j
th
column of (I − C)
−1
. This shows that
the (i, j)
th
entry of (I −C)
−1
gives the production of the i
th
sector to satisfy 1 unit
in the external demand for sector j. Now, the conclusion is true since if x
1
and x
2
are
production vectors which satisfy respectively the external demands d
1
and d
2
then
x
1
−x
2
is the production vector which satisfies the external demand d
1
−d
2
.
Basic feasible solutions
We consider a linear system in general form
_
_
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= b
m
.
We suppose that the above system is a consistent one with an infinite number of
solutions (that means that rank A = rank A < n). Also, we suppose that rank A = m
(in the case that rank A < m then there are some equations of the system which are
linear combinations of the others, and if we eliminate these equations we don’t change
the general solution).
37
Since rank A = rank A = m < n the system will have m leading variables and
n − m secondary variables. A leading variable is also called a basic variable and a
secondary variable is called a nonbasic variable.
Definitions
A feasible solution (FS) of a linear system is a solution for which all the com-
ponents are nonnegative.
A basic solution (BS) of a linear system is a solution for which all the nonbasic
variables are zero.
If one or more basic variables in a BS are zero then the solution is a degenerate
BS.
A basic feasible solution (BFS) is a feasible solution which is also a basic one.
If a BFS is degenerate, it is called a degenerate BFS.
Example. Determine all the basic solutions and all the basic feasible solutions of
the following system:
_
2x
1
+ 3x
2
−x
3
= 9
−x
1
+x
2
−x
3
= −2
Solution. Since
A =
_
2 3 −1 9
−1 1 −1 −2
_
,
then rank A = rank A = 2 then the system is consistent with an infinite number of
solutions. Actually, we have 2 basic variables and one nonbasic variable.
The 2 basic variables can be:
a) x
1
, x
2
(x
3
is a nonbasic variable)
Since x
3
is nonbasic then x
3
= 0 and the system becomes
_
2x
1
+ 3x
2
= 9
−x
1
+x
2
= −2
The solution of the previous system is x
1
= 3 and x
2
= 1.
In this case we obtain the BS
_
_
3
1
0
_
_
which is also a BFS.
b) x
1
, x
3
(x
2
is a nonbasic variable)
In this case we obtain the BS
_
_
11
3
0

5
3
_
_
which is not a BFS.
c) x
2
, x
3
(x
1
is a nonbasic variable)
In this case we obtain the BS
_
_
0
11
2
15
2
_
_
which is a BFS.
38
Remark. For a consistent system having an infinite number of solutions whose
rank is m < n (n is the number of unknowns) there are at most C
m
n
basic solutions.
Our purpose is to determine the basic feasible solutions of a linear system. We will
use the Gauss-Jordan elimination method.
Since the rank A = m then we have m basic variables and n−m nonbasic variables.
Since a basic variable is a variable from whose column we have chosen a pivot, that
means that we have chosen m pivots from m different columns and m different rows.
In consequence, we choose a pivot from each row.
Eventually by renumbering the unknowns we can suppose that we have chosen
pivots from the first m columns. So, we can suppose that the basic variables are
x
1
, . . . , x
m
and the nonbasic variables are x
m+1
, . . . , x
n
.
The computations can be arranged in the following table.
b x
1
. . . x
m
x
m+1
. . . x
n
b
1
a
11
. . . a
1m
a
1m+1
. . . a
1n
.
.
.
b
m
a
m1
. . . a
mm
a
mm+1
. . . a
mn
. . . . . .
β
1
1 . . . 0 α
1m+1
. . . α
1n
.
.
.
β
m
0 . . . 1 α
mm+1
. . . α
mn
The general solution is:
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
x
1
= β
1
−(α
1m+1
x
m+1
+ +α
1n
x
n
)
x
2
= β
2
−(α
2m+1
x
m+1
+ +α
2n
x
n
)
.
.
.
x
m
= β
m
−(α
mm+1
x
m+1
+ +α
mn
x
n
)
x
m+1
, . . . , x
n
∈ R
In order to get a basic solution we let
x
m+1
= = x
n
= 0, so x
1
= β
1
, x
2
= β
2
, . . . , x
m
= β
m
.
The basic solution
x =
_
_
_
_
_
_
_
_
_
_
_
_
β
1
β
2
.
.
.
β
m
0
.
.
.
0
_
_
_
_
_
_
_
_
_
_
_
_
is in the final table. This basic solution is also a basic feasible solution if in the final
table the column of the constants contains only nonnegative elements.
39
Next, we will determine rules for choosing the pivot such that if in the initial
table the column of the constants is nonnegative then so it will be in the final table.
Actually, we are interested in preserving the property of the constant column to
contain only nonnegative elements at each intermediate table which occurs when we
solve the system.
We may assume that in the initial table the constant column is nonnegative (if
there is an equation whose righthand side constant is negative, then we can multiply
it by −1).
We are interested in choosing a pivot from the j
th
column such that the constant
column in the next table will remain nonnegative.
If we choose a
ij
,= 0 as a pivot then b
i
will transform into
b
i
a
ij
, which has to be
nonnegative, too.
Since b
i
≥ 0 and
b
i
a
ij
≥ 0, then a
ij
(the pivot) has to be positive.
If k ,= i, then the element b
k
will transform by using the rectangle’s rule into
b
k

b
k
a
ij
−b
i
a
kj
a
ij
≥ 0.
Since a
ij
> 0 then b
k
a
ij
−b
i
a
kj
≥ 0, k = 1, m, k ,= i.
For k = i the previous inequality becomes
b
i
a
ij
−b
i
a
ij
= 0 ≥ 0;
so the pivot has to satisfy the following condition
b
i
a
kj
≤ b
k
a
ij
, k = 1, m (∗)
Let J
1
= ¦k = 1, m [ a
kj
> 0¦ and J
2
= ¦k = 1, m [ a
kj
≤ 0¦.
If k ∈ J
2
then (∗) is satisfied since b
i
a
kj
≤ 0 ≤ b
k
a
ij
.
If k ∈ J
1
then (∗) is equivalent with the following condition
b
i
a
ij

b
k
a
kj
, ∀ k ∈ J
1
.
So, (∗) is satisfied if
b
i
a
ij
= min
_
b
k
a
kj
, k ∈ J
1
_
where J
1
= ¦k = 1, m [ a
kj
> 0¦.
The previous condition is called the ratio test.
Conclusion. In order to keep the nonnegativity property of the constants column
we obtain the following rule for choosing a pivot on the j
th
column.
1) The pivot has to be positive; a
ij
> 0.
2) If J
1
= ∅ (on the j
th
column there is no positive element) then none of the
elements of the j
th
column can become a pivot. In this case x
j
can’t be a basic
variable.
40
If J
1
,= ∅ then the pivot will be the positive element situated on j
th
column for
which the ratio test is satisfied.
The computation table contains an extra column situated at the right hand side
of the usual table, for the ratio test.
Remark. If the ratio test is satisfied for more than one element then the pivot will
be the element which provides us the minimum row by using the lexicographical
order.
Let a = (a
1
, . . . , a
n
) ∈ R
n
and b = (b
1
, . . . , b
n
) ∈ R
n
. We say that a < b (in
lexicographical order) if
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
a
1
< b
1
or
a
1
= b
1
, a
2
< b
2
or
a
1
= b
1
, a
2
= b
2
, a
3
< b
3
or
a
1
= b
1
, . . . , a
n−1
= b
n−1
, a
n
< b
n
Examples. Determine a basic feasible solution for the following systems:
a)
_
2x
1
+ 3x
2
−x
3
= 9
−x
1
+x
2
−x
3
= −2 [ −1
First, we multiply the second equation by −1, in order to obtain a positive constant
in the right hand side.
_
2x
1
+ 3x
2
−x
3
= 9
x
1
−x
2
+x
3
= 2
b x
1
x
2
x
3
ratio test
9 2 3 −1
2 1 −1 1 min
_
9
2
,
2
1
_
= 2
5 0 5 −3
→ x
1
2 1 −1 1
→ x
2
1 0 1 −
3
5
← x
1
3 1 0
2
5
BFS : x =
_
_
3
1
0
_
_
x
2
11
2
3
2
1 0
→ x
3
15
2
5
2
0 1 BFS : x =
_
_
0
11
2
15
2
_
_
41
b)
_
2x
1
−x
2
−3x
3
+x
4
= 5
x
1
−2x
2
+x
3
+ 2x
4
= 10
b x
1
x
2
x
3
x
4
ratio test
5 2 −1 −3 1 min
_
5
1
,
10
2
_
= 5
10 1 −2 1 2
5 2 −1 −3 1
| ∨
5
1
2
−1
1
2
1
0
3
2
0 −
7
2
0
→ x
4
5
1
2
−1
1
2
1 min
_
0
3
2
,
5
1
2
_
= 0
x
1
0 1 0 −
7
3
0
x
4
5 0 −1
5
3
1 BFS :
_
_
_
_
0
0
0
5
_
_
_
_
a degenerate BFS
c)
_
_
_
x + 2y −3z −2s + 4t = 1
2x + 5y −8z −s + 6t = 4
x + 4y −7z + 5s + 2t = 8
b x y z s t ratio test
1 1 2 −3 −2 4
4 2 5 −8 −1 6 min
_
1
1
,
4
2
,
8
1
_
= 1
8 1 4 −7 5 2
→ x 1 1 2 −3 −2 4
2 0 1 −2 3 −2 min
_
2
3
,
7
7
_
=
2
3
7 0 2 −4 7 −2
x
7
3
1
8
3

13
3
0
8
3
→ s
2
3
0
1
3

2
3
1 −
2
3
7
3
0 −
1
3
2
3
0
8
3
x
35
2
1
1
2
0 0 20
s 3 0 0 0 1 2 BFS :
_
_
_
_
_
_
35
2
0
7
2
3
0
_
_
_
_
_
_
→ z
7
2
0 −
1
2
1 0 4
42
2.2 Linear programming problems (LPP)
Example. The diet problem
We want to determine the most economical diet which satisfies the basic minimum
nutritional requirements for a good health.
We know:
- there are available n different kinds of food: F
1
, . . . , F
n
- food F
j
sells at a price c
j
per unit; j = 1, n
- there are m basic nutritional ingredients N
1
, . . . , N
m
- for a balanced diet each individual must receive at least b
i
units of N
i
th
ingredient
per day, i = 1, m
- each unit of food F
j
contains a
ij
units of the i
th
ingredient.
We want:
- the amount x
j
of food F
j
(j = 1, n) such that the total cost of the diet is as
small as possible.
The mathematical model is the following:
- The total cost:
f(x
1
, x
2
, . . . , x
n
) = x
1
c
1
+x
2
c
2
+ +x
n
c
n
→ minimize
- The quantity of the i
th
ingredient received by a person is:
a
i1
x
1
. ¸¸ .
from F
1
+ a
i2
x
2
. ¸¸ .
from F
2
+ + a
in
x
n
. ¸¸ .
from F
n
≥ b
i
, i = 1, m
We have to solve the following problem (example of a linear programming problem):
f =
n

j=1
c
j
x
j
→ minimize (the objective function)
subject to the constraints:
_
¸
¸
_
¸
¸
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
≥ b
1
. . .
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
≥ b
m
x
j
≥ 0, j = 1, n
The main characteristic of a LPP is that all the involved functions: the objective
function and those which express the constraints must be linear.
Definition (General form of a linear programming problem)
A LPP is an optimization (minimization or maximization) problem of the following
form:
Find the optimum (minimum or maximum) of the following function
f =
n

j=1
c
j
x
j
subject to the constraints:
43
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
n

j=1
a
ij
x
j
≤ b
i
, i = 1, p
n

j=1
a
ij
x
j
≥ b
i
, i = p + 1, q
n

j=1
a
ij
x
j
= b
i
, i = q + 1, m
x
j
≥ 0, j = 1, n
where c
j
, b
i
, a
ij
, j = 1, n, i = 1, m are known real numbers, and x
j
, j = 1, n are real
numbers to be determined.
Depending on particular values of p and q we may have inequality constraints of
one type or the other and equality restrictions as well.
Definition (Standard form of a LPP)
Optimize
f =
n

j=1
c
j
x
j
subject to the constraints:
_
¸
¸
_
¸
¸
_
a
11
x
1
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+ +a
mn
x
n
= b
m
x
j
≥ 0, j = 1, n
where c
j
, b
i
, a
ij
, j = 1, n, i = 1, m are known real numbers; x
j
, j = 1, n are real
numbers to be determined.
We can assume that b
i
≥ 0, i = 1, m (otherwise we multiply the equality by −1).
Remark. Any LPP can be converted to the standard form by using the slack or
surplus variables.
a) If a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
≤ b
i
, then we add to the left side a new variable
y
i
≥ 0 in order to transform the inequality into an equality. We obtain:
a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
+y
i
= b
i
.
In this case y
i
is called a slack variable.
b) If a
i1
x
1
+ a
i2
x
2
+ + a
in
x
n
≥ b
i
then we subtract to the left side of the
inequality a new variable y
i
≥ 0 in order to transform the inequality into an equality.
We obtain
a
i1
x
1
+a
i2
x
2
+ +a
in
x
n
−y
i
= b
i
.
In this case y
i
is called a surplus variable.
Definition. Any solution of the constraints for which the optimum of the objective
function is obtained is called an optimal solution.
44
The graphical method for solving a LPP
When a LPP involves only 2 variables it can be solved by graphical procedures.
The graphical approach is extremely helpful in understanding the kinds of phenomena
which can occur in solving linear programming problems. We consider the case n = 2.
The feasible region is the set of points with coordinates (x, y) that satisfy all the
constraints. Each constraint (inequality) represents a half-plane at one side of the line
whose equation is the correspondent equality. So, the set of feasible solution is the
intersection of these half planes.
Example 1. Determine the maximum of the function
f(x, y) = 3x + 2y subject to
_
_
_
−2x + 2y ≤ 4
3x −y ≤ 3
x ≥ 0, y ≥ 0
`
¸
(0, 2)
(0, 0) (1, 0)
(2, 3)
3
x

y
=
3

x
+
2
y
=
4
3
x
+
2
y
=
1
0
3
x
+
2
y
=
0
To solve this problem graphically we first shade the region in the graph in which
all the feasible solutions must lie and then shift the position of the objective function
line
f = 3x + 2y.
The objective function is linear so it has level curves that are straight lines of
equation
3x + 2y = c, c const.
45
The question is how big can c become so that the line of equation 3x + 2y = c
meets the above polygon somewhere (for a maximum problem).
The objective is to maximize the level curve 3x+2y = c. If we fix for the beginning
the value of c to be 0 we see that the level curve can be represented as a line of slope −
3
2
that passes through the origin. Translating this objective line (i.e. moving it without
changing its slope) is equivalent to choose a different value for c. When the value of
c increases, the correspondent line moves to the right and hence we are interested in
determining the greatest value for c such that the correspondent level curve touches
the set of feasible solutions.
Graphically it is not hard to realize that the optimum value is realized at the
vertex (2,3) and the value of the maximum is 10.
Remark (The corner point method for solving a LPP)
The following cases can arise for a maximization problem.
1) If the constraints are such that there is no feasible region, then there is no
solution for the LPP.
2) If the objective function line can be moved indefinitely in a direction that
increases f and still intersects the feasible region, then f approaches ∞.
3) If the objective function line can be moved only a finite amount by increasing
the value of f (while still intersecting the feasible region) then the last point touched
by the objective function, if it is unique, will give the unique optimal solution. If it
is not unique, then any point on the segment of the boundary last touched gives an
optimal solution. In this case if x and x are the coordinates of the endpoints of the
segment, then the general solution is the following
(1 −t)x +tx, t ∈ [0, 1].
Example 2.
f(x, y) = 6x −2y → maximize
subject to
_
_
_
−x + 2y ≤ 4
3x −y ≤ 3
x ≥ 0, y ≥ 0
Solution. In this case the level curve 6x−2y = c is parallel to the line 3x−y = 3,
hence the optimal solutions are situated on the segment whose end points are (1,0)
and (2,3).
46
`
¸
y
x
(2,3)
(0,2)
(1,0) (0, 0)
3
x

y
=
3

x
+
2
y
=
4
6
x

2
y
=
0
6
x

2
y
=
6
Hence, the general solution is
(1 −t)(1, 0) +t(2, 3) = (1 +t, 3t), t ∈ [0, 1].
Example 3.
f(x, y) = 5x + 4y → maximize
subject to
_
_
_
x +y ≤ 2
−2x −2y ≤ −9
x ≥ 0, y ≥ 0
⇔ x +y ≥
9
2
Solution. In this case the set of feasible solutions is empty and hence the LPP
has no solution.
47
`
¸
(0,2)
y
(0,
9
2
)
(
9
2
, 0)
x
(2,0)
x
+
y
=
2
x
+
y
=
9
2
Example 4.
f(x, y) = x −4y → maximize
subject to
_
_
_
2x −y ≥ 1
x + 2y ≥ 2
x ≥ 0, y ≥ 0
Solution. In this case we have an unbounded feasible set. The objective function
can become as large as we want so the LPP is unbounded.
48
`
¸
(0,1)
y
(0, −1)
(
1
2
, 0) (0,2)
x
x
+
2
y
=
2
Remark. A similar discussion can be made regarding a minimum LPP.
The Simplex algorithm
The Simplex algorithm was developed by George B. Dantzig and was used first
for military reasons and after the second world war in the business world. In the years
’70 the Simplex algorithm was used to optimize the production, the benefits, the
costs and in the game theory. George B. Dantzig is considered to be one of the three
founders of linear programming, among John von Neumann and Leonid Kantorovich.
We will analyze only minimum problems, the results concerning maximum prob-
lems will be only stated.
Consider a LPP in standard form
f = c
1
x
1
+ +c
n
x
n
→ minimize
subject to
_
¸
¸
¸
_
¸
¸
¸
_
a
11
x
1
+ +a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ +a
mn
x
n
= b
m
x
1
≥ 0, . . . , x
n
≥ 0
As we have already discussed, we can assume that
rank A = m and b
j
≥ 0, j = 1, m.
49
Theorem (The fundamental theorem of linear programming)
Let a LPP in standard form.
a) If there is no optimal solution the problem is either infeasible or unbounded.
b) If there is an optimal solution then there is an optimal basic feasible solution.
Remark. The previous theorem assures us that it is sufficient to consider only
BFS in our search for optimal solutions.
The idea of the simplex method is to start from one basic feasible solution of the
constraints set and transform it into another one in order to decrease the value of the
objective function until a minimum is reached.
We need a criterion to decide when the objective function cannot be decreased
anymore, case in which we found an optimal solution and no more iterative steps are
needed.
Let x be an arbitrary BFS associated to the first part of the next table (the table
looks this way, eventually by renumbering the equations and the unknowns).
b x
1
x
2
. . . x
m
x
m+1
. . . x
n
← x
1
β
1
1 0 . . . 0 α
1m+1
. . . α
1n
x
2
β
2
0 1 . . . 0 α
2m+1
. . . α
2n
.
.
.
.
.
.
x
m
β
m
0 0 . . . 1 α
mm+1
. . . α
mn
→ x
m+1
β

1
1/α
1m+1
0 . . . 0 1 . . . α
1n

1n+1
x
2
β

2
0 1 . . . 0 0
.
.
.
.
.
.
.
.
.
x
m
β

m
.
.
. 0 . . . 1 0 . . .
.
.
.
The basic variables are: x
1
, x
2
, . . . , x
m
.
The nonbasic variables are: x
m+1
, . . . , x
m
.
The first part of the previous table gives us the following BFS
x =
_
_
_
_
_
_
_
_
_
_
β
1
β
2
. . .
β
m
0
. . .
0
_
_
_
_
_
_
_
_
_
_
For each j = 1, n, we define
f
j
=
m

i=1
c
i
α
ij
,
where c
i
, i = 1, m are the coefficients of the objective function.
50
For instance, we have
f
1
= c
1
1 +c
2
0 + +c
m
0 = c
1
.
.
.
f
m
= c
1
0 +c
2
0 + +c
m
1 = c
m
f
m+1
= c
1
α
1m+1
+c
2
α
2m+1
+ +c
m
α
mm+1
.
.
.
f
n
= c
1
α
1n
+c
2
α
2n
+ +c
m
α
mn
We are now able to present the main result.
Theorem (The optimality criterion for a minimum LPP) If for a basic
feasible solution we have c
j
−f
j
≥ 0, ∀ j = 1, n, then the solution is optimal.
Proof. Let x be a BFS for which c
j
−f
j
≥ 0, ∀ j = 1, n.
We want to prove that any new BFS (x) obtained by choosing a new pivot from
the remaining columns (from m+ 1 to n) isn’t better then x, that is f(x) ≥ f(x).
Suppose we choose α
1m+1
as a pivot from the m+1
th
column, so α
1m+1
satisfies
the following conditions
_
¸
¸
_
¸
¸
_
α
1m+1
> 0
β
1
α
1m+1
= min
k
_
β
k
α
km+1
[ α
km+1
> 0
_
the lexicographical order is respected
By choosing the new pivot, x
1
is leaving the base and becomes a nonbasic variable
and x
m+1
is entering the base and becomes a basic variable.
We determine now, the new BFS
β
1
→ β

1
=
β
1
α
1m+1
β
2
→ β

2
= β
2

β
1
α
1m+1
α
2m+1
β
3
→ β

3
= β
3

β
1
α
1m+1
α
3m+1
.
.
.
β
m
→ β

m
= β
m

β
1
α
1m+1
α
mm+1
The basic variables for x are x
2
= β

2
, . . . , x
m
= β

m
and x
m+1
= β

1
.
The nonbasic variables for x are x
1
= x
m+2
= = x
n
= 0.
51
Hence,
x =
_
_
_
_
_
_
_
_
_
_
0
β

2
. . .
β

m
β

1
0
0
_
_
_
_
_
_
_
_
_
_
.
It remains for us to compute and compare f(x) and f(x).
f(x) = c
1
β
1
+ +c
m
β
m
+c
m+1
0 + +c
n
0
= c
1
β
1
+ +c
m
β
m
f(x) = c
1
0 +c
2

_
β
2

β
1
α
1m+1
α
2m+1
_
+ +
+c
m

_
β
m

β
1
α
1m+1
α
mm+1
_
+c
m+1

β
1
α
1m+1
+c
m+2
0 + +c
n
0
= c
2
β
2
+ +c
m
β
m
. ¸¸ .
=f(x)−c
1
β
1
+
β
1
α
1m+1
[c
m+1
−(c
2
α
2m+1
+ +c
m
α
mm+1
. ¸¸ .
=f
m+1
−c
1
·α
1m+1
)]
= f(x) −c
1
β
1
+
β
1
α
1m+1
(c
m+1
−f
m+1
+c
1
α
1m+1
)
= f(x) −c
1
β
1
+
β
1
α
1m+1
(c
m+1
−f
m+1
) +c
1
β
1
So,
f(x) = f(x) +
β
1
α
1m+1
. ¸¸ .
≥0
(c
m+1
−f
m+1
. ¸¸ .
≥0
) ≥ f(x).
In conclusion, the basic feasible solution x cannot be improved (so it is optimal).
This completes the proof.
From the previous theorem we get the following obvious corollary.
Corollary. If there exists l ∈ ¦1, n¦ with c
l
−f
l
< 0 for a basic feasible solution,
the value of the objective function can be decreased by choosing x
l
as a basic variable.
The following two theorems characterize situations when either an optimal solution
does not exist or when an existing optimal solution is not uniquely determined.
52
Theorem. If inequality c
l
−f
l
< 0 holds for a nonbasic variable x
l
and x
l
cannot
become a basic variable (the entire column of x
l
is nonpositive so we cannot choose a
pivot from its column) then the LPP does not have an optimal solution.
In the latter case, the objective function value is unbounded from below, and we
can stop our computation.
Theorem. If there is l ∈ ¦1, . . . , n¦ such that c
l
−f
l
= 0 for an optimal solution
and x
l
is a nonbasic variable which can become a basic variable (there is at least
one positive element on its column), then there exists another optimal basic feasible
solution (by choosing x
l
as a basic variable).
Indeed, if the assumptions of the previous theorem are satisfied, we can perform
a further pivoting step with x
l
as entering variable and there is at least one basic
variable which can be chosen as leaving variable. However, due to c
l
− f
l
= 0, the
objective function value does not change.
In what concerns a minimum LPP we have the following conclusions (regarding a
BFS denoted by x):
1) If c
j
− f
j
≥ 0, for each j ∈ ¦1, . . . , n¦ then x is an optimal solution and
f
min
= f(x).
2) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
< 0 and J
1
= ¦k = 1, m, α
kj
> 0¦ = ∅
then the LPP is unbounded from below; f
min
= −∞.
3) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
< 0 then x is not an optimal solution.
In this case we obtain a better solution x (f(x) < f(x)) by choosing a pivot from x
j
’s
column.
4) If c
j
− f
j
≥ 0 for each j ∈ ¦1, . . . , n¦ and there is l ∈ ¦1, . . . , n¦ such that
c
l
−f
l
= 0 and x
l
is a nonbasic variable then the solution x, obtained by choosing a
pivot from x
l
column, is optimal too.
In what concerns a maximum LPP we have the following conclusions (regarding
a BFS denoted by x).
1) If c
j
−f
j
≤ 0 for each j ∈ ¦1, . . . , n¦ then x is an optimal solution and f
max
=
f(x).
2) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
> 0 and J
1
= ¦k = 1, m, α
kj
> 0¦ = ∅
then the LPP is unbounded from above; f
max
= +∞.
3) If there is j ∈ ¦1, . . . , n¦ such that c
j
−f
j
> 0 then x is not an optimal solution.
In this case we obtain a better solution x (f(x) > f(x)) by choosing a pivot from x
j
’s
column.
4) If c
j
− f
j
≤ 0 for each j ∈ ¦1, . . . , n¦ and there is l ∈ ¦1, . . . , n¦ such that
c
l
−f
l
= 0 and x
l
is a nonbasic variable then the solution x, obtained by choosing a
pivot from x
l
column, is optimal too.
Based on the results above, we can summarize the simplex algorithm as follows.
Assume that we have some current basic feasible solutions. The correspondent
simplex tableau is given bellow.
53
c c
1
. . . c
j
. . . c
l
. . . c
n
ratio
c
B
B b x
1
. . . x
j
. . . x
l
. . . x
n
test
. . .
c
i
x
i
b
i
α
i1
. . . α
ij
. . . α
il
. . . α
in
. . .
c
k
x
k
b
k
α
k1
. . . α
kj
. . . α
kl
. . . α
kn
. . .
f
j
f(x) f
1
. . . f
j
. . . f
l
. . . f
n
c
j
− f
j
− c
1
− f
1
. . . c
j
− f
j
. . . c
l
− f
l
. . . c
n
− f
n
In the first row and column of the previous table we write the coefficients of the
corresponding variables in the objective function.
f
j
=
m

k=1
c
k
α
kj
, j = 1, n
are obtained by adding the corresponding products between the elements of column
c
B
and column x
j
.
The simplex algorithm
1
st
step. Determine a BFS.
2
nd
step. Check the optimality of the current BFS.
3
rd
step. If the LPP is unbouded exit the algorithm.
If the current BFS is optimal and unique exit the algorithm.
If the current BFS is optimal and not unique determine another optimal solu-
tion.
If the current BFS is not optimal improve it.
4
th
step. Repeat steps 2 and 3 till obtaining all the optimal solutions.
Examples
1) A firm intends to manufacture three types of products P
1
, P
2
and P
3
so that
the total production cost does not exceed 32000 EUR. There are 400 working hours
possible and 30 units of raw materials may be used. Additionally, the data presented
in the table below are given.
Product P
1
P
2
P
3
Selling price (EUR/piece) 1600 3000 5200
Production cost (EUR/piece) 1000 2000 4000
Required raw material (per piece) 3 2 2
Working time (hours per piece) 20 10 20
54
The objective is to determine the quantities of each product so that the profit
is maximized. Let x
i
be the number of produced pieces of P
i
, i ∈ ¦1, 2, 3¦. We can
formulate the above problem as an LPP as follows:
The objective function is obtained by subtracting the production cost from the
selling price and dividing the resulting profit by 100 for each product
f(x
1
, x
2
, x
3
) =
1
100
(1600x
1
+ 3000x
2
+ 5200x
3
−1000x
1
−2000x
2
−4000x
3
)
= 6x
1
+ 10x
2
+ 12x
3
→ maximize
The constraint on the production cost can be divided by 1000
1000x
1
+ 2000x
2
+ 4000x
3
≤ 32000 [: 1000
and we obtain
x
1
+ 2x
2
+ 4x
3
≤ 32.
The constraint on the working time can be divided by 10
20x
1
+ 10x
2
+ 20x
3
≤ 400 [: 10
and we obtain
2x
1
+x
2
+ 2x
3
≤ 40.
The constraint on raw materials is the following:
3x
1
+ 2x
2
+ 2x
3
≤ 30.
So, we get the following LPP problem, written in general form:
f = 6x
1
+ 10x
2
+ 12x
3
→ max
subject to
_
¸
¸
_
¸
¸
_
x
1
+ 2x
2
+ 4x
3
≤ 32
3x
1
+ 2x
2
+ 2x
3
≤ 30
2x
1
+x
2
+ 3x
3
≤ 40
x
1
, x
2
, x
3
≥ 0
Introducing now in the i
th
constraint, i = 1, 3, the slack variable x
3+i
≥ 0, we
obtain the standard form and the following table.
f = 6x
1
+ 10x
2
+ 12x
3
→ max
subject to
_
¸
¸
_
¸
¸
_
x
1
+ 2x
2
+ 4x
3
+x
4
= 32
3x
1
+ 2x
2
+ 2x
3
+x
5
= 30
2x
1
+x
2
+ 3x
3
+x
6
= 40
x
1
, . . . , x
6
≥ 0
55
c 6 10 12 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
x
6
ratio test
0 ← x
4
32 1 2 4 1 0 0 min
_
32
4
,
30
2
,
40
3
_
= 8
0 x
5
30 3 2 2 0 1 0
0 x
6
40 2 1 3 0 0 1
f
j
0 0 0 0 0 0 0
c
j
−f
j
− 6 10 12 0 0 0
12 → x
3
8
1
4
1
2
1
1
4
0 0 min
_
8
1
2
,
14
1
_
= 14
0 ← x
5
14
5
2
1 0 −
1
2
1 0
0 x
6
16
5
4

1
2
0 −
3
4
0 1
f
j
96 3 6 12 3 0 0
c
j
−f
j
− 3 4 0 −3 0 0
12 x
3
1 −1 0 1
1
2

1
2
0
10 x
2
14
5
4
1 0 −
1
2
1 0
0 x
6
23
5
2
0 0 −1
1
2
1
f
j
152 13 10 12 1 4 0
c
j
−f
j
−7 0 0 −1 −4 0
Since now all coefficients c
j
− f
j
are nonpositive, we get the following optimal
solution from the latter table:
x
1
= 0, x
2
= 14, x
3
= 1, x
4
= 0, x
5
= 0, x
6
= 23.
Since there is no nonbasic variable for which c
j
−f
j
= 0, we have a unique solution.
This means that the optimal solution is to produce no piece of product P
1
, 14 pieces
of product P
2
and one piece of product P
3
. Taking into account that the coefficients
of the objective function were divided by 100, we get a total profit of 15200 EUR.
2) We consider the following LPP
f = −2x
1
−2x
2
→ min
subject to
_
_
_
x
1
−x
2
≥ −1
−x
1
+ 2x
2
≤ 4
x
1
, x
2
≥ 0.
Solution. First, we transform the given problem into the standard form, i.e. we
multiply the first constraint by −1 and introduce the slack variables x
3
and x
4
. We
obtain:
f = −2x
1
−2x
2
→ min
subject to
_
_
_
−x
1
+x
2
+x
3
= 1
−x
1
+ 2x
2
+x
4
= 4
x
1
, x
2
, x
3
, x
4
≥ 0
56
c −2 −2 0 0
c
B
B b x
1
x
2
x
3
x
4
ratio test
0 ← x
3
1 −1 1 1 0 min
_
1
1
,
4
2
_
= 1
0 x
4
4 −1 2 0 1
f
j
0 0 0 0 0
c
j
−f
j
− −2 −2 0 0
−2 → x
2
1 −1 1 1 0
0 ← x
4
2 1 0 −2 1
f
j
−2 2 −2 −2 0
c
j
−f
j
−4 0 2 0
−2 x
2
3 0 1 −1 1
−2 → x
1
2 1 0 −2 1
f
j
−10 −2 −2 6 −4
c
j
−f
j
0 0 −6 4
Since there is only one negative coefficient of a nonbasic variable in the objective
row, variable x
3
should be chosen as entering variable. However, there are only neg-
ative elements in the column belonging to x
3
. This means that we cannot perform a
further pivoting step, and so there does not exist an optimal solution of the minimiza-
tion problem considered, i.e. the objective function value is unbounded from below
(f
min
= −∞).
3) We consider the following LPP
f = x
1
+x
2
+x
3
+x
4
+x
5
+x
6
→ min
subject to
_
¸
¸
_
¸
¸
_
2x
1
+x
2
+x
3
≥ 4000
x
2
+ 2x
4
+x
5
≥ 5000
x
3
+ 2x
5
+ 3x
6
≥ 3000
x
1
, x
2
, x
3
, x
4
, x
5
, x
6
≥ 0
Solution. To get the standard form, we notice that in each constraint there is one
variable that occurs only in this constraint. Therefore, we divide the first constraint
by the coefficient 2 of variable x
1
, the second constraint by 2 and the third constraint
by 3. Then, we introduce a surplus variable in each of the constraints and obtain the
standard form
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
f = x
1
+x
2
+x
3
+x
4
+x
5
+x
6
→ min
x
1
+
1
2
x
2
+
1
2
x
3
−x
7
= 2000
1
2
x
2
+x
4
+
1
2
x
5
−x
8
= 2500
1
3
x
3
+
2
3
x
5
+x
6
−x
9
= 1000
x
1
, x
2
, x
3
, x
4
, x
5
, x
6
, x
7
, x
8
, x
9
≥ 0
57
c 1 1 1 1 1 1 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
x
6
x
7
x
8
x
9
ratio test
1 x
1
2000 1
1
2
1
2
0 0 0 −1 0 0 min
_
2500
1
2
,
1000
2
3
_
1 x
4
2500 0
1
2
0 1
1
2
0 0 −1 0 = min{5000, 1500}
1 ←x
6
1000 0 0
1
3
0
2
3
1 0 0 −1 = 1500
f
j
5500 1 1
5
6
1
7
6
1 −1 −1 −1
c
j
−f
j
− 0 0
1
6
0 −
1
6
0 1 1 1
1 x
1
2000 1
1
2
1
2
0 0 0 −1 0 0 min
_
2000
1
2
,
1750
1
2
_
1 ←x
4
1750 0
1
2

1
4
1 0 −
3
4
0 −1
3
4
= 3500
1 →x
5
1500 0 0
1
2
0 1
3
2
0 0 −
3
2
f
j
5250 1 1
3
4
1 1
3
4
−1 −1 −
3
4
c
j
−f
j
− 0 0
1
4
0 0
1
4
1 1
1
4
Now all the coefficients in the objective row are nonnegative and from the latter
tableau we obtain the following optimal solution
x
1
= 2000, x
2
= x
3
= 0, x
4
= 1750, x
5
= 1500, x
6
= 0
with the optimal objective function f
min
= 5250.
Notice that the optimal solution is not uniquely determined. In the last tableau,
there is one coefficient in the objective row equal to zero (this coefficient corresponds
to the nonbasic variable x
2
). Taking x
2
as entering variable, the ratio test determines
x
4
as the leaving variable, and we get
1 x
1
250 1 0
3
4
−1 −
3
4
−1 0 −
3
4
1 → x
2
3500 0 1 −
1
2
2 0 −
3
2
0 −2
3
2
1 x
5
1500 0 0
1
2
0 1
3
2
0 0 −
3
2
f
j
5250 1 1
3
4
1 1
3
4
−1 −2 −
3
4
c
j
−f
j
− 0 0
1
4
0 0
1
4
1 2
3
4
So, we obtain the following basic feasible solution
x
1
= 250, x
2
= 3500, x
3
= x
4
= 0, x
5
= 1500, x
6
= 0
with the same objective function value f
min
= 5250.
The general solution is:
x(t) = (1 −t)
_
_
_
_
_
_
_
_
2000
0
0
1750
1500
0
_
_
_
_
_
_
_
_
+t
_
_
_
_
_
_
_
_
250
3500
0
0
1500
0
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
2000 −1750t
3500t
1750 −1750t
1500
0
_
_
_
_
_
_
, t ∈ [0, 1].
58
Matrix form of the simplex method
We derive now the formulas in matrix-vector form for the linear programming
problem. In vector notation the standard problem becomes:
f(x) = c
t
x → minimize
subject to
_
Ax = b
x ≥ 0
Here x is an n-dimensional column vector, c
t
is an n-dimensional row vector,
named the cost vector (the symbol c
t
means the transpose of the vector c), A is an
mn matrix and b is an m-dimensional column vector. The vector inequality x ≥ 0
means that each component of x is nonnegative.
Let x be a basic feasible solution with the variables ordered so that
x =
_
x
B
0
_
, x
B
∈ R
m
, 0 ∈ R
n−m
, x
B
≥ 0
where x
B
is the vector of basic variables and 0 is the vector of nonbasic variables. In the
same way, the matrix A after the same permutations of column, can be decomposed
as
A =
_
B N
_
.
B is the submatrix of the matrix consisting of the m columns of A corresponding
to the basic variables. These columns are linearly independent and hence the columns
of B form a basis for R
m
. The matrix B is invertible.
The equation Ax = b is equivalent to
_
B N
_
_
x
B
0
_
= b, Bx
B
= b, x
B
= B
−1
b.
The cost of such a vector is
f(x) =
_
c
t
B
c
t
N
_
_
x
B
0
_
= c
t
B
x
B
= c
t
B
B
−1
b.
The basic step of the simplex method consists in moving to another basic feasible
solution such that the cost has been lowered.
The changing from x =
_
x
B
0
_
to x =
_
x
B
x
N
_
must satisfy the following
conditions:
1) Ax = b,
2) f(x) < f(x)
3) x ≥ 0.
The first condition is equivalent to
x
B
= x
B
−B
−1
Nx
N
.
59
Indeed, the equality
_
B N
_
_
x
B
x
N
_
= b
implies
Bx
B
+Nx
N
= b
and hence
x
B
= B
−1
(b −Nx
N
) = x
B
−B
−1
Nx
N
.
The new cost f(x) will be
f(x) = c
t
x =
_
c
t
B
c
t
N
_
_
x
B
x
N
_
=
_
c
t
B
c
t
N
_
_
x
B
−B
−1
Nx
N
x
N
_
= c
t
B
(x
B
−B
−1
Nx
N
) +c
t
N
x
N
= (c
t
N
−c
t
B
B
−1
N)x
N
+c
t
B
x
B
= (c
t
N
−c
t
B
B
−1
N)x
N
+f(x).
The sign of (c
t
N
−c
t
B
B
−1
N)x
N
will show if it is possible to decrease the cost by
moving to the new vector x.
The vector c
t
N
−c
t
B
B
−1
N is called the vector of reduced costs or the relative cost
vector (for nonbasic variables). If all the components of the vector of reduced costs
are nonnegative, we are not able to lower the costs anymore and x is optimal.
Duality theory
Linear programming is based on the theory of duality. To each primal linear
programming problem we can assign a dual linear programming problem. This sec-
tion formulates and discusses the relationships between the primal and dual problems
which are important for optimality conditions and offers a meaningful economic in-
terpretation of the optimization model.
Motivation
We begin with an example.
Example 1.
f = x
1
+ 3x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
x
1
+ 3x
2
≥ 4
4x
1
−x
2
≥ 1
x
2
≥ 3
x
1
, x
2
≥ 0
First we observe that every feasible solution provides an upper bound of the opti-
mal objective function value f
min
. For example, the solution (x
1
, x
2
) = (1, 3) tells us
60
f
min
≤ 1 + 3 3 = 10. But how close is this bound to the optimal value? To answer,
we need to give lower bounds.
By multiplying the third constraint by 13 and adding that to the sum of the first
two constraints we get
x
1
+ 3x
2
+ 4x
1
−x
2
+ 13x
2
≥ 4 + 1 + 39
which is equivalent to
5x
1
+ 15x
2
≥ 44.
Hence
44
5
≤ f
min
≤ 10.
To get a better lower bound, we apply again the same lower bounding technique,
but we replace the numbers used before with variables. So, we multiply the three
constraints by nonnegative numbers y
1
, y
2
and y
3
.
Hence
y
1
(x
1
+ 3x
2
) +y
2
(4x
1
−x
2
) +y
3
x
2
≥ 4y
1
+y
2
+ 3y
3
and
x
1
(y
1
+ 4y
2
) +x
2
(3y
1
−y
2
+y
3
) ≥ 4y
1
+y
2
+ 3y
3
.
If we stipulate that each of the coefficients of the x
i
is at most as large as the
corresponding coefficient in the coefficient function,
_
y
1
+ 4y
2
≤ 1
3y
1
−y
2
+y
3
≤ 3
then
f = x
1
+ 3x
2
≥ 4y
1
+y
2
+ 3y
3
.
We now have a lower bound 4y
1
+y
2
+3y
3
, which we should maximize in our effort
to obtain the best possible lower bound.
Therefore, we are led to the following optimization problem
g = 4y
1
+y
2
+ 3y
3
→ maximize
subject to
_
_
_
y
1
+ 4y
2
≤ 1
3y
1
−y
2
+y
3
≤ 3
y
1
, y
2
, y
3
≥ 0
This problem is called the dual linear programming problem associated to the
given linear programming problem. Next, we will define the dual linear programming
problem in general.
61
The dual problem. Symmetric form
Given a LPP in the form
(P) f =
n

j=1
c
j
x
j
→ minimize
subject to
_
¸
_
¸
_
n

j=1
a
ij
x
j
≥ b
i
, i = 1, 2, . . . , m
x
j
≥ 0, j = 1, 2, . . . , n
the associated dual linear programming problem is given by
(D) g =
m

i=1
b
i
y
i
→ maximize
subject to
_
¸
_
¸
_
m

i=1
a
ij
y
i
≤ c
j
, j = 1, 2, . . . , n
y
i
≥ 0, i = 1, 2, . . . , m
Since we started with the LPP (P) it is called the primal problem.
If we use the matrix notation in the form
_
A b
f ∗
_
we have
i) the minimization problem (P)
_
_
_
_
_
a
11
. . . a
1n
b
1
.
.
.
.
.
.
.
.
.
a
m1
. . . a
mn
b
m
c
1
. . . c
n

_
_
_
_
_
ii) the maximization problem (D).
_
_
_
_
_
a
11
. . . a
m1
c
1
.
.
.
.
.
.
.
.
.
a
1n
. . . a
mn
c
n
b
1
. . . b
m

_
_
_
_
_
Example 2. a) Find the dual of the given linear programming problem
f = 6x
1
+ 5x
2
+ 7x
3
→ minimize
subject to
62
_
¸
¸
_
¸
¸
_
3x
1
+x
2
+ 2x
3
≥ 3
2x
1
+ 2x
2
−x
3
≥ 5
x
1
+ 2x
2
+x
3
≥ 2
x
1
, x
2
, x
3
≥ 0
Solution
(D) g = 3y
1
+ 5y
2
+ 2y
3
→ maximize
subject to
_
¸
¸
_
¸
¸
_
3y
1
+ 2y
2
+y
3
≤ 6
y
1
+ 2y
2
+ 2y
3
≤ 5
2y
1
−y
2
+y
3
≤ 7
y
1
, y
2
, y
3
≥ 0
b) The dual of the diet problem
The diet problem was the problem faced by a dieticien to select a combination of
foods to meet certain nutritional requirements at minimum cost. This problem has
the form (see the first example in section 2.2)
f =
n

j=1
c
j
x
j
→ minimize
subject to
_
¸
_
¸
_
n

j=1
a
ij
x
j
≥ b
i
, i = 1, m
x
j
≥ 0, j = 1, n
The dual problem is
g =
m

i=1
b
i
y
i
→ maximize
subject to
_
¸
_
¸
_
m

i=1
a
ij
y
i
≤ c
j
, j = 1, n
y
i
≥ 0, i = 1, m
We describe an interpretation of the dual problem. Imagine a pharmaceutical com-
pany that produces the nutrients considered important by the dieticien. The problem
is to determine the positive unit prices y
1
, y
2
, . . . , y
m
for the nutrients, in order to
maximize the revenue
_
m

i=1
b
i
y
i
_
while at the same time being competitive to the
real food. To be competitive with real food, the cost of a unit of food i made by the
pharmaceutical company must be at most c
i
_
n

i=1
a
ij
y
i
≤ c
i
_
.
63
Remark 1. The dual of the dual symmetric problem is the primal problem.
Proof. We must first write the dual problem in the form (P). To change a maxi-
mization into a minimization, we note that:
max
m

i=1
b
i
y
i
= −min
_
m

i=1
(−b
i
)y
i
_
.
To change the direction of the inequalities, we simply multiply by −1.
The resulting equivalent representation of the dual problem in standard form then
is
- minimize
m

i=1
(−b
i
)y
i
subject to
_
¸
_
¸
_
n

i=1
(−a
ij
)y
i
≥ −c
j
, j = 1, . . . , n
y
i
≥ 0, i = 1, 2, . . . , m
Now we take its dual:
- maximize
n

j=1
(−c
j
)x
j
= minimize
n

j=1
c
j
x
j
subject to
_
¸
_
¸
_
n

j=1
(−a
ij
)x
j
≤ −b
i
, i = 1, 2, . . . , m
x
j
≥ 0, j = 1, 2, . . . , n
which is clearly equivalent to the primal problem (P).
It is always possible to obtain the dual of a LPP consisting a mixture of equations,
inequalities (in either direction), nonnegative variables or variables unrestricted in sign
by changing the system to an equivalent system (P).
However, an easier way is to apply certain rules, presented below.
Primal Dual
Minimize primal objective Maximize dual objective
Objective coefficients Right hand side (RHS) of dual
RHS of primal Objective coefficients
Coefficient matrix Transpose coefficient matrix
Primal relation Dual variable
i
th
inequality: ≥ y
i
≥ 0
i
th
inequality: ≤ y
i
≤ 0
i
th
equation: = y
i
unrestricted in sign
Primal variable Dual relation:
x
j
≥ 0 j
th
inequality: ≤
x
j
≤ 0 j
th
inequality: ≥
x
j
unrestricted in sign j
th
equation: =
64
The dual of a standard form
Applying the correspondence rules of the previous table, the dual of the standard
form can be easily obtained.
Thus, the primal problem for the standard linear problem is
(P

) f = c
1
x
1
+ +c
n
x
n
→ minimize
subject to
_
¸
¸
_
¸
¸
_
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
. . .
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= b
m
x
1
, x
2
, . . . , x
n
≥ 0
and the dual problem for the standard LPP is
(D

) g = b
1
y
1
+b
2
y
2
+ +b
m
y
m
→ maximize
subject to
_
¸
¸
_
¸
¸
_
a
11
y
1
+a
21
y
2
+ +a
m1
y
m
≤ c
1
. . .
a
1n
y
1
+a
2n
y
2
+ +a
mn
y
m
≤ c
n
y
1
, . . . , y
m
unrestricted in sign
The matrix form of the previous problem is the following:
If the primal problem is
f(x) = c
t
x → minimize
subject to
_
Ax = b
x ≥ 0
then its dual is
g(y) = b
t
y
subject to
A
t
y ≤ c.
The dual variables y are unrestricted.
Remark 2. The dual of a dual of a primal LPP in standard form is itself the
primal LPP in standard form.
Duality theorems
A duality theorem is a statement of the range of possible values for the primal
problem versus the range of possible values for the dual problem. There are two major
results relating the primal and dual problems. The first, called ”weak” duality states
65
that primal objective values provide bounds for dual objective values, and viceversa.
The second, called ”strong” duality, states that the optimal values of the primal and
dual problems are equal, provided that they exist. Since every linear programming
problem can be converted to standard form, in the theoretical results below we work
with primal linear programs in standard form.
Theorem (The weak duality theorem). Let x be a BFS for the primal problem
in standard form, and let y be a BFS for the dual problem. Then f(x) ≥ g(y).
Proof. The constraints for the dual show that A
t
y ≤ c. By transposing the pre-
vious inequality we get y
t
A ≤ c
t
. Since x ≥ 0,
f(x) = c
t
x ≥ y
t
Ax = y
t
b = g(y).
There are several simple consequences of the weak duality theorem.
Corollary 1. If the primal is unbounded then the dual is infeasible. If the dual is
unbounded, then the primal is infeasible.
Corollary 2. If x is a feasible solution to the primal problem, y is a feasible
solution to the dual, and f(x) = g(y), then x and y are optimal for their respective
problems.
The previous result shows us that it is possible to check if the vectors x and y are
optimal without solving the primal and dual problems.
The previous theorem is called the weak duality theorem because it expresses only
the guarding of the primal problem by the dual problem, but it doesn’t say that the
guarding is perfect. The latter is expressed by the strong duality theorem.
Theorem (The strong duality theorem). Let a pair of primal and dual linear
programming problems. If one of the problems has an optimal solution then so does
the other, and the optimal values are equal.
Proof. We assume that
- the primal problem is in standard form
- the primal problem has an optimal basic feasible solution x.
By reordering the variables we can write x in terms of basic and nonbasic variables
x =
_
x
B
0
_
and correspondingly we have
A =
_
B N
_
, c =
_
c
B
c
N
_
and x
B
= B
−1
b.
Since x is optimal, then c
t
N
−c
t
B
B
−1
N ≥ 0.
Let y = (B
−1
)
t
c
B
.
We will show that y is a feasible solution and f(x) = g(y). Then Corollary 2 will
show that y is optimal for the dual.
First we check the feasibility
y
t
A = (B
−t
c
B
)
t
A = c
t
B
B
−1
A = c
t
B
B
−1
_
B N
_
=
_
c
t
B
c
t
B
B
−1
N
_

_
c
t
B
c
t
N
_
= c
t
.
66
Taking the transpose of the previous inequality we get A
t
y ≤ c and hence y satisfies
the dual constraints.
f(x) = c
t
x = c
t
B
x
B
= c
t
B
B
−1
b
g(y) = b
t
y = (b
t
y)
t
= y
t
b = c
t
B
B
−1
b.
So, y is feasible for the dual and f(x) = g(y). Hence by Corollary 2, y is optimal
for the dual.
The previous proof provides the optimal dual solution.
If
x =
_
x
B
x
N
_
, A =
_
B N
_
and c =
_
c
B
c
N
_
,
then the optimal values of the dual variables are given by
y = B
−t
c
B
.
Remark 3. If the given linear programming problem has a complete set of slack
variables, then the reduced costs for the slack variables are given by
c
t
N
−c
t
B
B
−1
N = 0
t
−c
t
B
B
−1
I = −(B
−t
c
B
)
t
= −y
t
because the objective coefficients (c
t
N
) for the slack variables are zero, and their con-
straint coefficients (N) are given by the identity matrix I. In this case the values of
the optimal dual variables are the opposites of the reduced costs of the slack variables.
More precisely:
To obtain the optimal values of the nonsurplus variables of the dual problem
negate the entries in the c
j
−f
j
row under the slack columns (of the primal problem).
The slack column corresponding to the first constraint of the primal problem yields
the first variable of the dual, and so on. To obtain the optimal surplus values for the
dual negate the entries in the c
j
− f
j
row under the nonslack columns. The column
corresponding to the first nonslack variable yields the surplus variables associated to
the first constraint of the dual problem, and so on.
Example 3.
f = −x
1
−2x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
−2x
1
+x
2
≤ 2
−x
1
+ 2x
2
≤ 7
x
1
≤ 3
x
1
, x
2
≥ 0
The standard form of the previous problem is
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
f = −x
1
−2x
2
→ minimize
−2x
1
+x
2
+x
3
= 2
−x
1
+ 2x
2
+x
4
= 7
x
1
+x
5
= 3
x
1
, x
2
, x
3
, x
4
, x
5
≥ 0
67
c −1 −2 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
0 ← x
3
2 −2 1 1 0 0
0 x
4
7 −1 2 0 1 0 min
_
2
1
,
7
2
_
= 2
0 x
5
3 1 0 0 0 1
f
j
0 0 0 0 0 0
c
j
−f
j
− −1 −2 0 0 0
−2 x
2
2 −2 1 1 0 0
0 ← x
4
3 3 0 −2 1 0 min
_
3
3
,
3
1
_
= 1
0 x
5
3 1 0 0 0 1
f
j
−4 4 −2 −2 0 0
c
j
−f
j
− −5 0 2 0 0
−2 x
2
4 0 1 −
1
3
2
3
0
−1 x
1
1 1 0 −
2
3
1
3
0
0 ← x
5
2 0 0
2
3

1
3
1
f
j
−9 −1 −2
4
3

5
3
0
c
j
−f
j
− 0 0 −
4
3
5
3
0
−2 x
2
5 0 1 0
1
2
1
2
−1 x
1
3 1 0 0 0 1
0 → x
3
3 0 0 1 −
1
2
3
2
f
j
−13 −1 −2 0 −1 −2
c
j
−f
j
− 0 0 0 1 2
Hence f
min
= −13 and x
t
min
= (3, 5, 3, 0, 0).
The dual problem is
g = −2y
1
+ 7y
2
+ 3y
3
→ maximize
subject to
_
_
_
= 2y
1
−y
2
+y
3
≤ −1
y
1
+ 2y
2
≤ −2
y
1
, y
2
, y
3
≤ 0
g
max
= f
min
= −13, y
t
max
= (0, −1, −2).
Remark 4. If the given linear programming problem has a complete set of surplus
variables, then the reduced costs for the surplus variables are given by
c
t
N
−c
t
B
B
−1
N = 0
t
−c
t
B
B
−1
(−I) = (B
−t
c
B
)
t
= y
t
because the objective coefficients (c
t
N
) for the surplus variables are zero, and their
constraint coefficients (N) are given by −I. In this case the values of the optimal dual
variables are the same as the reduced costs of the surplus variables.
68
More precisely:
The optimal values of the nonslack variables of the dual problem are the entries in
the c
j
−f
j
row under the surplus columns of the primal problem. The surplus column
corresponding to the first constraint of the primal problem yields the first variable of
the dual, and so on.
The optimal values of the slack variables of the dual problem are the entries in
the c
j
− f
j
row under the nonsurplus columns of the primal problem. The column
corresponding to the first nonsurplus variable yields the slack variable associated to
the first constraint of the dual problem, and so on.
Example 4. We consider a very simple diet problem in which the nutrients are
starch, protein and vitamins. The foods are two types of grains with data given below.
Nutrient units/kg Minimum daily
of grain type requirement of nutrient
Nutrient 1 2 in units
Starch 2 1 2
Protein 1 2 2
Vitamins 2 2 3
Cost (RON/kg) of food 5 4
Determine the most economical diet which satisfies the basic minimum nutritional
requirements.
Solution. x
j
is the amount in kg of grain j included in the daily diet, j = 1, 2,
and the vector x = (x
1
, x
2
)
t
is the diet. Each of nutrients lead to a constraint. For
example the amount of vitamins contained in the diet is 2x
1
+ 2x
2
which must be
≥ 3.
The problem to be solved is the following
f(x) = 5x
1
+ 4x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
2x
1
+x
2
≥ 2
x
1
+ 2x
2
≥ 2
2x
1
+ 2x
2
≥ 3
x
1
≥ 0, x
2
≥ 0
The simplex table associated to the standard LPP corresponding to (P) is:
69
c 5 4 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
2 2 1 −1 0 0
2 1 2 0 −1 0 min
_
2
2
,
2
1
,
3
2
_
= 1
3 2 2 0 0 −1
5 → x
1
1 1
1
2

1
2
0 0
1 0
3
2
1
2
−1 0 min
_
1
1
2
,
1
3
2
,
1
1
_
=
2
3
1 0 1 1 0 −1
5 x
1
2
3
1 0 −
2
3
1
3
0
4 → x
2
2
3
0 1
1
3

2
3
0 min
_
2
3
1
3
,
1
3
2
3
_
=
1
2
1
3
0 0
2
3
2
3
−1
5 x
1
1 1 0 0 1 −1
4 x
2
1
2
0 1 0 −1
1
2
min
_
1
1
,
1
2
1
_
=
1
2
0 ← x
3
1/2 0 0 1 1 −
3
2
f
j
7 5 4 0 1 −3
c
j
−f
j
− 0 0 0 −1 3
5 x
4
1/2 1 0 −1 0 1/2
4 x
2
1 0 1 1 0 −1
0 → x
4
1/2 0 0 1 1 −
3
2
f
j
13
2
5 4 −1 0 −
3
2
c
j
−f
j
− 0 0 1 0
3
2
f
min
=
13
2
, x
t
=
_
1
2
, 1, 0,
1
2
, 0
_
The dual problem is
g = 2y
1
+ 2y
2
+ 3y
3
→ maximize
subject to
_
_
_
2y
1
+y
2
+ 2y
3
≤ 5
y
1
+ 2y
2
+ 2y
3
≤ 4
y
1
, y
2
, y
3
≥ 0
By using Remark 4 we obtain that g
max
=
13
2
and y
t
=
_
1, 0,
3
2
_
.
Complementary slackness
Theorem (Complementary Slackness). Consider a pair of primal and dual
linear programming problems with the primal problem in standard form.
a) If x is optimal for the primal and y is optimal for the dual then
x
t
(c −A
t
y) = 0 i.e.
70
n

j=1
x
j
(c
j
−(A
t
y)
j
) =
n

j=1
x
j
_
c
j

m

i=1
a
ij
y
i
_
= 0.
b) If x is feasible for the primal, y is feasible for the dual, and
x
t
(c −A
t
y) = 0
then x and y are optimal for their respective problems.
Proof. If x and y are feasible, then
f(x) = c
t
x ≥ (A
t
y)
t
x = y
t
Ax = y
t
b = (b
t
y)
t
= g(y).
If x and y are optimal then f(x) = g(y) so that
c
t
x = y
t
Ax,
where from we easily get
x
t
c = x
t
A
t
y and x
t
(c −A
t
y) = 0.
If x and y are optimal, then f(x) = g(y) and Corollary 2 shows that x and y are
optimal.
Example. We look again at the pair of LPP given by Example 3
f = −x
1
−2x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
−2x
1
+x
2
+x
3
= 2
−x
1
+ 2x
2
+x
4
= 7
x
1
+x
5
= 3
x
1
, . . . , x
5
≥ 0
The optimal solutions are
x = (3, 5, 3, 0, 0)
t
and y = (0, −1, −2)
t
.
The dual constraints are
_
_
_
−2y
1
−y
2
+y
3
≤ −1
y
1
+ 2y
2
≤ −2
y
1
, y
2
, y
3
≤ 0
The complementary slackness theorem says
5

j=1
x
j
_
c
j

3

i=1
a
ij
y
i
_
= 3[−1 −(−2 0 −1(−1) + (−2))]
+5[−2 −(0 + 2(−1))] + 3[0 −(1 0 + 0(−1) + 0(−2))]
71
+0[0 −(0 0 + 1(−1) + 0(−2))] + 0[0 −(0 0 + 0(−1) + 1(−2))]
= 3 0 + 5 0 + 3 0 + 0 1 + 0 2 = 0.
The complementary slackness theorem conducted us to the following results.
Remark. Consider a pair of primal and dual linear programming problems. Let
x be optimal for the primal and y optimal for the dual.
1) If x
j
> 0 then
c
j

m

i=1
a
ij
y
i
= 0.
In other words if x
j
is a basic variable then its reduced cost (or dual slack vari-
able) is zero. Conversely, if a dual slack variable (reduced cost) is nonzero, then the
associated primal variable is nonbasic and hence zero.
2) It is possible to have both x
j
= 0 and c
j

m
i=1
a
ij
y
i
= 0, for example, when
the problem is degenerate and one of the basic variables is zero.
3) For a symmetric pair of primal and dual linear programming problems
(P) f = c
t
x → minimize
subject to
_
Ax ≥ b
x ≥ 0
and
(D) g = b
t
y → maximize
subject to
_
A
t
y ≤ c
y ≥ 0
the complementary slackness conditions are
x
t
(c −A
t
y) = 0 and y
t
(Ax −b) = 0.
The complementary slackness conditions have an economic interpretation:
Thinking in terms of the diet problem (see the first example, section 2.2) which
is a primal part of a symmetric pair of dual problems, suppose that the optimal diet
supplies more than b
j
units of the j
th
nutrient. This means that the dietician will not
pay anything for small quantities of that nutrient, since availability of it would not
reduce the cost of the optimal diet. This implies y
j
= 0 which is (3) of Remark 5.
Marginal values. Shadow prices
Consider the LPP in standard form
f(x) = c
t
x → minimize
subject to
72
_
Ax = b
x ≥ 0
where A is a mn matrix and rank A = m.
The marginal value (or shadow price) of a constraint i is defined to be the rate
of the change in the objective function as a result of change in the values of b
i
, the
right-hand side of constraint i.
Suppose we keep all the other data in the problem fixed at their current value,
except b
i
. Then as b
i
varies, the optimum objective value in the problem is a function
of b
i
which we denote by F(b
i
)(= c
t
x). Then the marginal value of b
i
in the problem
is F

(b
i
).
By using the limit definition of the derivative
F

(b
i
) = lim
h→0
F(b
i
+h) −F(b
i
)
h
≈ F(b
i
+ 1) −F(b
i
).
By using the previous approximation we can say that a shadow price is the amount
the optimal value of the objective function would change if the right-hand side of a
constraint is increased by one unit.
If the given LPP has a nondegenerate optimum basic feasible solution, then the
dual problem has a unique optimal solution y and c
t
x = b
t
y. The previous equality
can be used to show that the marginal value associated with b
i
is y
i
. Since the solution
x is nondegenerate, small changes in any b
i
will not change the optimal dual solution.
Under nondegeneracy the change in value of f for small changes in b
i
is obtained
by partially differentiating F, F(b
i
) = b
t
y with respect to b
i
, as follows
F

(b
i
) =
_
m

i=1
b
i
y
i
_

b
i
= y
i
.
Example. [14] A mining company owns two different mines that produce a given
kind of ore. The mines are located in different parts of the country and have different
production capacities. After crushing, the ore is graded into three classes: high-grade,
medium-grade and low-grade ores. There is some demand for each grade of ore. The
mining company has contracted to provide a smelting plant with 12 tons of high-
grade, 8 tons of medium-grade, and 24 tons of low-grade ore. It costs the company $
200 per day to run the first mine and $ 160 per day to run the second. However, in a
day’s operation the first mine produces 6 tons of high grade, 2 tons of medium-grade
and 4 tons of low-grade ore, while the second mine produces daily 2 tons of high
grade, 2 tons of medium-grade, and 12 tons of low-grade ore. How many days should
each mine be operated in order to fulfill the company’s orders most economically?
Solution. First, we summarize the problem in the following table:
High-grade Medium-grade Low-grade Cost
ore ore ore
Mine 1 6 2 4 200
Mine 2 2 2 12 160
Requirements 12 8 24
73
x
1
- the number of days that mine 1 operates
x
2
- the number of days that mine 2 operates
f(x) = 200x
1
+ 160x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
6x
1
+ 2x
2
≥ 12
2x
1
+ 2x
2
≥ 8
4x
1
+ 12x
2
≥ 24
x
1
≥ 0, x
2
≥ 0
The standard form of the previous problem is
f(x) = 200x
1
+ 160x
2
subject to
_
¸
¸
_
¸
¸
_
6x
1
+ 2x
2
−x
3
= 12
2x
1
+ 2x
2
−x
4
= 8
4x
1
+ 12x
2
−x
5
= 24
x
1
, x
2
, x
3
, x
4
, x
5
≥ 0
c 200 160 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
12 6 2 −1 0 0
8 2 2 0 −1 0 min
_
12
6
,
8
2
,
24
4
_
= 2
24 4 12 0 0 −1
200 → x
1
2 1
1
3

1
6
0 0 min
_
2
1
3
,
4
4
3
,
16
32
3
_
4 0
4
3
1
3
−1 0 = min
_
6, 3,
3
2
_
=
3
2
16 0
32
3
2
3
0 −1
200 x
1
3
2
1 0 −
3
16
0
1
32
min
_
3
2
1
32
,
2
1
8
_
2 0 0
1
4
−1
1
8
= min¦48, 8¦
160 → x
2
3
2
0 1
1
16
0 −
3
32
200 x
1
1 1 0 −
1
4
1
4
0
0 → x
5
16 0 0 2 −8 1
160 x
2
3 0 1
1
4

3
4
0
f
j
680 200 160 −10 −70 0
c
j
−f
j
− 0 0 10 70 0
In conclusion f
min
= 680, x
min
= (1, 3, 0, 0, 16).
The minimum operating cost is $ 680 and it is achieved by operating the first mine
one day and the second mine three days.
If the mines are operated as indicated, then the combined production will be
6 + 2 3 = 12 tons of high-grade ore, 2 + 2 3 = 8 tons of medium-grade ore and
4 + 12 3 = 40 tons of high-grade ore.
74
We can observe that the low-grade ore is overproduced (with 16 tons).
The dual problem is
g = 12y
1
+ 8y
2
+ 24y
3
→ maximize
subject to
_
6y
1
+ 2y
2
+ 4y
3
≤ 200
2y
1
+ 2y
2
+ 12y
3
≤ 160
From the simplex table and Remark 4 we get that
y
1
= 10, y
2
= 70, y
3
= 0, y
4
= 0, y
5
= 0, g
max
= 680.
The first step in interpreting the solution to the dual problem is that of determining
the dimensions of the variables involved. We will determine the dimensions of the
variables of both the primal and dual problems by following the next two rules:
a) the dimension of x
j
is the ratio of the dimension of b divided by the dimension
of a
ij
for any i
b) the dimension of y
i
is the ratio of the dimension of c
j
divided by the dimension
of a
ij
for any j.
In our example we already know that the dimensions of x
1
and x
2
are days.
dimension of y
1
=
The dimension of c
1
dimension of a
11
=
$/day
tons-Hg/day
=
$
tons-Hg
In the same way
The dimension of y
2
=
$
tons-Mg
and the dimension of y
3
=
$
tons-Lg
.
The next step is to look at the optimal dual solution and give its interpretation.
We know that y
1
= 10 has dimension $/ton of high-grade ore, which sounds like the
imputed cost of producing an additional ton of high grade ore and we shall show that
this is the case. Suppose we increase the requirements for high-grade ore production
from 12 to 16 tons.
The new problem is
f = 200x
1
+ 160x
2
→ minimize
subject to
_
¸
¸
_
¸
¸
_
6x
1
+ 2x
2
≥ 16
2x
1
+ 2x
2
≥ 8
4x
1
+ 12x
2
≥ 24
x
1
≥ 0, x
2
≥ 0
75
c 200 160 0 0 0
c
B
B b x
1
x
2
x
3
x
4
x
5
ratio test
16 6 2 −1 0 0
8 2 2 0 −1 0 min
_
16
2
,
8
2
,
24
2
_
= 2
24 4 12 0 0 −1
12
16
3
0 −1 0
1
6
4
4
3
0 0 −1
1
6
min
_
8
1
0
,
4
1
0
_
= 24
160 → x
2
2
1
3
1 0 0 −
1
12
8 4 0 −1 1 0
0 → x
5
24 8 0 0 −6 1 min
_
8
4
,
24
8
,
42
1
_
= 2
160 x
2
4 1 1 0 −
1
2
0
200 x
1
2 1 0 −
1
4
1
4
0
0 x
5
8 0 0 2 −4 1
160 x
2
2 0 1
1
4

3
4
0
f
j
720 200 160 −10 −70 0
c
j
−f
j
0 0 10 70 0
The optimal solution to the new problem x
1
= 2, x
2
= 2.
Notice that the cost of production has increased from 680 to 720 which is 4y
1
=
4 10 = 40. Hence y
1
= 10 is the cost/ton of each of the additional high-grade ore. The
fact that y
3
= 0 (which has dimension $/ton of low-grade ore) says that the low-grade
ore is free in the sense that producing an additional ton has zero cost (because there
is already an overproduction of 16 tons, so the additional ton will cost zero to produce
since it already exists).
Remark 1. The interpretation of a pair of symmetric dual linear programming.
a) For either problem the matrix A will be called the matrix of technological
coefficients.
b) If the original problem is minimizing, we interpret x as the activity vector. b is
interpreted as the requirements vector, whose components give the minimum amounts
of each good that must be produced. The vector c is the cost vector, whose entries
give the unit costs of each of the activities. The vector y (the solution of the dual
problem) is the imputed-cost vector, whose components give the imputed costs of
producing additional amounts of each of the required goods (provided the changes in
requirements are sufficiently small that the dual solution remains optimal).
c) If the original problem is maximizing we interpret x as the activity vector.
Then the vector b is interpreted as the capacity-constraint vector, whose components
give the amounts of resources that can be demanded by a given activity vector. The
vector c is the profit vector, whose entries give the unit profits for each component
of the activity vector x. The vector y is the imputed-value vector, whose entries give
the imputed values of each of the resources that enter into the productive process
(provided the changes in resources are sufficiently small that the dual solution remains
optimal).
76
Remark 2. Suppose in the linear programming problem
f = c
t
x → minimize
subject to
_
Ax = b
x ≥ 0
Assume that the optimal basis is B with corresponding solution (x
B
, 0) where
x
B
= B
−1
b. A solution to the corresponding dual problem is y = B
t
c
B
.
Assuming nondegeneracy, small changes in the vector b will not cause the optimal
basis to change. Thus for b + ∆b the optimal solution is x = (x
B
+ ∆x
B
, 0) where
∆x
B
= B
−1
∆b. Thus the corresponding increment in the cost function is
∆f = ∆g = c
t
B
∆x
B
= y
t
∆b.
This equation shows that y give the sensitivity of the optimal cost with respect to
small changes in the vector b. If a new problem is solved with b changed to b + ∆b,
the change in optimal value of the objective function will be λ
t
∆b, where λ
j
is the
marginal price of the component b
j
, since if b
j
is changed to b
j
+∆b
j
the value of the
optimal solution changes by λ
j
∆b
j
.
Game theory
Game theory is a mathematical approach to the problems of strategy such as one
can find in the theory of operational research or economy. This theory is frequently
and naturally used in every day life.
Game theory is used to analyse situations where for two or more individuals (or
institutions) the outcome of an action by one of them depends not only on the action
taken by that individual but also on the actions taken by the others. The strategies
of individuals will be dependent on expectations about the others are doing. These
games are called games of strategy and the participants are called players. The players
of such a game need to take into account the possible actions of the others when they
make decisions.
Strategic thinking characterizes many human interactions. Here are some exam-
ples:
a) Two firms with large market shares in a particular industry makings decisions
with respect to price and output.
b) The decision of a firm to enter a new market where there is a risk that the
existing firms will try to fight entry.
c) A criminal deciding whether to confess or not to a crime that he has committed
with an accomplice who is also questioned by the police.
d) Family members arguing over the division of work within the household.
We shall focus on the simplest type of game, called the finite two-person zero-sum
game, or matrix game for short.
77
Matrix games
A matrix game is a two-person game defined as follows. Each of two persons selects
(independently) an action from a finite set of choices and both reveal to each other
their choice. If we denote the first player’s choice by i (i = 1, m) and the second
player’s choice by j (j = 1, l) then the rules of the game stipulate that the first
player’s payoff is a
ij
. We shall refer to the first player as the row player (R) and the
second player as the column player (C).
The matrix of possible payments A = [a
ij
]
1≤i≤m
1≤j≤l
is known to both of players before
the game begins.
More explicitly: If a
ij
> 0, C pays to R an amount of a
ij
; if a
ij
< 0, R pays C an
amount of [a
ij
[; if a
ij
= 0, no money is won or lost.
The main properties of a matrix game are:
• there are two players (two-person game)
• each player has finitely many choices of play, each makes one choice, and the
combination of the two choices determines a payoff (finite game)
• what one player wins, the other loses (zero-sum game).
We present now some examples:
Example 1. Paper-Scissors-Rock game
This is a two-person game in which each player declares either Paper, Scissors or
Rock. If both players declare the same object, then the payoff is 0. Paper loses to
Scissors since scissors can cut a piece of paper. Scissors loses to Rock since a rock
can dull scissors and finally Rock loses to Paper since a piece of paper can cover up
a rock. The payoff is 1 in these cases.
The payoff matrix is:
C player
Paper Scissors Rock
Paper
R player Scissors
Rock
_
_
0 −1 1
1 0 −1
−1 1 0
_
_
Example 2. Morra game
The two players simultaneously show either one or two fingers, and at the same
time, each player announce a number.
If the number announced by one of the players is the same as the number of fingers
showed by both players, then he wins that number from the opponent (if both players
guess right then the payment is zero).
Each player has four possible strategies. If he shows one finger then he may guess
two or three (he will never guess four in this case, but this strategy never wins so he
will eliminate it). If he shows two fingers then he may guess three or four.
If we denote by R
ij
and C
ij
the strategy of showing i fingers and guessing the
number j then the following payoff matrix will be associated to Morra game:
78
C
12
C
13
C
23
C
24
R
12
R
13
R
23
R
24
_
_
_
_
0 2 −3 0
−2 0 0 3
3 0 0 −4
0 −3 4 0
_
_
_
_
Example 3. Two stores R and C, are planning to locate in one of two towns.
Town 1 has 70 percent of the population while town 2 has 30 percent. If both stores
locate in the same town they will split the total business of both towns equality, but
if locate in different towns each will get the business of that town. Where should each
store locate?
The payoff matrix is:
Store C locates in
1 2
Store R 1
locates in 2
_
50 70
30 50
_
The entries of the payoff matrix represent the percentages of business of store R
(or of the percentage loses of business by C).
It is easy to see that store R should prefer to locate in town 1 because by this
choice R can assure himself of 20 percent more business in town 1 than in town 2.
Similarly, store C also prefers to locate in town 1 because he will lose 20 percent less
business in town 1 than in 2.
Hence the best strategies are for each store to locate in town 1.
By a strategy for R in a matrix game, A, we mean a decision by R to play
the various rows with a given probability distribution, i.e. to play the first row with
probability p
1
, to play the second row with probability p
2
, and so on.
This strategy for R is represented by the probability vector
p = (p
1
, . . . , p
m
),
m

k=1
p
k
= 1,
where p
i
, i = 1, m, represents the probability of choosing (by player R) the row i.
In the same way, by a strategy for C we mean a decision by C to play the
various columns with a given probability distribution, i.e. to play the first column
with probability q
1
, to play the second column with probability q
2
, and so on.
This strategy for C is represented by the probability vector
q = (q
1
, . . . , q
l
),
l

j=1
q
j
= 1,
where q
j
, j = 1, l, represents the probability of choosing (by player C) the column j.
A strategy which contains a 1 as component (and in consequence 0 everywhere
else) is called a pure strategy, otherwise it is called a mixed strategy.
79
In the case of a pure strategy the player R (respectively the player C) decides to
play always a given row (respectively a given column).
When R plays row i with probability p
i
(i = 1, m) and C plays column j with
probability q
j
(j = 1, l) then the payoff a
ij
will be realized with probability p
i
q
j
.
Hence the expected winning of R is:
E(p, q) = p
1
q
1
a
11
+p
1
q
2
a
12
+ +p
1
q
l
a
1l
+ +
+p
m
q
1
a
m1
+p
m
q
2
a
m2
+ +p
m
q
l
a
ml
=
m

i=1
p
i
l

j=1
q
j
a
ij
= (p
1
, . . . , p
m
)
_
_
_
a
11
. . . a
1l
.
.
.
.
.
.
a
m1
. . . a
ml
_
_
_
_
_
_
q
1
.
.
.
q
l
_
_
_
= pAq
t
=
l

j=1
q
j
m

i=1
p
i
a
ij
= (q
1
, . . . , q
l
)
_
_
_
a
11
. . . a
m1
.
.
.
.
.
.
a
1l
. . . a
ml
_
_
_
_
_
_
p
1
.
.
.
p
m
_
_
_
= qA
t
p
t
In conclusion:
E(p, q) = pAq
t
= qA
t
p
t
.
The player R tries to choose a row i (i = 1, m) such that the expected value of his
winnings is maximal no matter what column the player C chooses.
The player C tries to choose a column j (j = 1, l) such that the expected value of
his loses is minimal no matter what row the player R chooses.
We say that the game with payoff matrix A has the value v, and we call p
0
and
q
0
optimal strategies, if
E(p
0
, q) ≥ v, for every strategy q for C (∗)
E(p, q
0
) ≤ v, for every strategy p for R (∗∗)
Remark 1.
a) If p
0
is a given strategy for player R in a matrix game A then the following two
conditions are equivalent:
(i) E(p
0
, q) ≥ v, for every strategy q for C
(ii) p
0
A ≥ (v, v, . . . , v).
b) If q
0
is a given strategy for player C in a matrix game A then the following two
conditions are equivalent:
(i) E(p, q
0
) ≤ v, for every strategy p for C
(ii) q
0
A
t
≤ (v, v, . . . , v).
Proof. a) Assume that (i) holds and that p
0
A = (a
1
, a
2
, . . . , a
l
). Choosing the
pure strategy q = (1, 0, . . . , 0) we have
E(p
0
, q) = p
0
Aq
t
= (a
1
, a
2
, . . . , a
l
)(1, 0, . . . , 0)
t
= a
1
≥ v.
Similarly a
2
≥ v, . . . , a
l
≥ v. In other words
p
0
A ≥ (v, v, . . . , v).
80
On the other hand, assume that (ii) holds. Then, for any strategy q for C,
E(p
0
, q) = p
0
Aq
t
≥ (v, v, . . . , v)(q
1
, q
2
, . . . , q
l
)
t
=
l

j=1
vq
l
= v 1 = v.
b) Assume that (i) holds and that q
0
A
t
= (b
1
, b
2
, . . . , b
m
). Choosing the pure
strategy p = (1, 0, . . . , 0) we have
E(p, q
0
) = q
0
A
t
p
t
= (b
1
, b
2
, . . . , b
m
)(1, 0, . . . , 0)
t
= b
1
≤ v.
Similarly, b
2
≤ v, . . . , b
m
≤ v. In other words
q
0
A
t
≤ (v, v, . . . , v).
On the other hand, assume that (ii) holds. Then, for any strategy p for R,
E(p, q
0
) = q
0
A
t
p
t
≤ (v, v, . . . , v)(p
1
, p
2
, . . . , p
m
)
t
=
m

i=1
vp
i
= v 1 = v.
As we can observe from the previous proof:
- the inequality p
0
A ≥ (v, v, . . . , v) can be written as E(p
0
, q) ≥ v for every
strategy q for C
- the inequality q
0
A
t
≤ (v, v, . . . , v) can be written as E(p, q
0
) ≤ v for every pure
strategy p for R.
In view of the previous remark we say that the game with payoff matrix A has
the value v, and p
0
, q
0
optimal strategies if
p
0
A ≥ (v, v, . . . , v) (∗

)
q
0
A
t
≤ (v, v, . . . , v) (∗∗

)
We conclude this subsection by proving three results that characterize the value
and optimal strategies of a game.
Remark 2. If A is a matrix game that has a value and optimal strategies then
the value of the game is unique.
Proof. Suppose that v and w are two values for the matrix game A. If p
0
and q
0
are optimal strategy vectors associated with the value v then
(i) p
0
A ≥ (v, v, . . . , v)
(ii) q
0
A
t
≤ (v, v, . . . , v).
If p
1
and q
1
are optimal strategy vectors associated with the value w then
(iii) p
1
A ≥ (w, w, . . . , w)
(iv) q
1
A
t
≤ (w, w, . . . , w).
If we multiply (i) on the right by (q
1
)
t
we get
p
0
A(q
1
)
t

l

j=1
vq
1
j
= v.
81
In the same way, multiplying (iv) on the right by (p
0
)
t
gives
p
0
A(q
1
)
t
= (p
0
A(q
1
)
t
)
t
= q
1
A
t
(p
0
)
t

m

j=1
wp
1
i
= w.
The two inequalities obtained before show that w ≥ v.
Similarly, if we multiply (ii) on the right by (p
1
)
t
and (ii) on the right by (q
0
)
t
we
obtain v ≥ p
1
A(q
0
)
t
and p
1
A(q
0
)
t
≥ w, which together imply v ≥ w.
In consequence w = v, which completes the proof.
Remark 3. If A is a matrix game with value v and optimal strategies p
0
and q
0
,
then v = p
0
A(q
0
)
t
.
Proof. The following inequalities are true:
p
0
A ≥ (v, v, . . . , v) and q
0
A
t
≤ (v, v, . . . , v).
Multiplying the first of these inequalities on the right by (q
0
)
t
, we get
p
0
A(q
0
)
t
≥ v.
Similarly, multiplying the second inequality on the right by (p
0
)
t
, we obtain
p
0
A(q
0
)
t
= (p
0
A(q
0
)
t
)
t
= q
0
A
t
(p
0
)
t
≤ v.
These two inequalities together imply that
v = p
0
A(q
0
)
t
.
The previous two remarks allow us to interpret the value of a game as an expected
value in the following way: If the matrix game is played repeatedly and if each time
the player R chooses the p
0
strategy and the player C chooses the q
0
strategy, then
the value of the matrix game A is the expected value of the game for R.
Remark 4. If A is a matrix game with value v and optimal strategies p
0
and q
0
,
then v is the largest expectation that R can assure for himself and w is the smallest
expectation that C can assure for himself.
Proof. Let p any strategy vector of R; then multiplying the inequality q
0
A
t

(v, . . . , v) on the right by (p
0
)
t
, we get
pA(q
0
)
t
= (q
0
A
t
p
t
)
t
= q
0
A
t
p
t
≤ v.
So, if C plays optimally, the most that R can obtain for himself is v.
On the other hand, since v = p
0
A(q
0
)
t
, R can obtain for himself an expectation
of v.
The proof of the other statement of the remark is similar.
The previous remark tells us that the value of a game is the ”best” that a player
can obtain for himself (by using the optimal strategies).
82
Strictly determined games. Saddle point
A matrix game is strictly determined if the matrix has an entry which is a
minimum in its row and a maximum in its column; such an entry is called a ”saddle
point”.
Remark 5. Let v a saddle point of a strictly determined game. Then an optimum
strategy for R is to play the row containing v, an optimum strategy for C is to play
the column containing v, and v is the value of the game.
Proof. Suppose v = a
ij
, so p
0
= (0, 0, . . . , 1, . . . , 0) (1 is the value of the i
th
coordinate of p
0
) and q
0
= (0, . . . , 1, . . . , 0) (1 is the value of the j
th
coordinate of q
0
).
We now show that p
0
, q
0
and v satisfy the required properties to be optimum
strategies and the value of the game. Indeed,
p
0
A = (0, . . . , 1, . . . , 0) A = (a
i1
, . . . , a
ij
, . . . , a
il
)
≥ (a
ij
, . . . , a
ij
) = (v, . . . , v)
q
0
A
t
= (0, . . . , 1, . . . , 0)A
t
= (a
1j
, . . . , a
ij
, . . . , a
mj
)
≤ (a
ij
, . . . , a
ij
) = (v, . . . , v).
Thus in a strictly determined game a pure strategy for each player is an optimum
strategy:
for player R: to choose a row that contains a saddle value
for player C: to choose a column that contains a saddle value.
Example 4. We consider a generalization of Example 3 in which the stores R and
C are trying to locate in one of the three towns in figure below.
50
30 20
30 km
18 km
24 km
Town 1
Town 2
Town 3
If both stores locate in the same town they split all business equally, but if they
locate in different towns then all the business in the town that doesn’t have a store
will go to the closer of the two stores.
The payoff matrix for this game is the following:
83
Store C locates in
1 2 3
Store R 1
locates in 2
3
_
_
50 50 80
50 50 80
20 20 50
_
_
If we circle the minimum entry in each row and put a square around the maximum
entry in each column we obtain
1 2 3
1
2
3
_
_
_
_
_
_
50 50 80
50 50 80
20 20 50
_
_
_
_
_
_
Each of the four 50 entries in the 22 matrix in the upper left-hand corner is both
circled and boxed and so is a saddle value of the matrix. Hence the game is strictly
determined, and optimal strategies are:
• for store R: locate in town 1 or locate in town 2, represented by the vectors (1, 0, 0)
and (0, 1, 0) respectively.
: combining the previous two strategies we get the following mixed strategy
: locate in town 1 with probability p and locate in town 2 with probability 1−p
represented by the vector
p(1, 0, 0) + (1 −p)(0, 1, 0) = (p, 1 −p, 0), 0 < p < 1
• for store C: locate in town 1, locate in town 2 and ”locate in town 1 with probability
q and locate in town 2 with probability 1−q” represented by the vectors: (1, 0, 0),
(0, 1, 0) and (q, 1 −q, 0).
2 2 matrix games
Consider the matrix game
A =
_
a
11
a
12
a
21
a
22
_
.
If A is strictly determined, then the solution is presented above. Thus we need
only to consider the case in which A is non-strictly determined.
Criterion. The 2 2 matrix game is non-strictly determined if and only if each
of the entries on one of the diagonals is greater than each of the entries on the other
diagonal i.e. one of the following situations are fulfilled:
(i) a
11
, a
22
> a
12
and a
11
, a
22
> a
21
or
(ii) a
12
, a
21
> a
11
and a
12
, a
21
> a
22
.
84
Proof. If either of the conditions (i) or (ii) holds, it is easy to check that no entry
of the matrix is simultaneously the minimum of the row and the maximum of the
column in which it occurs hence the game is not strictly determined.
In order to prove the other part of the criterion we observe first that if two of the
entries in the same row or the same column of A are equal then the game is strictly
determined: hence the entries in the same row (or column) are different.
Suppose now that a
11
> a
12
; then a
22
> a
12
or else a
12
is a row minimum and a
column maximum; then also a
22
> a
21
or else a
22
is a row minimum and a column
maximum; then also a
11
> a
21
or else a
21
is a row minimum and a column maximum.
In a similar manner the assumption a
11
< a
12
leads to case (ii). This completes
the proof of the theorem.
In order to determine the optimal strategies for a 2 2 non-strictly determinate
game we have the following result:
Theorem 1. If the 2 2 matrix game A is non-strictly determined then p
0
=
(p
0
1
, p
0
2
) is an optimal strategy for player R, q
0
= (q
0
1
, q
0
2
) is an optimal strategy for
player C and v is the value of the game, where
p
0
1
=
a
22
−a
21
a
11
+a
22
−a
12
−a
21
, p
0
2
=
a
11
−a
12
a
11
+a
22
−a
12
−a
21
q
0
1
=
a
22
−a
12
a
11
+a
22
−a
12
−a
21
, q
0
2
=
a
11
−a
21
a
11
+a
22
−a
12
−a
21
and
v =
a
11
a
22
−a
12
a
21
a
11
+a
22
−a
12
−a
21
=
det A
a
11
+a
22
−a
12
−a
21
.
Proof. We have to see that the values above satisfy the following conditions:
p
0
A ≥ (v, v)
q
0
A
t
≤ (v, v)
which are equivalent to:
p
0
1
a
11
+p
0
2
a
21
≥ v
p
0
1
a
12
+p
0
2
a
21
≥ v
q
0
1
a
11
+q
0
2
a
12
≤ v
and
q
0
1
a
21
+q
0
2
a
22
≤ v.
It is easy to verify that the previous formulas are true, and in consequence the
proof is complete.
Example 5. (a simplified version of Morra game)
Each of the two players R and C simultaneously shows one or two fingers. If the
sum of the fingers shown is even, R wins the sum from C; if the sum is odd, R loses
the sum to C. The matrix is the following:
85
C shows
1 2
R 1
shows 2
_
2 −3
−3 4
_
It is easy to see that the game is nonstrictly determined so we apply the formulas
presented in Theorem 1
v =
2 4 −(−3)(−3)
2 + 4 + 3 + 3
= −
1
12
.
Thus the game is in favor of player C. Optimum strategies p
0
for R and q
0
for C
are as follows:
p
0
=
_
7
12
,
5
12
_
, q
0
=
_
7
12
,
5
12
_
.
Remark 6. If a matrix game contains a row (column) whose elements are smaller
or equal (greater or equal) then the elements of other row (column) then the smaller
(greater) row (column) is called a recessive row (column). Clearly, player R (player
C) would never play the recessive row (column), that’s why the recessive row (column)
can be omitted from the game.
Example 6. Consider the matrix game
A =
_
_
−4 −3 1
2 −1 2
−2 3 4
_
_
.
Note that (−4, −3, 1) ≤ (2, −1, 2) i.e. the first row is recessive and can be omitted
from the game and the game may be reduced to
_
2 −1 2
−2 3 4
_
.
Now observe that the third column is recessive since each entry is greater or equal
to the corresponding entry in the second column. Thus the game may be reduced to
the 2 2 game
A

=
_
2 −1
−2 3
_
.
The solution to the game A

can be found by using the formulas in Theorem 1
and is
v =
4
8
=
1
2
; p

0
=
_
5
8
,
3
8
_
; q

0
=
_
4
8
,
4
8
_
.
Thus the solution to the original game A is
v =
1
2
, p
0
=
_
0,
5
8
,
3
8
_
and q
0
=
_
1
2
,
1
2
, 0
_
.
86
2 l and m2 matrix games
In the case in which one of the players has just 2 strategies we can solve the game
geometrically.
Example 6. Consider the game whose matrix is:
A =
_
1 0 −1 0
−3 −2 1 2
_
.
Since the fourth column is recessive, we can omit it from the game which can be
reduced to
A

=
_
1 0 −1
−3 −2 −1
_
.
The player R plays an arbitrary strategy
p = (p
1
, p
2
) = (1 −p
2
, p
2
).
If the player C chooses column 1, then the expected payment y is:
y = 1 p
1
−3p
2
= 1 −p
2
−3p
2
= 1 −4p
2
.
If the player C chooses column 2 then
y = 0 p
1
−2p
2
= −2p
2
.
If the player C chooses column 3 then
y = −p
1
+p
2
= −(1 −p
2
) +p
2
= −1 + 2p
2
.
Notice that each of these expectations expresses y as a linear function of p
2
. Hence
the graph of these expectations will be straight line in each case. Since we gave the
restriction 0 ≤ p
2
≤ 1, we are interested only in segment for which p
2
satisfies the
restrictions.
87
¸
Maximum
1
0
−1
−1/2
−2
−3
Column 3
y = −1 + 2p
2
1
p
2
axis
Column 2
y = −2p
2
Column 1
y = 1 −4p
2
`
y axis
The player C will minimize his own expectation (his losses) by choosing the lowest
of the three lines presented in the above figure. Now R is the maximizing player, so he
will try to get the maximum of this function. This maximum occurs at the intersection
of the lines corresponding to the columns 2 and 3 (−1 + 2p
2
= −2p
2
) when p
2
=
1
4
,
p
0
=
_
3
4
,
1
4
_
and the value of the game is v = −2
1
4
= −
1
2
.
We can find an optimal strategy for player C by considering the 2 2 subgame of
A consisting of the second and third columns:
_
0 −1
−2 1
_
.
Applying the formulas from Theorem 1 we obtain the strategy q
0
=
_
1
2
,
1
2
_
. We
can extend q
0
to an optimal strategy for player C in A by adding two zero entries
thus:
q
0
=
_
0 1/2 1/2 0
_
.
A similar method to that presented in the previous example works to solve games
in which the column player has just two strategies and the row player has more than
2.
Example 7. Consider the game whose matrix is
A =
_
_
6 −1
0 4
4 3
_
_
.
88
The player C plays an arbitrary strategy
q = (q
1
, q
2
) = (1 −q
2
, q
2
).
If the player R chooses row i (i = 1, 3) then the expectation that player R has is:
If R chooses row 1: y = 6q
1
+ (−1)q
2
= 6(1 −q
2
) −q
2
= 6 −7q
2
.
If R chooses row 2: y = 0 q
1
+ 4q
2
= 4q
2
.
If R chooses row 3: y = 4q
1
+ 3q
2
= 4(1 −q
2
) + 3q
2
= 4 −q
2
.
¸
6
4
−1
Row 1
y = 6 −7q
2
Row 3
y = 4 −q
2
Row 2
y = 4q
2
q
2
axis
`
y axis
The player R will maximize his own expectation (his winnings) by choosing the
greatest of the three lines presented in the above figure. C is the minimizing player,
so he will try to get the minimum of this function. This minimum occurs at the
intersection of the lines corresponding to the rows 2 and 3 (4q
2
= 4 − q
2
) when
q
2
=
4
5
, q
0
=
_
1
5
,
4
5
_
and the value of the game is v = 4 −
4
5
=
16
5
.
We can find an optimal strategy for player R by considering the 2 2 subgame of
A consisting of the second and third rows:
_
0 4
4 3
_
.
Applying the formulas from Theorem 1 we obtain the strategy p
0
=
_
1
5
,
4
5
_
. We
can extend p
0
to an optimal strategy for player R in A by adding one zero entry thus:
p
0
=
_
0,
1
5
,
4
5
_
.
89
Next, we will prove the Von Neumann’s theorem (which says that every zero-sum,
two-person game has a value in mixed strategies). The proof of Von Neumann uses
the Brower fixed-point theorem. The proof that will be presented below uses an idea
of George Dantzig and linear programming theory. Dantzig’s proof is better that von
Neumann’s because it is elementary and because it shows how to construct a best
strategy.
Theorem 2. (John von Neumann’s theorem)
Let A be any real matrix. Then the zero-sum, two-person game with payoff matrix
A has a value v which satisfies p
0
A ≥ (v, v, . . . , v) and q
0
A
t
≤ (v, v, . . . , v) for some
optimal strategies p and q.
Proof. With no loss of generality, assume that all a
ij
are positive. Otherwise, if
a
ij
+ α is positive for all i and j then the optimality conditions p
0
A ≥ (v, v, . . . , v)
and q
0
A
t
≤ (v, v, . . . , v) may be replaced by
m

i=1
p
i
(a
ij
+α) =
m

i=1
p
i
a
ij
+α ≥ v +α, j = 1, l
and
l

j=1
(a
ij
+α)q
j
=
l

j=1
a
ij
q
j
+α ≤ v +α.
Assuming all a
ij
> 0, we will construct a number v > 0 satisfying the optimality
conditions. First we observe that the optimality conditions can be written as
m

i=1
p
i
v
a
ij
≥ 1, j = 1, l
l

j=1
q
j
v
a
ij
≤ 1, i = 1, m.
If we define the unknowns x
j
=
q
j
v
(j = 1, l), y
i
=
p
i
v
(i = 1, m) then the previous
inequalities become:
l

j=1
a
ij
x
j
≥ 1, i = 1, m
m

i=1
a
ij
y
i
≤ 1, j = 1, l
with x ≥ 0, y ≥ 0 and
l

j=1
x
j
=
l

j=1
q
j
v
=
1
v
,
m

i=1
y
i
=
m

i=1
p
i
v
=
1
v
.
90
The required vectors x ≥ 0 and y ≥ 0 must solve the following linear programming
problems:
(P) f =
l

j=1
x
j
→ maximize (since
l

j=1
x
j
=
1
v
and v → minimized
according to the fact that the column player is a minimizing player)
subject to
_
¸
_
¸
_
l

j=1
a
ij
x
j
≥ 1, i = 1, m
x
1
, . . . , x
l
≥ 0
(D) g =
m

i=1
y
i
→ minimize (since
m

i=1
y
i
=
1
v
and v → maximized
according to the fact that the row player is a maximizing player)
subject to
_
¸
_
¸
_
m

i=1
a
ij
y
i
≤ 1
y
i
≥ 0, i = 1, l
The previous two linear programming problems are a symmetric pair of dual prob-
lems. These problems have optimal solutions because they both have feasible solutions
(since all a
ij
are positive a vector y is feasible if all its components are large; the vec-
tor x = 0 is feasible for the primal). By duality theorem, these linear programming
problems have optimal solutions x, y and the same optimal value which is denoted
by
1
v
.
We easily can see that v, p and q satisfy the von Neumann’s theorem if p = vy
and q = vx.
The simplex method for solving matrix games
Suppose we are given a matrix game. We assume that A is non-strictly determined
(strictly determined games can be solved as we discussed before without using the
simplex method) and does not contain any recessive rows or columns. According to
the previous proof we can obtain the solution to A as follows:
1
st
step. Add a sufficiently large number k to every entry of A to form the following
matrix game which has only positive entries
A

=
_
_
a
11
a
12
a
1l
a
21
a
22
a
2l
a
m1
a
m2
a
ml
_
_
The purpose is to guarantee that the value of the new matrix game is positive.
91
2
nd
step. Solve the following LPP by the simplex method
f(x) = x
1
+x
2
+ +x
l
→ maximize
subject to
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
a
11
x
1
+ +a
1l
x
l
≥ 1
a
21
x
1
+ +a
2l
x
l
≥ 1
. . .
a
m1
x
1
+ +a
ml
x
l
≥ 1
x
1
, x
2
, . . . , x
l
≥ 0
Let x
0
be the optimum solution to the maximum problem, y
0
the optimum solution
to the dual minimum problem (which can be found in the row of reduced costs of the
terminal table) and v

= f(x
0
) (the optimal solution of LPP).
Let p
0
=
1
v

y
0
, q
0
=
1
v

x
0
and v =
1
v

−k.
Then p
0
is an optimum strategy for player R in game A, q
0
is an optimum strategy
for player C and v is the value of the game.
We can remark that the games A and A

have the same optimum strategies for
their respective players and that their values differ by the added constant k.
Example 8. Solve the Paper-Scissors-Rock game (example 1) by using the simplex
method.
Solution. Add k = 2 to each entry to form the matrix game
_
_
2 1 3
3 2 1
1 3 2
_
_
.
We have to solve the following LPP
f(x) = x
1
+x
2
+x
3
→ maximize
subject to
_
¸
¸
_
¸
¸
_
2x
1
+x
2
+ 3x
3
≤ 1
3x
1
+ 2x
2
+x
3
≤ 1
x
1
+ 3x
2
+ 2x
3
≤ 1
x
1
, x
2
, x
3
≥ 0
92
c 1 1 1 0 0 0
C
B
B b x
1
x
2
x
3
x
4
x
5
x
6
ratio test
0 x
4
1 2 1 3 1 0 0 min
_
1
2
,
1
3
, 1
_
0 x
5
1 3 2 1 0 1 0 =
1
3
0 x
6
1 1 3 2 0 0 1
f
j
0 0 0 0 0 0 0
c
j
−f
j
− 1 1 1 0 0 0
0 x
4
1
3
0 −
1
3
7
3
1 −
2
3
0 min
_
1
3
7
3
,
1
3
1
3
,
2
3
5
3
_
1 x
1
1
3
1
2
3
1
3
0
1
3
0 =
1
7
0 x
6
2
3
0
7
3
5
3
0 −
1
3
1
f
j
1
3
1
2
3
1
3
0
1
3
0
c
j
−f
j
− 0
1
3
2
3
0 0 0
1 x
3
1
7
0 −
1
7
1
3
7

2
7
0 min
_
2
7
5
7
,
3
7
18
7
_
1 x
1
2
7
1
5
7
0 −
1
7
3
7
0 =
1
6
0 x
6
3
7
0
18
7
0 −
5
7
1
7
1
f
j
3
7
1
4
7
1
2
7
1
7
0
c
j
−f
j
0
3
7
0 −
2
7

1
7
0
1 x
3
1
6
0 0 1
7
18

5
18
1
18
1 x
1
1
6
1 0 0
1
18
7
18

5
18
1 x
2
1
6
0 1 0 −
5
18
1
18
7
18
f
j
1
2
1 1 1
1
6
1
6
1
6
c
j
−f
j
− 0 0 0 −
1
6

1
6

1
6
The optimal solutions of the primal and dual problem are
x
0
=
_
1
6
,
1
6
,
1
6
_
, y
0
=
_
1
6
,
1
6
,
1
6
_
and the optimal value is v

=
1
2
.
Then for the original game
p
0
=
1
v

y
0
= 2
_
1
6
,
1
6
,
1
6
_
=
_
1
3
,
1
3
,
1
3
_
q
0
=
1
v

x
0
= 2
_
1
6
,
1
6
,
1
6
_
=
_
1
3
,
1
3
,
1
3
_
v =
1
v

−2 = 0.
Observe that the game is fair (since its value is 0).
93
The transportation problem
The balanced transportation problem
An important component of economic life is the shipping of goods from where
they are produced, to markets. The aim is to ship these goods at minimum cost. This
problem was one of the first problems that was modeled and solved by using the linear
programming.
We analyse the single commodity transportation problem. The data of this prob-
lem consists of the amount available at each source, the requirement at each demand
center and the cost of transporting the commodity per unit from each source to each
market.
We consider the transportation problem with the following data
m = number of sources where material is available
n = number of demand centers where material is required
a
i
= units of material available at source i, a
i
> 0, i = 1, m
b
j
= units of material required at demand center j, b
j
> 0, j = 1, n
c
ij
= unit shipping cost (m.u./unit) from source i to demand center j, i = 1, m,
j = 1, n.
The transportation problem with this data is said to satisfy the
balance condition if it satisfies
m

i=1
a
i
=
n

j=1
b
j
A transportation problem which satisfies the previous condition is called a bal-
anced transportation problem.
We want to determine the quantities to be shipped such that all the requirements
to be satisfied (all the supplies from the sources to be shipped and all the demands
to be satisfied) and the total cost of transportation to be minimum.
If we denote by x
ij
(i = 1, m, j = 1, n) the quantity to be shipped from the
source i to demand center j we get the following mathematical model (as a linear
programming problem)
f(x) =
m

i=1
n

j=1
c
ij
x
ij
→ minimize
subject to
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
n

j=1
x
ij
= a
i
, i = 1, m
m

i=1
x
ij
= b
j
, j = 1, n
x
ij
≥ 0, u = 1, m, j = 1, n
(1)
If we add the set of first m constraints (those corresponding to the sources) and
separately, the set of last m constraints (those corresponding to the demand centers),
94
we see that
m

i=1
a
i
=
m

i=1
n

j=1
x
ij
=
n

j=1
m

i=1
x
ij
=
n

j=1
b
j
.
The previous equality shows us that the balance condition is a necessary condition
for the feasibility of the transportation problem.
That’s why we assumed that the data of the problem satisfy the balance condition.
The previous equality shows that there is a redundant constraint among the con-
straints (1). One of the equality constraints in (1) can be deleted from the system
without affecting the set of feasible solutions.
The matrix of the previous system of equations has the order (m + n − 1) mn
and its rank is m+n −1.
So, every basic vector for the balanced transportation problem of order m n
consists of (m+n −1) basic variables.
The transportation problem can be represented in a two dimensional array in
which row i corresponds to source i; column j corresponds to demand center j; and
(i, j) is the cell in row i and column j.
In the cell (i, j), we record the value x
ij
(the amount to be shipped) in the lower
right-hand corner of the cell and the unit shipping cost in the upper left-hand corner
of the cell.
On the right-hand side of the array we record the availabilities at the sources and
at the bottom of the array we record the requirements at the demand centers.
The objective function is the sum of the variables in the array multiplied by the
unit cost in the corresponding cell.
Array representation of the transportation problem
Demand center
1 j n Supply
1
c
11
x
11
c
1n
x
1n
a
i
Source i
c
i1
x
i1
c
ij
x
ij
c
in
x
in
a
i
m
c
m1
x
m1
c
mn
x
mn
a
n
Demand b
1
b
j
b
n
We end this subsection with an example.
Example 1. We consider a small transportation problem where the commodity
is iron ore, the source are mine 1 and mine 2 that produce the ore, and the markets
are three steel plants. Let c
ij
= cost (RON/ton) to ship ore from mine i to plant j,
i = 1, 2, j = 1, 2, 3. The data is given below.
95
Steel plant Available at
1 2 3 mine (tons) daily
Mine 1
11
x
11
8
x
12
2
x
13
800
2
7
x
21
5
x
22
4
x
23
300
Demand at 400 500 200
plant (tons) daily
Determine the amounts to be shipped such that the transportation cost to be
minimum.
Let x
ij
be the amount of ore (in tons) shipped from mine i to plant j.
At mine 1 there are 800 tons of ore available. The amount of ore shipped out of
this mine, x
11
+x
12
+x
13
has to be smaller than the amount available, leading to the
constraint x
11
+ x
12
+ x
13
≤ 800. Similarly, considering ore at steel plant 1, at least
400 of it is required there, leading to the constraint x
11
+x
21
≥ 400.
The total amount of ore available is 800 + 300 = 1100; and the total amount
required is 400 + 500 + 800 = 1100. This imply that all the ore at each mine will be
shipped out, and the requirement at each plant will be met exactly. In consequence
all constraints will be equalities.
The dual problem. The optimality criterion
We associate the dual variable u
i
to the constraint corresponding to source i,
i = 1, m, and the dual variable v
j
to the constraint corresponding to demand center
j, j = 1, n.
The dual problem is:
g(u, v) =
m

i=1
a
i
u
i
+
n

j=1
b
j
v
j
→ maximize
subject to
_
u
i
+v
j
≤ c
ij
, i = 1, m, j = 1, n
u
i
, v
j
, i = 1, m, j = 1, n unrestricted in sign
(2)
From the complementarity slackness theorem (from previous section) we know
that if x = (x
ij
)
i=1,m
j=1,n
is a basic feasible solution for the primal problem, (u, v) =
((u
i
)
i=1,m
, (v
j
)
j=1,n
) is feasible for the dual and
x
ij
(c
ij
−u
i
−v
j
) = 0, for all i and j
then x and (u, v) are optimal for their problems.
96
In conclusion, if x = (x
ij
)
i=1,m
j=1,n
is a basic feasible solution for the transportation
problem then the dual basic solution associated with it can be computed by solving
the following system of equations:
u
i
+v
j
= c
ij
for each (i, j) corresponding to a basic variable x
ij
.
The previous system has m + n unknowns and m + n − 1 equations (since each
BFS x has m+n −1 basic variables).
Deleting one constraint in (1) has the effect of setting 0 the correspondent dual
variable.
The system which gives us the dual basic solution is
_
u
i
+v
j
= c
ij
for each (i, j) corresponding to a basic variable x
ij
u
1
= 0 (we can choose any dual variable to be 0)
The optimality criterion is
Optimality criterion. c
ij
≥ u
i
+v
j
for all nonbasic (i, j).
Indeed, if the optimal criterion is satisfied, then (u, v) is feasible for the dual
problem. Since x is feasible for the primal problem, according to the complementarity
slackness theorem, x and (u, v) are optimal to respective problems.
The transportation algorithm
By using the special structure of the transportation problem we will present a
version of the simplex algorithm that can be solved without canonical tables.
Step 1. Determine an initial basic feasible solution.
We present two methods of obtaining an initial basic feasible solution: the north-
west corner rule and the minimal cost rule.
The northwest corner rule
Begin in the upper left-hand corner (or northwest corner) of the transportation
array and set x
11
as large as possible (there are limitations for setting x
11
: b
1
which
is the demand at the market 1 and a
1
which is the supply at source 1). So, x
11
=
min¦a
1
, b
1
¦. Setting x
11
= min¦a
1
, b
1
¦, the supply at the source 1 will be a
1
− x
11
and the demand at the market 1 will be b
1
−x
11
.
If x
11
= a
1
then x
12
= = x
1n
= 0 and hence x
12
, . . . , x
1n
will be nonbasic
variables (instead of 0 in their cells we will put a point).
If x
11
= b
1
then x
21
= = x
n1
= 0 and hence x
21
, . . . , x
n1
will be nonbasic
variables (instead of 0 in their cells we will put a point).
Continue this procedure from the upper left-hand corner of the remaining table.
The northwest corner does not utilize shipping costs. It provides us easily an initial
BFS but the total shipping cost may be very high.
Example. To make this description more concrete, we now illustrate the general
procedure on the iron ore shipping problem (example 1).
97
11
400
8
400
2

800 400 0
7

5
100
4
200
300
400 500 200
0 100
In the first step, at the northwest corner, a maximum of 400 tons can be allocated,
since that is all that was required by plant 1. This left 800 −400 = 400 tons available
at the first mine and 0 tons required at plant 1. The demand in column 1 is fully
satisfied and we cannot ship anymore to plant 1 so x
21
= 0 and we put a • in cell
(2,1).
Next, we move to the second cell in the top row (the northwest corner of the
remaining table) where a maximum of 400 tons can be allocated, since that is all that
was available at that moment at mine 2. This left 400 − 400 = 0 tons available at
the first mine and 500 − 400 = 100 tons required by plant 2. At this moment the
first row’s availability is met (x
13
= 0) and we move down to the second row where
x
22
= 100 and x
23
= 200. We have 3 + 2 − 1 = 4 basic variables which are x
11
, x
12
,
x
22
and x
23
. The transportation cost in this BFS is
11 400 + 8 400 + 2 0 + 7 0 + 5 100 + 4 200 = 8900 RON.
The minimal cost rule
This method uses shipping costs in order to provide an initial BFS that has a
lower cost.
First we determine the variable with the smallest shipping cost, then assign x
ij
the largest possible value, which is x
ij
= min¦a
i
, b
j
¦.
The supply at source i will be reduced to a
i
− x
ij
and the demand at demand
center j will be reduced to b
j
−x
ij
.
If x
ij
= a
i
then x
ik
= 0 for each k = 1, n, k ,= j.
If x
ij
= b
j
then x
lj
= 0 for each k = 1, m, l ,= i.
After that, we will choose the cell with the minimum shipping cost from the
remaining array and we will repeat the procedure.
If the minimum cost is realized for more than one cell then we will choose to begin
with that cell for which the correspondent variable is maximum.
Example. We illustrate the procedure described above on the iron shipping prob-
lem (example 1).
98
11
400
8
200
2
200
800 600 400 0
7

5
300
4

300 0
400 500 200
200 0
0
The smallest coefficient cost is c
13
= 2, so x
13
= min¦200, 800¦ = 200. The demand
in column 3 is fully satisfied, and we cannot ship anymore to plant 3. So, x
23
= 0
(and we put a point in the cell (2,3)). We change the amount still to be shipped from
mine 1 to 800 −200 = 600 and the amount to be shipped at plant 3 to 200 −200 = 0.
The smallest cost among the remaining cells is c
22
= 5 and
x
22
= min¦500, 300¦ = 300.
We change the remaining requirement at plant 2 to 500−300 = 200, the remaining
availability at mine one to 300 −300 = 0 and put a point in cell (1,3) since x
13
= 0.
The remaining cells are (1,1) and (1,2) and
x
12
= min¦200, 600¦ = 200; x
11
= min¦400, 400¦ = 400.
We have 4 = 3 + 2 −1 basic variables which are: x
11
, x
12
, x
13
and x
22
.
The transportation cost in this BFS is
11 400 + 8 200 + 2 200 + 7 0 + 5 300 + 4 0 = 7900 RON.
Step 2. Check the optimality of the current BFS.
Denote by u
i
(i = 1, m) and v
j
(j = 1, n) the dual variables correspondent to
sources and respectively demand centers.
Solve the system
_
u
i
+v
j
= c
ij
for each (i, j) corresponding to a basic variable x
ij
u
1
= 0
For each (i, j) compute u
i
+ v
j
and wrote down the obtained value in the upper
right-hand corner in the correspondent cell.
Check the optimality criterion: c
ij
≥ u
i
+v
j
for all nonbasic x
ij
.
If the current BFS is optimal then stop. Write down the optimal solution and
compute the minimal value of f.
Remark. The above system can be solved very easily. Since we know u
1
= 0, we
can get the values of v
j
for columns j of the basic cells. Knowing the values of v
j
,
from equations corresponding to basic cells in these columns, we can get the values
of u
i
for rows i of these basic cells. We continue the method with the rows of u
i
in
the same way, until all the u
i
and v
j
are computed.
99
Since any dual variable can be 0, we will choose that dual variable on whose row
(or column) we have the maximum number of basic cells to be zero.
Example. Consider the BFS obtained by the minimal cost rule in the iron ore
shipping problem.
v
1
= 11 v
2
= 8 v
3
= 2
u
1
= 0
11 11
400
8 8
200
2 2
200
u
2
= −3
7 8
0
5 5
300
4 −1

To compute the dual basic solution, we start with u
1
= 0 (on the first row we have
3 basic cells). Since (1,1) is a basic cell, we have u
1
+v
1
= c
11
= 11, so v
1
= 11.
Since (1,2) is a basic cell, we have u
1
+v
2
= c
12
= 8, so v
2
= 8.
Since (1,3) is a basic cell, we have u
1
+v
3
= c
13
= 2, so v
3
= 2; and the processing
of row 1 is done.
Since (2,2) is a basic cell, we have u
2
+v
2
= c
22
= 5, and since v
2
= 8 we get that
u
2
= 5 −8 = −3.
The dual solution is written in the above array.
We check the optimality conditions and see that c
21
= 7 < 8 = u
2
+v
1
.
Since c
21
< u
2
+v
1
, the optimality criterion is not satisfied.
Step 3. Improve the current basic feasible solution by replacing exactly one basic
variable with a nonbasic variable for which the optimality criterion is violated.
Choose as entering cell the nonbasic variable x
ij
with the most negative reduced
cost, c
ij
−u
i
−v
j
.
Since every BFS for the mn transportation problem has exactly m+n−1 basic
variables, when an entering variable is brought into the BFS, some present basic
variable should be dropped from the BFS: the variable is called the dropping basic
variable and the correspondent cell is called the dropping basic cell.
We determine now the dropping basic variable and the new BFS.
The value of the entering variable is changed from 0 (its present value as a nonbasic
variable) to a value denoted by α (which will be determined later) and all the other
nonbasic variables remain unchanged. If we denote by (p, q) the entering cell then the
value of x
pq
changes from 0 to α. We have to subtract α to one of the basic variables
in row p (such that a
p
units to be shipped out of source p) and to subtract α to one of
the basic variables in column q (such that b
q
units to be shipped to demand center q).
We continue these adjustments adding α to another basic variable, then subtracting
α from a basic variable; until all the adjustments cancel each other in every row and
column. All the cells which have the values modified by α or by −α belong to a loop
which has the following properties:
i) all the cells in the loop other than the entering cell are in present basic cells
ii) every row and column of the array either has no cells in the loop; or has exactly
two cells, one with +α adjustment and the other with a −α adjustment.
The cells in the loop with +α adjustment are called recipient cells. The cells with
−α adjustment are called donor cells.
100
The new solution is x(α) = (x
ij
(α))
i,j
x
ij
(α) =
_
_
_
x
ij
if the cell (i, j) is not in the loop
x
ij
+α if (i, j) is a recipient cell in the loop
x
ij
−α if (i, j) is a donor cell in the loop.
We have that
f(x(α)) = f(x) +α(c
pq
−u
p
−v
q
).
(The proof of the previous equality is beyond the scope of this text).
Since f(x(α)) = f(x) +α(c
pq
−u
p
−v
q
) and c
pq
−u
p
−v
q
< 0, in order to decrease
the objective value as much as possible we should give α the maximum value it can
have which is
α = min¦x
ij
, (i, j) a donor cell in the loop¦.
A donor cell for which α = x
ij
will become the dropping cell (and become a
nonbasic cell). If there are more than one donor cell for which α = x
ij
, then the
dropping cell will be that cell for which there are no loops with basic cells.
We can summarize the previous discussion as follows:
- choose as entering variable x
ij
with the most negative reduced cost (the reduced
cost is c
ij
−u
i
−v
j
)
- starting from this cell consider a loop whose all the other corners except the
starting cell, are situated in basic cells. On each row or column there are either 2 cells
of the loop or none.
- mark (in the lower left-hand corner of the correspondent cell) by a + the odd
cells of this loop (the first, the third,...). These are the recipient cells of the loop.
- mark by a − the even cells of this loop (the second, the fourth,...). These are the
donor cells of the loop.
- determine the minimal value of the variables situated in the donor cells. Denote
it by α.
- add α to the recipient variables (situated in the cells marked by +).
- subtract α to the donor variables (situated in the cells marked by −)
- one of the donor variables equal to α will leave the base. Go to step 2.
Example. Improve the BFS obtained by the minimal cost rule in the iron ore
shipping problem.
v
1
= 11 v
2
= 8 v
3
= 2
u
1
= 0
11 11
− 400
8 8
+ 200
2
200
u
2
= −3
7 8
+ •
5 5
− 300
4 −1

The nonbasic cell (2,1) with c
12
= 7 < 8 = u
1
+v
2
is the only eligible cell to enter
the base. The loop consists of the following cells: (2,1), (2,2), (1,2), (1,1).
The odd cells (recipient cells) are (2,1), (1,2) and they are marked by +.
The even cells (donor cells) are (2,2), (1,1) and they are marked by −.
101
Since α = min¦400, 300¦ = 300 = x
22
then the dropping variable is x
22
.
The new BFS is given in the array below.
v
1
= 11 v
2
= 8 v
3
= 2
u
1
= 0
11 11
100
8 8
500
2 2
200
u
2
= −4
7 7
300
5 4

4 −2

We compute the dual basic variables and since
c
22
= 5 > 4 = u
2
+v
2
; c
23
= 4 > −2 = u
2
+v
3
the optimality criterion is satisfied.
Hence, we get an optimal solution
x =
_
100 500 200
300 0 0
_
with a minimum cost
f(x) = 11 100 + 8 500 + 2 200 + 7 300 + 5 0 + 4 0 = 7600 RON.
Remark. a) Degenerate solution
Each BFS which occurs in this iterative process must have m+n−1 basic variables.
If we have less than m+n−1 positive variables we must put 0’s instead of points
(in the cells of nonbasic variables) in order to obtain m + n − 1 basic variables. We
will transform the points into 0’s in that cells with minimal costs for which there are
no loops with the basic cells.
b) Multiple solution
If all the optimality conditions are fulfilled and if there is a nonbasic cell for which
c
ij
= u
i
+ v
j
then by choosing a loop starting with this cell we will obtain another
optimal solution.
If x
1
and x
2
are optimal solutions then for each t ∈ [0, 1], (1−t)x
1
+tx
2
is optimal,
too.
Next, we present an example of finding the loop which improves a given basic
feasible solution for a balanced transportation problem.
The loop will be a little more complicated in this case.
102
v
1
= 5 v
2
= 3 v
3
= 4 v
4
= −9 v
5
= 4
u
1
= 0
6 5

3 3
− 18
4 4
+ 20
5 −9

5 4

u
2
= 5
10 10
− 27
5 8

7 9

11 −4

9 9
+ 35
u
3
= 3
8 8
+ 0
5 6

7 7
− 27
11 −6

9 7

u
4
= 21
13 26

4 24
+ •
16 25

12 12
25
25 25
− 65
The previous BFS is degenerate since among 8 components just 7 are positive.
Computing the dual variables we observe that the optimality conditions are vio-
lated in the following cells: (2,2), (2,3), (3,2), (4,1), (4,2) and (4,3).
The entering variable is (4,2) since it provides the most negative reduced cost.
We put a + in the lower left-hand side part of this cell. To satisfy the equality
constraints in the problem we need to subtract the same value as was added in cell
(3,2) from one of the basic cells in row 4, that is in cells (4,4) or (4,5).
If we choose the cell (4,4), since this is the only basic cell in column four, we
cannot make the next correction in another basic cell in this column. In conclusion
the adjustment must be made in the basic cell (4,5). Continuing in the same way we
obtain the entire loop (presented in the above array).
Example 2. Solve the following transportation problem.
Demand center
1 2 3
1
1 2 3
15
Source 2
4 4 10
19
3
6 5 15
11
7 8 30
We check first the balance condition: 7 + 8 + 30 = 15 + 19 + 11.
Step 1. We determine an initial BFS by using the minimal cost rule:
103
1
7
2
8
3

15 8 0
4

4

10
19
19 0
6

5
0
15
11
11 0
7 8 30
0 0 11
0
Since c
11
= 1 = min¦c
ij
, i = 1, 3, j = 1, 3¦ we have
x
11
= min¦7, 15¦ = 7.
The minimum cost of the remaining array is now c
12
= 2 so x
12
= min¦48, 8¦ = 8.
Since we obtain just 4 positive variables and since we must have 3+3−1 = 5 basic
variables we must transform a • into a basic variable with value 0. We can transform
any • in 0 since neither of present nonbasic variables have loops with basic variables.
We consider x
23
= 0 to be a basic variable.
Step 2. We check the optimality of the previous BFS.
v
1
= 1 v
2
= 2 v
3
= 12
u
1
= 0
1 1
7
2 2
− 8
3 12
+ •
u
2
= −2
4 −1

4 0

10 10
19
u
3
= 3
6 4

5 5
+ 0
15 15
− 11
u
1
= 0, u
1
+v
1
= 1 ⇒ v
1
= 1
u
1
+v
2
= 2 ⇒ v
2
= 2
v
2
= 2, u
3
+v
2
= 5 ⇒ u
3
= 3
u
3
= 3, u
3
+v
3
= 15 ⇒ v
3
= 12
v
3
= 12, u
2
+v −3 = 10 ⇒ u
2
= −2
Since c
13
= 3 < 12 = u
1
+v
3
the BFS is not optimal.
Step 3. We improve the BFS, starting from the (1,3) cell.
The loop consists in the following cells: (1,3), (3,3), (3,2) and (1,2).
The odd (recipient) cells are: (1,3) and (3,2).
The even (donor) cells are: (1,2) and (3,3).
104
Since α = min¦8, 11¦ = 8 = x
12
then the dropping variable is x
12
.
The new BFS is given in the array below.
v
1
= 1 v
2
= 7 v
3
= 3
u
1
= 0
1 1
− 7
2 −7

3 3
+ 8
u
2
= 7
4 8

4 0

10 10
19
u
3
= 12
6 13
+ •
5 5
8
15 15
− 3
We compute the dual variables and since c
21
= 4 < 8 = u
2
+ u
1
and c
31
=
6 < 13 = u
3
+ u
1
the BFS is not oprimal. The entering variable is x
31
(since it
provides the most negative reduced cost). The loop is (3,1), (3,3), (1,3), (1,1) with
α = min¦7, 3¦ = 3 = x
33
which will be the dropping variable.
The new array is:
v
1
= 1 v
2
= 0 v
3
= 3
u
1
= 0
1 1
− 4
2 0

3 3
+ 11
u
2
= 7
4 8
+ •
4 7

10 10
− 19
u
3
= 5
6 6
3
5 5
8
15 8

The new BFS obtained before is not optimal since the optimality condition is
violated in the cells: (2,1) and (2,2). The entering variable is x
21
. Repeating the
procedure presented before we obtain the following new BFS.
v
1
= −3 v
2
= −4 v
3
= 3
u
1
= 0
1 −3

2 −4

3 3
15
u
2
= 7
4 4
4
4 3

10 10
15
u
3
= 9
6 6
3
5 5
8
15 12

We compute the dual variables and since c
11
> u
1
+v
1
, c
12
> u
1
+v
2
, c
22
> u
2
+v
2
,
c
33
> u
3
+v
3
the optimality criterion is satisfied.
105
Hence, we get an optimal solution
x =
_
_
0 0 15
4 0 15
3 8 0
_
_
with a minimum cost
f(x) = 3 15 + 4 4 + 10 15 + 6 3 + 5 8 = 269.
According to the previous remark, part b, the previous solution is unique since
there are no nonbasic variables for which the reduced costs to be zero.
Example 3. Solve the following balanced transportation problem.
Demand center
1 2 3
1
9 6 8
6
Source 2
10 5 12
11
3
11 13 20
4
3 4 14
Solution. We determine an initial basic feasible solution by using the least cost
rule. We show this BFS in the following array. The dual variables are also entered in
the array.
v
1
= 10 v
2
= 5 v
3
= 12
u
1
= −4
9 6

6 1

8 8
6
6 0
u
2
= 0
10 10
− 3
5 5
4
12 12
+ 4
11 7 4 0
u
3
= 8
11 18
+ •
13 13

20 29
4
4 0
3 4 14
0 0 8
0
The optimality criterion is violated since c
31
= 11 < 18 = u
3
+ v
1
. Hence (3,1)
is the entering cell, and the corresponding loop is already entered on the array. α =
min¦3, 4¦ = 3 and (2,1) is the dropping cell. The next BFS is presented in the following
array.
106
v
1
= 3 v
2
= 5 v
3
= 12
u
1
= −4
9 −1

6 1

8 8
6
u
2
= 0
10 3

5 5
− 4
12 12
+ 7
u
3
= 8
11 11
3
13 13
+ •
20 20
− 1
Now the optimality criterion is satisfied, so the present BFS is an optimal solution.
x =
_
_
0 0 6
0 4 7
3 0 1
_
_
.
The minimum cost is
f
min
= 8 6 + 5 4 + 12 7 + 11 3 + 20 1 = 205.
Since in the last array there is a nonbasic cell, namely (3,2), for which c
32
= u
3
+v
2
,
then by choosing a loop starting with this cell we will obtain another optimal solution,
as bellow:
v
1
= 3 v
2
= 5 v
3
= 12
u
1
= −4
9 −1

6 1

8 8
6
u
2
= 0
10 3

5 5
3
12 12
8
u
3
= 8
11 11
3
13 13
1
20 20

The new optimal solution is:
x

=
_
_
0 0 6
0 3 8
3 1 0
_
_
with the same minimal cost f
min
= f(x

) = 205.
The general solution (see the previous remark, part b) is:
x(t) = (1 −t)x +tx

=
_
_
0 0 6
0 4 −t 7 −t
3 0 1 −t
_
_
, t ∈ [0, 1].
107
Marginal values in the balanced transportation
problem
In this subsection we analyse how changes in availabilities and requirements a
i
and b
i
, affect the transportation cost in a balanced transportation problem.
The marginal value is the rate of change in the optimum objective value, per unit
change in the availabilities and requirements, a
i
and b
j
.
Since the balance condition is necessary for feasilibity, we cannot change only one
quantity among a
1
, . . . , a
m
; b
1
, . . . , b
n
.
We will consider the following types of changes:
i) increased demand at demand center j and the same balancing increase in avail-
ability at source i
ii) increased availability at source p and decreased availability at source i by the
same amount (this moves the supply from source i to source p)
iii) increased demand at demand center q and decreased demand at demand center
j by the same amount.
In all the cases presented above, all the other data in the balanced transportation
problem remain the same.
Let x and (u, v) be optimal solutions for primal and dual problems. Assume then x
is nondegenerate. According to Remark 6 of the previous section the marginal values
in the three cases presented before are:
i) v
j
+u
i
ii) u
p
−u
i
iii) v
q
−v
j
.
Example. Consider the balanced transportation problem presented in the previ-
ous example. In the next array we consider the first optimal solution x.
v
1
= 3 v
2
= 5 v
3
= 12
u
1
= −4
9 −1

6 1

8 8
6
u
2
= 0
10 3

5 5
4
12 12
7
u
3
= 8
11 11
3
13 13

20 20
1
The optimum transportation cost in this problem is 205.
According to the above discussion, if b
2
increases from its current value by 2 units
and a
2
changes by the same amount (to keep the problem balanced) then the optimum
objective value will change by 2(u
2
+v
2
) = 2 5 = 10 taking the value 215.
We remark the fact that if demand b
2
increases, the best place to create additional
supplies to satisfy that additional demand, is source 1 (it is the source with the small-
est u
i
). By shifting supply (2 units in our case) from source 2 to source 1 the company
can save 8 monetary units since the rate of change in the optimum transportation
cost per unit shift is u
1
−u
2
= −4 −0 = −4.
108
Remark. The previous discussion is true for sufficiently small changes in b
j
and
a
i
which don’t affect the basis of the transportation problem. In this case, both, initial
and modified transportation have the same basic variables which determine the same
dual variables in both problems, as needed in Remark 6 from previous section.
Unbalanced transportation problem
So far we assumed that the total supply at all the sources is equal to the total
demand at all the demand centers. This implies that
m

i=1
a
i
=
n

j=1
b
j
so the system is in balance.
In many applications it may be impossible (or unprofitable) to ship all that is
required or the total supply either exceeds or is less than the total demand. Such
problems are called unbalanced and can be solved by transportation algorithm as
below.
(a) Supply exceeds demand (overproduction)
In this case
_
_
m

i
a
i
>
n

j
b
j
_
_
after all the demand is met an amount of
m

i
a
i

n

j
b
j
will be left unused at the sources.
To solve this problem, we introduce a new (n + 1) column in the array. For each
i, i = 1, m, the cell (i, n +1) corresponds to the material left unused at source i. The
cost coefficients for all the cells in this new column are equal to zero and
b
n+1
=
m

i=1
a
i

n

j=1
b
j
.
In the optimum solution of this modified problem, basic values in the cells of n+1
column represent unused material at the sources.
(b) Demand exceeds supply (underproduction)
In this case
_
_

i
a
i
<

j
b
j
_
_
there is a shortage of

j
b
j

i
a
i
and we cannot
meet all the demand with the existing supply.
To solve this problem, we introduce a new (m + 1) row in the array. Since we
have to find how to distribute the existing supply to meet as much as the demand as
possible, we introduce a dummy source with availability a
n+1
=

j
b
j

i
a
i
. The
cost coefficients for all the cells in this new row are equal to zero.
In the optimum solution of this modified problem, basic values in the cells of
(m+ 1) row represent unfulfilled demand at demand centers.
109
110
Part II
Calculus
111
Chapter 3
One variable calculus
This part of the text concerns overviews of one-variable calculus. One can cover
this material either by taking the facultative mathematics course or read it on their
own as a review of the calculus they have already taken in high school. The examples
contained in this part should make the process relatively simple.
A central goal of economic theory is to express and analyse relationships between
economic variables which are described mathematically by functions.
A numerical function f : A → B, A, B ⊆ R,
A ∋ x → y = f(x) ∈ B
is a rule which assigns one and only one value y = f(x) in B to each element x in A.
The variable x is called the independent variable, or, in economic applications the
exogenous variable. y is called the dependent variable, or in economic applications
the endogeneous variable.
The development of calculus by Isaac Newton (1642-1727) and Gottfried Wilhelm
von Leibniz (1646-1716) resulted from studying certain mathematical problems such
as:
• finding the tangent line to a curve at a given point
• finding the extreme values of certain functions
• finding the areas of planar regions bounded by arbitrary curves.
The study of the first classes of problems mentioned before led to the creation of
differential calculus. The study of the third class of problems led to the creation of
integral calculus.
3.1 Differential calculus of one variable
The most important information in which we are interested concerns how a change
in one variable affects the other. In the case when the relationships are expressed as
113
linear functions the effect of a change of one variable on the other is expressed by
the ”slope” of the function, but in the general case the effect of this change can be
expressed by the derivative. In this section we will review some facts related to the
derivative of a one-variable function focusing on its role in quantifying relationships
between variables. The derivative of a function is defined by using the notion of limit.
3.1.1 Limits and continuity
We will begin with the intuitive approach to the notion of a limit.
Let f : R ¸ ¦3¦ →R, f(x) =
9 −x
2
3 −x
.
Even if f(3) is not defined, f(x) can be calculated for any value of x near 3. Simple
computations show if x approaches 3 either the left or right then the values f(x) are
approaching 6. We say 6 is the limit of f as x approaches 3 and write
lim
x→3
9 −x
2
3 −x
= 6 or f(x) → 6 as x → 3.
Intuitively, the notion of f(x) approaching a number l as x approaches to a number
a is defined in the following way.
Let f : D → R, D ⊆ R, a ∈ D

(see appendix...). If f(x) can be made arbitrarily
close to a number l by taking x sufficiently close to a number a (but different from a)
from both the left and right side of a, then lim
x→a
f(x) = l.
We shall use the notation x ր a (or x → a

) to denote that x approaches a from
the left and x ց a (or x → a
+
) to denote that x approaches a from right.
If the limits lim
xցa
f(x) and lim
xրa
f(x) have a common value l, we say that lim
x→a
f(x)
exists and write lim
x→a
f(x) = l.
Intuitively the notion of an infinite limit, lim
x→a
f(x) = ∞ is as follows.
If f can be made arbitrarily large by taking x sufficiently close to a number a (but
different from a) from both the left and right side of a then lim
x→a
f(x) = ∞.
The infinite limit lim
x→a
f(x) = −∞ can be described in a similar manner.
The intuitive definitions are too vague to be of any use in proving theorems. A
proof of the existence of a limit can never be based on previous intuitive approach.
To give a rigorous demonstration of the existence of a limit or to prove results
concerning limits we must present now the precise definition of a limit. This rigorous
definition (ε − δ definition) is due to Augustin-Louis Cauchy. He used ε because of
the correspondence between epsilon and the french word ”erreur” and δ because delta
correspond to ”diff´erence”.
Definition 1. (ε −δ definition of a limit)
Let f : A →R, a ∈ A

∩ R.
lim
x→a
f(x) = l ∈ R means for every ε > 0 there exists a δ > 0 such that [f(x)−l[ < ε
whenever 0 < [x −a[ < δ and x ∈ A.
To try to understand the meaning behind this abstract definition, see the diagram
below.
114
`
¸
y
x a −δ a a + δ
l + ε
l
l −ε
We first take an ε > 0 and represent on the y-axis the interval (l −ε, l +ε) around l.
We then determine an interval (a−δ, a+δ) around a so that for all x-values (excluding
a) inside the determined interval the corresponding values f(x) lie inside (l −ε, l +ε).
In general, the value of δ will depend on the value of ε. That is, we will always
begin with ε > 0 and then determine an appropriate corresponding value for δ > 0.
There are many values of δ which work. Once a value for δ is found, all smaller values
of δ also work.
In the next examples we will use the precise definition.
We will begin the proofs by letting ε > 0 be given.
Then we take the expression [f(x) − l[ < ε and from this inequality we try to
determine an appropriate value for δ (which depends on ε) such that [x −a[ < δ will
guarantee [f(x) −l[ < ε.
Example 1. Prove that lim
x→3
(4x −2) = 10.
Solution. Given ε > 0, we want to determine a δ > 0 so that [x − 3[ < δ will
guarantee [f(x) − 10[ = [(4x − 2) − 10[ < ε. It is natural to try to determine a
connection between [(4x −2) −10[ and [x −3[. We have
[f(x) −10[ = [(4x −2) −10[ = [4x −12[ = 4[x −3[.
So [f(x) −10[ < ε ⇔ 4[x −3[ < ε ⇔ [x −3[ <
ε
4
.
The choice of δ is now clear. If we let δ =
ε
4
we have
[(4x −2) −10[ < ε ⇔ [x −3[ < δ.
Putting all these together, we can write down the following proof.
Given any ε > 0, let δ =
ε
4
> 0. Then for all x in the domain of the function f, if
0 < [x −3[ < δ, we have
[f(x) −10[ = [(4x −2) −10[ = [4x −12[ = [4(x −3)[ = 4[x −3[ < 4δ = ε.
This completes the proof of lim
x→3
(4x −2) = 10.
115
Example 2. Prove that lim
x→4
3
x + 5
=
1
3
.
Solution. We will start with the analysis of the problem.
Observe that
¸
¸
¸
¸
f(x) −
1
3
¸
¸
¸
¸
=
¸
¸
¸
¸
3
x + 5

1
3
¸
¸
¸
¸
=
¸
¸
¸
¸
4 −x
3(x + 5)
¸
¸
¸
¸
=
[x −4[
3[x + 5[
.
So,
¸
¸
¸
¸
3
x + 5

1
3
¸
¸
¸
¸
< ε ⇔
[x −4[
3[x + 5[
< ε ⇔ [x −4[ < 3ε[x + 5[.
We cannot just take δ = 3ε[x + 5[ since δ should not depend on x. We see that
we need here is a constant k so that for x close enough to 4,
1
3[x + 5[
≤ k which is
equivalent to [x + 5[ ≥ a constant.
If [x − 4[ < δ, then [x + 5[ = [x − 4 + 9[ ≥ 9 − [x − 4[ > 9 − δ. If we take δ ≤ 1
then the previous inequality becomes [x + 5[ > 9 −δ ≥ 8.
In consequence we will have:
1
3[x + 5[

1
24
and
¸
¸
¸
¸
3
x + 5

1
3
¸
¸
¸
¸
=
[x −4[
3[x + 5[

[x −4[
24
< ε.
We observe that δ must satisfy two conditions: δ ≤ 1 and
δ
24
≤ ε.
We can now write down the following proof:
Let ε > 0 and δ = min¦1, 24ε¦. Then δ > 0 and if x is in the domain of the
function f and if 0 < [x −4[ < δ we have
[x + 5[ = [x −4 + 9[ ≥ 9 −[x −4[ > 9 −δ ≥ 9 −1 = 8
and in consequence we get
¸
¸
¸
¸
3
x + 5

1
3
¸
¸
¸
¸
=
¸
¸
¸
¸
4 −x
3(x + 5)
¸
¸
¸
¸
=
[x −4[
3[x + 5[

[x −4[
24
<
δ
24

24ε
24
= ε.
This completes the proof of
lim
x→4
3
x + 5
=
1
3
.
We will present now the ε −δ definition for the limits that involve infinity.
Definition 2. a) Let f : A →R and let a ∈ A

∩ R.
lim
x→a
f(x) = ∞ means for each ε > 0 there exists a δ > 0 such that f(x) > ε
whenever 0 < [x −a[ < δ and x ∈ A.
lim
x→a
f(x) = −∞ means for each ε > 0 there exist a δ > 0 such that f(x) < −ε
whenever 0 < [x −a[ < δ and x ∈ A.
b) Let f : A →R such that there is a a ∈ R such that (a, ∞) ⊂ A.
116
lim
x→∞
f(x) = l ∈ R if for each ε > 0 there is a δ > 0 such that [f(x) − l[ < ε
whenever x > δ and x ∈ A.
Similarly we can define lim
x→−∞
f(x) = l ∈ R.
lim
x→∞
f(x) = ∞ if for each ε > 0 there is a δ > 0 such that f(x) > ε whenever
x > δ and x ∈ A.
Similarly we can define lim
x→∞
f(x) = −∞, lim
x→−∞
= ∞.
The following properties of limits, which we list without proof, enable us to eval-
uate limits of functions algebraically.
Property 1. If f(x) = c, where c is a constant, then lim
x→a
f(x) = c.
Property 1 states the limit of the constant function f(x) = c at any point x = a
is equal to the value of the constant function which is c.
Property 2. If f(x) = x, then lim
x→a
f(x) = lim
x→a
x = a.
Property 3. If lim
x→a
f(x) = l and n ∈ R such that (f(x))
n
is well defined then
lim
x→a
[f(x)]
n
=
_
lim
x→a
f(x)
_
n
= l
n
.
Property states that the limit of the n-th power of a function is equal to the nth
power of the limit of the function. For this result to be true we must assume that x
is chosen so that the nth power of f is well defined for x close to a. For instance, if
n =
1
2
then f(x) cannot be negative.
Property 4. If lim
x→a
f(x) = l ∈ R and k is a constant, then
lim
x→a
(kf)(x) = k
_
lim
x→a
f(x)
_
= kl.
Property 4 states that the limit of a constant times a function is equal to the
constant times the limit of the function.
Property 5. If lim
x→a
f(x) = l ∈ R and lim
x→a
g(x) = m ∈ R, then
lim
x→a
(f ±g)(x) = lim
x→a
f(x) ± lim
x→a
g(x).
So, the limit of the sum or difference of two functions is equal to the sum or
difference of their limits. This result is easily extended to the case involving the sum
and (or) difference of any finite number of functions.
Property 6. If lim
x→a
f(x) = l ∈ R and lim
x→a
g(x) ∈ R then
lim
x→a
(fg)(x) = lim
x→a
f(x) lim
x→a
g(x).
The previous property can be also easily extended to the case involving the product
of any finite number of functions.
Property 7. If lim
x→a
f(x) = l, lim
x→a
g(x) = m and m ,= 0, then
lim
x→a
_
f
g
_
(x) =
lim
x→a
f(x)
lim
x→a
g(x)
=
l
m
.
117
The previous rules tell us that the limit operations interact with all the basic
algebraic operations in a natural way.
Example 3. Compute the following limit
lim
x→2
(2x
2
+ 1)(3x −1)
x + 4
.
Solution.
lim
x→2
(2x
2
+ 1)(3x −1)
x + 4
P7
=
lim
x→2
(2x
2
+ 1)(3x −1)
lim
x→2
(x + 4)
P6
=
=
lim
x→2
(2x
2
+ 1) lim
x→2
(3x −1)
lim
x→2
(x + 4)
P5
=
_
lim
x→2
(2x
2
) + lim1
_ _
lim
x→2
3x −limn
_
lim
x→2
x + lim
x→2
4
P1,P4
=
_
2 lim
x→2
x
2
+ 1
__
3 lim
x→2
x −1
_
lim
x→2
x + 4
P3,P2
=
_
2
_
lim
x→2
x
_
2
+ 1
_
(3 2 −1)
2 + 4
=
(2 4 + 1) 5
6
=
9 5
6
=
15
2
.
In certain situations, the attempt to apply Property 7 leads to the expression ”
0
0

(that is, both numerator and the denominator have limit 0 at x = a). In this case we
say that the quotient
f(x)
g(x)
has the indeterminate form ”
0
0
” at x = a.
To solve the problem, we have to replace the given function with another one that
takes the same values at the original function except at x = a and to evaluate the
limit of the latter. The next examples illustrate this process.
Example 4. Find lim
x→2
x
2
+ 6x −16
x
2
−5x + 6
.
Solution. The limit of the denominator is
lim
x→2
(x
2
−5x + 6) = 2
2
−5 2 + 6 = 0.
So, we cannot use the Property 7. We see if it is possible to simplify the given function.
We can try to factorize the denominator and the numerator too. The fact that the
denominator has limit zero suggest that 2 is a root of the denominator and so x −2
is a factor of the denominator. In the same way we can conclude that x −2 is also a
factor of the numerator. Thus,
lim
x→2
x
2
+ 6x −16
x
2
−5x + 6
= lim
x→2
(x −2)(x + 8)
(x −2)(x −3)
= lim
x→2
x + 8
x −3
=
2 + 10
2 −3
= −12.
Example 5. Find lim
x→4

3x + 4 −4
x −4
.
118
Solution. The limit is again of the form ”
0
0
”. The trouble this time is that it
might not be very clear how we can find the hidden factor x − 4 in the numerator.
The technique which is to be applied in this case is to rationalize by using the idea
of ”the conjugate expression”.
In our example, when

3x + 4 −4 is multiplied by

3x + 4 + 4 we will get
(

3x + 4)
2
−4
2
= 2x + 4 −16 = 3(x −4).
The square root disappears and a polynomial is obtained. Here are the details
lim
x→4

3x + 4 −4
x −4
= lim
x→4
(

3x + 4 −3)(

3x + 4 + 4)
(x −4)(

3x + 4 + 4)
= lim
x→4
3(x −4)
(x −4)(

3x + 4 + 4)
= lim
x→4
3

3x + 4 + 4
=
3

3 4 + 4 + 4
=
3
8
.
Example 6. Compute lim
x→∞
3x
2
+ 8x −4
2x
2
+ 4x −5
if it exists.
Solution. Since the limits of both the numerator and the denominator are ∞ the
Property 7 is not applicable. We will try to put the function into a form in which we
can find the required limit. By taking out as common factor x
2
(the highest power of
x appearing in the denominator) from both the numerator and the denominator, we
obtain
lim
x→∞
3x
2
+ 8x −4
2x
2
+ 4x −5
= lim
x→∞
x
2
_
3 +
8
x

4
x
2
_
x
2
_
2 +
4
x

5
x
2
_ = lim
x→∞
3 +
8
x

4
x
2
2 +
4
x

5
x
2
=
lim
x→∞
3 + 8 lim
x→∞
1
x
−4 lim
x→∞
1
x
2
lim
x→∞
2 + 4 lim
x→∞
1
x
−5 lim
x→∞
1
x
2
=
3 + 8 0 −4 0
2 + 4 0 −5 0
=
3
2
.
Observe that we have use that lim
x→∞
1
x
= lim
x→∞
1
x
2
= 0 in evaluating the second
and third terms of both the numerator and the denominator.
The previous remark can be generalized as follows.
Property 8. lim
x→∞
1
x
n
= 0 for all n > 0 and lim
x→−∞
1
x
n
= 0 for all n > 0, provided
that
1
x
n
is defined.
Some limits are best calculated by first finding the left and right-hand limits.
Definition 3. (left-hand limit). Let f : A →R and let a ∈ A

∩ R.
lim
xրa
f(x) = l ∈ R if for every number ε > 0 there is a number δ > 0 such that if
a −δ < x < a and x ∈ A then [f(x) −l[ < ε.
119
lim
xրa
f(x) = ∞ if for every number ε > 0 there is a number δ > 0 such that if
a −δ < x < a and x ∈ A then f(x) > ε.
Similarly, we can define lim
xրa
f(x) = −∞.
Definition 4. (right-hand limit). Let f : A →R and let a ∈ A

∩ R.
lim
xցa
f(x) = l ∈ R if for every number ε > 0 there is a number δ > 0 such that if
a < x < a +δ and x ∈ A then [f(x) −l[ < ε.
lim
xցa
f(x) = ∞ for every number ε > 0 there is a number δ > 0 such that if
a < x < a +δ and x ∈ A then f(x) > ε.
Similarly, we can define lim
xցa
f(x) = −∞.
Notice that Definition 3 is the same as Definition 1 except that x is restricted to
be in the left half (a−δ, a) of the interval (a−δ, a+δ). In Definition 4, x is restricted
to lie in the right half (a, a +δ) of the interval (a −δ, a +δ).
The following theorem says that a limit exists if and only both of the one-sided
limits exist and are equal.
Theorem 1. Let f : A →R and let a ∈ A

∩ R.
Then lim
x→a
f(x) = l if and only if lim
xցa
f(x) = lim
xրa
f(x) = l.
Example 7. If
f(x) =
_ √
x −6, if x > 6
2x −12, if x < 6,
determine whether lim
x→4
f(x) exists.
Solution. Since f(x) =

x −6 for x > 6, we have
lim
xց6
f(x) = lim
xց6

x −6 =

6 −6 = 0.
Since f(x) = 2x −12 for x < 6, we have
lim
xր6
f(x) = lim
xր6
(2x −12) = 2 6 −12 = 0.
The right and left hand limits are equal. Thus the limit exists and lim
x→6
f(x) = 0.
By using the limits we can define the notion of asymptote of a real valued function.
A linear asymptote is essentially a straight line to which the graph of the function
becomes closer and closer but does not become identical.
A function may have multiple asymptotes, of different or of the same kind. One
such function with a horizontal, vertical and oblique asymptote is graphed below.
120
`
¸
x
y = f(x)
f : R

→R
f(x) =
_
¸
_
¸
_
1
x
+x, x > 0
1
x
, x < 0
Definition 5. (asymptote)
a) horizontal asymptote
The line y = l ∈ R is called a horizontal asymptote of the curve y = f(x) if
lim
x→∞
f(x) = l or lim
x→−∞
f(x) = l.
b) vertical asymptote
The line x = a ∈ R is called a vertical asymptote of the curve y = f(x) if at least
one of the following statements is true:
lim
x→a
f(x) = ∞ (or −∞); lim
xրa
f(x) = ∞ (or −∞);
lim
xցa
f(x) = ∞ (or −∞).
c) oblique (or slant) asymptote
The line y = mx +n, m ,≡ 0, is called an oblique (or slant) asymptote if
lim
x→∞
(f(x) −(mx +n)) = 0 or lim
x→−∞
(f(x) −(mx +n)) = 0.
In this case
m = lim
x→∞
f(x)
x
and n = lim
x→∞
(f(x) −mx)
or
m = lim
x→−∞
f(x)
x
and n = lim
x→−∞
(f(x) −mx).
121
In particular a function y = f(x) can have at most 2 horizontal or 2 oblique
asymptotes (or one of each).
Example 8. Find the asymptotes of the graph of the function defined by
f(x) =

2x
2
+ 1
3x −5
.
Solution. First we determine the domain of f.
A = ¦x ∈ R [ 2x
2
+ 1 ≥ 0, 3x −5 ,= 0¦ = R ¸
_
5
3
_
lim
x→∞

2x
2
+ 1
3x −5
= lim
x→∞
¸
x
2
_
2 +
1
x
2
_
x
_
3 −
5
x
_ = lim
x→∞
x
_
2 +
1
x
2
x
_
3 −
5
x
_
= lim
x→∞
_
2 +
1
x
2
3 −
5
x
=

2
3
.
Therefore the line y =

2
3
is a horizontal asymptote of the graph of f.
In computing the limit as x → −∞, we must remember that for x < 0, we have

x
2
= [x[ = −x. So, when we take out as common factor x
2
we have
_
2x
2
+ 1 =
¸
x
2
_
2 +
1
x
2
_
= [x[
_
2 +
1
x
2
= −x
_
2 +
1
x
2
Therefore
lim
x→−∞

2x
2
+ 1
3x −5
= lim
x→−∞
−x
_
2 +
1
x
2
x
_
3 −
5
x
_ = − lim
x→−∞
_
2 +
1
x
2
3 −
5
x
= −

2
3
Thus the line y = −

2
3
is also a horizontal asymptote. A vertical asymptote is
likely to occur when the denominator 3x −5 is 0, that is when x =
5
3
.
lim

5
3

2x
2
+ 1
3x −5
=
¸
2
_
5
3
_
2
+ 1
+0
= ∞
122
If x is close to
5
2
but x <
5
3
, then 3x −5 < 0 and so f(x) is large negative. Thus
lim

5
3

2x
2
+ 1
3x −5
= −∞.
Since we already have two horizontal asymptotes there are no oblique asymptotes
of f.
Example 9. Determine the horizontal and oblique asymptotes of the function
defined by
f(x) =
_
x
2
+x −x.
Solution.
D = ¦x ∈ R [ x
2
+x ≥ 0¦
= ¦x ∈ R [ x(x + 1) ≥ 0¦
= (−∞, −1] ∪ [0, ∞)
We compute first
lim
x→∞
f(x) = lim
x→∞
(
_
x
2
+x −x)
Both

x
2
+x and x are large when x is large, so it is very difficult to see what
happens to their difference. We will use algebra to rewrite the function. We first
multiply both the numerator and the denominator by the conjugate radical.
lim
x→∞
(
_
x
2
+x −x) = lim
x→∞
(
_
x
2
+x −x)

x
2
+x +x

x
2
+x +x
= lim
x→∞
(x
2
+x) −x
2

x
2
+x +x
= lim
x→∞
x

x
2
+x +x
= lim
x→∞
x
¸
x
2
_
1 +
1
x
_
+x
= lim
x→∞
x
x
_
_
1 +
1
x
+ 1
_
= lim
x→∞
1
_
1 +
1
x
+ 1
=
1
2
So, y =
1
2
is an horizontal asymptote of f.
Since
lim
x→−∞
(
_
x
2
+x −x) = ∞+∞ = ∞
then there is no horizontal asymptote at −∞.
123
It remains to look for oblique asymptote at −∞.
m = lim
x→−∞
f(x)
x
= lim
x→−∞

x
2
+x −x
x
If in the previous limit we make the substitution y = −x then y → ∞ and the
limit becomes
m = lim
y→∞
_
(−y)
2
+ (−y) −(−y)
−y
= − lim
y→∞
_
y
2
−y +y
y
= − lim
y→∞
y
__
1 −
1
y
+ 1
_
y
= − lim
y→∞
_
1 −
1
y
+ 1
1
= −2
n = lim
x→−∞
(f(x) −mx) = lim
x→−∞
(f(x) + 2x) = lim
x→−∞
(
_
x
2
+x +x)
= lim
y→∞
(
_
y
2
−y −y) = lim
y→∞
y
2
−y −y
2
_
y
2
−y +y
= lim
y→∞
−y
_
y
2
−y +y
= lim
y→∞
−y
y
__
1 −
1
y
+ 1
_ = lim
y→∞
−1
_
1 −
1
y
+ 1
= −
1
2
In conclusion the line y = −2x −
1
2
is a slant asymptote to −∞.
Next, we want to look at another useful technique of finding limits. We start with
an example.
Example 10. Find lim
x→0
x
2
cos
1
x
.
Solution. We know from Property 6 that the limit of a product is the product of
the limits. That assumes that the limits of the factors exist. If we try to apply this
result, we would say that the limit of x
2
cos
1
x
is the limit of x
2
times the limit of
cos
1
x
. The problem is that the limit of cos
1
x
does not exist and the limit property
for product cannot be applied. Indeed,
cos
1
1
2nπ
= cos 2nπ = 1
for any nonzero natural number and
cos
1
1
π
2
+nπ
= 0
124
for any natural number. Since the values of cos
1
x
do not approach a fixed number as x
approaches 0, lim
x→0
cos
1
x
does not exist. Let’s notice that even though cos
1
x
oscillates,
it oscillates between fixed bounds, namely −1 and 1. So, as long as x ,= 0 we have
−1 ≤ cos
1
x
≤ 1.
We multiply the previous inequality by x
2
and we get
−x
2
≤ x
2
cos
1
x
≤ x
2
, for all x ,= 0.
Notice that x
2
is always positive, and so when we multiply it to the inequality, we
do not need to turn the inequality signs around.
As x → 0, both −x → 0 and x
2
→ 0. Being squeezed between two functions that
approach 0, the function x
2
cos
1
x
is forced to go to zero, too. So we can conclude that
it also has limit 0. That is
lim
x→0
x
2
cos
1
x
= 0.
The way we solved the above example suggests that we can write down a general
result.
Theorem 2. (Squeeze theorem). Suppose that f(x) ≤ g(x) ≤ h(x) for all x close
to a, except possible for x = a. If lim
x→a
f(x) = lim
x→a
h(x) = l then lim
x→a
g(x) = l.
If we compute the limit of a polynomial f at a given point a then the limit will
be f(a). For instance
lim
x→2
(2x
2
+ 4x + 1) = 2 2
2
+ 4 2 + 1.
Functions with this property are called continuous at a. We will see that the
mathematical definition of continuity corresponds closely with the meaning of the
word continuity in everyday language.
Definition 6. Let f : A →R and a ∈ A.
a) If a ∈ A

we say that f is continuous at a if and only if
lim
x→a
f(x) = f(a).
If a ∈ A¸ A

(a is an isolated point of A) then f is continuous at a.
b) If a ∈ A

we say that f is continuous from the right at a if
lim
xցa
f(x) = f(a).
c) If a ∈ A

we say that f is continuous from the left at a if
lim
xրa
f(x) = f(a).
125
The previous definition says that f is continuous at an accumulation point a if
f(x) approaches f(a) as x approaches a. A continuous function f has the property
that a small change in x produces only a small change in f(x).
Geometrically, the graph of a continuous function at each point of a given interval
can be drawn without removing the pen from the paper.
We say that f is discontinuous at a, or f has a discontinuity at a, if f is not
continuous at a.
Let f : I →R where I is an interval on the real axis. The function f is continuous
on I if it is continuous at each point in the interval. If f is defined only on one side
of an end point of the interval, we understand continuous at the end point to mean
continuous from the right or continuous from the left.
Instead of using Definition 6 to verify the continuity of a function, it is often con-
venient to use the next theorem, which shows how to build up complicated continuous
functions from simple ones.
Theorem 3. a) If f and g are continuous at a and c is a constant, then the
following functions are also continuous at a:
f ±g, cf, fg and
f
g
if g(a) ,= 0.
b) If g is continuous at a and f is continuous at g(a) then the composite function
f ◦ g (given by (f ◦ g)(x) = f(g(x))) is continuous at a.
c) The following types of functions are continuous at every point in their do-
mains: polynomials, rational functions, root functions, trigonometric functions, in-
verse trigonometric functions, exponential functions and logarithmic functions.
Intuitively, the part b) of the previous theorem is reasonable because if x is close
to a, then g(x) is close to g(a) and since f is continuous at g(a) then f(g(x)) is close
to f(g(a)).
Example 11. Where is the following function continuous?
f : A →R, f(x) =
1

x
2
+ 16 −5
Solution. The function f is the composition of four continuous functions
f = f
1
◦ f
2
◦ f
3
◦ f
4
where f
1
(x) =
1
x
, f
2
(x) = x −5, f
3
(x) =

x and f
4
(x) = x
2
+ 16.
We know that each of these functions is continuous on its domain (by Theorem 3,
part c)) and so by Theorem 3 (part a)), f is continuous on its domain.
The domain A of f is:
A = ¦x ∈ R [ x
2
+ 16 ≥ 0,
_
x
2
+ 16 ,= 5¦
= ¦x [ x = ±3¦ = (−∞, −3) ∪ (−3, 3) ∪ (3, ∞).
An important property of continuous functions is expressed by the following theorem.
126
Theorem 4. (The intermediate value theorem). Let f : [a, b] →R be a continuous
function on the closed interval [a, b] and let m be any number between f(a) and f(b).
Then there exists a number c in (a, b) such that f(c) = m.
The intermediate value theorem states that a continuous function takes on every
intermediate value between the function values f(a) and f(b).
Example 12. Show that there is a root of the equation
4x
3
−6x
2
+ 3x −2 = 0
between 1 and 2.
Solution. Let f : [1, 2] → R, f(x) = 4x
3
− 6x
2
+ 3x − 2. We are looking for a
number c between 1 and 2 such that f(c) = 0. Therefore we take a = 1, b = 2 and
m = 0 in the previous theorem.
We have
f(1) = 4 1
3
−6 1
2
+ 3 1 −2 = −1 < 0
f(2) = 4 2
3
−6 2
2
+ 3 2 −2 = 12 > 0
So, m = 0 is a number between f(1) and f(2). Since f is continuous (as a poly-
nomial function), the intermediate value theorem says there is a number c between 1
and 2 such that f(c) = 0.
The intermediate value theorem plays an important role in the way the computers
are drawing the graphs of continuous functions. A computer calculates a finite number
of points on the graph and turns on the pixels that contain these calculated points.
We end this subsection by presenting an important property of continuous which
will be used in the next sections.
Theorem 5. (The extreme value theorem). Let f : [a, b] → R be a continuous
function on [a, b]. Then there exist c, d ∈ [a, b] such that f(c) ≥ f(x) and f(d) ≤ f(x)
for all x ∈ [a, b].
The extreme value theorem says that a continuous function on a closed interval
has a maximum value and a minimum value, but it doesn’t tell us how to find these
extreme values.
3.1.2 Rates of change and derivatives
Starting from the slope of a straight line we try to introduce first the notion of
the slope of an arbitrary curve.
If a line passes through the points (x
0
, y
0
) and (x
1
, y
1
) then its slope is defined by
(see [??])
m =
y
1
−y
0
x
1
−x
0
.
The numerator, y
1
− y
0
is the change in y which occurs when x changes from x
0
to x
1
. Mathematicians often use the symbol ∆ to denote the change. Thus we write
y
1
−y
0
= ∆y and x
1
−x
0
= ∆x. Using this notation we have
m =
y
1
−y
0
x
1
−x
0
=
∆y
∆x
.
127
The quantity
∆y
∆x
tells us how fast y is changing with respect to x. It represents
the rate of change of y with respect to x. For example, if
∆y
∆x
= 2, then y is increasing
twice as fast as x, while if
∆y
∆x
= −2, then y is decreasing by two units as x increases
by one unit.
For example, if the straight line is the graph of the profits of a company, then the
slope represents the change in profits, which may be increasing or decreasing, rapidly
or slowly, depending on the sign and the size of the slope. The notion of rate of change
is fundamental in economics and includes topics as changes in profit, inflation rate or
elasticity of demand.
However, practical situations in economics rarely generate straight line graphs. In
consequence we have to extend the notion of slope to general curves. Even if the slope
of a straight line is a single number we cannot expect a single number to represent
the steepness of a curve, which changes from point to point.
Suppose that the curve can be represented by the equation y = f(x) and the point
P by the coordinates (x
0
, y
0
). We will define the slope of the curve at P to be equal to
the slope of the tangent line to the curve at P. The word tangent is derived from the
Latin word ”tangens”, which means ”touching”. Thus a tangent to a curve is a line
that touches the curve. In other words, a tangent line should have the same direction
as the curve at the point of contact. We will assume initially that for our curve the
tangent line exists (it doesn’t always exist).
Choose a second point on the curve reasonably close to P, say Q(x, y). The line
through the two points P and Q is called the secant line whose slope is easily to be
found:
m
sec
=
y −y
0
x −x
0
.
We can define the slope m of the tangent line as the limit of the slope of the secant
line as x approaches x
0
.
m = lim
x→x
0
y −y
0
x −x
0
= lim
x→x
0
f(x) −f(x
0
)
x −x
0
,
provided this limit exists.
The expression
f(x) −f(x
0
)
x −x
0
measures the average rate of change of y = f(x)
with respect to x over the interval [x
0
, x] and provides us an approximation to the
rate of change of the function f at x
0
. The approximation becomes better and better
as the intervals become shorter and shorter. This leads to the following definition of
the rate of change of f at x
0
:
lim
x→x
0
f(x) −f(x
0
)
x −x
0
(provided that the limit exists) (1)
128
`
¸
x
0 x
P(x
0
, y
0
)
Q(x, y)
s
e
c
a
n
t
l
i
n
e
ta
n
g
en
t
lin
e
The rate of change of a function f at x
0
is often called the instantaneous rate of
change of f at x
0
. Thus, the previous limit measures both the slope of the tangent
line to the graph of f at the point P(x
0
, y
0
) and the (instantaneous) rate of change
of the function f at x
0
.
Example 1. What is the slope of the parabola y = x
2
at the point P(2, 4)?
Solution.
m
tan
= lim
x→2
f(x) −f(2)
x −2
= lim
x→2
x
2
−2
2
x −2
= lim
x→2
(x −2)(x + 2)
x −2
= lim
x→2
(x + 2) = 4
¸
`
P(2, 4)
2
4
x
y
O
129
The equation of the tangent line can be obtained by using the point-slope formula
y −y
0
= m(x −x
0
)
(see [??]) with x
0
= 2, y
0
= f(2) = 4 and m = 4.
In consequence the equation will be y −4 = 4(x −2) or y = 4x −4.
Example 2. Find the slope at the origin of the curve with equation y = [x[.
Solution. We can rewrite the expression of the given curve with a split formula:
y =
_
x, if x ≥ 0
−x, if x < 0
¸
`
P(0, 0)
Q(x, −x) Q(x, x)
y
=

x y
=
x
y
First let Q approach P from the right, so that the coordinates of Q are (x, x). The
slope of any secant line is
y −y
0
x −x
0
=
x −0
x −0
= 1
so the right-hand limit of these slopes is also 1. However, when Q approaches P from
the left, its coordinates are (x, −x) which will produce a left hand limit of −1. Since
the right- and the left-hand limits are not equal, the slope at the origin is not defined.
Actually any function which has a ”sharp corner” fails to admit a tangent line at that
corner.
Since the limit of the form (1) occurs whenever we calculate a rate of change in
science and engineering it is given a special name and notation.
Definition 1. Let f : A →R, a ∈ A

.
The derivative of the function f at the point a, denoted by f

(a) is
f

(a) = lim
x→a
f(x) −f(a)
x −a
(2)
if the limit exists and takes a finite value.
If we write x = a+h, then h = x−a and h approaches 0 if and only if x approaches
a. Therefore, an equivalent way of stating the definition of the derivative is
f

(a) = lim
h→0
f(a +h) −f(a)
h
(3)
130
We say that the function f is differentiable at a, if f admits a finite derivative at
a.
We say that the function f is differentiable on the set A if f is differentiable at
each point of A.
In this case we can define the derivative function
f

: A →R by x → f

(x) = lim
h→0
f(x +h) −f(x)
h
.
Mathematicians, from the seventeenth to the nineteenth century, believed that a
continuous function usually possessed a derivative. In 1872 the German mathemati-
cian Karl Weierstrass destroyed this tenet by publishing an example of a function
that was continuous at every real number but nowhere differentiable.
Actually, the opposite of the previous assumption is true.
Theorem 1. Let f : A → R and a ∈ A

. If f is differentiable at a then f is
continuous at a.
Example 3. Find the derivative of the function f : R → R, f(x) = x
2
− 4x + 2
at the point a ∈ R.
Solution. From (3) we have
f

(a) = lim
h→0
f(a +h) −f(a)
h
= lim
h→0
[(a +h)
2
−4(a +h) + 2] −[a
2
−4a + 2]
h
= lim
h→a
a
2
+ 2ah +h
2
−4a −4h + 2 −a
2
+ 4a −2
h
= lim
h→0
2ah +h
2
−4h
h
= lim
h→0
(2a +h −4)
= 2a −4.
Example 4. Suppose C(x) = 8000 +200x −0, 2x
2
(0 ≤ x ≤ 400) is the total cost
that a company incurs in producing x units of a certain commodity.
a) What is the actual cost incurred for manufacturing the 251
st
unit?
b) Find the rate of change of the total cost with respect to x when x = 250.
c) Compare the results obtained in parts a) and b).
Solution. a) The actual cost in producing the 251
st
unit is the difference between
the total cost incurred in producing the first 251 units and the total cost of producing
the first 250 units. Thus, the actual cost is given by
C(251) −C(250) = 99, 80.
b) The rate of change of the total cost function C with respect to x is given by
the derivative of C at the point 250.
C

(250) = lim
h→0
C(250 +h) −C(250)
h
131
= lim
h→0
[8000 + 200(250 + h) − 0, 2(250 + h)
2
] − [8000 + 200 · 250 − 0, 2 · 250
2
]
h
= lim
h→0
200h −0, 2h
2
−0, 2 500h
h
= lim
h→0
(200 −0, 2h −0, 2 500) = 200 −100 = 100
c) From the solution of part (a) we know that the actual cost for producing the
251st unit of commodity is 99,80. This answer is very approximated by the answer to
part (b) which is 100.
To explain that, we observe that
C

(250) = lim
h→0
C(250 +h) −C(250)
h

C(250 +h) −C(250)
h
for h sufficiently small.
Taking h = 1 (which is small enough compared to 250) we have
C

(250) ≈ C(251) −C(250).
The cost of producing an additional unit of a certain commodity is called the
marginal cost. If C(x) is the total cost function in producing x units of a certain
commodity then the marginal cost in producing one additional unit is C(x+1)−C(x).
This quantity can be approximated as in the previous example by the rate of change
C

(x) ≈ C(x + 1) −C(x).
For this reason, economists have defined the marginal cost function to be the
derivative of the corresponding total cost function. Thus the word ”marginal” is syn-
onymous with ”derivative of”.
When we try to apply the definition of the derivative we encounter difficulties of
an algebraic nature. In the next example it can be seen how much work is needed to
compute the derivative of a relatively simple function.
Example 5. Find f

(x) for f(x) =
2x + 1
x −2
, f : R ¸ ¦2¦ →R.
Solution.
f

(x) = lim
h→0
f(x +h) −f(x)
h
= lim
h→0
2x + 2h + 1
x +h −2

2x + 1
x −2
h
= lim
h→0
(2x + 2h + 1)(x −2) −(x +h −2)(2x + 1)
(x +h −2)(x −2)h
= lim
h→0
−5h
(x +h −2)(x −2)h
= lim
h→0
−5
(x +h −2)(x −2)
= −
5
(x −2)
2
132
So, the technical aspects of differentiation are complex. We need other techniques
for computing derivatives, which avoid using the formal definition. For this, we will
consider the various ways in which two functions may be combined to form a new
function. The technique for handling such combinations are generally known as rules
of differentiation. The most important of them are the following.
Rule 1. Constant multiple rule
The derivative of cf (where c is a constant) is cf

; (cf)

= cf

.
Rule 2. Sum rule
The derivative of f +g is f

+g

; (f +g)

= f

+g

.
Rule 3. Difference rule
The derivative of f −g is f

−g

; (f −g)

= f

−g

.
Rule 4. Product rule
The derivative of fg is f

g +fg

; (fg)

= f

g +fg

.
Rule 5. Quotient rule
The derivative of
f
g
is
f

g −fg

g
2
;
_
f
g
_

=
f

g −fg

g
2
.
Rule 6. Chain rule
The derivative of the composite function f ◦ g is (f

◦ g) g

.
(f ◦ g)

= (f

◦ g) g

so that (f ◦ g)

(x) = f

(g(x)) g

(x).
The general rules presented before, allow us to compute derivatives of complicated
functions which are constructed from the basic ones.
Example 6. Differentiate f : R →R, f(x) = (x
3
−2x
2
+ 4)(8x
2
+ 5x).
Solution. From the product rule we have
f

(x) = (x
3
−2x
2
+ 4)

(8x
2
+ 5x) + (x
2
−2x
2
+ 4)(8x
2
+ 5x)

= (3x
2
−4x)(8x
2
+ 5x) + (x
2
−2x
2
+ 4)(16x + 5)
Example 7. Differentiate f : D →R, f(x) =
3x
2
−1
2x
3
+ 5x
2
+ 7
.
Solution. From the quotient rule we have
f

(x) =
(3x
2
−1)

(2x
3
+ 5x
2
+ 7) −(3x
2
−1)(2x
3
+ 5x
2
+ 7)

(2x
3
+ 5x
2
+ 7)
2
=
6x(2x
3
+ 5x
2
+ 7) −(3x
2
−1)(6x
2
+ 10x)
(2x
3
+ 5x
2
+ 7)
2
=
−6x
4
+ 6x
2
+ 52x
(2x
3
+ 5x
2
+ 7)
2
Example 8. Find the derivative of h,
h : R →R, h(x) = (x
3
+ 6x
2
−5x + 2)
7
.
Solution. We will split h into its constituent parts
h(x) = [g(x)]
7
, where g(x) = x
3
+ 6x
2
−5x + 2
= (f ◦ g)(x), where f(x) = x
6
By the chain rule, we get
h

(x) = f

(g(x)) g

(x) = 7g
6
(x) g

(x)
= 7(x
3
+ 6x
2
−5x + 2)
6
(3x
2
+ 12x −5)
133
Example 9. Find the derivative of h,
h : R →R, h(x) =
_
6x
6
+x
2
+ 4.
Solution. We first split h into its constituent functions
h(x) =
_
g(x), where g(x) = 6x
6
+x
2
+ 4
= (f ◦ g)(x), where f(x) =

x
By the chain rule
h

(x) = f

(g(x)) g

(x) =
1
2
_
g(x)
g

(x)
=
36x
5
+ 2x
2

6x
6
+x
2
+ 4
Example 10. Find the derivative of
h(x) =
_
x
2
+ sin
2
x.
Solution.
h

(x) =
1
2
_
x
2
+ sin
2
x
(x
2
+ sin
2
x)

=
1
2
_
x
2
+ sin
2
x
(2x + 2 sin x(sin x)

)
=
x + sin xcos x
_
x
2
+ sin
2
x
Example 11. Differentiate f,
f : R

→R, f(x) = e
1
x
3
.
Solution.
f

(x) = e
1
x
3
_
1
x
3
_

= e
1
x
3
(x
−3
)

= e
1
x
3
(−3x
−4
) = −3e
1
x
3

1
x
4
Example 12. Differentiate f,
f : R →R, f(x) = ln(e
4x
+e
−4x
).
Solution.
f

(x) = (ln(e
4x
+e
−4x
))

=
1
e
4x
+e
−4x
(e
4x
+e
−4x
)

=
1
e
4x
+e
−4x
[e
4x
(4x)

+e
−4x
(−4x)

] =
4e
4x
−4e
−4x
e
4x
+e
−4x
134
Example 13. Let f : R →R, f(x) = (x
2
−4)
3
.
Compute f

(2) in two different ways.
Solution. One way of computing f

(2) is to use the definition of the derivative at
a given point.
f

(2) = lim
x→2
f(x) −f(2)
x −2
= lim
x→2
(x
2
−4)
3
−0
x −2
= lim
x→2
(x −2)
3
(x + 2)
3
x −2
= lim
x→2
(x −2)
2
(x + 2)
3
= 0 4
3
= 0
The second way is to determine first the derivative function f

and then evaluate
it at x = 2
f

(x) = [(x
2
−4)
3
]

= 3(x
2
−4)
2
(x
2
−4)

= 3(x
2
−4)
2
2x = 6x(x
2
−4)
2
So,
f

(2) = 6 2 (2
2
−4)
2
= 0,
as we expected.
If f is a differentiable function, then its derivative f

is also a function, so f

may
have a derivative of its own, denoted by (f

)

= f
′′
. This new function f
′′
(if it exists)
is called the second derivative of f because it is the derivative of the derivative of f.
Example 14. If f : R →R, f(x) = xsin x, find f
′′
.
Solution. Using the product rule, we have
f

(x) = x

sin x +x(sin x)

= sin x +xcos x
To find f
′′
we differentiate f

f
′′
(x) = (sin x +xcos x)

= (sin x)

+ (xcos x)

= cos x +x

cos x +x(cos x)

= cos x + cos x −xsin x
= 2 cos x −xsin x.
The third derivative f
′′′
is the derivative of the second derivative: f
′′′
= (f
′′
)

.
The process can be continued.
The fourth derivative is usually denoted by f
(4)
.
In general, the nth derivative of f is denoted by f
(n)
and is obtained from f by
differentiating n times.
We will end this subsection by presenting the l’Hospital’s rule for computing limits.
The l’Hospital rule talks about a method to calculate limits of fractions where the
denominator and the numerator both go to zero or both go to infinity.
135
These forms of a limit are said to be indeterminate, since we cannot say what the
limit will be.
In an indeterminate form ”
0
0
” we do not know ”how fast” each is going to 0. If
the numerator goes to zero ”faster” then the denominator, we can expect that the
limit is zero. But if the denominator goes to zero ”faster” then the numerator of the
fraction will be a large number. Finally, if the numerator and the denominator are
going to zero equally fast, then the limit will be a nonzero real number. In any case,
the limit cannot be determined just looking at the form
0
0
.
Theorem 1. (l’Hˆospital’s rule for
0
0
)
Let f and g be functions and a ∈ R. If
(a) f and g are differentiable in some interval (a −h, a +h) with h > 0,
(b) lim
x→a
f(x) = 0 = lim
x→a
g(x) and
(c) lim
x→a
f

(x)
g

(x)
(allowing the limits +∞ and −∞)
then the limit lim
x→a
f(x)
g(x)
exists and
lim
x→a
f(x)
g(x)
= lim
x→a
f

(x)
g

(x)
.
The rule works also for limits of the indeterminate form
_


_
.
Theorem 2. (l’Hospital’s rule for
_


_
).
If f and g are functions that satisfy (a) and (c) above together with
(b’) lim
x→a
[f(x)[ = ∞ = lim
x→a
[g(x)[
then
lim
x→a
f(x)
g(x)
= lim
x→a
f

(x)
g

(x)
.
Either form of l’Hospital’s rule works for limits at infinity, lim
x→∞
f(x)
g(x)
or
lim
x→−∞
f(x)
g(x)
, as long as for one sided limits. In each case, condition (a) has to be
adapted correspondingly.
For example,
lim
x→∞
f(x)
g(x)
= lim
x→∞
f

(x)
g

(x)
if
(a) f and g are differentiable on some interval (b, ∞),
(b) lim
x→∞
f(x) = 0 = lim
x→∞
g(x) or lim
x→∞
[f(x)[ = ∞ = lim
x→∞
[g(x)[ and
(c) lim
x→∞
f

(x)
g

(x)
exists.
136
The following limits are all of the form
_
0
0
_
or
_


_
, but their answers are all
different.
Example 15. Find lim
x→0
x −sin x
x
3
.
Solution.
lim
x→0
x −sin x
x
3
[
0
0
]
= lim
x→0
(x −sin x)

(x
3
)

= lim
x→0
1 −cos x
3x
2
if the last limit exists. But the last limit is also of the indeterminate form
_
0
0
_
and
so we can try the l’Hospital’s rule one more time:
lim
x→0
1 −cos x
3x
2
= lim
x→0
(1 −cos x)

(3x
2
)

= lim
x→0
sin x
6x
[
0
0
]
= lim
x→0
cos x
6
=
1
6
Example 16.
lim
x→0
sin x
x
3
[
0
0
]
= lim
x→0
(sin x)

(x
3
)

= lim
x→0
cos x
3x
2
=
1
+0
= +∞
Example 17.
lim
x→∞
e
x
x
2
+ 4x + 2
[


]
= lim
x→∞
(e
x
)

(x
2
+ 4x + 2)

= lim
x→∞
e
x
2x + 4
[


]
= lim
x→∞
(e
x
)

(2x + 4)

= lim
x→∞
e
x
2
= ∞.
There are other indeterminate forms as we can see in the following examples.
a) lim
xց0

xln x is of the form [0 ∞]
b) lim
xց0
x
sin x
is of the form [0
0
]
c) lim
x→∞
(e
x
−x)
1
x
2
is of the form [∞
0
]
d) lim
x→∞
_
x
2
+ 1
x
2
_
2x
is of the form [1

]
e) lim
x→0
_
1
x

sin x
x
2
_
is of the form [∞−∞].
To find the limits of these indeterminate form, we can rewrite the functions as
quotients, as we can see below.
Example 18. Find lim
xց0

xln x.
Solution. The limit is of the indeterminate form 0 ∞. We rewrite it as a quotient
as follows:
lim
xց0

xln x = lim
xց0
ln x
x

1
2
.
137
Now, the limit is of the indeterminate form
_


_
. We use the l’Hospital’s rule.
lim
xց0

xln x = lim
xց0
ln x
x

1
2
= lim
xց0
(ln x)

(x

1
2
)

= lim
xց0
1
x

1
2
x

3
2
= −2 lim
xց0
x
1
2
= 0
Example 19. Find lim
x→0
_
1
x

sin x
x
2
_
.
Solution. The limit is of the indeterminate form ”∞ − ∞”. We first rewrite
1
x

sin x
x
2
in the form of a quotient
1
x

sin x
x
2
=
x −sin x
x
2
.
Now, the limit
lim
x→0
_
1
x

sin x
x
2
_
= lim
x→0
x −sin x
x
2
[
0
0
]
= lim
x→0
(x −sin x)

(x
2
)

= lim
x→0
1 −cos x
2x
[
0
0
]
= lim
x→0
(1 −cos x)

(2x)

= lim
x→0
sin x
2
= 0.
The form 0
0
, 1

, ∞
0
and 0

are indeterminate powers.
We will use the exponential and logarithmic functions to convert them into an
indeterminate product.
Example 20. Find lim
xց0
x
sin x
.
Solution. By using the equality a = e
ln a
we can write
x
sin x
= e
ln x
sin x
= e
sin x ln x
Hence,
lim
xց0
x
sin x
= lim
xց0
e
sin x ln x
= e
lim
xց0
sin x ln x
.
The last limit is of the indeterminate form 0 ∞. So, we rewrite it as an indeter-
minate quotient and use the l’Hospital rule
lim
xց0
sin xln x
[0·∞]
= lim
xց0
ln x
1
sin x
[


]
= lim
xց0
1
x

cos x
sin
2
x
= lim
xց0
−sin x
x
tg x = − lim
xց0
sin x
x
lim
xց0
tg x = −1 0 = 0.
138
In the previous equality we use lim
x→0
sin x
x
= 1. Indeed
lim
x→0
sin x
x
[
0
0
]
= lim
x→0
cos x
1
= 1.
Finally, we obtain that lim
xց0
x
sin x
= e
0
= 1.
Example 21. Find lim
x→∞
_
x
2
+ 1
x
2
_
2x
.
Solution.
lim
x→∞
_
x
2
+ 1
x
2
_
2x
= lim
x→∞
e
ln
_
x
2
+1
x
2
_
2x
= lim
x→∞
e
2x ln
_
x
2
+1
x
2
_
.
The last limit is of the indeterminate form [0 ∞].
lim
x→∞
2xln
x
2
+ 1
x
2
[∞·0]
= lim
x→∞
ln
x
2
+ 1
x
2
1
2x
[
0
0
]
= lim
x→∞
2x
x
2
+ 1

2x
x
2

1
2x
2
= lim
x→∞
2x
3
−2x
3
−2x
(x
2
+ 1)x
2

1
2x
2
= lim
x→∞
4x
x
2
+ 1
[


]
= lim
x→∞
4
2x
= 0.
In consequence we have
lim
x→∞
_
x
2
+ 1
x
2
_
2x
= e
0
= 1.
We end this section by mentioning the fact that the l’Hospital’s rule was first pub-
lished in 1696 in the Marquis de l’Hospital calculus textbook ”Analyse des infiniment
petits”. The rule actually was discovered in 1694 by the Swiss mathematician Johann
Bernoulli. That fact was possible because the Marquis de l’Hospital bought the rights
to Bernoulli’s mathematical discoveries.
3.1.3 Linear approximation and differentials
A curve lies very close to its tangent line near the point of tangency. This observa-
tion is the basis for a method of finding approximate values of functions. We use the
tangent line at (a, f(a)) as an approximation to the curve y = f(x) when x is near a.
The equation of the tangent line at the point (a, f(a)) is
y = f(a) +f

(a)(x −a)
139
and the correspondent approximation will be
f(x) ≈ f(a) +f

(a)(x −a) (1)
The relation (1) is called the linear approximation of f at a.
Example 1. Find the linearization of the function f : [−3, ∞) →R,
f(x) =

x + 3 at a = 1 and use it to approximate the numbers

3, 98 and

4, 02.
Solution. The derivative of f is f

,
f

: (−3, ∞) →R, f

(x) =
1
2

x + 3
and so we have f(1) = 2 and f

(1) =
1
4
.
The approximation formula will be
f(x) ≈ f(1) +f

(1) (x −1)
(when x is near 1)

x + 3 ≈ 2 +
1
4
(x −1)

x + 3 ≈
7
4
+
x
4
.
In particular we have
_
3, 98 =
_
0, 98 + 3 ≈
7
4
+
0, 98
4
= 1, 995
_
4, 02 =
_
1, 02 + 3 ≈
7
4
+
1, 02
4
=
8, 02
4
= 2, 005
The linear approximation is illustrated in the figure below.
`
¸
y =
7
4
+
x
4
(1, 2)
y =

x + 3
140
We can see that the tangent line approximation is a good approximation to the
given function when x is near 1.
These approximation ideas are formulated in terminology of differentials. If y =
f(x), where f is a differentiable function at a then the differential of f at the point a
is the following function:
df
(a)
: R →R, defined by df
(a)
(h) = f

(a)h (2)
Sometimes in the previous relation we use dx instead of h and dy instead of
df
(a)
(h), so we have
dy = f

(a)dx.
The geometric meaning of differentials is shown in the figure below. Let P(a, f(a))
and Q(a + ∆x, f(a + ∆x)) be points on the graph of f and let dx = ∆x. The corre-
spondent change in y is ∆y = f(a + ∆x) −f(a). The slope of the tangent line PR is
the derivative f

(a).
`
¸
`
·
P
a a + ∆x
dx = ∆x
∆y
_
dy
R
Q
df
(a)
(dx) = dy represents the change in linearization whereas ∆y represents the change
in function. The approximation ∆y ≈ dy becomes better as ∆x = dx becomes smaller.
For complicated functions it may be impossible to compute ∆y exactly. In such cases
the approximation by differentials is useful
f(a +dx) ≈ f(a) +f

(a)dx. (3)
3.1.4 Extreme values of a real valued function
Some of the most important applications of differential are optimization problems.
These problems can be reduced to finding the maximum or minimum values of a
function.
141
Definition 1. Let f : A →R and a ∈ A.
The function f has a global maximum at a if f(a) ≥ f(x) for all x in A. The
number f(a) is called the maximum value of f on A.
The function f has a global minimum at a if f(a) ≤ f(x) for all x in A. The
number f(a) is called the minimum value of f on A.
The maximum and minimum values of f are called the extreme values of f.
Definition 2. Let f : A →R and a ∈ A.
The function f has a local maximum at a if f(a) ≥ f(x) where x is near a. (This
means that f(a) ≥ f(x) for all x in some open interval containing a).
The function f has a local minimum at a if f(a) ≤ f(x) where x is near a.
Example 1. Determine the extreme values of the function
f : R →R, f(x) = sin x.
Solution. Since −1 ≤ sin x ≤ 1 for all x ∈ R and
sin
_
π
2
+ 2nπ
_
= 1
for any integer n then the function f takes its (local and global) maximum value of 1
infinitely many times.
In the same way −1 is its minimum value (local and global). This value is taken
infinitely many times too, since
sin
_

2
+ 2nπ
_
= −1 for all n ∈ Z.
Example 2. Determine the extreme values of the function
f : R →R, f(x) = x
3
.
Solution.
`
¸
y
x
142
From the graph of the function f we see that this function has neither an absolute
maximum value nor an absolute minimum value.
We have seen that some functions have extreme values, whereas others do not. The
extreme value theorem (Theorem 5, subsection 3.1.1) says that a continuous function
on a closed interval has a maximum value and a minimum value, but it doesn’t tell us
how to find these extreme values. In the next figure we sketch the graph of a function
f with a local maximum at c and a local minimum at d.
`
¸
(c, f(c))
(d, f(d))
d
c
It seems that at the maximum and minimum points the tangent lines are parallel
to the x-axis and in consequence each has slope 0. Since the slope of the tangent line
is the derivative we may believe that f

(c) = f

(d) = 0.
The following theorem shows us that this remark is always true for differentiable
functions.
Theorem 1. (Fermat’s theorem). Let f : I → R, I ⊆ R, I an open interval and
a ∈ I. If f is differentiable at a and f has a local maximum or minimum at a then
f

(a) = 0.
The example 2 shows us that we can’t expect to locate extreme values by setting
f

(x) = 0 and solving for x. Indeed if f(x) = x
3
, then f(x) = x
3
, so f

(x) = 3x
2
et f

(0) = 0. But f has no maximum or minimum at 0, as we already mention in
discussing example 2.
Example 3. Let f : R → R, f(x) = [x[. The graph of f is showed below. The
function f has a minimum at 0, but this value can’t be found by solving the equation
f

(x) = 0 since f is not differentiable at x = 0. Indeed
lim
x→0
f(x) −f(0)
x
= lim
x→0
[x[
x
does not exist since
lim
xց0
[x[
x
= lim
xց0
x
x
= 1 and lim
xր0
[x[
x
= lim
xր0
−x
x
= −1.
143
¸
`
O
x
y
f(x) = [x[
In conclusion we can observe that the converse of Fermat’s theorem is false.
In fact, the Fermat’s theorem say that we have to seek the local extreme points
among the solutions of the equation f

(x) = 0 or among the points for which f is not
differentiable.
These points (solutions for f

(x) = 0 or points for which f is not differentiable)
are called critical points.
Example 4. Find the critical points of the function
f : R →R, f(x) =
3

x(3 −x).
Solution. We rewrite first the function f as
f(x) = 3x
1
3
−x
4
3
and so
f

(x) = x

2
3

4
3
x
1
3
= x

2
3
_
1 −
4x
3
_
=
1 −
4
3
x
x
2
3
, for all x ,= 0.
Therefore, f

(x) = 0 if 1 −
4
3
x = 0 that is x =
3
4
. f is not differentiable at x = 0.
Thus the critical points are x = 0 and x =
3
4
.
Remark 1. To find an absolute maximum or minimum of a continuous function
f on a closed interval we find the critical points of f in (a, b) and compute the values
of function f at the critical points and at the endpoints of the interval. The largest of
the previous values is the absolute maximum value and the smallest of these values
is the absolute minimum value.
Example 5. Find the absolute maximum and minimum values of the function
f : [−2, 1] →R, f(x) = x
3
+ 2x
2
−1.
Solution. f

(x) = 3x
2
+ 4x = (3x + 4)x.
144
The function f is differentiable on (−2, 1) so the critical points are the solutions
of f

(x) = 0.
f

(x) = 0 ⇔ x(3x + 4) = 0 ⇔ x = 0 et x = −
4
3
.
The values of f at critical points are
f(0) = −1 and f
_

4
3
_
=
5
27
.
The values of f at the endpoints of the interval are
f(−2) = −1 and f(1) = 2.
Comparing these values we see that the absolute maximum value is f(1) = 2 and
the absolute minimum value is f(0) = f(−2) = −1.
As we have already shown the derivative function f

is very useful in studying the
properties of the given function f.
Next, we will present two important facts which summarize this connection.
Theorem 2. (Rolle’s theorem). Let f : [a, b] → R be continuous on [a, b] and
differentiable on (a, b) such that f(a) = f(b). Then there exists a number c ∈ (a, b)
such that f

(c) = 0.
As a first application of Rolle’s theorem present the second order Taylor’s formula
for a function whose second derivative is continuous on an interval.
Remark 2. (Taylor’s formula with the remainder in the Lagrange’s form)
Let f : I →R where I is an open interval and let a ∈ I. Suppose that the second
derivative f
′′
is continuous on I.
For each x ∈ I there exists a number c between a and x such that
f(x) = f(a) +f

(a)(x −a) +
1
2
f
′′
(c)(x −a)
2
.
Proof. The previous equality is true for x = a. So, let x ∈ I, x ,= a. We define
the function g : I →R given by
g(t) = f(t) −f(a) −f

(a)(t −a) −α(t −a)
2
where α is chosen so that g(x) = 0.
We easily obtain that
α =
1
(x −a)
2
[f(x) −f(a) −f

(a)(x −a)].
We also have that g(a) = g

(a) = 0.
We apply Rolle’s theorem to the function g defined on the interval
[min(a, x), max(a, x)] to find c
1
between a and x so that g

(c
1
) = 0.
We apply Rolle’s theorem to the function g

defined on the interval
[min(a, c
1
), max(a, c
1
)] to find c between a and c (hence c lies between a and x)
so that g
′′
(c) = 0.
145
On the other hand, the second derivative of g
g
′′
(t) = f
′′
(t) −2α,
where we easily get that α =
1
2
f
′′
(c).
By putting t = x and α =
1
2
f
′′
(c) in the expression of g we get
0 = f(x) −f(a) −f

(a)(x −a) −
1
2
f
′′
(c)(x −a)
2
,
which completes the proof.
The main use of Rolle’s theorem is proving the following important theorem, which
was first stated by the french mathematician, Joseph-Louis Lagrange.
Theorem 3. (The mean value theorem, Lagrange’s theorem). Let f : [a, b] → R
be continuous on [a, b] and differentiable on (a, b). Then there is a number c ∈ (a, b)
such that
f

(c) =
f(b) −f(a)
b −a
(1)
or, equivalently
f(b) −f(a) = f

(c)(b −a).
By interpreting geometrically the mean value theorem, we can see that it is rea-
sonable. Indeed, if A(a, f(a)) and B(b, f(b)) are points on the graph of f (see figures
below) then the slope of the secant line AB is
m
AB
=
f(b) −f(a)
b −a
which is the same as the right side of equality (1). Since f

(c) is the slope of the
tangent line at the point (c, f(c)) the mean value theorem says that there is at least
one point P(c, f(c)) on the graph where the tangent line is parallel to the secant line
AB.
`
¸
a
b
A
B
P(c, f(c))
146
`
¸
P
1
P
2
B
A
a
b
The mean value theorem helps us to obtain information about a function from
information about its derivative.
Example 6. Let f : R → R be a differentiable function. Suppose that f(2) = 2
and f

(x) ≤ 2 for all value of x. How large f(4) can be?
Solution. We can apply the mean value theorem on the interval [2, 4]. There exists
a number c such that
f(4) −f(2) = f

(c)(4 −2)
so
f(4) = f(2) +f

(c) 2 = 2 +f

(c) 2.
We are given that f

(x) ≤ 2 for all x, so in particular we know that f

(c) ≤ 2. So,
f(4) = 2 +f

(2) 2 ≤ 2 + 2 2 = 6.
The largest possible value for f(2) is 6.
The mean value theorem is useful in establishing the following basic properties of
differentiable functions.
Theorem 4. Let f : (a, b) → R be a differentiable function. If f

(x) = 0 for all
x ∈ (a, b) then f is constant on (a, b).
Theorem 5. Let f, g : (a, b) →R be two differentiable functions. If f

(x) = g

(x)
for all x ∈ (a, b), then f − g is constant on (a, b); there is a constant c such that
f = g +c.
Definition 3. Let f : A →R, A ⊆ R.
a) We say that f is increasing on A if for all x
1
, x
2
∈ A with x
1
< x
2
we have
f(x
1
) < f(x
2
).
b) We say that f is decreasing on A if for all x
1
, x
2
∈ A with x
1
< x
2
we have
f(x
1
) > f(x
2
).
Theorem 6. Let f : (a, b) →R be a differentiable function.
a) If f

(x) > 0 for all x ∈ (a, b) then f is increasing on (a, b).
b) If f

(x) < 0 for all x ∈ (a, b) then f is decreasing on (a, b).
Example 7. Find the intervals where the function
f : R →R, f(x) = 3x
4
−24x
2
+ 2
147
is increasing and where it is decreasing.
Solution. f

(x) = 12x
3
−48x = 12x(x
2
−4) = 12x(x −2)(x + 2)
We have to solve the following inequations:
f

(x) > 0 and f

(x) < 0.
This depends on the sign of the three factors of f

(x), namely, 12x, x − 2 and
x + 2.
The critical points of f are x = 0, x = 2 and x = −2.
We can arrange the signs of f

(x) in the following table.
x −∞ −2 0 2 +∞
12x − − − − 0 + + + +
x −2 − − − − − − 0 + +
x + 2 − − 0 + + + + + +
f

(x) − − 0 + 0 − 0 + +
f(x) ց ց f(−2) ր f(0) ց f(2) ր ր
f is decreasing on (−∞, −2) and on (0, 2).
f is increasing on (−2, 0) and on (2, ∞).
Recall that when a function has a relative extremum it must occur at a criti-
cal value. We will now combine the ideas mentioned before to obtain two tests for
determining when a critical value is a relative extremum point of a given function.
Theorem 7. (First derivative test for relative extrema) Let f : D → R and let
[a, b] ⊂ D.
Suppose that f is continuous on [a, b] and differentiable on (a, b) except possibly
at the critical value c.
(a) If f

(x) > 0 for a < x < c and f

(x) < 0 for c < x < b, then c is a relative
maximum point of f.
(b) If f

(x) < 0 for a < x < c and f

(x) > 0 for c < x < b, then c is a relative
minimum point of f.
(c) If f

(x) has the same algebraic sign on a < x < c and c < x < b then c is not
an extremum point of f.
Example 8. Determine the extreme points of the function
f : R →R, f(x) = 3x
4
−24x
2
+ 2.
Solution. By using the previous theorem and the results obtained in Example 7
we obtain that −2 and 2 are relative minimum points and 0 is a relative maximum
point.
Another geometric property of a graph of a given function is its ”concavity”.
Visually, concavity is easy to recognize. If a graph is ”smiling” at you, it is concave
up (or convex); if it is ”frowning” at you, it is concave down (or simply concave).
148
`
¸
x
y
concave up: ”smiling:
`
¸
x
y
concave down: ”frowning”
A mathematical characterization of concavity involves the second derivative of the
given function.
Theorem 8. (Test for concavity) Let f : I → R, I ⊆ R, where I is an open
interval.
(a) If f
′′
(x) > 0 for all x ∈ I then f is concave up on I.
(b) If f
′′
(x) < 0 for all x ∈ I then f is concave down on I.
Example 9. Find the intervals where the graph of
f : R →R, f(x) = 2x
3
−6x
2
is concave up and where it is concave down.
Solution. We have
f

(x) = 6x
2
−12x
and
f
′′
(x) = 12x −12 = 12(x −1).
The sign of the second derivative is given in the following table
x −∞ 1 +∞
f
′′
(x) − − 0 + +
f(x) ⌢ f(1) ⌣
concave down concave up
Thus the graph is concave up on (1, ∞) and concave down on (−∞, 1).
Another application of the second derivative is the following test for maximum
and minimum values. It is a consequence of the concavity test.
Theorem 9. (The second derivative test) Let f : A → R and c ∈ A. Suppose f
′′
is continuous near c (that is f
′′
is continuous on an interval (c −h, c +h)).
(a) If f

(c) = 0 and f
′′
(c) > 0 then f has a local minimum at c.
(b) If f

(c) = 0 and f
′′
(c) < 0 then f has a local maximum at c.
(c) If f

(c) = 0 and f
′′
(c) = 0 then the test is inconclusive.
149
Part (a) is true because f
′′
(x) > 0 near c and so f is concave up near c. This
means that the graph of f lies above its horizontal tangent at c (since f

(c) = 0) and
so f has a local minimum at c.
Part (b) is true because f
′′
(x) < 0 near c and so f is concave down near c. This
means that the graph of f lies below its horizontal tangent at c (since f

(c) = 0) and
so f has a local maximum at c.
Example 10. Use the second derivative test to find the extrema of the following
function:
f : R →R, f(x) = 3x
4
−8x
3
+ 6x
2
.
Solution. We have to evaluate the second derivative at the critical points. First,
we determine the critical points of f.
f

(x) = 12x
3
−24x
2
+ 12x = 12x(x
2
−2x + 1) = 12x(x −1)
2
.
In consequence f

(x) = 0 for x = 0 and x = 1.
Now, find the second derivative, and test the sign at x = 0 and x = 1.
f
′′
(x) = 36x
2
−48x + 12 = 12(3x
2
−4x + 1).
Since f
′′
(0) = 12 > 0 then the function f has a relative minimum point at x = 0
Since f
′′
(1) = 0 the test fails so we have to use the first derivative test. The sign
of the first derivative is given in the table below.
x −∞ 0 1 +∞
12x − − 0 + + + +
(x −1)
2
+ + + + 0 + +
f

(x) − − 0 + 0 + +
f(x) ց ց f(0) ր f(1) ր ր
The part (c) of the first derivative test shows that 1 is not a relative extreme point.
3.1.5 Applications to economics
In subsection 3.1.2 we introduced the idea of marginal cost. Recall that if C is the
cost function and C(x) is the cost of producing x units of a certain product then the
marginal cost is the rate of change of C with respect to x. In fact the marginal cost
function is the derivative C

of the cost function. We also consider the average cost
function
c(x) =
C(x)
x
representing the cost per unit if x units are produced. We want to find what happens
at a minimum point of the average cost function.
Theorem 1. a) If a is a minimum point for c then C

(a) = c(a).
b) If the marginal cost is less then the average cost function decreases.
c) If the marginal cost is greater then the average cost then the average cost in-
creases.
150
Proof. a) If a is a minimum point of function c then c

(a) = 0 (as a consequence
of Fermat’s theorem).
By applying the quotient rule we have
c

(x) =
_
C(x)
x
_

=
C

(x) x −C(x)
x
2
=
x
_
C

(x) −
C(x)
x
_
x
2
=
C

(x) −c(x)
x
.
Since c

(a) = 0 then C

(a) −c(a) = 0 and so C

(a) = c(a).
b) If the marginal cost is less then the average cost then
c

(x) =
C

(x) −c(x)
x
< 0
and c is a decreasing function (by Theorem 6, subsection 3.1.4).
c) If the marginal cost is greater then the average cost then
c

(x) =
C

(x) −c(x)
x
> 0
and c is an increasing function (by Theorem 6, subsection 3.1.4). This completes the
proof.
The part a) of the previous theorem says if the average cost is minimum then the
marginal cost equals to the average cost.
We have the following explanation for parts b) and c) of the previous theorem.
The marginal cost is (approximatively) the cost of producing one additional unit
of the considered product (see Example 4, subsection 3.1.2).
If the addition unit cost is less then the average cost this less expensive unit will
determine the average cost per unit to decrease.
If the additional unit cost is greater then the average cost this more expensive
unit will determine the average cost per unit to increase.
This principle is plausible because if the marginal cost is smaller then the average
cost then it should be produced more in order to lower the average cost. Similarly, if
the marginal cost is greater then the average cost, then it would be produced less in
order to lower the average cost.
We also consider the revenue function R, R(x) representing the income from the
sale of x units of the product. The derivative R

is called the marginal revenue function.
If x units are sold the price function p will be defined by
p(x) =
R(x)
x
.
The function P = R−C is naturally called the profit function and the derivative
P

is called the marginal profit function. Note that
P

(x) = R

(x) −C

(x) = 0 if R

(x) = C

(x).
151
We therefore conclude that.
Theorem 2. If the profit is maximum, then the marginal revenue is equal to the
marginal cost.
Remark 1. It is often appropriate to represent a total cost function by a polyno-
mial (usually of degree three)
C(x) = a +bx +cx
2
+dx
3
where a represents the overhead cost (rent, heat, maintenance) and the other terms
represent the cost of raw materials, labor and so on. The cost raw materials may be
proportional to x but labor costs might depend partially on higher powers of x.
Example 1. A publisher of a calculus text book works with a cost function
C(x) = 50000 + 20x −
1
10
4
x
2
+
1
3 10
8
x
3
and a price function
p(x) = 120 −
1
10
4
x,
both in dollars. Determine the maximum of the profit function.
Solution. Clearly we have
C

(x) = 20 −
1
5 10
3
x +
1
10
8
x
2
and
C
′′
(x) = −
1
5 10
3
+
1
5 10
7
x
so that
C
′′
(x) = 0 pour x = 10
4
.
The marginal cost increases after 10000 copies. On the other hand we have
R(x) = xp(x) = 120x −
1
10
4
x
2
and
R

(x) = 120 −
1
5 10
3
x
Maximum profit occurs when P

(x) = 0 and P
′′
(x) < 0, so that R

(x) = C

(x)
120 −
1
5 10
3
x = 20 −
1
5 10
3
x +
1
10
8
x
2
with the solution x = 10
5
. If we want to use the second derivative test to establish
the nature of the critical point x = 10
5
we have to evaluate P
′′
(10
5
).
P
′′
(10
5
) = R
′′
(10
5
) −C
′′
(10
5
) = −
1
5 10
3
+
1
5 10
3

1
5 10
7
10
5
< 0.
152
This means that maximum profit occurs when exactly 100.000 copies are produced
and sold. The income is then
R(10
5
) = 11 10
6
at p(10
5
) = 110 dollars per copy. The cost is
C(10
5
) =
1315
3
10
4
dollars.
The maximum of the profit function will be:
P(10
5
) = R(10
5
) −C(10
5
) = 11 10
6

1315
3
10
4
=
1985
3
10
4
dollars.
Finally, we will use the marginal concepts introduced before to derive an important
criterion used by economists to analyze the demand function. This concept is the price
elasticity of demand.
In mathematics, elasticity of a differentiable function f at a point x is defined as
E(x) =
xf

(x)
f(x)
which can be rewritten in the following two different forms
E(x) =
f

(x)
f(x)
1
x
=
(ln f(x))

(ln x)

(1)
or
E(x) =
x
f(x)
f

(x) =
x
f(x)
lim
∆x→0
f(x + ∆x) −f(x)
∆x
= lim
∆x→0
f(x + ∆x) −f(x)
f(x)
∆x
x

f(x + ∆x) −f(x)
f(x)
100
∆x
x
100
=
percentage change in f
percentage change in x
(2)
So
E(x) ≈
percentage change in f
percentage change in x
(3)
If we use the notations y = f(x) or y = y(x), then the x point elasticity of y is
denoted by
E
x
y
=
xf

(x)
f(x)
.
Remark 2. If y = y(x), then the y point elasticity of x is
E
y
x
=
1
E
x
y
.
153
Proof. If y = f(x), then x = f
−1
(y) and by using the definition of elasticity
E
y
x
=
y(f
−1
)

(y)
f
−1
(y)
=
y
1
f

(f
−1
(y))
x
=
f(x)
1
f

(x)
x
=
f(x)
xf

(x)
=
1
E
x
y
Next, we shall present the following economic example.
The demand for a product is usually related to its price. In most cases, the demand
decreases when the price increases. The sensitivity of demand to changes in price varias
from one product to another.
For some products small percentage changes in price have little effect on demand.
For other products small percentage changes in price have considerable effect on
demand. We want to measure the sensitivity of demand to changes in price.
Definition 1. If p represents the price per unit of a certain product and Q rep-
resents the demand function (in fact Q(p) is the number of the considered product)
then the price elasticity of demand is (see (2))
E
p
Q
=
pQ

(p)
Q(p)
= lim
∆p→0
Q(p + ∆p) −Q(p)
Q(p)
100
∆p
p
100

percentage change in quantity demanded
percentage change in price
(4)
We observe that if the percentage change in price is one then
E
p
Q
≈ percentage change in demand due to 1% increase in price. (5)
Remark 3. a) The price elasticity of demand is usually negative because the
demand decreases when the price increases.
b) If E
p
Q
< −1 the demand is said to be elastic with respect to price.
In this case the percentage decrease in demand is greater than the percentage
increase in price that caused it.
c) If −1 < E
p
Q
the demand is said to be inelastic with respect to price.
In this case the percentage decrease in demand is less then the percentage increase
in price that caused it.
d) If E
p
Q
= −1 the demand is said to be of unit elasticity with respect to price.
Theorem 3. (Elasticity and the total revenue)
Let R, R(p) = pQ(p) be the total revenue function.
a) If E
p
Q
< −1 then R is a decreasing function.
In this case, when the price is raised the total revenue decreases.
b) E
p
Q
> −1 then R is an increasing function.
In this case, when the price is raised the total revenue increases.
154
Proof. By the product rule of differentiation we have
R

(p) = (pQ(p))

= pQ

(p) +Q(p) = Q(p)
_
pQ

(p)
Q(p)
+ 1
_
= Q(p)(E
p
Q
+ 1)
For the part a) we have E
p
Q
< −1 and so E
p
Q
+1 < 0. Since R

(p) = Q(p)(E
p
Q
+1)
and E
p
Q
+ 1 < 0 we obtain that R

(p) < 0. So, R is a decreasing function.
For the part b) we have E
p
Q
> −1 and so E
p
Q
+1 > 0. Since R

(p) = Q(p)(E
p
Q
+1)
and E
p
Q
+1 > 0 we obtain that R

(p) > 0. So, R is an increasing function in this case.
Example 2. Suppose the relationship between the unit price p in dollars and the
quantity demanded, x, is given by the equation
p = −0, 02x + 400 (0 ≤ x ≤ 20000).
Compute the price elasticity of demand and interpret the results.
Solution. Solving the given demand equation for x in terms of p we find
x = Q(p) = −50p + 20000
from which we see that Q

(p) = −50. Therefore
E
p
Q
=
pQ

(p)
Q(p)
=
−50p
−50(p −400)
=
p
p −400
(0 ≤ p < 400).
Next, we will solve the equation
E
p
Q
= −1,
that is
p
p −400
= −1
giving p = 200.
We also see that E
p
Q
< −1 when p > 200 (elastic demand) and E
p
Q
> −1 when
p < 200 (inelastic demand).
So, when the unit price is between 0 and 200, an increase in the unit price will
increase the revenue; when the unit price is between 200 and 400, an increase in the
unit price will cause a decrease in revenue.
In consequence the revenue is maximized when the unit price is set at 200.
Example 3. Let Q be the demand function defined by
Q(p) = 10
_
50 −p
p
; 0 < p ≤ 50.
a) Determine the elasticity of demand when the price is p = 10. If the price
increases by 6% determine the approximate change in demand.
b) Determine where the demand is elastic, inelastic and of unitary elasticity with
respect to price.
155
c) Determine the price function as a function of demand.
d) Find the maximum of the total revenue function.
Solution. a)
E
p
Q
=
pQ

(p)
Q(p)
=
p
_
50 −p
p
_

2
_
50 −p
p
_
50 −p
p
=
p
−p −(50 −p)
p
2
2
50 −p
p
=
25
p −50
,
then
E
10
Q
= −
5
8
On the other hand, from (4) we know that
E
p
Q

percentage change in demand
percentage change in price
.
If we take the percentage change in price to be 6 then
E
10
Q
= −
5
8

percentage change in demand
6
,
where from the percentage change in demand is approximately −
15
4
. This means that
the demand decreases with
15
4
%.
b) First we solve the equation
25
p −50
= −1. This gives us the solution p = 25.
For determining the elasticity intervals we have to solve the inequations
E
p
Q
< −1 and E
p
Q
> −1.
We easily obtain that E
p
Q
< −1 when p ∈ (25, 50) (elastic demand) and E
p
Q
< −1
when p ∈ (0, 25) (inelastic demand).
c) In order to determine the price function as a function of demand we solve the
equation
Q(p) = 10
_
50 −p
p
for p.
Q
2
= 100
50 −p
p
.
156
Thus Q
2
p = 5000 −100p and
p = p(Q) =
5000
Q
2
+ 100
.
d) The critical points of the total revenue function are given by the following
equation
R

(Q) = 0,
where
R(Q) = Qp(Q) =
5000Q
Q
2
+ 100
.
R

(Q) = 5000
100 −Q
2
(Q
2
+ 100)
2
= 0 implies that Q = 10.
By using the first derivative test for determining the extreme values we get that
Q = 10 is a maximum point for the total revenue function.
3.2 Integral calculus of one variable
3.2.1 Antiderivatives and techniques of integration
In the previous chapter we were concerned only with the basic problem: given a
function f find its derivative f

. In this chapter we are interested in precisely the
opposite process, that is, given a function f, find a function whose derivative is f.
This process is called antidifferentiation. Antidifferentiation and differentiation are
inverse operations in the sense that one undoes what the other does.
Definition 1. A function F is an antiderivative of the function f if F

= f.
Example 1. An antiderivative of f : R →R, f(x) = 2x is F : R →R, F(x) = x
2
since F

(x) = 2x = f(x).
There is always more than one antiderivative of a function. For instance, in the
previous example, F
1
: R → R, F
1
(x) = x
2
− 1 and F
2
: R → R, F
2
(x) = x
2
+ 10
are also antiderivatives of f. If F is an antiderivative of a function f then so is G,
G(x) = F(x) +c, for any constant c.
Theorem 1. Let f : I → R, I ⊆ R an interval and let F : I → R be an
antiderivative of f. Then any other antiderivative G of f must be of the form G(x) =
F(x) +c, where c is a constant.
The proof of the previous result is based on Theorem 5, subsection 3.1.4.
The indefinite integral of a function f represents the entire family of antiderivatives
of the given function. We will use the following notation for the indefinite integral
_
f(x)dx.
The indefinite integral is a family of functions. The function f is called the inte-
grand.
157
If F is an antiderivative of a given function f (defined on an open interval) then
the indefinite integral of f will be
_
f(x)dx = F(x) +(.
Extensive techniques for the calculation of antiderivatives have been developed.
We will discuss now some basic techniques of integration.
Integration by substitution
This technique is based on the chain rule of differentiation. We have to mention
first that the integration by substitution does not always work and there is no simple
routine that could help us to find a suitable substitution even in the cases where the
method works.
Theorem 2. If F is an antiderivative of f, then
_
f(g(x))g

(x)dx = F(g(x)) +( (1)
Proof. By the chain rule,
(F(g(x)) +c)

= F

(g(x))g

(x) = f(g(x))g

(x).
Hence, from the definition of an antiderivative we have that
_
f(g(x))g

(x)dx = F(g(x)) +(.
Example 2. Evaluate
_
(x
2
+ 3)
4
2xdx.
Solution. Let g(x) = x
2
+ 3. Then g

(x) = 2x. If we define the function f by
f(u) = u
4
then the integrand of the indefinite integral we are considering has the
form
(x
2
+ 3)
4
2x = [g(x)]
4
g

(x) = f(g(x))g

(x).
From the equality
_
f(g(x))g

(x)dx = F(g(x)) +(,
we conclude that the required antiderivative can be found if we know the antiderivative
of the function f. But if f(u) = u
4
, then F(u) =
1
5
u
5
. Thus
_
(x
2
+ 3)
4
2xdx =
_
f(g(x))g

(x)dx
= F(g(x)) +( =
1
5
(x
2
+ 3)
5
+(.
158
On a practical level it is helpful to rewrite the integral in a more recognizable form
by using the substitutions u = g(x) and du = g

(x)dx. Then the rules of integration
are used to complete the solution of the problem. This formal procedure is justified
since it leads to the correct solution of the problem.
If we write u = g(x) and du = g

(x)dx, then the integral
_
f(g(x))g

(x)dx
which is to be evaluated becomes
_
f(u)du which is equal to F(u) +( since F is an
antiderivative of f.
So we have
_
f(u)du = F(u) +( which is the same with (1), as mentioned before.
Example 3. Rework the previous example using the relationships
u = g(x) and du = g

(x)dx.
We want to evaluate the indefinite integral
I =
_
(x
2
+ 3)
4
(2x)dx.
Let u = x
2
+ 3 so that du = 2xdx.
Making this substitution into the expression for I we get
I =
_
u
4
du =
1
5
u
5
+( =
1
5
(x
2
+ 3)
5
+(
which agrees with the results of previous example.
Example 4. Evaluate
_
1
xln x
dx.
Solution. Note first that the derivative of the function lnx is equal to
1
x
, so it is
convenient to make the substitution u = ln x. Then du =
1
x
dx and
_
1
xln x
dx =
_
1
u
du = ln [u[ +( = ln [ ln x[ +(.
Example 5. Evaluate
_
sin
3
xcos
3
xdx.
Solution. Since the derivative of the function sinx is equal to cos x, it is convenient
to make the substitution u = sin x. Then du = cos xdx, and
_
sin
3
xcos
3
xdx =
_
sin
3
xcos
2
xcos xdx
=
_
u
3
(1 −u
2
)du =
_
(u
3
−u
5
)du =
u
4
4

u
6
6
+( =
sin
4
x
4

sin
6
x
6
+(.
159
Alternatively, note that the derivative of the function cos x is equal to −sin x, so
it is convenient to make the substitution v = cos x. Then dv = −sin xdx, and
_
sin
3
xcos
3
xdx =
_
(−sin
2
x) cos
3
x(−sin x)dx
=
_
[−(1 −v
2
)]v
3
dv =
_
(v
5
−v
3
)dv =
v
6
6

v
4
4
+( =
cos
6
x
6

cos
4
x
4
+(
It can be checked that
sin
4
x
4

sin
6
x
6
=
cos
6
x
6

cos
4
x
4
+
1
12
so both of the previous results are true.
Example 6. Evaluate
_
x

x + 1dx.
Solution. If we make the substitution u =

x + 1, then x = u
2
−1 and dx = 2udu.
_
x

x + 1dx =
_
(u
2
−1)u 2udu = 2
_
u
4
du −2
_
u
2
du
=
2
5
u
5

2
3
u
3
+( =
2
5
(x + 1)
5
2

2
3
(x + 1)
3
2
+(.
Note that in this example the variable x is written as a function of the new variable
u. The substitution x = g(u) has to be invertible (u = g
−1
(x)) to enable us to return
from the new variable u to the original variable x at the end of the process.
Integration by parts
Recall the product rule for differentiation, that is
(fg)

(x) = f

(x)g(x) +f(x)g

(x).
Integrating with respect to variable x, we obtain
_
(fg)

(x)dx =
_
f

(x)g(x)dx +
_
f(x)g

(x)dx.
Since fg is an antiderivative of (fg)

the previous equality can be rewritten as
_
f(x)g

(x)dx = f(x)g(x) −
_
f

(x)g(x)dx (2)
The relationship (2) is called the formula for integration by parts for indefinite
integrals. It is very useful if the indefinite integral
_
f

(x)g(x)dx is much easier to
calculate than the indefinite integral
_
f(x)g

(x)dx.
160
Example. Compute
_
xe
x
dx.
Solution. Writing f(x) = x and g

(x) = e
x
, we have f

(x) = 1 and g(x) = e
x
. It
follows that
_
xe
x
dx = xe
x

_
e
x
dx = xe
x
−e
x
+(
Example 8. Evaluate
_
ln xdx.
Solution. Writing f(x) = ln x and g

(x) = 1, we have f

(x) =
1
x
and g(x) = x so
_
ln xdx = xln x −
_
x
1
x
dx = xln x −x +(.
Example 9. Evaluate
_
e
x
sin xdx.
Solution.
_
e
x
sin xdx =
_
(e
x
)

sin xdx
= e
x
sin x −
_
e
x
cos xdx (3)
We now need to study the indefinite integral
_
e
x
cos xdx =
_
(e
x
)

cos xdx
= e
x
cos x −
_
e
x
(−sin x)dx = e
x
cos x +
_
e
x
sin xdx (4)
It looks like we are back to the same problem.
However, if we combine (3) and (4) then we obtain
_
e
x
sin xdx = e
x
sin x −e
x
cos x −
_
e
x
sin xdx
so that
_
e
x
sin xdx =
1
2
e
x
(sin x −cos x) +(.
Completing squares
In this section, we shall consider thechniques to solve integrals involving square
roots of the form

a
2
x +bx +c, where a ,= 0. Our task is to show that such integrals
can be reduced to integrals discussed before.
161
Note that
ax
2
+bx +c = a
_
x
2
+
b
a
x +
c
a
_
= a
_
x
2
+
b
a
x +
_
b
2a
_
2
_
+c −
b
2
4a
= a
_
x +
b
2a
_
2

b
2
−4ac
4a
We will use now the following substitution
u = x +
b
2a
and du = dx.
Example 10. Evaluate the integral
_
1

3 −2x −x
2
dx.
Solution. We have
3 −2x −x
2
= −(x
2
+ 2x −3) = −(x
2
+ 2x + 1) + 4 = 4 −(x + 1)
2
.
We use the substitutions u = x + 1 and du = dx
_
1

3 −2x −x
2
dx =
_
1

4 −u
2
dx =
1
2
arcsin
u
2
+(
=
1
2
arcsin
u
2
+( =
1
2
arcsin
x + 1
2
+(
Partial fractions
In this section we shall consider indefinite integrals of the form
_
p(x)
q(x)
dx where
p and q are polynomials in x.
If the degree of p is not smaller than the degree of q, then we can always find
polynomials c and r such that
p(x)
q(x)
= c(x) +
r(x)
q(x)
where r ≡ 0 or r has a smaller degree than the degree of q.
We can therefore restrict our attention to the case when the polynomial p is of
lower degree than q.
The first step is to factorize the polynomial q into a product of irreducible factors.
It is a fundamental result in algebra that a polynomial with real coefficients can be
factorized into a product of irreducible linear factors and quadratic factors with real
coefficients.
Suppose that a linear factor (ax+b) occurs n times in the factorization of q. Then
we write down a decomposition:
A
1
ax +b
+
A
2
(ax +b)
2
+ +
An
(ax +b)
n
162
where the constants A
1
, A
2
, . . . , A
n
will be determinated later.
Suppose that a quadratic factor (ax
2
+bx +c) occurs n times in the factorization
of q. Then we write down a decomposition
A
1
x +B
1
ax
2
+bx +c
+
A
2
x +B
2
(ax
2
+bx +c)
2
+ +
A
n
x +B
n
(ax
2
+bx +c)
n
where the constants A
1
, . . . , A
n
and B
1
, . . . , B
n
will be determined later.
We proceed to add all the decompositions and equate their sum to
p(x)
q(x)
and then
calculate all the constants by equating the coefficients.
Example 11. Consider the indefinite integral
_
x
2
+x −3
x
3
−2x
2
−x + 2
dx.
Solution. We factorize first the denominator of the integrand
x
3
−2x
2
−x + 2 = x
3
−x −2(x
2
−1) = x(x
2
−1) −2(x
2
−1)
= (x −2)(x
2
−1) = (x −2)(x −1)(x + 1)
So we consider partial fractions of the form
x
2
+x −3
x
3
−2x
2
−x + 2
=
a
x −2
+
b
x −1
+
c
x + 1
=
a(x −1)(x + 1) +b(x −2)(x + 1) +c(x −2)(x −1)
(x −2)(x −1)(x + 1)
It follows that
x
2
+x −3 = a(x
2
−1) +b(x
2
−x −2) +c(x
2
−3x + 2)
= x
2
(a +b +c) +x(−b −3c) −a −2b + 2c
We equate coefficients and solve for a, b, c.
_
_
_
a +b +c = 1
−b −3c = 1
−a −2b + 2c = −3
where from we get a = 1, b =
1
2
, c = −
1
2
.
Hence
_
x
2
+x −3
x
3
−2x
2
−x + 2
dx =
_
1
x −2
dx −
1
2
_
1
x + 1
dx +
1
2
_
1
x −1
dx
= ln [x −2[ −
1
2
ln [x + 1[ +
1
2
ln [x −1[ +(
163
3.2.2 The definite integral
In order to define the concept of a definite integral we will define first the Riemann
sums (which are named after the famous german mathematician, Georg Friedrich
Bernhard Riemann (1826-1866)).
This is a 5 steps process.
1) Let f be defined on a closed interval [a, b].
2) Partition the interval [a, b] into n subintervals [x
k−1
, x
k
] of length x
k
− x
k−1
.
Let P denote the partition
a = x
0
< x
1
< < x
n−1
< x
n
= b.
3) Let |P| be the length of the longest subinterval. The number |P| is called the
norm of the partition P.
4) Choose a number x

k
∈ (x
k−1
, x
k
) in each subinterval k = 1, n.
5) Form the sum
n

k=1
f(x

k
)(x
k
−x
k−1
). (1)
Sums as (1) for the various partitions of [a, b] are known as Riemann sums.
Definition 1. Let f be a function defined on the closed interval [a, b]. Then the
definite integral of f from a to b, denoted
_
b
a
f(x)dx, is defined to be:
_
b
a
f(x)dx = lim
P→0
n

k=1
f(x

k
)(x
k
−x
k−1
) (2)
provided that the previous limit exists and has a finite value.
If the limit in (2) exists and is finite, the function f is said to be integrable on
[a, b].
The numbers a and b in the previous definition are called the lower and upper
limits of integration, respectively. The integral symbol
_
, first used by Leibniz, is an
elongated S for the word ”sum”.
We have the following important result, which gives us an important class of
integrable functions.
Theorem 1. If f is continuous on [a, b] then f is integrable on the integrable on
the interval.
The precise characterization of the integrable functions is given by the following
theorem.
Theorem 2. Let f : [a, b] →R.
The function f is integrable on [a, b] if and only if f is bounded on [a, b] and f is
continuous almost everywhere on [a, b].
In consequence any integrable function is a bounded one.
The next theorem gives some of the basic properties of the definite integral.
Theorem 3. Let f and g be integrable functions on [a, b]. Then we have:
164
a)
_
b
a
kf(x)dx = k
_
b
a
f(x)dx, where k is any constant;
b)
_
b
a
[f(x) ±g(x)]dx =
_
b
a
f(x)dx ±
_
b
a
g(x)dx;
c)
_
b
a
f(x)dx =
_
c
a
f(x)dx +
_
b
c
f(x)dx,
where c is any number in [a, b];
d)
_
b
a
f(x)dx = −
_
a
b
f(x)dx;
e)
_
a
a
f(x)dx = 0.
The most helpful result in computing definite integrals is the following:
Theorem 4. (Leibniz-Newton’s theorem) Let f : [a, b] → R. Suppose that f is
integrable on [a, b] and that there exists an antiderivative F of f. Then
_
b
a
f(x)dx = F(b) −F(a). (3)
The difference F(b) −F(a) is usually written F(x)
¸
¸
¸
b
a
.
Example 1. Evaluate
_
2
−2
(3x
2
−x + 1)dx.
Solution.
_
2
−2
(3x
2
−x + 1)dx = 3
_
2
−2
x
2
dx −
_
2
−2
xdx +
_
2
−2
dx
= 3
x
3
3
¸
¸
¸
2
−2

x
2
2
¸
¸
¸
2
−2
dx +x
¸
¸
¸
2
−2
= 2
3
−(−2)
3

1
2
[2
2
−(−2)
2
] + [2 −(−2)] = 20.
The Leibniz-Newton theorem allows us to use all the techniques of integration
presented in subsection 3.2.1.
Example 2. Evaluate
_
3
0
x

x + 1dx.
We will use the technique of integration by substitution.
Remark 1. If we don’t use the Theorem 4 in evaluating such an integral and
we make the substitution in the definite integral we have to change the limits of
integration to correspond to the values of u for x = a and x = b.
Solution. To calculate the previous integral we can use the substitution
u =

x + 1,
where from we have x = u
2
−1 and dx = 2udu.
165
Note that if x = 0, then u = 1 and if x = 3, then u = 2.
It follows that
_
3
0
x

x + 1dx =
_
2
1
(u
2
−1)u2udu
=
_
2
1
(2u
4
−2u
2
)du =
2
5
u
5
¸
¸
¸
2
1

2
3
u
3
¸
¸
¸
2
1
=
2
5
(32 −1) −
2
3
(8 −1) =
62
5

14
3
=
116
15
Example 3. Evaluate the integral
_
π/2
0
xcos xdx.
We will use the method of integration by parts in order to compute the previous
integral.
Remark 2. For definite integrals over an interval [a, b] we have the following
formula for integrating by parts
_
b
a
f

(x)g(x)dx = f(x)g(x)
¸
¸
¸
b
a

_
b
a
f(x)g

(x)dx (4)
Solution.
_ π
2
0
xcos xdx =
_ π
2
0
x(sin x)

dx
= xsin x
¸
¸
¸
π
2
0

_ π
2
0
sin xdx =
π
2
+ cos x
¸
¸
¸
π
2
0
=
π
2
−1.
One of the most important applications of the definite integral is the calculation
of areas bounded by arbitrary curves.
Theorem 5. Let f be a continuous nonnegative function with the domain con-
taining the interval [a, b]. Then the area of the region bounded above by the graph of
f, below by the x-axis, and on the left and right by the vertical lines x = a and x = b,
respectively, is given by the definite integral
_
b
a
f(x)dx.
Example 4. Find the area of the region situated under the curve y = x
2
+1 from
x = −1 to x = 2.
166
`
¸
x = −1
x = 2
y
x
−1 2
y
=
f
(
x
)
=
x
2
+
1
O
Solution. The region under consideration is shown in this figure. Using Theorem
5 we have
_
2
−1
(x
2
+ 1)dx =
x
3
3
¸
¸
¸
2
−1
+x
¸
¸
¸
2
−1
=
1
3
[2
3
−(−1)
3
] + 2 −(−1) = 6.
3.3 Improper integrals
3.3.1 Improper integrals
In defining a definite integral (or a Riemann integral)
_
b
a
f(x)dx it was understood
that:
1

the limits of integration were finite numbers
2

the function f was bounded on the interval [a, b].
Now we will extend the concept of a definite (proper) integral to the case where
length of the interval is infinite and also to the case when f is unbounded.
The resulting integral is said to be an improper integral.
So, in conclusion, ”improper” means that some part of
_
b
a
f(x)dx becomes infinite.
It might be a or b or the function f.
First we will consider integrals of functions that are defined on unbounded inter-
vals.
To motivate the definition of an improper integral of a function f over an infinite
interval, consider the problem of finding the area of the region under the curve y =
f(x) =
1
x
2
, above the x-axis, and to the right of the line x = 1 (as shown in figure
below).
167
`
¸
t 1
y =
1
x
2
,
The area that lies to the left of the line x = t (shaded in figure below) is
A(t) =
_
t
1
1
x
2
dx = −
1
x
¸
¸
¸
t
1
= 1 −
1
t
.
Note that A(t) < 1 no matter how large t is chosen.
We also observe that
lim
t→∞
A(t) = lim
t→∞
_
1 −
1
t
_
= 1.
The area of the shaded region approaches 1 at t → ∞, so we can say that the area
of the infinite region is equal to 1 and we write
_

1
1
x
2
dx = lim
t→∞
_
t
1
1
x
2
dx = lim
t→∞
_
1 −
1
t
_
= 1.
Using this example we define the integral of f over an infinite interval as the limit
of integrals over finite intervals.
Definition 1. (Improper integrals on unbounded intervals)
1) Let f : [a, ∞) →R. If
_
t
a
f(x)dx exists for each t ≥ 0 then
_

a
f(x)dx = lim
t→∞
_
t
a
f(x)dx.
The integral
_

a
f(x)dx is called an improper integral on an unbounded interval
on the right. This integral is said to be convergent if the limit exists and has a finite
value and it is said to be divergent if the limit does not exist or it has an infinite
value.
2) Let f : (−∞, b] →R. If
_
b
t
f(x)dx exists for each t ≤ b, then
_
b
−∞
f(x)dx = lim
t→−∞
_
b
t
f(x)dx.
168
The previous integral
_
b
−∞
f(x)dx is called an improper integral on an unbounded
interval on the left. The definition of convergence or divergence is similar with the
previous case.
3) Let f : R →R and a ∈ R.
_
+∞
−∞
f(x)dx =
_
a
−∞
f(x) +
_
+∞
a
f(x)dx.
The improper integral
_
+∞
−∞
f(x)dx is said to be convergent if both of
_
a
−∞
f(x)dx
and
_
+∞
a
f(x)dx are convergent.
The previous improper integral is divergent if at least one of the improper integrals
_
a
−∞
f(x)dx,
_

a
f(x)dx is divergent.
This type of improper integrals are easy to identify. It is sufficient to look at
the limits of integration. If either the lower limit of integration, the upper limit of
integration or both of them are not finite, it will be an improper integral on an
unbounded interval.
Example 1. Evaluate
_

0
e
−x
dx.
Solution.
_

0
e
−x
dx = lim
t→∞
_
t
0
e
−x
dx = lim
t→∞
_
−e
−x
¸
¸
¸
t
0
_
= lim
t→∞
(−e
−t
+ 1) = 1
We can abbreviate this calculation by writing (instead of writing the limit):
_

0
e
−x
dx = −e
−x
¸
¸
¸

0
= −0 + 1.
Example 2. Evaluate
_
0
−∞
xe
x
dx.
Solution. By using the definition of an improper integral we have
_
0
−∞
xe
x
dx = lim
t→−∞
_
0
t
xe
x
dx.
We integrate by parts with f(x) = x and g

(x) = e
x
, so that f

(x) = 1 and
g(x) = e
x
.
_
0
t
xe
x
dx = xe
x
¸
¸
¸
0
t

_
0
t
e
x
dx
= −te
t
−e
x
¸
¸
¸
0
t
= −te
t
−1 +e
t
.
169
We know that lim
t→−∞
e
t
= 0 and by using the l’Hospital’s rule (theorem 2, subsec-
tion 3.1.2) we get
lim
t→−∞
te
t
= lim
t→−∞
t
e
−t
= lim
y→∞
−y
e
y
[


]
= lim
y→∞
(−y)

(e
y
)

= lim
y→∞
−1
e
y
= 0
Another way of determining the previous limit is by using the fact the exponential
function goes faster to infinity as any polynomial. So,
lim
y→∞
P(y)
e
y
= 0
and in particular
lim
y→∞
y
e
y
= 0.
Therefore
_
0
−∞
xe
x
dx = lim
t→−∞
(−te
t
−1 +e
−t
) = −0 −1 + 0 = −1.
Example 3. For what values of α is the integral
_

1
1
x
α
dx convergent?
Solution. For α = 1 we have
_

1
1
x
dx = lim
t→∞
_
t
1
1
x
dx = lim
t→∞
ln [x[
¸
¸
¸
t
1
= lim
t→∞
(ln t −ln 1) = ∞.
The limit is not finite and so the improper integral
_

1
1
x
dx is divergent.
For α ,= 1 we have
_

1
1
x
α
dx = lim
t→∞
_
t
1
x
−α
dx = lim
t→∞
x
−α+1
−α + 1
¸
¸
¸
t
1
=
1
1 −α
lim
t→∞
_
1
t
α−1
−1
_
If α > 1 then α −1 > 0 and
lim
t→∞
1
t
α−1
=
1

= 0.
Therefore
_

1
1
x
α
dx =
1
α −1
170
and the integral is convergent.
If α < 1 then 1 −α > 0,
lim
t→∞
1
t
α−1
= lim
t→∞
t
1−α
= ∞
and the integral is divergent.
We can summarize the previous results in the following remark (for future refer-
ence).
Remark 1. The improper integral
_

1
1
x
α
dx is convergent if α > 1 and divergent
if α ≤ 1.
Example 4. Evaluate
_
0
−∞
cos xdx if possible.
Solution.
_
0
−∞
cos xdx = lim
t→−∞
_
0
t
cos xdx
= lim
t→−∞
_
sin x
¸
¸
¸
0
t
_
= lim
t→−∞
(−sin t) = − lim
t→−∞
sin t.
Since lim
t→−∞
sin t does not exist (as in example 10, subsection 3.1.1).
We will analyse now the integrals of unbounded functions.
Definition 2. (improper integrals of unbounded functions)
1) Let f : [a, b) → R be a continuous function on [a, b) with lim
xրb
f(x) = ∞ (or
−∞).
We define the improper integral of the unbounded function f as
_
b
a
f(x)dx = lim
tրb
_
t
a
f(x)dx.
This integral is said to be convergent if the limit exists and has a finite value and
it is said to be divergent if the limit does not exist or it has an infinite value. The
point b is called a critical point or a bad point.
2) Let f : (a, b] → R be a continuous function on (a, b] with lim
xցa
f(x) = ∞ (or
−∞). Then
_
b
a
f(x)dx = lim
tցa
_
b
t
f(x)dx.
The definition of convergence or divergence is similar with the previous case.
3) Let f : [a, c)∪(c, b] →R be a continuous function on [a, c)∪(c, b] with lim
xրc
f(x) =
∞ (−∞) or lim
xցc
f(x) = ∞ (−∞).
We define
_
b
a
f(x)dx =
_
c
a
f(x)dx +
_
b
c
f(x)dx.
171
The improper integral
_
b
a
f(x)dx is said to be convergent if both of
_
c
a
f(x)dx
and
_
b
c
f(x)dx are convergent.
The previous improper integral is divergent if at least one of the improper integrals
_
c
a
f(x)dx,
_
b
c
f(x)dx is divergent.
The integrals of unbounded functions are more difficult to identify. It is necessary
to look at the interval of integration and determine if the integrand is continuous or
not in that interval. Things to look are fractions for which the denominator becomes
zero in the interval of integration.
Example 5. Evaluate
_
3
0
1
x −2
dx if possible.
Solution. Observe that the line x = 2 is a vertical asymptote of the integrand.
We have to use part 3) of the Definition 2 with c = 2:
_
3
0
1
x −2
dx =
_
2
0
1
x −2
dx +
_
3
2
1
x −2
dx
where
_
2
0
1
x −2
dx = lim
tր2
_
t
0
1
x −2
dx = lim
tր2
ln [x −2[
¸
¸
¸
t
0
= lim
tր2
ln(2 −x)
¸
¸
¸
t
0
= lim
tր2
[ln(2 −t) −ln 2]
= lim
tր2
ln(2 −t) −ln 2 = −∞.
Thus
_
2
0
1
x −2
dx is divergent. This implies that
_
3
0
1
x −2
dx is divergent. We do
not need to evaluate
_
3
2
1
x −2
dx.
If we had not observed the asymptote x = 2 in the previous example and we
confused the integral with a proper integral, then we might have made the following
erroneous calculation.
_
3
0
1
x −2
dx = ln [x −2[
¸
¸
¸
3
0
= ln 1 −ln 2 = −ln 2.
This is wrong because the integral is improper and must be calculated in terms of
limits.
Example 6. Evaluate
_
4
0
dx

x
if possible.
Solution. Observe that lim
xց0
1

x
= +∞.
172
We must use the part 1) of the Definition 2 with a = 0.
_
4
0
dx

x
= lim
tց0
_
4
t
x

1
2
dx = lim
tց0
x

1
2
+1

1
2
+ 1
¸
¸
¸
4
t
= lim
tց0
2

x
¸
¸
¸
4
t
= lim
tց0
(4 −2

t) = 4.
Hence, the integral converges and
_
4
0
dx

x
= 4.
Example 7. Evaluate
_
e
0
ln xdx if possible.
Solution. Since lim
xց0
ln x = −∞, then the critical point is a = 0.
Using integration by parts we get
_
e
0
ln xdx = lim
tց0
_
e
t
xln xdx = lim
tց0
_
xln x
¸
¸
¸
e
t

_
e
t
x
1
x
dx
_
= lim
tց0
(xln x −x)
¸
¸
¸
e
t
= e ln e −e − lim
tց0
(t ln t −t)
= −lim
tց0
t ln t = −lim
tց0
ln t
1
t
[


]
= −lim
tց0
1
t

1
t
2
= lim
tց0
t = 0
In conclusion, the integral is convergent and
_
e
0
ln xdx = 0.
Example 8. For what values of α is the integral
_
b
a
1
(x −a)
α
convergent?
Solution. For α = 1 we have
_
b
a
1
x −a
dx = lim
tցa
_
b
t
1
x −a
dx = lim
tցa
ln [x −a[
¸
¸
¸
b
t
= lim
tցa
(ln(b −a) −ln(t −a)) = ∞.
For α ,= 1 we have
_
b
a
1
(x −a)
α
dx = lim
tցa
_
b
t
(x −a)
−α
dx = lim
tցa
(x −a)
−α+1
−α + 1
¸
¸
¸
b
t
= lim
tցa
1
1 −α
[(b −a)
1−α
−(t −a)
1−α
]
If α > 1 then α −1 > 0 and
lim
tցa
(t −a)
1−α
= lim
tցa
1
(t −a)
α−1
= ∞.
173
If α < 1 then 1 −α > 0 and
lim
tցa
(t −a)
1−α
= 0.
We can summarize the previous results in the following remark (for future refer-
ence).
Remark 2. a)
_
b
a
1
(x −a)
α
dx is convergent if α < 1 and divergent for α ≥ 1.
b)
_
b
a
1
(b −x)
α
dx is convergent if α < 1 and divergent for α ≥ 1.
The proof of the part b) in Remark 2 is similar with that presented in solving the
part a), so it will be omitted.
Sometimes an improper integral is too difficult to be evaluated. In these cases we
can compare the integrals with known integrals. The theorem below shows us how to
do this.
Theorem 1. (Comparison theorem)
Let f, g : [0, ∞) →R be two continuous functions with f(x) ≥ g(x) ≥ 0 for x ≥ a.
a) If
_

a
f(x)dx is convergent then
_

a
g(x)dx is convergent.
b) If
_

a
g(x)dx is divergent then
_

a
f(x)dx is divergent.
If we use the previous theorem and Remark 1 we obtain the following criterion for
convergence-divergence.
Theorem 2. (Criterion for convergence-divergence).
a) If there is α > 1 such that lim
x→∞
x
α
[f(x)[ = c < ∞ then the improper integral
_

a
f(x)dx is a convergent one.
b) If there is 0 < α ≤ 1 such that lim
x→∞
x
α
[f(x)[ = c > 0, then the improper integral
_

a
f(x)dx is a divergent one.
Similar results are valid for the improper integral
_
b
−∞
f(x)dx.
In what concerns the improper integrals of unbounded functions we have the
following results.
Theorem 3. (Comparison theorem)
Let f, g : [a, b) →R be two continuous functions such that
lim
xրb
f(x) = lim
xրb
g(x) = ∞
and
f(x) ≥ g(x) ≥ 0 for x ∈ [a, b).
174
a) If
_
b
a
f(x)dx is convergent then
_
b
a
g(x)dx is convergent.
b) If
_
b
a
g(x)dx is divergent then
_
b
a
f(x)dx is divergent.
Similar results are valid for the improper integral
_
b
a
f(x)dx
where a is a critical point.
If we use Theorem 3 and Remark 2 we obtain the following criterions for
convergence-divergence.
Theorem 4. (Criterion for convergence-divergence)
Let f, g : [a, b) →R be two continuous functions such that
lim
xրb
[f(x)[ = lim
xրb
[g(x)[ = ∞.
a) If there is α ∈ (0, 1) such that
lim
xրb
(b −x)
α
[f(x)[ = c < ∞,
then the improper integral
_
b
a
f(x)dx is a convergent one.
b) If there is α ≥ 1 such that
lim
xրb
(b −x)
α
[f(x)[ = c > 0,
then the improper integral
_
b
a
f(x)dx is a divergent one.
Theorem 5. (Criterion for convergence-divergence)
Let f, g : (a, b] →R be two continuous functions such that
lim
xցa
[f(x)[ = lim
xցa
[g(x)[ = ∞.
a) If there is α ∈ (0, 1) such that
lim
xցa
(x −a)
α
[f(x)[ = c < ∞,
then the improper integral
_
b
a
f(x)dx is a convergent one.
b) If there is α ≥ 1 such that
lim
xցa
(x −a)
α
[f(x)[ = c > 0,
then the improper integral is a divergent one.
175
Example 9. Show that
_

0
e
−x
2
dx is convergent.
Solution. We can’t evaluate the integral directly because we are not able to
compute the antiderivative of e
−x
2
. We write
_

0
e
−x
2
dx =
_
1
0
e
−x
2
dx +
_

1
e
−x
2
dx.
The first integral on the right hand side is just a proper integral. In the second
integral we use the fact that for x ≥ 1 we have x
2
≥ x, so e
−x
2
≤ e
−x
.
The integral of e
−x
is easy to evaluate:
_

1
e
−x
dx = lim
t→∞
_
t
1
e
−x
dx = lim
t→∞
(e
−1
−e
−t
) =
1
e
.
Thus, taking f(x) = e
−x
and g(x) = e
−x
2
in the Comparison theorem (Theorem
1), we see that
_

1
e
−x
2
dx is convergent and so will be
_

0
e
−x
2
dx.
3.3.2 Euler’s integrals
Euler’s integrals are special functions (defined by using improper integrals) that
are used in probabilities and in the computation of certain integrals.
Beta function
The integral
_
1
0
x
p−1
(1 −x)
q−1
dx is called Euler’s first integral.
This integral can be an improper integral of an unbounded function where the
potential critical points are 0 and 1.
If p < 1 then 0 is a critical point since
lim
xց0
x
p−1
(1 −x)
q−1
= ∞.
If q < 1 then 1 is a critical point since
lim
xր1
x
p−1
(1 −x)
q−1
= ∞.
If p ≥ 1 and q ≥ 1 then the Euler’s first integral is a definite (proper) integral.
In what concerns the convergence of the Euler’s first integral we have the following
result.
Theorem 1.
a) If p > 0 and q > 0 then the Euler’s first integral is convergent.
b) If p ≤ 0 or q ≤ 0 then the Euler’s first integral is divergent.
Proof. We split first the integral as
_
1
0
x
p−1
(1 −x)
q−1
dx =
_
1/2
0
x
p−1
(1 −x)
q−1
dx +
_
1
1/2
x
p−1
(1 −x)
q−1
dx
176
and we study the convergence of both improper integrals in the right-hand side of
previous equality.
We use Theorem 5 section 3.3.1 to study the convergence of the first improper
integrals mentioned before
lim
xց0
x
α
x
p−1
(1 −x)
q−1
= lim
xց0
x
α+p−1
=
_
_
_
∞, if α +p −1 < 0
1, if α +p −1 = 0
0, if α +p −1 > 0
The previous limit is finite if α +p −1 ≥ 0 and is positive if α +p −1 ≤ 0.
The improper integral
_
1/2
0
x
p−1
(1−x)
q−1
is convergent if there is α ∈ (0, 1) such
that the previous limit is finite. We are looking for α ∈ (0, 1) such that α+p −1 ≥ 0.
So, we need to have 1 −p ≤ α < 1 which is possible if p > 0. Therefore for p > 0
the improper integral
_
1/2
0
x
p−1
(1 −x)
q−1
dx is convergent.
The improper integral
_
1/2
0
x
p−1
(1−x)
q−1
is divergent if there is α ≥ 1 such that
the previous limit is positive.
We are looking for α ≥ 1 such that α +p −1 ≤ 0.
So, we need to have 1 ≤ α ≤ 1 −p which is possible if p ≤ 0. Therefore for p ≤ 0
the improper integral
_
1/2
0
x
p−1
(1 −x)
q−1
dx is divergent.
Similar arguments, based on Theorem 4, give us the following results: for q > 0
the improper integral
_
1
1/2
x
p−1
(1−x)
q−1
dx is convergent and for q ≤ 0 the improper
integral
_
1
1/2
x
p−1
(1 −x)
q−1
dx is divergent, as desired.
Since for p > 0 and q > 0 the first Euler’s integral is convergent we can define the
following function which is called Beta function:
B : (0, ∞) (0, ∞) →R
B(p, q) =
_
1
0
x
p−1
(1 −x)
q−1
dx
(1)
Theorem 2. (Properties of Beta function)
B1) B(p, 1) =
1
p
, B(1, 1) = 1, for each p > 0.
B2) B
_
1
2
,
1
2
_
= π.
B3) B(p, q) = B(q, p), for each p > 0 and q > 0.
B4) B(p, q) =
p −1
p +q −1
B(p −1, q), for each p > 1 and q > 0.
B5) B(p, q) =
q −1
p +q −1
B(p, q −1), for each p > 0 and q > 1.
B6) B(m, n) =
(m−1)!(n −1)!
(m+n −1)!
, for each m, n ∈ N

.
177
B7) B(p, 1 −p) =
π
sin pπ
, for each 0 < p < 1.
Proofs. (for statements from 1 to 6)
B1) B(p, 1) =
_
1
0
x
p−1
(1 −x)
1−1
dx =
_
1
0
x
p−1
dx =
x
p
p
¸
¸
¸
1
0
=
1
p
If we let p = 1 in the previous equality we get B(1, 1) = 1.
B2) B
_
1
2
,
1
2
_
=
_
1
0
x

1
2
(1 −x)

1
2
dx =
_
1
0
1

x −x
2
dx
=
_
1
0
1
¸
1
4

_
x −
1
2
_
2
dx = arcsin
x −
1
2
1
2
¸
¸
¸
1
0
= arcsin 1 −arcsin(−1) = 2 arcsin 1 = 2
π
2
= π.
B3) B(p, q) =
_
1
0
x
p−1
(1 −x)
q−1
dx
Let t = 1 −x so that x = 1 −t and dx = −dt. When x = 1, t = 0 and when x = 0,
t = 1. Making the indicated substitution we find:
B(p, q) =
_
0
1
(1 −t)
p−1
t
q−1
(−dt) = −
_
0
1
(1 −t)
p−1
t
q−1
dt
=
_
1
0
(1 −t)
p−1
t
q−1
dt = B(q, p)
B4) Let p > 1 and q > 0. By using the integration by parts we obtain:
B(p, q) =
_
1
0
x
p−1
_

(1 −x)
q
q
_

dx
= −
1
q
x
p−1
(1 −x)
q
¸
¸
¸
1
0
+
1
q
_
1
0
(p −1)x
p−2
(1 −x)
q
dx
=
p −1
q
_
1
0
x
p−2
(1 −x)
q−1
(1 −x)dx
=
p −1
q
_
1
0
x
p−2
(1 −x)
q−1
dx −
p −1
q
_
1
0
x
p−1
(1 −x)
q−1
dx
=
p −1
q
B(p −1, q) −
p −1
q
B(p, q).
From the previous equality we have:
B(p, q)
_
1 +
p −1
q
_
=
p −1
q
B(p −1, q)
where from we finally obtain
B(p, q) =
p −1
p +q −1
B(p −1, q)
178
as desired.
B5) Let q > 1 and p > 0. By using successively properties B3 and B4 we obtain:
B(p, q)
B3)
= B(q, p)
B4)
=
q −1
p +q −1
B(q −1, p)
B3)
=
q −1
p +q −1
B(p, q −1)
B6) The desired equality can be obtained by applying successively properties B4,
B5 and B1 as follows:
B(m, n)
B4)
=
m−1
m+n −1
B(m−1, n)
B4)
=
m−1
m+n −1

m−2
m+n −2
B(m−2, n)
B4)
= . . .
B4)
=
m−1
m+n −1

m−2
m+n −2
. . .
1
n + 1
B(1, n)
B5)
=
m−1
m+n −1
. . .
1
n + 1

n −1
1 +n −1
B(1, n −1)
B5)
=
m−1
m+n −1
. . .
1
n + 1

n −1
n

n −2
n −1
B(1, n −2)
B5)
= . . .
B5)
=
(m−1)!
(m+n −1) . . . (n + 1)

n −1
n

n −2
n −1
. . .
1
2
B(1, 1)
B1)
=
(m−1)!(n −1)!
(m+n −1)!
.
B7) The proof of this statement is beyond the scope of this text.
Example 1. By using the properties of Beta function compute the following
values:
a) B(11, 9); b) B
_
5
2
,
1
2
_
; c) B
_
7
4
,
1
4
_
.
Solution. a) B(11, 9)
B6)
=
(11 −1)!(9 −1)!
(11 + 9 −1)!
=
10!8!
19!
b)
B
_
5
2
,
1
2
_
B4)
=
5
2
−1
5
2
+
1
2
−1
B
_
5
2
−1,
1
2
_
=
3
4
B
_
3
2
,
1
2
_
B4)
=
3
4

3
2
−1
3
2
+
1
2
−1
B
_
3
2
−1,
1
2
_
=
3
8
B
_
1
2
,
1
2
_
B2)
=
3
8
π
179
c)
B
_
7
4
,
1
4
_
B4)
=
7
4
−1
7
4
+
1
4
−1
B
_
7
4
−1,
1
4
_
=
3
4
B
_
3
4
,
1
4
_
B3)
=
3
4
B
_
1
4
,
3
4
_
B7)
=
3
4

π
sin
π
4
=
3
4

π

2
2
=
3
2

2
π
Example 2. By using the properties of Beta function compute the following
integrals:
a)
_
1
0
x
10
(1 −x)
8
dx;
b)
_
1
0
x
_
x
1 −x
dx;
c)
_
1
0
4
¸
_
1 −x
x
_
3
dx.
Solution. a)
_
1
0
x
10
(1 −x)
8
dx =
_
1
0
x
11−1
(1 −x)
9−1
dx = B(11, 9) =
10!8!
9!
(see example 1, part a))
b)
_
1
0
x
_
x
1 −x
dx =
_
1
0
x
x
1
2
(1 −x)
1
2
dx
=
_
1
0
x
3
2
(1 −x)

1
2
dx =
_
1
0
x
5
2
−1
(1 −x)
1
2
−1
dx = B
_
5
2
,
1
2
_
=
3
8
π
(see example 2, part b))
c)
_
1
0
4
¸
_
1 −x
x
_
3
dx =
_
1
0
x

3
4
(1 −x)
3
4
dx
=
_
1
0
x
1
4
−1
(1 −x)
7
4
−1
dx = B
_
1
4
,
7
4
_
B3)
= B
_
7
4
,
1
4
_
=

2

2
(see example 2, part c)).
Example 3. By using the properties of Beta function compute the following
integrals:
a)
_ π
2
0
sin
4
xcos
2
xdx; b)
_

0
3

x
1 +x
2
dx.
Solution. a) We make the following change of variable
sin
2
x = t.
180
Since sin x =

t we get
x = arcsin

t
and hence
dx =
1

1 −t

1
2

t
dt.
We have to change the limits of integration. If x = 0 then t = 0 and if x =
π
2
then
t = 1.
_ π
2
0
sin
4
xcos
2
xdx =
_
1
0
t
2
(1 −t)
1

1 −t 2

t
dt
=
1
2
_
1
0
t
3
2
(1 −t)
1
2
dt =
1
2
_
1
0
t
5
2
−1
(1 −t)
3
2
−1
dt
=
1
2
B
_
5
2
,
3
2
_
B5)
=
1
2

3
2
−1
5
2
+
3
2
−1
B
_
5
2
,
3
2
−1
_
=
1
12
B
_
5
2
,
1
2
_
=
1
12

3
8
π =
π
32
(see example 1, part b)).
b) We make the following change of variable
x
2
=
t
1 −t
.
Since x =
_
t
1 −t
we get
dx =
1
2
_
t
1 −t
_
t
1 −t
_

dt
that is
dx =
1
2
_
1 −t
t

1
(1 −t)
2
dt
We have to change the limits of integration. If x = 0 then t = 0 and if x = ∞ then
t = 1.
_

0
3

x
1 +x
2
dx =
_
1
0
6
_
t
1 −t
1 +
t
1 −t

1
2
_
1 −t
t

1
(1 −t)
2
dt
=
1
2
_
1
0
6
¸
t
1 −t

(1 −t)
3
t
3

1
(1 −t)
6
dt
181
=
1
2
_
1
0
6
¸
1
t
2
(1 −t)
4
dt =
1
2
_
1
0
t

1
3
(1 −t)

2
3
dt
=
1
2
_
1
0
t
2
3
−1
(1 −t)
1
3
−1
dt =
1
2
B
_
2
3
,
1
3
_
B2)
=
1
2
B
_
1
3
,
2
3
_
B7)
=
1
2

π
sin
π
3
=
1
2

π

3
2
=
π

3
Gamma function
The integral
_

0
x
p−1
e
−x
dx is called Euler’s second integral.
This integral is an improper one on an unbounded interval. If p < 0 then 0 is a
critical point for this integral.
In what concerns the convergence of the Euler’s second integral we have the fol-
lowing result:
Theorem 3. a) If p > 0 then the Euler’s second integral is convergent.
b) If p ≤ 0 then the Euler’s second integral is divergent.
Proof. We split first the integral as
_

0
x
p−1
e
−x
dx =
_
1
0
x
p−1
e
−x
dx +
_

1
x
p−1
e
−x
dx
and we study the convergence of both improper integrals in the right-hand side of
previous equality.
Similar arguments (based on Theorem 5 section 3.3.1) to those used in the proof of
Theorem 1 give us the following results: for p > 0 the improper integral
_
1
0
x
p−1
e
−x
dx
is convergent and for p ≤ 0 the improper integral
_
1
0
x
p−1
e
−x
dx is divergent.
We use Theorem 2, section 3.3.1, to study the convergence of the improper integral
_

1
x
p−1
e
−x
dx.
We have
lim
x→∞
x
α
[x
p−1
e
−x
[ = lim
x→∞
x
α+p−1
e
x
= 0 < ∞
for each α, in particular for α > 1 (the previous limit is 0 since the exponential function
goes faster to infinity than any power function). Hence the considered integral is a
convergent one, as desired.
Since for p > 0 the second Euler’s integral is convergent we can define the following
function which is called Gamma function.
Γ : (0, ∞) →R
Γ(p) =
_

0
x
p−1
e
−x
dx
(2)
182
Gamma function is also known as generalized factorial function. We will present
next the basic properties of the gamma function and its relation with n!.
Theorem 4. (Properties of Gamma function)
Γ1) Γ(1) = 1.
Γ2) Γ(p) = (p −1)Γ(p −1), for each p > 1.
Γ3) Γ(n) = (n −1)!, for each n ∈ N

.
Γ4) B(p, q) =
Γ(p)Γ(q)
Γ(p +q)
, for each p > 0 and q > 0.
Γ5) Γ
_
1
2
_
=

π.
Proof. Γ1)
Γ(1) =
_

0
x
1−1
e
−x
dx =
_

0
e
−x
dx = −e
−x
¸
¸
¸

0
= −0 + 1 = 1
Γ2) Integration by parts gives us:
Γ(p) =
_

0
x
p−1
e
−x
dx = lim
t→∞
_
t
0
x
p−1
(−e
−x
)

dx
= lim
t→∞
_
−e
−x
x
p−1
¸
¸
¸
t
0
+ (p −1)
_
t
0
x
p−2
e
−x
dx
_
= − lim
t→∞
t
p−1
e
t
+ (p −1) lim
t→∞
_
t
0
x
p−2
e
−x
dx
= 0 + (p −1)Γ(p −1) = (p −1)Γ(p −1)
Γ3) The desired equality can be obtained by applying successively property Γ2)
and Γ1) as follows:
Γ(n)
Γ2)
= (n −1)Γ(n −1)
Γ2)
= (n −1)(n −2)Γ(n −2) = =
Γ2)
= (n −1)(n −2) . . . 1Γ(1)
Γ1)
= (n −1) . . . 1 = (n −1)!
Γ4) There will be no proof of this item.
Γ5) We take p = q =
1
2
in Euler relation Γ4) to obtain
B
_
1
2
,
1
2
_
=
_
Γ
_
1
2
__
2
Γ(1)
,
where from
_
Γ
_
1
2
__
2
= π.
Since Γ
_
1
2
_
> 0 from the last equality we get Γ
_
1
2
_
=

π. This completes the
proof.
183
Example 4. Compute the following integrals:
a)
_
1
0
_
x
5
−x
6
dx;
b)
_

0
x
2
e

x
5
dx;
c)
_

2
xe
2−x
dx;
d)
_

0
xe
2−x
dx.
Solution. a)
_
1
0
_
x
5
−x
6
dx =
_
1
0
_
x
5
(1 −x)dx =
_
1
0
x
5
2
(1 −x)
1
2
dx
=
_
1
0
x
7
2
−1
(1 −x)
3
2
−1
dx = B
_
7
2
,
3
2
_
=
Γ
_
7
2
_
Γ
_
3
2
_
Γ(5)
It remains for us to compute Γ(5), Γ
_
3
2
_
and Γ
_
7
2
_
.
Γ(5)
Γ3)
= (5 −1)! = 4! = 24
Γ
_
3
2
_
Γ2)
=
_
3
2
−1
_
Γ
_
3
2
−1
_
=
1
2
Γ
_
1
2
_
Γ5)
=
1
2

π
Γ
_
7
2
_
Γ2)
=
_
7
2
−1
_
Γ
_
7
2
−1
_
=
5
2
Γ
_
5
2
_
Γ2)
=
5
2
_
5
2
−1
_
Γ
_
5
2
−1
_
=
5
2

3
2
Γ
_
3
2
_
=
15
4


π
2
=
15

π
8
In consequence
_
1
0
_
x
5
−x
6
dx =
Γ
_
7
2
_
Γ
_
3
2
_
Γ(5)
=
15

π
8


π
2
24
=

128
b) In order to use Gamma function we make the following substitutions
x
5
= t; x = 5t; dx = 5dt
If x = 0 then t = 0 and if x = ∞ then t = ∞.
_

0
x
2
e

x
5
dx =
_

0
(5t)
2
e
−t
5dt = 125
_

0
t
2
e
−t
dt
184
= 125
_

0
t
3−1
e
−t
dt = 125Γ(3) = 125 2! = 250
c) In order to use Gamma function we make the following substitutions
t = x −2; x = t + 2; dx = dt
If x = 2 then t = 0 and if x = ∞ then t = ∞
_

2
xe
2−x
dx =
_

0
(t + 2)e
−t
dt =
_

0
te
−t
dt + 2
_

0
e
−t
dt
=
_

0
t
2−1
e
−t
dt + 2
_

0
t
1−1
e
−t
dt = Γ(2) + 2Γ(1) = (2 −1)! + 2 1 = 3
d)
_

0
xe
2−x
dx =
_

0
e
2
xe
−x
dx = e
2
_

0
xe
−x
dx
= e
2
_

0
x
2−1
e
−x
dx = e
2
Γ(2) = e
2
(2 −1)! = e
2
.
Euler-Poisson integral
The integral
_

0
e
−x
2
dx is called Euler-Poisson integral.
As we saw in example 9, subsection 3.3.1 the previous integral is convergent.
Next, by using the substitution
t = x
2
, x =

t, dx =
1
2

t
dt
we will evaluate the Euler-Poisson integral. We observe that by the previous change
of variable the limits of integration remain the same.
_

0
e
−x
2
dx =
1
2
_

0
1

t
e
−t
dt =
1
2
_

0
t

1
2
e
−t
dt
=
1
2
_

0
t
1
2
−1
e
−t
dt =
1
2
Γ
_
1
2
_
=
1
2

π
In conclusion
_

0
e
−x
2
dx =

π
2
Theorem 5. (Properties of Euler-Poisson integral)
a)
_

0
e
−x
2
dx =

π
2
;
b)
_

−∞
e
−x
2
dx =

π;
185
c)
_

−∞
e

x
2
2
dx =

2π.
Proof. a) The equality was already proved.
b)
_

−∞
e
−x
2
dx =
_
0
−∞
e
−x
2
dx +
_

0
e
−x
2
dx.
We compute separately the first integral of the right-side of equality by making
the following substitutions:
t = −x, x = −t and dx = −dt.
If x = −∞ then t = ∞ and if x = 0 then t = 0.
Hence
_
0
−∞
e
−x
2
dx = −
_
0

e
−t
2
dt =
_

0
e
−t
2
dt =
_

0
e
−x
2
dx =

π
2
Finally, we get
_

−∞
e
−x
2
dx =

π
2
+

π
2
=

π.
c) By making the following change of variable
x

2
= t; x = t

2; dx =

2dt
we obtain
_

−∞
e

x
2
2
dx =
_

−∞
e

(

2t)
2
2


2dt =

2
_

−∞
e
−t
2
dt =

2

π =

2π.
186
Chapter 4
Differential calculus of several
variables
4.1 Real functions of several variables.
Limits and continuity
4.1.1 Real functions of several variables
In many practical situations, the value of one quantity may depend on the values
on two or more others. For example, the output of a factory depends on the amount
of capital invested in the plant and on the size of the labor force. The demand for
butter may depend on the price of butter and on the price of margarine. Relationships
of this type can be represented mathematically by functions having more than one
variable.
We shall restrict first our attention to functions of two variables.
Definition 1. A real function f of two variables is a rule that assigns to each
ordered pair of real numbers (x, y) in a set A a unique real number denoted by
f(x, y).
f : A →R, A ∋ (x, y) → f(x, y) ∈ R.
The set A is the domain of f (usually is the largest set for which the rule of f
makes sense), the set R in which f takes its values is called the target space and its
range is the set of values that f takes on, that is, ¦f(x, y) [ (x, y) ∈ A¦.
We often write z = f(x, y) to make explicit the value taken on by f at the general
point (x, y). The variables x and y are independent variables and z is the dependent
variable.
A function of two variables is a function whose domain is a subset of R
2
and whose
range is a subset of R. One way of visualizing such a function is by using an arrow
diagram, where the domain is a subset of R
2
.
187
`
¸ ¸
f(x, y) 0 f(a, b)

(x, y)
(a, b)
y
x
x
If a function f is given by a formula and no domain is specified, then the domain
of f is the largest set for which the rule of f makes sense.
Example 1. For the following function find the domain and evaluate f(3, 2),
f(x, y) =

x +y + 1
x −1
.
Solution. The given expression is well-defined if the denominator is not 0 and the
quantity under the square root sign is nonnegative. In conclusion the domain of f is
A = ¦(x, y) [ x +y + 1 ≥ 0, x ,= 1¦.
The inequality x + y + 1 ≥ 0, or y ≥ −x −1, represents the points that lie on or
above the line y = −x − 1 while the condition x ,= 1 means that the points on the
line x = 1 must be excluded from the domain.
`
¸
x = 1
−1
−1
y
x
O
f(3, 2) =

3 + 2 + 1
3 −1
=

6
2
188
Example 2. In 1928 Charles Cobb and Paul Douglas published a study in which
they modeled the growth of the american economy during the period 1899-1922. They
considered a simplified view of the economy in which production output is determined
by the amount of labor involved and the amount of capital invested.
The function used to model production was
Q(L, K) = bL
α
K
1−α
(1)
where Q is the total production (the monetary value of all goods produced ina year),
L is the amount of labor (the total number of hours worked in a year) and K is the
amount of capital invested.
Cobb and Douglas used the method of least squares (see 4.7.1) to fit the data
published by the government to the function
Q(L, K) = 1, 01 L
0,75
K
0,25
.
The production function (1) has been used in many settings, from individual firms
to global economic functions. Its domain is ¦(L, K) [ L ≥ 0, K ≥ 0¦ because L and
K represent labor and capital and they are never negative.
Example 3. Suppose that at a certain factory, output is given by the Cobb-
Douglas production function
Q(K, L) = 60 K
1/3
L
2/3
,
where K is the capital investment measured in unit of 1000 Euros and L the size of
the labor force measured in worker-hours.
a) compute the output if the capital investment is 512.000 Euros and 1000 of
worker-hours are used.
b) show that the output in part (a) will be double if both the capital investment
and the size of the labor force are doubled.
Solution. a) Evaluate Q(K, L) with K = 512 and L = 1000.
Q(512, 1000) = 60 (512)
1/3
(1000)
2/3
= 60 8 100 = 48000 (units)
b) Q(2 512, 2 1000) = 60 (2 512)
1/3
(2 1000)
2/3
= 60 2
1/3
512
1/3
2
2/3
1000
2/3
= 60 2
1/3+2/3
512
1/3
1000
2/3
= 2Q(512, 1000) = 2 48000 = 96000.
Using a calculation similar to the one in part (b) of the previous example, it can
be shown that if both capital and labor are multiplied by some positive number m,
then output will also be multiplied by m. In economics, production functions with
this property are said to have constant return to scale.
Indeed,
Q(mK, mL) = b(mK)
α
(mL)
1−α
= m
α
m
1−α
b K
α
L
1−α
= mQ(K, L)
189
Another way of visualizing the behaviour of a function of two variables is to
consider its graph.
Definition 2. If f is a real function of two variables with domain A, then the
graph of f is the set:
G
f
= ¦(x, y, z) ∈ R
3
[ (x, y) ∈ A, z = f(x, y)¦ (2)
The graph of a function f of two variables is a surface S ⊂ R
3
with equation
z = f(x, y).
Example 4. Sketch the graph of the function
f(x, y) = 6 −3x −2y, A = R
2
.
Solution. The graph has the equation
z = 6 −3x −2y or 3x + 2y +z = 6
which represents a plane. We determine first the intercepts.
Putting y = z = 0 in the equation, we get x = 2 as the x-intercept. Similarly
y-intercept in 3 and z-intercept is 6.
`
¸
x
(0,0,6)
(0,3,0)
z
y
(2,0,0)
.
The graph of a function of two variables is, in general, difficult to be sketched,
and we shall not develop a systematic procedure for sketching the graphs of such
functions.
However, computer programs are available for graphing functions of two variables.
Fortunately, there is another way to visualize a function from R
2
to R - the study
of level curves of f.
190
Suppose f is a function of two variables x and y. If c is some value in the range
of the function f, then the equation f(x, y) = c describes a curve lying on the plane
z = c called the trace of the graph of f in the plane z = c.
If this trace is projected onto the xy-plane, the resulting curve in the xy-plane is
called a level curve.
Actually, for any constant c, the points (x, y) for which f(x, y) = c form a curve
in the xy plane that is said a level curve of f.
Example 5. If f(x, y) = x
2
− y, f : R
2
→ R, sketch the level curves f(x, y) = 4
and f(x, y) = 9.
Solution. The level curve f(x, y) = 9 consists of all points (x, y) in xy plane for
which
x
2
−y = 9 or y = x
2
−9.
The latter equality represent a quadratic function whose graph is sketched below.
`
¸
f
=
4
f
=
9
2 −2
3 −3
−4
−9
x
y
Definition 3. Let f : A ⊆ R
n
→ R, f : A → R is called a real function of n
variables if and only if for each x ∈ A there corresponds one and only one element
f(x) ∈ R
A ∋ x = (x
1
, x
2
, . . . , x
n
) → f(x) ∈ R.
The set A is called the domain of f, R is called the target space and the set
¦f(x) [ x ∈ A¦ is called the range of f or the image of f.
191
4.1.2 Limits. Continuity
Global limit
In order to make the things clear from the beginning we shall discuss first the case
n = 2.
Intuitively, f has the limit l at a given point (a, b) if the values f(x, y) are ap-
proaching l when (x, y) approaches (a, b). This limit can be written as
lim
(x,y)→(a,b)
f(x, y) = l or lim
x→
y→b
f(x, y).
In the last limit x → a and y → b in the same time and independently.
In the previous discussion we mention the word ”approach” and we want to mea-
sure somehow the ”notion of approaching” to a given point.
In one variable x → a means that we can approach a only from the left side
and from the right side. In two or more variables the situation is more complicated.
There is an infinite number of ways of approaching a point (a, b). For instance, we
can approach along vertical or horizontal lines; along every straight line which passes
through (a, b) or along every curve (as we can see in figure below).
`
¸
(a, b)
a
b
y
x
We will use the notion of distance between two points to measure how close is a
point another.
We can say that lim
(x,y)→(a,b)
f(x, y) = l if the distance between f(x, y) and l can be
made arbitrarily small by making the distance from (x, y) to (a, b) sufficiently small.
We are prepared now to present the rigorous definition of a limit of a function at
a given point.
Definition 1. (the global limit)
Let f : A →R, A ⊆ R
2
and (a, b) ∈ A

.
192
We say that the limit of f as (x, y) approaches (a, b) is l and we write
lim
(x,y)→(a,b)
f(x, y) = l
if and only if ∀ ε > 0, ∃ δ > 0 such that if (x, y) ∈ A and
0 <
_
(x −a)
2
+ (y −b)
2
< δ,
then [f(x, y) −l[ < ε.
Other notations for the limit in the Definition 1 are
f(x, y) → l as (x, y) → (a, b)
and
lim
x→a
y→b
f(x, y) = l
(here x → a, y → b independently and in the same time).
`
¸ ¸ ¸

(x, y)
(a, b)
x
y
f

l l − ε
δ
l + ε
From the previous discussion we obtain the following remark.
If we can find two different paths of approach along which f has different limits
then the global limit does not exist.
Remark 1. If f(x, y) → l
1
when (x, y) → (a, b) along a path C
1
and f(x, y) → l
2
as (x, y) → (a, b) along a path C
2
where l
1
,= l
2
then lim
(x,y)→(a,b)
f(x, y) does not exist.
Example 1. Show that lim
(x,y)→(0,0)
x
2
−3y
2
x
2
+y
2
does not exist.
Solution. First we will approach (0,0) along the real axis. Then y = 0 and
f(x, 0) =
x
2
x
2
= 1 for all x ,= 0, so f(x, y) → 1 as (x, y) → (0, 0) along x-axis.
We now approach along the y-axis by putting x = 0. Then
f(0, y) = −
3y
2
y
2
= −3 for all y ,= 0
193
so
f(x, y) → −3 as (x, y) → (0, 0)
along y-axis.
Since f has two different limits along two different paths, the global limit does not
exist.
Example 2. Study the existence of the following limit
lim
(x,y)→(0,0)
xy
x
2
+y
2
.
Solution. It is obvious that
A = R
2
¸ ¦(0, 0)¦, so (0, 0) ∈ A

.
If (x, y) → (0, 0) along the x axis then y = 0 and
lim
x→0
y=0
f(x, y) = lim
x→0
f(x, 0) = lim
x→0
x 0
x
2
+ 0
= 0.
In the same way
lim
x=0
y→0
f(x, y) = 0.
Even if we have obtained identical limits along the axes, that does not assure that
the given limit is 0. We can go to (0,0) along another line, let’s say y = mx (that is
the equation of a nonvertical line which passes through (0,0)).
lim
x→0
y=mx
f(x, y) = lim
x→0
y=mx
xy
x
2
+y
2
= lim
x→0
xmx
x
2
+m
2
x
2
= lim
x→0
x
2
m
x
2
(1 +m
2
)
=
m
1 +m
2
.
Since we have obtained different limits along different paths the global limit does
not exist.
Example 3. Does the limit lim
(x,y)→(0,0)
x
2
y
x
4
+y
2
exist?
Solution. If (x, y) → (0, 0) along any nonvertical line which passes through the
origin, y = mx then
lim
x→0
y=mx
f(x, y) = lim
x→0
x
2
mx
x
4
+m
2
x
2
= lim
x→0
xm
x
2
+m
2
= 0
So f(x, y) → 0 as (x, y) → (0, 0) along y = mx.
Even if f has the same limiting value along nonvertical lines, that does not show
that the given limit is 0.
194
Indeed, if we now let (x, y) → (0, 0) along the parabola y = mx
2
, we have
lim
x→0
y=mx
2
x
2
y
x
4
+y
2
= lim
x→0
x
2
mx
2
x
4
+m
2
x
4
= lim
x→0
mx
4
x
4
(1 +x
2
)
=
m
1 +m
2
Since different paths lead to different limiting values, the given limit does not
exist.
Example 4. Show that lim
(x,y)→(0,0)
x
3
y
x
6
+y
2
does not exist.
Solution. If we let y = mx or y = mx
2
we obtain that
lim
(x,y)→(0,0)
f(x, y) = 0.
The limit does not exist since on y = x
3
we have
lim
x→0
y=x
3
x
3
y
x
6
+y
2
= lim
x→0
x
6
x
6
+x
6
=
1
2
.
In what concerns the limits that do exist, their computation can be greatly sim-
plified by the use of properties of limits. As in the case of one variable we have that:
the limit of a sum is the sum of the limits, the limit of a product of the limits, etc
(see subsection 3.1.1). The squeeze theorem also holds (Theorem 2, subsection 3.1.1).
Example 5. Compute
lim
(x,y)→(0,0)
xy

xy + 4 −2
[
0
0
]
= lim
x→0
y→0
xy(

xy + 4 + 2)
xy + 4 −4
= lim
x→0
y→0
xy(

xy + 4 + 2)
xy
= lim
x→0
y→0
(
_
xy + 4 + 2) = 4.
Example 6. Compute lim
x→0
y→0
3x
2
y
x
2
+y
2
if it exists.
Solution. Since x
2
≤ x
2
+y
2
and −3[y[ ≤ 3y ≤ 3[y[ then
−3[y[ ≤
3x
2
y
x
2
+y
2
≤ 3[y[.
We know that 3[y[ → 0 as (x, y) → (0, 0) and by using the Squeeze Theorem we
obtain that
lim
x→0
y→0
3x
2
y
x
2
+y
2
= 0.
Another way of computing the considered limit is to use the following result.
195
Remark 2. If f is a bounded function and
lim
(x,y)→(a,b)
g(x, y) = 0,
then
lim
(x,y)→(a,b)
[f(x, y)g(x, y)] = 0.
In our example
f(x, y) =
x
2
x
2
+y
2
and g(x, y) = 3y.
It is obvious that
f(x, y) =
x
2
x
2
+y
2
≤ 1
and lim
x→0
y→0
3y = 0, so the product limit will be 0.
We can now present the definition of the limit in general case (n variable case).
Definition 2. Let f : A →R, A ⊆ R
n
and a ∈ A

.
We say that the limit of f as x approaches a is l and we write
lim
x→a
f(x) = l
iff ∀ ε > 0, ∃ δ > 0 such that x ∈ A and 0 < |x −a| < δ then [f(x) −l[ < ε (iff is an
abbreviation for if and only if).
Other notations for the limit in the previous definition are f(x) → l as x → a and
lim
x
1
→a
1
...
x
n
→a
n
f(x) = l (here x
1
→ a
1
, . . . , x
n
→ a
n
independently and in the same time).
The following theorem will permit us to compute limits.
Theorem 1. (basic limit theorem). Let f, g, h : A ⊆ R
n
→R and a ∈ A

. Suppose
lim
x→a
f(x) = l and lim
x→a
g(x) = m. Then:
a) lim
x→a
[f(x) +g(x)] = l +m
b) lim
x→a
[cf(x)] = cl, where c is a real constant
c) lim
x→a
f(x)g(x) = lm
d) lim
x→a
f(x)
g(x)
=
l
m
provided m ,= 0
e) Moreover if l = m, lim
x→a
f(x) = lim
x→a
g(x) and if f(x) ≤ h(x) ≤ g(x), for x near
a, then lim
x→a
h(x) = l. (The Squeeze theorem for functions of several variables).
In the previous theorem, part e), x near a means x in a ball centered at a.
Iterated limits for function of two variables
Definition 3. Let f : A →R, A ⊆ R
2
, (a, b) ∈ A

. The iterated limits l
12
and l
21
are defined as
l
12
= lim
x→a
_
lim
y→b
f(x, y)
_
196
and
l
21
= lim
y→b
_
lim
x→a
f(x, y)
_
provided that the previous limits exist.
In the case of iterated limits x → a and y → b independently but not in the same
time as you can see in the figure below.
`
¸
(a, y) (x, y)
(a, b) (x, b)
x a
b
y
l
12
l
21
·
¡
D
y
x
Remark 3. (The connection between the iterated limits and the global limit).
a) If there exist the iterated limits l
12
, l
21
and l
12
,= l
21
then the global limit does
not exist.
b) The existence and equality of the iterated limits does not assure that the global
limit exists.
Next, we will illustrate by 2 examples the statements of the previous results.
Example 7. Study the existence of the iterated limits and of the global limit in
the next cases:
a) f : R
2
→R
f(x, y) =
_
_
_
x
2
−3y
2
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
b) f : R
2
→R
f(x, y) =
_
xy
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
Solution. a) l
12
= lim
x→0
_
lim
y→0
x
2
−3y
2
x
2
+y
2
_
= lim
x→0
x
2
x
2
= lim
x→0
1 = 1
l
21
= lim
y→0
_
lim
x→0
x
2
−3y
2
x
2
+y
2
_
= lim
y→0
−3y
2
y
2
= lim
y→0
(−3) = −3
Hence l
12
,= l
21
.
197
The global limit does not exist (see example 1).
b) As we have already observed the global limit does not exist (see example 2).
l
12
= lim
x→0
_
lim
y→0
xy
x
2
+y
2
_
= lim
x→0
0
x
2
= lim
x→0
0 = 0
l
21
= lim
y→0
_
lim
x→0
xy
x
2
+y
2
_
= lim
y→0
0
y
2
= lim
y→0
0 = 0
So, in conclusion, even if l
12
= l
21
the global limit does not exist.
Directional limits
Definition 4. (the limit in the direction of a given vector)
Let f : A →R, A ⊆ R
n
, a ∈ A

and let h ∈ R
n
¸ ¦θ¦. The limit in the direction of
the vector h is defined by
l
h
= lim
t→0
f(a +th)
In the case n = 2 we can geometrically present the way of approaching a in this
type of limit.
`
¸
`

,
,
y
x
h
a +th
a
th
As you can see a + th goes to a on the straight line which passes through a and
has the same direction with the vector h.
Example 8. Compute l
h
in the case when
f : A →R, f(x, y) =
x
2
+y
2x +y
2
, a = (1, 0),
h = (h
1
, h
2
) ,= θ
Solution. A = ¦(x, y) ∈ R
2
[ y
2
,= −2x¦
The domain A of f is the set of points in R
2
which not lie on the parabola whose
equation is y
2
= 2x. Hence a = (1, 0) ∈ A

.
198
¸
`
(1, 0)
a +th = (1, 0) +t(h
1
, h
2
) = (1, 0) + (th
1
, th
2
) = (1 +th
1
, th
2
)
l
h
= lim
t→0
f(a +th) = lim
t→0
f(1 +th
1
, th
2
)
= lim
t→0
(1 +th
1
)
2
+th
2
2(1 +th
1
) +t
2
h
2
2
=
1
2
Remark 4. (The connection between the global limit and the directional limit)
1) If the global limit exists and is equal to l then any directional limit exists and
is equal to l.
2) If there are two different vectors h ,= h

(h, h

,= θ) such that l
h
1
,= l
h
2
, then
the global limit does not exist.
3) If any directional limit exists and is equal to the same real number then it is
not sure that the global limit exists.
Continuity
Definition 5. Let f : A → R, A ⊆ R
n
and a ∈ A. If a ∈ A

we say that f is
continuous at a if
lim
x→a
f(x) = f(a).
If a is an isolated point then f is continuous at a.
A function f is said to be continuous on the set A if f is continuous at any point
which belongs to A.
The intuitive meaning of continuity is that if the point x (in R
n
) changes by a
small amount, then the value of f(x) changes by a small amount.
This means that a surface that is the graph of a continuous function (of two
variables) has no holes or breaks.
Using the properties of limits, it can be easily seen that sums, differences, products,
quotients and compositions of continuous functions are continuous on their domain.
199
Theorem 2. (Basic continuity theorem). Let f, g : A ⊆ R
n
→ R and let a ∈ A.
Suppose f and g are continuous at a. Then f + g, fg and cf (c ∈ R, constant) are
continuous at a as is
f
g
provided g(a) ,= 0.
Theorem 3. (Continuity composition theorem) Let f : A →R, A ⊆ R
n
, h : B →
R such that f(A) ⊆ B ⊆ R and let a ∈ A. Suppose f is continuous at a and that h is
continuous at f(a). Then the function h ◦ f,
h ◦ f : A →R, (h ◦ f)(x) = h(f(x))
is continuous at a.
Example 9. Let f : R
2
→R defined by
f(x, y) =
_
_
_
3x
2
y
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
Study the continuity of f.
Solution. We know that f is continuous for (x, y) ,= (0, 0) since it is equal to a
rational function there.
Also from Example 6, we have
lim
(x,y)→(0,0)
f(x, y) = 0 = f(0, 0).
Therefore f is continuous at (0,0) and so it is continuous on R
2
.
Example 10. Study the continuity of the function f : R
2
→R,
f(x, y) =
_
_
_
1 −cos(x
3
+y
3
)
x
2
+y
2
, if (x, y) ,= (0, 0)
α, if (x, y) = (0, 0)
where α is a real parameter.
Solution. We know that f is continuous for (x, y) ,= (0, 0) since it is equal to a
composition of elementary functions there.
By using the formula
1 −cos 2z = 2 sin
2
z
and the fact that
lim
z→0
sin z
z
0
0
= lim
z→0
(sin z)

z

= lim
z→0
cos z = 1,
we have
lim
(x,y)→(0,0)
f(x, y) = lim
(x,y)→(0,0)
1 −cos(x
3
+y
3
)
x
2
+y
2
= lim
(x,y)→(0,0)
2 sin
2
_
x
3
+y
3
2
_
x
2
+y
2
= 2 lim
(x,y)→(0,0)
sin
2
_
x
3
+y
3
2
_
_
x
3
+y
3
2
_
2

_
x
3
+y
3
2
_
2
x
2
+y
2
200
=
1
2
_
¸
_ lim
(x,y)→(0,0)
sin
x
3
+y
3
2
x
3
+y
3
2
_
¸
_
2
lim
(x,y)→(0,0)
x
6
+y
6
+ 2x
3
y
3
x
2
+y
2
=
1
2
lim
(x,y)→(0,0)
_
(x
2
+y
2
)(x
4
−x
2
y
2
+y
4
)
x
2
+y
2
+
2x
3
y
3
x
2
+y
2
_
=
1
2
_
lim
(x,y)→(0,0)
(x
4
−x
2
y
2
+y
4
) + 2 lim
(x,y)→(0,0)
x
3
y
3
x
2
+y
2
_
=
1
2
lim
(x,y)→(0,0)
x
3
y
3
x
2
+y
2
=
1
2
lim
(x,y)→(0,0)
x
2
x
2
+y
2
xy
3
=
1
2
0 = 0.
The last limit is 0 by using Remark 2 since
0 ≤
x
2
x
2
+y
2
≤ 1 and lim
(x,y)→(0,0)
xy
3
= 0.
In conclusion we get that if α = 0 then f is continuous on R
2
and if α ,= 0 then f
is continuous on R
2
¸ ¦(0, 0)¦.
We end this section by presenting (without proof) a result concerning continuous
functions defined on compact sets. We need this result later.
Theorem 4. (Weierstrass theorem. Extreme value theorem). Suppose f : K →R
is continuous and K is a closed and bounded subset of R
n
(hence a compact subset).
Then f has a maximum value and a minimum value on K.
4.2 Partial derivatives
An important goal in economic analysis is to understand how a change in one
economic variable affects another.
We have seen (see subsection 3.1.2) that one-variable calculus helps us to under-
stand such kind of changes in the case of functions of one variable.
Since we are interested in the variation brought by the change in one variable,
we will change one variable at a time, keeping all the other variables constant and
the correspondent derivative is called the partial derivative of f with respect to the
considered variable.
Remark 1. From now on all the domains of definition of considered functions will
be open and connected sets and will be denoted by D. Recall that an open connected
set is a domain (see appendix A). Hence any point in D will be an accumulation point
for D.
For a better understanding of these notions we will start with the case of functions
of two variables.
Definition 1. (Partial differentiability at a point; two variables case) Let f : D →
R, D ⊆ R
2
and let (a, b) ∈ D.
201
We say that f is partial differentiable with respect to the variable x at the point
(a, b) if the following limit exists and has a finite value
∂f
∂x
(a, b) = f

x
(a, b) = lim
x→a
f(x, b) −f(a, b)
x −a
= lim
h→0
f(a +h, b) −f(a, b)
h
(1)
In the same way we can define
∂f
∂y
(a, b) = f

y
(a, b) = lim
y→b
f(a, y) −f(a, b)
y −b
= lim
h→0
f(a, b +h) −f(a, b)
h
(2)
Example 1. Compute the partial derivatives of f at the point (2,1) where
f : R
2
→R, f(x, y) = x
2
+y
2
+xy.
Solution.
∂f
∂x
(2, 1) = lim
x→2
f(x, 1) −f(2, 1)
x −2
= lim
x→2
x
2
+ 1 +x −(2
2
+ 1
2
+ 2 1)
x −2
= lim
x→2
x
2
+x −6
x −2
= lim
x→2
(x −2)(x + 3)
x −2
= lim
x→2
(x + 3) = 5
∂f
∂y
(2, 1) = lim
y→1
f(2, y) −f(2, 1)
y −1
= lim
y→1
4 +y
2
+ 2y −7
y −1
= lim
y→1
y
2
+ 2y −3
y −1
= lim
y→1
(y −1)(y + 3)
y −1
= lim
y→1
(y + 3) = 4
Definition 2. (Partial differentiability on a set, two variables case) Let f : D →R,
D ⊆ R
2
. We say that the function f is partial differentiable with respect to x on the
domain D, if f is partial differentiable with respect to x at any point (x, y) in D.
In this case the following limit exists and has a finite value at each point (x, y) in
D.
∂f
∂y
(x, y) = f

x
(x, y) = lim
h→0
f(x +h, y) −f(x, y)
h
, for each (x, y) ∈ D.
Similarly, we can define partial differentiability of f with respect to y on a set.
∂f
∂y
(x, y) = f

y
(x, y) = lim
h→0
f(x, y +h) −f(x, y)
h
for each (x, y) ∈ D.
In this case we can define the partial derivative functions of f with respect to x,
respectively to y
∂f
∂x
= f

x
: D →R (3)
202
(x, y) →
∂f
∂x
(x, y)
respectively
∂f
∂y
= f

y
: D →R (4)
(x, y) →
∂f
∂y
(x, y).
Actually, to find the partial derivative function with respect to x we think of y as
a constant and we differentiate in the usual way with respect to x. This gives another
method of computing the partial derivatives at a given point. First we compute the
partial derivative functions and then we evaluate them at the considered point.
Example 2. Another way of computing the partial derivatives in Example 1 is
the following:
∂f
∂x
: R
2
→R
∂f
∂x
(x, y) = (x
2
+y
2
+xy)

x
= (x
2
)

x
+ (y
2
)

x
+ (xy)

x
= 2x + 0 +y = 2x +y
So,
∂f
∂x
(2, 1) = 2 2 + 1 = 5
∂f
∂y
(x, y) = (x
2
+y
2
+xy)

y
= (x
2
)

y
+ (y
2
)

y
+ (xy)

y
= 2y +x
and
∂f
∂y
(2, 1) = 2 1 + 2 = 4,
as we expected.
As we can see in the above computations in order not to make mistakes, we have
to mention with respect to whom we compute the partial derivative functions.
Example 3. Compute the partial derivatives of each of the following functions:
a) f : D →R, f(x, y) =
xy
x
2
+y
2
b) g : D →R, g(s, t) = (s
2
−st +t
2
)
5
c) f : D →R, f(x, y) =
5

xy.
Solution. a) D = ¦(x, y) ∈ R
2
[ x
2
+y
2
,= 0¦ = R
2
−¦(0, 0)¦
To compute
∂f
∂x
, think of the variable y as a constant and apply the quotient rule
(see subsection 3.1.2)
∂f
∂x
(x, y) =
_
xy
x
2
+y
2
_

x
=
(xy)

x
(x
2
+y
2
) −xy(x
2
+y
2
)

x
(x
2
+y
2
)
2
=
y(x
2
+y
2
) −xy 2x
(x
2
+y
2
)
2
=
y(y
2
−x
2
)
(x
2
+y
2
)
2
203
To compute
∂f
∂y
, think of the variable x as a constant and apply the quotient rule.
∂f
∂y
(x, y) =
_
xy
x
2
+y
2
_

y
=
(xy)

y
(x
2
+y
2
) −xy(x
2
+y
2
)

y
(x
2
+y
2
)
2
=
x(x
2
+y
2
) −xy 2y
(x
2
+y
2
)
2
=
x(x
2
−y
2
)
(x
2
+y
2
)
2
b) D = R
2
To compute
∂g
∂s
, we treat the variable t as if it is a constant.
∂g
∂s
(s, t) = [(s
2
−st +t
2
)
5
]

s
= 5(s
2
−st +t
2
)
4
(s
2
−st +t
2
)

s
= 5(s
2
−st +t
2
)
4
(2s −t)
In the same way we can obtain that
∂g
∂t
(s, t) = [(s
2
−st +t
2
)
5
]

t
= 5(s
2
−st +t
2
)
4
(s
2
−st +t
2
)

t
= 5(s
2
−st +t
2
)
4
(2t −s)
c) D = R
2
∂f
∂x
(x, y) = (
5

xy)

x
=
5

y (
5

x)

x
=
5

y (x
1
5
)

=
5

y
1
5
x
1
5
−1
=
5

y
1
5

1
5

x
4
=
1
5

5
_
y
x
4
The previous equality is true for x ,= 0.
So, if x = 0, in order to obtain the partial derivative with respect to x at the
points of the form (0, y), we cannot simply differentiate and substitute (x, y) = (0, y).
In fact, we can use the definition 1.
∂f
∂x
(0, y) = lim
x→0
f(x, y) −f(0, y)
x −0
= lim
x→0
5

xy
x
= lim
x→0
5
_
y
x
4
=
_
∞, if y > 0
−∞, ify < 0
, for x = 0 and y ,= 0.
If x = 0, y = 0 we have
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x −0
= lim
x→0
0 −0
x
= lim
x→0
0 = 0
204
∂f
∂y
(x, y) = (
5

xy)

y
=
5

x (
5

y)

y
=
5

x
1
5
5
_
y
4
=
1
5

5
_
x
y
4
which is valid for y ,= 0 and x ,= 0.
For y = 0 and x ,= 0
∂f
∂y
(x, 0) = lim
y→0
f(x, y) −f(x, 0)
y −0
= lim
y→0
5

xy
y
= lim
y→0
5
_
x
y
4
=
_
∞, if x > 0
−∞, if x < 0
For y = 0 and x = 0
∂f
∂y
(0, 0) = lim
y→0
f(0, y) −f(0, 0)
y −0
= lim
y→0
0 −0
y
= 0
In conclusion
∂f
∂x
(x, y) =
_
¸
¸
_
¸
¸
_
0, (x, y) = (0, 0)
1
5
5
_
y
x
4
, (x, y) ,= (0, y)
does not exist, (x, y) = (0, y) where y ,= 0
and
∂f
∂y
(x, y) =
_
¸
¸
_
¸
¸
_
0, (x, y) = (0, 0)
1
5
5
_
x
y
4
, (x, y) ,= (x, 0)
does not exist, (x, y) = (x, 0) where x ,= 0.
Example 4. Economic interpretation of the partial derivatives
Let Q = f(K, L) be a production function where Q represents the output, K
represents the capital input and L represents the labor input.
If the firm is using K
0
units of capital and L
0
units of labor to produce Q
0
units
of output then the partial derivative:
∂f
∂K
(K
0
, L
0
) = lim
K→K
0
f(K, L
0
) −f(K
0
, L
0
)
K −K
0
= lim
∆K→0
f(K
0
+ ∆K, L
0
) −f(K
0
, L
0
)
∆K
= lim
∆K→0
∆Q
∆K

∆Q
∆K
is the rate at which output changes with respect to capital, keeping L fixed at L
0
.
If capital increases by ∆K, then the output will increase by
∆Q ≈
∂f
∂K
(K
0
, L
0
) ∆K.
205
If ∆K = 1 then ∆Q ≈
∂f
∂K
(K
0
, L
0
) so
∂f
∂K
(K
0
, L
0
) represents approximately the
change in output due to a one unit increase in capital (keeping L fixed).
∂f
∂K
(K
0
, L
0
) is called the marginal product of capital.
In the same way we can deduce that
∂f
∂L
(K
0
, L
0
) is the rate at which output
changes with respect to labor in the case in which capital is held fixed at K
0
.
∂f
∂L
(K
0
, L
0
) represents approximately the change in output due to a one unit
increase in labor (keeping the capital fixed at K
0
).
∂f
∂L
(K
0
, L
0
) is called the marginal product of labor.
As a particular case of the previous analysis we will consider the Cobb-Douglas
production function
Q : [0, ∞) [0, ∞) →R
Q(K, L) = 4K
3/4
L
1/4
If K = 10000 and L = 625 the output is
Q = 4 (10
4
)
3/4
(5
4
)
1/4
= 20000.
The partial derivatives at (10000, 625) are
∂Q
∂K
= (4K
3
4
L
1
4
)

K
= 4L
1
4

3
4
K
3
4
−1
= 3
_
L
K
_1
4
= 3
4
_
L
K
∂Q
∂K
(10000, 625) = 3
4
_
625
10000
= 3
5
10
=
3
2
∂Q
∂K
(10000, 625) =
3
2
∂Q
∂L
= 4
1
4
K
3
4
L

3
4
=
_
K
L
_3
4
∂Q
∂L
(10000, 625) =
_
10000
625
_3
4
=
_
10
5
_
3
= 2
3
= 8
∂Q
∂L
(10000, 625) = 8
By the previous discussion if L
0
= 625 and K is increased by ∆K, Q will increase
by approximately
3
2
∆K.
If ∆K = 10 then
Q(10000, 625) ≈ 20000 +
3
2
10 = 20015.
Evaluating
Q(10010, 625) = 4(10010)
3
4
625
1
4
= 20014, 99,
206
so the previous estimation is a very good one.
In the same way, by using the previous discussion if Q
0
= 10000 and L is decreased
by ∆L, L will decrease by approximately 8 ∆L.
That means that a 10 units decrease in labor should induce a 8 10 decrease in
output.
Q(10000, 615) ≈ 20000 −80 = 19920
Evaluating
Q(10000, 615) = 4 (10000)
3
4
615
1
4
≈ 19919, 5
so the previous estimation is a good one.
Example 5. In example 1, subsection 4.1.1 we described the work of Cobb and
Douglas in modelling the total production of an economic system.Here we use the
partial derivatives to show how their model can be obtained from certain assumptions
on economy.
If Q = Q(K, L) then the partial derivatives
∂Q
∂K
,
∂Q
∂L
are the rates of change at
which production modifies with respect to amount of capital and labor.
The assumptions made by Cobb and Douglas are the following.
(a) If either capital or labor vanishes, then so will production.
(b) The marginal product of capital is proportional to the amount of production
per unit of capital.
(c) The marginal product of labor is proportional to the amount of production
per unit of labor.
The second assumption says that
∂Q
∂K
= α
Q
K
for some constant α which is equivalent to
∂Q
∂K
Q
= α
1
K
.
If we keep L constant, by integrating this equality with respect to K we get
ln Q = αln K + ln C
1
,
where C
1
is a function which depend on L. From the previous relation we get
Q(K, L) = C
1
(L) K
α
(5)
Similarly, from the third assumption we get
Q(K, L) = C
2
(K) L
β
(6)
Combining (5) and (6) we have
Q(K, L) = bK
α
L
β
207
where b is a constant that is independent of both K and L.
If capital are labor are both increased by a factor m, then
Q(mK, mL) = b(mK)
α
(mL)
β
= bm
α+β
K
α
L
β
= m
α+β
Q(K, L)
If α +β = 1, then Q(mK, mL) = mQ(K, L), which means that the production is
also increased by a factor m. That is why Cobb and Douglas assumed that α+β = 1
and therefore
Q(K, L) = bK
α
L
1−α
.
Remark 2. The existence of partial derivatives of a function f at a given point
does not imply the continuity of f at that point, as we can see in the next example.
Example 6. Let f : R
2
→R,
f(x, y) =
_
xy
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
Show that f is not continuous at (0,0) even that the function f admits partial
derivatives at (0,0).
Solution. Example 2 from subsection 4.1.2 shows that
lim
(x,y)→(0,0)
f(x, y)
does not exist and in consequence f is not continuous at the point (0,0).
We will prove next that the partial derivatives of f exist at (0,0).
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x
= lim
x→0
0 −0
x
= lim
x→0
0
x
= lim
x→0
0 = 0
∂f
∂y
(0, 0) = lim
y→0
f(0, y) −f(0, 0)
y
= lim
y→0
0
y
= lim
y→0
0 = 0.
Next, we will present the definition of partial derivatives in the general case of a
function of n variables.
Definition 3. Let f : D →R, D ⊆ R
n
, a = (a
1
, . . . , a
n
) ∈ D and let i = 1, n.
The partial derivative of function f with respect to x
i
at a is the following limit
(if the limit exists and has a finite value):
∂f
∂x
i
(a) = f

x
i
(a) =
208
lim
x→x
i
f(a
1
, . . . , a
i−1
, x
i
, a
i+1
, . . . , a
n
) −f(a
1
, . . . , a
n
)
x
i
−a
i
(7)
If the partial derivative of function f with respect to x
i
at a exists for all points a
in D we can define the partial derivative function with respect to x, in the following
way:
∂f
∂x
i
: D →R, x →
∂f
∂x
i
(x) (8)
Definition 4. A function f : D → R is continuously differentiable (or C
1
) on D
if the partial derivative functions
∂f
∂x
i
exist for all i = 1, n and are continuous on D.
Example 7. Compute the partial derivative functions of each of the following
functions:
a) f : R
3
→R, f(x, y, z) = x
2
+ sin yz
b) f : D →R, D = ¦x ∈ R
n
[ x
1
> 0, x
2
> 0, . . . , x
n
> 0¦
f(x
1
, . . . , x
n
) = x
2
1
+x
2
2
+ +x
2
n−1
+
x
n
x
1
.
Solution. a)
∂f
∂x
: R
3
→R,
∂f
∂x
(x, y, z) = (x
2
+ sin yz)

x
= 2x + 0 = 2x
∂f
∂y
: R
3
→R,
∂f
∂y
(x, y, z) = (x
2
+ sin yz)

y
= 0 +z cos yz = z cos yz
∂f
∂y
: R
3
→R,
∂f
∂z
(x, y, z) = (x
2
+ sin yz)

z
= y cos yz
b)
∂f
∂x
1
: D →R,
∂f
∂x
1
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_

x
1
= 2x
1
+
1
x
n
∂f
∂x
2
: D →R,
∂f
∂x
2
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_

x
2
= 2x
2
. . . . . . . . .
∂f
∂x
n−1
: D →R,
∂f
∂x
n−1
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_

x
n−1
= 2x
n−1
∂f
∂x
n
: D →R,
∂f
∂x
n
(x) =
_
x
2
1
+ +x
2
n−1
+
x
1
x
n
_

x
n
= −
x
1
x
2
n
We end this section by presenting a geometric interpretation of partial derivatives.
Suppose f is a function of two variables x and y. If c is some value in the range
of the function f, then the equation f(x, y) = c describes a curve lying on the plane
z = c called the trace of the graph of f in the plane z = c.
209
If this trace is projected onto the xy-plane, the resulting curve in the xy-plane is
called a level curve.
Actually, for any constant c, the points (x, y) for which f(x, y) = c form a curve
in the xy plane that is said a level curve of f.
The slope of the line that is tangent to the level curve f(x, y) = c at a particular
point is given by the derivative y

(x). This derivative is the rate of change of y with
respect to x on the level curve and hence is approximately the amount by which the
y coordinate of a point on the level curve changes when the x coordinate is increased
by 1.
For example, if f represent output and x and y represent the levels of skilled
and unskilled labor, respectively, the slope y

(x) of the tangent to the level curve
f(x, y) = c is an approximation to the amount by which the manufacturer should
change the level of unskilled labor y to compensate for a 1-unit increase in the level
of skilled labor x so that output will remain unchanged.
`
¸
·
`
_
y

(x)
f(x, y) = c
y
x x x + 1
actual change in y
on level curve
One way to compute y

(x) is to solve the equation f(x, y) = c in terms of x, and
then differentiate the resulting expression with respect to x. Sometimes is difficult or
even impossible to solve the equation f(x, y) = c explicitly for y. In such cases, we can
differentiate the equality f(x, y) = c with respect to x, by considering y to depend on
x:
∂f
∂x
1 +
∂f
∂y
y

(x) = 0
wherefrom we get the formula
y

(x) = −
f

x
(x, y)
f

y
(x, y)
(9)
Since
y

(x) = lim
∆x→0
∆y
∆x

∆y
∆x
,
210
then
∆y ≈ y

(x) ∆x = −
f

x
f

y
∆x.
In conclusion, the change in y needed to compensate a small change ∆x in x so
that the value of the function f(x, y) will remain unchanged is
∆y ≈ −
f

x
f

y
∆x.
Example 8. Indifference curves
Let U(x, y) be a utility function which measures the total satisfaction (or utility)
the consumer obtains from having x units of the first commodity and y units of the
second. An indifference curve is a level curve U(x, y) = C of the utility function.
An indifference curve gives all the combination of x and y that lead to the same
level of consumer satisfaction.
We next present a typical example involving the slope of an indifference curve.
Suppose U(x, y) = x
3
2
y. The consumer currently owns x = 16 units of the first
commodity and y = 20 units of the second.
Use calculus to estimate how many units of these commodity could substitute for
1 unit of the first commodity without changing the total utility.
Solution. The level of utility is
U(16, 20) = 1280.
The corresponding indifference curve is sketched in figure below. Since 1280 = x
3
2
y,
then y = 1280x

3
2
.
`
¸
We try to estimate the change ∆y required to compensate a change of ∆x = −1
so that the utility will remain at its current level which is 1280. The approximation
formula
∆y ≈ −
U

x
U

y
∆x =
U

x
U

y
=
3
2
x
1
2
y
x
3
2
=
3
2

y
x
211
with x = 16 and y = 20 gives
∆y =
3
2

20
16
=
15
8
= 1, 875 units.
4.3 Higher order partial derivatives
The partial derivative
∂f
∂x
i
of a function f, i ∈ ¦1, . . . , n¦, is itself a function of
n variables so we can continue analyzing the existence of partial derivatives of these
partial derivatives obtaining the second order partial derivatives.
Let f : D →R, D ⊆ R
n
, a ∈ D and i = 1, n. We assume that the partial derivative
function
∂f
∂x
i
exists.
Definition 1. If the function
∂f
∂x
i
is partial differentiable at the point a with
respect to x
j
(j = 1, n) we say that f admits second-order partial derivative at the
point a with respect to x
i
and x
j
.
We have the following notations:

∂x
j
_
∂f
∂x
i
_
(a) =

2
f
∂x
i
∂x
j
(a)
or
(f

x
i
)

x
j
(a) = f
′′
x
i
x
j
(a).
If i = j, then the second order partial derivative is written in the following way

∂x
i
_
∂f
∂x
i
_
(a) =

2
f
∂x
2
i
(a) = f
′′
x
2
i
(a)
If i ,= j, then

2
f
∂x
i
∂x
j
(a) = f
′′
x
i
x
j
(a)
is called the mixed partial derivative or cross partial derivative.
Continuing in the same way we can obtain higher order partial derivatives.
Example 1. Let f : R
2
→R, f(x, y) = e
x
y. Compute f

x
, f
′′
x
2
, f
′′′
x
2
y
.
Solution.
f

x
= (e
x
y)

x
= y(e
x
)

x
= ye
x
f
′′
x
2 = (f

x
)

x
= (ye
x
)

x
= ye
x
f
′′′
x
2
y
= (f
′′
x
2)

y
= ((f

x
)

x
)

y
= e
x
Remark 1. A function of n variables can admit
• n first order partial derivatives:
∂f
∂x
1
,
∂f
∂x
2
, . . . ,
∂f
∂x
n
212
• n
2
second order partial derivatives (since for each partial derivative we have n
second order partial derivatives)
• n
3
third order partial derivatives
. . . . . . . . .
• n
k
k
th
order partial derivatives.
It is natural to arrange the n
2
partial derivatives of f at a given point a into a
n n matrix whose (i, j)th entry is

2
f
∂x
i
∂x
j
(a). This matrix is called the Hessian or
the Hessian matrix of f at a.
H(a) =
_
_
_
_
_
_
_
_
_
_
_
_

2
f
∂x
2
1
(a)

2
f
∂x
1
∂x
2
(a) . . .

2
f
∂x
1
∂x
n
(a)

2
f
∂x
2
∂x
1
(a)

2
f
∂x
2
2
(a) . . .

2
f
∂x
2
∂x
n
(a)
. . . . . . . . . . . .

2
f
∂x
n
∂x
1
(a)

2
f
∂x
n
∂x
2
(a) . . .

2
f
∂x
2
n
(a)
_
_
_
_
_
_
_
_
_
_
_
_
(1)
If we denote by
a
ij
=

2
f
∂x
i
∂x
j
(a),
then a shorter written form of H(a) is the following:
H(a) =
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
n1
a
n2
. . . a
nn
_
_
_
_
Definition 2. a) If all the second partial derivatives of f exist and are themselves
continuous functions we say that f is twice continuously differentiable or C
2
.
b) A function f is C
3
(or 3 times continuously differentiable) if all of its n
3
third
order partial derivatives exist and are continuous.
In the same way we can define C
k
functions (for which all n
k
k
th
order partial
derivatives exist and are continuous).
Example 2. Let f : R
2
→R be the function defined by
f(x, y) = x
3
y
2
−4y
2
x.
Compute all the third partial derivative functions of f.
Solution. It is easier for us to arrange the calculation in the following tree dia-
gram.
213
f(x, y) = x
3
y
2
− 4y
2
x
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
f

x
= 3x
2
y
2
− 4y
2
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
f
′′
x
2 = 6xy
2
_
_
_
f
′′′
x
3 = 6y
2
f
′′′
x
2
y
= 12xy
f
′′
xy
= 6x
2
y − 8y
_
_
_
f
′′′
xyx
= 12xy
f
′′′
xy
2 = 6x
2
− 8
f

y
= 2x
3
y − 8yx
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
f
′′
yx
= 6x
2
y − 8y
_
_
_
f
′′′
yx
2 = 12xy
f
′′′
yxy
= 6x
2
− 8
f
′′
y
2 = 2x
3
− 8x
_
_
_
f
′′′
y
2
x
= 6x
2
− 8
f
′′′
y
3 = 0
From the previous example we observe that some of the mixed partial derivatives
are equal and the order of differentiation seems to be of no importance (at least in this
case). This is not an accident, the equality is true for almost all functions which arise
in practical applications. More precisely we have the following result. This theorem
gives us informations about the equality of the mixed second order partial derivatives.
Theorem 1. (Schwarz theorem, general case). Let f : D → R, D ⊆ R
n
, a ∈ D,
i, j = 1, n, with i ,= j.
If both f
′′
x
i
x
j
and f
′′
x
j
x
i
exist for all points in a ball centered at a and they are
continuous at a then
f
′′
x
i
x
j
(a) = f
′′
x
j
x
i
(a).
For the sake of simplicity of the notations we will present the proof just for the
case of two variables.
The reasoning for the general case remains the same. In the case of two variables
we have the following statement of Schwarz theorem.
Theorem 2. Let f : D →R, D ⊆ R
2
, (a, b) ∈ D. If both f
′′
xy
and f
′′
yx
exist for all
points in a disc centered at (a, b) and they are continuous at (a, b) then
f
′′
xy
(a, b) = f
′′
yx
(a, b).
Proof. Let r be the radius of the disc centered at (a, b) and let u, v be real number
such that [u[, [v[ <
r

2
.
214
(a, b)
(a, b +v) (a +u, b +v)
(a +u, b)
In this case the rectangle whose corners are the points (a, b), (a+u, b), (a+u, b+v)
and (a, b +v) is situated inside the disc centered at (a, b) with radius r.
Applying the mean value theorem (Theorem 3, subsection 3.1.4) to the function
g : [a, a +u] →R,
g(x) = f(x, b +v) −f(x, b)
we find there exist a
0
(which depends on u and v) between a and a +u such that
g(a +u) −g(a) = g

(a
0
)u
which translates to
f(a +u, b +v) −f(a +u, b) −f(a, b +v) +f(a, b)
=
_
∂f
∂x
(a
0
, b +v) −
∂f
∂x
(a
0
, b)
_
u
Applying Lagrange’s theorem to the function
h : [b, b +v] →R, h(y) =
∂f
∂x
(a
0
, y),
there is some b
0
(which depends on u and v) between b and b +v such that
h(b +v) −h(b) = h

(b
0
)v
Hence,
f(a +u, b +v) −f(a +u, b) −f(a, b +v) +f(a, b) =

2
f
∂x∂y
(a
0
, b
0
)uv
Interchanging x and y in the argument above, we can find a point (a
1
, b
1
) such
that:
f(a +u, b +v) −f(a +u, b) −f(u, b +v) +f(a +b) =

2
f
∂y∂x
(a
1
, b
1
)uv
215
Thus

2
f
∂x∂y
(a
0
, b
0
) =

2
f
∂y∂x
(a
1
, b
1
)
Letting u, v → 0 we obtain that (a
0
, b
0
) → (a, b) and (a
1
, b
1
) → (a, b) so the claim
follows from the continuity of the partial derivatives at (a, b).
Fortunately, just about every function we will meet in applications will be C
2
and
therefore its mixed partial derivatives will be equal.
It is important to note that Schwarz’ theorem implies that the Hessian matrix (1)
is a symmetric one.
Remark 2. If f : D → R, D ⊆ R
n
is a C
2
function on D and a ∈ D then the
Hessian matrix H(a) is a symmetric one.
Remark 3. The previous result (Schwarz theorem) is true for mixed higher order
derivatives too (with the assumption of continuity of partial derivatives). So, in these
conditions we can reverse the order of any two successive differentiations.
For example, if we take an x
1
, x
2
, x
4
derivative of order 3 then the order of
differentiation does not matter for a C
3
function and we have
f
′′′
x
1
x
2
x
4
= f
′′′
x
1
x
4
x
2
= f
′′′
x
2
x
1
x
4
= f
′′′
x
2
x
4
x
1
= f
′′′
x
4
x
1
x
2
= f
′′′
x
4
x
2
x
1
Example 3. Let f : R
2
→R defined by
f(x, y) =
_
_
_
0, if (x, y) = (0, 0)
x
3
y −xy
3
x
2
+y
2
, otherwise.
Compute the mixed partial derivatives at (0,0). Does this result contradict the
conclusion of Schwarz theorem?
Solution. For each (x, y) ,= (0, 0), by applying the quotient rule of differentiation
we easily get that:
∂f
∂x
(x, y) =
x
4
y −y
5
+ 4x
2
y
3
(x
2
+y
2
)
2
and
∂f
∂y
(x, y) =
x
5
−4x
3
y
2
−xy
4
(x
2
+y
2
)
2
We will compute the mixed partial derivatives at (0,0) by using the limit of defi-
nition

2
f
∂y∂x
(0, 0) =

∂y
_
∂f
∂x
_
(0, 0) = lim
y→0
∂f
∂x
(0, y) −
∂f
∂x
(0, 0)
y −0
(2)
By the previous computation we have that
∂f
∂x
(0, y) =
−y
5
y
4
= −y
216
On the other hand, by using once more the limit definition we obtain
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x −0
= lim
x→0
0
y
2
−0
x
= lim
x→0
0
x
= 0.
Substituting these two partial derivatives in (2) we get

2
f
∂y∂x
(0, 0) = lim
y→0
−y −0
y
= −1
so the final result will be

2
f
∂y∂x
(0, 0) = −1.
In the same way we obtain

2
f
∂x∂y
(0, 0) =

∂x
_
∂f
∂y
_
(0, 0) = lim
x→0
∂f
∂y
(x, 0) −
∂f
∂y
(0, 0)
x
= lim
x→0
x −0
x
= 1
so

2
f
∂x∂y
(0, 0) = 1
In conclusion we have

2
f
∂x∂y
(0, 0) ,=

2
f
∂y∂x
(0, 0)
This result does not contradict Schwarz’s theorem, since, as we can latter see, the
partial derivative function is not continuous at (0,0) so the hypotheses of Schwarz’s
theorem do not hold.
For each (x, y) ,= (0, 0) we have

2
f
∂y∂x
(x, y) =

∂y
_
∂f
∂x
_
(x, y) =
_
x
4
y −y
5
+ 4x
2
y
3
(x
2
+y
2
)
2
_

y
=
x
6
−y
6
−9x
2
y
4
+ 9x
4
y
2
(x
2
+y
2
)
3
It remains for us to evaluate the limit of the previous function at (0,0). We will
compute first the limit along the line y = mx, m ∈ R.
lim
x→0
y=mx

2
f
∂y∂x
(x, y) = lim
x→0
y=mx
x
6
−y
6
−9x
2
y
4
+ 9x
4
y
2
(x
2
+y
2
)
3
= lim
x→0
x
6
(1 −m
6
−9m
4
+ 9m
2
)
x
6
(1 +m
2
)
3
=
1 −m
6
−9m
4
+ 9m
2
(1 +m
2
)
3
217
Since the limit depends on m we conclude that lim
(x,y)→(0,0)

2
f
∂y∂x
(x, y) does not
exist.
In consequence the function

2
f
∂x∂y
is not continuous in (0,0).
4.4 Differentiability
4.4.1 Differentiability. The total differential
Because of the geometric representation we will start the discussion in this section
with the case of functions of two variables.
Suppose we are interested in the behaviour of a function f in the neighborhood of
a given point (a, b).
We know from the calculus of one variable that (see subsection 3.1.3)
• if y = b and u is a real number sufficiently small we have that
f(a +u, b) −f(a, b) ≈
∂f
∂x
(a, b) u
• if x = a and v is a real number sufficiently small than
f(a, b +v) −f(a, b) ≈
∂f
∂y
(a, b) v.
It is natural to ask ourselves what happens if both a, b changes into a + u and
b +v. The expected effect is the sum of the effects of the one variables changes.
f(a +u, b +v) −f(a, b) ≈
∂f
∂x
(a, b)u +
∂f
∂y
(a, b)v (1)
For a geometric interpretation it is more convenient to write the previous formula
in the form
f(a +u, b +v) ≈ f(a, b) +
∂f
∂x
(a, b)u +
∂f
∂y
(a, b)v (2)
The right-hand side of (2) is exactly the equation of the tangent plane to the graph
of f at the point ((a, b), f(a, b)) and therefore (2) is an analytical expression of the
fact that the tangent plane is a good approximation to the graph (as we can see in
the figure below).
218
We have now to justify rigorously the discussion above.
Definition 1.
a) Differentiability at a given point (two variables case)
Let f : D →R, D ⊆ R
2
, (a, b) ∈ D.
We say that f is differentiable at (a, b) if there exist
α
1
, α
2
∈ R, ω : D →R, lim
(x,y)→(a,b)
ω(x, y) = ω(a, b) = 0
such that
f(x, y) −f(a, b) = α
1
(x −a) +α
2
(y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
, for all (x, y) ∈ D (3)
b) Differentiability on a set
We say that f is differentiable on D if f is differentiable at each point in D.
Remark 1. Let f : D →R, D ⊆ R
2
, (a, b) ∈ D.
a) If f is differentiable at (a, b) then f is continuous at (a, b).
b) If f is differentiable at (a, b) then f is partial differentiable at (a, b) with respect
to x and y and
∂f
∂x
(a, b) = α
1
and
∂f
∂y
(a, b) = α
2
.
c) The converse statements of part a) and part b) are not true.
Proof. a) If we let (x, y) → (a, b) in (3) we get that
lim
x→a
y→b
[f(x, y) −f(a, b)] = lim
x→a
y→b

1
(x −a) +α
2
(y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
] = 0.
In consequence
lim
(x,y)→(a,b)
f(x, y) = f(a, b),
so f is continuous at (a, b).
219
b) We have to evaluate the limits
lim
x→a
f(x, b) −f(a, b)
x −a
and lim
y→b
f(a, y) −f(a, b)
y −b
.
If we take y = b in (3), divide the obtained relation by x − a and take then the
limit in both parts of equality we obtain
∂f
∂x
(a, b) = lim
x→a
f(x, b) −f(a, b)
x −a
= lim
x→a
_
α
1
+ω(x, b)
[x −a[
x −a
_
= α
1
+ lim
x→a
ω(x, b)
[x −a[
x −a
= α
1
+ 0 = α
1
.
The last limit is 0 as a limit of a product between a function with limit equal to
0 and a bounded function.
The fact that
∂f
∂y
(a, b) = α
2
can be proved in a similar manner.
c) We will present here a function which is continuous at (0,0), admits partial
derivatives at (0,0) and still is not differentiable at (0,0).
Let
f : R
2
→R, f(x, y) =
_
_
_
xy
_
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
We study first the continuity of f at (0,0) by taking the following limit:
lim
x→0
y→0
xy
_
x
2
+y
2
= lim
x→0
y→0
y
_
x
2
+y
2
x = 0 = f(0, 0)
The previous limit is 0 since
−1 ≤
y
_
x
2
+y
2
≤ 1 and lim
x→0
y→0
x = 0.
By using the limit definition we study the existence of partial derivatives at (0,0).
∂f
∂x
(0, 0) = lim
x→0
f(x, 0) −f(0, 0)
x
= lim
x→0
0
x
= lim
x→0
0 = 0
∂f
∂y
(0, 0) = lim
y→0
f(0, y) −f(0, 0)
y
= lim
y→0
0
y
= lim
y→0
0 = 0
Assume, by contradiction, that f is differentiable at (0,0). Then there exists
ω : R
2
→R, lim
x→0
y→0
ω(x, y) = ω(0, 0) = 0,
220
such that
f(x, y) −f(0, 0) =
∂f
∂x
(0, 0)(x −0) +
∂f
∂y
(0, 0)(y −0)
+ω(x, y)
_
x
2
+y
2
, (x, y) ∈ R
2
.
Since
∂f
∂x
(0, 0) =
∂f
∂y
(0, 0) = 0,
the previous equality reduces to
f(x, y) = ω(x, y)
_
x
2
+y
2
,
wherefrom we obtain:
ω(x, y) =
_
_
_
xy
_
x
2
+y
2
, (x, y) ,= (0, 0)
0, (x, y) = (0, 0)
But lim
(x,y)→(0,0)
ω(x, y) does not exist (see Example 2, subsection 4.1.2) which con-
tradicts the assumption, so f is not differentiable as desired.
Remark 2. Let f : D → R, (a, b) ∈ R. If f is differentiable at (a, b) then there
exists a function ω : D →R, with
lim
x→a
y→b
ω(x, y) = ω(a, b) = 0,
such that
f(x, y) = f(a, b) +
∂f
∂x
(a, b)(x −a) +
∂f
∂y
(a, b)(y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
(4)
for all (x, y) ∈ D.
If we let x −a = u, y −b = v and u and v are sufficiently small then the quantity
ω(x, y)
_
(x −a)
2
+ (y −b)
2
is small enough to justify the geometric approximation
f(x, y) ≈ f(a, b) +
∂f
∂x
(a, b)(x −a) +
∂f
∂y
(a, b)(y −b)
which was discussed at the beginning of this section.
Since the definition of differentiability is quite difficult to be checked we will
present, without proof (which is beyond the scope of this text) the sufficient con-
ditions for differentiability.
Theorem 1. If f has continuous partial derivatives on D (f is C
1
) then f is
differentiable on D.
Definition 2. (the differential or the total differential, 2 variables case)
221
Lett f : D →R, (a, b) ∈ D. If f is differentiable at (a, b) the differential (the total
differential) of f at (a, b) is defined by
df
(a,b)
: R
2
→R
df
(a,b)
(h
1
, h
2
) =
∂f
∂x
(a, b)h
1
+
∂f
∂y
(a, b)h
2
(5)
For h = (h
1
, h
2
) ∈ R
2
the number
df
(a,b)
(h) = df
(a,b)
(h
1
, h
2
)
is called the differential of f at (a, b) of increment h = (h
1
, h
2
) ∈ R
2
.
By using this new notion the relation (5) becomes:
f(x, y) = f(a, b) +df
(a,b)
(x −a, y −b)
+ω(x, y)
_
(x −a)
2
+ (y −b)
2
(6)
with the correspondent approximation
f(x, y) ≈ f(a, b) +df
(a,b)
(x −a, y −b) (7)
Example 1. Compute the total differential of the function
f : R
2
→R, f(x, y) = x
2
+y
2
+xy
at the point (2,1).
Solution. The function f is a C
1
function on R
2
so, according to Theorem 1, f is
differentiable on R
2
and in consequence the total differential of f exists at any point
in R
2
.
df
(2,1)
: R
2
→R
df
(2,1)
(h
1
, h
2
) =
∂f
∂x
(2, 1)h
1
+
∂f
∂y
(2, 1)h
2
= (2 2 + 1)h
1
+ (2 1 + 2)h
2
= 5h
1
+ 4h
2
We will present now the definitions in general case.
Definition 3. (differentiability, general case)
Let f : D → R, D ⊆ R
n
, D domain. We say that f is differentiable at a if there
exist α
1
, α
2
, . . . , α
n
∈ R, ω : D →R,
lim
x→a
ω(x) = ω(a) = 0,
such that
f(x) −f(a) = α
1
(x
1
−a
1
) + +α
n
(x
n
−a
n
)
222
+ω(x)|x −a|, x ∈ D. (8)
As in the two dimensional case if f is differentiable at a then f is continuous at a
and f admits partial derivatives at a with
∂f
∂x
i
(a) = α
i
, i = 1, n.
If f is C
1
on D then f is differentiable at each point in D.
Definition 4. (the differential or the total differential, general case)
Let f : D →R, D ⊆ R
n
and a ∈ D. If f is differentiable at a then the differential
(the total differential) of f at a is defined by
df
(a)
: R
n
→R
df
(a)
(h) =
∂f
∂x
1
(a)h
1
+
∂f
∂x
2
(a)h
2
+ +
∂f
∂x
n
(a)h
n
(9)
Sometimes instead of h
i
, i = 1, n, we can use the notation dx
i
which is considered
to be a small increment along x
i
axis.
If dx = (dx
1
, dx
2
, . . . , dx
n
) then
df
(a)
(dx) =
∂f
∂x
1
(a)dx
1
+
∂f
∂x
2
(a)dx
2
+ +
∂f
∂x
n
(a)dx
n
.
By using (9) the relation (8) can be written in the following form
f(x) = f(a) +df
(a)
(x −a) +ω(x)|x −a|, (10)
for all x ∈ D.
Since ω(x)|x − a| → 0 as x → a for |x −a| small enough we have the following
approximation of (10)
f(x) ≈ f(a) +df
(a)
(x −a) (11)
The chain rule
Recall that the chain rule for functions of a single variable gives the rule for
differentiating a composite function. If f and g are differentiable functions such that
f ◦ g makes sense then
(f ◦ g)

(t) = f

(g(t)) g

(t).
For functions of more than one variable, the Chain rule has several versions, each
of them giving a rule for differentiating a composite function.
The first version (Theorem 2) analyse the case where z = f(x, y) and each of the
variables x and y is a function of a variable t.
This means that f is indirectly a function of t,
z(t) = f(x(t), y(t)),
223
and the Chain rule gives a formula for differentiating z as a function of t. Assume f,
x and y are differentiable functions.
Theorem 2. Suppose that z = f(x, y) is a differentiable function of x and y, where
x = x(t) and y = y(t) are both differentiable functions of t. Then z is differentiable
function of t and
z

(t) =
∂f
∂x
(x(t), y(t))x

(t) +
∂f
∂y
(x(t), y(t))y

(t) (12)
For short, we can write down the previous formula in the following way:
z

(t) =
∂f
∂x
x

(t) +
∂f
∂y
y

(t) (13)
Proof. From Definition 1 and Remark 2 we have:
∆z =
∂f
∂x
∆x +
∂f
∂y
∆y +ω
_
(∆x)
2
+ (∆y)
2
where ω → 0 as (∆x, ∆y) → (0, 0).
Dividing both sides of this equation by ∆t, we have
∆z
∆t
=
∂f
∂x

∆x
∆t
+
∂f
∂y

∆y
∆t

¸
_
∆x
∆t
_
2
+
_
∆y
∆t
_
2
If we now let ∆t → 0, then
∆x = x(t + ∆t) −x(t) → 0
because x is differentiable and therefore continuous. In the same way ∆y → 0. This
will imply that ω → 0 so,
z

(t) = lim
∆t→0
∆z
∆t
=
∂f
∂x
lim
∆t→0
∆x
∆t
+
∂f
∂y
lim
∆t→0
∆y
∆y
+ lim
∆t→0
ω
¸
_
lim
∆t→0
∆x
∆t
_
2
+
_
lim
∆t→0
∆y
∆t
_
2
=
∂f
∂x
x

(t) +
∂f
∂y
y

(t) + 0
_
(x

(t))
2
+ (y

(t))
2
=
∂f
∂x
x

(t) +
∂f
∂y
y

(t)
Since we often write
∂z
∂x
in place of
∂f
∂x
we can rewrite the chain rule in the form
z

(t) =
∂z
∂x
x

(t) +
∂z
∂y
y

(t). (14)
224
If z depends on more than two variables then
z

(t) =
∂z
∂x
1
x

1
(t) +
∂z
∂x
2
x

2
(t) + +
∂z
∂x
n
x

n
(t)
=
n

i=1
∂z
∂x
i
x

i
(t) (15)
As you have already observed, in order not to make the notations too complicated
we worked in a more formal way in formulating and proving the previous theorem.
Example 2. Use the chain rule to find w

(t) if w = ln
_
x
2
+y
2
+z
2
, x = sin t,
y = cos t, z = tg t et t =
π
4
.
Solution.
w

(t) =
∂w
∂x
x

(t) +
∂w
∂y
y

(t) +
∂w
∂z
z

(t)
= (ln
_
x
2
+y
2
+z
2
)

x
sin

t + (ln
_
x
2
+y
2
+z
2
)

y
cos

t
+(ln
_
x
2
+y
2
+z
2
)

z
tg

t
=
(
_
x
2
+y
2
+z
2
)

x
_
x
2
+y
2
+z
2
cos t −
(
_
x
2
+y
2
+z
2
)

y
_
x
2
+y
2
+z
2
sin t
+
(
_
x
2
+y
2
+z
2
)

z
_
x
2
+y
2
+z
2

1
cos
2
t
w

(t) =
x
x
2
+y
2
+z
2
cos t −
y
x
2
+y
2
+z
2
sin t +
z
x
2
+y
2
+z
2

1
cos
2
t
It’s not necessary to substitute the expressions for x, y and z in terms of t. We
observe that when t =
π
4
we have
x = sin
π
4
=

2
2
, y = cos
π
4
=

2
2
, z = tg
π
4
= 1.
Therefore
w

_
π
4
_
=

2
2
_

2
2
_
2
+
_

2
2
_
2
+ 1


2
2


2
2
_

2
2
_
2
+
_

2
2
_
2
+ 1


2
2
+
1
_

2
2
_
2
+
_

2
2
_
2
+ 1

1
_

2
2
_
2
= 1
Hence
w

_
π
4
_
= 1.
225
We now consider the situation where z = f(x, y) but each of x and y is a function
of two variables s and t: x = x(s, t), y = y(s, t). Then z is indirectly a function of s
and t. We try to find
∂z
∂s
and
∂z
∂t
.
Theorem 3. Suppose that z = f(x, y) is a differentiable function of x and y,
where x = x(s, t) and y = y(s, t) are differentiable functions of s and t. Then
∂z
∂s
=
∂z
∂x

∂x
∂s
+
∂z
∂y

∂y
∂s
and
∂z
∂t
=
∂z
∂x

∂x
∂t
+
∂z
∂y

∂y
∂t
(16)
Proof. Recall that in computing
∂z
∂s
we hold t fixed and compute the derivative
of z with respect to x. Therefore we can apply Theorem 2 to obtain
∂z
∂s
=
∂z
∂x

∂x
∂s
+
∂z
∂y

∂y
∂s
.
A similar argument holds for
∂z
∂t
and the proof is complete.
Example 3. Use the chain rule to find
∂z
∂s
and
∂z
∂t
if z = e
x+2y
, x =
s
t
and y =
t
s
.
Solution.
∂z
∂s
=
∂z
∂x

∂x
∂s
+
∂z
∂y

∂y
∂s
= (e
x+2y
)

x
_
s
t
_

s
+ (e
x+2y
)

y
_
t
s
_

s
= e
x+2y

1
t
+e
x+2y
2
_

t
s
2
_
= e
s
t
+
2t
s
_
1
t

2t
s
2
_
=
s
2
−2t
2
ts
2
e
s
2
+2t
2
st
This case of the chain rule contains three types of variables: s and t are independent
variables, x and y are called intermediate variables and z is the dependent variable.
A tree diagram (see figure below) could help us to remember the previous form of
the chain rule. We draw branches from the dependent variable z to the intermediate
variables x and y. Then we draw branches from x and y to the independent variables
x and t. On each branch we write the corresponding partial derivative.
226
z
x
y
∂z
∂x
∂z
∂y
∂x
∂s
∂x
∂t
∂y
∂s
∂y
∂t
s s t t
To find
∂z
∂s
we find the product of the partial derivatives along each path from z
to s and then add these products:
∂z
∂s
=
∂z
∂x

∂x
∂s
+
∂z
∂y

∂y
∂s
We consider now the general case in which the dependent variable z is a function
of n intermediate variables x
1
, . . . , x
n
, each of which is a function of m independent
variables s
1
, . . . , s
m
.
Theorem 4. (general case) Suppose that z is a differentiable function of the n
variables x
1
, . . . , x
n
and each x
j
is a differentiable function of the m independent
variables s
1
, . . . , s
m
. Then z is a function of s
1
, . . . , s
m
and
∂z
∂s
j
=
∂z
∂x
1

∂x
1
∂s
j
+ +
∂z
∂x
n

∂x
n
∂s
j
=
n

i=1
∂z
∂x
i

∂x
i
∂s
j
(17)
for each j = 1, m.
The proof is similar to previous case.
Example 4. Use the chain rule to find the indicated partial derivatives:
∂z
∂u
,
∂z
∂v
,
∂z
∂w
when u = 2, v = 1, w = 0 for z = x
2
+xy
3
, x = uv
2
+w
3
, y = u +ve
w
.
227
Solution.
z
x y
∂z
∂x
∂z
∂y
∂x
∂u
∂x
∂w
∂y
∂u
∂y
∂w
u u w w v
∂x
∂v
v
∂y
∂v
∂z
∂u
=
∂z
∂x

∂x
∂u
+
∂z
∂y

∂y
∂u
= (x
2
+xy
3
)

x
(uv
2
+w
3
)

u
+ (x
2
+xy
3
)

y
(u +ve
w
)

u
= (2x +y
3
)v
2
+ 3xy
2
1
When u = 2, v = 1 and w = 0, then
x = 2 1
2
+ 0 = 2 and y = 2 + 1 e
0
= 2
Therefore
∂z
∂u
(2, 1, 0) = (2 2 + 2) 1 + 3 2 2
2
1 = 6 + 24 = 30.
The other two partial derivatives can be calculated in a similar way.
4.4.2 Higher order differentials
Definition 1. Let f : D →R, D ⊆ R
n
, a ∈ D.
We say that f is differentiable of k-th order at the point a if all the partial deriva-
tives of order k −1 exist for all points near a and they are differentiable at a.
Remark 1. If f has continuous partial derivatives of order k on D then f is
differentiable of order k on D.
Definition 2. Let f : D → R, D ⊆ R
n
be a C
k
function on D. The k-th order
differential of the function f at a of increment h = (h
1
, . . . , h
n
) ∈ R
n
is defined in the
following ”formal” way:
d
k
f
(a)
(h) =
_

∂x
1
h
1
+ +

∂x
n
h
n
_
(k)
f(a) (1)
228
In the previous definition

∂x
i
, i = 1, n, is the partial differentiation operator with
respect to x
i
and (k) is a ”formal” power. When we rise to the formal power (k) we
apply the binomial theorem where the ”formal” k
th
power of

∂x
i
means actually the
k-th order partial derivative

k
f
∂x
k
i
(a).
In order to make the things clear we will present some particular cases of the
previous formula.
Particular case 1. n = 2.
In this case the k-th order differential of the function f at (a, b) of increment
h = (h
1
, h
2
) ∈ R
2
is defined by
d
k
f
(a,b)
(h
1
, h
2
) =
_

∂x
h
1
+

∂y
h
2
_
(k)
f(a, b).
a) k = 2
d
2
f
(a,b)
(h
1
, h
2
) =
_

∂x
h
1
+

∂y
h
2
_
(2)
f(a, b)
=
_

2
∂x
2
h
2
1
+ 2

2
∂x∂y
h
1
h
2
+

2
∂y
2
h
2
2
_
f(a, b)
=

2
f
∂x
2
(a, b)h
2
1
+ 2

2
f
∂x∂y
(a, b)h
1
h
2
+

2
f
∂y
2
(a, b)h
2
2
.
b) k = 3
d
3
f
(a,b)
(h
1
, h
2
) =
_

∂x
h
1
+

∂y
h
2
_
(3)
f(a, b)
=
_

3
∂x
3
h
3
1
+ 3

3
∂x
2
∂y
h
2
1
h
2
+ 3

3
∂x∂y
2
h
1
h
2
2
+

3
∂y
3
h
3
2
_
f(a, b)
=

3
f
∂x
3
(a, b)h
3
1
+ 3

3
f
∂x
2
∂y
(a, b)h
2
1
h
2
+ 3

3
f
∂x∂y
2
(a, b)h
1
h
2
2
+

3
f
∂y
3
(a, b)h
3
2
Particular case 2. n = 3.
In this case the k-th order differential of the function f at (a, b, c) of increment
h = (h
1
, h
2
, h
3
) ∈ R
3
is defined by
d
k
f
(a,b,c)
(h
1
, h
2
, h
3
) =
_

∂x
h
1
+

∂y
h
2
+

∂z
h
3
_
(k)
f(a, b, c).
If k = 2 we have
d
2
f
(a,b,c)
(h
1
, h
2
, h
3
) =
_

∂x
h
1
+

∂y
h
2
+

∂z
h
3
_
(2)
f(a, b, c)
229
=
_

2
∂x
2
h
2
1
+

2
∂y
2
h
2
2
+

2
∂z
2
h
2
3
+ 2

2
∂x∂y
h
1
h
2
+2

2
∂y∂z
h
2
h
3
+ 2

2
∂z∂x
h
3
h
1
_
f(a, b, c)
=

2
f
∂x
2
(a, b, c)h
2
1
+

2
f
∂y
2
(a, b, c)h
2
2
+

2
f
∂z
2
(a, b, c)h
2
3
+2

2
f
∂x∂y
(a, b, c)h
1
h
2
+ 2

2
f
∂y∂z
(a, b, c)h
2
h
3
+ 2

2
f
∂z∂x
(a, b, c)h
3
h
1
Remark 2. Let f : D → R, D ⊆ R
n
, be a C
2
function and let a ∈ D. Then we
have
d
2
f
(a)
(h) = h H(a) h
t
, for each h ∈ R
n
where by h
t
we understand the column vector
h
t
=
_
_
_
_
_
h
1
h
2
.
.
.
h
n
_
_
_
_
_
(which is the transpose of the row vector h = (h
1
, . . . , h
n
)).
Proof. Since f is a C
2
function then the Hessian matrix H(a) is a symmetric
matrix (see Remark 2, Section 4.3)
h · H(a) · h
t
= (h
1
, . . . , h
n
)
_
_
_
_
_
_
_
_
_
_
_
_

2
f
∂x
2
1
(a)

2
f
∂x
1
∂x
2
(a) . . .

2
f
∂x
1
∂x
n
(a)

2
f
∂x
2
∂x
1
(a)

2
f
∂x
2
2
(a) . . .

2
f
∂x
2
∂x
n
(a)
. . . . . . . . . . . .

2
f
∂x
n
∂x
1
(a)

2
f
∂x
n
∂x
2
(a) . . .

2
f
∂x
2
n
(a)
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
h
1
h
2
. . .
h
n
_
_
_
_
=
_
_
n

j=1

2
f
∂x
1
∂x
j
(a)h
j
,
n

j=1

2
f
∂x
2
∂x
j
(a)h
j
, . . . ,
n

j=1

2
f
∂x
n
∂x
j
(a)h
j
_
_
_
_
_
_
h
1
h
2
. . .
h
n
_
_
_
_
=
n

i,j=1

2
f
∂x
i
∂x
j
(a)h
i
h
j
=
n

i=1

2
f
∂x
2
i
(a)h
2
i
+ 2

1≤i<j≤n

2
f
∂x
i
∂x
j
(a)h
i
h
j
= d
2
f
(a)
(h).
Remark 3. Let f : D →R, D ⊆ R
n
be a C
2
function and let a ∈ D. Then
Q : R
n
→R defined by
Q(h) = d
2
f
(a)
(h)
230
is a quadratic form.
Proof. The proof is an easy consequence of the previous remark and of the defi-
nition of a quadratic form (see 1.3).
4.4.3 Taylor formula in R
n
We learned in subsection 4.4.1 that the approximation by differential for a C
1
function f, f : D →R, a ∈ D, is
f(x) ≈ f(a) +df
a
(x −a) (1)
The previous estimation is valid for |x −a| sufficiently small.
The difference between the left-hand side of (1) and the right-hand side is ω(x)|x−
a| (see (10), subsection 4.1.2) where lim
x→a
ω(x) = 0.
If we denote the difference mentioned above by R
1
(x) (R from the remaider) then
the equality (10), subsection 4.4.1, becomes
f(x) = f(a) +df
(a)
(x −a) +R
1
(x), x ∈ D (2)
where
lim
x→a
R
1
(x)
|x −a|
= 0.
(2) is called the Taylor approximation of order one for a C
1
function of several
variables.
The next theorem shows that if f has continuous second-order partial derivatives,
the error term is equal to a quadratic form (see 1.3) plus a term of smaller order than
|x −a|
2
.
Theorem 1. (Second order Taylor formula)
Let f : D → R be a C
2
function, D ⊆ R
n
and let a ∈ D. Then, there exists a
function R
2
: D →R such that
f(x) = f(a) +df
(a)
(x −a) +
1
2!
d
2
f
(a)
(x −a) +R
2
(x), x ∈ D (3)
where
lim
x→a
R
2
(x)
|x −a|
2
= 0.
Proof. Keep x fixed and define g : [0, 1] →R by the relationship
g(u) = f(a +u(x −a)).
Then
f(x) −f(a) = g(1) −g(0).
We will prove the theorem by applying the second order Taylor’s formula to g for
x = 1 and a = 0. We obtain
g(1) = g(0) +g

(0) +
1
2
g
′′
(c), where 0 < c < 1.
231
Here we have used Lagrange’s form of the remainder (see Remark 2, subsection
3.1.4).
We will compute the derivatives of g by the chain rule
g

(u) =
n

i=1
∂f
∂x
i
(a +u(x −a))(x
i
−a
i
)
= df
(a+u(x−a))
(x −a)
In particular
g

(0) = df
(a)
(x −a).
Using the chain rule once more we find
g
′′
(u) =
n

i,j=1

2
f
∂x
i
∂x
j
(a +u(x −a))(x
i
−a
i
)(x
j
−a
j
)
= d
2
f
(a+u(x−a))
(x −a).
Hence
g
′′
(c) = d
2
f
(a+c(x−a))
(x −a).
To prove (3) we define
R
2
(x) =
1
2
d
2
f
(a+c(x−a))
(x −a) −
1
2
d
2
f
(a)
(x −a).
By using the previous equality we have
f(x) −f(a) = df
(a)
(x −a) +
1
2
d
2
f
(a+c(x−a))
(x −a)
= df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a) +R
2
(x)
To complete the proof we need to show that
R
2
(x)
|x −a|
2
→ 0 as x → a.
We have:
[R
2
(x)[
|x −a|
2
=
1
|x −a|
2

1
2
¸
¸
¸
¸
¸
¸
n

i,j=1
_

2
f
∂x
i
∂x
j
(a +c(x −a))


2
f
∂x
i
∂x
j
(a)
_
(x
i
−a)(x
j
−a)
¸
¸
¸
¸

1
2
n

i,j=1
¸
¸
¸
¸

2
f
∂x
i
∂x
j
(a +c(x −a)) −

2
f
∂x
i
∂x
j
(a)
¸
¸
¸
¸
232
Since each second-order partial derivative is continuous at a, we have

2
f
∂x
i
∂x
j
(a +c(x −a)) →

2
f
∂x
i
∂x
j
(a), as x → a
so
R
2
(x)
|x −a|
2
→ 0, as x → a.
This completes the proof.
The correspondent second-order Taylor approximation is
f(x) ≈ f(a) +df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a)
The previous estimation is valid for |x −a| sufficiently small.
We will present without proof the next theorem which summarize the approxima-
tion of a C
k
function on R
n
by a Taylor polynomial of order k.
Theorem 2. Let k be a positive integer. Let f : D →R, D ⊆ R
n
, be a C
k
function
on D and let a ∈ D. Then, there exists a function R
k
: D → R, such that for all
x ∈ D
f(x) = f(a) +df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a) + +
+
1
k!
d
k
f
(a)
(x −a) +R
k
(x) (4)
where
R
k
(x)
|x −a|
k
→ 0, as x → a.
Relation (4) is called Taylor’s formula.
In the previous theorem the polynomial T
k
: D →R
T
k
(x) = f(a) +
1
1!
df
(a)
(x −a) +
1
2!
d
2
f
(a)
(x −a)+
+ +
1
k!
d
k
f
(a)
(x −a) (5)
is called the Taylor polynomial of degree k and R
k
= f −T
k
is called the remainder
of oder k.
The correspondent k-order Taylor approximation is
f(x) ≈ f(a) +
1
1!
df
(a)
(x −a) +
1
2!
d
2
f
(a)
(x −a)+
+ +
1
k!
d
k
f
(a)
(x −a), (6)
valid for |x −a| sufficiently small.
Example 1. Compute the first and second order Taylor approximation of the
Cobb-Douglas function
f : (0, ∞) (0, ∞) →R
f(x, y) = x
1
4
y
3
4
233
at the point (1, 1).
Solution. We compute the first and second order partial derivative functions of
f:
∂f
∂x
=
1
4
x

3
4
y
3
4
,
∂f
∂y
=
3
4
x
1
4
y

1
4

2
f
∂x
2
= −
3
16
x

7
4
y
3
4
,

2
f
∂x∂y
=
3
16
x

3
4
y

1
4
,

2
f
∂y
2
= −
3
16
x
1
4
y

5
4
Evaluating these partial derivatives at the point (1,1) we obtain:
∂f
∂x
(1, 1) =
1
4
,
∂f
∂y
(1, 1) =
3
4
,

2
f
∂x
2
(1, 1) = −
3
16
,

2
f
∂x∂y
(1, 1) =
3
16
,

2
f
∂y
2
(1, 1) = −
3
16
Substituting in
df
(1,1)
(x −1, y −1) =
∂f
∂x
(1, 1)(x −1) +
∂f
∂y
(1, 1)(y −1)
the values of the first partial derivatives obtained before we get:
df
(1,1)
(x −1, y −1) =
1
4
(x −1) +
3
4
(y −1)
Substituting in
d
2
f
(1,1)
(x −1, y −1) =

2
f
∂x
2
(1, 1)(x −1)
2
+ 2

2
f
∂x∂y
(1, 1)(x −1)(y −1)
+

2
f
∂y
2
(1, 1)(y −1)
2
the values of the second partial derivatives we get:
d
2
f
(1,1)
(x −1, y −1) = −
3
16
(x −1)
2
+
3
8
(x −1)(y −1) −
3
16
(y −1)
2
.
Finally we obtain:
f(x, y) ≈ 1 +
1
4
(x −1) +
3
4
(y −1) – the first order approximation
f(x, y) ≈ 1 +
1
4
(x −1) +
3
4
(y −1) −
3
32
(x −1)
2
+
3
16
(x −1)(y −1) −
3
32
(y −1)
2
– the second order approximation.
234
If we use the Taylor approximation of order one to approximate
f(1, 1; 0, 9) = (1, 1)
1
4
(0, 9)
3
4
,
we obtain that
(1, 1)
1
4
(0, 9)
3
4
≈ 1 +
1
4
(1, 1 −1) +
3
4
(0, 9 −1)
= 1 +
1
4
0, 1 −
3
4
0, 1 = 0, 9
If we use the Taylor approximation of order two we get that
(1, 1)
1
4
(0, 9)
3
4
≈ 1 +
1
4
0, 1 +
3
4
(−0, 1) −
3
32
(0, 1)
2
+
3
16
0, 1 (−0, 1) −
3
32
(−0, 1)
2
= 0, 94625
It can be easily seen that the approximation obtain by using the second order
Taylor approximation is a better approximation of
(1, 1)
1
4
(0, 9)
3
4
= 0, 943026 . . . .
4.5 Extrema of function of several variables
Since optimization plays a major role in economic theory this section can be
considered the core of this part of the book.
At this moment we have a good understanding of conditions under which a is a
local extreme point of a C
2
function f : I → R, where I is an open interval, I ⊆ R.
These conditions can be stated as follows.
1

. Necessary conditions
If a is a local minimum point of f then
f

(a) = 0 and f
′′
(a) ≥ 0.
If a is a local maximum point of f then
f

(a) = 0 and f
′′
(a) ≤ 0.
2

. Sufficient conditions
If f

(a) = 0 and f
′′
(a) < 0, then a is a local maximum point of f.
If f

(a) = 0 and f
′′
(a) > 0, then a is a local minimum point of f.
Our purpose is to develop the generalizations of the previous results to the case
of functions of more then one variable.
We will see that the main results for functions of several variables are analogous
to the one-dimensional results.
Definition 1. Let f : D →R be a real-valued function of n variables, D a subset
of R
n
and let a ∈ D.
235
a) The point a is called a global or absolute maximum point of f if f(a) ≥ f(x)
for all x ∈ D. In this case, f(a) is the global maximum value of f.
b) The point a is called a local (or relative) maximum point of f if there is a ball
B(a, r) such that f(a) ≥ f(x) for all x ∈ B(a, r) ∩ D. In this case, f(a) is the local
maximum value of f.
Reversing the inequalities in the above two definitions we obtain the definitions of
a global minimum point and of a local minimum point.
First order conditions
The results presented here are obtained by using only the first order partial deriva-
tives of a given function.
In the case of one variable, the first order condition for a point a be to a local
maximum or minimum point of a C
1
function f is that f

(a) = 0. In this case a has
to be a critical point of f.
The generalization to several variables of the critical point notion is the following:
Definition 2. Let f : D → R, D ⊆ R
n
and let a ∈ D. The point a is called a
stationary (or critical) point of f if f admits partial derivatives at a and
∂f
∂x
1
(a) = =
∂f
∂x
n
(a) = 0.
The next theorem is very useful in locating local extrema of f.
Theorem 1. Let f : D →R be a differentiable function on D ⊆ R
n
. If a is a local
maximum (or minimum) point of f then
∂f
∂x
1
(a) =
∂f
∂x
2
(a) = =
∂f
∂x
n
(a) = 0.
Hence, a is a stationary point of f.
Proof. We will work in the case when a is a local minimum point (the same proof
works for the maximum case).
Let B = B(a, r) be a ball centered at a, B ⊂ D with the property that f(a) ≤ f(x)
for all x ∈ B.
Since a is a minimum point of f on B then along each segment which passes
through a (that lies in B) f takes its minimum value at a.
In consequence, for each i = 1, n, a
i
is the minimum value of the following function
of one variable
g
i
: (a
i
−r, a
i
+r) →R
g
i
(x
i
) = f(a
1
, . . . , a
i−1
, x
i
, a
i+1
, . . . , a
n
).
If we apply now Fermat’s theorem (Theorem 1, subsection 3.1.4) to the above
function we conclude that
g
i
(a
i
) =
∂f
∂x
i
(a
i
) = 0, i = 1, n.
236
The previous theorem says that in order to determine the local extreme points for
a differentiable function we must seek among the stationary points.
Example 1. Determine the stationary points of the function defined by
f : R
2
→R
f(x, y) = x
3
−y
3
+ 9xy.
Solution. To find the stationary points of f, we compute the first order partial
derivatives and equate them to zero.
∂f
∂x
(x, y) = 3x
2
+ 9y = 0
∂f
∂y
(x, y) = −3y
2
+ 9x = 0
From the first equation we get
y = −
1
3
x
2
which can be substituted in the second one to get

1
3
x
4
+ 9x = 0.
The solution of the previous equation are x = 0 and x = 3.
Substituting these values into y = −
1
3
x
2
we obtain that the stationary points of
the function f are (0,0) and (3, −3). At this moment we are not able to decide the
nature of each of the previous two stationary points. To determine the nature of the
stationary points we need to use a condition on the second order differential of f, as
we did for functions of one variable.
Second order conditions
Sufficient conditions
Theorem 2. Let f : D → R be a C
2
function. Suppose that D ⊆ R
n
, a ∈ D is a
stationary point for f.
a) If d
2
f
(a)
(h) > 0 for each h ∈ R
n
¸ ¦θ¦, then a is a local minimum point for f.
b) If d
2
f
(a)
(h) < 0 for each h ∈ R
n
¸ ¦θ¦, then a is a local maximum point of f.
c) If there are v, w ∈ R
n
¸ ¦θ¦ such that d
2
f
(a)
(v) > 0 and d
2
f
(a)
(w) < 0, then a
is neither a local maximum point nor a local minimum of f.
Definition 3. A stationary point of f for which the assumptions of part c) hold
is called a saddle point.
Proof. Since the proofs for part a) and b) are quite similar, we will prove part a)
and leave the proof of part b) as an exercise.
237
a) We assume that a is a stationary point of the C
2
function f and that d
2
f
(a)
(h) >
0, for each h ∈ R
n
¸ ¦θ¦. Write the second order Taylor’s formula at the critical point
a:
f(x) = f(a) +df
(a)
(x −a) +
1
2
d
2
f
(a)
(x −a) +R
2
(x) (1)
where
R
2
(x)
|x −a|
2
→ 0, as x → a.
Since a is a stationary point of f then (1) becomes
f(x) −f(a) =
1
2
d
2
f
(a)
(x −a) +R
2
(x).
We divide the previous equality by |x −a|
2
and we get
f(x) −f(a)
|x −a|
2
=
1
2
d
2
f
(a)
_
x −a
|x −a|
_
+
R
2
(x)
|x −a|
2
(2)
As a polynomial of degree two, the quadratic form
Q(h) = d
2
f
(a)
(h)
is a continuous function on R
n
.
Let α = min¦Q(v) [ |v| = 1¦.
Since the unit sphere ¦v [ |v| = 1¦ is compact, by applying Weierstrass’ theorem
(Theorem 4, subsection 4.1) to the restriction of function Q on the unit sphere we
conclude that there exists a point w on the unit sphere such that Q(w) = α. Since Q
is positive definite and w ,= 0 (|w| = 1) then α = Q(w) > 0 and it follows that
1
2
α ≤
1
2
Q
_
x −a
|x −a|
_
=
1
2
d
2
f
(a)
_
x −a
|x −a|
_
, for all x ,= a (3)
Since
R
2
(x)
|x −a|
2
→ 0 as x → a, there exist an r > 0 such that

α
4
<
R
2
(a)
|x −a|
2
<
α
4
for all x, 0 < |x −a| < r (4)
Combining (3) and (4) we find that for all x with 0 < |x −a| < r we have
1
2
d
2
f
(a)
_
x −a
|x −a|
_
+
R
2
(a)
|x −a|
2
>
1
2
α −
1
4
α =
1
4
α > 0.
The second part of the equality (2) is positive for 0 < |x − a| < r, therefore, so
is the first part of the above mentioned equality:
f(x) −f(a)
|x −a|
2
> 0 for 0 < |x −a| < r or x ∈ B(a, r).
238
c) We will show that the conditions
df
(a)
≡ 0 and d
2
f
(a)
(v) > 0
imply that a cannot be a local maximum point of f.
In the same way the conditions
df
(a)
≡ 0 and d
2
f
(a)
(w) < 0
imply that a cannot be a local minimum point of f.
We consider the function
t → g(t) = f(a +tv)
and use the chain rule to compute the first and second derivatives of function g defined
above
g

(t) =
n

i=1
∂f
∂x
i
(a +tv)v
i
= df
(a+tv)
(v)
so
g

(0) = df
(a)
(v) = 0.
Taking the second derivative we get
g
′′
(t) =
n

i,j=1

2
f
∂x
i
∂x
j
(a +tv)v
i
v
j
= d
2
f
(a+tv)
(v)
Since g
′′
(0) = d
2
f
(a)
(v) > 0 and g
′′
is a continuous function then there is ε > 0
such that g
′′
(t) > 0 for all t ∈ (−ε, ε) and in conclusion g

is an increasing function on
(−ε, ε). By taking in account the fact g

(0) = 0 we obtain g

(t) > 0 for each t ∈ (0, ε).
This implies that g is an increasing function on (0, ε).
In particular a cannot be a local maximum point for f.
A similar argument shows that
df
(a)
≡ 0 and d
2
f
(a)
(w) < 0
imply that a cannot be a local minimum point either.
This completes the proof of Theorem 2.
By using the characterization of positive definite and negative definite quadratic
forms, Theorem 2 can be restated as the following theorem.
Theorem 3. Let f : D → R be a C
2
function, D ⊆ R
n
. Suppose that a is a
stationary point of f.
a) If the n leading principal minors of H(a) are all positive
[a
11
[ > 0,
¸
¸
¸
¸
a
11
a
12
a
21
a
22
¸
¸
¸
¸
> 0, . . . ,
¸
¸
¸
¸
¸
¸
¸
¸
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
n1
a
n2
. . . a
nn
¸
¸
¸
¸
¸
¸
¸
¸
> 0,
239
then a is a local minimum point of f.
b) If the n leading principal minors of H(a) alternate is sign
[a
11
[ < 0,
¸
¸
¸
¸
a
11
a
12
a
21
a
22
¸
¸
¸
¸
> 0, . . . , (−1)
n
¸
¸
¸
¸
¸
¸
¸
¸
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
. . . . . . . . . . . .
a
n1
a
n2
. . . a
nn
¸
¸
¸
¸
¸
¸
¸
¸
> 0,
then a is a local maximum point of f.
c) If some nonzero leading principal minors of H(a) dont’s satisfy the sign condi-
tions in the hypotheses of parts a) and b) then a is a saddle point of f (it is neither
a local maximum nor a local minimum point of f).
Next, we will present a particular case of the previous results which concerns
functions of two variables.
Theorem 4. Let f : D → R, D ⊆ R
2
, be a C
2
function. Suppose that (a, b) is a
stationary point of f.
Let
A =

2
f
∂x
2
(a, b), B =

2
f
∂x∂y
(a, b),
C =

2
f
∂y
2
(a, b) and D = B
2
−AC.
a) If D < 0 and A > 0 then (a, b) is a local minimum point of f.
b) If D < 0 and A < 0 then (a, b) is a local maximum point of f.
c) If D > 0 then (a, b) is a saddle point.
d) If D = 0 no conclusion can be drawn concerning a relative extremum, the test
inconclusive and some other technique must be used to solve the problem.
Proof. Since
a
11
= A and D = −
¸
¸
¸
¸
a
11
a
12
a
21
a
22
¸
¸
¸
¸
theorem 4 is a immediate consequence of theorem 3.
Example 2. In example 1 we computed the stationary points of
f : R
2
→R, f(x, y) = x
3
−y
3
+ 9xy
are (0, 0) and (3, −3). By differentiating the first partial derivatives, we obtain that
the Hessian of f at (x, y) is
H(x, y) =
_
f
′′
x
2
(x, y) f
′′
xy
(x, y)
f
′′
yx
(x, y) = f
′′
xy
(x, y) f
′′
y
2
(x, y)
_
=
_
6x 9
9 −6y
_
The first order principal minor is

1
(x, y) = 6x
240
and the second order principal minor is

2
(x, y) = −36xy −81.
At (0,0), these two minors are 0 and −81, respectively.
Since the second order leading principal minor is negative, (0,0) is a saddle point
of f – neither a maximum point nor a minimum point (see Theorem 3).
At (3, −3) these two minors are 18 and 243 which are positive numbers and in
consequence (3, −3) is a local point of f (see Theorem 3).
Another way of solving the problem is by using Theorem 4. If analyze the nature
of stationary point we observe that
A =

2
f
∂x
2
(0, 0) = 0, B =

2
f
∂x∂y
(0, 0) = 9,
C =

2
f
∂y
2
(0, 0) = 0, D = 81,
hence (0,0) is a saddle point (see part c) of Theorem 4).
At (3, −3) we have
A =

2
f
∂x
2
(3, −3) = 18, B =

2
f
∂x∂y
(3, −3) = 9,
C =

2
f
∂y
2
(3, −3) = −18, D = −18
2
−81 < 0,
hence (3, −3) is a local minimum point of f.
We have to mention that (3, −3) is not a global minimum point, because f(0, n) =
−n
3
which goes to −∞ as n → ∞.
Example 3. A monopolist producing a single output has two types of customers.
If it produces a units for customers of type 1, then these customers are willing to pay
100 − 10a euros per unit. If it produces b units for customers of type 2, then these
customers are willing to pay a price of 200 − 20b euros per unit. The monopolist’s
cost of producing c units of output is 180 + 40c euros. In order to maximize profits,
how much should the monopolist produce for each market?
Solution. The profit function is the following
f(a, b) = a(100 −10a) +b(200 −2b) −[180 + 40(a +b)]
The stationary points are the solutions of the following system:
_
¸
¸
¸
_
¸
¸
¸
_
∂f
∂a
= 100 −20a −40 = 60 −20a = 0
∂f
∂b
= 200 −40b −40 = 160 −40b = 0
_
60 −20a = 0
160 −40b = 0
⇔ (a, b) = (3, 4)
241
It remains to check the second order conditions. Since
f
′′
a
2(a, b) = −20, f
′′
b
2(a, b) = −40
and
f
′′
ab
(a, b) = f
′′
ba
(a, b) = 0,
then A = −20 < 0, D = −800. Therefore the point (3,4) is a local maximum point of
f.
Example 4. A firm uses two inputs to produce a single product. If its production
function is
Q(x, y) = x
1
4
y
1
4
and if it sells its output for a euro a unit and buys each input for
1
16
euros a unit,
find its maximum profit.
Solution. The profit function is the following
f(x, y) = x
1
4
y
1
4

1
16
(x +y), x > 0, y > 0.
The stationary points are the solutions of the following system
_
¸
¸
¸
_
¸
¸
¸
_
∂f
∂x
(x, y) = 0
∂f
∂y
(x, y) = 0

_
¸
¸
_
¸
¸
_
1
4
x

3
4
y
1
4

1
16
= 0
1
4
x
1
4
y

3
4

1
16
= 0

_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
y
x
3
=
_
1
4
_
4
x
y
3
=
_
1
4
_
4

_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
1
x
2
y
2
=
_
1
4
_
8
y
x
3
=
_
1
4
_
4

_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
1
xy
=
_
1
4
_
4
y
x
3
=
_
1
4
_
4

_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
_
1
x
_
4
=
_
1
4
_
8
y
x
3
=
_
1
4
_
4

_
x = 16
y = 16
It remains to check the second order conditions:

2
f(x, y)
∂x
2
= −
3
16
x

7
4
y
1
4
; A = −
3
16
2
−6
< 0

2
f(x, y)
∂x∂y
=
1
16
x

3
4
y

3
4
; B =
1
16
2
−3
2
−3
=
1
16
2
−6

2
f(x, y)
∂y
2
= −
3
16
x
1
4
y

7
4
and C = −
3
16
2
−6
.
242
In consequence
D =
1
16
2
−12

9
16
2
2
−12
< 0 and A < 0,
hence (16, 16) is a local maximum point for f.
Example 5. A farmer wishes to build a rectangular storage bin, without a top,
with a volume of 500 cubic meters using the least amount of material possible. De-
termine the dimensions of such a storage bin.
Solution. If we let x and y be the dimensions of the base of the bin and z be the
height, all measured in meters, then the farmer wishes to minimize the surface area
of the bin, given by
S = xy + 2xz + 2yz (3)
subject to the constraint of the volume, namely,
500 = xyz
Solving for z in the latter expression and substituting into (3) we have
S = S(x, y) = xy + 2x
500
xy
+ 2y
500
xy
= xy +
1000
y
+
1000
x
This is the function we need to minimize on the unbounded set
D = ¦(x, y) [ x > 0, y > 0¦.
Now
∂S
∂x
= y −
1000
x
2
and
∂S
∂y
= x −
1000
y
2
so to find the stationary points of S we need to solve
_
¸
¸
¸
_
¸
¸
¸
_
y −
1000
x
2
= 0
x −
1000
y
2
= 0
Solving for y in the first equation and then substituting into the second equation
we get
x −
x
4
1000
= 0 ⇔ x
_
1 −
x
3
1000
_
= 0.
243
The solutions of the latter are x = 0 and x = 10. Since the first of these will not
give us a point in D, we have x = 10 and
y =
1000
10
2
= 10.
Thus the only stationary point is (10,10).
Now

2
S
∂x
2
=
2000
x
3
; A =

2
S
∂x
2
(10, 10) =
2000
1000
= 2 > 0

2
S
∂x∂y
= 1; B =

2
S
∂x∂y
(10, 10) = 1

2
S
∂x∂y
=
2000
y
3
; C =

2
S
∂y
2
(10, 10) =
2000
1000
= 2 > 0
Hence D
2
= B
2
−AC = 1 −4 = −3 < 0 and A > 0.
This shows that S has a local minimum of
S(10, 10) = 10 10 +
1000
10
+
1000
10
= 300 at (x, y) = (10, 10).
Finally, when x = 10 and y = 10, we have
z =
500
10 10
= 5,
so the farmer should build the bin to have a base of 10 meters by 10 meters and a
height of 5 meters.
The problem of showing that the point (10,10) is actually the global minimum
value of S will be discussed later.
Example 6. Determine the local extreme values of the following function
f : R
3
→R, f(x, y, z) = 2x
2
+y
2
+ 2xy + 3z
2
+ 2xz + 4z + 3.
Solution. The stationary points are the solutions of the following system of linear
equations:
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
∂f
∂x
(x, y, z) = 0
∂f
∂y
(x, y, z) = 0
∂f
∂z
(x, y, z) = 0

_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
4x + 2y + 2z = 0
2x + 2y = 0
2x + 6z + 4 = 0
The solution of the previous system is (x, y, z) = (1, −1, −1). In order to establish
the nature of the stationary point (1, −1, −1) we have to apply the second order
conditions. We evaluate first the Hessian matrix at (1, −1, −1).
H(x, y, z) =
_
_
4 2 2
2 2 0
2 0 6
_
_
and hence H(1, −1, −1) =
_
_
4 2 2
2 2 0
2 0 6
_
_
.
244
The principal minors are

1
= 4, ∆
2
=
¸
¸
¸
¸
4 2
2 2
¸
¸
¸
¸
= 4 and

3
=
¸
¸
¸
¸
¸
¸
4 2 2
2 2 0
2 0 6
¸
¸
¸
¸
¸
¸
= 48 + 0 + 0 −8 −0 −24 = 16.
By applying part a) of Theorem 3 we get that (1, −1, −1) is a local minimum point
of f.
Necessary conditions
We will now prove the second order necessary conditions for optimization.
Theorem 5. Let f : D → R be a C
2
function, D ⊆ R
n
. Suppose that a is a
local minimum point (respectively a local maximum point) of f. Then df
(a)
≡ 0 and
d
2
f
(a)
(h) ≥ 0, for each h in R
n
(respectively df
(a)
≡ 0 and d
2
f
(a)
(h) ≤ 0 for each h
in R
n
).
Proof. From Theorem 1 we know that a is a stationary point of f and hence
df
(a)
≡ 0.
By using a similar argument as in the proof of Theorem 2 if a is a stationary point
and d
2
f
(a)
(v) > 0 for some vector v then a cannot be a local maximum point of f.
So, if a stationary point is a local maximum point of f there is no vector such that
d
2
f
(a)
(v) > 0. In consequence we have d
2
f
(a)
(h) ≤ 0 for all h ∈ R
n
.
In the same way if a is a local minimum point of f then df
(a)
≡ 0, d
2
f
(a)
(h) ≥ 0
for each h ∈ R
n
.
Global maxima and minima
Definition 4. Let D ⊆ R
n
. D is said to be convex if, for all x and y in D and
every t in the interval [0, 1], the point (1 −t)x +ty is in D.
In other words, every point on the line segment connecting x and y is in D.
Theorem 6. a) Let f be a C
2
function on an open convex subset D of R
n
for
which d
2
f
(x)
(h) ≥ 0 for all x ∈ D and h ∈ R
n
.
If a is a stationary point of f then a is a global minimum point of f.
b) Let f be a C
2
function on an open convex subset D of R
n
for which d
2
f
(x)
(h) ≤ 0
for all x ∈ D and h ∈ R
n
. If a is a stationary point of f, that is d
2
f
(a)
≡ 0, then a is
a global maximum point of f.
The proof of the previous theorem involves ideas that are beyond the scope of this
text and will be omitted.
Example 7. In example 3 we get that the point (3,4) is a local maximum point
of f. By applying the previous theorem we obtain that (3,4) is a global maximum
point. Indeed,
D = ¦(a, b) ∈ R
2
[ a > 0, b > 0¦
245
is a convex set (see the Definition 4). On the other hand, the Hessian matrix is
H(a, b) =
_
−20 0
0 −40
_
for each point in D, hence
d
2
f
(a,b)
(h
1
, h
2
) = −20h
2
1
−40h
2
2
≤ 0
for each (h
1
, h
2
) ∈ R
2
.
Since all the hypotheses of theorem 4, part b) are fulfilled it results that the
stationary point (3,4) is a global maximum point.
Example 8. Prove that the stationary point in example 6 is a global (absolute)
minimum point of f.
Solution. D = R
3
is a convex set.
The Hessian matrix is an arbitrary point (x, y, z) in R
3
is
H(x, y, z) =
_
_
4 2 2
2 2 0
2 0 6
_
_
,
hence
d
2
f
(x,y,z)
(h
1
, h
2
, h
3
) = 4h
2
1
+ 4h
1
h
2
+ 4h
1
h
3
+ 2h
2
2
+ 6h
2
3
= 2h
2
1
+ 4h
1
h
2
+ 2h
2
2
+ 2h
2
1
+ 4h
1
h
3
+ 2h
2
3
+ 4h
2
3
= 2(h
1
+h
2
)
2
+ 2(h
1
+h
3
)
2
+ 4h
2
3
≥ 0,
for all (x, y, z) ∈ R
3
and (h
1
, h
2
, h
3
) ∈ R
3
.
Since all the hypotheses of Theorem 4 part a) are fulfilled it results that the
stationary point (1, −1, −1) is a global minimum point.
Remark 1. The global minimum and maximum values of a continuous function
on a closed and bounded (hence compact) set D can be obtained in the following way:
• find the values of f at the stationary points of f in the interior of D
• find the extreme values of f on the boundary of D
• the largest of the values of f from steps 1 and 2 is the global maximum value,
the smallest of these values is the global minimum value.
Example 9. Prove that the stationary point (10,10) in example 5 is an absolute
maximum point of S.
246
Solution. let D
1
be the closed rectangle:
D
1
= ¦(x, y) [ 1 ≤ x ≤ 400, 1 ≤ y ≤ 400¦.
Now, if 0 < x ≤ 1, then
1000
x
≥ 1000 and so
S = xy +
1000
y
+
1000
x
≥ 1000 > 300.
Similarly, if 0 < y ≤ 1, then S > 300. Moreover, if x ≥ 400 and y ≥ 1, then
xy ≥ 400, and so S > 300. Similarly, if y ≥ 400 and x ≥ 1, then S > 300. Hence
S > 300 for all (x, y) outside of D
1
and for all (x, y) on the boundary of D
1
.
From the previous observations, the global minimum on D must in fact be the
global minimum of S on all D
1
which is 300.
4.6 Constrained extrema
In this section we discuss a powerful method for determining the relative extrema
of a function whose independent variables satisfy one or more constraints. This method
is called the Lagrange multipliers method. Consider the problem:
_
_
_
optimize f(x) = f(x
1
, x
2
, . . . , x
n
)
subject to
g
j
(x
1
, x
2
, . . . , x
n
) = c
j
, j = 1, . . . , m < n
(1)
f is called the objective function, g
1
, . . . , g
m
are the constraint functions, c
1
, . . . , c
m
are the constraint constants.
If it is possible to express m independent variables as functions of the other n−m
independent variables, we can eliminate m variables in the objective function (as in
example 5, section 4.5) thus the initial problem will be reduced to the unconstrained
optimization problem with respect to n − m variables. However, in many cases it is
not technically possible to express one variable as a function of the others.
In this case, instead of the substitution and elimination method, we will use the
method of Lagrange multipliers.
In comparison with using the constraint to express m independent variables in
terms of the others, the Lagrangean technique involves more variables and more equa-
tions. The advantage of the Lagrangean method is its universality.
Constrained optimization has a proeminent place in economic theory due to the
importance of maximization of utility subject to a budget constraint.
In economic theory it is important that the Lagrange multipliers express how the
extreme value of the problem would change as the constraint is modified.
We begin with the simplest constrained maximization problem, that is maximizing
a function f(x, y) of two variables subject to a single equality constraint g(x, y) = c.
247
`
¸
g
(
x
,
y
)
=
c
f(x, y) = k
y
x
The previous figure shows this curve together with several level curves of f. To
maximize f(x, y) subject to g(x, y) = c is to find the largest value of k such that
the level curve f(x, y) = k intersects g(x, y) = c. It appears from the figure above
that this happens when these curves just touch each other, that is, when they have
a common tangent line (otherwise, the value of k could be increased further). In this
case the slope of the constraint curve g(x, y) = c is equal to the slope of a level curve
f(x, y) = k. According to the formula (9), section 4.2 the slope of the constraint curve
is = −
g

x
g

y
and the slope of the level curve is −
f

x
f

y
.
Hence the condition that the slopes be equal can be expressed by the equation

f

x
f

y
= −
g

x
g

y
or, equivalently,
f

x
g

x
=
f

y
g

y
.
If we let λ denote this common ratio, we have
λ =
f

x
g

x
and λ =
f

y
g

y
from which we get the following equations (Lagrange equations)
f

x
= λg

x
and f

y
= λg

y
.
The third equation g(x, y) = c is simply a statement of the fact that the point in
question actually lies on the constraint set.
There is a formal way of obtaining the previous equations:
248
Form the Lagrange function
L(x, y, λ) = f(x, y) −λ[g(x, y) −c].
Find the critical points of the Lagrangean. The result of this process is the following
system:
_
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
_
∂f
∂x
(x, y) = λ
∂g
∂x
(x, y)
∂f
∂y
(x, y) = λ
∂g
∂y
(x, y)
g(x, y) = c
Among the solutions of this system we can find the extreme points of f.
A minimization problem can be analyzed by using the same arguments.
The statement of necessary conditions for optimizing a function of b variables
constrained by m equality constraints is the following.
Theorem 1. (Necessary conditions)
Let f, g
1
, . . . , g
m
: D → R be C
1
functions of n variables (m < n). Consider the
problem of maximizing (or minimizing) f on the constraint set
C
g
= ¦x [ g
1
(x) = c
1
, . . . , g
m
(x) = c
m
¦.
Suppose that a is a local maximum or minimum point of f on C
g
(that is a ∈ C
g
).
Suppose further that the rank of the Jacobian matrix
Dg(a) =
_
_
_
_
_
∂g
1
∂x
1
(a) . . .
∂g
1
∂x
n
(a)
. . . . . . . . .
∂g
m
∂x
1
(a) . . .
∂g
m
∂x
n
(a)
_
_
_
_
_
, is m. (2)
Then there exist λ
1
, λ
2
, . . . , λ
n
such that
(a
1
, . . . , a
n
, λ
1
, . . . , λ
m
) = (a, λ)
is a stationary point of the Lagrangean function:
L(x, λ) = f(x) −λ
1
[g
1
(x) −c
1
] −λ
2
[g
2
(x) −c
2
] − −
−λ
m
[g
m
(x) −c
m
] (3)
In other words
∂L
∂x
1
(a, λ) = 0, . . . ,
∂L
∂x
n
(a, λ) = 0,
∂L
∂λ
1
(a, λ) = g
1
(x) −c
1
= 0, . . . ,
∂L
∂λ
m
(a, λ) = g
m
(x) −c
m
= 0.
249
The proof of this theorem involves ideas that are beyond the scope of this text all
will be omitted.
Example 1. A consumer has 1200 m.u. (monetary units) to spend on two com-
modities, the first of which costs 40 m.u. per unit and the second 60 m.u. per unit.
Suppose that the utility derived by the consumer from x units of the first commodity
and y units of the second commodity is given by the Cobb-Douglas utility function
U(x, y) = 20x
0,6
y
0,4
.
How many units of each commodity should the consumer buy to maximize utility?
Solution. The total cost of buying x units of the first commodity and y units of
the second is 40x+60y. Since the consumer has only 1200 m.u. to spend, the goal is to
maximize utility U(x, y) subject to the budgetary constraint that 40x + 60y = 1200.
The Lagrangean function is
L(x, y, λ) = 20x
0,6
y
0,4
−λ(40x + 60y −1200).
The three Lagrange equations are:
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
∂L
∂x
= 12
_
y
x
_
0,4
−40λ = 0
∂L
∂y
= 8
_
x
y
_
0,6
−60λ = 0
∂L
∂λ
= 40x + 60y −1200 = 0
wherefrom we easily get that
_
y
x
_
0,4
=
10λ
3
;
_
x
y
_
0,6
=
15λ
2
and 2x + 3y = 60.
From the first two equalities we get
_
y
x
_
0,4

_
y
x
_
0,6
=
10λ
3

2
15λ

y
x
=
4
9
⇒ y =
4
9
x
Substituting this into the third equation we get
2x + 3
4
9
x = 60
from which it follows that x = 18 and y = 8.
So the only candidate for a solution to our problem is x = 18 and y = 8;
λ =
3
10

_
y
x
_
0,4
=
3
10
_
8
18
_
0,4
=
3
10

_
2
3
_
0,8
.
Next we will describe a second order condition that distinguished maximum points
from minimum points.
250
Intuitively, the second order condition for a constrained maximization (or mini-
mization) problem should involve the negative definiteness of some Hessian matrix,
but should only be concerned with directions along the constraint set.
Theorem 2. (Sufficient conditions).
Let f, g
1
, . . . , g
m
: D → R, D ⊆ R
n
, be C
2
functions of n variables (m < n).
Consider the problem of maximizing or minimizing f on the constraint set
C
g
= ¦x = (x
1
, . . . , x
n
) [ g
1
(x) = c
1
, . . . , g
m
(x) = c
m
¦.
Form the Lagrangean
L(x, λ) = f(x) −λ
1
[g
1
(x) −c
1
] − −λ
m
[g
m
(x) −c
m
]
and suppose that:
a) a ∈ C
g
b) There exists λ = (λ
1
, . . . , λ
m
) ∈ R
m
, such that
∂L
∂x
1
(a, λ) = =
∂L
∂x
n
(a, λ) = 0
c) The Hessian of L with respect to x at (a, λ) is negative definite (positive definite)
on the set ¦v [ Dg(a)v = 0¦ that is for each v ,= θ with Dg(a)v = 0 we have
d
2
L
(a,λ)
(v) < 0 (respectively d
2
L
(a,λ)
(v) > 0).
Then a is a strict local constrained maximum (minimum) point on C
g
.
Proof. We want to show that a is a local maximum point of f on the constraint
set C
g
.
We assume the opposite, that means that there exists a sequence (x
j
)
j≥1
⊆R
n
such
that x
j

k→∞
a with x
j
,= a for all j and x
j
∈ C
g
and f(x
j
) > f(a) for all j. Construct
a new sequence by using these x
j
’s.
v
j
=
x
j
−a
|x
j
−a|
.
It is obvious that |v
j
| = 1.
(v
j
)
j≥1
is a sequence contained by the unit sphere (in R
n
).
Since the unit sphere in R
n
is a compact set, the sequence (v
j
)
j≥1
has a convergent
subsequence which will be denoted by (v
k
)
k≥1
and its limit by v.
Since g
i
is C
1
, for each i = 1, m, we write down its first Taylor polynomial of order
one about a evaluating it at each x
k
.
g
i
(x
k
) −g
i
(a) = dg
i
(a)
(x
k
−a) +R
i
1
(x
k
)
where
R
i
1
(x
k
)
|x
k
−a|
→ 0 as x
k
→ a
c
i
−c
i
|x
k
−a|
= dg
i
(a)
_
x
k
−a
|x
k
−a|
_
+
R
i
(x
k
)
|x
k
−a|
.
251
If we let k → ∞ in the previous equality we get that
0 = dg
i
(x)
(a), i = 1, m.
Now write down the second order Taylor polynomial of the Lagrangean as a func-
tion of x about a:
L(x
k
) = L(a) +dL
a
(x
k
−a) +
1
2
d
2
L
a
(x
k
−a) +R
2
(x
k
) (4)
where
R
2
(x
k
)
|x
k
−a|
2
→ 0 as x
k
→ a.
By hypothesis, dL
a
≡ 0; also
L(x
k
) = f(x
k
) −

i
λ
i
(g
i
(x
k
) −c
i
)
= f(x
k
)
In the same way L(a) ≡ f(a).
Using these results, rewrite (4) as
0 ≤
f(x
k
) −f(a)
|x
k
−a|
2
= d
2
L
a
(v
k
) +
R
2
(x
k
)
|x
k
−a|
2
.
Let x
k
→ a in the previous relation. Hence d
2
L
a
(v) ≥ 0 which is a contradiction
with the hypotheses. This completes the proof.
By combining the previous two theorems we obtain the following 5 steps algorithm
in order to determine the constrained extreme points of a given function.
Lagrange’s method of multipliers
Let f, g
1
, . . . , g
m
: D → R, D ⊆ R
n
(m < n) be C
2
functions and let c
1
, . . . , c
m

R. Consider the problem of maximizing (or minimizing) f on the constraint set
C
g
= ¦x [ g
1
(x) = c
1
, . . . , g
m
(x) = c
m
¦.
Suppose that the rank of the matrix
Dg(x) =
_
_
_
_
_
∂g
1
∂x
1
(x) . . .
∂g
1
∂x
n
(x)
. . . . . . . . .
∂g
m
∂x
1
(x) . . .
∂g
m
∂x
n
(x)
_
_
_
_
_
is m for each x ∈ C
g
.
Step 1. Assign to any constraint g
j
(x) = c
j
one Lagrange multiplier, λ
j
∈ R,
j = 1, . . . , m.
252
Write down the Lagrange function (Lagrangean)
L(x, λ) = f(x) −λ
1
[g
1
(x) −c
1
] − −λ
m
[g
m
(x) −c
m
]
= f(x) −
m

j=1
λ
j
[g
j
(x) −c
j
] (5)
Step 2. Find the stationary points (a, λ) of the function L with respect to variables
x and λ. These are solutions of the following system
_
¸
¸
¸
_
¸
¸
¸
_
∂L
∂x
i
(x, λ) = 0, i = 1, n
∂L
∂λ
j
(x, λ) = g
j
(x) −c
j
= 0, j = 1, m
Step 3. For each stationary point (a, λ) consider the function L of n variables:
L(x) = f(x) −λ
1
[g
1
(x) −c
1
] − −λ
m
[g
m
(x) −c
m
]
= f(x) −
m

j=1
λ
j
[g
j
(x) −c
j
]
Step 4. Consider and solve the following system whose rank is m (see (1))
_
_
_
dg
1(a)
(v) = 0
. . .
dg
m(a)
(v) = 0
Step 5. Evaluate d
2
L
(a)
at each solution v of the previous system.
a) If d
2
L
(a)
(v) > 0 for each v ,= θ obtained at step 4, then a is a constrained
minimum point.
b) If d
2
L
(a)
(v) < 0 for each v ,= θ obtained at step 4, then a is a constrained
maximum point.
c) If d
2
L
(a)
(v) takes both positive and negative values on the solution set obtained
at step 4 then a is not a constrained extreme point.
Example 2. Determine the nature of the stationary point in Example 1.
Solution. The rank of the matrix
(g

x
(x, y), g

y
(x, y)) = (40, 60)
is always 1. The Lagrangean function in this example is
L(x, y, λ) = 20x
0,6
y
0,4
−λ(40x + 60y −1200)
and the critical point is
_
18, 8,
3
10
_
2
3
_
0,8
_
. It remains for us to check steps 3, 4 and
5 from the previous algorithm.
253
Step 3. L(x, y) = 20x
0,6
y
0,4

3
10
_
2
3
_
0,8
(40x + 60y −1200)
Step 4. We have to solve the following equation (because we have just one con-
straint)
∂g
∂x
(18, 8)v
1
+
∂g
∂y
(18, 8)v
2
= 0.
Since
∂g
∂x
(18, 8) = 40 and
∂g
∂y
(18, 8) = 60, then the equation to be solved is:
40v
1
+ 60v
2
= 0
wherefrom we have v
2
= −
2
3
v
1
.
Step 5. d
2
L
(18,8)
(v
1
, v
2
) = d
2
L
(18,8)
_
v
1
, −
2
3
v
1
_
=

2
L
∂x
2
(18, 8)v
2
1
+ 2

2
L
∂x∂y
(18, 8)v
1
_

2
3
v
1
_
+

2
L
∂y
2
(18, 8)
_

2
3
v
1
_
2
=
_

2
L
∂x
2
(18, 8) −
4
3


2
L
∂x∂y
(18, 8) +
4
9


2
L
∂y
2
(18, 8)
_
v
2
1
We have

2
L
∂x
2
(x, y) =
_
12
_
y
x
_
0,4
−40λ
_

x
= −4, 8y
0,4
x
−1,4
= −4, 8
1
x
_
y
x
_
0,4
so

2
L
∂x
2
(18, 8) = −
4, 8
18
_
2
3
_
0,8
;

2
L
∂x∂y
(x, y) =
_
12
_
y
x
_
0,4
−40λ
_

y
= 4, 8x
−0,4
y
−0,6
= 4, 8
1
y
_
y
x
_
0,4
so

2
L
∂x∂y
(18, 8) = 4, 8
1
8
_
8
18
_
0,4
=
4, 8
8
_
2
3
_
0,8

2
L
∂y
2
(x, y) =
_
8
_
x
y
_
0,6
−60λ
_

y
= −4, 8y
−1,6
x
0,6
= −4, 8
1
y
_
x
y
_
0,6
so

2
L
∂y
2
(18, 8) = −
4, 8
8
_
2
3
_
1,2
254
We finally obtain that
d
2
L
(18,8)
_
v
1
, −
2
3
v
2
_
=
_

4, 8
18
_
2
3
_
0,8

4, 8
6
_
2
3
_
0,8

4, 8
18
_
2
3
_
1,2
_
v
2
1
< 0
for all v
1
,= 0 and in consequence we have that (18,8) is a local maximum constraint
point f.
Recall from section 4.2 (example 8) that the level curves of a utility function are
the optimal indifference curve U(x, y) = C, where C = U(18, 8) and the budgetary
constraint is 40x + 60y = 1200 is sketched in figure below.
`
¸
20
30
-
(18,8)
budget line
Example 3. Rework Example 5 from section 4.5 by using Lagrange’s multipliers
method.
Solution. Let x, y and z be the length, width and height, respectively, of the bin
in meters. We wish to minimize
S : (0, ∞) (0, ∞) (0, ∞) →R, S(x, y, z) = xy + 2yz + 2zx
subject to the constraint of the volume, namely
V : (0, ∞) (0, ∞) (0, ∞) →R, V (x, y, z) = xyz = 500.
Using the method of Lagrange multipliers, we have to follows the five steps of the
Lagrange’s algorithm.
We have first to check that the rank of the matrix
(V

x
(x, y, z), V

y
(x, y, z), V

z
(x, y, z)) = (yz, xz, xy)
is one for each point (x, y, z) which satisfies xyz = 500. The matrix rank is then 1 if
and only if x = y = z = 0, but the point (0,0,0) does not respect the constraint. In
conclusion the rank of the matrix is one for all the points of the constraint set.
255
Steps 1 and 2.
L(x, y, z) = S(x, y, z) −λ[V (x, y, z) −500]
L(x, y, z) = xy + 2yz + 2zx −λ(xyz −500)
We have to solve the following system
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
∂L
∂x
(x, y, z) = y + 2z −λyz = 0
∂L
∂y
(x, y, z) = x + 2z −λxz = 0
∂L
∂z
(x, y, z) = 2y + 2x −λxy = 0
∂L
∂λ
(x, y, z) = 500 −xyz = 0
There are no general rules for solving nonlinear systems of equations. Sometimes
some ingenuity is required. Usually we eliminate λ from the equations and try to solve
the remaining system.
λ =
y + 2z
yz
=
x + 2z
xz
=
2y + 2x
xy
From the previous equality we obtain that
1
z
+
2
y
=
1
z
+
2
x
=
2
x
+
2
y
The first equality shows us that x = y and the last equality assure us that y = 2z,
so x = y = 2z.
If we substitute this values in the constraint equality we get 4z
3
= 500 wherefrom
we obtain that z = 5, x = y = 10 and λ =
10 + 10
50
=
2
5
. In consequence
_
10, 10, 5,
2
5
_
is the unique critical point of L.
Step 3. L(x, y, z) = xy + 2xz + 2yz −
2
5
(xyz −500)
Step 4. In order to solve the equation
∂g
∂x
(10, 10, 5)v
1
+
∂g
∂y
(10, 10, 5)v
2
+
∂g
∂z
(10, 10, 5)v
3
= 0
we compute first
∂g
∂x
= yz,
∂g
∂y
= xz and
∂g
∂z
= xy,
hence
∂g
∂x
(10, 10, 5) = 50,
∂g
∂y
(10, 10, 5) = 50 and
∂g
∂z
(10, 10, 5) = 100
256
The equation to be solved is
50v
1
+ 50v
2
+ 100v
3
= 0
wherefrom we have
v
3
= −
1
2
(v
1
+v
2
).
Step 5. d
2
L
(10,10,5)
(v
1
, v
2
, v
3
) = d
2
L
(10,10,5)
_
v
1
, v
2
, −
1
2
(v
1
+v
2
)
_
= 2

2
L
∂x∂y
(10, 10, 5)v
1
v
2
+ 2

2
L
∂x∂z
(10, 10, 5)v
1
_

1
2
(v
1
+v
2
)
_
+2

2
L
∂y∂z
(10, 10, 5)v
2
_

1
2
(v
1
+v
2
)
_
= 2
_
1 −
2
5
5
_
v
1
v
2

_
2 −
2
5
10
_
v
1
(v
1
+v
2
) −
_
2 −
2
5
10
_
v
2
(v
1
+v
2
)
= −2v
1
v
2
+ 2v
1
(v
1
+v
2
) + 2v
2
(v
1
+v
2
)
= 2v
2
1
+ 2v
1
v
2
+ 2v
2
2
= v
2
1
+v
2
2
+ (v
1
+v
2
)
2
> 0
implies that (10,10,5) is a local minimum constraint point for f.
Example 4. Consider the problem of optimization of f
f : R
3
→R, f(x, y, z) = x + 2y + 3z
on the constraint set defined by
g(x, y, z) = x
2
+y
2
= 1 and h(x, y, z) = x +z = 1.
Solution. First, compute the Jacobian matrix of the constraint function
Dg(x, y, z) =
_
2x 2y 0
1 0 1
_
Its rank is less than 2 if and only if x = y = 0. Since any point of the form (0, 0, z)
does not respect the first constraint, the rank of the Jacobian matrix is two for all
the points of the constraint set.
Next, form the Lagrangean
L(x, y, z, λ, µ) = x + 2y + 3z −λ(x
2
+y
2
−1) −µ(x +z −1)
257
and set its first partial derivatives equal to 0
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
∂L
∂x
= 1 −2λx −µ = 0
∂L
∂y
= 2 −2λy = 0
∂L
∂z
= 3 −µ = 0
∂L
∂λ
= −(x
2
+y
2
−1) = 0
∂L
∂µ
= −(x +z −1) = 0
Solve the second and third equations for λ and µ and plug these into the first
equation to obtain: µ = 3, λ =
1
y
and 1 − 2
x
y
− 3 = 0 which implies that x = −y.
Then, solve the fourth equation for x and the last equation for z.
2x
2
= 1 ⇒ x =
1

2
, y = −
1

2
, z = 1 −
1

2
, λ = −

2, µ = 3
⇒ x = −
1

2
, y =
1

2
, z = 1 +
1

2
, λ =

2, µ = 3
For each of two previous critical points it remains for us to follows steps 3 to 5.
We will analyse just the first case, the second is similar and it is left to the reader.
Step 3. L(x, y, z) = x + 2y + 3z +

2(x
2
+y
2
−1) −3(x +z −1)
Step 4. We have to solve the following system (we denoted by a =
_
1

2
, −
1

2
, 1 −
1

2
_
)
_
¸
¸
¸
_
¸
¸
¸
_
∂g
∂x
(a)v
1
+
∂g
∂y
(a)v
2
+
∂g
∂z
(a)v
3
= 0
∂h
∂x
(a)v
1
+
∂h
∂y
(a)v
2
+
∂h
∂z
(a)v
3
= 0
Easy computations lead us to
_ √
2v
1


2v
2
= 0
v
1
+v
3
= 0
, hence
_
v
2
= v
1
v
3
= −v
1
Step 5. d
2
L
_
1

2
,−
1

2
,1−
1

2
_
(v
1
, v
2
, v
3
)
= d
2
L
_
1

2
,−
1

2
,1−
1

2
_
(v
1
, v
1
, −v
1
)
258
= 2

2v
2
1
+ 2

2v
2
1
= 4

2v
2
1
> 0
so
_
1

2
, −
1

2
, 1 −
1

2
_
is a local constraint maximum point for f.
The significance of the Lagrange multipliers
It is possible to solve a constrained optimization problem by the method of La-
grange multipliers without obtaining numerical values for the Lagrange multipliers.
However, the multipliers play an important role in economic analysis since they
measure the sensitivity of the optimal value of the objective function to changes in
the right-hand sides of the constraints.
We analyse first the simplest problem – two variables and one equality constraint.
Let f, g : D →R, D ⊆ R
2
optimize f(x, y)
subject to g(x, y) = c
(6)
Consider c as a parameter, c ∈ R, which may vary.
For any fixed value of c we denote by (a(c), b(c)) the solution of the previous
problem by λ(c) the multiplier which corresponds to this solution and by f(a(c), b(c))
the corresponding optimal value of the objective function.
We will show that λ(c) measures the rate of change of the optimal value of f with
respect to the parameter c.
Theorem 3. Let f, g : D →R be C
1
functions of two variables. Let (a(c), b(c)) be
the solution of the problem (6) where the corresponding multiplier is denoted by λ(c).
Suppose that a, b and λ are C
1
functions of c and
∂g
∂x
(a(c), b(c)) ,= 0 or
∂g
∂y
(a(c), b(c)) ,= 0.
Then
λ(c) =
d
dc
f(a(c), b(c)).
The derivative in the previous equality is taken with respect to c since f(a(c), b(c))
can be seen as a function of one variable which is c.
Proof. From Theorem 1 we know that at an extreme point (a(c), b(c)) we have
∂f
∂x
(a(c), b(c)) = λ(c)
∂g
∂x
(a(c), b(c)),
∂f
∂y
(a(c), b(c)) = λ(c)
∂g
∂y
(a(c), b(c)) and
g(a(c), b(c)) = c.
259
If we differentiate the latter equality (with respect to c) we get
∂g
∂x
(a(c), b(c))a

(c) +
∂g
∂y
(a(c), b(c))b

(c) = 1.
By the chain rule of partial derivatives:
d
dc
(f(a(c), b(c)) =
∂f
∂x
(a(c), b(c))a

(c) +
∂f
∂y
(a(c), b(c))b

(c)
= λ(c)
∂g
∂x
(a(c), b(c))a

(c) +λ(c)
∂g
∂y
(a(c), b(c))b

(c)
= λ(c)
_
∂g
∂x
(a(c), b(c))a

(c) +
∂g
∂y
(a(c), b(c))b

(c)
_
= λ(c) 1
= λ(c).
This completes the proof.
Remark 1. Under the assumptions of Theorem 3 we have
λ ≈ change in the optimal value of f due to 1-unit change in c. (7)
Proof. We know, from Theorem 3, that
λ(c) =
d
dc
(f(a(c), b(c))
= lim
h→0
f(a(c +h), b(c +h)) −f(a(c), b(c))
h

∆f
∆c
.
In consequence
∆f ≈ λ ∆c (8)
If we take ∆c = 1 in (8), we get ∆f ≈ λ, as desired.
Example 5. Suppose the consumer in Example 1 has 1201 m.u. instead of 1200
m.u. to spend on the two commodities. Estimate how the additional 1 m.u. will affect
the maximum utility.
Solution. From Example 1 we know that
λ =
3
10
_
y
x
_
0,4
.
Since the maximum value M of utility when 1200 m.u. was available occured when
x = 18 and y = 8, substitute these values into the formula for λ we get
λ = 0, 3
_
8
18
_
0,4
≈ 0, 22
260
which is (see Remark 1) approximately the increase ∆M in maximum utility resulting
from the 1 m.u. increase in available funds.
The statement of the natural generalization of Theorem 3 to several variables and
several equality constraints is the following.
Theorem 4. Let f, g
1
, . . . , g
m
: D → R, D ⊆ R
n
, be C
1
functions, m < n. Let
(a
1
(c), a
2
(c), . . . , a
n
(c)), c = (c
1
, . . . , c
m
) be the solution of the problem (1). Suppose
that a
1
, . . . , a
n
, λ
1
, . . . , λ
m
are differentiable functions of the parameters (c
1
, . . . , c
m
)
and that condition (2) holds.
Then, for each j = 1, m we have
λ
j
(c) =
∂f
∂c
j
(a
1
(c
1
, . . . , c
m
), . . . , a
n
(c
1
, . . . , c
m
)).
In the previous equality λ
j
describes (approximately) the influence of the c
j
(j
th
component of the constraint constants) on the change of the optimal value of the
problem.
4.7 Applications to economics
4.7.1 The method of least squares
Scientists studying the data from some observations or experiments are often in-
terested in determining a function that fits the data reasonable well. Suppose that
we are studying a relationship between two variables, so that each observation can be
represented by a point (x, y) in the plane.
The method of least squares is used for determining a function which approximates
the set of given points so that its graph is closest to that points.
Suppose we have n point P
1
(x
1
, y
1
), . . . , P
n
(x
n
, y
n
) which describe a relationship
between the two variables x and y. Usually these data are presented in a table of the
following form:
x x
1
x
2
. . . x
n
y y
1
y
2
. . . y
n
.
The first step is to determine what type of function to look for. This can be done
by theoretical analysis of the practical situation or by inspection of the graph of the
n points P
1
, . . . , P
n
. The second step is to determine the particular function whose
graph is closest to the given set of points.
261
¸
`
P
1
(x
1
, y
1
)
P
2
(x
2
, y
2
)
P
n
(x
n
, y
n
)
x
1
x
2
(x
2
, f(x
2
))
x
n
. . .
It is obvious that the error at x
i
is y
i
−f(x
i
), i = 1, n. The question is how can we
combine these errors in order to define the total error which has to reflect how close
is the graph to the given points.
The choice
n

i=1
(y
i
− f(x
i
)) is not convenient because this sum can be 0 and still
the terms of this sum can have great values and opposite signs.
The sum
n

i=1
[y
i
− f(x
i
)[ reflects better how close is the graph of f to the points
but this choice is not convenient, too, since the modulus function is not everywhere
differentiable and we can’t use the sufficient conditions for local extrema. The sum
of the squares of the vertical distances from the given points to the graph of f,
n

i=1
(y
i
−f(x
i
))
2
, is the convenient choice for the total error.
We have to solve the following problem:
Determine the function f such that the sum
n

i=1
(y
i
−f(x
i
))
2
takes the minimum
value.
From now on we restrict the discussion to the case when f is a polynomial.
Problem. Determine a polynomial f of degree at most m,
f(x) = a
0
+a
1
x + +a
m
x
m
(a
0
, a
1
, . . . , a
m
=?) such that the function
F(a
0
, a
1
, . . . , a
m
) =
n

i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)]
2
takes the minimum value.
262
The unknowns are the coefficients of the polynomial. We have m + 1 unknowns:
a
0
, a
1
, . . . , a
m
.
We will apply Theorem 2, section 4.5.
First we determine the stationary points of F by solving the following system:
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
F

a
0
= 2
n

i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)](−1) = 0
F

a
1
= 2
n

i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)](−x
i
) = 0
. . .
F

a
m
= 2
n

i=1
[y
i
−(a
0
+a
1
x
i
+ +a
m
x
m
i
)](−x
m
i
) = 0
The previous system can be written in the following way:
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
n

i=1
y
i
= a
0
n

i=1
1 +a
1
n

i=1
x
i
+ +a
m
n

i=1
x
m
i
n

i=1
y
i
x
i
= a
0
n

i=1
x
i
+a
1
n

i=1
x
2
i
+ +a
m
n

i=1
x
m+1
i
. . .
n

i=1
y
i
x
m
i
= a
0
n

i=1
x
m
i
+a
1
n

i=1
x
m+1
i
+ +a
m
n

i=1
x
2m
i
By denoting
n

i=1
x
k
i
= s
k
, k = 0, 2m,
n

i=1
y
i
x
l
i
= t
l
, l = 0, m (here we make the
convention x
0
i
= 1, i = 1, m), the system to be solved becomes:
_
¸
¸
_
¸
¸
_
s
0
a
0
+s
1
a
1
+ +s
m
a
m
= t
0
s
1
a
0
+s
2
a
1
+ +s
m+1
a
m
= t
1
. . . . . .
s
m
a
0
+s
m+1
a
1
+ +s
2m
a
m
= t
m
(1)
The previous system is called the normal system.
It can be shown that the previous system has a unique solution which is a minimum
point for F (we will not prove these statements).
In conclusion the solution of the normal system will give us the coefficients for the
desired polynomial.
The coefficients s
0
, s
1
, . . . , s
2m
; t
0
, t
1
, . . . , t
m
can be determined by arranging the
calculation in the following table:
263
x
0
i
x
1
i
x
2
i
. . . x
2m
i
y
i
y
i
x
i
. . . y
i
x
m
i
1 x
1
x
2
1
. . . x
2m
1
y
1
y
1
x
1
. . . y
1
x
m
n
. . . . . . . . . . . . . . . . . . . . . . . . . . .
1 x
n
x
2
n
. . . x
2m
n
y
n
y
n
x
n
. . . y
m
x
m
n

s
0
s
1
s
2
. . . s
2m
t
0
t
1
. . . t
m
(2)
In the case m = 1 the desired function is a polynomial of degree one f(x) = a
0
+a
1
x
whose graph is called the least squares line.
Example 1. On election day, the polls open at 8:00 A.M. Every 2 hours after that,
an election official determines what percentage of the registered voters have already
express their ballots. The data through 6:00 P.M. are shown below.
time 10:00 12:00 2:00 4:00 6:00
percentage 12 19 24 30 37
Find the equation of the least-squares line (let x denote the number of hours after
8:00 A.M.). Use the least-squares line to predict what percentage of the registered
voters will have express their ballots by the time the polls close at 8:00 P.M.
Solution. Let x denote the number of hours after 8:00 A.M. and y the percentage.
Arrange the calculations as follows:
x 2 4 6 8 10
y 12 19 24 30 37
x
0
i
x
i
x
2
i
y
i
y
i
x
i
1 2 4 12 24
1 4 16 19 76
1 6 36 24 144
1 8 64 30 240
1 10 100 37 370

s
0
= 5 s
1
= 30 s
2
= 220 t
0
= 122 t
1
= 854
The normal system is
_
5a
0
+ 30a
1
= 122
30a
0
+ 220a
1
= 854
The solution of the previous system is given by
a
0
=
¸
¸
¸
¸
122 30
854 220
¸
¸
¸
¸
¸
¸
¸
¸
5 30
30 220
¸
¸
¸
¸
=
122 220 −30 854
5 220 −30 30
=
122
20
= 6, 1
a
1
=
¸
¸
¸
¸
5 122
30 854
¸
¸
¸
¸
¸
¸
¸
¸
5 30
30 220
¸
¸
¸
¸
=
5 854 −30 122
200
=
610
200
= 3, 05
264
So the least squares line is
f(x) = 6, 1 + 3, 05x.
To predict the percentage at 8:00 P.M., substitute 12 (the number of hours after
8:00 A.M.) into the equation of the least-squares line. This gives
y = 6, 1 + 12 3, 05 = 42, 7
which suggest that the percentage at 8:00 P.M. might be 42,7.
4.7.2 Inventory control. The economic order
quantity model
Inventory is the set of items (goods or materials) that are held by an organization
for later use.
For every type of item held in inventory we want to establish how much should be
ordered each time and when the reordering occur.
The objective is to minimize variable inventory costs which are: ordering costs and
holding costs.
Ordering costs are expenses of processing an order (these costs are independent
of the order quantity).
Holding costs are rent, heat, salaries etc.
The economic order quantity model is the simplest and the oldest of the inventory
models. It uses unrealistic assumptions but it gives a reasonable first approximation
to the given situation.
The assumptions in this model are the following:
- we study a single product with a constant demand
- no shortages (stockouts) are allowed
- the order is constant
- the time between the orders is constant
- lead time is 0 (the lead time is the time between the ordering moment and the
receipt of the goods; so, goods arrive at the same day they are ordered).
The elements of this model are:
a) τ - the entire period of time (known)
b) D - the demand of the entire period (known)
c) Q - the order quantity (unknown)
d) T - the time interval between the orders (unknown)
e) C
h
- holding cost / per item / per day (known)
f) n - number of orders (unknown)
g) C
0
- ordering cost, which is fixed for each period (known).
265
¸
O
`
T 2T nT
time
inventory
control
Q
We have to determine the costs per each cycle and then to multiply them by n in
order to obtain the total variable costs. The cost per cycle consist in ordering costs
(C
0
) and holding costs C
h

Q
2
T so the total cost function will be:
C
τ
= n
_
C
0
+C
h

Q
2
T
_
.
We have to solve the following constraint problem
_
¸
¸
¸
_
¸
¸
¸
_
C
τ
(n, Q, T) = nC
0
+nC
h

Q
2
T → min
subject to the following constraints :
τ = nT
D = nQ
We will solve the previous constraint optimization problem by using the elimina-
tion method. Since n =
D
Q
and τ = nT the total cost function will depend just on Q
and it remains for us to find the minimum value of the following one variable function
C(Q) =
D
Q
C
0
+C
h

Q
2
τ
The critical points are obtained by solving the next equation
C

(Q) = −
DC
0
Q
2
+
τC
h
2
= 0,
wherefrom we get
Q
2
=
2DC
0
τC
h
and Q =
_
2DC
0
τC
h
.
Since C
′′
(Q) =
2DC
0
Q
3
> 0, we have that the optimal order quantity is
Q

=
_
2DC
0
τC
h
.
Easy computations will give us the values of all unknowns mentioned before.
266
The minimum cost
C

=
D
Q

C
0
+C
h

Q

2
τ =
_
2DC
0
τC
h
.
The minimum number of order is
n

=
_
DτC
h
2C
h
and the optimal time between two orders is
T

=
_
2τC
0
DC
h
.
267
268
Part III
Probabilities
269
A short history of probabilities
It is said that the theory of probabilities as a branch of mathematics appeared in
the middle of 17
th
century in France. Antoine Gombaud, Chevalier de M´er´e (a French
noblement) proposed his gambling problems to Blaise Pascal (1623-1662) who started
a mathematical correspondence with Pierre de Fermat (1601-1665). The gambling
problems were:
- how many throws of two dice are required such that the number of double six
appearevents to be more than a half of total throws
- how to share the wagered money between two gamblers if the game is interrupted
before it ends.
The legend say that the de M´er´e’s gambling problems made the beginning of the
theory of probabilities. Actually, the legend is not entirely true since years before
Pascal and Fermat, problems of a probabilistic nature have been analysed by some
mathematicians. It would be more realistic to say that Pascal and Fermat initiated
the fundamental principles of probability theory as we know them now (the theory
started as an empirical science).
There are at least two distinct roots of the probability theory. The first one is
the processing of statistical data for determining mortality tables and insurances
rates (the Babylonians had forms of maritime insurances, the romans had annuities,
elements of empirical probability were applied for census of population in China).
The second one is gambling which appeared in the early stages of human history in
many places of the world. The predecessor of the dice was the astragalus (a heel bone
of an animal, the bones were used both for religious ceremonies and for gambling).
It took more than 2000 years of dice games, card games, etc. before someone
developed the basic probabilitic rules. There are at least two reasons for this late
appearance of probabilistic abstractions: Greek philosophy (the antiempiricism was
against the quantification of the random events) and early Christian theology (ev-
ery event was supposed to be a direct manifestation of God’s intervention and in
consequence every probabilist could be considered an heretic).
The first reasoned considerations which put rudimentary probabilistic bases to the
games of chance were presented in the manuscript ”The book on games of chance”
written around 1550 by Gerolamo Cardano and found after his death in 1576 (the
manuscript was printed only in 1663). G. Cardano was a phisicist addicted to gam-
bling. It is said that he sold all his wife’s possessions just to get table stakes. It can
be said that the ”classical definition” of probability came out of his obsession for
gambling.
The next paper on probability, ”On a discovery concerning dice” is due to Galileo
Galilei (presumable written between 1613 and 1623).
The Pascal-Fermat exchange of letters (1654) remained unpublished until 1657.
Even that in the correspondence there were solved a set of isolated problems in prob-
ability we cannot say that the obtained results put the basis of a new theory.
But the strong influence on many mathematicians, focused initially on gambling
and then in other branches of mathematics and sciences, lead to idea that the history
of probability begins with the correspondence between Pascal and Fermat.
271
One of those who heard about the correspondence was a Dutch mathematician,
Christian Huygens (1629-1695). In 1657, after a visit to Paris where he did not meat
Pascal (at that time Pascal abandoned math for religion) nor Fermat, he published
the first book on probability, ”On reasoning in games of dice” in which he solved the
same problems that have been already solved by Fermat and Pascal, he proposed and
solved some new problems and he introduced the concept of mathematical expectation
as ”the value of the chance”. Huygens’ book remained a standard introduction to the
subject for about a half a century.
During the same period, important advances were made in collection of demo-
graphic data and the development of the science known today as statistics. John
Graunt (1620-1674) made a semimathematical study of mortality and insurances. His
work was extended by Sir William Petty (1623-1687) and by Edmund Halley (1656-
1742) who developed mortality tables and is considered to initiate the science of life
statistics.
Because of the games of chance, probability theory became popular and the subject
developed rapidly during the 18
th
century.
The major contributors during this period were:
Jacob Bernoulli (1654-1705) whose most important result was the Law of large
numbers.
Abraham de Moivre (1667-1754) derived the theory of permutations and combina-
tions from the principles of probability and founded the theory of annuities. In 1733 he
discovered the equation of the normal curve. The normal curve is know as the ”Gaus-
sian curve” or ”Gauss-Laplace curve” in honor of Marquis de Laplace (1749-1827)
and Karl Friedrich Gauss (1777-1855) who independently rediscovered the equation.
Gauss obtained it from a study of errors in repeated measurements of the same quan-
tity. Laplace made great contributions to the application of probability to astronomy
and introduced the use of partial differential equations into the study of probability.
Between 1835 and 1870 the Belgian scientist Lambert A.J. Quetelet (1796-1874)
showed that biological and anthropological measurements follows the normal curve
and applied statistical methods in biology, education and sociology.
Sim´eon Denis Poisson (1781-1840) publishes in 1837 ”Research on the probabil-
ity of the judgements out of criminal matter and civil matter” where the Poisson
distribution first appears.
Probability theory has been developed since the 17
th
century and now has many
applications in many fields such as: actuarial mathematics, statistical mechanics, ge-
netics, law, medicine, meteorology, etc.
The major difficulty in developing the rigorous theory of probabilities was to give
a definition of probability that is rigorous enough to be used in mathematics but in
the same time to be applicable in real world.
It took three century until an acceptable definition was obtained.
Andrey Nikolaevich Kolmogorov (1903-1987) presented an axiomatic definition for
probability, this work is the basis for the modern theory of probabilities.
272
Chapter 5
Counting techniques.
Tree diagrams
In this section we present some techniques for determining without direct enumer-
ation the number of possible outcomes of a particular experiment or the number of
elements in a particular set. Such counting problems are called combinatorial prob-
lems, because we count the number of ways in which different possible outcomes can
be combined.
As the fundamental rules of all combinatorics we consider the addition rule and
the multiplication rule. While these rules are very easy to state, they are useful in
many various and complicated situations.
5.1 The addition rule
The number of elements in a given set is called the cardinality of set A, and is
denoted by cardA or [A[.
If it is an empty set then the cardinality is 0.
If it is an infinite set, then the cardinality is ∞.
We will analyze only finite sets in this section.
Cardinality of unions
Let A and B be two finite sets. Then
card (A∪ B) = card A+ card B −card (A∩ B).
The previous equality is obvious since if we add the cardinality of A to the cardi-
nality of B we have added the cardinality of the intersection A ∩ B twice. Hence we
have to subtract once the cardinality of A∩ B from cardA+ card B.
As a corollary, if the sets A and B are mutually exclusive then the cardinality of
273
the union is the sum of cardinalities
card (A∪ B) = card A+ card B, if A∩ B = ∅.
The generalization of the previous result to an arbitrary union of n finite sets is
called the inclusion-exclusion principle.
The inclusion-exclusion principle. Let A
1
, A
2
, . . . , A
n
be n finite sets. Then:
card
_
n
_
i=1
A
i
_
=
n

i=1
card A
i

1≤i<j≤n
card (A
i
∩ A
j
)
+

1≤i<j<k≤n
card (A
i
∩ A
j
∩ A
k
) − + (−1)
n+1
card
_
n

i=1
A
i
_
.
If the sets ¦A
i
¦
i=1,n
are mutually exclusive then:
card
_
n
_
i=1
A
i
_
=
n

i=1
card A
i
, if A
i
∩ A
j
= ∅, ∀ i ,= j.
Example. In an association gathering 95 people, 72 play backgammon, 44 play
chess and 30 do not play any of these two games. How many of them play at the same
time backgammon and chess?
Solution. We will use the following notations:
A - the set of whole members of the association
B - the set of backgammon players
C - the set of chess players.
We are given the following data:
card A = 95, card B = 72, card C = 44
and
card (A¸ (B ∪ C)) = 30
wherefrom we can easily get:
card (B ∪ C) = 95 −30 = 65.
By applying the previous formula we obtain:
card (B ∩ C) = card B + card C −card (B ∪ C)
= 72 + 44 −65
= 51.
274
Hence, the number of people that play at the same time backgammon and chess
is 51.
Example. 120 people take part at a conference. Among participants each can
speak at least one language: French, Spanish or German. We also know that:
• 10 people speak the three languages
• 4 people speak French, Spanish, but not German
• 8 people speak only Spanish
• 100 people speak French
• 32 people speak Spanish
• 53 people speak German.
Determine:
- the number of people who speak Spanish and German but not French
- the number of people who speak French and German but not Spanish
- the number of people who speak only German
- the number of people who speak only French.
Solution. We consider the following Venn-Euler diagram.
F
B
C
A
D
G
S
French
German
Spanish
Each disc among the 3 drawn corresponds to a group who speak a language
(French, German or Spanish).
In order to simplify the notations we will denote by small correspondent letters
the cardinality of the involved sets (for instance a = card A).
By applying the addition rule with mutually exclusive sets our problem reduces
to solving the following system of linear equations:
275
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
a +b +c +d +f +g +s = 120
a = 10
c = 4
s = 8
f +c +a +b = 100
s +c +a +d = 32
g +a +b +d = 53
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
a = 10, c = 4, s = 8
b +d +f +g = 120 −10 −4 −8 = 98
f +b = 100 −14 = 86
d = 32 −22 = 10
g +b = 53 −10 −10 = 33
_
¸
¸
_
¸
¸
_
a = 10, c = 4, s = 8, d = 10
f +b = 86
g +b = 33
b +f +g = 88
_
¸
¸
_
¸
¸
_
a = 10, c = 4, s = 8, d = 10
g = 88 −86 = 2
b = 33 −2 = 31
f = 88 −33 = 55
In conclusion: a = 10, b = 31, c = 4, d = 10, f = 55, g = 2, s = 8.
Hence:
- there are d = 10 people who speak Spanish and German, but not French
- there are b = 31 people who speak French and German but not Spanish
- there are g = 2 people who speak only German
- there are f = 55 people who speak only French.
The addition rule can be formulated in general terms involving objects, operations
or symbols but the main idea is the same.
The addition rule
If there are m possible outcomes for an event (or ways to do something) and n
possible outcomes for another event (or ways to do another thing) and the two events
cannot both occur (or the two things can’t both be done) then either of the two events
can occur (or total possible ways to do one of the things) in m+n ways.
Formally, the sum of the sizes of two disjoint sets is equal to the size of their union.
Example. A square with side length 3 is divided by parallel lines into 9 equal
squares. What is the total number of squares obtained by this procedure?
Solution. We divide the squares into three sets S
1
, S
2
, S
3
such that the set S
i
contains all squares of side length i (i = 1, 2, 3).
276
It is obvious that: cardS
1
= 1, card S
2
= 4 and cardS
3
= 9, hence the total
number of squares is
card (S
1
∪ S
2
∪ S
3
) = card S
1
+ card S
2
+ card S
3
= 1 + 4 + 9 = 14.
5.2 Tree diagrams and the multiplication principle
Consider an experiment that takes place in several steps such that the number
of outcomes at the n
th
step is independent of the outcomes of the previous steps.
Suppose that the number of outcomes at each step may be different for different
steps. We have to count the number of ways that the entire experiment can occur.
The best way to analyze such multistep problems is by drawing a tree diagram.
We list the possible outcomes of the first step, and then draw lines to present the
possible outcomes that can occur in the second step and so on.
To clarify the above description we will present the following example:
Example. A caf´e has the following menu:
a) two choices for appetizers: soup or juice;
b) three for the main course; lamb chops, fish or vegetable dish;
c) two for dessert: ice cream or cake.
How many possible choices do you have for your complete menu?
Solution. The complete menu is choosen in three independent steps: two choices
at the first course, three at the second and two at the third.
From the following tree diagram we see that the total number of choices is the
product of the number of choices at each stage.
277
soup
juice
lamb
fish
vegetable
lamb
fish
vegetable
ice cream
ice cream
ice cream
ice cream
ice cream
ice cream
cake
cake
cake
cake
cake
cake
We have 2 3 2 = 12 possible menus.
A tree diagram is a device used to enumerate all the possible outcomes of a mul-
tistep experiment where the outcomes at each step are independent of those at the
previous steps and the outcomes at each step can occur in a finite number of steps.
From the previous example we can observe that the tree is constructed from left
to right, and the number of branches at each point corresponds to the number of
possible outcomes of the next step of the experiment.
More rigorously, we can introduce a tree as:
A directed graph is a set of points, called vertices, together with a set of directed
line segments, called edges, between some pairs of distinct vertices. A path from a
vertex u to a vertex v in a directed graph G is a finite sequence (v
0
, v
1
, . . . , v
n
) of
vertices of G (n ≥ 1), v
0
= u, v
n
= v, and (v
i−1
, v
i
) is an edge in G for i = 1, 2, . . . , n.
A directed graph T is a tree if it has a distinguished vertex r, called the root,
such that r has no edges going into it and such that for every other vertex v of T
there is a unique path from r to v.
278
We can easily generalize the result obtained in the previous example to multistep
experiments.
The multiplication rule
If an experiment is performed in m steps, and there are n
1
choices in the first
step, and for each of those choices there are n
2
choices in the second step, and so on,
with n
m
choices in the last step for each of the previous choices, then the number of
all possible outcomes is given by the product n
1
n
2
n
3
. . . n
m
.
Example. On a grid of sporting lotto, we have to choose one of the three boxes 1,
x or 2 for each of the 9 matches (1 is put when the host team gains the match, x when
the match finishes at the same score and 2 when the host team loses the match).
How many different choices do we have?
Solution. There are 9 steps in our experiment (since there are 9 matches on the
grid) hence m = 9.
Since n
1
= n
2
= = n
9
= 3 (at each step we have to choose one of the three
possible boxes: 1, x or 2) then the number of all choices is
3 3 . . . 3
. ¸¸ .
9 times
= 3
9
= 19683.
Example. How many functions can we define on a set A, card A = m, given that
the target space is B, card B = n.
Solution. There are m steps in our experiment (since there are m points in the
domain of the function).
Since n
1
= n
2
= = n
m
= n (at each step we have to choose one of the n
elements of the set B) then the number of all functions is
n n. . . n
. ¸¸ .
m times
= n
m
.
Example. In a race with 20 horses, in how many different ways can the first three
places be filled?
Solution. There are 3 steps in our experiment (first, second and third place)
hence m = 3. There are 20 horses that can come first (n
1
= 20). Whichever horse
comes first, there are 19 horses left that can come second (n
2
= 19). Whichever
horses come first and second, there are 18 horses left that can come third. So there
are n
1
n
2
n
3
= 20 19 18 = 6840 ways in which the first three positions can be filled.
Example. Ruxi and Ana are to play a tennis match. The first person who wins
three sets wins the match. Draw a tree diagram which shows the possible outcomes
of the match.
279
Solution.
R
R R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
A
A
A
A
A
A
A
A
A
A
A
R
R
R
A
A
A
A
A
(R, R, R)
(R, R, A, R)
(R, R, A, A, R)
(R, R, A, A, A)
(R, A, R, R)
(R, A, R, A, R)
(R, A, R, A, A)
(R, A, A, R, R)
(R, A, A, R, A)
(R, A, A, A)
(A, R, R, R)
(A, R, R, A, R)
(A, R, R, A, A)
(A, R, A, R, R)
(A, R, A, R, A)
(A, R, A, A)
(A, A, R, R, R)
(A, A, R, R, A)
(A, A, R, A)
(A, A, A)
Notice that the total number of choices is 20. In this example, the branches have
different lengths, and this makes the counting more difficult than in the previous
examples.
Example. How many natural numbers are there under 1000 whose digits are
even?
Solution. There are 4 single-digit even numbers.
There are 4 5 numbers with two even digits, since the first digit can take 4 values
(2,4,6,8) and the second digit can take 5 values (0,2,4,6,8).
There are 4 5
2
numbers with three even digits.
By applying the addition rule we get the solution, which is
4 + 4 5 + 4 5 = 4 31 = 124.
280
5.3 Permutations and combinations
Some counting problems appear so frequently in application that we have special
names and symbols. In this subsection we will discuss such problems.
Permutations
Example. How many different ordered arrangements of the letters a, b, c are pos-
sible?
Solution. We can enumerate all the possibilities which are: abc, acb, bac, bca, cab,
cba. Hence there are 6 ordered arrangements.
This result can also be obtained from the multiplication rule, since the first letter
can be any of three, the second letter can be any of the two remaining letters and the
third letter is the remaining one.
Thus there are 3 2 1 = 6 ordered arrangements of the given letters.
Factorial notation
The product of the positive numbers from 1 to n inclusive occurs frequently in
mathematics and is denoted by the special symbol n! (read n factorial).
n! = 1 2 . . . n.
The expression 0! is defined to be 1 to make simpler certain formulas.
• Any arrangement of a set of n objects in a given order is called a permutation
of the objects (taken all a time).
• Any arrangement of any k ≤ n of these objects in a given order is called an
k-permutation, a permutation of the n objects taken k at time or an ar-
rangement of the n objects taken k at a time.
Notations. The number of permutations of n objects taken k at a time is denoted
by
P(n, k) or A
k
n
.
The number of permutations of n objects taken n at a time is denoted by
P(n, n) or P
n
.
We usually are interested in the number of such permutations without listing them.
Theorem. Given n distinct objects, the number of distinct permutations of the n
objects taken k (k ≤ n) at a time is
P(n, k) = A
k
n
= n(n −1) . . . (n −k + 1) =
n!
(n −k)!
P
n
= n!
Proof. In the first place we can put n objects (which can be written as n−1+1);
in the second place we can put n −1 (n −2 + 1) objects; and so on. In the k
th
place
(the last one) we can put n −k + 1 objects.
281
By applying the multiplication rule
A
k
n
= P(n, k) = n(n −1)(n −2) . . . (n −k + 1)
=
n(n −1)(n −2) . . . (n −k + 1)(n −k) . . . 2 1
(n −k) . . . 1
=
n!
(n −k)!
If we take k = n in the previous formula we get
P
n
= P(n, n) =
n!
0!
= n!, as desired.
Example. Ioana has 11 books that she is going to put on her bookshelf. Of these,
5 are law books, 3 are literature books, 2 are history books and 1 is a language book.
Ioana wants to arrange her books so that all the books with the same subject are
together on the shelf. How many different arrangements are possible?
Solution. For each possible ordering of the subjects, there are 5! 3! 2! 2! possible
arrangements. Since there are 4! possible orderings of the subjects, the desired result
is 4! 5! 3! 2! 1!.
Example. In how many ways can 8 persons arrange themselves
a) in a row of 8 chains?
b) around a circular table?
Solution. a) The eight persons can arrange themselves in P
8
= 8! ways.
b) One person can sit at any place in the circular table. The other 7 persons can
then arrange themselves in P
7
= 7! ways around the table.
This is an example of circular permutation.
n objects can be arranged in a circle in (n −1)! ways.
Example. a) In how many ways 3 women aged 20, 4 women aged 45 and 6 women
aged 75 be seated in a row so that those of the same age sit together?
b) Solve the same problem is they sit at a round table.
Solution. a) The 3 groups of women can be arranged in a row in 3! ways. In each
case, the 3 women aged 20 can be seated in 3! ways, the 4 women aged 45 in 4! ways
and the 6 women aged 75 in 6! ways. Thus, there are 3! 3! 4! 6! arrangements.
b) The 3 groups of women can be arranged in a circle in 2! ways (see the previ-
ous example with circular permutations). Thus, in this case there are 2! 3! 4! 6!
arrangements.
Example. Find the number of ”four-letter words” using only the letters
a, b, c, d, e, f. Don’t use a letter twice!
Solution.
A
4
6
=
6!
(6 −4)!
=
6!
2!
= 3 4 5 6 = 360.
Permutations with indistinguishable objects
We will now determine the number of permutations of a set of n objects when
certain objects are indistinguishable from each other.
282
First of all we will present an example.
Example. How many different letter arrangements can be formed using the letters
MUMMY ?
Solution. First, note that there are not 5! permutations, since the M’s are not
distinguishable from each other.
If the three M’s are distinguished there are 5! permutations of the letters
M
1
UM
2
M
3
Y . Observe that the following 3! = 6 permutations
M
1
M
2
M
3
UY, M
1
M
3
M
2
UY, M
2
M
1
M
3
UY,
M
2
M
3
M
1
UY, M
3
M
1
M
2
UY, M
3
M
2
M
1
UY
produce the same word when the subscripts are removed.
This is true for each of the other possible positions in which the M’s appear.
In conclusion there are
5!
3!
= 4 5 = 20
different letter words that can be obtained using the letters from the word MUMMY .
Theorem. If there are n objects with n
1
indistinguishable objects of a first type, n
2
indistinguishable objects of a second type,... and n
k
indistinguishable objects of an k
th
type, where n
1
+n
2
+ +n
k
= n, then there are
n!
n
1
!n
2
! . . . n
k
!
linear arrangements
of the given n objects.
Proof. We begin with the assumption that the objects of the same type are
distinct and let all n! arrangements of these n objects. We split these arrangements
into groups such that two elements of the same group differ only by the fact that the
objects of the same type are interchanged. Each group can be represented by one fixed
arrangement with repetitions. Since the objects of the first type can be interchanged
in n
1
! ways, the objects of the second type can be interchanged in n
2
! ways,. . . and
the objects of the k
th
type can be interchanged in n
k
!, each group contains exactly
n
1
!n
2
! . . . n
k
! ways arrangements. In conclusion, by applying the addition rule we get
that the desired number of arrangements is
n!
n
1
!n
2
! . . . n
k
!
.
Example. An university applicant has to pass four entrance exams, which means
getting 2, 3 or 4 point for each exam. In order to be accepted the applicant must get
a total of at least 13 points. How many possible exam results are there (in order that
the applicant to be accepted).
Solution. In order to be accepted the applicant can obtain a total of 13, 14, 15
or 16 points at the 4 exams.
16 points can be achieved in 1 one way (four points at each exam).
15 points can be achieved in
4!
3! 1!
= 4 ways (four points at any three exams out
of four and 3 points at the other exam).
283
14 points can be achieved in
4!
3! 1!
+
4!
2! 2!
= 4 +6 = 10 ways (four points at any
3 exams and 2 points at the other exam or four points at any two exams and 3 points
at the other two exams).
13 points can be achieved in
4!
2! 1! 1!
+
4!
3! 1!
= 12 + 4 = 16 (four points at 2
exams, 3 points at one exam and 2 points at the other exam or 3 points at any 3
exams and four points at the other exam).
By applying the addition rule we get 1 + 4 + 10 + 16 = 31 possible exam results
in order that the applicant to be accepted.
Permutations with repetitions
Let A = ¦a
1
, . . . , a
n
¦ be a set with n elements.
A k-permutation with repetitions of elements of n types is an k-tuple whose
components are in the set A.
Theorem. The number of all k-permutations with repetitions of elements of n
types is n
k
.
Proof. Each component of the k-tuple can take n values and by applying the
multiplication rule the desired number is
n n. . . n
. ¸¸ .
k times
= n
k
.
Combinations
Example. Ten points lie in a plane in such way that no three of them lie on the
same straight line. How many lines do these point determine?
Solution. Since each line is uniquely determined by a pair of points through which
it passes, the number of all lines is equal to the number of all unordered pairs of points
that can be chosen from the given set of 10. There are A
2
10
pairs of 2 points when the
order in which the points are selected is relevant. However, since every pair is counted
twice, the total number of lines is equal to
A
2
10
2
=
10!
8! 2
=
9 10
2
= 45.
As the previous example shows, sometimes we are interested in determining the
number of different groups of k objects that could be selected from a total of n objects.
Definition. Let us consider a set with n elements.
A combination of these n elements taken k at a time is any selection of k of the n
elements where the order does not count. Such a selection is called an k-combination.
The number of all possible unordered selections of k different elements out of n dif-
ferent ones is denoted by
_
n
k
_
(read ”n choose k”) or by C
k
n
(read ”combinations of n
taken k at a time).
Theorem. If 0 ≤ k ≤ n then
C
k
n
=
A
k
n
k!
=
n!
k!(n −k)!
.
284
Proof. We have C
k
n
ways of choosing k elements out of n without regarding order.
In each case we have k elements which can be ordered in k! ways. By applying the
multiplication rule, the number of k permutations is C
k
n
k!. On the other hand, this
number is A
k
n
. Hence
C
k
n
k! = A
k
n
,
wherefrom we get the desired formula.
Remark. The quantity C
k
n
is also called the binomial coefficient since it occurs
as the coefficient of the binomial expansion given by
(a +b)
n
= C
0
n
a
n
+C
1
n
a
n−1
b + +C
k
n
a
n−k
b
k
+ +C
n
n
b
n
.
Properties of the binomial coefficients
1

. C
k
n
= C
n−k
n
2

. Pascal’s identity:
C
k
n+1
= C
k
n
+C
k−1
n
3

. Sum of the binomial coefficients
C
0
n
+C
1
n
+C
2
n
+ +C
n
n
= 2
n
4

. Vandermonde’s identity
C
k
n+m
= C
0
n
C
k
m
+C
1
n
C
k−1
m
+C
2
n
C
k−2
m
+ +C
k
n
C
0
m
Proofs. 1

. C
n−k
n
=
n!
(n −k)!(n −(n −k))!
=
n!
(n −k)!k!
= C
k
n
2

. Expanding the righthand side of the equality we obtain
C
k
n
+C
k−1
n
=
n!
k!(n −k)!
+
n!
(k −1)!(n −k + 1)!
=
n!
(k −1)!(n −k)!
_
1
k
+
1
n −k + 1
_
=
n!
(k −1)!(n −k)!

n −k + 1 +k
k(n −k + 1)
=
(n + 1)!
k!(n −k + 1)!
= C
k
n+1
3

. Substituting in the binomial expansion
(a +b)
n
= C
0
n
a
n
+C
1
n
a
n−1
b + +C
n
n
b
n
a = b = 1, we obtain 2
n
= C
0
n
+C
1
n
+C
2
n
+ +C
n
n
, as desired.
4

. By identifying the coefficients of x
k
of both sides of the following identity:
(1 +x)
m+n
= (1 +x)
m
(1 +x)
n
,
C
0
m+n
+C
1
m+n
x + +C
k
m+n
x
k
+ +C
m+n
x
m+n
285
= (C
0
n
+C
1
n
x +C
2
n
x
2
+ +C
n
n
x
n
)(C
0
m
+C
1
m
x +C
2
m
x
2
+ +C
m
m
x
m
)
we get that:
C
k
m+n
= C
0
n
C
k
m
+C
1
m
C
k−1
m
+ +C
k
n
C
0
m
.
Example. A farmer buys 3 cows, 2 pigs and 4 hens from a man who has 6 cows,
5 pigs and 8 hens. Find the number of choices that the farmer has.
Solution. The farmer can choose the cows in C
3
6
ways, the pigs in C
2
5
ways and
the hens in C
4
8
ways. By the multiplication rule the number of choices is C
3
6
C
2
5
C
4
8
.
Example. From a group which consists of 7 boys and 4 girls we want to choose
a six-member volleyball team that has at least 2 girls. In how many ways the volley
ball team can be selected?
Solution. We divide all possible choices into three groups V
2
, V
3
, V
4
such that in
each team in V
i
, i = 2, 3, 4 there are exactly i girls. These i girls can be chosen in C
i
4
ways and the remaining 6 −i team members are chosen in C
6−i
7
ways. Hence,
card V
i
= C
i
4
C
6−i
7
and the number of all choices is
card V
2
+ card V
3
+ card V
4
= C
2
4
C
4
7
+C
3
4
C
3
7
+C
4
4
C
2
7
= 6 35 + 4 35 + 1 21 = 371.
Combinations with repetitions
Let A = ¦a
1
, . . . , a
n
¦ be a set with n elements.
A k-combination with repetitions of elements of n types is an unordered
group of k elements which consists of k
i
copies of a
i
, i = 1, n and k
1
+k
2
+ +k
n
= k.
For example ¦a, a, b, b, b¦ is a 5-combination with repetitions of the elements of
the set ¦a, b, c, d¦.
Theorem. The number of all k-combinations with repetitions of elements of n
types, is equal to C
k
k+n−1
.
Proof. To each k-element with repetitions of elements of n types we associate a
sequence of zeros and ones as follows.
First we write k
1
ones, then one zero, then k
2
ones, then 1 zero etc. up to k
n
ones.
If k
i
= 0 for some i, there will be no one in the correspondent position. Thus we
obtained an ordered m-tuple of 0’s and 1’s where
m = k
1
+ 1 +k
2
+ 1 + +k
n−1
+ 1 +k
n
= k
1
+k
2
+ +k
n
+n −1
= k +n −1.
There is a one to one correspondence between the k-elements with repetitions of
elements of n types and k +n −1-tuples which contains k ones and n −1 zeros. The
number of these is C
k
k+n−1
, as we needed.
286
Example. A domino is a rectangle divided into two squares with each square
numbered one of 0, 1, . . . , 6, repetitions allowed. How many dominoes are there?
Solution. In this case n = 7 (there are seven possibilities, 0, 1, 2, . . . , 6 for each
square) and k = 2 (each of two squares of a domino is to be numbered).
Hence, there are C
2
2+7−1
= C
2
8
=
8 7
2
= 28 dominoes.
Example. On their way home, seven students stop at a restaurant, where each of
them has one of the following: a cheeseburger, a hot dog, a taco or a fish sandwich.
How many different orders are possible (from the point of view of the restaurant)?
Proof. In this case n = 4 (there are four types of food available) and k = 7 (each
of 7 students choose a food). Hence, the number of possible orders is
C
7
7+4−1
= C
7
10
=
10!
7! 3!
=
8 9 10
6
= 120.
Remark. The problem of combinations with repetitions allowed with given n and
k is equivalent to the following problem. How many solutions are there to the equation
x
1
+x
2
+ +x
n
= k such that each x
i
is a nonnegative integer?
Proof. x
i
represents the number of elements of type i selected,
i = 1, n.
Example. How many solutions does the equation
x
1
+x
2
+x
3
+x
4
= 7
have such that x
i
, i = 1, 4, are nonnegative integers.
Solution. Here n = 4 and k = 7, so the answer is
C
7
7+4−1
= C
7
10
=
10!
7! 3!
= 120.
287
288
Chapter 6
Basic probability concepts
6.1 Sample space. Events
A random experiment is a process or an action whose outcomes are not known
in advance with certainty. Classic random experiments include flipping a coin, rolling
a dice, selecting a ball from an urn and drawing a card from a deck.
Each repetition of an experiment is a trial which has an observable outcome. If we
assume in the coin experiment that the coin cannot rest on its edge, the two possible
outcomes for a trial are the occurrence of a head or the occurrence of a tail.
The set which contains all the possible outcomes for an experiment is called the
sample space (denoted by S).
For the coin example the sample space S is defined as
S = ¦head, tail¦ = ¦H, T¦.
Some other examples:
Example 1. If the experiment consists of flipping two coins, then the sample
space consists of the following 4 elements:
S = ¦(H, H), (H, T), (T, H), (T, T)¦.
The outcome will be (H, H) if both coins come up heads; it will be (H, T) if the
first coin comes up heads and the second comes up tails, etc.
Example 2. If the experiment consists of rolling a dice, then the sample space is
S = ¦1, 2, 3, 4, 5, 6, ¦ where the outcome i means that i points appeared on the dice,
i = 1, 6.
Example 3. If the experiment consists of rolling a dice until a six is obtained
then we obtain an infinite sample space S
S = ¦6, 16, 26, 36, 46, 56, 116, 126, . . . ¦.
289
Example 4. If the experiment consists of rolling two dice, then the sample space
consists of the following 36 elements
S =
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6)
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6)
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)
_
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
_
where the outcome (i, j) is said to occur if i appears on the first dice and j on the
second dice.
Each element of S is called elementary event. Any subset E of the sample space
(or any collection of elementary events) is known as an event. The events are denoted
by capital letters. Some examples of events are the following.
Example 5. In example 2 the event that an odd number appears on the dice is
A = ¦1, 3, 5¦ and the event that the outcome is at most 4 is B = ¦1, 2, 3, 4¦.
Example 6. In example 4 if
E = ¦(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)¦,
then E is the event that the sum of the dice equals 7.
We say that an event E ⊂ S is realized if the outcome of the experiment is an
element of the set E.
The impossible event is the event that never happened. This is the event con-
taining no outcomes and is denoted by ∅.
The certain event is the event that happens in each trial. This is the event that
contains all possible outcomes which is S.
Remark. Since the events are sets of elementary events (or subsets of the sample
space) we may combine them according to the usual set operations.
Operations with events
Consider an experiment whose sample space is S.
Let A and B be two events of the sample space S.
• The union of the events A and B is the event denoted by A∪B which consists
of all elementary events that are either in A or in B in both of them. That is, the
event A∪ B will occur if either A or B occurs.
For instance, in example 5 if A = ¦1, 3, 5¦ and B = ¦1, 2, 3, 4¦ then
A∪ B = ¦1, 2, 3, 4, 5¦.
290
• The intersection of the events A and B is the event denoted by A ∩ B which
consists of all elementary events that are both in A and B. That is, the event A∩ B
will occur only if both A and B occur.
For instance, in example 5, if A = ¦1, 3, 5¦ and B = ¦1, 2, 3, 4¦ then
A∩ B = ¦1, 3¦.
Two events A and B are said to be mutually exclusive events or disjoint
events if they cannot be realized at the same time, that is A∩ B = ∅.
We can also defined unions and intersections of more than two events in a similar
manner.
• The contrary event of a given event A is the event denoted by A which consists
of all elementary events in the sample space S that are not in A.
The contrary event, A, will occur if and only if the event A does not occur.
For instance, in example 5, if A = ¦1, 3, 5¦ then A = ¦2, 4, 6¦.
• The difference of two events A and B is the event denoted by A¸B(= A∩B)
which consists of all the elementary events that are in A but not in B. The event A¸B
will occur if and only if the event A will occur and the event B will not occur.
• The inclusion A ⊂ B is the event which is realized if every elementary event in
A is also in B.
Properties of events operations
Operations with events satisfy various identities which are listed in the table below.
Event spaces (σ-fields)
This section contains a technical approach which can be omitted at the first read-
ing.
We have already observed that events are subsets of S. The question which arises
as a consequence of the previous remark is the following: which subsets of S can be
considered to be events?
It is obvious that if A and B are events, then A ∪ B, A, A ∩ B are also events.
This is too vague; to be rigorous, we say that a subset A of S can be an event if it
belongs to a set T ⊆ T(S) which satisfies the following properties.
Definition. (Event space or σ-field)
Let S be the sample space of a given experiment.
The set T ⊆ T(S) is called an event space or a σ-field if the following three
conditions are fulfilled:
(i) S ∈ T
(ii) if A ∈ T then A ∈ T
(iii) if A
j
∈ T, j ∈ N, then

_
j=1
A
j
∈ T.
291
Idempotent laws A∪ A = A A∩ A = A
Associative laws (A∪ B) ∪ C = A∪ (B ∪ C) (A∩ B) ∩ C = A∩ (B ∩ C)
Commutative laws A∪ B = B ∪ A A∩ B = B ∩ A
Distributive laws A∪ (B ∩ C) = (A∪ B) ∩ (A∩ C) A∩ (B ∪ C) = (A∩ B) ∪ (A∩ C)
Identity laws A∪ ∅ = A A∩ S = A
A∪ S = S A∩ ∅ = ∅
Involution laws (A) = A
Complement laws A∪ A = S A∩ A = ∅
S = ∅ ∅ = S
De Morgan’s laws A∪ B = A∩ B A∩ B = A∪ B
2
9
2
Remark. Let T be an event space (on the sample space S).
If A, B ∈ T then A∪ B, A∩ B, A¸ B and A∆B ∈ T.
Proof. First, we observe the fact that ∅ ∈ T. This is true since ∅ = S ∈ T
(according to the second rule of the previous definition).
If we take now in the third rule of the definition of an event space: A
1
= A, A
2
= B
and A
j
= ∅, j ≥ 3 then
A∪ B =

_
j=1
A
j
∈ T.
As a consequence of De Morgan’s laws we have that A∩ B = A ∪ B which leads
us to the following identity
A∩ B = (A∪ B) ∈ T
(according to the second and third rule of the previous definition.
Now, it is obvious that A¸ B = A∩ B ∈ T.
Similarly, A∆B = (A¸ B) ∪ (B ¸ A) ∈ T.
Example. In the experiment of rolling a dice we define the following events:
A the event that the outcome is even (A = ¦2, 4, 6¦)
B the event that the outcome is odd (B = ¦1, 3, 5¦)
In this case T
1
= ¦∅, A, B, S¦ is an event space and T
2
= ¦∅, A, S¦ is not an event
space since A = B ,∈ T
2
.
The smallest event space which can be defined on a sample space S is T = ¦∅, S¦
and the largest event space is T(S).
Usually, if S is finite then T = T(S). If S is infinite, then T(S) is too big to be
useful, and a smaller collection of subsets is required.
The classical definition of probability
The classical definition was given by the French mathematician Pierre Simon
Laplace in his book ”Th´eorie analytique des probabilit´es” in the following form:
”The probability of an event is the ratio of the number of cases favorable to it, to
the number of all cases possible when nothing leads us to expect that any of these
cases should occur more than any other.”
Definition. (The classical definition of probability)
We consider an experiment which has a finite number of equally likely outcomes,
S = ¦s
1
, . . . , s
n
¦.
The function P : T(S) → [0, 1], defined as
T(S) ∋ A → P(A) =
number of favorable cases for the occurrence of A
number of all possible outcomes(n)
is a probability on T(Ω).
Example. In the experiment of rolling a pair of unbiassed dice compute the
probability of the event that
a) the sum of the dice is 11
293
b) the sum of the dice is at least 11.
Solution. The number of all possible cases is 36 = 6 6 (see Example 2).
Let A be the event that the sum of the dice is 11 and B the event that the sum
of the dice is at least 11. Then
A = ¦(5, 6), (6, 5)¦, B = ¦(5, 6), (6, 5), (6, 6)¦,
P(A) =
2
36
=
1
18
and P(B) =
3
36
=
1
12
.
Example. In the experiment of drawing a card from a deck with 52 cards compute
the probability of drawing a king or a spade.
Solution. Out of the 52 cards, there are 13 spades and 4 kings. The number of
favorable cases is 13 + 4 − 1 = 16 (in order not to count the king of spades twice).
The desired probability will be
16
52
=
4
13
.
Classical probability suffers from a serious limitation since its definition consider
all outcomes to be equiprobable. This can be useful for drawing cards, rolling dice or
extracting balls from an urn but it cannot help us in the experiments with outcomes
with unequal probabilities.
The axiomatic definition of probability
The axiomatic approach build up probability theory from a number of axioms.
Definition. Let S be the sample space of a given experiment and let T ⊂ T(S)
be an event space on S.
The function P : T →R which satisfies the following properties:
i) P(A) ≥ 0, ∀ A ∈ T
ii) P(S) = 1
iii) P
_

_
i=1
A
i
_
=

i=1
P(A
i
), for each sequence (A
i
)
i≥1
⊆ T of mutually exclusive
events (A
i
∩ A
j
= ∅, ∀ i ,= j)
is called a probability function on T.
The previous definition was introduced by the Russian mathematician Kolmogorov
in 1933.
Definition. (Probability space)
Let S be a sample space of a given experiment, T ⊆ T(S) be an event space
on S and P be a probability function on T. Then the triple (S, T, P) is called a
probability space.
Elementary properties of a probability function
From the previous definition with its three axioms we can deduce a lot of properties
that one would expect a probability function to have.
Let (S, T, P) be a probability space.
P1) P(∅) = 0.
294
Proof. Since S = S ∪ ∅ ∪ ∅ ∪ . . . and the sequence ¦S, ∅, ∅, . . . , ∅, . . . ¦ consists of
mutually exclusive events, by applying the third axiom of the probability function we
get
P(S) = P(S) +

i=1
P(∅)
which is equivalent to

i=1
P(∅) = 0.
The last equality cannot be true unless P(∅) = 0.
Of course, the fact that the impossible event has probability 0 is natural.
P2) For each pair A, B ∈ T of mutually exclusive events (A∩ B = ∅) we have
P(A∪ B) = P(A) +P(B).
Proof. Since A∪ B = A∪ B ∪ ∅ ∪ . . . and P(∅) = 0 we get
P(A∪ B) = P(A) +P(B) +

i=1
P(∅) = P(A) +P(B),
as desired.
P3) ∀ A ∈ T, P(A) = 1 −P(A).
Proof. For each A ∈ T we have A ∪ A = S and A ∩ A = ∅. By applying the
previous property and the second axiom we get
1 = P(S) = P(A∪ A) = P(A) +P(A)
which implies that
P(A) = 1 −P(A).
P4) For each A, B ∈ T with A ⊂ B we have:
a) P(B ¸ A) = P(B) −P(A);
b) P(A) ≤ P(B).
Proof. Since A ⊂ B we get that B = A∪(B¸ A) and A∩(B¸ A) = ∅. From (P2)
we have that
P(B) = P(A) +P(B ¸ A)
and since P(B ¸ A) ≥ 0 we obtain the inequality P(B) ≥ P(A).
P5) ∀ A ∈ T, 0 ≤ P(A) ≤ 1.
Proof. Clearly, ∅ ⊆ A ⊆ S for every event A. Then the first axiom and (P4) give
0 = P(∅) ≤ P(A) ≤ P(S) = 1.
P6) The addition rules of probabilities
i) The case of two events
∀ A, B ∈ T : P(A∪ B) = P(A) +P(B) −P(A∩ B)
295
ii) the case of three events
∀ A, B, C ∈ T : P(A∪ B ∪ C) = P(A) +P(B) +P(C) −P(A∩ B)
−P(A∩ C) −P(B ∩ C) +P(A∩ B ∩ C)
Proof. i) It is clear that (by means of a Venn diagram for example)
A∪ B = A∪ [B ¸ (A∩ B)]
Then, by using (P2) and (P4), we get
P(A∪ B)
(P2)
= P(A) +P(B ¸ (A∩ B))
(P4)
= P(A) +P(B) −P(A∩ B)
ii) We apply part (i) to obtain:
P(A∪ B ∪ C) = P((A∪ B) ∪ C)
(i)
= P(A∪ B) +P(C) −P((A∪ B) ∩ C)
(i)
= P(A) +P(B) −P(A∩ B) +P(C)
−P((A∩ C) ∪ (B ∩ C))
(i)
= P(A) +P(B) −P(A∩ B) +P(C)
−[P(A∩ C) +P(B ∩ C) −P((A∩ C) ∩ (B ∩ C))]
= P(A) +P(B) +P(C) −P(A∩ B) −P(A∩ C)
−P(B ∩ C) +P(A∩ B ∩ C)
We can generalize the addition rules to the case of more then three events.
Theorem. (The Poincar´e’s formula) The probability of the union of any n
events A
1
, A
2
, . . . , A
n
is given by:
P
_
n
_
i=1
A
i
_
=
n

i=1
P(A
i
) −

1≤i<j≤n
P(A
i
∩ A
j
) +

1≤i<j<k≤n
P(A
i
∩ A
j
∩ A
k
)
− + (−1)
n+1
P(A
1
∩ A
2
∩ ∩ A
n
).
Even if the proof of the theorem (which is by induction) will not be presented, the
form of the right-side above is clear. First, we have to sum the probabilities of the
individual events, then subtract the probabilities of the intersections of the events,
taken two at a time (in the ascending order of indices), then add the probabilities of
the intersections of the events, taken three at a time as before, and continue like this
until you add or subtract (depending on n) the probability of the intersection of all
n events.
P7) i) The Boole’s inequalities
∀ A, B ∈ T : P(A∩ B) ≤ P(A) +P(B).
296
ii) ∀ (A
i
)
i≥1
⊆ T; P
_

_
i=1
A
i
_

i=1
P(A
i
).
Proof. i) From the previous property we have:
P(A∪ B) = P(A) +P(B) −P(A∪ B) ≤ P(A) +P(B)
ii) First, we observe that the union

_
i=1
A
i
can be written as a union of mutually
exclusive events in the following way

_
i=1
A
i
= A
1
∪ (A
2
¸ A
1
) ∪ (A
3
¸ (A
1
∪ A
2
)) ∪ . . .
So,
P
_

_
i=1
A
i
_
= P(A
1
) +P(A
2
¸ A
1
) +P(A
3
¸ (A
1
∪ A
2
)) +. . .
(P4)
≤ P(A
1
) +P(A
2
) +P(A
3
) +. . .
=

i=1
P(A
i
)
P8) The Bonferroni’s inequality
∀ A
1
, . . . , A
n
∈ T : P
_
n

i=1
A
i
_

n

i=1
P(A
i
) −n + 1.
Proof. The proof is by induction.
The first case is n = 1 and is P(A
1
) ≥ P(A
1
).
The case n = 2: P(A
1
∩ A
2
) ≥ P(A
1
) +P(A
2
) −1.
To prove this we use the addition rule and the fact that P(A
1
∪ A
2
) ≤ 1.
P(A
1
∩ A
2
) = P(A
1
) +P(A
2
) −P(A
1
∪ A
2
) ≥ P(A
1
) +P(A
2
) −1.
The inductive step remains. We assume that the proposition is true for k and we
show that it necessarily follows for the case k + 1 (we use the case n = 2).
P(A
1
∩ A
2
∩ ∩ A
k+1
) = P((A
1
∩ ∩ A
k
) ∩ A
k+1
)
≥ P(A
1
∩ A
2
∩ ∩ A
k
) +P(A
k+1
) −1
≥ P(A
1
) + +P(A
k
) −k + 1 +P(A
k+1
) −1
=
k+1

i=1
P(A
i
) −k
297
which is what we have to prove.
Next, some examples are presented to illustrate some of the above properties.
Example. (i) Let A, B ∈ T such that P(A) = 0, 5, P(B) = 0, 4 and P(A∩ B) =
0, 6. Calculate P(A∩ B).
(ii) If P(A) = 0, 5, P(B) = 0, 4, P(A¸B) = 0, 4 and B ⊂ C, calculate P(A∪B∪C).
Solution. (i) From P(A∪ B) = P(A) +P(B) −P(A∩ B) we obtain
P(A∩ B) = P(A) +P(B) −P(A∪ B) = 0, 5 + 0, 4 −0, 6 = 0, 3.
(ii) The inclusion B ⊂ C implies C ⊂ B and hence
A∪ B ∪ C = A∪ B.
In consequence
P(A∪ B ∪ C) = P(A∪ B) = P(A) +P(B) −P(A∩ B)
= P(A) + 1 −P(B) −P(A¸ B) = 0, 5 + 1 −0, 4 −0, 4 = 0, 7.
Example. Consider a biased dice such that the probability of occurrence of a face
is directly proportional to the number of the points on the considered face.
Consider the following events:
A = ¦the occurrence of an even number¦
B = ¦the occurrence of an odd number¦
C = ¦the occurrence of a prime number¦
a) Compute the probabilities of occurrence of each face of the dice.
b) Compute P(A), P(B) and P(C).
c) P(A∪ C), P(B ∩ C), P(A∩ B).
Solution. a) If we denote by P(¦1¦) = p then P(¦i¦) = i p,
i ∈ ¦2, 3, 4, 5, 6¦. On the other hand since S = ¦1, . . . , 6¦ we have that
1 = P(S) =
6

i=1
P(¦i¦) =
6

i=1
i p = p(1 + 2 + + 6) = p 21.
ˆ
In conclusion
P(¦i¦) = i p =
i
21
, i = 1, 6.
b) P(A) = P(¦2, 4, 6¦) = P(¦2¦) +P(¦4¦) +P(¦6¦)
=
2
21
+
4
21
+
6
21
=
12
21
=
4
7
P(B) = P(¦1, 3, 5¦) = P(¦1¦) +P(¦3¦) +P(¦5¦) =
1 + 3 + 5
21
=
9
21
=
3
7
P(C) = P(¦2, 3, 5¦) = P(¦2¦) +P(¦3¦) +P(¦5¦) =
10
21
298
c) Since A∪ C = ¦2, 3, 4, 5, 6¦ = ¦1¦, then
P(A∪ C) = P(¦1¦) = 1 −P(¦1¦) = 1 −
1
21
=
20
21
P(B ∩ C) = P(¦3, 5¦) = P(¦3¦) +P(¦5¦) =
3
21
+
5
21
=
8
21
P(A∩ B) = P(A¸ B) = P(A) =
4
7
.
Example. Consider the experiment of throwing in the same time a dice and a
coin.
Determine the following events and compute their probabilities:
a) A: the occurrence of an even number and the head;
b) B: the occurrence of a prime number;
c) C: the occurrence of an odd number and the tail;
d) D: A or B is realized;
e) E: B and C is realized;
f) Which events among A, B and C are mutually exclusive?
Solution. The sample space is:
S = ¦1H, 2H, 3H, 4H, 5H, 6H, 1T, 2T, 3T, 4T, 5T, 6T¦.
a) A = ¦2H, 4H, 6H¦
P(A) =
card A
card S
=
3
12
=
1
4
b) B = ¦2H, 2T, 3H, 3T, 5H, 5T¦
P(B) =
card B
card S
=
6
12
=
1
2
c) C = ¦1T, 3T, 5T¦
P(C) =
card C
card S
=
3
12
=
1
4
d) D = A∪ B = ¦2H, 4H, 6H, 2T, 3H, 3T, 5H, 5T¦
P(A∪ B) =
8
12
=
2
3
or
P(A∪ B) = P(A) +P(B) −P(A∩ B)
A∩ B = ¦2H¦, P(A∩ B) =
1
12
P(A∪ B) =
1
4
+
1
2

1
12
=
3 + 6 −1
12
=
8
12
=
2
3
e) E = B ∩ C = ¦3T, 5T¦
P(E) =
card E
card S
=
2
12
=
1
6
299
f) A∩ B = ¦2H¦ ,= ∅
A∩ C = ∅
B ∩ C = ¦3T, 5T¦
Hence, the events A and C are mutually exclusive.
Example. Let (S, T, P) a probability space and A, B ∈ T such that
P(A) =
3
8
, P(B) =
1
2
, P(A∩ B) =
1
4
.
Compute the following probabilities:
a) P(A∪ B);
b) P(A) and P(B);
c) P(A∩ B);
d) P(A∪ B);
e) P(A∩ B);
f) P(B ∩ A).
Solution. a) P(A∪ B) = P(A) +P(B) −P(A∩ B)
=
3
8
+
1
2

1
4
=
3 + 4 −2
8
=
5
8
b) P(A) = 1 −P(A) = 1 −
3
8
=
5
8
P(B) = 1 −P(B) = 1 −
1
2
=
1
2
c) P(A∩ B) = P(A∪ B) = 1 −P(A∪ B) = 1 −
5
8
=
3
8
d) P(A∪ B) = P(A∩ B) = 1 −P(A∩ B) = 1 −
1
4
=
3
4
e) P(A∩ B) = P(A¸ B) = P(A¸ (A∩ B))
= P(A) −P(A∩ B) =
3
8

1
4
=
1
8
f) P(B ∩ A) = P(B ¸ A) = P(B ¸ (A∩ B))
= P(B) −P(A∩ B) =
1
2

1
4
=
1
4
.
Example. Chevalier de Mere was a mid-seventeenth century nobleman and gam-
bler who tried to make money gambling with dice. De Mere made money by betting
that he could obtain at least one 6 on four rolls of one dice. When people did not
bet on this game with de Mere, he created a new game. He began to bet he would
get a double 6 on twenty-four rolls of two dice but he began losing money on it. He
asked his friend Blaise Pascal to analyze this game. Pascal analyzed this game and
asked Pierre de Fermat to work with him. It can be said that the formal study of
probability was lauched by two mathematicians and a gambler.
300
We will calculate and compare the probabilities of the following events:
A: we obtain at least one six in 4 rolls of a dice
B: we obtain at least one double 6 in 24 rolls of two dice
C: we obtain at least one double 6 in 25 rolls of two dice.
For this problem it is easier to determine the probabilities of the contrary events
A, B, C.
The event A means no six is obtained in 4 rolls of a dice.
The experiment has 6 6 6 6 = 6
4
possible outcomes. Since in each rolling of
one dice we have 5 possibilities to obtain no six, then in 4 rolls of the dice we have 5
4
possibilities to get no six.
Therefore, P(A) =
5
4
6
4
which implies
P(A) = 1 −P(A) = 1 −
5
4
6
4
≈ 0, 52.
The event B means no double 6 is obtained in 24 rolls of two dice. The number
of possible outcomes is 36
24
. The number of favorable outcomes is 35
24
. Therefore,
P(B) = 1 −P(B) = 1 −
35
4
36
4
≈ 0, 49.
The event C has the probability
P(C) = 1 −
_
35
36
_
25
≈ 0, 505.
Example. The birthday problem. If n people are present in a room, what is
the probability that no two of them celebrate their birthday on the same day of this
year? How large need n be so that this probability is less then
1
2
?
Solution. As each person can celebrate his or her birthday on any of 365 day, there
is a total o(365)
n
possible outcomes. (We are ignoring the possibility of someone’s was
born on February 29). Assuming that each outcome is equally likely, we see that the
number of favorable cases is
365 (365 −1) (365 −2) . . . (365 −(n −1))
since the first person can celebrate his or her birthday on any day, the second person
can celebrate on any day except the first person’s birthday, the third person can
celebrate on any day except the first and second person’s birthdays and so on.
The desired probability is
p
n
=
365 364 . . . (365 −n + 1)
365
n
.
The values for the previous probability for different values of n can be found in
the next table.
301
n 1 2 5 10 20 23 30 50
p
n
1 0, 99 0, 97 0, 88 0, 58 0, 49 0, 29 0, 03
When n ≥ 23, the desired probability is less than
1
2
. That is, if there are 23 or
more people in a room, then the probability that at least two of them have the same
birthday is greater than
1
2
.
When there are 50 persons in the room, the probability that at least two have the
same birthday is approximately 0,97.
6.2 Conditional probability
In this section we introduce the concept of conditional probability. Conditional
probabilities are used to compute the probabilities when some partial information
concerning the result of the experiment is available.
Conditional probabilities
In the experiment of tossing a pair of unbiased dice suppose that we observe that
the first dice is a 2. We want to determine the probability that the sum of the two
dice equals 7, given that we already know that the first dice is a 2.
Since the possible outcomes for our experiment are (2,1), (2,2), (2,3), (2,4), (2,5)
and (2,6) and the favorable outcome is (2,5), the desired probability is
1
6
.
If we consider the events
A: the first dice is 2 and
B: the sum of the dice is 7 then
A = ¦(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)¦,
B = ¦(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)¦
and
A∩ B = ¦(2, 5)¦.
In the above example we are told that the event A has occurred and we are asked
to evaluate the probability of B on the basis of this fact. What is important here
is the event A, and given that A has occurred, the event B occurs only if the event
¦(2, 5)¦ = A∩ B occurs. The required probability is then
1
6
=
1
36
6
36
=
P(A∩ B)
P(A)
.
If we denote by P(B[A) (the probability of B) given that A has occurred or just,
given B) then
P(B[A) =
P(A∩ B)
P(A)
.
302
This example justifies the following definition of conditional probability.
Definition. (Conditional probability) Let (S, T, P) be a probability space and
let A, B ∈ T such that P(A) > 0.
The conditional probability of the event B, given that A (has occurred) is denoted
by P(B[A) (or P
A
(B)) and is defined by
P(B[A) = P
A
(B) =
P(A∩ B)
P(A)
.
Replacing A by the sample space S we obtain the probability of B,
P(B[S) =
P(B ∩ S)
P(S)
=
P(B)
1
= P(B).
Hence, the conditional probability is a generalization of the concept of probability
where S is restricted to an event A.
We will see now that the conditional probability, is, indeed, a probability.
Proposition. If (S, T, P) is a probability space and A ∈ T such that P(A) > 0
then P
A
: T →R, P
A
(B) = P(B[A) is a probability function, too.
Proof. We verify the three axioms of a probability function.
i) P
A
(B) = P(B[A) =
P(B ∩ A)
P(A)
≥ 0, ∀ B ∈ T.
ii) P
A
(S) = P(S[A) =
P(S ∩ A)
P(A)
=
P(A)
P(A)
= 1.
iii) Let (B
i
)
i≥1
⊆ T be a sequence of mutually exclusive events. Then
P
A
_

_
i=1
B
i
_
= P
_

_
i=1
B
i
[A
_
=
P
__

_
i=1
B
i
_
∩ A
_
P(A)
=
P
_

_
i=1
(B
i
∩ A)
_
P(A)
=

i=1
P(B
i
∩ A)
P(A)
=

i=1
P(B
i
∩ A)
P(A)
=

i=1
P(B
i
[A) =

i=1
P
A
(B
i
),
which completes the proof.
From the definition of conditional probability we can derive a simple but very
useful result: the so-called multiplicative theorem.
Theorem. (The multiplication rules) Let (S, T, P) be a probability space.
i) If A, B ∈ T such that P(A) > 0, then
P(A∩ B) = P(A) P(B[A).
ii) If A, B, C ∈ T such that P(A∩ B) > 0, then
P(A∩ B ∩ C) = P(A) P(B[A) P(C[A∩ B).
303
iii) If A
1
, . . . , A
n
∈ T such that P(A
1
∩ A
2
∩ ∩ A
n−1
) > 0 then
P(A
1
∩ A
2
∩ ∩ A
n
) = P(A
1
) P(A
2
[A
1
) P(A
3
[A
1
∩ A
2
) . . .
P(A
n
[A
1
∩ A
2
∩ ∩ A
n−1
).
Proof. i) P(A) P(B[A) = P(A)
P(B ∩ A)
P(A)
= P(B∩A) = P(A∩B), as desired.
ii) P(A) P(B[A) P(C[A∩ B)
= P(A)
P(A∩ B)
P(A)

P(A∩ B ∩ C)
P(A∩ B)
= P(A∩ B ∩ C).
Observe that if P(A∩B) > 0 then P(A) ≥ P(A∩B) > 0, so the previous fractions
are correctly defined.
iii) By induction and is left as an exercise.
The importance of the previous theorem is given by the fact that we can calculate
the probability of the intersection of n events, step by step, by means of conditional
probabilities which is easier.
Next, we present a simple example which illustrates the point.
Example. An urn contains 12 identical balls of which 6 are red, 4 are green and
2 are yellow. Four balls are extracted from the urn without replacement. Determine
the probability that the first ball is green, the second is red, the third is yellow and
the last is red.
Solution. If we denote by G
1
, R
2
, Y
3
and R
4
the events that the first ball is green,
the second is red, the third is yellow and the fourth is red then the desired probability
is
P(G
1
∩ R
2
∩ Y
3
∩ R
4
) = P(G
1
) P(R
2
[G
1
) P(Y
3
[G
1
∩ R
2
) P(R
4
[G
1
∩ R
2
∩ Y
3
)
=
4
12

6
11

2
10

5
9
=
2
99
.
Probability trees
An effective and simpler method of applying the probability rules is the probability
tree, in which:
• the events are represented by lines (branches)
• the probability along a path is the product of the probabilities on the branches
which form the path
• the sum of the probabilities at the end of the branches which start from the
same point is 1 because all possible events are listed.
We will present first a theoretical example.
Example. Let (S, T, P) be a probability space. Let A
1
, A
2
, A
3
, E ∈ T such that
A
1
∪ A
2
∪ A
3
= S, A
i
∩ A
j
= ∅, i ,= j.
304
We can obtain the following probability tree.
E ∩ A
1
P(E ∩ A
1
) = P(A
1
) · P(E|A
1
)
E ∩ A
1
P(E ∩ A
1
) = P(A
1
) · P(E|A
1
)
E ∩ A
2
P(E ∩ A
2
) = P(A
2
) · P(E|A
2
)
E ∩ A
2
P(E ∩ A
2
) = P(A
2
) · P(E|A
2
)
E ∩ A
3
P(E ∩ A
3
) = P(A
3
) · P(E|A
3
)
E ∩ A
3
P(E ∩ A
3
) = P(A
3
) · P(E|A
3
)
A
1
A
2
A
3
P(E|A
1
)
P(E|A
1
)
P(E|A
2
)
P(E|A
2
)
P(E|A
3
)
P(E|A
3
)
P(A
1
)
P(A
2
)
P(A
3
)
Example. A graduate statistics course has 7 male and 3 female students. The
professor wants to select two students at random for a research project. By using a
probability tree determine the probabilities of all the possible outcomes.
F
F ∩ F P(F ∩ F) =
3
10
·
2
9
=
2
30
M ∩ F P(M ∩ F) =
7
10
·
3
9
=
7
30
M
F ∩ M P(F ∩ M) =
3
10
·
7
9
=
7
30
M ∩ M P(M ∩ M) =
7
10
·
6
9
=
14
30
P(F) =
3
10
P(M) =
7
10
P(F|F) =
2
9
P(M|F) =
7
9
P(F|M) =
3
9
P(M|M) =
6
9
305
Remark that the first two branches represent the two possibilities female and male
students, on the first choice. The second set of branches represents the two possibilities
on the second choice. The probabilities of female and male student chosen first are
3
10
and
7
10
respectively. The probabilities for the second set of branches are conditional
probabilities based on the choice of the first student selected.
6.3 The total probability formula.
Bayes’ formula
We will first analyze the following example:
Example. Suppose a disease is present in 0,1% of a population. A diagnostic test
is available but imperfect. The test shows 5% false positives and 1% false negatives.
That is, for a patient not having the disease, the test shows positive with probability
0,05 and negative with probability 0,95. For a patient having the disease, the test
shows negative with probability 0,01 and positive with probability 0,99.
A person is randomly chosen.
(i) Determine the probabilities of the following configurations: diseased and posi-
tive test, diseased and negative test, not diseased and positive test, not diseased and
negative test.
(ii) Determine the probability that a person will test positive and the probability
that a person will test negative.
(iii) If the chosen person tests positive, what is the probability that he/she is
diseased? If the chosen person tests negative, what is the probability that he/she is
diseased?
Solution. Let
D: the event that the person is diseased
T: the test is positive.
We are given the following data:
P(D) = 0, 001, P(D) = 0, 999, P(T[D) = 0, 05,
P(T[D) = 0, 95, P(T[D) = 0, 99, P(T[D) = 0, 01.
We can represent these informations by using a probability tree:
(i)
306
D
D ∩ T P(D ∩ T) = 0, 99 · 0, 001
D ∩ T P(D ∩ T) = 0, 01 · 0, 001 = 0, 00001
D ∩ T P(D ∩ T) = 0, 05 · 0, 999 = 0, 04995
D ∩ T P(D ∩ T) = 0, 95 · 0, 999 = 0, 94905
D
P(D) = 0, 001
P(D) = 0, 999
P(T|D) = 0, 99
P(T|D) = 0, 01
P((T|D) = 0, 05
P(T|D) = 0, 95
(ii) P(T) = P(D ∩ T) +P(D ∩ T) = 0, 00099 + 0, 04995 = 0, 05094
P(T) = P(D ∩ T) +P(D ∩ T) = 0, 00001 + 0, 094905 = 0, 094906
(iii) P(D[T) =
P(D ∩ T)
P(T)
=
0, 00099
0, 05094
=
99
5094
≈ 0, 0198
P(D[T) =
P(D ∩ T)
P(T)
=
1
94905
Thus only 1,98% of those persons whose test results are positive actually have the
disease. The result is surprising, since the proportion is low given that the test is quite
good. We will present a second argument which is less rigorous but is more relevant.
Since 0,1% of the population actually has the disease, it follows that 1 person
out of every 1000 tested will have it (on average). The test confirms that a diseased
person has the disease with probability 0,99. Thus out of every 1000 person tested,
the test will confirm that 0,99 persons have the disease. On the other hand, out of
999 healthy people, the test will state (incorrectly) that 999 0, 05 = 49, 95 have the
disease. Hence, for every diseased persons that the test correctly states are ill, there
are 49,95 healthy persons that the test states are ill (incorrectly).
Thus, the proportion of correct positives is equal to:
correct positives
correct positives + incorrect positives
=
0, 99
0, 99 + 49, 95
=
99
5094
≈ 0, 019.
The fact that the probability P(D[T) is less then 1 reflects the fact that the test
is imperfect. If the test would be perfect
P(T[D) = P(T[D) = 1
then
P(D[T) =
P(D ∩ T)
P(D ∩ T) +P(D ∩ T)
307
=
P(D) P(T[D)
P(D) P(T[D) +P(D) P(T[D)
=
P(D) 1
P(D) 1 +P(D) 0
= 1.
By using the same reasoning we can observe that a person testing positive has the
disease depends on the proportion of people in population who are actually ill. Let
us suppose that the incidence rate of disease is r. Replacing the proportion 0,001 by
r and 0,999 by 1 −r in the above calculation, we obtain that
P(D[T) =
r 0, 99
r 0, 99 + (1 −r) 0, 05
=
99r
99r + 5 −5r
=
99r
5 + 94r
A graph of this function is shown in the figure below.
`
¸
1
0,8
0,6
0,4
0,2
0,2 0,4 0,6 0,8 1
r
P(D[T)
We see that the test is relevant when the incidence rate for the disease is large.
Since most diseases have small incidence rate then the false positive rate and false
negative rate for this tests are very important numbers.
Definition. (Partition of the sample space)
Let (S, T, P) be a probability space. The events ¦A
1
, A
2
, . . . , A
n
¦ ⊆ T form a
partition of S if:
i) A
1
∪ A
2
∪ ∪ A
n
= S
ii) A
i
∩ A
j
= ∅, i ,= j
iii) P(A
i
) > 0, i = 1, n.
It is obvious that any event B ∈ T can be expressed in terms of a partition of S:
B = B ∩ S = B ∩
_
n
_
i=1
A
i
_
=
n
_
i=1
(B ∩ A
i
) =
n
_
i=1
(A
i
∩ B).
Furthermore,
P(B) = P
_
n
_
i=1
(A
i
∩ B)
_
=
n

i=1
P(A
i
∩ B) =
n

i=1
P(A
i
) P(B[A
i
).
308
Thus, we have the following result.
Theorem. (The total probability formula) Let (S, T, P) be a probability space
and let ¦A
1
, A
2
, . . . , A
n
¦ be a partition of S.
Then, for any event B ∈ T we have
P(B) =
n

i=1
P(A
i
) P(B[A
i
).
So, if we know the probabilities of the partitioning events P(A
i
), i = 1, n and the
conditional probabilities of B, given A
i
, then by using the previous formula we can
obtain the probability P(B).
The computation in the last example is a particular case of the following general
result.
Theorem. (Bayes’ formula) Let (S, T, P) be a probability space,
¦A
1
, A
2
, . . . , A
n
¦ be a partition of S and let B ∈ T with P(B) > 0.
Then, for all j = 1, n we have
P(A
j
[B) =
P(B[A
j
) P(A
j
)
n

i=1
P(B[A
i
) P(A
i
)
.
Proof. We write
P(A
j
[B) =
P(A
j
∩ B)
P(B)
=
P(B[A
j
) P(A
j
)
n

i=1
P(B[A
i
) P(A
i
)
according to the definition of conditional probability and to the total probability
formula.
The previous formula was first stated by the English clergyman Thomas Bayes,
who died in 1761 but whose now famous formula was not published until 1763.
• The probabilities P(A
1
), . . . , P(A
n
) are called the prior probabilities in the
sense that they do not take into account any information about B.
• P(A
j
[B), j = 1, n are called the posterior probability in the sense that they
are reevaluations of the respective prior P(A
j
) after the event has occurred.
Example. Two identical urns have the following compositions:
- the first urn contains 10 black and 30 white balls
- the second urn contains 20 black and 20 white balls.
An urn is selected at random and a ball is taken from the urn. If the ball is white,
what is the probability that it comes from the first urn?
Solution. We consider the following events:
U
1
: the event of choosing the first urn
U
2
: the event of choosing the second urn
W: the event that the extracted ball is white.
We compute the probabilities:
P(U
1
) = P(U
2
) =
1
2
309
(since the urns are identical, we have the same chance to select any urn)
P(W[U
1
) =
30
40
=
3
4
P(W[U
2
) =
20
40
=
1
2
In order to compute the desired probability, P(U
1
[W) we apply the Bayes law:
P(U
1
[W) =
P(W[U
1
) P(U
1
)
P(U
1
) P(W[U
1
) +P(U
2
) P(W[U
2
)
=
3
4

1
2
1
2

3
4
+
1
2

1
2
=
3
4
3
4
+
2
4
=
3
5
= 0, 6.
Example. (Bac Polynesie 2007) In a vacantion village, three training courses are
proposed both to the children and to the adults. They take place in the same time
and their topics are magic, drama and digital photography. 150 persons (90 adults
and 60 children) are registered at one of this training courses.
- the course of magic was chosen by one half of the children and 20% of the adults
- the course of digital photography was chosen by 27 adults and 10% of the children.
1. Fill in the following table.
Digital
Magic Drama photography Total
Adults 90
Children 60
Total 150
We choose at random a person registered at one of the training courses. We con-
sider the following events.
A: the chosen person is an adult
M: the chosen person is registered at the course of magic
D: the chosen person is registered at the drama course
P: the chosen person is registered at the digital photography course.
2. a) What is the probability that the chosen person is a child?
b) What is the probability that the chosen person takes the course of digital
photography given that he/she is an adult?
c) What is the probability that the chosen person is an adult which is registered
at drama course?
3. Compute the probability that the chosen person is registered at the course of
magic.
4. Compute the probability that the chosen person is a child given that he/she is
registered at the course of magic.
5. We choose at random three persons out of the group of 150 persons. Which is
the probability that only one person takes the course of magic?
310
Solution. 1)
Digital
Magic Drama photography Total
Adults 18 45 27 90
Children 30 24 6 60
Total 48 69 33 150
2) a) P(A) =
card A
card S
=
60
150
=
2
5
b) P(P[A) =
P(P ∩ A)
P(A)
=
27
150
90
150
=
27
90
=
3
10
c) P(A∩ D) =
card (A∩ D)
card S
=
45
150
=
3
10
3) P(M) =
card M
card S
=
48
150
=
9
25
4) P(A[M) =
P(A∩ M)
P(M)
=
30
150
9
25
=
30
48
=
5
8
.
6.4 Independence
Since we already know what conditional probability means we can define the notion
of independent events. Intuitively, in order that two events A and B to be independent,
the probability of A should not change when the event B occurs and the probability
of B should not change when the event A occurs. A first approach to the definition of
independent events is: P(A) = P(A[B). There are two reasons for which the previous
definition is not satisfactory: the definition is not symmetric in A and B and P(A[B)
cannot be defined when P(B) = 0. We define independence as follows.
Definition. Let (S, T, P) be a probability space. Two events A and B ∈ T are
said to be independent if
P(A∩ B) = P(A) P(B).
The events A
1
, A
2
, . . . , A
n
∈ T are said to be independent if
P
_
_

j∈J
A
j
_
_
=

j∈J
P(A
j
),
for all index sets J ⊆ ¦1, 2, . . . , n¦.
311
Example. We choose a random card from a deck of 52 cards. Let A be the event
that the card is a queen, and B be the event that it is a spade. Then
P(A) =
4
52
=
1
13
and P(B) =
13
52
=
1
4
.
The event A∩B is the event that we draw a spade queen, the probability of which
is just
1
52
. We see that P(A) P(B) = P(A ∩ B) and hence the events A and B are
independent.
The next example explains why in the definition of independence for more than
two events, we need to require
P
_
_

j∈J
A
j
_
_
=

j∈J
P(A
j
)
for all index sets J, and not only for such sets of size 2.
Example. In the experiment of rolling 2 fair dice, let A be the event that the first
dice is even, B the event that the second dice is even and C the event that the sum
of the dice is even.
Show that events A, B and C are pairwise independent but A, B, C are not inde-
pendent.
Solution.
P(A) =
card A
card S
=
18
36
=
1
2
P(B) =
card B
card S
=
18
36
=
1
2
A∩ B = ¦(2, 2), (2, 4), (2, 6), (4, 2), (4, 4), (4, 6), (6, 2), (6, 4), (6, 6)¦
P(A∩ B) =
card (A∩ B)
card S
=
9
36
=
1
4
Hence A and B are independent since
P(A∩ B) =
1
4
=
1
2

1
2
= P(A) P(B)
P(C) =
1
2
, A∩ B = A∩ C, P(A∩ C) =
1
4
hence A and C are independent.
In the same way we can show that B and C are independent.
Since A∩ B ∩ C = A∩ B = A∩ C = B ∩ C then
P(A∩ B ∩ C) =
1
4
,=
1
8
= P(A) P(B) P(C)
wherefrom we deduce the fact that the events A, B and C are not independent.
Remark. 1) If P(B) ,= 0, A and B are independent if and only if
P(A[B) = P(A).
312
2) A, B are independent events
⇔ A, B are independent events
⇔ A, B are independent events
⇔ A, B are independent events.
Proof. 1) Let A, B ∈ T such that P(B) ,= 0.
A, B are independent events
⇔ P(A∩ B) = P(A) P(B) ⇔
P(A) =
P(A∩ B)
P(B)
⇔ P(A) = P(A[B).
2) We shall prove just the first equivalence since the other equivalences can be
justified similarly.
Since B = B ∩ S = B ∩ (A ∪ A) = (B ∩ A) ∪ (B ∩ A) and the events B ∩ A and
B ∩ A are mutually exclusive we obtain
P(B) = P(B ∩ A) +P(B ∩ A).
Suppose now that A and B are independent events, that is
P(A∩ B) = P(A) P(B).
P(A∩ B) = P(B) −P(B ∩ A) = P(B) −P(B) P(A)
= P(B)(1 −P(A)) = P(B) P(A).
We see that P(A∩B) = P(A) P(B) and hence the events A and B are indepen-
dent.
Suppose now that A and B are independent events, that is
P(A∩ B) = P(A) P(B)
P(A∩ B) = P(B) −P(A∩ B) = P(B) −P(A) P(B)
= P(B)(1 −P(A)) = P(B) P(A) = P(B) P(A).
We see that P(A∩B) = P(A) P(B) and hence the events A and B are indepen-
dent.
6.5 Classical probabilistic models.
Urn models
In this section we will consider random experiments which frequently appear in
practical applications and we will calculate the probabilities of their outcomes. The
mathematical models that will be used to describe the considered experiments are the
urn models (which contain colored balls of the same weight).
313
Urn models with replacement
Urn model with two states with replacement
Consider an urn U which contains white and black balls. Let p be the probability
of getting a white ball and q the probability of getting a black ball. Since the events
of extracting a white ball, respectively a black ball are contrary events we have that
p +q = 1.
A trial consists in taking a ball, recording its colour and putting it back into the
urn. In consequence the probability of taking a ball of a specified colour at the first
trial is the same with the probability of taking a ball of the same colour at the second
trial, and so on. The trials in this experiment are independent.
We want to determine the probability that in n repeated trials to get k white balls
(k ≤ n).
We denote by X
k
n
the desired event. We have to compute P(X
k
n
).
The desired white balls can be obtained at any k trials from the considered n
trials.
Denote by W
j
the event of getting a white ball at the j
th
trial, j = 1, n. The
desired event can be written as
X
k
n
=
_
(W
i
1
∩ W
i
1
∩ ∩ W
i
k
∩ W
i
k+1
∩ ∩ W
i
n
)
where ¦i
1
, i
2
, . . . , i
n
¦ = ¦1, 2, . . . , n¦.
The previous union contains C
k
n
terms since we can obtain k white balls at any k
trials from n trials.
P(X
k
n
) = P
_
_
(W
i
1
∩ W
i
1
∩ ∩ W
i
k
∩ W
i
k+1
∩ ∩ W
i
n
)
_
=

P(W
i
1
) P(W
i
2
) . . . P(W
i
k
) P(W
i
k+1
) . . . P(W
i
n
)
=

p p . . . p
. ¸¸ .
k times
q q . . . q
. ¸¸ .
n−k times
=

p
k
q
n−k
= C
k
n
p
k
q
n−k
Hence,
P(X
k
n
) = C
k
n
p
k
q
n−k
.
Remark 1. The term C
k
n
p
k
q
n−k
can be obtained as the general term in the
binomial theorem
(p +q)
n
=
n

k=0
C
k
n
p
k
q
n−k
.
This is why the previous model is also called the binomial model.
Remark 2. Pascal urn model
Consider an urn U which contains white and black balls. Let p the probability of
getting a white ball and q the probability of getting a black ball (p +q = 1).
A trial consists in taking a ball from the urn, recording its colour and putting it
back into the urn.
314
We want to determine the probability that in n successive trials to get k (k ≤ n)
white balls and at the n
th
trial (which is the last trial) to get a white ball. The
previous event can also be described as: to get k −1 white balls in the first n−1 trials
and a white ball at the n
th
trial.
Denote by Y
k
n
the desired event. The needed event can be written as:
Y
k
n
= X
k−1
n−1
∩ W
n
where X
k−1
n−1
represents the event of obtaining: k − 1 white balls in n − 1 successive
trials and W
n
is the event of getting a white ball at the n
th
trial.
Hence,
P(Y
k
n
) = P(X
k−1
n−1
∩ W
n
) = P(X
k−1
n−1
) P(W
n
)
= C
k−1
n−1
p
k−1
q
(n−1)−(k−1)
p = C
k−1
n−1
p
k
q
n−k
.
Remark 3. Geometric model
The geometric model is a particular case of the Pascal model in which we take
k = 1, that is in n successive trials we get one white ball which is obtained at the n
th
trial.
Hence,
P(Y
1
n
) = C
1−1
n−1
p q
n−1
= pq
n−1
P(Y
1
n
) = pq
n−1
.
The name ”Geometric model” comes from the fact that the term pq
n−1
is the n
th
term of a geometric progression whose first term in p and its ratio is q.
Another way of obtaining Y
1
n
is the following:
Since Y
1
n
= W
1
∩ W
2
∩ ∩ W
n−1
∩ W
n
, we obtain
P(Y
1
n
) = P(W
1
) P(W
2
) . . . P(W
n−1
) P(W
n
)
= q q . . . q
. ¸¸ .
n−1 times
p
= q
n−1
p = pq
n−1
,
as we expected.
Urn model with more than two states with replacement
Consider an urn U which contains balls of s colours: c
1
, c
2
, . . . , and c
s
. Let p
i
be the
probability of getting a ball of colour c
i
, i = 1, s. Since the events of extracting a ball of
colour c
i
, i = 1, s form a partition of the sample space we have that p
1
+p
2
+ +p
s
= 1.
A trial consists in taking a ball, recording its colour and putting it back into the
urn.
We want to determine the probability that in n repeated trials to get k
i
balls of
colour c
i
, i = 1, s,
k
1
+k
2
+ +k
s
= n.
315
First, we will count in how many different ways the desired event can be obtained.
There are C
k
1
n
possible choices for the balls of colour c
1
; for each choice of the balls
of the first colour there are C
k
2
n−k
1
possible choices for the balls of colour c
2
; for each
choice of the balls of the first two colours there are C
k
3
n−k
1
−k
2
possible choices for the
third group; and so on. In consequence, there are
C
k
1
n
C
k
2
n−k
1
C
k
3
n−k
1
−k
2
. . . C
k
s
n−k
1
−k
2
−···−k
s−1
=
n!
k
1
!(n −k
1
)!

(n −k
1
)!
(n −k
1
−k
2
)! k
2
!

(n −k
1
−k
2
)!
(n −k
1
−k
2
−k
3
)!k
3
. . .
. . .
(n −k
1
−k
2
− −k
s−1
)
(n −k
1
−. . . k
s
)! k
s
!
=
n!
k
1
!k
2
! . . . k
s
!
possible ways in which the desired event can be obtained.
We denote by X
k
1
,k
2
,...,k
s
n
the desired event. We have to compute P(X
k
1
,k
2
,...,k
s
n
).
Denote by X
k
i
c
i
the probability of extracting k
i
balls of colour c
i
from n trials,
i = 1, s.
X
k
1
,...,k
s
n
=
_
(X
k
1
c
1
∩ X
k
2
c
2
∩ ∩ X
k
s
c
s
).
The previous union contains
n!
k
1
!k
2
! . . . k
s
!
terms.
P(X
k
1
,...,k
s
n
) =

P(X
k
1
c
1
∩ X
k
2
c
2
∩ ∩ X
k
s
c
s
)
=

p
k
1
1
p
k
2
2
. . . p
k
s
s
=
n!
k
1
!k
2
! . . . k
s
!
p
k
1
1
. . . p
k
s
s
Remark 4. The term
n!
k
1
!k
2
! . . . k
s
!
p
k
1
1
. . . p
k
s
s
can be obtained as the general term
in the multinomial theorem
(p
1
+ +p
s
)
n
=

(k
1
,...,k
s
)
k
1
+k
2
+···+k
s
=n
n!
k
1
!k
2
! . . . k
s
!
p
k
1
1
p
k
2
2
. . . p
k
s
s
(the above sum is over all nonnegative integer-valued vectors (k
1
, . . . , k
s
) such that
k
1
+k
2
+ +k
s
= n.
This is why the previous model is also called the multinomial model.
Poisson urn model
Suppose we have n urns, (U
1
, . . . , U
n
), each of them containing white and black
balls in different proportions. Let p
i
be the probability of getting a white ball from
the i
th
urn and q
i
the probability of getting a black ball from the same urn, i = 1, n.
Since the previous events are contrary events we have that p
i
+q
i
= 1, i = 1, n.
Our experiment consists in taking one ball from each urn (so we will get exactly
n balls). We want to find the probability of getting k white balls from the selected n
balls, k ≤ n. We denote by X
k
the desired event.
316
Since the desired k white balls can be obtained from any k urns then X
k
can be
written as:
X
k
=
_
(W
i
1
∩ W
i−2
∩ ∩ W
i
k
∩ W
i
k+1
∩ ∩ W
i
n
)
where by W
i
we denote the event of getting a white ball from the i
th
urn and
¦i
1
, . . . , i
n
¦ is any permutation of the set ¦1, 2, . . . , n¦.
Hence
P(X
k
) =

p
i
1
p
i
2
. . . p
i
k
q
i
k+1
. . . q
i
n
,
where the sum is made over all the permutations (i
1
, . . . , i
n
) of the set (1, 2, . . . , n).
Remark 5. The previous value can also be obtained as the coefficient of t
k
of the
following polynomial:
(p
1
t +q
1
)(p
2
t +q
2
) . . . (p
n
t +q
n
).
Actually, we have the following more general identity:
n

k=0
P(X
k
)t
k
=
n

i=1
(p
i
t +q
i
).
Remark 6. The Poisson urn model is a generalization of the binomial model.
Indeed, if in the Poisson urn model we consider n urns with same composition, then
extracting one ball from each urn is the the same to extract repeatedly n balls from
one urn.
In this case we have
n

k=0
P(X
k
)t
k
=
n

i=1
(pt +q) = (pt +q)
n
,
wherefrom we obtain
P(X
k
) = C
k
n
p
k
q
n−k
as we expected.
Urn model without replacement
Urn model with two states without replacement
Let U be an urn which contains a white balls and b black balls. A trial consists
in taking a ball from the urn, recording its colour and not replacing the ball into the
urn. We want to find the probability that in n successive trials to get k white balls
and l = n −k black balls. The numbers l and k must satisfy the following conditions
n = l +k, l ≤ a, l ≤ b.
We observe that since the extracted ball in one trial is not replaced our experiment
(which consists in n successive trials) can be performed by taking n balls at a time.
317
By using the previous remark and the classical definition of the probabilities we get
that the probability of the desired event X
k,l
a,b
is
P(X
k,l
a,b
) =
no. of favorable outcomes
no. of possible outcomes
=
C
k
a
C
l
b
C
n
a+b
.
Indeed, since the number of possibilities of taking k white balls is C
k
a
and for
each choice of k white balls there are C
l
b
possibilities for taking l black balls then the
number of favorable cases is C
k
a
C
l
b
as we mentioned before.
In conclusion:
P(X
k,l
a,b
) =
C
k
a
C
l
b
C
n
a+b
, n = k +l, l ≤ a, l ≤ b.
The previous model can be easily generalized to more then two states as follows.
Urn model with more than two states without replacement
Let U be an urn which contains a
i
balls of colour c
i
, i = 1, s. The experiment
described is similar to the previous one. A trial consists in taking a ball from the
urn, recording its colour and not replacing back into the urn. We want to find the
probability that in n successive trials to get k
i
balls of colour c
i
, i = 1, s. The numbers
k
i
, i = 1, s must satisfy the following conditions: k
1
+ k
2
+ + k
s
= s and k
i
≤ a
i
for each i = 1, s.
The probability of the desired event, X
k
1
,k
2
,...,k
s
a
1
,a
2
,...,a
s
can be obtained by the same
reasoning as in the previous example.
P(X
k
1
,k
2
,...,k
s
a
1
,a
2
,...,a
s
) =
C
k
1
a
1
C
k
2
a
2
. . . C
k
s
a
s
C
n
a
1
+a
2
+···+a
s
, k
1
+k
2
+ +k
s
= n
and k
i
≤ a
i
, i = 1, s.
Examples
Example 1. Suppose that the probability that an item produced by a certain
production line will be defective is 0,1. Find the probability that a sample of 10 items
will contain at most 1 defective item.
Solution. We shall use the binomial model (the urn model with 2 states with
replacement) since the probability of obtaining a defective item at each trial is the
same.
Since we are interested in at most 1 defective item, we have to compute the
probability of the following event X
0
10
∪ X
1
10
.
The desired probability is
P(X
0
10
∪ X
1
10
) = P(X
0
10
) +P(X
1
10
)
= C
0
10
(0, 1)
0
(0, 9)
10
+C
1
10
(0, 1) (0, 9)
9
= 0, 9
10
+ 10 0, 1 0, 9
9
= 0, 9
9
.
318
Example 2. An urn contains 10 white and 5 black balls. Balls are randomly
selected, one at a time, until a white ball is obtained. If we assume that each selected
ball is replaced before the next ball is drawn, what is the probability that:
(a) exactly 3 extractions are needed
(b) at least 3 extractions are needed.
Solution. We shall use the geometric model, with
p =
10
15
=
2
3
and q =
5
15
=
1
3
a) P(Y
1
3
) = p q
2
=
2
3

_
1
3
_
2
=
2
27
b) The desired event is
Y = Y
1
3
∪ Y
1
4
∪ =

_
n=3
Y
1
n
.
It is easier to compute the probability of the contrary event which is: at most two
extractions are needed.
Y = Y
1
1
∪ Y
1
2
P(Y ) = 1 −P(Y ) = 1 −P(Y
1
1
∪ Y
1
2
) = 1 −(P(Y
1
1
) +P(Y
1
2
))
= 1 −(p +pq) = 1 −p −pq = q −pq = q
2
=
1
9
.
Example 3. The probabilities that three men hit a target are respectively
1
6
,
1
4
and
1
3
. Each shoots once at the target.
a) Find the probability that exactly one of them hits the target.
b) Find the probability that at most two of them hit the target.
c) If only one hit the target, what is the probability that is was the first man.
Solution. We will use the Poisson model. By using the notations introduced before
we have:
p
1
=
1
6
, q
1
=
5
6
, p
2
=
1
4
, q
2
=
3
4
, p
3
=
1
3
, q
3
=
2
3
.
a) P(X
1
) is the coefficient of t
1
of the following polynomial
(p
1
t +q
1
)(p
2
t +q
2
)(p
3
t +q
3
) =
_
1
6
t +
5
6
__
1
4
t +
3
4
__
1
3
t +
2
3
_
,
hence
P(X
1
) =
1
6

3
4

2
3
+
5
6

1
4

2
3
+
5
6

3
4

1
3
=
31
72
.
We denote by M
i
the event that the target was hit by the i
th
men, i = 1, 3. The
previous probability can be directly computed as follows.
X
1
= (M
1
∩ M
2
∩ M
3
) ∪ (M
1
∩ M
2
∩ M
3
) ∪ (M
1
∩ M
2
∩ M
3
)
319
P(X
1
) = P(M
1
) P(M
2
) P(M
3
) +P(M
1
) P(M
2
) P(M
3
)
+P(M
1
) P(M
2
) P(M
3
)
P(X
1
) =
1
6

3
4

2
3
+
5
6

1
4

2
3
+
5
6

3
4

1
3
=
31
72
b) B = X
0
∪ X
1
∪ X
2
We will compute the probability of the contrary event: B = X
3
.
P(B) = 1 −P(B) = 1 −P(X
3
) = 1 −
1
6

1
4

1
3
=
71
72
c) P(M
1
[X
1
) =
P(M
1
∩ X
1
)
P(X
1
)
=
P(M
1
∩ M
2
∩ M
3
)
P(X
1
)
P(M
1
[X
1
) =
P(M
1
) P(M
2
) P(M
3
)
P(X
1
)
=
1
6

3
4

2
3
31
72
=
6
31
Example 4. At a lottery, among 100 tickets, 25 are winning tickets. A person
buys 4 tickets from this lottery. Find the probability that at least one ticket is a
winning one?
Solution. We use the urn model with two states without replacement. The desired
event is
A = X
1
4
∪ X
2
4
∪ X
3
4
∪ X
4
4
.
We compute the probability of the contrary event A = X
0
4
.
P(A) = 1 −P(A) = 1 −P(X
0
4
) = 1 −
C
0
25
C
4
75
C
4
100
= 1 −
C
4
75
C
4
100
.
Example 5. In the last 30 years, the probability that a newborn is a girl is 0,52.
A family has 5 babies born in the last 30 years. What is the probability that:
a) the fifth born child to be the second boy of the family;
b) the first boy to be the 4
th
newborn;
c) the last baby born in the family to be a boy.
Solution. We use the Pascal model with
p = 1 −0, 52 = 0, 48, q = 0, 52.
a) P(Y
2
5
) = C
1
4
p
2
q
3
= 4 0, 42
2
0, 52
3
b) P(Y
1
4
) = C
0
3
p q
3
= 0, 42 0, 52
3
c) p = 0, 48.
Example 6. A dice is rolled fourteen times.
a) What is the probability that we obtain exactly one 6?
b) What is the probability that we obtain 4 times 4, 2 times 6 and 6 times 3?
Solution. a) We use the urn model with 2 states (face 6 and other faces) with
replacement.
320
In this case we have n = 14, k = 1, p =
1
6
and q =
5
6
.
P(X
1,13
14
) = C
1
14

_
1
6
_
1

_
5
6
_
13
= 14
1
6

_
5
6
_
13
=
14 5
13
6
14
.
b) We use the urn model with 4 states (face 4, face 6, face 3 and the other faces)
with replacement. In this case we have
n = 14, k
1
= 4, k
2
= 2, k
3
= 6,
k
4
= n −k
1
−k
2
−k
3
= 14 −4 −2 −6 = 2, p
1
= p
2
= p
3
=
1
6
, p
4
=
1
2
P(X
4,2,6,2
14
) =
14!
4! 2! 6! 2!
_
1
6
_
4
_
1
6
_
2
_
1
6
_
3
_
1
2
_
2
.
Miscellaneous examples
Example 1. Among the 20 students of a group, 6 speak English, 5 speak French
and 2 speak German.
If we choose randomly one student which is the probability that he/she knows a
foreign language (English, French or German)?
Solution. Let E, F, G be the considered events. Then
P(E) =
6
20
, P(F) =
5
20
, P(A) =
2
20
.
If the desired event is denoted X, then X = E ∪ F ∪ G. Since the pairs of events
E and F, E and G, F and G are not mutually exclusive we will use the addition rule
for computing the probability of the union E ∪ F ∪ G.
P(X) = P(E ∪ F ∪ G) = P(E) +P(F) +P(G) −P(E ∩ G)
−P(F ∩ G) −P(E ∩ F) +P(E ∩ F ∩ G).
The events E, F and G are independent, hence:
P(X) = P(E) +P(F) +P(G) −P(E) P(G) −P(F) P(G)
−P(E) P(F) +P(E) P(F) P(G)
=
6 + 5 + 2
20

6 5 + 6 2 + 5 2
20 20
+
6 5 2
20 20 20
=
13
20

52
400
+
3
400
=
211
400
.
Example 2. We are given 3 urns which contain white balls and black balls as
follows: U
1
(a, b), U
2
(c, d) and U
3
(e, f). A ball is drawn from the third urn. If the
selected ball is white it is replaced in the first urn and the second ball is drawn from
the urn U
1
. If the first ball is black it is replaced in the second urn wherefrom the
second ball is drawn.
Determine the probability of the following events:
321
a) the second ball is white;
b) the first ball is white given that the second ball is black.
Solution. We use the following notations:
W
i
- the event that the i
th
ball is white, i = 1, 2
B
i
- the event that the i
th
ball is black, i = 1, 2.
a) We apply the total probability formula where the partition is ¦W
1
, B
1
¦.
P(W
2
) = P(W
1
) P(W
2
[W
1
) +P(B
1
) P(W
2
[B
1
)
=
e
e +f

a + 1
a +b + 1
+
f
e +f

c
c +d + 1
.
In the same say we get:
P(B
2
) = P(W
1
) P(B
2
[W
1
) +P(B
1
) P(B
2
[B
1
)
=
e
e +f

b
a +b + 1
+
f
e +f

d + 1
c +d + 1
.
b) According to Bayes’ rule we have
P(W
1
[B
2
) =
P(W
1
) P(B
2
[W
1
)
P(B
2
)
=
e
e +f

b
a +b + 1
e
e +f

b
a +b + 1
+
f
e +f

d + 1
c +d + 1
.
Example 3. An urn contains a white balls (a ≥ 3) and b black balls. If we extract
without replacement 3 balls, what is the probability that all three balls to be white?
Solution. If we denote by W
i
the event that the i
th
ball is white, i = 1, 3, and by
X the desired event, then
X = W
1
∩ W
2
∩ W
3
.
By applying the multiplication rule for computing the probabilities we obtain:
P(X) = P(W
1
∩ W
2
∩ W
3
) = P(W
1
) P(W
2
[W
1
) P(W
3
[W
1
∩ W
2
)
=
a
a +b

a −1
a +b −1

a −2
a +b −2
.
Example 4. Three machines A, B and C produce respectively 40%, 30% and 30%
of the total number of items of a factory. The percentages of defective output of these
machines are 2%, 4% and 5%. Suppose an item is selected at random and is found to
be defective. Find the probability that the item was produces by machine A.
Solution. We are given
P(A) = 0, 4, P(B) = 0, 3, P(C) = 0, 3
P(D[A) = 0, 02, P(D[B) = 0, 04, P(D[C) = 0, 05.
322
A ∩ D
B ∩ D
C ∩ D
A ∩ D
B ∩ D
C ∩ D
P(D|A) = 0, 02
P(D|B) = 0, 04
P(D|C) = 0, 05
A
B
C
0,4
0,3
0,3
By using the Bayes’ formula and the total probability formula we get:
P(A[D) =
P(A) P(D[A)
P(D)
=
P(A) P(D[A)
P(A) P(D[A) +P(B) P(D[B) +P(C) P(D[C)
=
0, 4 0, 02
0, 4 0, 02 + 0, 3 0, 04 + 0, 3 0, 05
≈ 0, 229.
Example 5. A student takes a multiple-choice exam. Suppose for each question
he either knows the answer or gambles and chooses an option at random (at each
question there are exactly 4 choices one of which is the correct answer). To pass,
students need to answer at least 60% of the questions correctly. The student has
”studied for a minimal pass”, that is, with probability 0,6 he knows the answer to a
question. Given that he answers a question correctly, what is the probability that he
actually knows the answer?
Solution. Let C and K denote, respectively, the events that the student answers
323
the question correctly and the event that he knows the answer. Now
P(K[C) =
P(K) P(C[K)
P(C)
=
P(K) P(C[K)
P(K) P(C[K) +P(K) P(C[K)
=
0, 6 1
0, 6 1 + 0, 4 0, 25
=
0, 6
0, 6 + 0, 1
=
6
7
≈ 0, 857
Example 6. Suppose A and B are events with 0 < P(A) < 1 and 0 < P(B) < 1.
a) If A and B are independent, can they be mutually exclusive?
b) If A and B are mutually exclusive, can they be independent?
c) If A ⊂ B, can A and B be independent?
Solution. a) No.
Since A and B are independent, then
P(A∩ B) = P(A) P(B) ,= 0
wherefrom A∩ B ,= ∅.
b) No.
Since A and B are mutually exclusive, then
P(A∩ B) = 0 ,= P(A) P(B).
c) No.
If A ⊂ B, then
P(A∩ B) = P(A) ,= P(A) P(B).
Example 7. Suppose that each of three men at a part throws his hat into the
center of the room. The hats are first mixed up and then each man randomly selects
a hat. What is the probability that none of the three men selects his own hat?
Solution. Let H
i
, i = 1, 3, be the event that the i
th
man selects his own hat. We
have to compute the probability of the event H
1
∩ H
2
∩ H
3
. In order to do that we
will compute the probability of the contrary event:
H
1
∩ H
2
∩ H
3
= H
1
∪ H
2
∪ H
3
= H
1
∪ H
2
∪ H
3
.
Hence
P(H
1
∩ H
2
∩ H
3
) = 1 −P(H
1
∪ H
2
∪ H
3
).
To calculate P(H
1
∪ H
2
∪ H
3
) we will apply the addition rule
P(H
1
∪ H
2
∪ H
3
) = P(H
1
) +P(H
2
) +P(H
3
) −P(H
1
∩ H
2
) −P(H
1
∩ H
3
)
−P(H
2
∩ H
3
) +P(H
1
∩ H
2
∩ H
3
).
324
It remains to compute P(H
1
∩ H
j
), i ,= j and P(H
1
∩ H
2
∩ H
3
).
For each i, j ∈ ¦1, 2, 3¦, i ,= j we have:
P(H
i
∩ H
j
) = P(H
i
) P(H
j
[H
i
) =
1
3

1
2
=
1
6
P(H
1
∩ H
2
∩ H
3
) = P(H
1
) P(H
2
[H
1
) P(H
3
[H
1
∩ H
2
) =
1
3

1
2
1 =
1
6
.
Now, we have that:
P(H
1
∪ H
2
∪ H
3
) =
1
3
+
1
3
+
1
3

1
6

1
6

1
6
+
1
6
=
2
3
.
Hence, the probability that none of the men selects his own hat is
1 −
2
3
=
1
3
.
Example 8. In the poker game (dealing 5 cards from a well-shuffled deck of 52
cards) find the following probabilities:
a) the hand is all spades?
b) the hand us a flush?
c) the hand is a full house?
Solution. To calculate the probability associated with any particular hand, we
first need to calculate how many hands can be dealt. Since the order of cards is
irrelevant we should use combinatorics. This is expressed as:
C
5
52
=
52!
5! 47!
=
52 51 50 49 48
5 4 3 2 1
= 2.598.960
a) We determine first in how many ways we can select 5 spades. That is C
5
13
= 1287.
Thus, the probability of a hand of spades is
C
5
13
C
5
52
=
1287
2598960
≈ 0.0005
b) A flush is a hand of five cards all in the same suit. The probability of a flush is
4 C
5
13
C
5
52
=
4 1287
2598960
≈ 0.002
c) A full house consists of three of a kind and one pair. There are thirteen numbers
that the three of a kind may have and then twelve possible numbers possible for the
pair. This is expressed as: 13C
3
4
12C
2
4
and in consequence the probability of a full
house is equal to:
13C
3
4
12C
2
4
C
5
52
=
13 4 12 6
2958960
= 0, 0012.
325
Example 9 (Probability as a continuous set function)
A sequence of events ¦A
n
¦
n≥1
is said to be an increasing sequence if A
1

A
2
⊂ ⊂ A
n
⊂ A
n+1
⊂ . . . and it is said to be a decreasing sequence if
A
1
⊃ A
2
⊃ ⊃ A
n
⊃ A
n+1
⊃ . . . . If ¦A
n
¦
n≥1
is an increasing sequence of events,
then we define the new event lim
n→∞
A
n
, by lim
n→∞
A
n
=

_
n≥1
A
n
. Similarly, if ¦A
n
¦
n≥1
is
a decreasing sequence of events then lim
n→∞
A
n
is defined by lim
n→∞
A
n
=

n≥1
A
n
.
Prove that if ¦A
n
¦
n≥1
is either an increasing or a decreasing sequence of events,
then
lim
n→∞
P(A
n
) = P
_
lim
n→∞
A
n
_
.
Solution. Suppose, first that ¦A
n
¦
n≥1
is an increasing sequence and defined the
events B
n
, n ≥ 1 by
B
1
= A
1
, B
n
= A
n
¸ A
n−1
, n > 1.
It is easy to verify that the events ¦B
n
¦
n≥1
are mutually exclusive events such
that

_
i=1
A
i
=

_
i=1
B
i
and
n
_
i=1
A
i
=
n
_
i=1
B
i
, n ≥ 1.
Hence
P
_
lim
n→∞
A
n
_
= P
_

_
i=1
A
i
_
= P
_

_
i=1
B
i
_
=

i=1
P(B
i
)
= lim
n→∞
n

i=1
P(B
i
) = lim
n→∞
P
_
n
_
i=1
B
i
_
= lim
n→∞
P
_
n
_
i=1
A
i
_
= lim
n→∞
P(A
n
)
which proves the result when ¦A
n
¦
n≥1
is increasing.
If ¦A
n
¦
n≥1
is a decreasing sequence, then ¦A
n
¦
n≥1
is an increasing sequence,
hence
P
_

_
i=1
A
n
_
= lim
n→∞
P(A
n
).
Since

_
n=1
A
n
=

n=1
A
n
, the previous equality becomes
1 −P
_

i=1
A
i
_
= lim
n→∞
(1 −P(A
n
)) = 1 − lim
n→∞
P(A
n
)
which proves the result.
326
Chapter 7
Random variables
In general, in performing an experiment we are not interested in its outcomes,
but rather in some function of them. For example, suppose one plays a game where
the payoff is a function of the number of dots on two dice: suppose one receives 2
euros if the total number of dots equals 2 or 3; that one received 4 euros if the total
number of dots equals 4, 5, 6 or 7, and that one has to pay 8 euros otherwise. Our
payoff is a function of the total number of dots on the dice. In order to compute the
probability that the payoff equals some number we compute the probability that the
total number of dots correspond to the number selected. This leads to the notion of
random variables.
Definition. Let (S, T, P) be a probability space.
A random variable is a (measurable) function from the probability space to the
real numbers:
X : S →R such that for each
x ∈ R, ¦s : X(s) < x¦ ∈ T.
Random variables are denoted by capital letters, such as X, Y, Z, U, V and W.
Example. In the particular example described before we obtain the following
random variable:
X : S → ¦2, 4, −8¦
X(s) = 2, for each s ∈ A = ¦(1, 1), (1, 2), (2, 1)¦
X(s) = 4, for each s ∈ B = ¦(2, 2), (1, 3), (3, 1), (1, 4), (2, 3), (3, 2), (4, 1)
(1, 5), (2, 4), (3, 3), (4, 2), (5, 1), (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)¦
X(s) = −8 otherwise, s ∈ C = S ¸ (A∪ B)
Since in this experiment we have T = T(S) the condition ”∀ x ∈ R, ¦s ∈ S [
X(s) < x¦ ∈ T” is fulfilled.
For short, we shall use the notation ¦X < x¦ for the event
¦s ∈ S [ X(s) < x¦.
327
Remark. (Events defined by a random variable)
Let (S, T, P) be a probability space and let X : S → R be a random variable. If
x ∈ R, then:
a) ¦X ≤ x¦ = ¦s ∈ S [ X(s) ≤ x¦ ∈ T
b) ¦X = x¦ = ¦s ∈ S [ X(s) = x¦ ∈ T
c) ¦X ≥ x¦, ¦X > x¦ ∈ T
d) ¦X ≤ x¦ ∪ ¦X > x¦ = S; ¦X ≤ x¦ ∩ ¦X > x¦ = ∅
¦X ≥ x¦ ∪ ¦X < x¦ = S; ¦X ≥ x¦ ∩ ¦X < x¦ = ∅.
Proof. We shall use the definition of the σ-field T and the properties of event
operations.
a) ¦X ≤ x¦ =

n>0
_
X < x +
1
n
_
∈ T
b) ¦X = x¦ = ¦X ≤ x¦ ¸ ¦X < x¦ ∈ T
c) ¦X ≥ x¦ = ¦X < x¦ ∈ T
¦X > x¦ = ¦X ≤ x¦ ∈ T
d) Obvious.
7.1 Discrete random variables
Definition. Let (S, T, P) be a probability space and let X : S →R be a random
variable.
X is said to be a discrete random variable (d.r.v.) if X(S) = M ⊆ R is finite or
countable.
A set M is countable if there is a one-to-one correspondence between M and N
(the set of natural number).
In consequence a random variable X : S → M ⊆ R is a discrete random variable
if
M = ¦x
i
[ i ∈ I ⊆ N¦.
Definition. (Probability mass function) (p.m.f.)
Let X : S → M, M = ¦x
i
[ i ∈ I ⊆ N¦ be a d.r.v.
The function f = f
X
: M →R defined by
f(x
i
) = P(¦X = x
i
¦), i ∈ I
is called the probability mass function of the d.r.v. X.
Example 1. The random variable described before is a d.r.v. since M = ¦2, 4, −8¦.
Its p.m.f. is f : ¦2, 4, −8¦ →R defined by
f(2) = P(A) =
3
36
=
1
12
f(4) = P(B) =
18
36
=
1
2
f(−8) = p(C) =
15
36
=
5
12
328
Theorem. (Properties of the p.m.f.)
Let X : S → M = ¦x
i
[ i ∈ I ⊆ N¦ be a d.r.v. and let f
X
= f be its probability
mass function. Then
1) f(x
i
) ≥ 0, ∀ i ∈ I
2)

i∈I
f(x
i
) = 1.
Proof. We will use the properties of the probability function.
1) f(x
i
) = P(X = x
i
) ≥ 0.
2) We remark first that the events ¦X = x
i
¦
i∈I
form a partition of the sample
space S. By using this fact, we get that

i∈I
f(x
i
) =

i∈I
P(X = x
i
) = P
_
_
i∈I
¦X = x
i
¦
_
= P(S) = 1,
as desired.
Notation. If X : S → M = ¦x
i
[ i ∈ I ⊆ N¦ is a d.r.v. with the p.m.f. f
X
we will
use the following notations:
p
i
= f(x
i
), i ∈ I.
By using these notations the properties of the p.m.f. can be written as:
1) p
i
≥ 0, i ∈ N
2)

i∈I
p
i
= 1.
Definition. The distribution of a d.r.v.
The distribution of a d.r.v. X is a table of the following form:
X :
_
x
i
f
X
(x
i
)
_
i∈I
or X :
_
x
i
p
i
_
i∈I
with
_
¸
_
¸
_
p
i
≥ 0

i∈I
p
i
= 1.
The values taken by X are written on the first row of the previous table and the
probabilities
p
i
= P(X = x
i
), i ∈ I.
are written on the second row of the table.
Operations with discrete random variables
Let (S, T, P) bee a probability space and let X, Y two discrete random variables
defined on S.
X : S → M
1
= ¦x
i
[ i ∈ I ⊆ N¦ d.r.v.
X :
_
x
i
p
i
_
i∈I
, p
i
≥ 0,

i∈I
p
i
= 1
Y : S → M
2
= ¦y
j
[ j ∈ J ⊆ N¦ d.r.v.
329
Y :
_
y
j
q
j
_
, q
j
≥ 0,

j∈J
q
j
= 1.
Definition. The discrete random variables X and Y are called independent if
for each i ∈ I and j ∈ J the following events ¦X = x
i
¦, ¦Y = y
j
¦ are independent
events. In this case
P(¦X = x
i
¦ ∩ ¦Y = y
j
¦) = P(¦X = x
i
¦) P(¦Y = y
j
¦) = p
i
q
j
.
• The sum of two discrete random variables
X +Y : S → M = ¦x
i
+y
j
[ i ∈ I, j ∈ J¦
f
X+Y
(x
i
+y
j
) = P(¦X = x
i
¦ ∩ ¦Y = y
j
¦)
X,Y indep
=
= P(¦X = x
i
¦) P(¦Y = y
j
¦) = p
i
q
j
.
The distribution of the sum X +Y is
X +Y :
_
x
i
+y
j
f
X+Y
(x
i
+y
j
)
_
i∈I,j∈J
If X and Y are independent then
X +Y :
_
x
i
+y
j
p
i
q
j
_
i∈I,j∈J
Example 2. Let X, Y be two independent discrete random variables defined by:
X :
_
−1 1
0, 3 0, 7
_
and Y :
_
0 2
0, 6 0, 4
_
Compute X +Y .
Solution. Since X and Y are independent, we have:
X +Y :
_
−1 + 0 −1 + 2 1 + 0 1 + 2
0, 3 0, 6 0, 3 0, 4 0, 7 0, 6 0, 7 0, 4
_
X +Y :
_
−1 1 1 3
0, 18 0, 12 0, 42 0, 28
_
.
As we can see, the value 1 is taken twice, with probabilities 0,12 and 0,42. Then
P(X +Y = 1) = 0, 12 + 0, 42 = 0, 56.
Finally,
X +Y :
_
−1 1 3
0, 18 0, 56 0, 28
_
.
330
• The product of 2 discrete random variables
X Y : S → M = ¦x
i
y
j
[ i ∈ I, j ∈ J¦
f
XY
(x
i
y
j
) = P(¦X = x
i
¦ ∩ ¦Y = y
j
¦)
X,Y indep
=
= P(X = x
i
) P(Y = y
j
) = p
i
q
j
.
The distribution of the d.r.v. XY is:
XY :
_
x
i
y
j
f
XY
(x
i
y
j
)
_
i∈I,j∈J
If X and Y are independent, then
XY :
_
x
i
y
j
p
i
y
j
_
i∈I,j∈J
• The sum (product) of a d.r.v. with a constant
If c ∈ R, then
c +X :
_
c +x
i
p
i
_
i∈I
c X :
_
cx
i
p
i
_
i∈I
• The inverse of a discrete random variable
If x
i
,= 0, ∀ i ∈ I, then
X
−1
:
_
1
x
i
p
i
_
i∈I
Example. If X and Y are the discrete random variables defined in example 2,
then
X
−1
:
_
−1 1
0, 3 0, 7
_
and ∄ Y
−1
.
• The power of a discrete random variable
Let k ∈ R and X :
_
x
i
p
i
_
i∈I
.
If x
k
i
is well defined for each i ∈ I then
X
k
:
_
x
k
i
p
i
_
i∈I
331
Example. Let X, Y be the independent d.r.v. defined by
X :
_
−1 0 1 2
0, 3 0, 4 0, 2 p
_
and Y :
_
1 2 3
0, 2 0, 4 q
_
Compute: X
2
; 2X −Y ; X
2
−X.
Solution. First, observe that
p = 1 −0, 3 −0, 4 −0, 2 = 0, 1 and q = 1 −0, 2 −0, 4 = 0, 4
X
2
:
_
(−1)
2
0
2
1
2
2
2
0, 3 0, 4 0, 2 0, 1
_
X
2
:
_
0 1 4
0, 4 0, 3 + 0, 2 0, 1
_
2X −Y : 2
_
−1 0 1 2
0, 3 0, 4 0, 2 0, 1
_

_
1 2 3
0, 2 0, 4 0, 4
_
2X −Y :
_
−2 −1 −2 −2 −2 −3 0 −1 0 −2 0 −3
0, 3 0, 2 0, 3 0, 4 0, 3 0, 4 0, 4 0, 2 0, 4 0, 4 0, 4 0, 4
2 −1 2 −2 2 −3 4 −1 4 −2 4 −3
0, 2 0, 2 0, 2 0, 4 0, 2 0, 4 0, 1 0, 2 0, 1 0, 4 0, 1 0, 4
_
2X −Y :
_
−5 −4 −3 −2 −1 0 1 2 3
0, 12 0, 12 0, 22 0, 16 0, 16 0, 08 0, 08 0, 04 0, 02
_
The computation rules introduced before cannot be used in order to determine
X
2
−X since the variables X
2
and X are not independent
X
2
−X :
_
(−1)
2
+ 1 0
2
−0 1
2
−1 2
2
−2
0, 3 0, 4 0, 2 0, 1
_
X
2
−X :
_
0 2
0, 4 + 0, 2 0, 3 + 0, 1
_
X
2
−X :
_
0 2
0, 6 0, 4
_
.
7.2 The distribution function of a random variable
Definition. The distribution function F (or the cumulative distribution
function) of the random variable X is the function
F : R → [0, 1] defined by
F(x) = P(¦X < x¦) = P(X < x).
If we want to mention the role of X we denote F by F
X
.
332
Example. Compute the cumulative distribution function of the following discrete
random variable
X :
_
0 1 2
1
4
1
2
1
4
_
.
Solution. If x ≤ 0, F(x) = P(X < x) = P(∅) = 0.
If x ∈ (0, 1], F(x) = P(X < x) = P(¦X = 0¦) =
1
4
.
If x ∈ (1, 2], F(x) = P(X < x) = P(¦X = 0¦ ∪ ¦X = 1¦) =
1
4
+
1
2
=
3
4
.
If x > 2, F(x) = P(X < x) = P(¦X = 0¦ ∪ ¦X = 1¦ ∪ ¦X = 2¦) = P(S) = 1.
Hence,
F(x) =
_
¸
¸
_
¸
¸
_
0, x ≤ 0
1
4
, 0 < x ≤ 1
3
4
, 1 < x ≤ 2
1, x > 2.
Theorem. (Properties of the cumulative distribution function) The cu-
mulative distribution function F = F
X
of a random variable X has the following
properties.
1) If x
1
, x
2
∈ R, x
1
< x
2
, then
P(x
1
≤ X < x
2
) = F(x
2
) −F(x
1
)
P(x
1
< X < x
2
) = F(x
2
) −F(x
1
) −P(X = x
1
)
P(x
1
≤ X ≤ x
2
) = F(x
2
) −F(x
1
) +P(X = x
2
)
P(x
1
< X ≤ x
2
) = F(x
2
) −F(x
1
) +P(X = x
2
) −P(X = x
1
)
2) F is a monotone increasing function.
3) F is continuous from the left i.e.
lim
yրx
F(y) = F(x −0) = F(x), ∀ x ∈ R.
4) lim
x→−∞
F(x) = 0, lim
x→∞
F(x) = 1.
5) If x ∈ R, then
P(X ≤ x) = F(x + 0) = lim
yցx
F(y).
6) If x ∈ R, then
P(X = x) = lim
yցx
F(y) −F(x) = F(x + 0) −F(x).
7) The set of all points of discontinuity of F is at most countable.
Proof. 1) Let x
1
, x
2
∈ R, x
1
< x
2
.
The events ¦X < x
1
¦ and ¦x
1
≤ X < x
2
¦ are mutually exclusive and satisfy the
equality:
¦X < x
1
¦ ∪ ¦x
1
≤ X < x
2
¦ = ¦X < x
2
¦
333
F(x
2
) = P(X < x
2
) = P(¦X < x
1
¦ ∪ ¦x
1
≤ X < x
2
¦)
= P(X < x
1
) +P(x
1
≤ X < x
2
)
= F(x
1
) +P(x
1
≤ X < x
2
),
hence
P(x
1
≤ X < x
2
) = F(x
2
) −F(x
1
).
Similarly, starting from the equality
¦X < x
1
¦ ∪ ¦x
1
< X < x
2
¦ ∪ ¦X = x
1
¦ = ¦X < x
2
¦,
we obtain
P(X < x
1
) +P(x
1
< X < x
2
) +P(X = x
1
) = P(X < x
2
),
hence
P(x
1
< X < x
2
) = F(x
2
) −F(x
1
) −P(X = x
1
).
The remaining two equalities can be obtained in the same way.
2) Property 2 follows because for x
1
< x
2
the event ¦X < x
1
¦ is contained in the
event ¦X < x
2
¦ and cannot have a larger probability, hence F(x
1
) ≤ F(x
2
).
3) Let x ∈ R and let (x
n
)
n∈N
an arbitrary increasing sequence such that lim
n→∞
x
n
=
x. If x
n
increase to x, then the events ¦X < x
n
¦ are increasing events whose union is
the event ¦X < x¦.
_
n≥1
¦X < x
n
¦ = ¦X < x¦.
Hence, by the continuity property of probabilities (see example 9, page 326):
lim
n→∞
F(x
n
) = lim
n→∞
P(X < x
n
) = P
_
_
_
n≥1
¦X < x
n
¦
_
_
= P(X < x) = F(x).
Hence, F(x −0) = F(x).
4) If (x
n
)
n∈N
increases to ∞, then the events ¦X < x
n
¦, n ≥ 1, are increasing
events whose union is ¦X < ∞¦ = S.
Hence, lim
n→∞
F(x
n
) = lim
n→∞
P(X < x
n
) = P(X < ∞) = 1 which proves the second
part of the fourth property.
The proof of the first part of this property is similar and is left as an exercise.
5) 6) F(x + 0) = lim
yցx
F(y) = lim
n→∞
F
_
x +
1
n
_
= lim
n→∞
P
_
X < x +
1
n
_
= P
_

_
X < x +
1
n
__
= P(X ≤ x) = P(X < x) +P(X = x) = F(x) +P(X = x).
7) The proof of this property is beyond the scope of this text and it will be omitted.
334
Remark. If x ∈ R and F is continuous at x then P(X = x) = 0.
Proof. Since F is continuous at x then F(x+0) = F(x). By applying the property
6 of the previous theorem we get
P(X = x) = F(x + 0) −F(x) = F(x) −F(x) = 0.
7.3 Continuous random variables
In the previous sections we considered discrete random variables, that is, random
variables whose set of possible values is at most countable. However, there also exist
random variables whose set of possible values is uncountable.
Definition. Let (X, T, P) a probability space and let X : S → R be a random
variable whose cumulative distribution function is F.
If there exists a function f : R →R such that
F(x) =
_
x
−∞
f(t)dt,
for each x ∈ R, then
a) X is said to be a continuous random variable
b) f is called the probability density function of X.
The distribution of a c.r.v. X whose density is f is defined as:
X :
_
x
f(x)
_
x∈R
Theorem. (Properties of the density function of a continuous random
variable) Let X be a continuous random variable and let f : R → R be its density
function. Then the following properties hold:
1) f(x) ≥ 0, ∀ x ∈ R
2)
_

−∞
f(x)dx = 1.
Remark. If X is a continuous random variable having the distribution function
F then all the properties of a distribution function hold. Also we have:
a) F is continuous on R, hence for each x ∈ R we have
P(X = x) = 0.
b) If x
1
, x
2
∈ R, x
1
< x
2
P(x
1
< X < x
2
) = P(x
1
≤ X < x
2
) = P(x
1
≤ X ≤ x
2
)
= P(x
1
< X ≤ x
2
) =
_
x
2
x
1
f(t)dt.
c) F is differentiable on R ¸ I, where I is a set which is at most countable and
F

(x) = f(x), ∀ x ∈ R ¸ I.
335
7.4 Numerical characteristics of random variables
Expected value
One of the most important concepts in probability theory is that of the expectation
of a random variable.
If X is a discrete random variable defined as
X :
_
x
i
p
i
_
i∈I⊆N
, p
i
≥ 0, i ∈ I,

i∈I
p
i
= 1,
such that

i∈I
[x
i
[p
i
< ∞, then the expectation or the expected value of X, denoted
by E(X), is:
E(X) =

i∈I
x
i
p
i
.
This is also known as the mean, or average or first moment of X.
Hence, the expected value of X is a weighted average of the possible values that
X can take on, each value being weighted by the probability that X assumes it.
The expected value can be seen as a guide to the location of X and is often called
a location parameter.
If X is a continuous random variable has the density f such that
_

−∞
[x[f(x)dx < ∞,
then X has an expected value, which is given by
E(X) =
_

−∞
xf(x)dx.
Remark. The condition

i=1
[x
i
[p
i
< ∞ is needed because, if it is violated, it is
known that

i=1
x
i
p
i
may be take different values, depending on the order of summa-
tion.
Example. Suppose an insurance company pays the amount of 500 Euro for lost
luggage on an airplane trip. It is known that the company pays this amount in 1 out
of 100 policies it sells. What premium should the company charge?
Solution. Let X be the r.v. defined as X = 0 if no loss occur, and X = −500 for
lost luggage. Then the distribution of X is:
X :
_
0 −500
0, 99 0, 01
_
.
Then the expected loss to the insurance company is
E(X) = 0 0, 99 −500 0, 01 = −5.
336
Thus, the company must charge 5 Euro (it will also add an amount for adminis-
trative expenses and a profit).
From the definition of the expectation and familiar properties of summations or
integrals, it follows that:
Theorem. Properties of the expected value.
1) If X is a constant random variable, i.e.
X :
_
a
1
_
, then E(X) = a.
2) Let X be a r.v. and let a ∈ R. Then
E(aX) = aE(X).
3) Let X and Y be two r.v. and let a, b ∈ R. Then
E(X +Y ) = E(X) +E(Y )
E(a +X) = a +E(X)
E(aX +b) = aE(X) +b.
4) If X and Y are two independent r.v. then:
E(XY ) = E(X)E(Y ).
Variance
The following example illustrates that the expected value, as a measure of location
of the distribution, may show us very little about the entire distribution.
Let X and Y be two d.r.v. whose distributions are defined as follows:
X :
_
−1 1 3
5
8
2
8
1
8
_
; Y :
_
−100 100 300
5
8
2
8
1
8
_
.
It is easy to see that E(X) = E(Y ) = 0.
The distribution of X is over an interval of length 4, the distribution of Y is over
an interval of length 100 times larger and they have the same center of location.
Hence, an additional measure is needed to be associated with the spread of location.
This new measure is the variance of a r.v.
Definition. If X is a random variable with mean E(X) = m, then the variance
of X, denoted by V (X) is defined by
V (X) = E((X −m)
2
).
If X is a d.r.v., i.e. X :
_
x
i
p
i
_
i∈I
then
V (X) =

i∈I
(x
i
−m)
2
p
i
.
337
If X is a c.r.v. X :
_
x
f(x)
_
x∈R
then
V (X) =
_

−∞
(x −m)
2
f(x)dx.
For the random variables X and Y mentioned before we have
V (X) = (−1)
2

5
8
+ 1
2

2
8
+ 3
2

1
8
= 2
and
V (Y ) = (−100)
2

5
8
+ 100
2

2
8
+ 300
2

1
8
= 20000.
Thus, the variance shows the difference in size of the range of the distributions of
the r.v.’s X and Y .
Theorem. (Properties of the variance)
1) If X is a r.v. then V (X) ≥ 0.
2) If X is a r.v. then
V (X) = E(X
2
) −(E(X))
2
.
3) If X is a constant random variable, i.e. X :
_
a
1
_
, then
V (X) = 0.
4) If X is a r.v. and a, b ∈ R then
V (aX +b) = a
2
V (X).
5) If X and Y are two independent random variables and a, b ∈ R then
V (aX +bY ) = a
2
V (X) +b
2
V (Y ).
For a = 1 and b = −1 we have
V (X −Y ) = V (X) +V (Y ).
Proof. We will prove the second property since the rest of them can be eas-
ily obtain from the variance’s definition and familiar properties of summations and
integrals.
2) V (X) = E((X −E(X))
2
) = E((X −m)
2
) = E(X
2
−2mX +m
2
)
= E(X
2
) −2mE(X) +m
2
= E(X
2
) −m
2
= E(X
2
) −(E(X))
2
.
In words, the variance of X is equal to the expected value of X
2
minus the square
of its expected value. This is, in practice, the easier way to compute V (X).
338
Standard deviation
The square root of the variance V (X) is called the standard deviation of X :
σ(X) =
_
V (X).
Unlike the variance, the standard deviation is measured in the same units as X
and E(X) and serves as a measure of deviation of X from E(X).
Moments and central moments
Definition. (Moments)
Let X be a r.v. and let k ∈ N.
The moment of order k of X is the number
ν
k
= E(X
k
).
If X is a d.r.v., i.e., X :
_
x
i
p
i
_
i∈I
then
ν
k
= E(X
k
) =

i∈I
x
k
i
p
i
(if the previous sum exists).
If X is a c.r.v., i.e., X :
_
x
f(x)
_
x∈R
then
ν
k
= E(X
k
) =
_

−∞
x
k
f(x)dx
(if the previous integral converges).
The moments of order k generalize the expected value because
ν
1
= E(X).
Definition. (Central moments)
Let X be a random variable with mean E(X) = m and let k ∈ N.
The central moment of order k of X is the number
µ
k
= E((X −m)
k
).
If X is a d.r.v., i.e., X :
_
x
i
p
i
_
i∈I
then
µ
k
= E((X −m)
k
) =

i∈I
(x
i
−m)
k
p
i
(if the previous sum exists).
339
If X is a c.r.v., i.e, X :
_
x
f(x)
_
x∈R
then
µ
k
= E((X −m)
k
) =
_

−∞
(x −m)
k
f(x)dx
(if the previous integral converges).
The central moments of order k generalize the variance because
µ
2
= E((X −m)
2
) = V (X).
We have the following relationship moments and central moments.
Theorem. Let X be a random variables and let k ∈ N. Then
µ
k
=
k

i=0
(−1)
i
C
i
k
ν
k−i

1
)
i
, where ν
0
= 1.
Proof. By using the binomial theorem, the properties of the expected value and
the fact that m = E(X) = ν
1
we have:
µ
k
= E((X −m)
k
) = E
_
k

i=0
C
i
k
X
k−i
(−m)
i
_
= E
_
k

i=0
(−1)
k
C
i
k
X
k−i
ν
i
1
_
=
k

i=0
(−1)
i
C
i
k
E(X
k−i
) ν
i
1
=
k

i=0
(−1)
i
C
i
k
ν
k−i
ν
i
1
,
as desired.
Particular cases:
µ
1
= 0
µ
2
= ν
2
−ν
2
1
µ
3
= ν
3
−3ν
2
ν
1
+ 2ν
3
1
µ
4
= ν
4
−4ν
3
ν
1
+ 6ν
2
ν
2
1
−3ν
4
1
.
Examples
Example 1. We consider the following gambling game. A player bets on one of
the numbers 1 through 6. Three dice are rolled, and if the number bet by the player
appears i times, i = 1, 3, then the player wins i units; if the number bet by the player
340
does not appear on any of the dice, then the player loses 1 unit. Is the game fair to
the player?
Solution. By assuming that the dice are fair and independent of each other we
can use the urn model with replacement and 2 states (p =
1
6
, q =
5
6
) and 3 repeated
trials.
Let X be the random variables which represents the player’s winning in the game.
P(X = −1) = C
0
3

_
1
6
_
0

_
5
6
_
3
=
125
216
P(X = 1) = C
1
3

_
1
6
__
5
6
_
2
=
75
216
P(X = 2) = C
2
3

_
1
6
_
2

_
5
6
_
=
15
216
P(X = 3) = C
3
3

_
1
6
_
3

_
5
6
_
0
=
1
216
.
Hence the distribution of X is
X :
_
−1 1 2 3
125
216
75
216
15
216
1
216
_
In order to determine whether or not this is a fair game for the player we compute
E(X).
E(X) =
−125 + 75 + 2 15 + 3 1
216
= −
17
216
.
The game is not fair, since in the long run, the player will lose 17 monetary units
at every 216 games he plays.
Example 2. Let X be the discrete random variable whose distribution is
X :
_
1 3 5 7 9 11
0, 05 0, 1 0, 15 0, 2 0, 3 0, 2
_
Compute its distribution function and draw its graph representation.
Solution. The distribution function F : R → [0, 1] is defined as
F(x) = P(X < x)
F(x) =
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
0, x ≤ 1
0, 05, 1 < x ≤ 3
0, 05 + 0, 1 = 0, 15, 3 < x ≤ 5
0, 15 + 0, 15 = 0, 3, 5 < x ≤ 7
0, 3 + 0, 2 = 0, 5, 7 < x ≤ 9
0, 5 + 0, 3 = 0, 8, 9 < x ≤ 11
0, 8 + 0, 2 = 1, x > 11
341
¸
`
1
0,8
0,5
0,3
0,15
0,05
O 1 3 5 7 9 11
F(x)
x
Example 3. Let X :
_
−1 0 2
0, 2 a b
_
.
1) Determine a, b ∈ R such that E(X) = 0, 8.
2) Compute V (X).
3) Compute the moment and the central moment of order 3 of X.
Solution. 1) From the properties of the probability mass function and the defini-
tion of the expected value we have to determine a, b ≥ 0 such that:
_
0, 2 +a +b = 1
−1 0, 2 + 0 a + 2 b = 0, 8

_
2b = 1
a = 0, 8 −b

_
b = 0, 5
a = 0, 3
Hence
X :
_
−1 0 2
0, 2 0, 3 0, 5
_
2) V (X) = E(X
2
) −(E(X))
2
= (−1)
2
0, 2 + 0
2
0, 3 + 2
2
0, 5 −0, 8
2
= 2, 2 −0, 64 = 1, 56
4) ν
3
= E(X
3
) = (−1)
3
0, 2 + 0
3
0, 3 + 2
3
0, 5 = −0, 2 + 4 = 3, 8
µ
3
= E((X −0, 8)
3
) = (−1 −0, 8)
3
0, 2 + (0 −0, 8)
3
0, 3 + (2 −0, 8)
3
0, 5
= −1, 1664 −0, 1536 + 0, 864 = −0, 456
By using the relationship between central moments and moments we have a second
342
method for computing µ
3
µ
3
= ν
3
−3ν
2
ν
1
+ 2ν
3
1
= 3, 8 −3 2, 2 0, 8 + 2 0, 8
3
= 3, 8 −5, 28 + 1, 024 = −0, 456
as we expected.
Example 4. Let f : R →R defined by
f(x) =
_

1
18
x +k, if −1 < x ≤ 2
0, otherwise
a) Determine k ∈ R such that f is a density probability function of a continuous
random variable X.
b) Determine the distribution function F.
c) Determine E(X).
Solution. a) f is a density probability function if the following two conditions are
satisfied:
• f(x) ≥ 0, x ∈ R

_

−∞
f(x)dx = 1
The first condition implies that k ≥
x
18
for each x ∈ (−1, 2] wherefrom we easily
obtain
k ≥
2
18
=
1
9
.
From the second condition we obtain:
1 =
_

−∞
f(x)dx =
_
−1
−∞
f(x)dx +
_
2
−1
f(x)dx +
_

2
f(x)dx
=
_
2
−1
_

1
18
x +k
_
dx = −
1
18

x
2
2
¸
¸
¸
2
−1
+kx
¸
¸
¸
2
−1
= −
4
36
+
1
36
+ 3k.
It remains for us to solve the equation
3k = 1 +
1
12
, k =
13
36
_
observe that k ≥
1
9
_
.
The distribution function is F : R → [0, 1]
F(x) =
_
x
−∞
f(t)dt.
- for x ≤ −1, since f(x) = 0 then F(x) = 0
343
- for −1 < x ≤ 2,
F(x) =
_
x
−∞
f(t)dt =
_
−1
−∞
f(t)dt +
_
x
−1
f(t)dt =
_
x
−1
_

t
18
+
13
36
_
dt
=
_

t
2
36
+
13t
36
_
¸
¸
¸
x
−1
= −
x
2
36
+
13x
36
+
1
36
+
13
36
= −
x
2
36
+
13x
36
+
14
36
.
- for x > 2
F(x) =
_
2
−∞
f(t)dt +
_
x
2
f(t) = F(2) +
_
x
2
0dt = F(2) = 1.
Hence
F(x) =
_
¸
¸
¸
_
¸
¸
¸
_
0, x ≤ −1

x
2
36
+
13x
36
+
14
36
, −1 < x ≤ 2
1, x > 2
b) E(X) =
_

−∞
xf(x)dx =
_
2
−1
x
_

x
18
+
13
36
_
dx
=
_

x
3
54
+
13x
2
72
_
¸
¸
¸
2
−1
= −
8
54

1
54
+
52
72

13
72
= −
1
6
+
13
24
=
9
24
=
3
8
.
Example 5. Let f : R →R be a function defined as
f(x) =
_
αe

x
5
, x ≥ 0
0, x < 0
a) Determine α ∈ R such that f is a density probability function of a continuous
random variable X.
b) Compute E(X), V (X) and ν
15
.
Solution. a) Since f is a probability density function then
f(x) ≥ 0, ∀ x ∈ R and
_

−∞
f(x)dx = 1.
From the inequality f(x) ≥ 0, ∀ x ∈ R we obtain α ≥ 0.
From the second condition we obtain
1 =
_

−∞
f(x)dx =
_

0
αe

x
5
dx = α
_

0
e
−y
5dy = 5αΓ(1) = 5α.
344
Hence α =
1
5
≥ 0.
b) E(X) =
_

−∞
xf(x)dx =
_

0
x
1
5
e

x
5
dx.
By using the following change of variable:
x
5
= y, x = 5y, dx = 5dy
we have:
E(X) =
_

0
5y
1
5
e
−y
5dy = 5
_

0
ye
−y
dy = 5Γ(2) = 5
V (X) = E(X
2
) −[E(X)]
2
.
By using the same change of variable we obtain
E(X
2
) =
_

0
x
2

1
5
e

x
5
dx =
_

0
25y
2

1
5
e
−y
5dy
= 25Γ(3) = 25 2! = 50,
hence V (X) = 50 −25 = 25.
Using once more the Gamma function we obtain
ν
15
=
_

0
x
15

1
5
e

x
5
dx =
_

0
5
15
y
15

1
5
e
−y
5dy
= 5
15
Γ(16) = 5
15
15!
7.5 Special random variables
Certain types of random variables occur over and over again in applications. In
this section we will study a few of them.
Discrete random variables
The Bernoulli and binomial random variables
Suppose that we perform an experiment whose outcome can be classified as either
a ”success” (with probability p, 0 < p < 1) or a ”failure” (with probability 1 −p).
If we let X = 1 when the outcome is a success and X = 0 when it is a failure then
the distribution of X is
X :
_
1 0
p 1 −p
_
.
A random variable X is said to be a Bernoulli random variable (after the Swiss
mathematician James Bernoulli) if its distribution is:
X :
_
1 0
p 1 −p
_
, 0 < p < 1.
345
The expected value is E(X) = 1 p + 0 (1 −p) = p.
The variance is
V (X) = E(X
2
) −(E(X))
2
= 1
2
p + 0
2
(1 −p) −p
2
= p(1 −p) = pq,
where by q we denoted the probability of a ”failure”; q = 1 −p.
Suppose that n independent trials, each of which is a Bernoulli experiment, are
performed. If X represents the number of successes that occur in the n trials, then X
is said to be a binomial random variables with parameters (n, p).
Notation X ∼ B(n, p).
The probability mass function of a binomial random variable with parameters n
and p is given by
P(X = k) = C
k
n
p
k
(1 −p)
n−k
= C
k
n
p
k
q
n−k
, k = 0, n
(the reasoning is similar to that used in the urn model with two states and with
replacement).
Definition. The binomial random variable with parameters n and p is the
random variable whose distribution is
X :
_
k
C
k
n
p
k
q
n−k
_
k=0,n
, p +q = 1, p ∈ (0, 1).
To check that P(X = k) = C
k
n
p
k
q
n−k
, k = 1, n is a probability mass function, we
note that:
P(X = k) = C
k
n
p
k
q
n−k
> 0
and
n

k=1
P(X = k) =
n

k=1
C
k
n
p
k
q
n−k
= (p +q)
n
= 1.
Remark. If X is a binomial random variable with parameters n and p, X ∼
B(n, p), then
E(X) = np and V (X) = npq.
Proof. Since a binomial random variable X, with parameters n and p, represents
the number of successes in n independent trials (with the success probability p), then
X can be represented as
X =
n

i=1
X
i
where X
i
=
_
1, if the i
th
trial is a success
0, otherwise
Because X
i
, i = 1, n, are n independent Bernoulli r.v. we have that
E(X) = E
_
n

i=1
X
i
_
=
n

i=1
E(X
i
) = np
346
V (X) = V
_
n

i=1
X
i
_
=
n

i=1
V (X
i
) = npq.
Remark. If X
1
∼ B(n
1
, p) and X
2
∼ B(n
2
, p) are independent then X
1
+X
2
is a
binomial random variable with parameters (n
1
+n
2
, p); i.e. X
1
+X
2
∼ B(n
1
+n
2
, p).
Example. Suppose that an airplane engine will fail with probability 1 − p inde-
pendently from engine to engine; suppose that the airplane will make a successful
flight if at least a half of its engines are operative. For what value of p a four-engine
airplane is preferable to a two-engine airplane?
Solution. As the number of functioning engines is a binomial random variable
with parameters (n, p) it follows that the probability for a four engine airplane to
make a successful flight is
P(X
1
= 2) +P(X
1
= 3) +P(X
1
= 4)
= C
2
4
p
2
(1 −p)
2
+C
3
4
p
3
(1 −p) +C
4
4
p
4
(1 −p)
0
= 6p
2
(1 −p)
2
+ 3p
3
(1 −p) +p
4
whereas the corresponding probability for a two engine airplane is
P(X
1
= 1) +P(X
2
= 2) = C
1
2
p(1 −p) +C
2
2
p
2
= 2p(1 −p) +p
2
.
Hence, the four engine plane is better if
6p
2
(1 −p)
2
+ 3p
3
(1 −p) +p
4
≥ 2p(1 −p) +p
2
which is equivalent to (by dividing the inequality with p)
6p(1 −p)
2
+ 3p
2
(1 −p) +p
3
≥ 2 −p ⇔ (p −1)
2
(3p −2) ≥ 0
which is equivalent to
p ≥
2
3
.
In conclusion, the four-engine plane is better when the success probability is
greater than
2
3
, whereas the two-engine plane is better if the success probability
is smaller than
2
3
.
The geometric random variable
The geometric random variable is closely related to the binomial random variable.
Consider a sequence of independent Bernoulli experiments where p is the proba-
bility of ”success” and q = 1 −p the probability of ”failure”.
We saw that the random variable which represents the number of successes (in n
successive trials) is binomial.
The geometric random variable represents the waiting time until the first
success occurs.
347
If the first success occurs at the k
th
trial, then we must have k −1 failures before
the first success. The Bernoulli trials are independent, hence the probability of the
desired event is (1 −p)
k−1
p = q
k−1
p, k ≥ 1 (see also the geometric urn model).
Definition. The geometric random variable with parameter p is the random vari-
able whose distribution is
X :
_
k
q
k−1
p
_
k≥1
p ∈ (0, 1), p +q = 1.
To check that P(X = k) = q
k−1
p, k ≥ 1 is a probability mass function, we note
that:
P(X = k) = q
k−1
p > 0
and

k=1
P(X = k) =

k=1
q
k−1
p = p

k=1
q
k−1
= p lim
n→∞
(1 +q + +q
n
)
= p lim
n→∞
1 −q
n−1
1 −q
= p
1
1 −q
=
p
p
= 1.
Remark. If X is a geometric random variable with parameter p ∈ (0, 1) then
E(X) =
1
p
and V (X) =
q
p
2
.
Proof.
E(X) =

k=1
kq
k−1
p = p

k=1
kq
k−1
= p
_

k=1
q
k−1
_

= p
_
1
1 −q
_

= p
1
(1 −q)
2
=
p
p
2
=
1
p
,
as desired.
348
To determine V (X) we compute first E(X
2
).
E(X
2
) =

k=1
k
2
q
k−1
p = p

k=1
k
2
q
k−1
= p

k=1
(k
2
−k +k)q
k−1
= p

k=2
k(k −1)q
k−1
+p

k=1
kq
k−1
= pq

k=2
k(k −1)q
k−2
+p

k=1
kq
k−1
= pq
_

k=1
q
k
_
′′
+
1
p
= pq
_
1
1 −q
_
′′
+
1
p
= pq
_
1
(1 −q)
2
_

+
1
p
= pq
2
(1 −q)
3
+
1
p
=
2q
p
2
+
1
p
=
q +q +p
p
2
=
1 +q
p
2
.
Hence
V (X) =
1 +q
p
2

1
p
2
=
q
p
2
.
We can also remark that the standard deviation is:
σ(X) =

q
p
.
The occurrence of a geometric series explains the use of the word ”geometric” in
describing the probability distribution.
As an application of the previous remark we present the following:
- if we toss a fair coin, then the expected waiting time for the first head to occur
is
1
p
=
1
1
2
= 2 tosses
- if we roll a fair dice, then the expected waiting time for the six to occur is
1
p
=
1
1
6
= 6 rolls.
The negative binomial (Pascal) random variable
The binomial distribution finds the probability of exactly k successes in n inde-
pendent trials.
The geometric distribution finds the number of independent trials until the first
success occurs.
349
We can generalize these two results and find the number of independent trials
required for k successes.
Suppose that independent trials, each having the probability of success p, 0 < p <
1 are performed until a total of k successes is obtained.
Let X be the random variable which represent the number of trials required, then
P(X = n) = C
k−1
n−1
p
k
(1 −p)
n−k
, n = k, k + 1, . . . ,
The previous equality holds because, in order that k successes to occur in the first
n trials, there must be k −1 successes in the first n −1 trials and the n
th
trial must
be a success. The mentioned probability was computed in the Pascal urn model.
Definition. The negative binomial (or Pascal) random variable with parameters
k and p is the random variable whose distribution is:
X :
_
n
C
k−1
n−1
p
k
q
n−k
_
n≥k
To check that P(X = n) = C
k−1
n−1
p
k
q
n−k
, n ≥ k is a probability mass function, we
note that
P(X = n) > 0
and

n=k
C
k−1
n−1
p
k
q
n−k
= p
k
(C
k−1
k−1
+C
k−1
k
q +C
k−1
k+1
q
2
+. . . )= p
k
1
(1 −q)
k
=
p
k
p
k
= 1.
In establishing the previous equality we used the following Taylor expansion:
1
(1 −q)
k
= 1 +kq +
k(k + 1)
2
q
2
+
k(k + 1)(k + 2)
3!
q
3
+. . . , [q[ < 1.
The geometric random variable is a negative binomial random variable with k = 1.
Remark. If X is negative binomial random variable with parameters k and p ∈
(0, 1) then
E(X) =
k
p
and V (X) =
kq
p
2
.
Proof.
E(X) =

n=k
nC
k−1
n−1
p
k
q
n−k
=
k
p

n=k
C
k
n
p
k+1
q
n−k
, since nC
k−1
n−1
= kC
k
n
=
k
p

m=k+1
C
k+1−1
m−1
p
k+1
q
m−(k+1)
, by setting m = n + 1
=
k
p
1 =
k
p
since the numbers C
k+1−1
m−1
p
k+1
q
m−(k+1)
, m ≥ k + 1
350
represent the probability mass function of a negative binomial random variables with
parameters (k + 1, p).
To determine V (X) we compute first E(X
2
).
E(X
2
) =

n=k
n
2
C
k−1
n−1
p
k
q
n−k
=
k
p

n=k
nC
k
n
p
k+1
q
n−k
, since nC
k−1
n−1
= kC
k
n
=
k
p

m=k+1
(m−1)C
k
m−1
p
k+1
q
m−(k+1)
by setting m = n + 1
=
k
p

m=k+1
mC
k+1−1
m−1
p
k+1
q
n−(k+1)

k
p

m=k+1
C
k+1−1
m−1
p
k+1
q
m−(k+1)
=
k
p

k + 1
p

k
p
=
k
p
_
k + 1
p
−1
_
.
Therefore,
V (X) =
k
p
_
k + 1
p
−1
_

_
k
p
_
2
=
k
2
+k −kp −k
2
p
2
=
k(1 −p)
p
2
=
kq
p
2
,
as desired.
Example. Find the expected value and the variance of the number of times one
must roll a dice until the face 6 occurs 4 times.
Solution. The experiment can be described by a negative binomial random vari-
able with parameters k = 4 and p =
1
6
.
Hence
E(X) =
k
p
=
4
1
6
= 24
and
V (X) =
4
5
6
_
1
6
_
2
= 120.
The hypergeometric random variable
The hypergeometric random variable is obtained while sampling without replace-
ment.
Suppose that a sample of size n is to be chosen randomly (without replacement)
from an urn containing a white balls and b black balls. If we let X denote the number
351
of white balls selected, then
P(X = k) =
C
k
a
C
n−k
b
C
k
a+b
, k = 0, 1, . . . , n
(see also the urn model with two states without replacement).
Definition. The hypergeometric random variables with parameters n, a, b (n ≤
a +b, max(a, n −b) ≤ k ≤ min(n, a)) is the random variable whose distribution is
X :
_
k
C
k
a
C
n−k
b
C
n
a+b
_
k=0,n
To check that P(X = k) =
C
k
a
C
n−k
b
C
n
a+b
, k = 0, n is a probability mass function we
note that
P(X = k) > 0
and
n

k=0
P(X = k) =
n

k=0
C
k
a
C
n−k
b
C
n
a+b
=
1
C
n
a+b
n

k=0
C
k
a
C
n−k
b
=
C
n
a+b
C
n
a+b
= 1.
In establishing the previous equality we used the Vandermonde’s identity:
n

k=0
C
k
a
C
n−k
b
= C
n
a+b
.
Remark 1. If X is a hypergeometric random variable with parameters n, a, b then
E(X) = n
a
a +b
, V (X) = n
a
a +b

b
a +b

a +b −n
a +b −1
.
Proof.
E(X) =
n

k=1
k
C
k
a
C
n−k
b
C
n
a+b
=
1
C
n
a+b
n

k=0
(kC
k
a
)C
n−k
b
=
1
C
n
a+b
n

k=1
(aC
k−1
a−1
)C
n−k
b
, since kC
b
a
= aC
k−1
a−1
=
a
C
n
a+b
n−1

l=0
C
l
a−1
C
n−1−l
b
, by setting k = l + 1
=
a
C
n
a+b
C
n−1
a+b−1
, by using Vandermonde’s identity
=
a
C
n
a+b

n
a +b
C
n
a+b
= n
a
a +b
, since C
n−1
a+b−1
=
n
a +b
C
n
a+b
.
352
Hence E(X) = n
a
a +b
.
To determine V (X) we compute first E(X
2
).
E(X
2
) =
n

k=0
k
2

C
k
a
C
n−k
b
C
n
a+b
=
1
C
n
a+b
n

k=1
[k(k −1) +k]C
k
a
C
n−k
b
=
1
C
n
a+b
n

k=2
(k(k −1)C
k
a
)C
n−k
b
+E(X)
=
1
C
n
a+b
n

k=2
a(a −1)C
k−2
a−2
C
n−k
b
+E(X),
since k(k −1)C
k
a
= (k −1)aC
k−1
a−1
= a(a −1)C
k−2
a−2
=
a(a −1)
C
n
a+b
n−2

k=0
C
l
a−2
C
n−2−l
b
+E(X), by setting k = l + 2
=
a(a −1)
C
n
a+b
C
n−2
a+b−2
+E(X), by using the Vandermonde’s identity
=
a(a −1)n(n −1)
(a +b)(a +b −1)
+n
a
a +b
V (X) = E(X
2
) −(E(X))
2
= n(n −1)
a(a −1)
(a +b)(a +b −1)
+n
a
a +b
−n
2
a
2
(a +b)
2
= = n
a
a +b

b
a +b

a +b −n
a +b −1
,
as desired.
Remark 2. Let X be a hypergeometric random variable with parameters n, a
and b. If we denote by p (respectively q) the probability of extracting a white ball
(respectively a black ball) at the beginning of the experiment then the expected value
and the variance of the r.v. X can be written as
E(X) = np, V (X) = npq
a +b −n
a +n −1
,
where p =
a
a +b
and q =
b
a +b
.
Remark 3. (Approximation to binomial distribution)
If n balls are randomly chosen without replacement from a set of a + b balls, of
which a are white balls, then the r.v. which represents the number of white balls
extracted is hypergeometric. If a and b are large in relation to n then it seems that
there is no difference whether the selection is made with or without replacement.
In this case, when a and b are large, the probability of taking a white ball at each
additional selection will be approximately equal to p =
a
a +b
.
353
We may expect, that the probability mass function of X can be approximate by
the p.m.f. of a binomial r.v. with parameters n and p.
We will verify now the previous statement.
P(X = k) =
C
k
a
C
n−k
b
C
n
a+b
=
a!
k!(a −k)!

b!
(n −k)!(b −n −k)!

n!(a +b −n)!
(a +b)!
=
n!
k!(n −k)!

a!(a +b −n)!
(a +b)!(a −k)!

b!
(b −n −k)!
= C
k
n
a(a −1) . . . (a −k + 1)
(a +b) . . . (a +b −k + 1)

b . . . (b −n −k + 1)
(a +b −k)(a +b −n + 1)
= C
k
n
a
a +b

a −1
a +b −1
. . .
a −k + 1
a +b −k + 1

b
a +b −k

b −n −k + 1
a +b −n + 1
≈ C
k
n
p
k
q
n−k
.
In practice, the hypergeometric law can be replaced by the binomial distribution
if the following inequality holds 10n < a +b.
The Poisson random variable
Definition. The Poisson random variable with parameter λ, λ > 0 is a random
variable whose distribution is:
X :
_
k
e
−λ λ
k
k!
_
k=0,1,...
To check that P(X = k) = e
−λ

λ
k
k!
, k ≥ 0 is a probability mass function we note
that P(X = k) > 0 and

k=0
P(X = k) =

k=0
e
−λ

λ
k
k!
= e
−λ

k=0
λ
k
k!
= e
−λ
e
λ
= 1.
In establishing the previous equality we used the following Taylor expansion:
e
λ
= 1 +
λ
1!
+
λ
2
2!
+. . . , λ ∈ R.
The Poisson probability distribution was introduced by S.D. Poisson in the book
entitled ”Recherches sur la probabilit´e des jugements en mati`ere criminelle et en
mati`ere civile”.
Remark 1. If X is a Poisson random variable with parameter λ, λ > 0, then
E(X) = λ and V (X) = λ.
354
Proof.
E(X) =

k=0
ke
−λ
λ
k
k!
= e
−λ

k=1
k
λ
k
k!
= e
−λ
λ

k=1
λ
k−1
(k −1)!
= e
−λ
λ e
λ
= λ
V (X) = E(X
2
) −[E(X)]
2
=

k=0
k
2
e
−λ
λ
k
k!
−λ
2
= e
−λ

k=1
[k(k −1) +k]
λ
k
k!
−λ
2
= e
−λ
λ
2

k=2
λ
k−2
(k −2)!
+λe
−λ

k=1
λ
k−1
(k −1)!
−λ
2
= e
−λ
λ
2
e
λ
+λe
−λ
e
λ
−λ
2
= λ
Remark 2. (The Poisson distribution as the limit of the binomial)
The Poisson random variable may be used as an approximation for a binomial
random variable with parameters (n, p) when n is large (compared to k) and p is
small enough such that np is of moderate size.
Suppose that X is a binomial random variable with parameters (n, p) and let
λ = np. Then
C
k
n
p
k
(1 −p)
n−k
=
n!
k!(n −k)!
p
k
(1 −p)
n−k
=
n!
k!(n −k)!
_
λ
n
_
k
_
1 −
λ
n
_
n−k
=
n(n −1) . . . (n −k + 1)
n
k

λ
k
k!

_
1 −
λ
n
_
n
_
1 −
λ
n
_
k
If n is large and p is small, then
_
1 −
λ
n
_
n
=
_
_
1 −
λ
n
_

n
λ
_
−λ
≈ e
−λ
n(n −1) . . . (n −k + 1)
n
k
≈ 1
355
(since k is much smaller)
_
1 −
λ
n
_
k
≈ 1
Hence, for n large (compared to k) and p small,
C
k
n
p
k
(1 −p)
n−k
≈ e
−λ

λ
k
k!
In consequence, if n independent trials (whose outcomes are ”success” with prob-
ability p and ”failure” with probability 1 −p) then, for n large and p small such that
np is moderate in size, the number of successes occurring is approximately a Poisson
variable with parameter λ = np.
Some examples of random variables that usually follow the Poisson probability
law are:
1. The number of misprints on a page (or on a group of pages) of a book.
2. The number of customers entering a bank on a given day.
3. The number of people in a community living to 90 years of age.
4. The number of particles emitted by a radioactive source within a certain period
of time.
5. The number of accidents in 1 day on a particular stretch of a highway.
Example. (Misprints on a page)
Suppose a page of a book contains n = 1000 characters, each of which is misprinted
with a probability p = 10
−4
. Compute the probabilities of having:
a) no misprint on the page
b) at least one misprint on the page
both by binomial formula and by Poisson formula.
Solution. Let X be the r.v. which represents the number of misprints on the page:
a) - by the binomial formula:
C
0
1000
(10
−4
)
0
(1 −10
−4
)
1000
≈ 0.904833
- by the Poisson formula with
λ = np = 1000 10
−4
= 0, 1 :
0, 1e
−0,1
0!
≈ 0, 904837
b) - by the binomial formula
P(X ≥ 1) = 1 −P(X = 0) ≈ 1 −0.904833 = 0, 095167
- by the Poisson formula
P(X ≥ 1) = 1 −P(X = 0) ≈ 1 −0, 904837 = 0, 095163.
Remark 3. (The sum of independent Poisson variables is Poisson)
If X
1
and X
2
are independent Poisson random variables with parameters λ
1
and
λ
2
, respectively, then X
1
+X
2
is Poisson with parameter λ
1

2
.
356
Proof.
P(X
1
+X
2
= n) =
n

i=0
P(X
1
= k, X
2
= n −k)
=
n

i=0
P(X
1
= k) P(X
2
= n −k) =
n

k=0
e
−λ
1

λ
k
1
k!
e
−λ
2

λ
n−k
2
(n −k)!
= e
−(λ
1

2
)

1
n!
n

k=0
n!
k!(n −k)!
λ
k
1
λ
n−k
2
= e
−(λ
1

2
)

1
n!
n

k=0
C
k
n
λ
k
1
λ
n−k
2
= e
−(λ
1

2
)

1
n!

1

2
)
n
= e
−(λ
1

2
)

1

2
)
n
n!
, n = 0, 1, . . .
Sometimes, we are interested not just in one Poisson random variable, but in a
family of random variables.
For example, in the previous example we may be interested for the probabilities
of the number of misprints on several pages.
A family of random variables X(t) depending on a parameter t is called a stochastic
or random process.
The parameter t is time in most applications.
Next we will present a particular stochastic process called the Poisson process.
Definition. (Poisson process)
A family of random variable (X(t))
t>0
is called a Poisson process with rate λ,
λ > 0, if the r.v. X(t) (the number of occurrences of some type in any interval of
length t) has a Poisson distribution with parameter λt for any t > 0:
P(X(t) = k) =
(λt)
k
e
−λt
k!
, k = 0, 1, . . .
and for each 0 < t
1
< < t
n
the random variables X(t
1
); X(t
2
)−X(t
1
); . . . , X(t
n
)−
X(t
n−1
) are independent (i.e. the numbers of occurrences in non overlapping time
intervals are independent of each other).
Example. (Misprints on several pages)
Suppose the pages of a book contain misprinted characters, independent of each
other, with a rate of λ = 0, 1 misprints per page. Suppose that the number X(t) of
misprints on any t pages form a Poisson process. Find the probabilities of having
a) no misprints on the first 3 pages
b) at least two misprints on the first two pages.
Solution. a) Since in this case t = 3 and λt = 0, 3, then
P(X(3) = 0) =
0, 3
0
e
−0,3
0!
= e
−0,3
≈ 0, 74
b) In this case t = 2, λt = 0, 2. Hence
P(X(2) ≥ 2) = 1 −[P(X(2) = 0) +P(X(2) = 1)]
357
= 1 −
0, 2
0
e
−0,2
0!

0, 2
1
e
−0,2
1!
≡ 1 −1, 2e
−0,2
≈ 0, 017
The next remark, which is an immediate consequence of the definition of random
processes and Remark 3, says that the number of occurrences in an interval depends
only on the length of the interval. This property is called stationarity.
Remark 4. For any s, t > 0
X(s +t) −X(s) = X(t).
Continuous random variables
The uniform random variable
Definition. A random variable X is said to be uniformly distributed over the
interval [a, b] if its distribution is
X :
_
x
f(x)
_
x∈R
,
where the probability density function is
f(x) =
_
1
b −a
, a ≤ x ≤ b
0, otherwise
Notation: X ∼ U(a, b)
Remark 1. The function f verifies the two conditions of a probability density
function.
Proof. 1) f(x) ≥ 0, ∀ x ∈ R (since a < b)
2)
_

−∞
f(x)dx =
_
b
a
1
b −a
dx =
x
b −a
¸
¸
¸
b
a
=
b −a
b −a
= 1
Remark 2. The distribution function
If X ∼ U(a, b), then
F(x) =
_
¸
_
¸
_
0, x ≤ a
x −a
b −a
, a < x ≤ b
1, x > b.
Proof. F : R → [0, 1], F(x) = P(X < x) =
_
x
−∞
f(t)dt
• if x ≤ a then
F(x) =
_
x
−∞
0dt = 0
• if a < x ≤ b then
F(x) =
_
a
−∞
0dt +
_
x
a
1
b −a
dt = 0 +
t
b −a
¸
¸
¸
x
a
=
x −a
b −a
358
• if b < x then
F(x) =
_
a
−∞
0dt +
_
b
a
1
b −a
dt +
_
x
b
0dt =
b −a
b −a
= 1,
as desired.
Remark 3. If X ∼ U(a, b), then
E(X) =
a +b
2
, V (X) =
(b −a)
2
12
.
Proof.
E(X) =
_


xf(x)dx =
_
b
a
x
1
b −a
dx =
1
b −a

x
2
2
¸
¸
¸
b
a
=
b
2
−a
2
2(b −a)
=
b +a
2
To determine V (X) we compute first E(X
2
).
E(X
2
) =
_

−∞
x
2
f(x)dx =
_
b
a
x
2
1
b −a
dx
=
1
b −a

x
3
3
¸
¸
¸
b
a
=
b
3
−a
3
3(b −a)
=
a
2
+ab +b
2
3
.
Hence, the variance is
V (X) =
a
2
+ab +b
2
3

_
a +b
2
_
2
=
a
2
−2ab +b
2
12
=
(b −a)
2
12
Notation. If a = 0 and b = 1 we obtain the standard uniform random variable.
Example. Buses arrive at a specified station at 10 minute intervals starting a
6 A.M. That is, they arrive at 6, 6:10, 6:20 and so on. If a passenger arrives at the
station at a time that is uniformly distributed between 6 and 6:20, find the probability
that he waits:
a) less than 5 minutes for a bus
b) more than 7 minutes for a bus.
Solution. Let X denote the number of minutes past 6 that the passenger arrives
at the station.
a) The passenger will have to wait less than 5 minutes if (and only if) he arrives
between 6:05 and 6:10 or between 6:15 and 6:20.
Hence, the desired probability is
P(5 < X < 10) +P(15 < X < 20) =
_
10
5
1
20
dx +
_
20
15
1
20
dx =
10
20
=
1
2
b) The passenger will wait more than 7 minutes if he arrives between 6 and 6:03
or between 6:10 and 6:13, so the desired probability:
P(0 < X< 3) +P(10 < X< 13) =
3
20
+
3
20
=
6
20
=
3
10
.
359
The exponential random variable
Definition. A random variable X is said to be an exponential random variable (or
exponentially distributed) with parameter λ, λ > 0 if its probability density function
is
f(x) =
_
λe
−λx
, x > 0
0, x ≤ 0
Remark 1. The function f verifies the two conditions of a probability density
function.
Proof. 1) f(x) ≥ 0, ∀ x ∈ R (since λ > 0)
2)
_

−∞
f(x)dx =
_
0
−∞
0 dx +
_

0
λe
−λx
dx = λ
e
−λx
−λ
¸
¸
¸

0
= λ
_
0 −
1
−λ
_
= 1.
Remark 2. The distribution function
If X is an exponential distribution then
F(X) =
_
0, x ≤ 0
1 −e
−λx
, x > 0
Proof. F : R → [0, 1],
F(x) = P(X < x) =
_
x
−∞
f(t)dt
- if x ≤ 0,
F(x) =
_
x
−∞
0dt = 0
- if x > 0,
F(x) =
_
0
−∞
0dt +
_
x
0
λe
−λt
dt
= −
_
x
0
(e
−λt
)

dt = −e
−λx
+ 1 = 1 −e
−λx
,
as desired.
Remark 3. If X is exponential distributed then
E(X) =
1
λ
and V (X) =
1
λ
2
.
Proof.
E(X) =
_

−∞
xf(x)dx =
_

0
xe
−λx
λdx
By using the following change of variable: y = λx we obtain
E(X) =
_

0
y
λ
e
−y
dy =
1
λ
_

0
y
2−1
e
−λ
dy =
1
λ
Γ(2) =
1
λ
360
To determine V (X) we compute first E(X
2
) (in computing the corresponding
integral, we will use the same change of variable as before).
E(X
2
) =
_

0
x
2
e
−λx
λdx =
_

0
y
2
λ
2
e
−y
dy
=
1
λ
2
_

0
y
3−1
e
−y
dy =
1
λ
2
Γ(3) =
2
λ
2
Hence, the variance is
V (X) = E(X
2
) −(E(X))
2
=
2
λ
2

1
λ
2
=
1
λ
2
Remark 4. The memoryless property
If X is an exponential random variable then:
P(X > s +t[X > t) = P(X > s), for all s, t ≥ 0.
Proof.
P(X > s +t[X > t) =
P(X > s +t, X > t)
P(X > t)
=
P(X > s +t)
P(X > t)
=
1 −P(X ≤ s +t)
1 −P(X ≤ t)
=
1 −P(X < s +t)
1 −P(X < t)
=
1 −F(s +t)
1 −F(t)
=
e
−λ(s+t)
e
−λt
= e
−λs
= 1 −F(s) = 1 −P(X < s)
= P(X ≥ s) = P(X > s).
To understand why the previous equality is called the memoryless property, con-
sider that X represents the length of time that an item functions before failing. The
previous equality says that the probability that an item functioning at age t will con-
tinue to function for at least an additional time s is the same as of a new item to
function at least a period of time equal to s.
We can say the mentioned equality says that an functional item is ”as good as
new”. It can be shown that the exponential random variables are the only continuous
random variables that are memoryless.
Then is a very important relationship between the exponential random variables
and the Poisson process.
The next remark shows that in a Poisson process, the ”waiting time” for an occur-
rence and ”the time between two any consecutive occurrences (the interarrival time)
have the same exponential distribution with parameter λ.
Remark 5. Let (X(t))
t≥0
be a Poisson process with rate λ, λ > 0.
a) If s ≥ 0 and if T
1
is the random variable which represents the length of time till
the first occurrence after s then T
1
is an exponential random variable with parameter
λ.
361
b) Suppose we have an occurrence at time s ≥ 0. Let T
2
be the random variable
which represents the time between this occurrence and the next one. Then T
2
is an
exponential random variable with parameter λ.
Proof. a) Let t > 0. We have to compute P(T
1
< t).
P(T
1
< t) = P(T
1
≤ t) = 1 −P(T
1
> t).
Clearly, for any t > 0, the waiting time T
1
is greater than t, if and only if there is
no occurrence in the time interval (s, s +t]. Thus:
P(T
1
< t) = 1 −P(T
1
> t) = 1 −P(X(s +t) −X(s) = 0)
= 1 −P(X(t) = 0) = 1 −
(λt)
0
0!
e
−λt
= 1 −e
−λt
The previous equality together with P(T
1
< t) = 0 for t ≤ 0, shows that T
1
has
the distribution function of an exponential random variable with parameter λ.
b) We have that P(T
2
< t) = 0 for t ≤ 0.
Let t > 0. We have to compute P(T
2
< t).
Instead of assuming that we have an occurrence at time s, we assume that we have
an occurrence in the time interval [s −∆s, s] and let ∆s → 0. Then
P(T
2
< t) = P(T
2
≤ t) = 1 −P(T
2
> t)
= 1 − lim
∆s→0
P(X(s +t) −X(s) = 0[X(s) −X(s −∆s) = 1)
= 1 −P(X(s +t) −X(s) = 0) = 1 −P(X(t) = 0) =1 −e
−λt
.
Thus T
2
has the distribution function of an exponential random variable.
Remark 6. If in a Poisson process the rate is λ, that is the mean number of
occurrences per unit time is λ, then the mean interoccurrence time is
1
λ
.
Example. A checkout counter at a supermarket completes the process according
to an exponential random variable with a service rate of 15/hour. A customer arrives
at the checkout counter. Find the following probabilities.
a) the service is completed in more than 5 minutes;
b) the customer has to wait more than 8 minutes knowing that he already waited
3 minutes;
c) the service is completed in a time between 5 and 8 minutes.
Solution. a) We have first to convert the service rate so that the time period is
1 minute.
The service rate is λ = 0, 25/minute.
a) P(X > 5) = 1 −P(X ≤ 5) = 1 −P(X < 5)
= 1 −1 +e
−0,25·5
= e
−1,25
.
b) P(X > 8[X > 3) =
P(X > 8, X > 3)
P(X > 3)
=
P(X > 8)
P(X > 3)
=
1 −1 +e
−0,25·8
1 −1 +e
−0,25·3
= e
−0,25·5
= e
−1,25
,
362
as we expected according to the ”memoryless property” of the exponential random
variable.
c) P(5 < X < 8) =
_
8
5
f(x)dx =
_
8
5
λe
−λx
dx = −e
−λx
¸
¸
¸
8
5
= e
−0,25·5
−e
−0,25·8
= e
−1,25
−e
−2
.
The Erlang random variable
The Erlang distribution is a generalization of the exponential distribution. While
the exponential random variable describes the time between two consecutive events,
the Erlang random variable describes the time interval between any event and the k
th
following event.
Definition. A random variable X is said to be an Erlang random variable with
parameters λ and n (λ > 0, ∈ N

) if it has the following distribution
X :
_
x
f(x, n, λ)
_
x∈R
where
f(x, n, λ) =
_
_
_
λ
n
x
n−1
e
−λx
(n −1)!
, n = 1, 2, 3, . . . , x ≥ 0
0, x < 0
Remark 1. The function f(, n, λ) verifies the two conditions of a probability
density function.
Proof. 1) f(x, n, λ) ≥ 0, ∀ x ∈ R (since λ > 0).
2) In order to compute the integral
_

−∞
f(x, n, λ)dx we will use the Gamma
function. If we let t = λx, then
_

−∞
f(x, n, λ)dx =
λ
n
(n −1)!
_

0
x
n−1
e
−λx
dx
=
λ
n
(n −1)!
_

0
_
t
λ
_
n−1
e
−t
1
λ
dt
=
λ
n
(n −1)!

1
λ
n
_

0
t
n−1
e
−t
dt
=
1
(n −1)!
Γ(n) =
1
(n −1)!
(n −1)! = 1,
as we needed.
Remark 2. If X is an Erlang random variable, then
E(X) =
n
λ
and V (X) =
n
λ
2
.
363
Proof. In the next computations we will make the same change of variable as in
the proof of the previous remark.
E(X) =
_

−∞
xf(x, n, λ)dx =
λ
n
(n −1)!
_

0
x x
n−1
e
−λx
dx
=
λ
n
(n −1)!
_

0
t
n
λ
n
e
−t

1
λ
dt =
1
(n −1)!

1
λ
Γ(n + 1)
=
1
(n −1)!

1
λ
n! =
n
λ
.
Note that this is n times the expected value of the exponential distribution (with
parameter λ).
Similarly,
E(X
2
) =
1
(n −1)!

1
λ
2
Γ(n + 2) =
n(n + 1)
λ
2
.
Therefore, the variance of X is:
V (X) =
n(n + 1)
λ
2

n
2
λ
2
=
n
λ
2
.
Remark 3. The distribution function of an Erlang random variable
If X is an Erlang random variable with parameters n and λ then
F(x) =
_
¸
¸
_
¸
¸
_
1 −
n−1

k=0
(λx)
k
e
−λx
k!
, x ≥ 0
0, x < 0
364
Proof. Let x > 0.
F(x) =
_
x
−∞
f(x, n, λ)dx =
λ
n
(n −1)!
_
x
0
t
n−1
e
−λt
dt
=
1
(n −1)!
_
x
0
(λt)
n−1
e
−λt
λdt =
1
(n −1)!
_
λx
0
u
n−1
e
−u
du
=
1
(n −1)!
_
λx
0
u
n−1
(−e
−u
)

du
=
1
(n −1)!
_
−(λx)
n−1
e
−λx
+ (n −1)
_
λx
0
u
n−2
e
−u
du
_
=
1
(n −2)!
_
λx
0
u
n−2
e
−u
du −
(λx)
n−1
e
−λx
(n −1)!
=
1
(n −3)!
_
λx
0
u
n−3
e
−u
du −
(λx)
n−2
e
−λx
(n −2)!

(λx)
n−1
e
−λx
(n −1)!
= =
=
_
λx
0
ue
−u
du −
n

k=2
(λx)
k
e
−λx
k!
= −ue
−u
¸
¸
¸
λx
0
+
_
λx
0
(−e
−u
)

du −
n

k=2
(λx)
k
e
−λx
k!
= 1 −
n

k=0
(λx)
k
e
−λx
k!
,
as desired.
Example. The lengths of phone calls at a certain phone booth are exponentially
distributed with a mean of 4 minutes. I arrived at the booth while Ana was using the
phone, and I was told that she already spent 2 minutes on the call before I arrived.
a) What is the average time I will wait until she ends her call?
b) What is the probability that Ana’s call will last between 3 and 6 minutes after
my arrival.
c) Assume that I am the first in line at the booth to use the phone after Ana, and
by the time she finished her call more than 4 people were waiting to use the phone.
What is the probability that the time between I start using the phone and the time
the fourth person behind me starts his/her call is greater than 15 minutes?
Solution. Let X be the random variable which represents the lengths of calls at
the phone booth. Then
f
X
(x) =
_
λe
−λx
, x ≥ 0
0, x < 0
λ =
1
4
a) Due to the memoryless property of the exponential random variable, the average
time I wait until Ana’s call ends is 4 minutes.
b) Due to the memoryless property of the exponential random variable, the proba-
bility that Ana’s call lasts between 3 and 6 minutes after my arrival is the probability
365
that an arbitrary call lasts between 3 and 6 minutes, which is
P(3 < X < 6) =
_
6
3
λe
−λx
dx = (−e
−λx
)
¸
¸
¸
6
3
= e
−3λ
−e
−6λ
= e

3
4
−e

6
4
≈ 0.2492
c) Let Y the random variable that represents the time between I start my phone
call until the fourth person starts his/her call. Then Y is an Erlang random variable
with parameters n = 4 and λ =
1
4
. Then,
P(Y > 15) = 1 −P(Y ≤ 15) = 1 −P(Y < 15)
= 1 −F
Y
(15) = 1 −1 +
3

k=0
(λ 15)
k
k!
e
−λ·15
= e

15
4
_
1 + 15
1
4
+
1
2!
_
15
1
4
_
2
+
1
3!
_
15
1
4
_
3
_
≈ 0, 4838
The normal random variable
Definition. A random variable X is said to be a normal random variable with
parameters m and σ (m ∈ R, σ > 0) if it has the following distribution:
X :
_
x
f(x; m, σ)
_
x∈R
where
f(x, m, σ) =
1

2πσ
e

(x−m)
2

2
.
Notation: X ∼ N(m, σ
2
).
The normal r.v. is also called the Laplace-Gauss random variable.
Remark 1. If X ∼ N(m, σ
2
) then its density function is a bell-shaped curve (or
Gauss curve) that is symmetric with respect to the line x = m (see figure below).
366
¸
`
m− 2σ m− σ m m + σ m + 2σ
1

2πσ

399
σ
The normal distribution was introduced by the French mathematician Abraham
de Moivre in 1733 and was used by him to approximate probabilities associated with
binomial random variables when n (the binomial parameter) is large. This result was
later extended by Gauss and Laplace.
Remark 2. The function f(, m, σ) verifies the two conditions of a probability
density function.
Proof. 1) f(x, m, σ) > 0, ∀ x ∈ R (since σ > 0)
2) In order to compute the integral
_

−∞
f(x, m, σ)dx =
1

2πσ
_

−∞
e

(x−m)
2

2
dx
we make the following change of variable:
x −m


= u
with x = m+

2σu and dx =

2σdu.
Hence:
_

−∞
f(x, m, σ)dx =
1

2πσ
_

−∞
e
−u
2

2σdu
=
2

π
_

0
e
−u
2
du =
2

π


π
2
= 1,
as desired.
In establishing the previous equality we used the value of the Euler-Poisson inte-
gral:
_

0
e
−u
2
du =

π
2
.
367
Remark 3. If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then
E(X) = m and V (X) = σ
2
.
Proof.
E(X) =
1

2πσ
_

−∞
xe
(x−m)
2

2
dx
=
1

2πσ
_

−∞
(x −m)e
(x−m)
2

2
dx +
m

2πσ
_

−∞
e

(x−m)
2

2

Letting y = x −m in the first integral yields
E(X) =
1

2πσ
_

−∞
ye

y
2

2
dy +m
_

−∞
f(x, m, σ)dσ
where f(, m, σ) is the normal density. By symmetry, the first integral must be zero,
so
E(X) = m
_

−∞
f(x, m, σ)dσ = m 1 = m.
Since E(X) = m, we have that
V (X) = E((X −m)
2
) =
1

2πσ
_

−∞
(x −m)
2
e

(x−m)
2

2
dx
=
1

2πσ
_

−∞
σ
2
u
2
e

u
2
2
σdu
=
σ
2


_

−∞
u
_
−e

u
2
2
_

du
=
σ
2


_
−ue

u
2
2
¸
¸
¸

−∞
+
_

−∞
e

u
2
2
du
_
=
σ
2


2
_

0
e

u
2
2
du =

2


_
π
2
= σ
2
.
Remark 4. a) If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then
Y = aX +b ∼ N(am+b, a
2
σ
2
) (a ,= 0, b ∈ R).
b) If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then
Z =
X −m
σ
∼ N(0, 1)
The random variable Z ∼ N(0, 1) is called a standard normal random variable.
Proof. We can suppose that a > 0 (the proof for a < 0 is quite similar).
Let F
Y
be the cumulative distribution function of the r.v. Y
F
Y
(y) = P(aX +b < x) = P
_
X <
x −b
a
_
= F
X
_
x −b
a
_
.
368
The density probability function of Y is obtained by differentiating the previous
equality
f
Y
(y) =
1
a
f
X
_
x −b
a
_
=
1

2πσa
e

(
x−b
a
−m
)
2

2
=
1

2πσa
e

(x−(am+b))
2

2
a
2
Hence, Y is normal with mean am+b and variance a
2
σ
2
.
b) Take a =
1
σ
and b = −
m
σ
in the part a).
Remark 5. (The distribution function of a normal random variable)
Let φ : R →R
φ(z) =
1


_
z
0
e

y
2
2
dy
be the Laplace function (whose values can be found in tables).
If X ∼ N(m, σ
2
), m ∈ R, σ > 0 then the distribution function of the r.v. X is
given by
F(x) =
1
2

_
x −m
σ
_
.
Also, we have
i) P(a < X < b) = φ
_
b −m
σ
_
−φ
_
a −m
σ
_
ii) P([X −m[ < r) = 2φ
_
r
σ
_
with the following particular cases:
P([X −m[ < σ) = 2φ(1) = 0, 6826
P([X −m[ < 2σ) = 2φ(2) = 0, 9544
P([X −m[ < 3σ) = 2φ(3) = 0, 9972
Proof. We shall list first some properties of the Laplace function:
a) φ(0) = 0
b) lim
z→∞
φ(z) =
1
2
c) lim
z→−∞
φ(z) = −
1
2
d) φ(−z) = −φ(z), ∀ z ∈ R.
Let F : R → [0, 1] be the distribution function of the r.v. X ∼ N(m, σ
2
). Then
F(x) = P(X < x) =
_
x
−∞
f(t; m, σ)dt
=
1

2πσ
_
x
−∞
e

(t−m)
2

2
dt
369
By making the following change of variable:
t −m
σ
= y, dt = σdy
then
F(x) =
1

2πσ
_ x−m
σ
−∞
e

y
2
2
σdy =
1


_ x−m
σ
−∞
e

y
2
2
dy
=
1


_
0
−∞
e

y
2
2
dy +
1


_ x−m
σ
0
e

y
2
2
dy =
1
2

_
x −m
σ
_
i) P(a < X < b) = F(b) −F(a) = φ
_
b −m
σ
_
−φ
_
a −m
σ
_
ii) P([X −m[ < r) = P(−r < X −m < r) = P(m−r < X < m+r)
= φ
_
m+r −m
σ
_
−φ
_
m−r −m
σ
_
= φ
_
r
σ
_
−φ
_

r
σ
_
= φ
_
r
σ
_

_
r
σ
_
= 2φ
_
r
σ
_
.
The particular cases are obtained from the previous equality by taking r = σ,
r = 2σ and r = 3σ.
Example. An expert in a paternity suit testifies that the length of pregnancy
is approximately normally distributed with parameters m = 270 and σ = 10. The
defendent is able to prove that he wasn’t in the country for a period that began 290
days before the birth of the child and ended 240 days before the birth. What is the
probability that the mother could have had a very long or a very short pregnancy as
mentioned before.
Solution. Let X denote the length of pregnancy in days. If he is the father, the
probability that the birth could occur within the indicated period is
P(X < 240 or X > 290) = P(X < 240) +P(X > 290)
= F(240) + 1 −F(290) =
1
2

_
240 −270
10
_
+ 1 −
1
2
−φ
_
290 −270
10
_
= 1 +φ(−3) −φ(2) = 1 −φ(3) −φ(2) ≈ 0, 0241
Next, we will present the De Moivre-Laplace limit theorem which states that when
n is large a binomial random variable with parameters n and p will have approximately
the same distribution as a normal random variable with the same mean and variance
(as the binomial).
De Moivre (1733) proved this result for the particular case p =
1
2
and Laplace
(1812) generalized it to general p.
370
Theorem. (De Moivre-Laplace limit theorem)
If X
n
∼ B(n, p) then, for any a, b ∈ R, a < b we have
lim
n→∞
P
_
a ≤
X
n
−np
_
np(1 −p)
< b
_
= φ(b) −φ(a).
Remark 6. (Normal approximation with continuity correction)
If X
n
∼ B(n, p) then for any integers i, j, 0 ≤ i ≤ j ≤ n then
P(i ≤ X
n
≤ j) ≈ φ
_
_
_
j +
1
2
−np
_
np(1 −p)
_
_
_−φ
_
_
_
i −
1
2
−np
_
np(1 −p)
_
_
_.
In general, we make the following adjustments
P(X
n
≤ j) ≈
1
2

_
_
_
j +
1
2
−np
_
np(1 −p)
_
_
_
P(X
n
< j) ≈
1
2

_
_
_
j −
1
2
−np
_
np(1 −p)
_
_
_
Hence, we have two possible approximation to binomial probabilities:
- if n is large and p small such that np is moderate in size we can use the Poisson
approximation
- if np(1 − p) is large (usually for np and n(1 − p) ≥ 5) we can use the normal
approximation.
Example. Each item produced by a manufacturer is, independently, of good qual-
ity with probability 0,95. Approximate the probability of the event that at most 40
of the next 1000 items are of bad quality.
Solution. Let X be the random variable which represents the items of bad quality
among the next 1000 produced. It is obvious X ∼ B(1000; 0, 05). The expected value
of the binomial is np = 1000 0, 05 = 50 and the variance is
np(1 −p) = 1000 0, 05 0, 95 = 47, 5
P(X ≤ 40) = P
_
X ≤ 40 +
1
2
_
≈ P
_
_
_
X −50

47, 5

40 +
1
2
−50

47, 5
_
_
_
=
1
2
+φ(−1, 42) =
1
2
−φ(1, 42) = 0, 5 −0, 422 = 0, 078
371
The Gamma random variable
A gamma distribution is a generalization of the Erland distribution where n = a.
which may not be an integer. This distribution has a lot of applications in statistics.
Definition. A random variable X is said to be a gamma random variable with
two parameters a > 0 (shape parameter) and b > 0 (called the scale parameter) if it
has the following distribution:
X :
_
x
f(x; a, b)
_
x∈R
where
f(x; a, b) =
_
_
_
1
Γ(a)b
a
x
a−1
e

x
b
, x > 0
0, x ≤ 0.
Notation: X ∼ Γ(a, b).
Remark 1. The function f(, a, b) verifies the two conditions of a probability
density function.
Proof. 1) f(a, x, b) ≥ 0, ∀ x ∈ R (since a, b > 0).
2) In order to compute the integral
_

−∞
f(x, a, b)dx we make the following change
of variable
x
b
= y with x = by and dx = bdy. Hence
_

−∞
f(x, a, b)dx =
1
Γ(a)b
a
_

0
b
a−1
y
a−1
e
−y
bdy
=
1
Γ(a)
_

0
y
a−1
e
−y
dy =
1
Γ(a)
Γ(a) = 1,
as desired.
Remark 2. If X ∼ Γ(a, b) then E(X) = ab and V (X) = ab
2
.
Proof. In the next computations we will make the same change of variable as in
the proof of the previous remark.
E(X) =
_

−∞
xf(x, a, b)dx =
1
Γ(a)b
a
_

0
xx
a−1
e

x
b
dx
=
1
Γ(a)b
a
_

0
b
a
y
a
e
−y
bdy
=
b
a+1
Γ(a)b
a
Γ(a + 1) =
baΓ(a)
Γ(a)
= ab.
372
Similarly,
E(X
2
) =
_

−∞
x
2
f(x, a, b)dx =
1
Γ(a)b
a
_

0
x
a+1
e

x
b
dx
=
1
Γ(a)b
a
_

0
b
a+1
y
a+1
e
−y
bdy
=
b
2
Γ(a)
Γ(a + 2) =
b
2
(a + 1)aΓ(a)
Γ(a)
= a(a + 1)b
2
.
Therefore, the variance of X is:
V (X) = E(X
2
) −(E(X))
2
= a(a + 1)b
2
−a
2
b
2
= ab
2
.
Remark 3. (The distribution function of a Gamma random variable)
If X is a gamma random variable with parameters a and b, then
F(x) =
_
_
_
0, x ≤ 0
1
Γ(a)
_ x
b
0
u
a−1
e
−u
du.
Proof. For each x > 0 we have
F(x) = P(X < x) =
_
x
−∞
f(t, a, b)dt =
1
Γ(a)b
a
_
x
0
t
a−1
e

t
b
dt
=
1
Γ(a)b
a
_ x
b
0
b
a−1
u
a−1
e
−u
bdu =
1
Γ(a)
_ x
b
0
u
a−1
e
−u
du.
Gamma random variables, with values of the parameter a not just integers or half-
integers are often use to model continuous random variables with an approximately
known distribution on (0, ∞).
An important case in which we obtain a gamma random variable is described in
the following remark:
Remark 4. (Square of a normal random variable)
If X ∼ N(0, σ
2
) then Y = X
2
∼ Γ
_
1
2
,
1

2
_
.
Proof. The distribution function F
Y
of Y is given by
F
Y
(y) = P(X
2
≤ y) =
_
P(−

y ≤ X ≤

y), if y > 0
0, if y ≤ 0
=
_
F
X
(

y) −F
X
(−

y), if y > 0
0, if y ≤ 0
373
Hence, for y > 0 we have
f
Y
(y) = F

Y
(y) =
1
2

y
(f
X
(

y) +f
X
(−

y))
=
1
2

y

2

2πσ
e

y

2
=
1

2πσ
y

1
2
e

y

2
=
1
Γ
_
1
2
_
(2σ
2
)
1
2
y
1
2
−1
e

y

2
In conclusion this density is gamma with a =
1
2
and λ =
1

2
.
Remark 5. Sum of independent gamma variables
If X
1
∼ Γ(a
1
, b) and X
2
∼ Γ(a
2
, b) are independent then
X
1
+X
2
∼ Γ(a
1
+a
2
, b).
Proof. Appendix B.
The Beta random variable
Similarly, continuous random variables with unknown distribution on [0, 1] are
often modeled by Beta random variables.
Definition. A random variable X is said to be a Beta random variable with
parameters a and b (a > 0, b > 0) if it has the following distribution:
X :
_
x
f(x; a, b)
_
x∈R
where
f(x; a, b) =
_
_
_
1
B(a, b)
x
a−1
(1 −x)
b−1
, x ∈ [0, 1]
0, otherwise
Remark 1. The function f(, a, b) verifies the two conditions of a probability
density function.
Proof. 1) f(x, a, b) ≥ 0, ∀ x ∈ R (since a, b > 0).
2)
_

−∞
f(x, a, b)dx =
1
B(a, b)
_
1
0
x
a−1
(1 −x)
b−1
dx =
1
B(a, b)
B(a, b) = 1.
Remark 2. If X is a Beta r.v. then
E(X) =
a
a +b
, V (X) =
ab
(a +b)
2
(a +b + 1)
.
374
Proof.
E(X) =
_
R
xf(x, a, b)dx =
1
B(a, b)
_
1
0
x
a
(1 −x)
b−1
dx
=
1
B(a, b)
B(a + 1, b) =
1
B(a, b)

a + 1 −1
a + 1 +b −1
B(a, b) =
a
a +b
E(X
2
) =
1
B(a, b)
_
1
0
x
a+1
(1 −x)
b−1
dx =
1
B(a, b)
B(a + 2, b)
=
1
B(a, b)

a + 1
a +b + 1
B(a + 1, b)
=
1
B(a, b)

a + 1
a +b + 1

a
a +b
B(a, b)
=
a + 1
a +b + 1

a
a +b
.
Hence,
V (X) = E(X
2
) −(E(X))
2
=
a + 1
a +b + 1

a
a +b

a
2
a +b
=
a
a +b
_
a + 1
a +b + 1

a
a +b
_
=
ab
(a +b)
2
(a +b + 1)
.
Remark 3. (The distribution function of a Beta random
variable)
If X is a Beta r.v. then
F(x) =
_
¸
¸
_
¸
¸
_
0, x < 0
1
B(a, b)
_
x
0
t
a−1
(1 −t)
b−1
dt, x ∈ [0, 1]
1, x > 1
The Chi-square random variable
The Chi-square distribution straddle the exponential and the normal distribution.
If in the Gamma distribution we take a =
n
2
and b = 2σ
2
we obtain the Chi-square
distribution.
Definition. A random variable X is said to be a Chi-square r.v. or χ
2
random
variable with n degrees of freedom (n ∈ N

) and parameter σ (σ > 0) if it has the
following distribution
X :
_
x
f(x; n, σ)
_
x∈R
where
f(x; n, σ) =
_
¸
_
¸
_
1
(2σ
2
)
n
2
Γ
_
n
2
_x
n
2
−1
e

x

2
, x > 0
0, x ≤ 0
375
By using the properties of the Gamma distribution we easily get that:
Remark 1. The function f(, n, σ) verifies the two conditions of the probability
density function.
Remark 2. If X is a χ
2
random variable then
E(X) = nσ
2
and V (X) = 2nσ
4
.
Remark 3. (The distribution function of a χ
2
random
variable)
If X is a χ
2
random variable then
F(x) =
_
¸
¸
_
¸
¸
_
0, x ≤ 0
1
Γ
_
n
2
_
_ x

2
0
u
n
2
−1
e
−u
du, x > 0
Remark 4. If X
1
, X
2
, . . . , X
n
are independent normal variables,
X
i
∼ N(0, σ
2
), i = 1, n then the random variable X defined by
X = X
2
1
+ +X
2
n
is a Chi-squared r.v. with n degrees of freedom and parameter σ.
Proof. Since X
i
, i = 1, n, is normal then X
2
i
∼ Γ
_
1
2
,
1

2
_
(see Remark 4 from
the subsection The Gamma random variable) and
X = X
2
1
+X
2
2
+ +X
2
n
∼ Γ
_
n
2
,
1

2
_
(see Remark 5 from the subsection The Gamma random variable), as we needed.
The Chi-Square distribution is used in certain statistical inference problems.
The lognormal random variable
Definition. A random variable X is said to be a lognormal random variable with
parameters m and σ (m > 0, σ > 0) if it has the following distribution:
X :
_
x
f(x; m, σ)
_
x∈R
where
f(x; m, σ) =
_
_
_
1

2πσx
e

1

2
ln
2 x
m
, x > 0
0, x ≤ 0
Remark 1. The function f(, m, σ) verifies the two conditions of a probability
density function.
376
Proof. By using the following change of variable y =
1
σ
ln
x
m
with x = me
σy
and
dx = mσe
σy
dy we obtain
_

−∞
f(x; , m, σ)dx =
1

2πσ
_

0
1
x
e

1

2
ln
2 x
m
dx
=
1

2πσ
_

−∞
1
m
e
−σy
e

1
2
y
2
mσe
σy
dy
=
1


_

−∞
e

y
2
2
=




= 1.
Remark 2. If X is a lognormal random variable then
E(X) = me
σ
2
2
and V (X) = m
2
e
σ
2
(e
σ
2
−1).
Proof. By using the same change of variable as in the previous remark we obtain:
E(X) =
_

−∞
xf(x; m, σ)dx =
1

2πσ
_

0
e

1

2
ln
2 x
m
dx
=
1

2πσ
_

0
e

y
2
2
mσe
σy
dy =
m


_

0
e

y
2
2
+σy−
σ
2
2
+
σ
2
2
dy
=
me
σ
2
2


_

0
e

(y−σ)
2
2
dy =
m


e
σ
2
2

2π = me
σ
2
2
.
Similarly,
E(X
2
) =
1

2πσ
_

−∞
me
σy
e

y
2
2
mσe
σy
dy
=
m
2


_

−∞
e

y
2
2
+2σy−2σ
2
+2σ
2
dy
=
m
2
e

2


_

−∞
e

(y−2σ)
2
2
dy =
m
2


e

2


= m
2
e

2
In conclusion:
V (X) = E(X
2
) −(E(X))
2
= m
2
e

2

_
me
σ
2
2
_
2
= m
2
e

2
−m
2
e
σ
2
= m
2
e
σ
2
(e
σ
2
−1).
Remark 3. If Y is the random variable defined by Y = ln X where X is a
lognormal random variable with parameters m > 0 and σ > 0 then
E(Y ) = E(ln X) = ln m and V (X) = V (ln Y ) = σ.
377
Proof. We will use the same change of variable as in the previous remarks
E(Y ) = E(ln X) =
_

−∞
ln x f(x, m, σ)dx
=
1

2πσ
_

0
ln x
1
x
e

1

2
ln
2 x
m
dx
=
1

2πσ
_

−∞
(yσ + ln m)
1
m
e
−σy
e

1
2
y
2
mσe
σy
dy
=
σ


_

−∞
ye

y
2
2
dy +
ln m


_

−∞
e

1
2
y
2
dy
= 0 +
ln m



2π = ln m.
Similarly
E(Y
2
) = E(ln
2
X) =
1


_

−∞
(yσ + ln m)
2
e

1
2
y
2
dy
=
σ
2


_

−∞
y
2
e

y
2
2
dy +
ln
2
m


_

−∞
e

1
2
y
2
dy
=
σ
2


_

−∞
y
_
−e
−y
2
2
_

dy +
ln
2
m




=
σ
2


_
−ye

y
2
2
¸
¸
¸

−∞
+
_

−∞
e

y
2
2
dy
_
+ ln
2
m
= σ
2
+ ln
2
m.
In conclusion:
V (Y ) = V (ln X) = σ
2
+ ln
2
m−ln
2
m = σ
2
,
as desired.
Remark 4. If Y ∼ N(0, 1), m > 0 and σ > 0 then the random variable X defined
by X = me
σY
is a lognormal random variable with parameters m and σ.
Proof. Let Y ∼ N(0, 1), m > 0 and σ > 0.
We have to determine the probability density function of the random variable X.
First,we will determine the distribution function of the random variable X.
If x ≤ 0 then
F
X
(x) = P(X < x) = P(me
σY
< x) = 0.
For x > 0 we get
F
X
(x) = P(X < x) = P(me
σY
< x) = P
_
e
σY
<
x
m
_
= P
_
σy < ln
x
m
_
= P
_
Y <
1
σ
ln
x
m
_
= F
Y
_
1
σ
ln
x
m
_
.
378
By differentiating the distribution function of X we get the probability density
function f
X
.
Hence, for x ≤ 0 we have f
X
(x) = 0 and for x > 0
f
X
(x) = F

X
(x) = f
Y
_
1
σ
ln
x
m
_
1
σx
=
1


1


e

1

ln
2 x
m
,
which is the probability density function of a lognormal random variable with param-
eters m and σ.
Remark 5. If X is a lognormal random variable with parameters m > 0 and
σ > 0 then the random variable Y defined by
Y =
1
σ
ln
X
m
is a standard normal random variable (Y ∼ N(0, 1)).
Proof. Let m > 0, σ > 0 and X a lognormal r.v. with parameters m and σ.
We have to determine the probability density function of the random variable Y .
We determine first the distribution of the random variable Y
F
Y
(y) = P(Y < y) = P
_
1
σ
ln
X
m
< y
_
= P
_
ln
x
m
< σy
_
= P
_
X
m
< e
σy
_
= P(X < me
σy
) = F
X
(me
σy
).
By differentiating the distribution function of Y we get the probability density
function f
Y
f
Y
(y) = F

Y
(y) = F

X
(me
σy
)me
σy
σ = f
X
(me
σy
)mσe
σy
= mσe
σy
1
σme
σy


e

1

2
ln
2
e
σy
=
1


e

1

2
·σ
2
y
2
=
1


e

y
2
2
which is the probability density function of a standard random variable.
379
Appendix A
Notions of topology in R
n
A metric space is an ordered pair (X, d) where X is an non-empty set and d is
a metric (that is a function):
d : X X →R
such that
D1) d(x, y) ≥ 0, ∀ x, y ∈ X (non-negativity)
d(x, y) = 0 if and only if x = y
D2) d(x, y) = d(y, x), ∀ x, y ∈ X (symmetry)
D3) d(x, z) ≤ d(x, y) +d(y, z), ∀ x, y, z ∈ X (triangle inequality).
The notion of a metric is a generalization of the Euclidean distance in R
n
defined
as
d(x, y) =
_
(x
1
−y
1
)
2
+ + (x
n
−y
n
)
2
, ∀ x = (x
1
, . . . , x
n
) ∈ R
n
,
∀ y = (y
1
, . . . , y
n
) ∈ R
n
.
Remark. If d is the Euclidean distance defined before, then (R
n
, d) is a metric
space called the Euclidean metric space R
n
.
A metric space also induces topological properties like open and closed sets which
leads to the study of more abstract topological spaces.
Definition. (Topological space)
A topological space is a pair (X, T ) where X is an nonempty set and T ⊂ T(X),
satisfying the following axioms:
T1) ∅, X ∈ T
T2) The union of any collection of sets in T is also in T .
T3) The intersection of any finite collection of sets in T is also in T .
The collection T is called a topology on X. The sets in T are called open sets
and their complements in X are called closed sets.
Every metric space is a topological space.
Below, we will present the manner in which the Euclidean metric on R
n
induces
a topological structure on R
n
.
Definitions
1. Open ball in R
n
Let a = (a
1
, . . . , a
n
) ∈ R
n
and let r > 0.
The open ball of radius r and center a is the set
B(a, r) = ¦x ∈ R
n
[ d(a, x) < r¦.
Some particular cases:
R : B(a, r) = (a −r, a +r).
380
If n = 1, the open ball of radius r and center a is the interval of center a and
radius r. Indeed, if n = 1 then x ∈ B(a, r) iff [x −a[ < r which gives us:
−r < x −a < r or a −r < x < a +r, as desired.
R
2
: B(a, r) is the disc centered at a and radius r.
R
3
: B(a, r) is the interior of a sphere centered at a and radius r.
2. Vicinity or neighbourhood of a ∈ R
n
Let a ∈ R
n
. A set V ⊆ R
n
is called a vicinity (or neighbourhood) of the point
a iff there is r > 0 such that B(a, r) ⊂ V .
By 1(a) we denote the set of all vicinities of the point a, i.e.
1(a) = ¦V ⊂ R
n
[ ∃ r > 0 : B(a, r) ⊂ V ¦.
3. Open set in R
n
A set G ⊆ R
n
is called an open set iff for every x ∈ G, G is a vicinity of x i.e.
∀ x ∈ G, ∃ r > 0: B(x, r) ⊂ G.
Remark. If we denote by T = ¦G ⊂ R
n
[ G - open set¦ then (R
n
, T ) is a topo-
logical space. A topological space which can arise in this way from a metric space is
called a metrizable space.
4. Closed set in R
n
A set F ⊂ R
n
is called a closed set in R
n
iff R
n
¸ F is an open set.
5. Domain in R
n
A set D ⊂ R
n
is called a domain if it is an open set and a connected one (in one
piece).
6. Accumulation point (limit point)
The point a is an accumulation point (or limit point) of the set A ⊂ R
n
iff
every open ball centered at a contains at least one element of A, other than a.
In the previous definition there is no condition on the membership of a in A.
Hence, if A ⊆ R
n
then a ∈ R
n
is an accumulation point of the set A iff ∀ r > 0,
B(a, r) ∩ A¸ ¦a¦ ,= ∅.
We will denote by A

the set of accumulation points of the given set A. Hence, if
A ⊆ R
n
A

= ¦a ∈ R
n
[ ∀ r > 0 : B(a, r) ∩ A¸ ¦a¦ ,= ∅¦.
381
7. Isolated point
Let A ⊂ R
n
and let a ∈ A. A point a ∈ A is an isolated point for A iff there is
an open ball centered at a which contains only the point a from A. In other words, a
point a from A is an isolated point for A if it isn’t an accumulation point of the given
set.
Hence, if A ⊂ R
n
and a ∈ A, a is an isolated point of the set A iff ∃ r > 0:
B(a, r) ∩ A = ¦a¦.
8. Boundary
Let A ⊂ R
n
. The boundary of A consists of all points a ∈ R
n
for which there is
an open ball of positive radius which intersects both A and R
n
¸ A.
We denote the boundary of A by frA.
Hence, if A ⊂ R
n
then
frA = ¦a ∈ R
n
[ ∃ r > 0 : B(a, r) ∩ A ,= ∅, B(a, r) ∩ (R
n
¸ A) ,= ∅¦.
9. Bounded set
The set A ⊂ R
n
is a bounded set if there is M > 0 such that A ⊂ B(0, M).
Hence A ⊂ R
n
is bounded iff ∃ M > 0 such that ∀ a ∈ A, d(0, a) < M.
Example. Let A ⊆ R
2
,
A = ¦(x, y) ∈ R
2
[ x
2
+y
2
≤ 9, x > 0, y > 0¦ ∪ ¦(3, 3)¦.
`
¸
y
x
(3,3)
1
2
3
1 2 3
(1,1)
O
(3,3) - an isolated point
(1,1) - an accumulation point which belongs to A
382
(0, 1)
_
a boundary point
an accumulation point which doesn’t belong to A
A

= ¦(x, y) ∈ R
2
[ x
2
+y
2
≤ 9, x ≥ 0, y ≥ 0¦
frA = ¦(x, y) ∈ R
2
[ x
2
+y
2
= 9, x ≥ 0, y ≥ 0¦∪
∪ ¦(x, 0) [ 0 ≤ x ≤ 3¦ ∪ ¦(0, y) [ 0 ≤ y ≤ 3¦.
383
Appendix B
Functions of random variables
In many cases we are given the probability distribution of a random variable and
we are asked for the distribution of some function of it. For example, suppose that
we know the distribution of X and try to find the distribution of g(X). In order to
do that we have to express the event that g(X) < y in terms of X being in some set
(which depends on the function g). We start with some examples:
Example. Linear functions of random variables
Let X be a random variable and consider a new random variable Y = aX + b,
with a ,= 0 and b ∈ R.
Case 1. If X is a discrete random variable, then
f
Y
(y) = P(Y = y) = P(aX +b = y) = P
_
X =
y −b
a
_
= f
X
_
y −a
a
_
.
Case 2. If X is continuous, we determine first the distribution function of Y .
F
Y
(y) = P(Y < y) = P(aX +b < y) = P(aX < y −b).
- for a > 0 we have
F
Y
(y) = P
_
X <
y −b
a
_
= F
X
_
y −b
a
_
Since f
X
= F

X
, we obtain that F
Y
is differentiable and
f
Y
(y) = F

Y
(y) =
1
a
f
X
_
y −b
a
_
=
1
[a[
f
X
_
y −b
a
_
- for a < 0 we have
F
Y
(y) = P
_
X >
y −b
a
_
= 1 −P
_
X ≤
y −b
a
_
= 1 −P
_
X <
y −b
a
_
= 1 −F
X
_
y −b
a
_
and
f
Y
(y) = F

Y
(y) = −
1
a
f
X
_
y −b
a
_
=
1
[a[
f
X
_
y −b
a
_
.
In conclusion, we obtain that
f
Y
(y) =
1
[a[
f
X
_
y −b
a
_
.
Example. Let X be a continuous random variable whose probability density
function is f
X
. Determine the p.d.f. of Y = X
2
.
384
Solution. If y ≤ 0 then
F
Y
(y) = P(Y < y) = P(X
2
< y) = 0.
If y > 0 then
F
Y
(y) = P(Y < y) = P(X
2
< y) = P(−

y < X <

y)
= F
X
(

y) −F
X
(−

y).
By differentiating the previous equality we get:
f
Y
(y) =
1
2

y
f
X
(

y) +
1
2

y
f
X
(−

y).
In conclusion,
f
Y
(y) =
_
¸
_
¸
_
0, y ≤ 0
1
2

y
[f
X
(

y) +f
X
(−

y)].
The previous examples illustrate two different situations: the case in which Y is
an invertible function of X and the case in which it is not.
We will generalize the previous situations just in the case in which Y is an invertible
function of X.
In the other case we will not present the general case which is more complicated
and is beyond the scope of this text.
Theorem. (Discrete case) Let X be a discrete random variable and let Y =
g(X), where g is one-to-one function on R.
Then Y is a discrete random variable whose p.m.f. is
f
Y
(y) =
_
f
X
(g
−1
(y)), if there is x ∈ RangeX with x = g(y)
0, otherwise
Theorem. (Continuous case) Let X be a continuous random variable and let
Y = g(X) where g is a one-to-one differentiable function on R.
Then Y is a continuous random variable whose p.d.f. is
f
Y
(y) =
_
¸
¸
_
¸
¸
_
f
X
(g
−1
(y))[(g
−1
)

(y)[ =
f
X
(x)
[g

(x)[
, if there is x ∈ RangeX
with x = g(y)
0, otherwise
The proofs of these theorems are similar to the ideas presented in the first example
of this subsection.
385
Expectation of a function of one random variable
The expectation of Y = g(X) is given by:
E(Y ) = E[g(X)] =
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_

i
g(x
i
)p
i
(discrete case)
_

−∞
g(x)f
X
(x)dx (continuous case)
Sums of independent continuous random variables
Let X and Y be independent continuous random variables whose probability den-
sity functions are f
X
and f
Y
. We wish to know the p.d.f. of X +Y .
Definition. (Convolution)
Let X and Y be two continuous random variables with density functions f
X
and
f
Y
. Then the convolution f ∗ g of f and g is the function defined by
(f ∗ g)(z) =
_

−∞
f
X
(z −y)f
Y
(y)dy
=
_

−∞
f
Y
(z −x)f
X
(x)dx.
Theorem. Let X and Y be two independent random variables with density func-
tions f
X
and f
Y
. Then the sum Z = X+Y is a random variable with density function
f
Z
, where f
Z
= f ∗ g.
The proof of this theorem is beyond the scope of this book and will be omitted.
To get a better understanding of this important result, we will present some ex-
amples.
Example. (Sum of two independent Gamma random variables)
If X ∼ Γ(a
1
, b) and Y ∼ Γ(a
2
, b) are independent, then
Z = X +Y ∼ Γ(a
1
+a
2
, b).
Solution. We know that:
f
X
(x) =
_
_
_
1
Γ(a
1
)b
a
1
x
a
1
−1
e

x
b
, x > 0
0, x ≤ 0
and
f
Y
(y) =
_
_
_
1
Γ(a
2
)b
a
2
y
a
2
−1
e

y
b
, y > 0
0, y ≤ 0
386
and so, if z > 0
f
Z
(z) =
_

−∞
f
X
(z −y)f
Y
(y)dy
=
1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
_
z
0
(z −y)
a
1
−1
e

z−y
b
y
a
2
−1
e

y
b
dy
=
e

z
b
z
a
1
−1
z
a
2
−1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
_
z
0
_
1 −
y
z
_
a
1
−1
_
y
z
_
a
2
−1
dy.
If we make the following change of variable:
y
z
= x with dy = zdx, we obtain:
If z > 0 then
f
Z
(z) =
e

z
b
z
a
1
+a
2
−2
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
_
1
0
(1 −x)
a
1
−1
x
a
2
−1
dx
=
e

z
b
z
a
1
+a
2
−1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2
B(a
1
, a
2
)
=
e

z
b
z
a
1
+a
2
−1
Γ(a
1
)Γ(a
2
)b
a
1
+a
2

Γ(a
1
)Γ(a
2
)
Γ(a
1
+a
2
)
=
1
Γ(a
1
+a
2
)b
a
1
+a
2
e

z
b
z
a
1
+a
2
−1
If z ≤ 0 then
f
X
(z −y) = 0 for y > 0 and
f
Y
(y) = 0 for y ≤ 0,
hence f
Z
(z) = 0. This completes the proof.
Example. (Sum of two independent normal random variables)
If X ∼ N(m
1
, σ
2
1
) and Y ∼ N(m
2
, σ
2
2
) are independent then
Z = X +Y ∼ N(m
1
+m
2
, σ
2
1

2
2
).
Solution. If X ∼ N(m
1
, σ
2
1
) then U =
X −m
1
σ
1
∼ N(0, 1)
(see Remark 4 from the subsection ”The normal random variable”).
Similarly, V =
Y −m
2
σ
2
∼ N(0, 1).
First, we will prove that if U, V are two independent standard normal variables
and α, β > 0 such that α
2

2
= 1, then
W = αU +βV ∼ N(0, 1).
If U ∼ N(0, 1) then by applying once more the remark mentioned before we have
that αU ∼ N(0, α
2
).
387
Similarly, βV ∼ N(0, β
2
), so
f
W
(z) =
_

−∞
f
αU
(z −y)f
βV
(y)dy
=
1
2παβ
_

−∞
e

(z−y)
2

2
e

y
2

2
dy
=
1
2παβ
_

−∞
e

y
2
α
2

2
(z−y)
2

2
β
2
dy
α
2
+b
2
=1
====
=
1
2παβ
_

−∞
e

y
2

2
z
2
−2β
2
zy

2
β
2
dy
=
1
2παβ
_

−∞
e

y
2
−2β
2
zy+β
4
z
2
−β
4
z
2

2
z
2

2
β
2
dy
=
1
2παβ
_

−∞
e

(y−β
2
z)
2

2
β
2
e
−α
2
b
2
z
2

2
β
2
dy.
If we make the following change of variable: x =
y −β
2
z
αβ
with
dy = αβdx, then
f
W
(z) =
1

e

z
2
2
_

−∞
e

x
2
2
dx =
1

e

z
2
2

2π =
1


e

z
2
2
In conclusion: W ∼ N(0, 1).
If we take α =
σ
1
_
σ
2
1

2
2
and β =
σ
2
_
σ
2
1

2
2
, then
W = αU +βV =
X −m
1
σ
1

σ
1
_
σ
2
1

2
2
+
Y −m
2
σ
2

σ
2
_
σ
2
1

2
2
=
X +Y −(m
1
+m
2
)
_
σ
2
1

2
2
∼ N(0, 1).
We apply again the Remark 4 (mentioned before), part a), with
a =
_
σ
2
1

2
2
and b = m
1
+m
2
we get that
aW +b = X +Y ∼ N(m
1
+m
2
, σ
2
1

2
2
),
as desired.
388
Bibliography
[1] Apostol, T.M., Calculus, vol. 2, Multivariable Calculus and Linear Algebra, with
Applications to Differential Equations and Probability, Second edition, John Wi-
ley, New York, 1969.
[2] Anton, H., Rorres, C., Elementary Linear Algebra, 9
th
edition, John Wiley, New
York, 2005.
[3] Binmore, K., Game Theory, A very Short Introduction, Oxford University Press,
2007.
[4] Blaga, P., Mure¸san, A.S., Matematici aplicate ˆın economie, vol. 1, Ed. Transil-
vania Press, Cluj-Napoca, 1996.
[5] Brickman, L., Mathematical Introduction to Linear Programming and Game The-
ory, Springer-Verlag, New-York, 1989.
[6] Carmichael, F., A Guide to Game Theory, Prentice Hall, Pearson Education
Limited, 2005.
[7] Cobza¸s, S., Analiz˘ a matematic˘ a (Calcul diferent ¸ial), Presa Universitar˘ a Clujean˘a,
Cluj-Napoca, 1997.
[8] Dantzig, G.B., Thapa, M.N., Linear Programming, Springer-Verlag, New York,
1997.
[9] Dowling, E.T., Math´ematiques pour l’´economistes, Mc Graw-Hill, Paris, 1990.
[10] Duca, D.I., Multicriteria optimisation in complex space, Casa C˘art ¸ii de S¸tiint ¸˘a,
Cluj-Napoca, 2005.
[11] Eiselt, H.A., Sandblom, C.L., Linear Programming and its Applications, Springer
Berlin Heidelberg New York, 1965.
[12] Filip, D.A., Curt, P., M´ethodes quantitatives en ´economie, Ed. Mediamira, Cluj-
Napoca, 2008.
[13] Grinstead, Cl.M., Introduction to Probability,
http://www.darmouth.edu/∼chance.reaching aids/books articles/ probabil-
ity book/amsbook.mac.pdf
389
[14] Kemeny, J.G., Snell, J.L., Thompson, G.L., Introduction to Finite Mathematics,
3
rd
edition, Prentice-Hall, 1974.
[15] Kinney, J., A probability and Statistic Companion, John Wiley, New York, 2009.
[16] Klimov, G., Probability Theory and Mathematical Statistics, Mir Publishers,
Moscow, 1996.
[17] Kolman, B., Beck, R.E., Elementary Linear Programming with Applications, El-
sevier Science, 1995.
[18] Krishnan, V., Probability and Random Processes, John Wiley, New York, 2006.
[19] Lay, D.C., Linear Algebra and its Applications, Addison-Wesley Publishing Com-
pany, 2003.
[20] Le Gall, J.-F., Int´egration, Probabilit´es et Processus Al´eatoires, 2006.
[21] Lipschitz, S., Schaum’s outline of Theory and Problems of Finite Mathematics,
Schaum’s outline series, Mc Graw-Hill Book Company, 1966.
[22] Lisei, H., Probability Theory, Casa C˘art ¸ii de S¸tiint ¸˘a, Cluj-Napoca, 2004.
[23] Luderer, B., Nollan, V., Vetfers, K., Mathematical Formulas for Economists, 3rd
edition, Springer-Verlag Berlin Heidelberg, 2007.
[24] Luenberger, D.G., Ye, Y., Linear and Nonlinear Programming, 3
rd
edition,
Springer-Verlag, 2008.
[25] Meester, R., A Natural Introduction to Probability Theory, 2nd edition,
Birkh¨auser Verlag AG, 2008.
[26] Mihoc, I., Calculul probabilit˘ at ¸ilor ¸si statistic˘ a matematic˘ a, lito Univ. Babe¸s-
Bolyai, Cluj-Napoca, 1998.
[27] Mihoc, I., Mihoc, M., Matematici aplicate ˆın economie. Analiz˘ a matematic˘ a, vol.
II, Ed. Presa Universitar˘ a Clujean˘a, 1999.
[28] Mure¸san, A.S., Matematici pentru economi¸sti, vol. 1, 2, lito Univ. Babe¸s-Bolyai,
Cluj-Napoca, 1991.
[29] Mure¸san, A.S., Matematici aplicate ˆın finant ¸e, b˘ anci ¸si burse, Ed. Risoprint,
Cluj-Napoca, 2000.
[30] Mure¸san, A.S., Lung, R.I., Matematici aplicate ˆın economie (Cercet˘ ari
operat ¸ionale), Ed. Mediamira, Cluj-Napoca, 2005.
[31] Mure¸san, A.S., ¸si colectiv, Elemente de algebr˘ a liniar˘ a ¸si analiz˘ a matematic˘ a
pentru economi¸sti, Ed. Todesco, Cluj-Napoca, 2003.
[32] Mure¸san, A.S., Non-cooperative games, Ed. Mediamira, Cluj-Napoca, 2003.
390
[33] Mure¸san, A.S., ¸si colectiv, Elemente de teoria probabilit˘ at ¸ilor ¸si statistic˘ a matem-
atic˘ a pentru economi¸sti, Ed. Todesco, Cluj-Napoca, 2004.
[34] Mure¸san, A.S., ¸si colectiv, Analiz˘ a matematic˘ a ¸si Teoria probabilit˘ at ¸ilor aplicate
ˆın economie, Ed. Todesco, Cluj-Napoca, 2006.
[35] Mure¸san, A.S., Blaga, P., Matematici aplicate ˆın economie, vol. 2, Ed. Transil-
vania Press, Cluj-Napoca, 1996.
[36] Mure¸san, A.S., Mihoc, M., Filip, D., Curt, P., Rˆap, I., Radu, Ro¸sca, A., P˘ acurar,
M., Petru, P., Mihalca, G., Analiz˘ a matematic˘ a, teoria probabilit˘ at ¸ilor ¸si algebr˘ a
liniar˘ a aplicate ˆın economie, edit ¸ia a doua, Ed. Mediamira, 2008.
[37] Murty, K.G., Linear Programming, John Wiley, 1983.
[38] Nicolescu, M., Analiz˘ a matematic˘ a, vol. 1, E.D.P., Bucure¸sti, 1997.
[39] Pedregal, P., Introduction to Optimization, Springer-Verlag New York, 2004.
[40] Piatecki, C., Le dilemme du prisonnier et autres dilemmes sociaux, 2006.
[41] Piatecki, C., Th´eorie des Choix en Incertain, 2007.
[42] Popescu, O., ¸si alt ¸ii, Matematici aplicate ˆın economie, E.D.P., Bucure¸sti, 1998.
[43] Purcaru, I., Matematici generale ¸si elemente de optimizare: teorie ¸si aplicat ¸ii,
Ed. Economica, Bucure¸sti, 1997.
[44] Purcaru, I., Matematici generale ¸si elemente de optimizare: teorie ¸si aplicat ¸ii,
edit ¸ia II, Ed. Economica, Bucure¸sti, 2004.
[45] Roussases, G., Introduction to Probability and Statistical Inference, Elsevier Sci-
ence, 2003.
[46] Sheldon, R., A First Course in Probability, 5th edition, Prentice-Hall Inc., 1998.
[47] Sheldon, R., Introduction to Probability and Statistics for Engineers and Scien-
tists, third edition, Elsevier Academic Press, 2004.
[48] Sheldon, R., Introduction to Probability Models, Sixth edition, Academic Press,
1997.
[49] Simon, C.P., Blume, L., Mathematics for Economists, W.W. Norton Company
Inc., New York, 1994.
[50] St˘an˘a¸sil˘ a, O., Analiz˘ a matematic˘ a, E.D.P., Bucure¸sti, 1991.
[51] Stewart, J., Analyse. Concepts et contextes, Vol. 1, Fonctions d’une variable, De
Boeck Universit´e, Paris, Bruxelles, 2001.
[52] Stewart, J., Analyse. Concepts et contextes, vol. 2, Fonctions de plusieurs vari-
ables, De Boeck Universit´e, Paris, Bruxelles, 2001.
391
[53] Stirzaker, D., Elementary Probability, 2
nd
edition, Cambridge University Press,
2003.
[54] Sydsater, K., Strom, A., Berck, P., Economists’ Mathematical Manual, fourth
edition, Springer-Verlag, 2005.
[55] http://www.masterfinance.proba.jussieu.fr, 2004-2005
[56] http://fr.wikipedia.org/wiki/Accueil
[57] http://www.bibmath.net, BibM@th, la bibliot`eque des Math´ematiques
[58] http://www.cmath.fr
[59] http://www.netprof.fr
[60] http://aleph0.clarku.edu/∼djoyce
[61] http://homeomath.imingo.net
[62] http://www.les-mathematiques.net
392

4.1

4.2 4.3 4.4

4.5 4.6 4.7

Real functions of several variables. Limits and continuity . . . . . . . . . . . . . . . . . . . . 4.1.1 Real functions of several variables . . 4.1.2 Limits. Continuity . . . . . . . . . . . Partial derivatives . . . . . . . . . . . . . . . Higher order partial derivatives . . . . . . . . Differentiability . . . . . . . . . . . . . . . . . 4.4.1 Differentiability. The total differential 4.4.2 Higher order differentials . . . . . . . 4.4.3 Taylor formula in Rn . . . . . . . . . . Extrema of function of several variables . . . Constrained extrema . . . . . . . . . . . . . . Applications to economics . . . . . . . . . . . 4.7.1 The method of least squares . . . . . . 4.7.2 Inventory control. The economic order quantity model . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

187 187 192 201 212 218 218 228 231 235 247 261 261

. . . . . . . . . . . . . . 265

III

Probabilities

269

A short history of probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 271 5 Counting techniques. Tree diagrams 273 5.1 The addition rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 5.2 Tree diagrams and the multiplication principle . . . . . . . . . . . . . 277 5.3 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . 281 6 Basic probability concepts 6.1 Sample space. Events . . . . . 6.2 Conditional probability . . . 6.3 The total probability formula. 6.4 Independence . . . . . . . . . 6.5 Classical probabilistic models. . . . . . . . . . . . . . . . . . . Bayes’ formula . . . . . . . . . Urn models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 289 302 306 311 313 327 328 332 335 336 345 380 384 389

7 Random variables 7.1 Discrete random variables . . . . . . . . . . . . 7.2 The distribution function of a random variable 7.3 Continuous random variables . . . . . . . . . . 7.4 Numerical characteristics of random variables . 7.5 Special random variables . . . . . . . . . . . . . Appendix A Appendix B Bibliography

6

Introduction
Why Maths?
Because Mathematics is the universal language of sciences. When we speak mathematics, all the barriers - linguistic or cultural ones - are pushed away.

Why Maths in Economis?
Mathematics plays an important role in Economics. This role has been rather significant for the late century and knew a real impulse during the last decades. Emmanuel Kant (1724-1804) said: ”A science contains as much science as it contains Mathematics”. One of the first economists who wanted to make economics more scientific by applying the mathematical rigor to it was Alfred Marshall (1842-1924, English economist). He did not want that an overside of mathematics to make the economic texts harder to understand. Accordingly, Marshall put the mathematical content in the footnotes and appendices of his economics books. In 1906 we wrote: ”I had a growing feeling in the later years of my work at the subject that a good mathematical theorem dealing with economic hypotheses was very unlikely to be a good economics: and I went more and more on the rules (1) Use mathematics as a short hand language, rather than an engine of inquiry; (2) Keep to them till you have done; (3) Translate into English; (4) Then illustrate by examples that are important in real life; (5) Burn the mathematics; (6) If you can’t succeed in (4), burn (3). That last I did often.” The use of mathematics in economics provides us some advantages: - the language that is used is more precise and concise - it allows us to treat the general case - we have at our disposal a great number of mathematical results. On his blog, Greg Mankiw (Professor of Economics at Harvard University) wrote the following answers to the next question: ”Why aspiring economists need Math?” ”A student who wants to pursue a career in policy-related economics is advised to 7

go to the best graduate school he or she can get into. The best graduate schools will expect to see a lot of math on your undergraduate transcript, so you need to take it. But will you use a lot of differential equations and real analysis once you land that dream job in a policy organization? No, you won’t. That raises the question: Why do we academics want students that have taken a lot of math? There are several reasons: 1. Every economist needs to have a solid foundation in the basics of economic theory and econometrics, even if you are not going to be either a theorist or an econometrician. You cannot get this solid foundation without understanding the language of mathematics that these fields use. 2. Occasionally, you will need math in your job. In particular, even as a policy economist, you need to be able to read the academic literature to figure out what research ideas have policy relevance. That literature uses a lot of math, so you will need to be equipped with mathematical tools to read it intelligently. 3. Math is good training for the mind. It makes you a more rigorous thinker. 4. Your math courses are one long IQ test. We use math courses to figure out who is really smart. 5. Economics graduate programs are more oriented to training students for academic research than for policy jobs. Although many econ PhD go on to policy work, all of us teaching in graduate programs are, by definition, academics. Some academics take a few years off to experience the policy world, as I did not long ago, but many academics have no idea what that world is like. When we enter the classroom, we teach what we know. (I am not claiming this is optimal, just reality.) So math plays a larger role in graduate classes than it does in many jobs that PhD economists hold. It is possible that admissions committees for econ PhD programs are excessively fond of mathematics on student transcripts? Perhaps. That is some thing I might argue with my colleagues about if I were ever put on the admissions committee. But a student cannot change that. The fact is, if you are thinking about a PhD program in economics, you are advised to take math course until it hurts.” At present the Maths teachers’ mission is to do the best advertisement for Maths, to make the students the importance of Maths, to ”dress up” the Maths classes in vivid colours. The purpose of this book (covering three parts (Linear Algebra, Calculus and Probabilities) divided in seven chapters) is to give to the students at economics the possibility to acquire the basic knowledge in Maths which they will have to work with in the future, in order to be able to use them with complex economic models belonging to the real world. Our book attempts to develop the student’s intuition concerning the ways of working with mathematical techniques. This book wants to be a device for using Maths in order to understand the structure of ”economics”. The text offers an introduction into the most intimate relationship between Maths and Economics. Taking into account the applicative content of this book, we will not always present the complete proofs of all theoretical statements but attached importance to examples and economic applications. As Mathematics is a very old science, we can’t possibly be 8

entirely original, but the structure and concepts have been thought out after several years of work together with the students in economics. For the content could get closer to the necessities of economists, we introduced several examples from the economic field. This book is especially meant to the first year’s students in ”Economic Sciences and Business Administration”. We also turn to all those who should need to refresh their knowledge in maths which are to be used in economics or just for the sake of their professional update.

9

10

Part I

Elements of linear algebra

11

.

to represent 1. lines. distances and angles to n-dimensional Euclidean spaces. b) ∈ R2 and each element of R2 can be represented by a point in the Cartesian plane. usually to the right of 0.e. b ∈ R} as the plane and use the words plane and ordered pair of real numbers interchangeably. The set of real numbers. For this reason we refer to R2 = {(a. A point. The vertical line through the point P meets the horizontal axis (x axis) at a which is called the abscissa of P . Each point P represents an ordered pair of real numbers (a. For this reason we refer to R as the real line and use the words point and number interchangeably. The geometric representation of R is a straight line. In this paragraph we will study how to generalize notions of points.Chapter 1 Linear spaces 1. planes. is chosen to represent 0 and another point. There is a natural correspondence between the points on the line and real numbers. b) | a ∈ R. The horizontal line through the point P meets the vertical axis (y axis) at b which is called the ordinate of P . called the origin. denoted by R.1 The Euclidean space One of the main uses of mathematics in economic theory is to construct the appropriate geometric and analytic generalizations of the two or three-dimensional geometric models which are the main stay of undergraduate economic courses. plays a dominant role in mathematics. i. O −∞ 0 P 1 ∞ The real line R We assume that the reader is familiar with the Cartesian plane. 13 . each point will represent a unique real number and each real number will be represented by a unique point.

The slope m of the line is defined as the tangent of its angle θ of inclination: m = tg θ. we will present different forms of the equation of a line. y1 ) and Q(x2 . The inclination θ of a line l is the angle that l makes with the horizontal axis. It is well know that two different points determine exactly one line. y2 ) are two points in R2 then the distance between them can be determined by using the pythagorean theorem in a right triangle. 6 y2 Q      x2 y2 − y1 y1 P θ x2 − x1 - x1 The line P Q d(P. We shall denote the distance between P and Q by d(P. Q) = (x2 − x1 )2 + (y2 − y1 )2 . Q). The range of θ is given by 0 ≤ θ < 180◦ . θ is the smallest positive angle measured counterclockwise from the positive end of the x axis to the line l. Next. 14 .y axis 6 b > P O a The cartesian plane R2 x axis If P (x1 .

Linear equations Every line l in the Cartesian plane R2 can be represented by a linear equation of the form (l) ax + by + c = 0. The equation of a vertical line i. i. a line parallel to the y axis. Point-slope form A line is completely determined if we know its direction (its slope) and a point on the line. when y2 = y1 . the slope of the line passing through two points P (x1 . The equation of the line having slope m and passing through the point (x1 . are equal. Horizontal and vertical lines The equation of a horizontal line i. In particular. is zero. y1 ) is (l) y − y1 = m(x − x1 ). In particular. when x2 = x1 is not defined.e. d) Two distinct lines l1 and l2 are perpendicular if and only if the slope of one is the negative reciprocal of the other: m1 = − 1 m2 or m1 m2 = −1. a) The slope of an horizontal line. y2 ) two different points in the Cartesian plane.e.e. is of the form x = k where k is the abscissa of the point at which the line intersects the x axis. y1 ) and Q(x2 . The equation of the line which passes through the previous two points is: (l) x − x1 y − y1 = . a line parallel to the x axis. y1 ) and Q(x2 . the equation of the x axis is y = 0. y2 − y1 x2 − x1 15 .e. Two points form Let P (x1 . is of the form y = k where k is the ordinate of the point at which the line intersects the y axis. the equation of the y axis is x = 0. m = tg θ = x2 − x1 Remark. i. b) The slope of a vertical line.e. i. m1 respectively m2 . where a and b are not both zero. each point on l is a solution of (l) and each solution of (l) is a point on l. c) Two distinct lines l1 and l2 are parallel if and only if their slopes. m1 = m2 . i. y2 ) is given by: y2 − y1 .e.In particular.

b) and v = (c. d) are two vectors in R2 . y 6 u v  u+ v 3 v 1 u O Addition of two vectors x We can use the parallelogram as in the above figure to draw u + v keeping the tails of u and v at the same point. We represent these displacements as vectors in R2 . then u + v will represent a displacement of a + c units to the right and b + d units up. scalar multiplication of a vector v be a nonnegative (negative) scalar corresponds to stretching or shrinking v without (with) changing its direction. On the other hand. To develop a geometric intuition for vector addition we can think in the following way. The tail of the arrow marks the initial location. b) means: move a units to the right and b units up from the current location. the head marks the location after the displacement is made.2 Euclidean n-space We can interpret the order pairs of R2 not only as locations but also as displacements. The displacement (a.The previous equation was obtained by replacing m = form of the equation. coordinatewise multiplication does not satisfy the basic properties of the multiplication of real numbers. 16 . If u = (a. geometrically. For instance. It is generally not possible to multiply two vectors in a nice way to generalize the multiplication of real numbers. y2 − y1 in the point-slope x2 − x1 1.

e. For an integer n ≥ 1. . y. . . addition of two vectors: if x. z ∈ V (ii) there is a vector θ ∈ V (called the null vector) that is an identity element for addition: x + θ = θ + x = θ. . A real vector space is a set V = ∅ with an operation + : V ×V → V called vector addition and an operation · : R × V → V called scalar multiplication with the following properties: (i) (x + y) + z = x + (y + z). . xn + yn ) ∈ Rn 2. . ∀ x ∈ V (iii) for any x ∈ V there is −x ∈ V such that x + (−x) = (−x) + x = θ 17 . Definition. y ∈ Rn then x + y := (x1 + y1 . xn ) | x1 . The two fundamental operations (which generalize the addition of two vectors and the multiplication of a vector by a scalar) are: 1. xn ) of real numbers i. . . . Rn = {x = (x1 . . x2 .y 6 2b 2u b u a * * O 2a x Vector multiplication by a real scalar We will generalize the previous discussion to the general case. . . ∀ x. . . . . n. The elements of Rn are called vectors and the numbers xi . . αx2 . By definition Rn is the set of ordered n-tuples x = (x1 . x2 . scalar multiplication: if x ∈ Rn and α ∈ R then αx = (αx1 . . . i = 1. we will present an axiomatic concept based on the simplest properties of the previous operations. x2 + y2 . are called the coordinates of x (xi is the ith coordinate of x). Next. x2 . . xn ∈ R}. αxn ) ∈ Rn .

. α2 . vn } is a basis of v then each point v ∈ V can be written as a linear combination v = α1 v1 + α2 v2 + · · · + αn vn 18 . . ∀ x. v1 . Let V be a vector space. . . that is. . . u + αv ∈ W . vn . a set of vectors in a vector space is linearly dependent if and only if one vector can be written as a linear combination of the others. (V. span S = {α1 v1 + · · · + αn vn | n ∈ N∗ . . Conversely. v2 . The proof of the previous example follows immediately by using the definitions of the operations in Rn . αn ∈ R. If α1 . . such that α1 v1 + · · · + αn vn = θ. vn } is called linearly independent if the vector equation α1 v1 + α2 v2 + · · · + αn vn = θ has the only trivial solution α1 = α2 = · · · = αn = 0. . y ∈ V (v) 1 · x = x. . vn ∈ V }. . ∀ x ∈ V (vi) ∀ x. ∀ u. Remark.(iv) x + y = y + x. In this case we say that V is finite dimensional and the common number of elements of the basis of V is called the dimension of V . β ∈ R (a) (αβ)x = α(βx) (b) (α + β)x = αx + βx (c) α(x + y) = αx + βy. . . . . vn ∈ V and α1 . Remark. . . . . . . there are α1 . vn ∈ V then the sum α1 v1 + · · · + an vn ∈ V is called a linear combination of the vectors v1 . Rn is a vector space. αn ∈ R not all zero. Remark. vn } is called linearly dependent if the vector equation α1 v1 + · · · + αn vn = θ has a nontrivial solution. A linear subspace of V is a subset W ⊂ V that is itself a vector space with vector addition and scalar multiplication defined by restriction of given operations on V. If W ⊆ V then W is a linear subspace if and only if the following two conditions are fulfilled: a) W = ∅ b) ∀ α ∈ R. . . . A basis is linearly independent. . v ∈ W. The span of a set S ⊂ V is the set of all linear combinations α1 v1 + · · · + αn vn where v1 . A set of vectors {v1 . . αn ∈ R and v1 . . We next show that any finite dimensional vector space is ”like” Rn . In other words. . . . y ∈ V. . α1 . . . . ∀ α. . v2 . a linearly independent set that spans V is a basis for V . A basis for V is a subset B ⊂ V that spans V (spanB = V ) and is minimal for this property in the sense that there are no proper subsets of B that span V . α2 . then all basis have the same number of elements. +) is a commutative group. α2 . . Example. . . . . . . . If B = {v1 . A set of vectors {v1 . If V has a finite basis. . αn ∈ R.

n. ∀ x. . Definition. . ∀ x ∈ V \ {θ}. y ∈ V IP3) αx + βy. x. i = 1. Let V be a vector space. 0. ∀ x. · : V × V → R is called an inner product in V if the following conditions are satisfied: IP1) x. ∀ x ∈ V x =0 ⇔ x=0 N2) αx = |α| · x . ∀ x ∈ V.in exactly one way. z . A vector space with an inner product is called an inner product space. z + β y. Normed spaces The concept of a norm is an abstract generalization of the length of a vector. y ∈ Rn . · : Rn × Rn → R x. y | ≤ x · y . . 0. y ∈ V . b) dim Rn = n. ∀ x. x = 0 ⇔ x = θ (positive definiteness) IP2) x. A mapping ·. Let V be a vector space. . Proof. x → x is called a norm if it satisfies the following conditions: N1) x ≥ 0. 1. The property (N3) follows from the Cauchy-Buniakovski-Schwarz inequality: | x. The canonical inner product in Rn is defined in the following way: ·. . y = y. x is a norm on Rn which is called the Euclidean norm. β ∈ R (bilinearity). y = x1 y1 + x2 y2 + · · · + xn yn . Inner product spaces Definition. . ∀ α. A vector space V with a norm · is called a normed space and it is denoted by (V. a) The set B = {e1 . z ∈ V. Example. 0) (1 is the ith coordinate). . 19 . A function · : V → R. . . ∀ α ∈ R N3) x + y ≤ x + y . Example. The properties (N1) and (N2) are easy consequences of the properties of the inner product. z = α x. is a base of the vector space Rn . . x > 0. y ∈ Rn . en } where ei = (0. y. The proof that the previous functions satisfies all the properties of the previous definition is left to the reader (easy computations based on the properties of real numbers). Example. x . . · ). ∀ x. . ∀ x. The function · : Rn → R x→ x = x.

We are now able to prove N3) x+y 2 y x . y | ≤ x 2 + y + y 2. . . z). . . x y ≤ x x 2 + y y 2 =2 | x. d) in which X is a nonempty set and d is a distance on X. ∀ x. y) = 0 ⇔ x = y D2) d(x. y ∈ X d(x. We can assume that x = θ and y = θ (if x = θ or y = θ the Cauchy-BuniakovskiSchwarz is true). y − 2 x. A metric space is a pair (X. y (symmetry) D3) d(x. y) ≤ d(x. y) = 20 (x1 − y1 )2 + · · · + (xn − yn )2 . y 2 −( x + y 2 ) ≤ 2 x. |xn |}. x y If we replace x by and y by in the previous inequality we obtain x y 2 wherefrom we have: as desired. ∀ x. y + y 2 CBS = ( x + y )2 . y ∈ Rn . Metric spaces Definition. · ) is a metric space since the function d:V ×V →R d(x. D3). y) = d(y. x − y = x 2 2 2 + y + y 2 2 2 + 2 x. d : X × X → R is called distance on X if the following conditions are satisfied: D1) d(x. x). ≤ x 2 +2 x · y + y 2 Example (other norms on Rn ). y ≤ x 2| x. If X = ∅. = x 2 + 2 x. y) ≥ 0. ∀ x. y) = x − y . y. Remark. D2). In Rn the Euclidean distance is: d(x. y ∈ V satisfies the properties D1).We consider first the following obvious inequalities: 0≤ x+y wherefrom we get hence 0≤ x−y 2 2 = x + y. x ∞ = max{|x1 |. y | ≤ x · y . Each normed space (V. ∀ x. 1) · 1 : Rn → R. Example. ∀ x. x 1 = |x1 | + · · · + |xn |. z (triangle inequality). x + y = x = x − y. . ∀ x ∈ Rn 2) · ∞ : Rn → R. y) + d(y.

 . x = x1 e1 + x2 e2 + · · · + xn en T (x) = T (x1 e1 + · · · + xn en ) = x1 T (e1 ) + · · · + xn T (en ) = x1 a1 + · · · + xn an = at x. Remark 2. Proof. . .3 Quadratic forms In this section we present the natural generalizations of linear and quadratic functions to several variables. n. such that dim V = n and dim W = m. hence   a1j  a  T (ej ) =  2j  . i = 1.  a =  .. Remark 1. there exists a vector   a1  . y ∈ V. ∀ α. The previous remark implies that every linear form on Rn can be associated with a unique vector a ∈ Rn (or with a unique 1 × n matrix) so that T (x) = at x.. en } be the canonical base of Rn . for any vector x ∈ Rn . ∀ x. an such that T (x) = at x for all x ∈ V . . Then there exists an m × n matrix A such that T (x) = Ax. A linear operator T : V → R is called a linear form. The same correspondence between linear operators and matrices is valid for linear operators from Rn to Rm . Then. . . .1. . The idea is the same as that of the previous remark. . Let V and W be two real vector spaces. en } be the canonical basis of Rn . Proof. β ∈ R. ∀ x ∈ Rn .e. T (αx + βy) = αT (x) + βT (y). For each j = 1. n. Let T : Rn → R be a linear operator.  ∈ Rn .  amj 21 . Let T : Rn → Rm be a linear operator. A linear operator from V to W is a function T that preserves the vector space structure. T (ej ) ∈ Rm . Let ai = T (ei ) ∈ R. Let B = {e1 . Then. Let B = {e1 . i. Linear operators Definition.

Quadratic forms are associated to bilinear forms.  = Ax. .  xn am1 am2 . . ∀ x.   . ∀ y. .. ..Let A be the m × n matrix whose j th column is the column vector T (ej ).. . . . Examples. Quadratic forms In mathematics. z) = ax2 + by 2 + cz 2 + dxy + exz + f yz are quadratic forms in one. ∀ α. two or three variables.... . ∀ x ∈ V is called a quadratic form on V .  amn am2 am1     a11 a12 .. a2n    .. a1n x1  a21 a22 . Definition. ∀ x. y ∈ V.. The application (·|·) : V × W → R is called bilinear if it is linear with respect to its two variables. The application: Q:V →R x → Q(x) = (x|x). z ∈ W. y. we can say that matrices are representations of linear operators. ∀ x ∈ V. c) Let V be a real vector space whose dimension is n and let (·|·) : V × V → R be a symmetric bilinear form. i. y) = ax2 + bxy + cy 2 Q(x. amn So. ... β ∈ R.. For any x = x1 e1 + · · · + xn en ∈ Rn we have T (x) = T (x1 e1 + · · · + xn en ) = x1 T (e1 ) + · · · + xn T (en )       a1n a12 a11   a   a   a = x1  21  + x2  22  + · · · + xn  2n   .. β ∈ R. ∀ z ∈ W b) The bilinear form (·|·) : V × V → R is called symmetric if (x|y) = (y|x).   . a) Let V and W be two real vector spaces such that dim V = n and dim W = m. . Q(x) = ax2 Q(x. ∀ α. =  .e. a quadratic form is a homogeneous polynomial of degree two in a number of variables..: (αx + βy|z) = α(x|z) + β(y|z). y ∈ V. 22 (x|αy + βz) = α(x|y) + β(x|z). . .

n n Q(x) = i=1 j=1 aij xi xj (aij = aji ) 23 . j = 1. The quadratic form Q : Rn → R. . Hence: Q(x) = (x|x) = (x1 v1 + x2 v2 + · · · + xn vn |x) = x1 (v1 |x) + x2 (v2 |x) + · · · + xn (vn |x) = x1 (v1 |x1 v1 + · · · + xn vn ) + ···+ + xn (vn |x1 v1 + · · · + xn vn ) + xn [x1 (vn |v1 ) + · · · + xn (vn |vn )] n n = x1 [x1 (v1 |v1 ) + · · · + xn (v1 |vn )] + ···+ = x1 j=1 n xj (v1 |vj ) + · · · + xn n j=1 j=1 xj (vn |vj ) (vi |vj )xi xj . Remark. = i=1  xi xj (vi |vj ) = n n  n n i=1 j=1 In conclusion: Q(x) = (x|x) = i=1 j=1 (vi |vj )xi xj . . vn } is a base of V then x can be uniquely expressed as x = x1 v1 + x2 v2 + · · · + xn vn .Next. we determine the analytical expression of a quadratic form. n we denote aij = (vi |vj ) then aij = aji (since (·|·) is symmetric) and n n Q(x) = = i=1 j=1 a11 x2 1 aij xi xj +2a12 x1 x2 +a22 x2 2 + · · · + 2a1n x1 xn + · · · + 2a2n x2 xn 2 + · · · + ann xn which is the analytical expression of the quadratic form Q. Just as a linear function has a matrix representation. too. . a quadratic form has a matrix representation. . If B = {v1 . If for each i.

.. . . (−1)n det An = (−1)n det A > 0.. 24 .. We will denote the mth order leading principal submatrix by Am and the corresponding leading principal minor by det Am . . The determinant of a m × m principal submatrix is called a mth order principal minor of A. a) Let A be a n×n matrix. . det A2 > 0. Let Q : Rn → R be a quadratic form whose coefficient matrix is A. The mth order principal submatrix of A obtained by deleting the last n − m rows and the last n − m columns from A is called the mth order leading principal minor of A. . If Q : Rn → R be a quadratic form. b) Let A be a n × n matrix..e. . x′ ∈ Rn \ {θ} such that Q(x) < 0 and Q(x′ ) > 0. . .  ...    a1n x1  a2n   . Remark 3. The following remark provides an algorithm which uses the leading principal minors to determine the definitess of a quadratic form Q whose coefficient matrix is A. then Q is (a) positive definite if Q(x) > 0 for all x ∈ Rn \ {θ} (b) positive semidefinite if Q(x) ≥ 0 for all x ∈ Rn (c) negative definite if Q(x) < 0 for all x ∈ Rn \ {θ} (d) negative semidefinite if Q(x) ≤ 0 for all x ∈ Rn (e) indefinite if there are x. det A1 > 0.. Then (a) Q is positive definite if and only if all its n leading principal minors are strictly positive i. To present the test we need some definitions related to the coefficient matrix of Q... ..  .can be written as a11  a21  Q(x) = (x1 . . (c) Q is positive semidefinite if and only if every principal minor of A is nonnegative. an1 = xt Ax. Next. det An = det A > 0 (b) Q is negative definite if and only if all its n leading principal minors alternate in sign as follows det A1 < 0. Definition. det A2 > 0. .... we will describe a simple test for the definitess of a quadratic form. . . m For a n × n matrix there are Cn mth order principal minors of A.  xn ann where A is the matrix (symmetric) of the coefficients of the quadratic form Q. Definitess of quadratic forms Definition. . xn )  .. A m×m submatrix of A formed by deleting n − m columns and the same n − m rows from A is called a mth order principal submatrix of A.

. Ak 0 a d a (A−1 a)t a + d k The previous equality can be written as A = C t BC. Proof. 25 .d) Q is negative semidefinite if and only if every principal minor of odd order is not positive and every principal minor of even order is nonnegative. . The matrix A can be written as   a1 k+1   a Ak a where a =  2 k+1  . (c) and (d)) by using induction on the size of A (the coefficient matrix of Q). .  at ak+1 k+1 ak k+1 If d = ak+1 k+1 − at A−1 a then we have k Ak 0 Ak at 0 d Ik 0 Ik (A−1 a)t k = 0 1 A−1 a k 1 = = Ak at Ik (A−1 a)t k a ak+1 k+1 0 1 = A. We have to show that Q is positive definite. ∀ (x1 . Since det C = det C t = 1 and det B = d · det Ak then det A = det B = d · det Ak . We have to prove that if det Aj > 0. x2 ) = (0. We will prove the part (a) (the proofs are similar for parts (b). 0). e) If there is an even number m (m ∈ {1. ∀ x = θ. det A1 2 We suppose that the theorem is true for symmetric matrices of order k and prove it for symmetric matrices of order k + 1. If n = 1 then the result is trivial. A=  .. det A2 = det A = a11 · a22 − a12 > 0 and hence: Q(x) = (x1 . . Let A be symmetric matrix of order k + 1. j = 1. k + 1 then xt Ax > 0. 2 If n = 2 then det A1 = a11 > 0. where C= Ik 0 A−1 a k 1 and B = Ak 0 0 d . n}) such that det Am < 0 or if there are two odd numbers m1 and m2 such that det Am1 < 0 and det Am2 > 0 then Q is indefinite. x2 ) a11 a21 a12 a22 2 x1 x2 2 a11 a22 − a12 2 x2 a11 2 = a11 x1 + 2a12 x1 x2 + a22 x2 2 = a11 x1 + a12 x2 a11 + 2 = det A1 x1 + a12 x2 a11 + det A2 2 x > 0.. Suppose that all the leading minors are strictly positive.

By the inductive hypothesis we obtain that det A1 > 0. It is obvious that if Q(x) > 0. It remains for us only to prove that det A = det Ak+1 is positive. Let x ∈ Rk+1 \ {θ}. can be written as x = xk+1 Then xt Ax = xt C t BCx = (Cx)t B(Cx) = y t By = yt yn+1 Ak 0 0 d y yk+1 2 = y t Ak y + dyk+1 .  = x Bx = x (C ) AC x . If n = 1 then the result is trivial. 26 . x Let x ∈ Rk \ {θ}. j = 1. . 1)  . det A1 2 In the previous equality we used the fact that a11 = 0 since if a11 = 0 then Q(1. To prove the converse (Q positive definite implies that det |Aj | > 0. If x = ∈ Rk+1 then 0 0 < xt Ax = xt 0 A x 0 = xt Ak x. In the previous equality we denoted the vector Cx by y = y yk+1 which is not the null vector since C is invertible and x = θ. 0) = 0 and Q cannot be positive definite. . since Q is positive definite we have that   0 Ak 0  . If n = 2. then det A1 > 0 and det A2 > 0. 0 d 1 = (C −1 x)t A(C −1 x) > 0. ∀ x = θ. . Every x ∈ Rk+1 x where x ∈ Rk . then Q(x) = det A1 x1 + a12 x2 a11 2 + det A2 2 x . n) we will use the induction once more. Assume that the result is true for any quadratic form whose coefficient matrix has order k and let A be the (k + 1) × (k + 1) coefficient matrix of a positive definite quadratic form. 0. We write the matrix A as in the first part of the proof.Since det A > 0 and det Ak > 0 then d > 0. Since det Ak > 0 and d > 0 then det A = det Ak+1 > 0 as desired. We have to prove now that d > 0. Hence A = C t BC and det A = det Ak · d. Indeed. and det Ak > 0. By using the inductive hypothesis and the fact that d > 0 we get that xt Ax > 0.  t t −1 t −1 d = (0. det A2 > 0. hence Q is positive definite. .

respectively. The set of all possible solutions is called the solution set or the general solution of the linear system. sn ) of numbers which makes each equation a true statement when the values s1 . . . j=1. Two linear systems are called equivalent if they have the same solution set. . ∀ i = 1. . . . The general form of a linear system of m equations and n unknowns is the following:   a11 x1 + · · · + a1n xn = b1 . xn .Chapter 2 Linear Algebra 2. . . x2 .n A finite set of linear equations in the variables x1 . We say that a linear system is consistent if it has either one solution or infinitely many solutions. a system is inconsistent if it has no solution.. .  am1 x1 + · · · + amn xn = bm am1 . . n. . A solution of the system is a list (s1 .. Gauss-Jordan elimination method where aij ∈ R. a1n  . . sn are substituted for x1 . . .m =  . . j = 1.1 Systems of linear equations.. and each solution of the second system is a solution of the first system. .. exactly one solution. infinitely many solutions. m. . xn ∈ R is called a system of linear equations or a linear system. or 3. no solution. For a linear system we consider • the matrix of the system (the matrix of the coefficients of the unknowns)   a11 . That is. bi ∈ R. . or 2. . A system of linear equation has either 1.  A = (aij )i=1. . each solution of the first system is a solution of the second system. . . amn 27 . .

3) If rank A = rank A < n then the system has infinitely many solutions. .where Ri = (ai1 . ai2 . ain ) is the i row. .. . .  . . 28 . .  .  • the column of the unknowns By using the above matrix notations the system can be written in the following form: Ax = b. . This algorithm is called Gauss-Jordan elimination method and its basic strategy is to replace one system with an equivalent one that is easier to solve. . then the system is inconsistent. . am1 . .. If the equivalent system contains a degenerate linear equation of the following form 0 · x1 + 0 · x2 + · · · + 0 · xn = bi then i) If bi = 0. . This chapter describes an algorithm or a systematic procedure for solving linear systems. a1n b1  . amn bm • the column of the constants  b1  .  b= . The system is inconsistent.  A= . xn    A=  th  R1 R2 . . 1) If rank A < rank A then there is no solution of the considered linear system. Concerning the solutions set of a linear system we have the following result: Remark. Rm    . . The method is named after German mathematicians Carl Friederich Gauss (17771855) and Wilhelm Jordan (1842-1899) but it appears in an important Chinese mathematical text which was written approximately at 150 BCE. The system is consistent. . . . bn   x1  . • the augmented matrix (the coefficient matrix with an added column containing the constants from the right sides of the equations)   a11 . then the degenerate equation may be deleted from the system without changing the solution set. ii) If bi = 0.  x =  . 2) If rank A = rank A = n (where n is the number of the unknowns) then the system has exactly one solution.

.. aij akj → akj − aij · akj = 0. ∀ k = i). Interchange two rows. . .. . ail . A=  . akj .  . If we apply the elementary row operations to an augmented matrix of a linear system we obtain a new matrix which is the augmented matrix of an equivalent linear system to the given one. This can be done by using the elementary row operations which are: 1) Scaling. ∀ k = 1. Suppose that aij = 0. k = i.. λRi → Ri ... .. . n. . . m.The rectangle rule for row operations The purpose of this paragraph is to transform a matrix which has a nonzero column into an equivalent one that contains one element equal to 1 and all the other elements equal to 0 (we say that such a column is in proper form). Let A be the matrix   . aij . . . .    ... Ri ↔ Rj Remark. .. Ri + λRk → Ri 3) Interchange. λ=0 2) Replacement. We want to determine the elementary row operations which transform the element aij into 1 (aij → 1) and all the other elements of the j th column into 0 (akj → 0. .. Replace one row by the sum of itself and a multiple of another row.. . aij aij → aij =1 aij The effects of the previous elementary row operations on the other elements of the matrix are: ail ail → .  .. . . ∀ l = 1. ... . .      .. . aij 1 → Ri . . akl . .. l = j aij 29 .. We consider the following row operations Ri · and Rk − Ri · akj → Rk .. ... This remark is true since it is well known that the solution of a system remains unchanged if we multiply one equation by a nonzero constant or if we add a multiple of one equation to another or if we interchange two equations of a system (the rows of an augmented matrix correspond to the equations in the associated system). . Multiply all entries in a row by a nonzero constant.

Then.akl → akl − ail · akj akl aij − ail akj = . aij 2) The columns which contain 0 on the pivot row remain unchanged. Divide the pivot row by the pivot. Remark. Step 2. from the product of the elements situated in the opposite corners of the previous rectangle’s diagonal which contains the pivot we subtract the product of the elements situated in the corners of the other diagonal and the result is divided by the pivot (rectangle’s rule). in order to transform a matrix which has a nonzero column into an equivalent one that contains one element equal to 1 and all the other elements equal to 0 we have to follow the next steps: Rectangle’s algorithm Step 1. The columns which contain a 0 on the pivot row remain unchanged. 1) The rows which contain 0 on the pivot column remain unchanged. 2 A= 1 −1  0 3 1   1 −1 2  ∼  1/3 2 0 −4/3 0 2 0 1 0 1 2/3 −2/3  −1 0  2 Remark. Set the elements of the pivot column (except the pivot) equal to 0. ∀ l = j. The rectangle rule can be used to determine the inverse of a given invertible matrix A. Step 5. ∀ k = i. The rows which contain a 0 on the pivot column remain unchanged. This can be done by writing at the right side of the given matrix the unitary matrix I which has the same number of rows and columns as the matrix A 30 . Indeed. if ail = 0 then akl → akl · aij − 0 · akj = akl . aij aij The element aij = 0 is called the pivot. Step 4. Example. in order to transform the element akl (by using aij as a pivot) we locate the rectangle which contains the element and the pivot aij as opposite corners. aij So. Compute all the other elements of the matrix by using the rectangle’s rule. So. Choose and circle (from the considered column) a nonzero element which is called the pivot. Indeed. if akj = 0 then akl → akl · aij − ail · 0 = akl . Step 3.

.. 1 2 3 1 0 0 1 0 0 1 0 0 Hence A 2 3 1 2 −1 −5 0 1 0 0 1 0  3 1 2 3 −5 −7 −7 5 18 0 0 1 5 − 13 1 18 7 18 1 0 0 1 −2 −3 −3 2 7 5 − 13 1 18 7 18 1 18 7 18 5 − 18 7 18 5 − 18 1 18 I3 0 1 0 0 1 0 2 −1 −5 0 0 1 0 0 1 0 0 1 7 18 5 − 18 1 18 A−1 =  1 18 7 18 5 − 18  . xn a1n a2n . . The matrix situated at the right side of the unitary matrix (in the final table) is the inverse of the matrix A.. . am1 x2 a12 a22 . 3 1 2 We observe that the matrix A is invertible since its determinant is −18 = 0. .. bm x1 a11 a21 . . We will illustrate the previous procedure by an example. amn 31 . b b1 b2 .. ... Example... am2 . . The table contains the augmented matrix with the constant column written at the left side of the matrix A. . The Gauss-Jordan elimination method This method is an elimination procedure which transforms the initial system into an equivalent one whose solution can be obtained directly. Determine the inverse of the matrix A given by   1 2 3 A =  2 3 1 . Gauss-Jordan elimination algorithm Step 1. . Associate to the given system the following table.and then applying the rectangle rule to the obtained matrix. By choosing successively the elements situated on the main diagonal of matrix A as pivots we will finally obtain the unitary matrix I (situated below the given matrix A). . .

the variables whose columns are not in proper form may assume any values and they are called secondary variables. a) If R corresponds to the following equation 0 · x1 + · · · + 0 · xn = 0 then delete R from the table.the variables whose columns are in proper form are called leading variables. The system is inconsistent. equivalently. In the case of consistency (the system is consistent if we choose a pivot from each row) write the general solution. Continue the above process until we choose a pivot from each row or a degenerate equation is obtained at the step 3b. Choose and circle aij = 0 (the pivot). Step 4. Solve the following linear systems. 32 . each new equation) R. If there is at least one secondary variable then the system has infinitely many solutions. The solution set can be specified as follows . Repeat steps 2 and 3 with the subsystem formed by all the equations from which a pivot hasn’t been chosen yet. with bi = 0 then exit the algorithm. Step 3. Step 5. The pivot has to be chosen from the coefficient matrix A not from the constant column. Use aij as a pivot to eliminate the unknown xj from all the equations except the ith equation (by applying rectangle’s algorithm).Step 2. If all the variables are leading variables then the system has a unique solution which can be obtained directly from the column b . In this case express the leading variables in terms of secondary variables. Examine each new row obtained (or. b) If R corresponds to the following equation 0 · x1 + 0 · x2 + · · · + 0 · xn = bi . Example.   x + 2y − 3z + 4t = 2 2x + 5y − 2z + t = 1 a)  5x + 12y − 7z + 6t = 7 Solution b 2 1 7 2 −3 −3 8 −3 3 x 1 2 5 1 0 0 1 0 0 y 2 5 12 2 1 2 0 1 0 z −3 −2 −7 −3 4 8 −11 4 0 t 4 1 6 4 −7 −14 18 −7 0 The system is inconsistent since we obtain the following equation: 3 = 0 · x + 0 · y + 0 · z + 0 · t.

The leading variables are x. y. z 3   x + 2y − 3z − 2s + 4t = 1 2x + 5y − 8z − s + 6t = 4 c)  x + 4y − 4z + 5s + 2t = 8 Solution b 1 4 8 1 2 7 −3 2 3 21 −7 3 x 1 2 1 1 0 0 1 0 0 1 0 0 y 2 5 4 2 1 2 0 1 0 0 1 0 z −3 −8 −4 −3 −2 −1 1 −2 3 25 −11 3 s −2 −1 5 −2 3 7 −8 3 1 0 0 1 t 4 6 2 4 −2 −2 8 −2 2 24 −8 2 33 . z and the system has a unique solution which is     x 2  y  =  −1  .  x − 2y + z = 7 2x − y + 4z = 17 b)  3x − 2y + 2z = 14 Solution b 7 17 14 7 3 −7 0 −11 7 2 −1 3 x 1 2 3 1 0 0 1 0 0 1 0 0 y −2 −1 −2 −2 3 4 2 11 −4 0 1 0 z 1 4 2 1 2 −1 0 0 1 0 0 1 The system is consistent since we have chosen a pivot from each row.

t and in consequence the system is consistent and has infinitely many solutions.with z.     =       . We define the production vector   x1  x2    x =  . In this model there are n industries producing n different products such that consumption equals production. . t ∈ R. The general solution can be expressed as follows. The general solution is:       where from we easily can express the leading variables in terms of secondary variables   x = 21 − 25z − 24t y = −7 + 11z + 8t  s = 3 − 3z − 2t x y z s t   21 − 25z − 24t −7 + 11z + 8t z 3 − 3z − 2t t  The system is consistent since we have chosen a pivot from each row. From the final table we write down the following equivalent system with the given one:   x + 25z + 24t = 21 y − 11z − 8t = −7  3z + s + 2t = 3 where z. The leading variables are x. The problem is to determine the levels of the outputs of the industries if the external demand is given and the prices are fixed. the secondary variables are z. xn 34 . We will measure the levels of the outputs in terms of their economic values. t ∈ R. s. . Over some fixed period of time.   Leontief Production Model The Leontief production model is a model for the economics of a whole country or region. We remark that a part of production is consumed internally by industries and the rest is to satisfy the outside demand. let xi = monetary value of the total output of the ith industry di = monetary value of the output of the ith industry needed to satisfy the external demand cij = monetary value of the output of the ith industry needed by the j th industry to produce one unit of monetary of its own output.   . y.

Writing x as In x and using matrix algebra. 3 c11  c21 C=  ..the demand vector and the consumption matrix    d=  c12 c22 .. 2x2 − 0.. . cn2  d1 d2 . 1 0.. i = 1. The above system can be solved by using the Gauss-Jordan elimination method.. 30 units for agriculture and 20 units for services. . 4x1 + 0. . dn       c1n c2n  . then we obtain x = (In − C)−1 d.. 7x3 = 20 35 . 1x2 + 0.. n. Suppose the external demand is 50 units for manufacturing. 2x1 − 0. 2 0. 3 0. Find the production level that will satisfy this demand. As a simple example. dj . We are led to the following equation x = Cx + d which is called the Leontief input-output model. cij ≥ 0 for each j = 1. Solution 1 (by using the Gauss-Jordan elimination method) The production equation is the following (I3 − C)x = d which gives us the following system to be solved:   0. 1x3 = 30  −0.. 2 0. agriculture and services whose consumption matrix is given by   0... n. 1x3 = 50 −0.. cn1 . or production model. 4 0.  cnn It is obvious that xj . . 5x1 − 0. If the matrix In − C is invertible. Example. 0. 5 0. 7x2 − 0.. we can rewrite the previous equation as In x − Cx = d (In − C)x = d. 1  . . . The quantity ci1 x1 +c12 x2 +· · ·+cin xn is the value of the output of the ith industry needed by all n industries. suppose the economy consists of three sectors manufacturing.. 1 C =  0..

5 −0. 1 −2 5 27 50 9 − 50 −0. 0. 1 −0. We first determine the matrix (I3 − C)−1 . 1 −0. 1 2 7 −1 2 9 −15 0 1 0 0 1 0 x3 −0.b 50 30 20 −500 300 200 −500 −200 3700 − 4100 9 − 200 9 10100 3 950 9 4450 27 5050 27 x1 0. 1 0. 2 −5 −4 −2 −5 −9 33 −3 −1 18 0 0 1 x2 −0. 2 0. 2 1 0 0 1 0 0 1 0 0 36 I3 − C −0. 1 0. 7 −0. 4 −0. 7 −1 5 9 − 50 33 50 −1 3 −1 3 3 5 1 0 0 2 4 5 2 5 70 27 40 27 2 3 80 27 50 27 10 9 I3 0 1 0 0 1 0 20 27 50 27 1 3 25 27 55 27 5 9 0 0 1 0 0 1 0 0 1 5 9 5 9 5 3 0 1 0 0 1 0 0 0 1 . 4 −0. 2 0. 7 1 −1 7 1 0 0 1 0 0 1 0 0 5050 ≈ 187 27 4450 x2 = ≈ 165 27 950 ≈ 106 x3 = 9 Solution 2 (by determining the inverse of the matrix I − C) We know that the production level is determined by x1 = x = (I3 − C)−1 D. 5 −0. 7 −0.

Let d be the vector in Rn with 1 in the j th entry and zeros elsewhere. The corresponding production vector x is the j th column of (I − C)−1 . If C and d have nonnegative entries and if each row sum or each column sum of C is less than 1. Let C be the consumption matrix for an economy and let d be the vector of external demand. then (I − C)−1 exists and the production vector x = (I − C)−1 d has nonnegative entries and is the unique solution of the production equation x = Cx + d. and if we eliminate these equations we don’t change the general solution).. The economic interpretation of entries in (I − C)−1 The (i. Remark. Proof.Hence.. This shows that the (i. Theorem. the conclusion is true since if x1 and x2 are production vectors which satisfy respectively the external demands d1 and d2 then x1 − x2 is the production vector which satisfies the external demand d1 − d2 . 37 . The theorem below shows that in most practical cases. (I − C) and in consequence −1 =  80 27 50 27 10 9 25 27 55 27 5 9 5 9 5 9 5 3     as we expected. j)th entry of the matrix (I − C)−1 is the increased amount of the ith sector which is to be produced in order to satisfy an increase of 1 unit in the external demand for sector j. j)th entry of (I − C)−1 gives the production of the ith sector to satisfy 1 unit in the external demand for sector j. Now. We suppose that the above system is a consistent one with an infinite number of solutions (that means that rank A = rank A < n).  am1 x1 + am2 x2 + · · · + amn xn = bm . x = (I − C)−1 d =   5050 27 4450 27 950 9 Basic feasible solutions We consider a linear system in general form   a11 x1 + a12 x2 + · · · + a1n xn = b1 . Also. we suppose that rank A = m (in the case that rank A < m then there are some equations of the system which are linear combinations of the others. I − C is invertible and the production vector x is economically feasible in the sense that the entries in x are nonnegative.

  3 In this case we obtain the BS  1  which is also a BFS. If one or more basic variables in a BS are zero then the solution is a degenerate BS. Determine all the basic solutions and all the basic feasible solutions of the following system: 2x1 + 3x2 − x3 = 9 −x1 + x2 − x3 = −2 Solution. The 2 basic variables can be: a) x1 . Since A= 2 3 −1 1 −1 9 −1 −2 . 0 3 b) x1 . A leading variable is also called a basic variable and a secondary variable is called a nonbasic variable. x3 (x1 is a nonbasic variable)   0 In this case we obtain the BS  11  which is a BFS. Example. Definitions A feasible solution (FS) of a linear system is a solution for which all the components are nonnegative.Since rank A = rank A = m < n the system will have m leading variables and n − m secondary variables. 2 15 2 38 . If a BFS is degenerate. we have 2 basic variables and one nonbasic variable. −5 3 c) x2 . x3 (x2 is a nonbasic variable)  11  In this case we obtain the BS  0  which is not a BFS. it is called a degenerate BFS. A basic solution (BS) of a linear system is a solution for which all the nonbasic variables are zero. A basic feasible solution (BFS) is a feasible solution which is also a basic one. Actually. then rank A = rank A = 2 then the system is consistent with an infinite number of solutions. x2 (x3 is a nonbasic variable) Since x3 is nonbasic then x3 = 0 and the system becomes 2x1 + 3x2 = 9 −x1 + x2 = −2 The solution of the previous system is x1 = 3 and x2 = 1.

Remark.. .. .. We will use the Gauss-Jordan elimination method. . .. .. Since a basic variable is a variable from whose column we have chosen a pivot.. So. .. xn ∈ R In order to get a basic solution we let xm+1 = · · · = xn = 0. Since the rank A = m then we have m basic variables and n−m nonbasic variables. The computations can be arranged in the following table.. . This basic solution is also a basic feasible solution if in the final table the column of the constants contains only nonnegative elements. . Our purpose is to determine the basic feasible solutions of a linear system. ..  ... . . .. . . . . Eventually by renumbering the unknowns we can suppose that we have chosen pivots from the first m columns. we can suppose that the basic variables are x1 . . xm = βm .. In consequence. .. so x1 = β1 ... 39      x =  βm   0   . β1 . . b b1 .. bm .. xm a1m amm 0 1 xm+1 a1m+1 amm+1 α1m+1 αmm+1 . ..  . For a consistent system having an infinite number of solutions whose m rank is m < n (n is the number of unknowns) there are at most Cn basic solutions. that means that we have chosen m pivots from m different columns and m different rows. . we choose a pivot from each row. The basic solution  β1 β2 . . . .. 1 0 . 0 . xn .. x2 = β2 . . xn a1n amn α1n αmn The general solution is:   x1 = β1 − (α1m+1 xm+1 + · · · + α1n xn )    x2 = β2 − (α2m+1 xm+1 + · · · + α2n xn )   . .. βm x1 a11 am1 .. . .   xm = βm − (αmm+1 xm+1 + · · · + αmn xn )    xm+1 . . xm and the nonbasic variables are xm+1 . .             is in the final table. . ..

so the pivot has to satisfy the following condition bi akj ≤ bk aij . aij Since aij > 0 then bk aij − bi akj ≥ 0. m | akj ≤ 0}. aij akj So. If k ∈ J1 then (∗) is equivalent with the following condition bk bi ≤ . In order to keep the nonnegativity property of the constants column we obtain the following rule for choosing a pivot on the j th column. too. 1) The pivot has to be positive. we will determine rules for choosing the pivot such that if in the initial table the column of the constants is nonnegative then so it will be in the final table. aij > 0. aij If k = i. 2) If J1 = ∅ (on the j th column there is no positive element) then none of the elements of the j th column can become a pivot. m (∗) Let J1 = {k = 1. then aij (the pivot) has to be positive. m | akj > 0}. k = i. Actually. which has to be If we choose aij = 0 as a pivot then bi will transform into aij nonnegative. k = 1. The previous condition is called the ratio test. then we can multiply it by −1). Conclusion.Next. we are interested in preserving the property of the constant column to contain only nonnegative elements at each intermediate table which occurs when we solve the system. m. We are interested in choosing a pivot from the j th column such that the constant column in the next table will remain nonnegative. For k = i the previous inequality becomes bi aij − bi aij = 0 ≥ 0. then the element bk will transform by using the rectangle’s rule into bk → bk aij − bi akj ≥ 0. k ∈ J1 akj where J1 = {k = 1. bi Since bi ≥ 0 and ≥ 0. m | akj > 0} and J2 = {k = 1. k = 1. bi . If k ∈ J2 then (∗) is satisfied since bi akj ≤ 0 ≤ bk aij . In this case xj can’t be a basic variable. ∀ k ∈ J1 . (∗) is satisfied if bi = min aij bk . We may assume that in the initial table the constant column is nonnegative (if there is an equation whose righthand side constant is negative. 40 .

. . . . a2 < b2  or   a1 = b1 . Let a = (a1 . . we multiply the second equation by −1. If the ratio test is satisfied for more than one element then the pivot will be the element which provides us the minimum row by using the lexicographical order. a3 < b3     or    a1 = b1 . . . . an−1 = bn−1 . an < bn Examples. in order to obtain a positive constant in the right hand side.If J1 = ∅ then the pivot will be the positive element situated on j th column for which the ratio test is satisfied. 2x1 + 3x2 − x3 = 9 x1 − x2 + x3 = 2 b 9 2 5 2 1 3 11 2 15 2 → x1 → x2 ← x1 x2 → x3 x1 2 1 0 1 0 1 3 2 5 2 x2 x3 3 −1 −1 1 5 −3 −1 1 1 −3 5 0 1 0 2 5 ratio test min 9 2 2. 1 =2 0 1  3 BF S : x =  1  0 BF S : x =   0 11 2 15 2    41 . for the ratio test. Determine a basic feasible solution for the following systems: a) 2x1 + 3x2 − x3 = 9 −x1 + x2 − x3 = −2 | −1 First. a2 = b2 . . an ) ∈ Rn and b = (b1 . . We say that a < b (in lexicographical order) if   a1 < b1    or     a1 = b1 . . Remark. . bn ) ∈ Rn . The computation table contains an extra column situated at the right hand side of the usual table.

1 =1 min 2 7 3. 2. 10 = 5 2 5 2 −1 −3 1 ∨ 5 1 −1 1 1 2 2 min 0 3 2 .b) 2x1 − x2 − 3x3 + x4 = 5 x1 − 2x2 + x3 + 2x4 = 10 b 5 10 0 → x4 x1 x4 5 0 5 x1 2 1 3 2 1 2 x2 −1 −2 0 −1 0 −1 x3 −3 1 7 −2 1 2 −7 3 5 3 x4 1 2 0 1 0 1 ratio test 5 min 1 . 7 = 2 3 0 0 20 s 3 7 2 0 0 −1 2 0 1 2 →z 0 1 0 4  0    BF S :  7   2   3  0  35 2  42 . 5 1 2 =0 1 0  0  0  BF S :    0  5 a degenerate BF S    x + 2y − 3z − 2s + 4t = 1 2x + 5y − 8z − s + 6t = 4 c)  x + 4y − 7z + 5s + 2t = 8 b 1 4 8 1 2 7 x 1 2 1 1 0 0 1 0 0 1 y 2 5 4 2 1 2 →x x →s x 7 3 2 3 7 3 35 2 8 3 1 3 −1 3 1 2 z −3 −8 −7 −3 −2 −4 − 13 3 −2 3 2 3 s t −2 4 −1 6 5 2 −2 4 3 −2 7 −2 8 0 3 2 1 −3 0 8 3 ratio test min 1 4 8 1.

x2 . m . f rom F1 f rom F2 f rom Fn i = 1.  am1 x1 + am2 x2 + · · · + amn xn ≥ bm   xj ≥ 0. . . . The mathematical model is the following: . Nm .the amount xj of food Fj (j = 1. j = 1. n) such that the total cost of the diet is as small as possible.2.The total cost: f (x1 . . .2 Linear programming problems (LPP) Example. . .. i = 1. xn ) = x1 c1 + x2 c2 + · · · + xn cn → minimize . . Fn . j = 1.each unit of food Fj contains aij units of the ith ingredient. The diet problem We want to determine the most economical diet which satisfies the basic minimum nutritional requirements for a good health.food Fj sells at a price cj per unit. Definition (General form of a linear programming problem) A LPP is an optimization (minimization or maximization) problem of the following form: Find the optimum (minimum or maximum) of the following function n subject to the constraints:   a11 x1 + a12 x2 + · · · + a1n xn ≥ b1   .there are available n different kinds of food: F1 . m We have to solve the following problem (example of a linear programming problem): n f= j=1 cj xj → minimize (the objective function) The main characteristic of a LPP is that all the involved functions: the objective function and those which express the constraints must be linear. . .. n f= j=1 cj xj subject to the constraints: 43 . .there are m basic nutritional ingredients N1 . We know: .The quantity of the ith ingredient received by a person is: ai1 x1 + ai2 x2 + · · · + ain xn ≥ bi . . n . We want: .for a balanced diet each individual must receive at least bi units of Ni th ingredient per day.

Definition (Standard form of a LPP) Optimize n f= j=1 cj xj subject to the constraints:   a11 x1 + · · · + a1n xn = b1   . bi . m xj ≥ 0. We obtain ai1 x1 + ai2 x2 + · · · + ain xn − yi = bi . q aij xj = bi . In this case yi is called a slack variable.  am1 x1 + · · · + amn xn = bm   xj ≥ 0. j = 1. xj . i = 1. then we add to the left side a new variable yi ≥ 0 in order to transform the inequality into an equality. j = 1. j = 1. m are known real numbers. n where cj . n. Remark. n. i = 1. Depending on particular values of p and q we may have inequality constraints of one type or the other and equality restrictions as well. and xj . i = q + 1. Definition. 44 . bi . In this case yi is called a surplus variable. n are real numbers to be determined. a) If ai1 x1 + ai2 x2 + · · · + ain xn ≤ bi . aij . Any LPP can be converted to the standard form by using the slack or surplus variables.                         n j=1 n j=1 n j=1 aij xj ≤ bi . m (otherwise we multiply the equality by −1). b) If ai1 x1 + ai2 x2 + · · · + ain xn ≥ bi then we subtract to the left side of the inequality a new variable yi ≥ 0 in order to transform the inequality into an equality. n where cj . We obtain: ai1 x1 + ai2 x2 + · · · + ain xn + yi = bi . i = 1. j = 1. p aij xj ≥ bi . m are known real numbers.. Any solution of the constraints for which the optimum of the objective function is obtained is called an optimal solution. j = 1. n are real numbers to be determined. i = 1. j = 1.. i = p + 1. aij . We can assume that bi ≥ 0.

So. 0) (1. The graphical approach is extremely helpful in understanding the kinds of phenomena which can occur in solving linear programming problems. Example 1. 3) (0. the set of feasible solution is the intersection of these half planes. Determine the maximum of the function   −2x + 2y ≤ 4 3x − y ≤ 3 f (x. c const. 0) To solve this prob