You are on page 1of 205

University of Zimbabwe

MTE 101 ENGINEERING MATHEMATICS 1


Lecture hours: 60 hours

Mathematics building Office No. : 216 Old wing

Author:
Mr T.V. Mupedza Department:
Contact: 0779791216 Mathematics And
Email: Computational Sciences
tvmupedza@science.uz.ac.zw

October 16, 2021


Course Outline

0.1 Purpose

This course is designed to cover Calculus of single variables and part of Calculus of several variables
for undergraduate engineering students with some background of A-Level calculus. The concentra-
tion is on motivating results and concepts geometrically rather than on providing rigorous proofs.
Concepts are defined carefully and results stated precisely, but illustrated by way of vivid, concrete
examples. We seek to introduce and develop concepts (including complex numbers and the algebra
of polynomials) necessary for a first course in algebra. Our goal is the elementary theory of matrices
and determinants, and their applications to solving systems of linear equations. The course is a
pre-requisite for MT201 (Engineering mathematics 2). It has 20 lectures per week which have a
duration of one hour each.
Monday to Friday 0800 − 1200 hrs.

0.2 AIM

The aim of this course is to introduce gently the rigour of mathematical analysis and provide a
good background for applied mathematics.

0.3 Course Content

0.3.1 Number systems

1. Natural, Rational and Irrational Numbers, [1]


2. Principle of Mathematical Induction, [2]
3. The Real Number System, Inequalities, Solution Sets and Geometrical Representation, [1]
4. Absolute Value, Neighborhoods and Intervals. [1]

1
0.3.2 Sequences and series

1. Definitions and Notation, Limits of Sequences and their properties, [2]

2. Monotone Sequences, [1]

3. Convergence or Divergence of infinite Series, [1]

4. Tests for Convergence or Divergence of infinite Series (Direct Comparison Test, Limit Com-
parison test, Alternating Series Test, Absolute convergence, N th Root test and Ratio Test)
[2]

0.3.3 Functions, limits of functions of a single variable

1. Definitions and Notation, Types of Functions and their Inverses, [2].

2. Definition of a limit of a function and its application, [1]

3. Left and Right hand Limits, [1]

4. L’ Hospital’s Rule, [1]

5. Continuous Functions, [2]

6. Curve sketching, [1]

0.3.4 Differentiation

1. The concept of a derivative, [1]

2. Theorems on Differentiation. [2]

3. Applications of the derivative. The Mean Value Theorem and Rolle’s Theorem, [2]

4. Leibniz Theorem with Applications, [2]

0.3.5 Integration

1. Indefinite integral and definite integral, [2]

2. Techniques of integration, [2]

3. Reduction Formulas, [2]

4. Improper integrals, [2]

2
0.3.6 Functions of two or more variables

1. Limits and continuity, [2]

2. Partial differentiation, [2]

3. The chain rule, [2]

4. The extended chain rule, [1]

5. maxima, Minima and Saddle points, [2]

0.3.7 Integration of functions of several variables

1. Double integration, [1]

2. Changing the order of integration and changing variables of integration, [3]

3. Triple integrals, [1]

0.3.8 Basic Concepts of Set Theory

1. Importance of set theory, Notation, Some interesting sets of numbers, Well-defined sets, Spec-
ification of Sets, [1]

2. The empty set (null set), Identity and cardinality, Russell’s Paradox, Inclusion, Axiom of
Extensionality, [2]

3. Power set, Operations on Sets, Difference and Complement, Venn diagrams, [2]

4. Set Theoretic Equalities, The Algebra of sets, Set Products. [2]

0.3.9 Introduction to Probability

1. Introduction to probability, random experiments, sample spaces, events, mutually exclusive


events, axiomatic definition of probability, relative frequency, [2]

2. Computation of probabilities of finite sample spaces, cardinality of a set, probabilities based on


symmetry, methods of enumerating sample points, conditional probability, total probability,
independent events, Bayes’ Law. [3]

3
0.3.10 Complex Numbers and Polynomials

1. Introduction, Operations, Rules of Complex arithmetic, [2]

2. Modulus, Complex conjugate, Division, Polar representation of complex numbers, De Moivre’s


theorem and its application, [4]

3. Applications of complex numbers, The fundamental theorem of algebra. [4]

0.3.11 Matrices and Determinants

1. Matrix addition and multiplication, properties, Transpose of a matrix, square matrices, diag-
onal and trace, Powers of matrices, [2]

2. Some special types of square matrices, Determinants, Laplace expansion of the determinant,
Inverse of matrices, [5]

3. Application of matrices, Elementary row operations, Inverses using row operations, Solving
systems of linear equations. [5]

0.4 Methods/Strategies to be used


1. lecture method,

2. group discussion,

3. seminars,

4. tutorials.

0.5 Student Assessment

Students will write three, one hour tests after every week.
The average of the tests will constitute the coursework mark where 50% of the coursework mark
will contribute to the final mark.
A 3 hour final examination will be written in the 4th week of the first semester.
The examination will contribute 50% to the final mark.
The examination paper has two sections; namely; section A and section B. Candidates may attempt
ALL questions in Section A and at most TWO questions in Section B. Section A carries 40 marks
and each question in section B carries 30 marks.

4
0.6 Selected Resources(references)

Recommended reading

ˆ S Lang, Calculus of Several Variables (Springer Science+Business Media New York).

ˆ P D Lax, M S Terrell, Calculus with Applications (Springer Science+Business Media New


York).

ˆ M R Spiegel, Advanced Calculus (Schaum’s Outline Series).

ˆ J R Kirkwood, An Introduction To Analysis (PWS Publishing Company).

ˆ A Jeffrey, Linear Algebra and Differential Equations.

ˆ Antony Howard, Elementary Linear Algebra, 7th edition (Wiley, 1994)

ˆ M R Spigel, Complex Variables, Schaum Outline Series.

Additional reading

Any first year university Calculus text book.

5
Abstract

Calculus is one of the milestones of Western thought. Building on ideas of Archimedes, Fermat,
Newton, Leibniz, Cauchy, and many others, the calculus is arguably the cornerstone of modern
science. Any well-educated person should at least be acquainted with the ideas of calculus, and a
scientifically literate person must know calculus solidly. Calculus has two main aspects: differential
calculus and integral calculus. Differential calculus concerns itself with rates of change. Various
types of change, both mathematical and physical, are described by a mathematical quantity called
the derivative. Integral calculus is concerned with a generalized type of addition, or amalgamation,
of quantities. Many kinds of summation, both mathematical and physical, are described by a
mathematical quantity called the integral.

Calculus is one of the most important parts of mathematics. It is fundamental to all of modern
science. How could one part of mathematics be of such central importance? It is because calculus
gives us the tools to study rates of change and motion. All analytical subjects, from biology to
physics to chemistry to engineering to mathematics, involve studying quantities that are growing
or shrinking or moving, in other words, they are changing. Astronomers study the motions of the
planets, chemists study the interaction of substances, physicists study the interactions of physical
objects. All of these involve change and motion. 1 2

1
To Archimedes, Pierre de Fermat, Isaac Newton, and Gottfried Wilhelm von Leibniz, the fathers of calculus
2
The true sign of intelligence is not knowledge but imagination—— Albert Einstein
Chapter 1

The Basics

1.1 Number Systems

Mathematics has its own language with numbers as the alphabet. The language is given structure
with the aid of connective symbols, rules of operation, and a rigorous mode of thought (logic). The
number systems that we use in calculus are the natural numbers, the integers, the rational numbers,
and the real numbers. Let us describe each of these :

1. The natural numbers are the system of positive counting numbers 1, 2, 3 . . . . We denote the
set of all natural numbers by N.

N = {1, 2, 3, 4, 5, 6, 7, 8, . . . }.

2. The integers are the positive and negative whole numbers and zero, . . . , −3, −2, −1, 0, 1, 2, 3, . . . .
We denote the set of all integers by Z.

Z = {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . }.

3. The rational numbers are quotients of integers or fractions, such as 32 , − 54 . Any number
p
of the form , with p, q ∈ Z and q 6= 0, is a rational number. We denote the set of all rational
q
numbers by Q.  
p
Q= p, q ∈ Z, q 6= 0 .
q
4. The real numbers are the set of all decimals, both terminating and non-terminating. We
denote the set of all real numbers by R. A decimal number of the form x = 3.16792 is actually
a rational number, for it represents
316792
x = 3.16792 = .
100000
1
A decimal number of the form

m = 4.27519191919 . . . ,

with a group of digits that repeats itself interminably, is also a rational number. To see this,
notice that
100 · m = 427.519191919 . . .
and therefore we may subtract

100m = 427.519191919 . . .
m = 4.27519191919 . . .

Subtracting, we see that


99m = 423.244
or
423244
m= .
99000
So, as we asserted, m is a rational number or quotient of integers. To indicate recurring
decimals we sometimes place dots over the repeating cycle of digits, e.g., m = 4.2751̇9̇,
19
6
= 3.16̇.
Another kind of decimal number is one which has a non-terminating decimal expansion that
does not keep repeating. An example is π = 3.14159265 . . . . Such a number is irrational, that
is, it cannot be expressed as the quotient of two integers.
In summary : There are three types of real numbers : (i) terminating decimals, (ii) non-
terminating decimals that repeat, (iii) non-terminating decimals that do not repeat. Types
(i) and (ii) are rational numbers. Type (iii) are irrational numbers.
The geometric representation of real numbers as points on a line is called the real axis. Between
any two rational numbers on the line there are infinitely many rational numbers. This leads
us to call the set of rational numbers an everywhere dense set.
Real numbers are characterised by three fundamental properties :

(a) algebraic means formalisations of the rules of calculation (addition, subtraction, multi-
plication, division). Example : 2(3 + 5) = 2 · 3 + 2 · 5 = 6 + 10 = 16.
3 1
(b) order denote inequalities. Example : − < .
4 3
(c) completeness implies that there are “no gaps” on the real line.
Algebraic properties of the reals for addition (a, b, c ∈ R) are :
(A1) a + (b + c) = (a + b) + c. associativity
(A2) a + b = b + a. commutativity
(A3) There is a 0 such that a + 0 = a. identity
(A4) There is an x such that a + x = 0. inverse
Why these rules? They define an algebraic structure (commutative group). Now define anal-
ogous algebraic properties for multiplication :

2
(M1) a(bc) = (ab)c.
(M2) ab = ba.
(M3) There is a 1 such that a · 1 = a.
(M4) There is an x such that ax = 1 for a 6= 0.

Finally, connect multiplication and addition :

(D) a(b + c) = ab + ac. distributivity

These 9 rules define an algebraic structure called a field.


Order properties of the reals are :

(O1) for any a, b ∈ R, a ≤ b or b ≤ a. totality of ordering I


(O2) if a ≤ b and b ≤ a, then a = b. totality of ordering II
(O3) if a ≤ b and b ≤ c, then a ≤ c. transitivity
(O4) if a ≤ b, then a + c ≤ b + c. order under addition
(O5) if a ≤ b and c ≥ 0, then ac ≤ bc. order under multiplication

Some useful rules for calculations with inequalities are : If a, b, c are real numbers, then :

(a) if a < b and c < 0 ⇒ bc < ac.


(b) if a < b ⇒ −b < −a.
1
(c) if a > 0 ⇒ > 0.
a
1 1
(d) if a and b are both positive or negative, then a < b ⇒ < .
b a
The completeness property can be understood by the following construction of the real
numbers : Start with the counting numbers 1, 2, 3, . . . .

ˆ N = {1, 2, 3, 4, . . . } natural numbers ⇒ Can we solve a + x = b for x?


ˆ Z = {. . . , −2, −1, 0, 1, 2, . . . } integers ⇒ Can we solve ax = b for x?
ˆ Q = { pq |p, q ∈ Z, q 6= 0} rational numbers ⇒ Can we solve x2 = 2 for x?

ˆ R real numbers, for example : The positive solution to the equation x2 = 2 is 2. This
is an irrational number whose decimal representation is not eventually repeating.

⇒ N⊂Z⊂Q⊂R
In summary, the real numbers R are complete in the sense that they correspond to all points on
the real line, i.e., there are no “holes” or “gaps”, whereas the rationals have “holes” (namely
the irrationals).
You Try It : What type of real number is 3.41287548754875 . . . ? Can you express this
number in more compact form?

3
1.2 Intervals

Definition 1.2.1. A subset of the real line is called an interval if it contains at least two numbers
and all the real numbers between any of its elements.

Examples :

1. x > −2 defines an infinite interval. Geometrically, it corresponds to a ray on the real line.

2. 3 ≤ x ≤ 6 defines a finite interval. Geometrically, it corresponds to a line segment on the real


line.

Finite Intervals. Let a and b be two points such that a < b. By the open interval (a, b) we mean
the set of all points between a and b, that is, the set of all x such that a < x < b. By the closed
interval [a, b] we mean the set of all points between a and b or equal to a or b, that is, the set of all
x such that a ≤ x ≤ b. The points a and b are called the endpoints of the intervals (a, b) and [a, b].

By a half-open interval we mean an open interval (a, b) together with one of its endpoints. There
are two such intervals : [a, b) is the set of all x such that a ≤ x < b and (a, b] is the set of all x such
that a < x ≤ b.

Infinite Intervals. Let a be any number. The set of all points x such that a < x is denoted by
(a, ∞), the set of all points x such that a ≤ x is denoted by [a, ∞). Similarly, (−∞, b) denotes the
set of all points x such that x < b and (−∞, b] denotes the set of all x such that x ≤ b.

1.3 Solving Inequalities

Solve inequalities to find intervals of x ∈ R. Set of all solutions is the solution set of the inequality.
Examples:

4
1.

2x − 1 < x + 3
2x < x + 4
x < 4.

2. For what values of x is x + 3(2 − x) ≥ 4 − x?

x + 3(2 − x) ≥ 4 − x when
x + 6 − 3x ≥ 4−x
6 − 2x ≥ 4−x
2 ≥ x ⇒ x ≤ 2.

3. For what values of x is (x − 4)(x + 3) < 0?


Case 1: (x − 4) > 0 and (x + 3) < 0, =⇒ x > 4 and x < −3.
Impossible since x cannot be both greater than 4 and less than −3.
Case 2: (x − 4) < 0 and (x + 3) > 0, =⇒ x < 4 and x > −3 =⇒ −3 < x < 4.

2 3
You Try It: Solve the inequality < .
x−1 2x + 1

1.4 The Absolute Value

It is a quantity that gives the magnitude or size of a real number. The absolute value or modulus
of a real number x, denoted by |x|, is given by

x, if x ≥ 0
|x| =
−x, if x < 0.

Geometrically, |x| is the distance between x and 0. For example, | − 6| = 6, |5| = 5, |0| = 0.

1.4.1 Properties of the Absolute Value

1. The absolute value of a real number x is non-negative, that is, |x| ≥ 0.

2. The absolute value of a real number x is zero if and only if x = 0, that is, |x| = 0 ⇐⇒ x = 0.

3. In general, if x and y are any two numbers, then

5
(a) −|x| ≤ x ≤ |x|.
(b) | − x| = |x| and |x − y| = |y − x|.
(c) |x| = |y| implies x = ±y.

x |x|
(d) |xy| = |x| · |y| and = if y 6= 0.
y |y|
(e) |x + y| ≤ |x| + |y|. (Triangle inequality)
4. If a is any positive number, then
(a) |x| = a if and only if x = ±a.
(b) |x| < a if and only if −a < x < a.
(c) |x| > a if and only if x > a or x < −a.
(d) |x| ≤ a if and only if −a ≤ x ≤ a.
(e) |x| ≥ a if and only if x ≥ a or x ≤ −a.

Example: Show that for all real numbers x, | − x| = |x|.


Solution: If x ∈ R, then either x > 0, x = 0 or x < 0. If x > 0, then −x < 0. Thus,
| − x| = −(−x) = x = |x|, that is, | − x| = |x|.
If x = 0, then | − x| = | − 0| = |0| = 0, that is, | − x| = |x|.
If x < 0, then −x > 0. Now |x| = −x = | − x| since −x > 0.
Therefore in all cases | − x| = |x|.

Solving an Equation with Absolute Values: Solve the equation |2x − 3| = 7.

Solution: Hence 2x − 3 = ±7, so there are two possibilities,

2x − 3 = 7 2x − 3 = −7
2x = 10 2x = −4
x = 5 x = −2

The solutions of |2x − 3| = 7 are x = 5 and x = −2.



2
Solving Inequalities Involving Absolute values: Sole the inequality 5 − < 1.
x

Solution: We have


5 − 2 2
< 1 ⇐⇒ −1 < 5 − < 1
x x
2
⇐⇒ −6 < − < −4
x
1
⇐⇒ 3 > > 2
x
1 1
⇐⇒ <x< .
3 2
6
Solve the inequalities and show the solution set on the real line. (a) |2x − 3| ≤ 1 (b) |2x − 3| ≥ 1.

Solution: (a)

|2x − 3| ≤ 1 ⇐⇒ −1 ≤ 2x − 3 ≤ 1
⇐⇒ 2 ≤ 2x ≤ 4
⇐⇒ 1 ≤ x ≤ 2.

The solution set is the closed interval [1, 2].

(b)

|2x − 3| ≥ 1 ⇐⇒ 2x − 3 ≥ 1 or 2x − 3 ≤ −1
⇐⇒ x ≥ 2 or x ≤ 1.

The solution set is (−∞, 1] ∪ [2, ∞).

You Try It: Solve the inequality 4|x| < 7x − 6.

1.5 The Principle of Mathematical Induction

It is an important property of the positive integers (natural numbers) and is used in proving state-
ments involving all positive integers when it is known for, for example, that the statements are valid
for n = 1, 2, 3, . . . but it is suspected or conjectured that they hold for all positive integers.

1.5.1 Steps

1. Prove the statement for n = 1 or some other positive integer. (Initial Step)

2. Assume the statement true for n = k, where k ∈ Z+ . (Inductive Hypothesis)

3. From the assumption in 2 prove the statement must be true for n = k + 1.

4. Since the statement is true for n = 1 (from 1) it must (from 3) be true for n = 1 + 1 = 2 and
from this for n = 2 + 1 = 3, and so on, so must be true for all positive integers. (Conclusion)

Example: For any positive integer n,

n(n + 1)
1 + 2 + ··· + n = .
2
Solution:

7
1(1 + 1) 2
1. Prove for n = 1, 1 = = = 1, which is clearly true.
2 2
2. Assume that the statement holds for n = k, that is,

k(k + 1)
1 + 2 + ··· + k = .
2

3. Prove for n = k + 1. So
k(k + 1)
1 + 2 + · · · + k + (k + 1) = + (k + 1) (by inductive hypothesis)
2
k(k + 1) + 2(k + 1)
=
2
2
k + 3k + 2
=
2
(k + 1)(k + 2)
=
2
so holds for n = k + 1.
n(n + 1)
4. Hence by induction, 1 + 2 + · · · + n = is true for any positive integer n.
2

Example: Prove that for any natural number

1 + 3 + 5 + · · · + 2n − 1 = n2 .

Solution:

1. Prove for n = 1, 1 = 12 = 1, so it is true.

2. Assume that the statement holds for n = k, that is,

1 + 3 + 5 + · · · + 2k − 1 = k 2 .

3. Prove for n = k + 1. We have

1 + 3 + 5 + · · · + (2k − 1) + 2(k + 1) − 1 = k 2 + 2k + 1 (by inductive hypothesis)


= (k + 1)2 .

So it is true for n = k + 1.

4. Hence by induction 1 + 3 + 5 + · · · + 2n − 1 = n2 is true for all natural numbers n.

8
Example: Prove that 3n > 2n for all natural numbers n.

Solution:

1. Prove for n = 1 =⇒ 31 = 3 > 21 = 2, which is true.

2. Assume the statements holds for n = k, that is, 3k > 2k .

3. Prove for n = k + 1.

3k+1 = 3k · 3
> 2k · 3 by inductive hypothesis
> 2k · 2 since 3 > 2
> 2k+1 ,

which is true.

4. Hence, by induction 3n > 2n for all natural numbers n.

Example: Prove that for any integer n ≥ 1, 22n − 1 is divisible by 3.


Solution:

1. Prove for n = 1 =⇒ 22 − 1 = 3 and is divisible by 3, hence its true.

2. Assume that the statement holds for n = k, that is, for k ≥ 1, 22k − 1 is divisible by 3, i.e.,
22k − 1 = 3l, for some l ∈ Z.

3. Prove for n = k + 1.

22(k+1) − 1 = 4 · 22k − 1 but 22k = 3l + 1 by the inductive hypothesis


= 4(3l + 1) − 1
= 12l + 4 − 1
= 12l + 3
= 3(4l + 1),

which is true.

4. Hence, by induction 22n − 1 is divisible by 3 for all n ≥ 1.

1.6 Tutorial 1
1. Express the following recurring decimals in the form p/q where p and q are integers
(i) 2, 1737̇3̇ (ii) 0, 3̇2̇4̇.

9
2. Express 0, mnmnmnmn . . . = 0, ṁṅ, where m and n are distinct integers, in the form p/q
where p and q are integers.

3. State, giving a reason, whether each of the following numbers is rational or irrational.
(i) 0.20200200020 . . . (ii) 537.137137137 . . . .

4. Does the decimal 0, 1234567891011121314151617181920 . . . whose digits are natural numbers


strung end-to-end represent a rational or an irrational number? Give a reason for your answer.

5. Show that if 0 < a < b then a2 < b2 . If a2 < b2 , is it necessarily true that a < b? Give an
example to illustrate your answer.
1 √
6. If a ≥ 0 and b ≥ 0, prove that (a + b) ≥ ab.
2
7. Solve the following inequalities.
2x + 3
(i) x2 + x − 2 > 0 (ii) >3 (iii) 2|x| > 3x − 10 (iv) |x + 1| ≥ 3
x−5
8. Prove that |ab| = |a||b| for all a, b ∈ R.

9. Show that |a + b| ≤ |a| + |b| for all a, b ∈ R.

10. Prove that |x|2 = x2 for any real number x.

11. If x and y are real numbers, prove that |x| − |y| ≤ |x − y|.

12. Prove the following by induction.

(a) n! > 2n for all n ≥ 4.


(c) n2 ≤ 2n for all n ≥ 5.
n
X rn+1 − 1
(d) The sum of terms in a geometric series is ri = , if r 6= 0, r 6= 1, n ∈ N.
i=0
r−1
n 2
(g) 3 > n for n > 2.
(h) 72n − 48n − 1 is divisible by 2304.
(k) | sin nx| ≤ n| sin x| for all x ∈ R, n ∈ Z+ .
n2 (n + 1)2
(l) 13 + 23 + 33 + · · · + n3 = .
4

10
Chapter 2

Sequences

Definition 2.0.1. A sequence is a set of numbers u1 , u2 , u3 , . . . in a definite order of arrangement


and formed according to a definite rule.

Each number in the sequence is called a term and un is called the nth term. The sequence
u1 , u2 , u3 , . . . is written briefly as {un }, e.g., {un } = 2n, where u1 = 2, u2 = 4, u3 = 6 and so
on. The sequence is called finite or infinite according as there are or are not a finite number of
terms.

Recursion Formula or Recurrence Relations


So far we have seen that a sequence {Un } may be defined by giving a formula for {Un } in terms of
n. For example
2n2 − 5n + 4
Un = √ .
n2 + 1
We can also define sequences by giving a relation or formula that connect successive terms of a
sequence and specifying the value or values of the first term or the first and second terms etc. The
formula or relation linking the terms is called a recursion formula or recurrence relation.

Example:
Find the values of the first four terms of the sequence defined by
2
un+1 = , u0 = 1, n ∈ N.
un

Solution:
2 2
u1 = u0+1 = = =2
u0 1
2 2
u2 = u1+1 = = =1
u1 2
2 2
u3 = u2+1 = = = 2.
u2 1

11
You Try It: Define recursively
a0 = a1 = 1, and an = an−1 + 2an−2 , n ≥ 2.
Find a6 recursively.

2.1 Limits of Sequences

1
Lets consider the sequence un = . The sequence has the terms 1, 21 , 13 , 14 , . . . . We see that the
n
terms of the sequence tend to or approach 0.
Definition 2.1.1. A number L is called the limit of an infinite sequence a1 , a2 , a3 , . . . or {an }, if
for any positive number ε, we can find a positive number N depending on ε such that |an − L| < ε
for all integers n > N . We write lim an = L.
n→∞

If {an } is a convergent sequence, it means that the terms an can be made arbitrarily close to L for
n sufficiently large.

1 3n + 1
Example: If un = 3 + = , the sequence is 4, 27 , 10
3
, . . . and we can show that
n n
lim un = 3.
n→∞

If the limit of a sequence exists, the sequence is called convergent, otherwise, it is called divergent.

12
1
Example: Prove that lim = 0.
n→∞ n


1 1 1 1 1
Proof: Let ε > 0, we can find N (ε) such that − 0 = = < ε. But n > . So N = .
n n n ε ε
1 1
Taking N to be the smallest integer greater than , we have, lim = 0.
ε n→∞ n

1
You Try It: Prove that lim = 0 if p ∈ N.
n→0 np

2n − 1 2
Example: Use the definition of a limit to prove that lim = .
n→∞ 3n + 2 3

Proof: Let ε > 0, we can find N (ε) such that



2n − 1 2 3(2n − 1) − 2(3n + 3) 6n − 3 − 6n − 4 −7 7
− =
3n + 2 3
=
3(3n + 2) 3(3n + 2) 3(3n + 2) < ε
= =
3(3n + 2)

7
< ε
3(3n + 2)
7 − 6ε
n > .

7 − 6ε 7 − 6ε
Take N = . So taking N to be the smallest integer greater than , we have
9ε 9ε
2n − 1 2 2n − 1 2
3n + 2 − 3 < ε , i.e., n→∞
lim = .

3n + 2 3

2.2 Theorems on Limits

If lim an = A and lim bn = B, then


n→∞ n→∞

1. lim (an + bn ) = lim an + lim bn = A + B.


n→∞ n→∞ n→∞

2. lim (an − bn ) = lim an − lim bn = A − B.


n→∞ n→∞ n→∞

3. lim (an · bn ) = ( lim an )( lim bn ) = AB.


n→∞ n→∞ n→∞

an lim an A
4. lim = n→∞ = if lim bn = B 6= 0.
n→∞ bn lim bn B n→∞
n→∞

5. The limit of a convergent sequence {un } of real numbers is unique.

13
Proof: We must show that if lim un = l1 and lim un = l2 , then l1 = l2 . By hypothesis, given any
n→∞ n→∞
ε ε
ε > 0, we can find N such that |un − l1 | < when n > N and |un − l2 | < when n > N . Then
2 2
ε ε
|l1 − l2 | = |l1 − un + un − l2 | ≤ |l1 − un | + |un − l2 | < + = ε,
2 2
i.e., |l1 −l2 | is less than any positive ε (however small) and so must be zero, i.e., l1 −l2 = 0 =⇒ l1 = l2 .

Example: If lim an = A and lim bn = B, prove that lim (an + bn ) = A + B.


n→∞ n→∞ n→∞

Proof: We must show that for any ε > 0, we can find N > 0, such that |(an + bn ) − (A + B)| < ε
for all n > N . We have

|(an + bn ) − (A + B)| = |(an − A) + (bn − B)| ≤ |an − A| + |bn − B|.


ε
By hypothesis, given ε > 0 we can find N1 and N2 such that |an − A| < for all n > N1 and
2
ε
|bn − B| < for all n > N2 . Then
2
ε ε
|(an + bn ) − (A + B)| < + =ε
2 2
for all n > N where N = max(N1 , N2 ). Hence lim (an + bn ) = A + B.
n→∞

2.3 Sequences Tending to Infinity

n tends to infinity, n → ∞ (n grows or increases beyond any limit ). Infinity is not a number and
the sequences that tend to infinity are not convergent.

We write lim an = ∞, if for each positive number M , we can find a positive number N (depending
n→∞
on M ) such that an > M for all n > N .

Similarly, we write lim an = −∞, if for each positive number M , we can find a positive number N
n→∞
such that an < −M for all n > N .

Example: Prove that (a) lim 32n−1 = ∞ (b) lim (1 − 2n) = −∞.
n→∞ n→∞

Proof: (a) If for each positive number M we can find a positive number
 N such
 that an > M for
1 ln M
all n > N , then 32n−1 > M when (2n − 1) ln 3 > ln M , i.e., n > + 1 . Taking N to be
  2 ln 3
1 ln M
the smallest greater than + 1 , then lim 32n−1 = ∞.
2 ln 3 n→∞

14
(b) If for each positive number M , we can find a positive number N such that an < −M for all
n > N , i.e., 1 − 2n < −M when 2n − 1 > M or n > 12 (M + 1). Taking N to be the smallest integer
greater than 12 (M + 1), we have lim (1 − 2n) = −∞.
n→∞

2.4 Bounded and Monotonic Sequences

A sequence that tends to a limit l is said to be convergent and the sequence converges to l. A
sequence may tend to +∞ or −∞, and is said to be divergent and it diverges to +∞ or −∞.

If un ≤ M for n = 1, 2, 3, . . . , where M is a constant, we say that the sequence {un } is bounded


above and M is called an upper bound. The smallest upper bound is called the least upper bound
(l.u.b).

If un ≥ m, the sequence is bounded below and m is called a lower bound. The largest lower bound
is called the greatest lower bound (g.l.b).

If m ≤ un ≤ M , the sequence is called bounded, indicated by |un | ≤ P . (Every convergent sequence


is bounded, but the converse is not necessarily true)

If un+1 ≥ un , the sequence is called monotonic increasing and if un+1 > un it is called strictly
increasing. If un+1 ≤ un , the sequence is called monotonic decreasing, while if un+1 < un it is
strictly decreasing.

Examples: 1. The sequence 1, 1.1, 1.11, 1.111, . . . is bounded and monotonic increasing.
2. The sequence 1, −1, 1, −1, 1, . . . is bounded but not monotonic increasing or decreasing.
1
Definition 2.4.1. A null sequence is a sequence that converges to 0, e.g., un = , n ≥ 11.
n − 10

If {un } does not tend to a limit or +∞ or −∞, we say that {un } oscillates (or is an oscillating
sequence). It can oscillate finitely (bounded) or infinitely (unbounded).

Examples: un = (−1)n , un = (−1)n n.

2.5 Limits of Combination of Sequences

5 − 2n2
 
1 3
We want to be able to evaluate limits, for example, of the form lim 2− + 2 or lim .
n→∞ n n n→∞ 4 + 3n + 2n2

15
 
1 3 1 1
Example: lim 2 − + 2 = lim 2 − lim + 3 lim 2 = 2 − 0 + 0 = 2.
n→∞ n n n→∞ n→∞ n n→∞ n

3n2 − 5n 3 − n5 3+0 3
Example: lim = lim = = .
n→∞ 5n2 + 2n − 6 n→∞ 5 + 2 − 6
5+0+0 5
n n2

√ √ 
√ √ √ √ n+1+ n 1
Example: lim ( n + 1 − n) = lim ( n + 1 − n) · √ √ = lim √ √ = 0.
n→∞ n→∞ n+1+ n n→∞ n+1+ n

2.6 Squeeze Theorem

If lim an = l = lim bn and there exists an N such that an ≤ cn ≤ bn , for all n > N , then
n→∞ n→∞
lim cn = l.
n→∞

cos n
Example: Find lim .
n→∞ n

Solution: We know that −1 ≤ cos n ≤ 1


1 cos n 1 1 cos n 1 cos n
=⇒ − ≤ ≤ =⇒ − lim ≤ lim ≤ lim =⇒ 0 ≤ lim ≤0
n n n n→∞ n n→∞ n n→∞ n n→∞ n
cos n
=⇒ lim = 0.
n→∞ n

16
Chapter 3

Infinite Series

One important application of infinite sequences is in representing infinite summations. If {an } is


an infinite sequence, then
X∞
an = a1 + a2 + a3 + · · ·
n=1

is called an infinite series (or simply a series). The numbers a1 , a2 , a3 , . . . are called the terms of
the series. To find the sum of an infinite series, consider the following sequence of partial sums.

S1 = a1
S2 = a1 + a2
S3 = a1 + a2 + a3
.. . .. ..
. = .. . .
Sn = a1 + a2 + a3 + · · · + an .

If this sequence of partial sums converges, then the series is said to converge and has the sum
indicated in the following definition.

3.1 Definition of Convergent and Divergent Series

X
For the infinite series an , the nth partial sum is given by

Sn = a1 + a2 + a3 + · · · + an .
X
If the sequence of partial sums {Sn } converges to S, then the series an converges. The limit S
is called the sum of the series. If {Sn } diverges, then the series diverges.

17

X 1 1 1 1 1
Example 3.1.1. The series n
= + + + + · · · has the following partial sums.
n=1
2 2 4 8 16
1
S1 =
2
1 1 3
S2 = + =
2 4 4
1 1 1 7
s3 = + + =
2 4 8 8
.. .. .. .. ..
. = . . . .
1 1 1 1 2n − 1
sn = + + + ··· + n = .
2 4 8 2 2n
2n − 1
Because lim = 1, it follows that the series converges and its sum is 1.
n→∞ 2n
Example 3.1.2. The nth partial sum of the series
∞        
X 1 1 1 1 1 1 1
− = 1− + − + − + ···
n=1
n n+1 2 2 3 3 4
1
is given by Sn = 1 − . Because the limit of Sn is 1, the series converges and its sum is 1.
n+1

X
Example 3.1.3. The series 1 = 1 + 1 + 1 + · · · diverges, because Sn = n and the sequence of
n=1
partial sums diverges.

The series in Example (3.1.2) is a telescoping series. That is, it is of the form
(b1 − b2 ) + (b2 − b3 ) + (b3 − b4 ) + (b4 − b5 ) + · · ·
note that b2 is canceled by the second term, b3 is canceled by the third term and so on. Because the
nth partial sum of the series is Sn = b1 − bn+1 , it follows that a telescoping series will only converge
if and only if bn approaches a finite number as n → ∞. Moreover, if the series converges, then its
sum is
S = b1 − lim bn+1 .
n→∞

X 2
Example 3.1.4. Find the sum of the series .
n=1
4n2 −1

Solution: Using partial fractions, we can write


2 2 1 1
an = 2 = = − .
4n − 1 (2n − 1)(2n + 1) 2n − 1 2n + 1
From the telescoping form, we can see that the nth partial sum is
     
1 1 1 1 1 1 1
Sn = − + − + ··· + − =1− .
1 3 3 5 2n − 1 2n + 1 2n + 1
Thus, the series converges and its sum is 1. That is,
∞  
X 2 1
2−1
= lim Sn = lim 1 − = 1.
n=1
4n n→∞ n→∞ 2n + 1

18
3.2 Geometric Series

A geometric series with ratio r is given by



X
arn = a + ar + ar2 + · · · + arn + · · · , a 6= 0.
n=0

Theorem 3.2.1. A geometric series with ratio r diverges if |r| ≥ 1. If 0 < |r| < 1, then the series

X a
converges to the sum arn = , 0 < |r| < 1.
n=0
1 − r

Example 3.2.1. The geometric series


∞ ∞  n    2
X 3 X 1 1 1
n
= 3 = 3(1) + 3 +3 + ···
n=0
2 n=0
2 2 2

has a ratio of r = 21 with a = 3. Because 0 < |r| < 1, the series converges and its sum is
a 3
S= = = 6.
1−r 1 − 12

3.2.1 Properties of Infinite Series

X X
If an = A and bn = B and c is a real number, then the following series converge to the
X X X X
indicated sums. (i) can = cA (ii) (an ± bn ) = an ± bn = A ± B.

n-th Term Test for Divergence

Limit of n−th Term of a Convergent Series

X
If the series an converges, then the sequence {an } converges to 0.

X
If the sequence {an } does not converge to 0, then the series an diverges.

3.3 Test for Convergence or Divergence of Series

In this and the following section, we will study several convergence tests that apply to series with
positive terms.

19
3.3.1 The Integral Test


X Z ∞
If f is positive, continuous, and decreasing for x ≥ 1 and an = f (n), then an and f (x) dx
n=1 1
either both converge or both diverge.

X n
Example 3.3.1. Apply the integral test to the series .
n=1
n2 +1

x
Because f (x) = satisfies the conditions for the integral test (check this), we can integrate to
x2 +1
obtain
Z ∞ Z ∞ Z b
x 1 2x 1 2x
dx = dx = lim dx
1 x2 + 1 2 1 x2 + 1 2 b→∞ 1 x2 + 1
 b
1 2
= lim ln(x + 1)
2 b→∞ 1
1
= lim [ln(b2 + 1) − ln 2]
2 b→∞
= ∞.

Thus, the series diverges.



X 1
Example 3.3.2. Apply the integral test to the series .
n=1
n2 +1

1
Solution: Because f (x) = satisfies the conditions for the integral test, we can integrate to
x2 + 1
obtain
Z ∞ Z b  b
dx dx −1
2
= lim = lim tan x
1 x +1 b→∞ 1 x2 + 1 b→∞
1
−1 −1
= lim (tan b − tan 1)
b→∞
π π π
= − = .
2 4 4
Thus, the series converges.

3.3.2 p− Series and Harmonic Series

A series of the form ∞


X 1 1 1 1
p
= p + p + p + ···
n=1
n 1 2 3

20
is a p−series, where p is a positive constant. For p = 1, the series

X 1 1 1
= 1 + + + ···
n=1
n 2 3

is the harmonic series.

Theorem 3.3.1. The p−series



X 1 1 1 1
p
= p + p + p + ···
n=1
n 1 2 3

(i) converges if p > 1 and (ii) diverges if 0 < p ≤ 1.

Example 3.3.3. From the Theorem it follows that the harmonic series

X 1 1 1
= 1 + + + ···
n=1
n 2 3

diverges.

3.4 Comparisons of Series

3.4.1 Direct Comparison Test

This is a test for positive-term series. It allows you to compare a series having complicated terms
with a simpler series whose convergence or divergence is known.

Direct Comparison Test

Theorem 3.4.1. Let 0 ≤ an ≤ bn for all n.


X ∞
X
1. If bn converges, then an converges.
n=1 n=1


X ∞
X
2. If an diverges, then bn diverges.
n=1 n=1


X 1
Example 3.4.1. Determine the convergence or divergence of .
n=1
2 + 3n

21

X 1
Solution: This series resembles n
(Convergent geometric series). Term-by-term comparison
n=1
3
yields
1 1
an = n
< n = bn , n ≥ 1.
2+3 3
Thus, by the Direct Comparison Test, the series converges.

X 1
Example 3.4.2. Determine the convergence or divergence of √ .
n=1
2+ n


X 1
Solution: The series resembles 1 (Divergent p−series). Term-by-term comparison yields
n=1 n2

1 1
√ ≤√ , n≥1
2+ n n

which does not meet the requirements for divergence. Still expecting the series to diverge, we can

X 1
compare the given series with (Divergent Harmonic series). In this case, term-by-term com-
n=1
n
parison yields
1 1
an = ≤ √ = bn , n ≥ 4
n 2+ n
and, by the Direct Comparison Test, the given series diverges.

3.4.2 Limit Comparison Test or Quotient Test

Often a given series closely resembles a p−series or a geometric series, yet we cannot establish the
term-by-term comparison necessary to apply the Direct Comparison Test. We can apply a second
comparison test, called the Limit Comparison Test.

Limit Comparison Test

 
an
Suppose that an > 0 and bn > 0 and lim = L where L is finite and positive. Then the two
X X
n→∞ bn X
series an and bn , either both converge or both diverge. If L = 0 and bn converges, then
X X X
an converges. If L = ∞ and bn diverges, then an diverges.

Example 3.4.3. Show that the following harmonic series diverges.



X 1
, a > 0, b > 0.
n=1
an + b

22
∞ 1
X 1 n 1
Solution: By comparison with we have lim an+b
1 = lim = . Because this limit is
n=1
n n→∞
n
n→∞ an + b a
grater than 0, we can conclude from the Limit comparison Test that the given series diverges.

The limit Comparison Test works well for comparing a messy algebraic series with a p−series. In
choosing an appropriate p−series, we must choose one with an nth term of the same magnitude as
the nth term of the given series.

Given series Comparison series Conclusion


∞ ∞
X 1 X 1
2
Both series converge.
n=1
3n − 4n + 5 n=1
n2
∞ ∞
X 1 X 1
√ √ Both series diverge.
n=1
3n − 2 n=1
n
∞ 2 ∞
X n2 X ∞
X n − 10 1
= Both series converge.
n=1
4n5 + n3 n=1
n 5
n=1
n3

3.5 Alternating Series

So far, most series we have dealt with have had positive terms. In this section, we will study series
that contain both positive and negative terms. The simplest such series is an alternating series,
whose terms alternate in sign. For example, the geometric series
∞  n X ∞
X 1 1 1 1 1 1
− = (−1)n n = 1 − + − + − ···
n=0
2 n=0
2 2 4 8 16

is an alternating geometric series with r = − 12 . Alternating series occur in two ways, either the odd
terms are negative or the even terms are negative.

Alternating Series Test


X ∞
X
n
Let an > 0. The alternating series (−1) an and (−1)n+1 an converge, if the following two
n=1 n=1
conditions are met.

1. an+1 ≤ an for all n.

2. lim an = 0.
n→∞

23

X 1
Example 3.5.1. Determine the convergence or divergence of (−1)n+1 .
n=1
n

1 1 1
Solution: Because ≤ for all n and the limit (as n → ∞) of is 0, we can apply the
n+1 n n
Alternating Series Test to conclude that the series converges. (This series is called the alternating
harmonic series)

X n
Example 3.5.2. Determine the convergence or divergence of .
n=1
(−2)n−1

Solution: To apply the Alternating Series Test, note that, for n ≥ 1,


1 n

2 n+1
n−1
2 n
n

2 n+1
(n + 1)2n−1 ≤ n2n
n+1 n
n
≤ n−1 .
2 2
n+1 n
Hence, an+1 = n
≤ n−1 = an for all n. Furthermore, by L’Hopital’s rule,
2 2
x 1 n
lim = lim = 0 =⇒ lim = 0.
x→∞ 2x−1 x→∞ 2x−1 (ln 2) n→inf 2n−1

Therefore, by the Alternating Series Test, the given series converges.

Cases for which the Alternating Series Test Fails

Example 3.5.3. The alternating series



X (−1)n+1 (n + 1) 2 3 4 5 6
= − + − + − ···
n=1
n 1 2 3 4 5

passes the first condition in the alternating series test because an+1 ≤ an for all n. We cannot apply
the Alternating Series Test, because the series does not pass the second condition.

The alternating series


2 1 2 1 2 1 1 1
− + − + − + − + ···
1 1 2 2 3 3 2 4
passes the second condition because an approaches 0 as n → ∞. We cannot apply the Alternating
Series Test, however, because the series does not pass the first condition.

24
3.6 Absolute and Conditional Convergence

Occasionally, a series may have both positive and negative terms and not be an alternating series,
for example, the series

X sin n sin 1 sin 2 sin 3
2
= + + + ···
n=1
n 1 4 9
has both positive and negative terms, yet it is not an alternating series. One way to obtain some
information about the convergence of this series is to investigate the convergence of the series

X sin n
n2 . By direct comparison, we have | sin n| ≤ 1, for all n, so

n=1

sin n 1
n2 ≤ n2 , n ≥ 1.


X sin n
Thus, by the Direct Comparison Test, the series converges. But the question still is “Does
n=1
n2
the original series converge?”
X X
Theorem 3.6.1 (Absolute Convergence). If the series |an | converges, then the series an
also converges.

The converse of the Theorem is not true. For example, the alternating harmonic series

X (−1)n+1 1 1 1
=1− + − + ···
n=1
n 2 3 4

converges by the Alternating Series Test. Yet the harmonic series diverges. This type of convergence
is called conditional.

Definition of Absolute and Conditional Convergence


X X
1. an is absolutely convergent if |an | converges.
X X X
2. an is conditionally convergent if an converges but |an | diverges.

3. An absolutely convergent series converges.

Example 3.6.1. Determine whether the following series are convergent or divergent. Classify any
convergent series as absolutely or conditionally convergent.
∞ n(n+1)
X (−1) 2 1 1 1 1
(a) n
=− − + + − ···.
n=1
3 3 9 27 81

25
Solution: This in not an alternating series. However, because

∞ n(n+1) ∞
X (−1) 2 X 1
=
3n n=1 3n

n=1

is a convergent geometric series, so the given series is absolutely convergent, hence convergent.


X (−1)n 1 1 1 1
(b) =− + − + − ···.
n=1
ln(n + 1) ln 2 ln 3 ln 4 ln 5

Solution: In this case, the alternating series test indicates that the given series converges. However,
the series ∞
(−1)n

X 1 1 1
ln(n + 1) = ln 2 + ln 3 + ln 4 + · · ·

n=1

diverges by direct comparison with terms of the harmonic series. Therefore, the given series is
conditionally convergent.


X (−1)n 1 1 1 1
(c) √ = −√ + √ − √ + √ − · · · .
n=1
n 1 2 3 4

Solution: The given series converges by the Alternating Series Test. Moreover, because the
p−series

(−1)n

√ = √1 + √1 + √1 + √1 + · · ·
X
n 1 2 3 4
n=1

diverges, the given series is conditionally convergent.

3.7 The Ratio and Root Tests

3.7.1 The Ratio Test

This is a test for absolute convergence.

Ratio Test

X an+1
1. an converges absolutely if lim < 1.
n→∞ an

X an+1 an+1
2. an diverges if lim
> 1 or lim = ∞.
n→∞ an n→∞ an

26

an+1
3. The Ratio Test is inconclusive if lim
= 1.
n→∞ an

Although the Ratio Test is not a cure for all ills related to tests for convergence, it is particularly
useful for series that converge rapidly. Series involving factorials or exponentials are frequently of
this type.

X 2n
Example 3.7.1. Determine the convergence or divergence of .
n=0
n!

2n
Solution: Because an = , we can write the following
n!
 n+1
2n

an+1 2
lim
= lim ÷
n→∞ an n→∞ (n + 1)! n!
 n+1 
2 n!
= lim · n
n→∞ (n + 1)! 2
2
= lim
n→∞ n + 1
= 0.
Therefore, the series converges.
Example 3.7.2. Determine whether the following series converge or diverge.
∞ ∞
X n2 2n+1 X nn
(a) (b) .
n=0
3n n=1
n!

Solution:
an+1
(a) This series converges because the limit of is less than 1.
an
  n+2   n

an+1 2 2 3
lim = lim (n + 1)
n→∞ an n→∞ 3n+1 n2 2n+1
2(n + 1)2
= lim
n→∞ 3n2
2
= < 1.
3

an+1
(b) This series diverges because the limit of is grater than 1.
an
n+1
  
an+1 (n + 1) n!
lim = lim
n→∞ an n→∞ (n + 1)! nn
(n + 1)n+1 1
  
= lim
n→∞ (n + 1) nn
n
(n + 1)n

1
= lim = lim 1 +
n→∞ nn n→∞ n
= e > 1.

27
3.7.2 The Root Test

This test of convergence or divergence of series works especially well for series involving nth powers.

Root Test

X
Let an be a series with non-zero terms.

X p
n
1. an converges absolutely if lim |an | < 1.
n→∞
X p p
n
2. an diverges if lim |an | > 1 or lim n |an | = ∞.
n→∞ n→∞
p
n
3. The Root Test is inconclusive if lim |an | = 1.
n→∞


X e2n
Example 3.7.3. Determine the convergence or divergence of .
n=1
nn

Solution: We can apply the Root Test as follows


r
p n e2n
lim n |an | = lim
n→∞ n→∞ nn
2n
en
= lim n
n→∞ n n
e2
= lim
n→∞ n
= 0 < 1.

Because this limit is less than 1, we can conclude that the series converges absolutely.

3.8 Power Series

3.8.1 Definition of Power Series

If x is a variable, then the infinite series of the form



X
an xx = a0 + a1 x + a2 x2 + a3 x3 + · · · + an xn + · · ·
n=0

28
is called a power series. More generally, a series of the form

X
an (x − c)n = a0 + a1 (x − c) + a2 (x − c)2 + · · · + an (x − c)n + · · ·
n=0

is called a power series centered at c, where c is a constant.

Example 3.8.1. (a) The following power series is centered at 0.



X xn x2 x3
=1+x+ + + ···
n=0
n! 2 3!

(b) The following power series is centered at −1.



X
(−1)n (x + 1)n = 1 − (x + 1) + (x + 1)2 − (x + 1)3 + · · ·
n=0

(c) The following power series is centered at 1.



X 1 1 1
(x − 1)n = (x − 1) + (x − 1)2 + (x − 1)3 + · · ·
n=1
n 2 3

3.8.2 Radius and Interval of Convergence

A power series in x can be viewed as a function of x



X
f (x) = an (x − c)n
n=0

where the domain of f is the set of all x for which the power series converges.

Convergence of a Power Series

For a power series centered at c, precisely one of the following is true.

1. The series converges only at c.

2. There exists a real number R > 0 such that the series converges absolutely for |x − c| < R,
and diverges for |x − c| > R.

3. The series converges absolutely for all x.

29
The number R is the radius of convergence of the power series. In the series converges only at
c, then the radius of convergence is R = 0, and if the series converges for all x, then the radius
of convergence is R = ∞. The set of all values of x for which the power series converges is the
interval of convergence of the power series.

X
Example 3.8.2. Find the radius of convergence of n!xn .
n=0

Solution: For x = 0, we obtain



X
f (0) = n!0n = 1 + 0 + 0 + · · · = 1.
n=0

For any fixed value of x such that |x| > 0, let un = n!xn . Then
(n + 1)!xn+1

un+1
lim = lim
n→∞ un n→∞ n!xn
= |x| lim (n + 1)
n→∞
= ∞.

Therefore, by the Ratio Test, the series diverges for |x| > 0, and converges only at its center, 0.
Hence, the radius of convergence is R = 0.

X
Example 3.8.3. Find the radius of convergence of 3(x − 2)n .
n=0

Solution: For x 6= 2, let un = 3(x − 2)n . Then


3(x − 2)n+1

un+1
lim = lim
n→∞ un n→∞ 3(x − 2)n

= lim |x − 2|
n→∞
= |x − 2|.

By the Ratio Test, the series converges if |x − 2| < 1 and diverges if |x − 2| > 1. Therefore, the
radius of convergence of the series is R = 1.

Finding the Interval of Convergence


X xn
Example 3.8.4. Find the interval of convergence of .
n=1
n

30
xn
Solution: Letting un = produces
n
n+1

un+1 x
lim = lim n+1

x n
n→∞ un n→∞
n


nx
= lim
n→∞ n + 1

= |x|.
Therefore, by the Ratio Test, the radius of convergence is R = 1. Moreover, because the series is
centered at 0, it converges in the interval (−1, 1). This interval, however, is not necessarily the
interval of convergence. To determine this, we must test for convergence at each endpoint. When
x = 1, we obtain the divergent harmonic series

X 1 1 1
= 1 + + + ···
n=1
n 2 3

When x = −1, we obtain the convergent alternating harmonic series



X (−1)n 1 1 1
= −1 + − + − ···
n=1
n 2 3 4

Therefore, the interval of convergence for the series is [−1, 1).



X (−1)n (x + 1)n
Example 3.8.5. Find the interval of convergence of .
n=0
2n

(−1)n (x + 1)n
Solution: Letting un = produces
2n
(−1)n+1 (x+1)n+1

un+1 2n+1
lim = lim
n n
n→∞ un n→∞ (−1) (x+1)


2n
n
2 (x + 1)
= lim
n→∞ 2n+1

x + 1
= .
2

x + 1
By the Ratio test, the series converges if
< 1 or |x+1| < 2. Hence, the radius of convergence
2
is R = 2. Because the series is centered at x = −1, it will converge in the interval (−3, 1).
Furthermore, at the endpoints we have
∞ ∞ ∞
X (−1)n (−2)n X 2n X
= = 1 (Diverges when x=-3)
n=0
2n n=0
2n n=0

and ∞ ∞
X (−1)n (2)n X
= (−1)n (Diverges when x=1)
n=0
2n n=0

both of which diverge. Thus, the interval of convergence is (−3, 1).

31

X xn
Example 3.8.6. Find the interval of convergence of .
n=1
n2

xn
Solution: Letting un = produces
n2
xn+1
n2 x

un+1
(n+1)

2
lim = lim n = lim
= |x|.
n→∞ un n→∞ x 2 n→∞ (n + 1)2
n

Thus, the radius of convergence is R = 1. Because the series is centered at x = 0, it converges in


the interval (−1, 1). When x = 1, we obtain the convergent p−series

X 1 1 1 1
2
= 1 + 2 + 2 + 2 + ··· (Converges when x=1)
n=1
n 2 3 4

When x = −1, we obtain the convergent alternating series



X (−1)n 1 1 1
= −1 + − + − ··· (Converges when x=-1)
n=1
n2 22 32 42

Therefore, the interval of convergence for the given series is [−1, 1].

X
Example 3.8.7. Find the interval of convergence of nxn .
n=1

Solution: The series is a power series with an = n and c = 0. Let un = nxn , so un+1 = (n+1)xn+1 .
Then
un+1 (n + 1)|x|n+1 n+1
= n
= |x| =⇒ |x| as n → ∞.
un n|x| n

X
The limit is less than one whenever |x| < 1. The Ratio Test then shows un is convergent for
n=1
|x| < 1, and the series diverges for |x| > 1. This means the radius of convergence is R = 1. We
know that the series is convergent for −1 < x < 1. We need to check convergence at the endpoints
X∞
of this interval. When x = 1, we have n. This series does not approach zero as n → ∞, we
n=1
know this series must diverge. Similarly, the series is divergent when x = −1. The interval of
convergence is (−1, 1).

X (−1)n (x − 2)n
Example 3.8.8. Find the interval of convergence of .
n=1
n4n

|x − 2|n |x − 2|n+1
Solution: Let un = , so u n+1 = . Then
n4n (n + 1)4n+1

un+1 |x − 2|n+1 n4n n|x − 2| |x − 2|


= n+1 n
= =⇒ as n → ∞.
un (n + 1)4 |x − 2| (n + 1)4 4

32
|x − 2| |x − 2|
The Ratio Test gives convergence for < 1 and divergence for > 1. Solving the first
4 4
inequality, we have
|x − 2| < 4 =⇒ −4 < x − 2 < 4 =⇒ −2 < x < 6.
When x = −2, the series is
∞ ∞
X (−1)n (−2 − 2)n X 1
= ,
n=1
n4n n=1
n
which is a divergent p−series. When x = 6, we have
∞ ∞
X (−1)n (6 − 2)n X (−1)n
= .
n=1
n4n n=1
n

The Alternating Series Test shows that the series is convergent. The interval of convergence is
−2 < x ≤ 6.

X xn
Example 3.8.9. Find the interval of convergence of the series .
n=1
n3n

xn
Solution: With un = , we find that
n3n
xn+1



= lim n|x| = |x| .
un+1 (n + 1)3n+1
lim =
n
n→∞ un
x n→∞ 3(n + 1)
3
n3 n

|x|
Now < 1 provided |x| < 3, so the Ratio Test implies that the given series converges absolutely if
3 X1
|x| < 3 and diverges if |x| > 3. When x = 3, we have the divergent harmonic series and when
n
X (−1)n
x = −3, we have the convergent alternating series . Thus the interval of convergence of
n
the given power series is [−3, 3).

X 2n xn
Example 3.8.10. Find the interval of convergence of .
n=0
n!

2n xn
Solution: With un = , we find that
n!
n+1 n+1
2 x

= lim 2|x| = 0
(n + 1)!
lim n n
n→∞
2 x n→∞ n + 1

n!

for all x. Hence the Ratio Test implies that the power series converges for all x, and its interval of
convergence is (−∞, ∞).

33
3.9 Tutorial 2
1. Write
( the first
) five terms
( of each of) the following
( sequences. ) ( )
n
2n − 1 1 − (−1) (−1)n−1 (−1)n−1 x2n−1
(i) (ii) (iii) (iv)
3n + 2 n3 2 · 4 · 6 · · · 2n (2n − 1)!

2. Determine the general term of each sequence.


1 2 3 4 5
(i) , , , , ,....
2 3 4 5 6
1 3 5 9
(iii) 3 , 5 , 7 , 11 , . . . .
5 5 5 5
3. (i) Recursively define a0 = 0, a1 = 1, a2 = 2 and an = an−1 − an−2 + an−3 for n ≥ 3. List the
first five terms.
(ii) Recursively define s0 = 1, s1 = −3 and sn = 6sn−1 − 9sn−2 for n ≥ 2, find s5 .

4. Using the definition of a limit, show that each of the following sequences cannot have the limit
shown:
2n − 1 1 n+1 1 n
(i) un = , (ii) un = , (iii) un = 2 , 1.
3n + 4 2 7n − 4 6 n +1
5. Use the definition of a limit to verify each of the following limits.
2n − 1 2 4 − 2n 2 sin n
(i) lim = (ii) lim =− (iii) lim =0
n→∞ 3n + 2 3 n→∞ 3n + 2 3 n→∞ n

an − bn
6. Find lim , where a > 0 and b > 0 for the three cases:
n→∞ an + bn
(i) a > b (ii) a < b (iii) a = b.

7. Use the properties of limits to evaluate


 √ each of thefollowing limits.
4 − 2n − 3n2 3n2 − 5n + 4 √
(i) lim 2
(ii) lim (iii) lim ( n2 + n − n)
n→∞ 2n + n n→∞ 2n − 7 n→∞
4
3 √
  
n(n + 2) n 2n − 3
(iv) lim − 2 (v) lim ( n + 1 − n) (vi) lim
n→∞ n+1 n +1 n→∞ n→∞ 3n + 7
√ √
s
√ 3 (3 − n)( n + 2)
(vii) lim ( 4n2 + n + 5 − 2n) (viii) lim .
n→∞ n→∞ 8n − 4
2 2 √
8. Show that if an → l as n → ∞, then an+1 = an + 2 converges to 3 2.
3 3an

(Series)

1. Find the sum of the convergent series.


∞  n ∞  n ∞  n
X 1 X 2 X −1
(i) (ii) 2 (iii)
n=0
2 n=0
3 n=0
2

34
2. Determine whether or not the series converges and find its sum if it converges
∞ ∞ ∞
X 1 X 1 X 20
(i) (ii) (iii) .
r=1
(3r − 1)(3r + 2) r=1
(5r − 2)(5r + 3) r=1
(7r − 3)(7r + 4)

3. Use the integral test to determine the convergence or divergence of the series.
∞ ∞ ∞ ∞ ∞
X 1 X X 1 X 1 X 1
(i) (ii) ne−n (iii) (iv) 3
(v) 1 .
n=1
n + 1 n=1 n=1
4n + 1 n=1
n n=1 n 3

4. Use the Direct Comparison Test to determine the convergence or divergence of the series.
∞ ∞ ∞
X 1 X 1 X 1
(i) 2
(ii) (iii)
n=1
n +1 n=2
n−1 n=0
n!

5. Use the Limit Comparison to determine the convergence or divergence of the series.
∞ ∞ ∞ ∞ ∞
X n X 2n2 − 1 X 1 X n+3 X n
(i) 2
(ii) 5
(iii) √ (iv) (v) .
n=1
n +1 n=1
3n + 2n + 1 n=1
n n2 + 1 n=1
n(n + 2) n=1
(n + 1)2n−1

6. Use the Alternating Series Test to determine the convergence or divergence of the series.
∞ ∞ ∞
X (−1)n+1 X (−1)n+1 n X (−1)n
(i) (ii) (iii)
n=1
n n=1
2n − 1 n=2
ln n

7. Determine whether the series converges conditionally or absolutely, or diverges.


∞ ∞ ∞ ∞
X (−1)n+1 X (−1)n+1 X (−1)n+1 (2n + 3) X (−1)n+1
(i) (ii) √ (iii) (iv)
n=1
(n + 1)2 n=1
n n n=1
n + 10 n=1
n1.5

8. Use the Ratio Test to test for convergence or divergence


n of the series.
n−1 3
∞ ∞ ∞ (−1)
X n! X n2 X 2
(i) (ii) (iii)
n=0
3n n=1
2n n=1
n2

9. Use the Root Test to test for convergence or divergence of the series.
∞  n ∞ ∞
X n X (−1)n X
(i) (ii) n
(iii) e−n
n=1
2n + 1 n=2
(ln n) n=0

10. Find the radius of convergence of the power series.


∞ n ∞ ∞
X
n x
X (2x)n X (−1)n xn
(i) (−1) (ii) (iii)
n=0
n+1 n=0
n! n=0
2n

11. Find the interval of convergence of the power series.


∞ ∞ ∞
X (−1)n xn X (−1)n+1 (x − 5)n X (−1)n+1 x2n−1
(i) (ii) (iii)
n=1
n n=1
n5n n=1
2n − 1

35
Chapter 4

Functions

4.1 What is a Function?

Definition 4.1.1. A function f from a set X to a set Y is a rule that assigns to each element x
in X a unique element y in Y .

The set X is called the domain of the function f and the range is the set of all elements of Y
assigned to an element of X. The element of Y assigned to an element x of X is called the image
of x under f and is denoted by f (x). We write f : X → Y for saying f is a function from X to Y .

In this course, both X and Y are sets of real numbers. Thus, the functions are called real functions.
We usually specify a function f by giving the expression for f (x). Below are a few examples of
functions:

f (x) = 5x4 + 9, g(t) = 1 − t3 , h(s) = 9s2 + 2.

Note that in the above examples, the letters f, g, h are used to denote functions whereas the let-
ters x, t, s are used to denote the variables. A variable is an arbitrary element of a set. In the
above examples, the letters x, t, s denote the independent variables and f (x), g(t), h(s) denote
the dependent variables since their values depend on the values of x, t, s respectively.

The domain of a function f is the largest set of real numbers for which the rule makes sense.

36
1 1
Example: Let f (x) = , we cannot compute f (0), since is not defined. Then the domain of
x 0
1
f (x) = is the set of all real numbers except 0.
x

Function Domain x ∈ X Range y ∈ Y


y = x2 (−∞, ∞) [0, ∞)
1
y= (−∞, 0) ∪ (0, ∞) (−∞, 0) ∪ (0, ∞)
x

y = √x [0, ∞) [0, ∞)
y = 1 − x2 [−1, 1] [0, 1]

Table 4.1: Examples of functions

You Try It: Let


x
g(x) = .
x2 + 4x + 3
What is the domain and range of this function?

4.2 Graphs of Functions

It is useful to draw pictures which represent functions. These pictures, or graphs, are a device
for helping us to think about functions. We graph functions in the x − y plane. The elements of
the domain of the function are thought of as points of the x−axis. The values of a function are
measured on the y−axis. The graph of f associates to x a unique y value that the function f assigns
to x. The graph of a function f is the set of points {(x, y)|y = f (x) in the domain of f } in the
Cartesian plane.

As a consequence, a function is characterized geometrically by the fact that any vertical line inter-
secting its graph does so in exactly one point.

37
4.3 Monotone and Bounded Functions

A real function f is increasing (strictly increasing) on an interval I if for all points x1 and x2
in I with x1 < x2 , f (x1 ) ≤ f (x2 ) (f (x1 ) < f (x2 )).

A real function f is decreasing (strictly decreasing) on an interval I if for all points x1 and x2
in I with x1 < x2 , f (x1 ) ≥ f (x2 ) (f (x1 ) > f (x2 )).

A real function f is monotone on interval I if f is either increasing or decreasing.

Example: Consider the function f (x) = (2x − 1)(x + 5). We observe that f is increasing on
the interval (−9/4, ∞) and is decreasing on the interval (−∞, −9/4).

Bounded Functions

A function f is bounded above if there is a real number M such that f (x) ≤ M for all points x
in its domain. The number M is then called an upper bound of f .

A function f is bounded below if there is a real number m such that f (x) ≥ m for all points x
in its domain. The number m is then called a lower bound of f .

A function f is bounded if f is bounded above and below, that is, there exist real numbers
M and m such that m ≤ f (x) ≤ M for all points x in its domain.

Examples: f (x) = x + 3 is bounded in −1 ≤ x ≤ 1. An upper bound is 4 (or any number greater


than 4). A lower bound is 2 (or any number less than 2).

38
4.4 Types of Functions

4.4.1 Elementary Functions

Polynomial Function

Have the form f (x) = a0 xn + a1 xn−1 + · · · + an−1 x + an where a0 , a1 , . . . , an are constants and n is
a positive integer called the degree of the polynomial provided a0 6= 0.

Examples: x5 + 10x3 − 2x + 1 is a polynomial of degree 5.

Rational Functions

P (x)
A function f (x) = where P (x) and Q(x) are polynomial functions.
Q(x)

x3 + x + 5
Example: f (x) = is a rational function. Since (x + 1)(x − 4) = 0 for x = −1 and
x2 − 3x − 4
x = 4, the domain of f is the set of all real numbers except −1 and 4.

Power Function

f (x) = kxn , n a real number and k a constant.

1 1 2
Examples: y = , y = x2 , y = x3 .
x

39
Piecewise Defined Functions

A function need not be defined by a single formula. A piecewise defined function is a function
described by using different formula on different parts of its domain.
 
 −1, x<0  −x, x<0
Examples: (a) f (x) = 0, x=0 (b) f (x) = x2 , 0≤x≤1
x + 2, x>0 1, x > 1.
 

Transcendental Functions

The following are sometimes called elementary transcendental functions.

1. Exponential function, f (x) = ax , a 6= 0, 1.

2. Logarithmic function, f (x) = loga x, a 6= 0, 1.

3. Trigonometric functions (also called circular functions because of their geometric interpreta-
sin x
tion with respect to the unit circle), e.g., sin x, cos x, tan x = , csc x, cot x, sec x.
cos x
4. Inverse trigonometric functions, e.g., y = sin−1 x, y = cos−1 x.

5. Hyperbolic Functions, e.g., sinh x, cosh x, tanh x, coth x.

40
Even and Odd Functions

Let f (x) be a real-valued function of a real variable. Then f is even if f (x) = f (−x). (Symmetric
with respect to the y−axis)

Examples: |x|, x2 , x4 , cos x, cosh x.

Let f (x) be a real-valued function of a real variable. Then f is odd if −f (x) = f (−x) or
f (x) + f (−x) = 0. (Symmetric with respect to the origin)

Examples: x, x3 , sin x, sinh x.

3x
Example: Determine whether the following function is odd or even f (x) = .
x2 + 1

Solution:
3(−x) 3x
f (−x) = 2
=− 2 = −f (x).
(−x) + 1 x +1
The function is odd.

4.5 Combining Functions

A function f can be combined with another function g by means of arithmetic operations to form
f
other functions, the sum f + g, difference f − g, product f g and quotient are defined as :
g
Let f and g denote functions, then

1. Sum : (f + g)(x) = f (x) + g(x).

2. Difference : (f − g)(x) = f (x) − g(x).

3. Product : (f g)(x) = f (x)g(x).


 
f f (x)
4. Quotient : (x) = .
g g(x)

f
Example: If f (x) = 2x2 − 5 and g(x) = 3x + 4. Find f + g, f − g, f g, .
g

41
Solution:

(f + g)(x) = (2x2 − 5) + (3x + 4) = 2x2 + 3x − 1.


(f − g)(x) = (2x2 − 5) − (3x + 4) = 2x2 − 3x − 9.
(f g)(x) = (2x2 − 5)(3x + 4) = 6x3 + 8x2 − 15x − 20
2x2 − 5
 
f
(x) = .
g 3x + 4

4.6 Composition of Functions

Let f and g denote functions. The composition of f and g, written f ◦ g is the function
(f ◦g)(x) = f (g(x)) and the composition of g and f , written g◦f , is the function (g◦f )(x) = g(f (x)).

Example: If f (x) = x2 and g(x) = x2 + 1, find f ◦ g and g ◦ f .

Solution:
(f ◦ g)(x) = f (g(x)) = f (x2 + 1) = (x2 + 1)2 = x4 + 2x2 + 1.
and
(g ◦ f )(x) = g(f (x)) = g(x2 ) = (x2 )2 + 1 = x4 + 1.
In general, f ◦ g 6= g ◦ f .

4.7 Bijection, Injection and Surjection

Classes of functions may be distinguished by the manner in which arguments and images are related
or mapped to each other.

A function f : X → Y is injective (one-to-one, 1 − 1) if every element of the range corresponds


to exactly one element in its domain X.

For all x, y ∈ X, f (x) = f (y) =⇒ x = y or equivalently

For all x, y ∈ X, x 6= y =⇒ f (x) 6= f (y). An injective function is an injection.

42
Example: Show that the functions f (x) = 2x + 3 and g(x) = x3 − 2 are injective.

Solution: Need to show that f (x) = f (y) =⇒ x = y.


2x + 3 = 2y + 3
2x = 2y =⇒ x = y.

Hence f (x) = 2x + 3 is injective.

Need to show that g(x) = g(y) =⇒ x = y.


x3 − 2 = y 3 − 2
x3 = y 3 taking cube roots
x = y.
Hence g(x) = x3 − 2 is injective.

A function f : X → y is called onto if for all y in Y there is an x in X such that f (x) = y. All
elements in Y are used. Such functions are referred to as surjective.

Example: Show that f (x) = 3x − 5 is onto.

43
y+5
Solution: For onto f (x) = y, i.e., 3x − 5 = y. Solve for x, =⇒ x = .
    3
y+5 y+5
So f =3 − 5 = y. Therefore f is onto.
3 3

Let X and Y be sets. A function f : X → Y that is one-to-one and onto is called a bijection
or bijective function from X to Y . If f is both one-to-one and onto, then we call f a 1 − 1
correspondence.

Inverse of a Function. Suppose f is a 1 − 1 function that has domain X and range Y . Since
every element y ∈ Y corresponds with precisely one element x of X, the function f must actually
determine a reverse function g whose domain is Y and range is X, where f and g must satisfy
f (x) = y and g(y) = x. The function g is given the formal name inverse of f and usually written
f −1 and read f inverse. Not all functions have inverses, those that do are called invertible functions.

Example: Find the inverse of the function f (x) = (2x + 8)3 .

Solution: We must solve the equation y = (2x + 8)3 for x.

y = (2x + 8)3
√3
y = 2x + 8
√3
y − 8 = 2x
√3 y − 8
x = .
2

3
x−8
Hence the inverse function f −1 −1
is given by f (x) = .
2

A 1−1 function f can have only one inverse, i.e., f −1 is unique. A function f : X → Y is invertible
if and only if f is one-to-one and maps X onto Y .

44
4.8 Operations on Functions

Equality of Functions

Equality of functions does not mean the same as equality of two numbers (numbers have a fixed
value but values of functions vary). Each function is a relationship between x and y, the two
relationships are the same if for every value of x we get the same value of y.

Example: The functions (x − 1)(x + 2) and x2 + x − 2 are equal.

Example: Equal functions for positive values of x,



|x| = x2 .

Identity Function

Generally, an identity function is one which does not change the domain values at all. Its the
function f (x) = x. Denoted by IX .

45
Chapter 5

Limits and Continuity

The single most important idea in calculus is the idea of limit. More than 2000 years ago, the
ancient Greeks wrestled with the limit concept, and they did not succeed. It is only in the past 200
years that we have finally come up with a firm understanding of limits. The study of calculus went
through several periods of increased mathematical rigour beginning with the French mathematician
Augustin-Loius Cauchy (1789-1857) and later continued by the German mathematician, and former
high school teacher, Karl Wilhelm Weierstrass (1815-1897).

5.1 Limit of a Function

If f is a function, then we say lim f (x) = A, if the value of f (x) gets arbitrarily closer to A as x gets
x→a
closer and closer to a. For example, lim x2 = 9, since x2 gets arbitrarily close to 9 as x approaches
x→3
as close as one wishes to 3.

The definition can be stated more precisely as follows : lim f (x) = A if and only if, for any
x→a
chosen positive number ε, however small, there exists a positive number δ, such that, whenever
0 < |x − a| < δ, then |f (x) − A| < ε.

lim f (x) = A means that f (x) can be made as close as desired to A by making x close enough, but
x→a
not equal to a. How close is “close enough to a” depends on how close one wants to make f (x) to
A. It also of course depends on which function f is and on which number a is. The positive number
ε is how close one wants to make f (x) to A ; one wants the distance to be no more then ε. The
positive number δ is how close one will make x to a ; if the distance from x to a is less than δ (but
not zero), then the distance from f (x) to A will be less than ε. Thus δ depends on ε. The limit
statement means that no matter how small ε is made, δ can be made smaller enough. The letters
ε and δ can be understood as “error” and “distance”. In these terms the error (ε) can be made as
small as desired by reducing the distance (δ).

46
The ε − δ definition of lim f (x) = A
x→a

For any chosen positive number ε, however small, there exists a positive number δ, such that,
whenever 0 < |x − a| < δ, then |f (x) − A| < ε.

Example: Show that lim (x2 + 1) = 2.


n→1

Solution: Need to find δ so that, for a given ε, |x2 + 1 − 2| < ε for |x − 1| < δ.

Now

x 2 + 1 − 2 = x2 − 1
= (x + 1)(x − 1).

Choose |x − 1| < 1 so that −1 < x − 1 < 1 ⇒ 0 < x < 2 ⇒ 1 < x + 1 < 3. You have |x2 + 1 − 2| < ε
ε
if 3|x − 1| < ε or |x − 1| < . You have now two conditions on x :
3
ε
|x − 1| < 1 and |x − 1| < .
3
Choose δ = min{1, 3ε }. For a given ε > 0, choose δ = min{1, 3ε }, then we have |x − 1| < δ, it would
be true that |x2 + 1 − 2| < ε.

Example: Show that lim (x2 + 3x) = 10.


x→2

Solution: Let ε > 0. We must produce a δ > 0 such that, whenever 0 < |x − 2| < δ then
|(x2 + 3x) − 10| < ε. First we note that

|(x2 + 3x) − 10| = |(x − 2)2 + 7(x − 2)| ≤ |x − 2|2 + 7|x − 2|.
ε
Also, if 0 < δ ≤ 1, then δ 2 ≤ δ. Hence, if we take δ to be the minimum of 1 and , then, whenever
8
0 < |x − 2| < δ,
|(x2 + 3x) − 10| < δ 2 + 7δ ≤ δ + 7δ = 8δ ≤ ε.
 
1
You Try It: Prove that lim x sin = 0.
x→0 x

Right and Left Limits

Considering x and a as points on the real axis where a is fixed and x is moving, then x can approach
a from the right or from the left. We indicate these respective approaches by writing x → a+ and
x → a− .

47
If lim+ f (x) = A1 and lim− f (x) = A2 , we call A1 and A2 respectively the right and left hand limits
x→a x→a
of f (x) at a.

We have lim f (x) = A if and only if lim+ f (x) = lim− f (x) = A. The existence of the limit from the
x→a x→a x→a
left does not imply the existence of the limit from the right and conversely. When a function f is
defined on only one side of a point a, then lim f (x) is identical to the one-sided limit, if it exists. For
√ x→a √ √
example, if f (x) = x, then f is only defined to the right of zero. Hence, lim x = lim+ x = 0.
x→0
√ √ x→0
Of course, lim− x does not exist, since x is not defined when x < 0. On the other hand, consider
x→0 r
1
the function g(x) = , which is defined only for x > 0. In this case, lim+ g(x) does not exist and,
x x→0
therefore lim g(x) does not exist.
x→0

5.2 Theorems on Limits


1. If f (x) = c, a constant, then lim f (x) = c.
x→a

2. If lim f (x) = A and lim g(x) = B, then


x→a x→a

(a) lim kf (x) = kA, k being any constant.


x→a

(b) lim [f (x) ± g(x)] = lim f (x) ± lim g(x) = A ± B.


x→a x→a x→a

(c) lim f (x)g(x) = lim f (x) lim g(x) = AB.


x→a x→a x→a

f (x) lim f (x) A


(d) lim = x→a = , provided B 6= 0.
x→a g(x) lim g(x) B
x→a

Example: If lim f (x) exists, prove that it must be unique.


x→a

Solution: Must show that if lim f (x) = A1 and lim f (x) = A2 , then A1 = A2 .
x→a x→a

By hypothesis, given any ε > 0 we can find δ > 0 such that


ε
|f (x) − A1 | < when 0 < |x − a| < δ
2
ε
|f (x) − A2 | < when 0 < |x − a| < δ.
2
Then
ε ε
|A1 − A2 | = |A1 − f (x) + f (x) − A2 | ≤ |A1 − f (x)| + |f (x) − A2 | < + = ε.
2 2
i.e., |A1 −A2 | is less than any positive number ε (however small) and so must be zero. Thus A1 = A2 .

48
Example: Given lim f (x) = A and lim g(x) = B. Prove that
x→a x→a

lim [f (x) + g(x)] = lim f (x) + lim g(x) = A + B


x→a x→a x→a
.

Solution: We must show that for any ε > 0, we can find δ > 0 such that |(f (x)+g(x))−(A+B)| < ε
when 0 < |x − a| < δ.

By hypothesis, given ε > 0, we can find δ1 > 0 and δ2 > 0 such that
ε
|f (x) − A| < when 0 < |x − a| < δ1
2
ε
|g(x) − B| < when 0 < |x − a| < δ2 .
2
Then
ε ε
|(f (x) + g(x)) − (A + B)| ≤ |f (x) − A| + |g(x) − B| < + = ε,
2 2
when 0 < |x − a| < δ where δ is chosen as the smaller of δ1 and δ2 .

You Try It: Given lim f (x) = A and lim g(x) = B. Prove that
x→a x→a

lim f (x)g(x) = lim f (x) lim g(x) = AB


x→a x→a x→a
.

5.3 Special Limits


sin x 1 − cos x
1. lim = 1, lim = 0.
x→0 x x→0 x
 x
1 1
2. lim 1 + = e, lim+ (1 + x) x = e.
x→∞ x x→0

ex − 1 x−1
3. lim = 1, lim = 1.
x→0 x x→1 ln x

5.4 Methods of Calculating lim f (x)


x→a

If f (a) is defined

If x = a is in the domain of f (x) and a is not an endpoint of the domain, and f (x) is defined by a
single expression, then
lim f (x) = f (a).
x→a

49
Example: Find lim (x + 3).
x→1

Solution: lim (x + 3) = 1 + 3 = 4.
x→1

1
Example: Find lim .
x→1 x + 2

1 1 1
Solution: lim = = .
x→1 x + 2 1+2 3

Example: Find lim (x2 − 7x + 5).


x→8

Solution: lim (x2 − 7x + 5) = 82 − 7(8) + 5 = 13.


x→8

x2 − 4
Example: Find lim .
x→2 x − 2

x2 − 4 (x + 2)(x − 2)
Solution: lim = lim = lim (x + 2) = 4.
x→2 x − 2 x→2 x−2 x→2

Functions Defined By More Than One Expression

Suppose that f (x) is defined by one expression for x < a and by a different expression for x > a.

|x|
Example: Show that lim does not exist.
x→0 x

Solution: Notice that  x


|x|  x = 1, if x ≥ 0
= x
x  − = −1, if x < 0.
x
i.e., you seek a limit at x = 0 of a function that is defined differently on either side of x = 0.

|x|
lim− = lim (−1) = −1.
x→0 x x→0−
|x|
lim+ = lim (1) = 1.
x→0 x x→0+

|x| |x| |x|


Since lim+ 6= lim− , then lim does not exist.
x→0 x x→0 x x→0 x

sin 3x
Example: Find lim .
x→0 x

50
 
sin 3x sin 3x
Solution: Since =3 . Then
x 3x
 
sin 3x sin 3x sin 3x
lim = lim 3 = 3 lim = 3(1) = 3.
x→0 x x→0 3x x→0 3x

1 − cos 2x
Example: Find lim .
x→0 sin 3x
      
1 − cos 2x 1 − cos 2x 3x 1 2 1 − cos 2x 3x
Solution: Since = 2x = . Then
sin 3x 2x sin 3x 3x 3 2x sin 3x
   
1 − cos 2x 2 1 − cos 2x 3x 2
lim = lim lim = (0)(1) = 0.
x→0 sin 3x 3 x→0 2x x→0 sin 3x 3

sin x
Example: Find limπ .
x→ 4 cos x

π

sin x sin 4
Solution: limπ = π
= 1.
x→ 4 cos x cos 4

1 − cos θ
You Try It: Show that lim = 0.
θ→0 θ

Limits at Infinity

It sometimes happen that as x → a, f (x) increases or decreases without bound. We write


lim f (x) = +∞ or lim f (x) = −∞. We say that, lim f (x) = +∞, if for each positive number
x→a x→a x→a
M we can find a positive number δ (depending on M in general) such that f (x) > M whenever
0 < |x − a| < δ.

Similarly, we say that lim f (x) = −∞, if for each positive number M we can find a positive number
x→a
δ (depending on M in general) such that f (x) < −M whenever 0 < |x − a| < δ.

51
1 1
Note that lim = 0 and lim = 0.
x→∞ x x→−∞ x

Limits at Infinity of a Rational Function

pm (x)
A rational function is a quotient of two polynomials, f (x) = , where m and n are the degrees
qn (x)
of the two polynomials.

1. If m < n, then lim f (x) = 0.


x→∞
x+1
Example: Find lim .
x→∞ x2 + 4

Solution: The degree of the numerator is one, the degree of the denominator is two. Therefore
1
x+1 x
+ x12 0+0
lim 2 = 0, since lim 4 = = 0.
x→∞ x + 4 x→∞ 1 + 2 1+0
x

2. If m > n, then lim f (x) = ±∞ . (sign depends on the polynomials pm (x) and qn (x), if they
x→∞
are of the same sign as x gets larger, the quotient is positive, if they are of opposite signs, the
quotient is negative)
x3 − 2x2 + 3x + 4
Example: Find lim .
x→∞ 3x + 5
2
x3 − 2x2 + 3x + 4 1− x
+ x32 + x43 1
Solution: lim = lim 3 = = ∞.
x→∞ 3x + 5 x→∞
x2
+ x53 0
a
3. If m = n, then lim f (x) = , where a is the coefficient of xm in the numerator and b is the
x→∞ b
coefficient of xn in the denominator.
x3 − 4x + 1
Example: Find lim .
x→∞ 3x3 + 2x + 7

4 1
x3 − 4x + 1 1− x2
+ x3 1−0+0 1
Solution: lim 3
= lim 2 7 = = .
x→∞ 3x + 2x + 7 x→∞ 3 + + 3+0+0 3
x2 x3

a0 xm + a1 xm−1 + · · · + am
You Try It: What is lim , where a0 , b0 6= 0 and m and n are
x→∞ b0 xn + b1 xn−1 + · · · + bn
positive integers, when (a) m > n (b) m = n (c) m < n.
x−4
You Try It: Find lim √ .
x→4 x−2

5.5 Continuity

A function f (x) is continuous at a point x = a if

52
1. f (a) is defined.

2. lim f (x) exists.


x→a

3. lim f (x) = f (a).


x→a

Notice that, for f (x) to be continuous at x = a, all three conditions must be satisfied. If at least
one condition fails, f is said to have a discontinuity at x = a. For example, f (x) = x2 + 1 is
continuous at x = 2 since lim f (x) = 5 = f (2). The first condition above implies that a function
x→2 √
can be continuous only at points of its domain. Thus, f (x) = 4 − x2 is not continuous at x = 3
because f (3) is imaginary, i.e., not defined.

A function f is right-continuous (continuous from the right) at a point x = a in its domain if


lim+ f (x) = f (a). It is left-continuous (continuous from the left) at x = a if lim− f (x) = f (a).
x→a x→a
A function is continuous at an interior point x = a of its domain if and only if it is both right-
continuous and left-continuous at x = a.

Example: Determine whether f (x) = x2 + 1 is continuous at x = 1.

Solution: lim x2 + 1 = f (1) = 12 + 1 = 2. Therefore f (x) = x2 + 1 is continuous at x = 1.


x→1

|x|
Example: Determine whether f (x) = is continuous at x = 0.
x

Solution: Since f (0) is not defined, f (x) is not continuous at x = 0.

Example: Determine whether

|x|
(
f (x) = , if x 6= 0
x
0, if x = 0,

is continuous at x = 0.

Solution: f (0) now defined. Then

Then lim f (x) must be considered in two steps,


x→0

lim f (x) = lim (−1) = −1.


x→0− x→0−
lim f (x) = lim (1) = 1.
x→0+ x→0+

Sine the limits are not the same, lim f (x) does not exist and f (x) is not continuous at x = 0.
x→0

53
You Try It: Determine whether the function defined by
 2
 x, if x < 2
f (x) = 5, if x = 2
−x + 6, if x > 2,

is continuous at the point x = 2.

A function f (x) is discontinuous at x = a if one or more of the conditions for continuity fails
there.

1
Example: (a) f (x) = is discontinuous at x = 2, because f (2) is not defined (has a zero
x−2
denominator) and because lim f (x) does not exist (equals ∞). The function is, however, continuous
x→2
everywhere except at x = 2, where it is said to have an infinite discontinuity.

x2 − 4
(b) f (x) = is discontinuous at x = 2 because f (2) is not defined (both numerator and
x−2
denominator are zero) and because lim f (x) = 4. The discontinuity here is called removable since
x→2
x2 − 4
it may be removed by redefining the function as f (x) = for x 6= 2 and f (2) = 4. (Note the
x−2
discontinuity in (a) cannot be removed because the limit also does not exist.)

5.6 The ε − δ Definition of Continuity

f (x) is continuous at x = a, if for any ε > 0, we can find δ > 0, such that, |f (x) − f (a)| < ε
whenever 0 < |x − a| < δ.

Example: Prove that f (x) = x2 is continuous at x = 2.

Solution: Must show that, given any ε > 0, we can find δ > 0, such that |f (x)−f (2)| = |x2 −4| < ε
when |x − 2| < δ.

Choose δ ≤ 1, so that |x − 2| < 1 or 1 < x < 3 (x 6= 2). Then |x2 − 4| = |(x − 2)(x + 2)| =
|x − 2||x + 2| < δ|x + 2| < 5δ. Taking δ = min{1, 5ε } whichever is smaller, then we have |x2 − 4| < ε
whenever |x − 2| < δ.

You Try It: (a) Prove that f (x) = x is continuous at any point x = x0 .
(b) Prove that f (x) = 2x3 + x is continuous at any point x = x0 .

54
Theorems on Continuity

Theorem 1
f (x)
If f (x) and g(x) are continuous at x = a, so are the functions f (x) ± g(x), f (x)g(x) and if
g(x)
g(x) 6= 0.

Theorem 2
The following functions are continuous in every finite interval (a) all polynomials (b) sin x and
cos x (c) ax , a > 0.

Theorem 3
If y = f (x) is continuous at x = a and z = g(y) is continuous at y = b and if b = f (a), then the
function z = g[f (x)] called a function of a function or composite function is continuous at x = a.

Briefly: A continuous function of a continuous function is continuous.

Theorem 4
If f (x) is continuous in a closed interval, it is bounded in the interval.

5.7 Tutorial 3
1. Find the domain ofreach of the following functions.
1 x x3 − 8
(i) √ (ii) (iii) 2
1−x 2−x x −4
2. Determine whether or not each of the following correspondences is a function.
(i) x2 + y = 1 (ii) x2 y 2 = 5 (iii) x2 y = 4 (iv) {(1, 5), (2, 5), (5, 1)}.
3. For each of the following pairs of functions, calculate f ◦ g and g ◦ f .
2 2
(i) f (x) = ex and g(x) = e−x .
4. Let the function f : R → R be defined by f (x) = 2x − 1. Show that f is bijective and hence
find f −1 .
5. Show that if f : A → B is onto and g : B → C is onto, then the product function
(g ◦ f ) : A → C is onto.

6. Show that the function f (x) = ln(x+ x2 + 1) is an odd function and find the inverse function
of f (x).
7. Let 
 3x − 1, x<0
f (x) = 0, x=0
2x + 5, x > 0,

55
Evaluate (i) lim f (x) (ii) lim f (x) (iii) lim+ f (x) (iv) lim− f (x) (v) lim f (x).
x→2 x→−3 x→0 x→0 x→0

3x + |x|
8. If f (x) = . Evaluate (i) lim f (x) (ii) lim f (x).
7x − 5|x| x→∞ x→−∞

9. Use the theorems on limits √ to evaluate


√ each of the following limits.
3 2
x −8 3+x− 3 8x + 4
(i) lim (ii) lim (iii) lim 2−x+2
(iv) lim (4x2 − x + 5)
x→0 x√ − 2 x→0 x x→∞ 2x x→2
x+4−2 a0 + a1 x + · · · + am x m (3x − 1)(2x + 3)
(v) lim (vi) lim (vii) lim
x→0 x x→∞ b0 + b1 x + · · · + bn xn x→∞ (5x − 3)(4x + 5)

10. Use the definition of a limit to prove that, lim (3x2 − 7x + 1) = 7


x→3

x2 − 4
 
2 1
11. Verify (i) lim =4 (ii) lim x cos = 0.
x→2 x − 2 x→0 x
12. Give the points of discontinuity of each of the following
 functions.
x 1
(i) f (x) = (ii) f (x) = x2 sin , f (0) = 0
(x − 2)(x − 4) x
p 1
(iii) f (x) = (x − 3)(6 − x), 3 ≤ x ≤ 6 (iv) f (x) = .
1 + 2 sin x
13. Given that the function f : R → R be defined by
 2
 x − 4, x≥2
f (x) = 2ax + b, 0<x<2
 x
e , x ≤ 0,

is continuous at all points in R. Find the values of a and b.

14. Let f : R → R be the function defined by


 2
 x, x≥2
f (x) = ax + b, 1<x<2
2 − x, x ≤ 1,

where a and b are constants. If f is continuous on R, find the values a and b.

15. Sketch the graphs of the following functions.


x+1
(i) f (x) = 2x2 − 5x − 3 (ii) f (x) = (iii) f (x) = x4 (iv) f (x) = x9
(x − 2)(3x + 5)

56
Chapter 6

Differentiation

Increments. The increment ∆x of a variable x is the change in x as it increases or decreases from


one value x = x0 to another value x = x1 in its domain. Here, ∆x = x1 − x0 and we may write
x1 = x0 + ∆x. If the variable x is given an increment ∆x from x = x0 (i.e., if x changes from x = x0
to x1 = x0 + ∆x) and a function y = f (x) is thereby given an increment ∆y = f (x0 + ∆x) − f (x0 )
from y = f (x0 ), then the quotient
∆y change in y
= ,
∆x change in x
is called the average rate of change of the function on the interval between x = x0 and x1 = x0 +∆x.

Let f (x) be defined at any point x0 in (a, b). The derivative of f (x) at x = x0 is defined as
f (x0 + h) − f (x0 )
f 0 (x0 ) = lim
h→0 h
if this limit exists. A function is called differentiable at a point x = x0 , if it has a derivative at
that point, i.e., if f 0 (x0 ) exists. If we write x = x0 + h, then h = x − x0 and h approaches 0 if and
only if x approaches x0 . Therefore, an equivalent way of stating the definition of the derivative, is
f (x) − f (x0 )
f 0 (x0 ) = lim .
x→x0 x − x0

Example: If f (x) = x3 − x, find a formula for f 0 (x).

Solution:
f (x + h) − f (x) [(x + h)3 − (x + h)] − [x3 − x]
f 0 (x) = lim = lim
h→0 h h→0 h
3 2 2 3 3
x + 3x h + 3xh + h − x − h − x + x
= lim
h→0 h
3x2 h + 3xh2 + h3 − h
= lim
h→0 h
= lim (3x + 3xh + h2 − 1) = 3x2 − 1.
2
h→0

57

Example: If f (x) = x, find the derivative of f .

Solution:
f (x + h) − f (x)
f 0 (x) = lim
h→0
√ h

x+h− x
= lim
h→0
√ h
√ √ √
x+h− x x+h+ x
= lim ·√ √
h→0 h x+h− x
(x + h) − x
= lim √ √
h→0 h( x + h + x)
1
= lim √ √
h→0 x+h+ x
1 1
= √ √ = √ .
x+ x 2 x

dy d d
The derivative at x may be denoted by f 0 (x), y 0 , , (f (x)). The symbol is called differ-
dx dx dx
entiation operator because it indicates the operation of differentiation. The process of finding
derivatives of functions is called differentiation.

A function f is differentiable at x0 if f 0 (x0 ) exists. It is differentiable on an open interval (a, b)


[or (a, ∞) or (−∞, a) or (−∞, ∞)], if it is differentiable at every number in the interval.

Example: Where is the function f (x) = |x| differentiable?

Solution: If x > 0, then |x| = x and we can choose h small enough that x + h > 0 and hence
|x + h| = x + h. Therefore, for x > 0,

|x + h| − |x|
f 0 (x) = lim
h→0 x
(x + h) − x h
= lim = lim = 1,
h→0 h h→0 h
and so f is differentiable for any x > 0.

Similarly, for x < 0, we have |x| = −x and h can be chosen small enough that x + h < 0 and so
|x + h| = −(x + h). Therefore, for x < 0,

|x + h| − |x|
f 0 (x) = lim
h→0 h
−(x + h) − (−x) −h
= lim = lim = −1,
h→0 h h→0 h

and so f is differentiable for any x < 0.

58
For x = 0 we have to investigate
f (0 + h) − f (0)
f 0 (0) = lim
h→0 h
|0 + h| − |0|
= lim (if it exists).
h→0 h
Let’s compute the left and right limits separately;
|0 + h| − |0| |h|
lim+ = lim+ = lim+ 1 = 1.
h→0 h h→0 h h→0
|0 + h| − |0| |h|
lim− = lim− = lim− (−1) = −1.
h→0 h h→0 h h→0

Since these limits are different f 0 (0) does not exist. Thus, f is differentiable at all x except 0.

6.1 Differentiation Techniques (Finding Derivatives)


1.
f (x0 + h) − f (x0 )
f 0 (x0 ) = lim
h→0 h
if this limit exists.

2. The derivative of any constant function is zero, i.e., c0 = 0.

3. For any real number n,


(xn )0 = nxn−1 .
When differentiating, results can be expressed in a number of ways. For example, (i) if
dy
y = 3x2 then = 6x, (ii) if f (x) = 3x2 then f 0 (x) = 6x, (iii) the differential coefficient
dx
of 3x2 is 6x.
√ 1 1 1 1 1
For example, if f (x) = x = x 2 , then f 0 (x) = x 2 −1 = 12 x− 2 = √ .
2 2 x
You Try It: Using the general rule, differentiate the following with respect to x :
4
(a) f (x) = 5x7 (b) f (x) = 2 .
x
4. For any constant c, (cf (x))0 = cf 0 (x). For example, (5x3 )0 = 5(x3 )0 = 5(3x2 ) = 15x2 .

5. The derivative of a sum (difference) is the sum (difference) of the derivatives,

(f (x) ± g(x))0 = f 0 (x) ± g 0 (x).

For example,

(3x5 − 2x2 + 1)0 = (3x5 )0 − (2x2 )0 + 10 = 3(x5 )0 − 2(x2 )0 + 0 = 3(5x4 ) − 2(2x) = 15x4 − 4x.

6. Product Rule
(f (x)g(x))0 = f (x)g 0 (x) + g(x)f 0 (x).

59
7. Quotient Rule 0
g(x)f 0 (x) − f (x)g 0 (x)

f (x)
= .
g(x) (g(x))2

8. Parametric Equations. If the coordinates (x, y) of a point P on a curve are given as


functions x = f (u) and y = g(u) of a third variable or parameter u, the equations x = f (u)
and y = g(u) are called parametric equations of the curve. For example, x = 12 t, y = 4 − t2 or
x = cos θ, y = 4 sin2 θ.
dy
The First Derivative is given by
dx
dy dy/du
= .
dx dx/du

d2 y
The Second Derivative is given by
dx2
d2 y
 
d dy du
2
= .
dx du dx dx

dy d2 y
Example: Find and 2 given x = θ − sin θ and y = 1 − cos θ.
dx dx
dx dy
Solution: Note that = 1 − cos θ and = sin θ, so
dθ dθ
dy dy/dθ sin θ
= = .
dx dx/dθ 1 − cos θ
Also
d2 y
 
d sin θ dθ
2
=
dx dθ 1 − cos θ dx
cos θ − 1 1 1
= 2
· =− .
(1 − cos θ) 1 − cos θ (1 − cos θ)2

dy d2 y
Example: Find and 2 given x = et cos t and y = et sin t.
dx dx
dx dy
Solution: Note that = et (cos t − sin t) and = et (sin t + cos t), so
dt dt
dy dy/dt sin t + cos t
= = .
dx dx/dt cos t − sin t
Also
d2 y
 
d sin t + cos t dt
=
dx2 dt cos t − sin t dx
2 1 2
= 2
· t = t .
(cos t − sin t) e (cos t − sin t) e (cos t − sin t)3

Theorem 6.1.1. If f (x) = c is a constant function, then f 0 (x) = 0 for all real numbers x.

60
f (x + h) − f (x) c−c
Proof. Observe that f 0 (x) = lim = = 0.
h→0 h h
Theorem 6.1.2 (Product Rule). If f and g are both differentiable at x, then the product function
f g is also differentiable at x and (f g)0 (x) = f (x)g 0 (x) + g(x)f 0 (x).

Proof.
f (x + h)g(x + h) − f (x)g(x)
(f g)0 (x) = lim .
h→0 h
Trick of adding and subtracting f (x + h)g(x) to the numerator,

f (x + h)g(x + h) − f (x + h)g(x) + f (x + h)g(x) − f (x)g(x)


lim =
h→0 h
g(x + h) − g(x) f (x + h) − f (x)
lim f (x + h) lim + g(x) lim =
h→0 h→0 h h→0 h
f (x)g 0 (x) + g(x)f 0 (x).

Theorem 6.1.3. If g is differentiable at x and g(x) 6= 0, then


 0
1 g 0 (x)
(x) = − .
g (g(x))2

Proof.
1 1
g(x+h)
− g(x) g(x) − g(x + h)
lim = lim
h→0 h h→0 h(g(x))(g(x + h))

−(g(x + h) − g(x)) 1
= lim
h→0 h g(x)g(x + h)
−(g(x + h) − g(x)) 1
= lim lim
h→0 h h→0 g(x)g(x + h)
1
= −g 0 (x) .
(g(x))2

f
Theorem 6.1.4. If f and g are differentiable at x and g(x) 6= 0, then is differentiable at x and
 0 g
0 0
f g(x)f (x) − f (x)g (x)
(x) = .
g (g(x))2

61
 
f 1
Proof. Since = f , we have
g g
 0  0
f 1
(x) = f· (x)
g g
 0
0 1 1
= f (x) + f (x) (x)
g(x) g
f 0 (x) g 0 (x)
 
= + f (x) −
g(x) (g(x))2
f 0 (x)g(x) − f (x)g 0 (x)
= .
(g(x))2

x2 − 1 0 (x2 + 1)(2x) − (x2 − 1)(2x) 4x


Example: f (x) = 2
, then f (x) = 2 2
= 2 .
x +1 (x + 1) (x + 1)2

x 0 (x2 + 1) − x(2x) 1 − x2
Example: If f (x) = , then f (x) = = .
x2 + 1 (x2 + 1)2 (x2 + 1)2

1 1
Example: If f (x) = , then f 0 (x) = − 2 .
x x

dy
Example: If x = cos t and y = t sin t, find .
dx

Solution:
d
dy (t sin t) sin t + t cos t
= dt = .
dx d − sin t
(cos t)
dt

6.2 Derivatives of Trigonometric Functions

Recall :
sin h 1 − cos h
lim = 1 and lim = 0.
h→0 h h→0 h

62
sin(x + h) − sin x
(sin x)0 = lim
h→0 h
sin x cos h + cos x sin h − sin x
= lim
h→0 h
sin x(cos h − 1) + cos x sin h
= lim
h→0
 h  
(1 − cos h) sin h
= lim − sin x + cos x
h→0 h h
= − sin x(0) + cos x(1).

Hence (sin x)0 = cos x.

dy
Example: Find if y = x3 sin x.
dx

dy
Solution: = (x3 sin x)0 = x3 (sin x)0 + sin x(x3 )0 = x3 cos x + 3x2 sin x.
dx

Example: Determine whether the function,


  
 3 1
x sin , if x 6= 0
f (x) = x
0, if x = 0,

is differentiable at x = 0.

Solution: Observe that


h3 sin h1 − 0 1
lim = lim h2 sin = 0.
h→0 h h→0 h

Logarithmic Functions. Assume a > 0 and a 6= 1. If ay = x, then define y = loga x. Let


ln x ≡ loge x (ln x is called the natural logarithm of x).

Basic Properties of Logarithms

1. loga 1 = 0 (In particular, ln 1 = 0).

2. loga a = 1 (In particular, ln e = 1).

3. loga uv = loga u + loga v.


u
4. loga = loga u − loga v.
v
5. loga ur = r loga u.

d x d 1 d x
Derivatives of ln x and ex are e = ex and ln x = . Also (a ) = ax ln x, a > 0.
dx dx x dx
63
d
Example: Calculate the derivative [(sin x + x) · (x3 − ln x)].
dx

d d d 3 d 1
Solution: We know that sin x = cos x, x = 1, x = 3x2 and ln x = . Therefore, by
dx dx dx dx x
the addition rule,
d d d
(sin x + x) = sin x + x = cos x + 1
dx dx dx
and
d 3 d 3 d 1
(x − ln x) = x − ln x = 3x2 − .
dx dx dx x
Now we may conclude the calculation by applying the product rule;
d d d
[(sin x + x) · (x3 − ln x)] = (sin x + x) · (x3 − ln x) + (sin x + x) · (x3 − ln x)
dx dx  dx 
3 2 1
= (cos x + 1) · (x − ln x) + (sin x + x) · 3x −
x
1
= 4x3 − 1 + x3 cos x + 3x2 sin x − sin x − ln x cos x − ln x.
x

You Try It: Calculate the derivative


  
d x
sin x · cos x − x .
dx e + ln x

6.3 Derivative of a Composition [The Chain Rule]

We calculate the derivative of a composition by

[f ◦ g(x)]0 = f 0 (g(x)) · g 0 (x).

If y = f (u) where u = g(x), then


dy dy du du
= · = f 0 (u) = f 0 (g(x))g 0 (x).
dx du dx dx
Similarly, if y = f (u) where u = g(v) and v = h(x), then
dy dy du dv
= · · .
dx du dv dx

d
Example: Calculate the derivative (sin(x3 − x2 )).
dx

Solution: This is the composition of functions, so we must apply the Chain Rule. It is essential
to recognize what function will play the role of f and what function will play the role of g. Notice

64
that, if x is the variable, then x3 − x2 is applied first and sin applied next. So it must be that
d d
g(x) = x3 − x2 and f (s) = sin s. Notice that f (s) = cos s and g(x) = 3x2 − 2x. Then
ds dx
sin(x3 − x2 ) = f ◦ g(x)

and
d d
(sin(x3 − x2 )) = (f ◦ g(x))
dx dx 
df d
= (g(x)) · g(x)
ds dx
= cos(g(x)) · (3x2 − 2x)
= [cos(x3 − x2 )] · (3x2 − 2x).

x2
 
d
Example: Calculate the derivative ln .
dx x−2

x2 x2
 
Solution: Let h(x) = ln . Then h = f ◦ g, where f (s) = ln s and g(x) = . So
x−2 x−2
d 1 d (x − 2) · 2x − x2 · 1 x2 − 4x
f (s) = and g(x) = = . As a result,
ds s dx (x − 2)2 (x − 2)2
d d
h(x) = (f ◦ g)
dx dx 
df d
= (g(x)) · g(x)
ds dx
1 x2 − 4x
= ·
g(x) (x − 2)2
1 x2 − 4x
= ·
x2 (x − 2)2
x−2
x−4
= .
x(x − 2)

You Try It: Calculate the derivative of tan(ex − x).

6.4 Continuity and Differentiation

What is the relationship between continuity and differentiation? It appears that functions that
have derivatives must be continuous.
Theorem 6.4.1. If a function f is differentiable at a point x, then it is continuous at x.

65
Proof. We want to show that f is continuous at x, i.e., lim f (t) = f (x) or lim f (x + h) = f (x),
t→x h→0
where h = t − x. It will be sufficient to show that lim [f (x + h) − f (x)] = 0.
h→0

Now,
 
f (x + h) − f (x)
lim [f (x + h) − f (x)] = lim h
h→0 h→0 h
f (x + h) − f (x)
= lim lim h
h→0 h h→0
= f 0 (x) · 0
= 0,

because f 0 (x) is finite. Thus f is continuous at x.

Converse is false: For example, the function f (x) = |x| is continuous at x = 0, but it is not
differentiable there.

6.5 Higher Order Derivatives

dy
If f (x) is differentiable in an interval, its derivative is given by f 0 (x), y 0 or where y = f (x).
dx

d2 y
 
0 00 d
00 dy
If f (x) is also differentiable in the interval, its derivative is denoted by f (x), y or = .
dx dx dx2

dn y
Similarly, the nth derivative of f (x), if it exists, is denoted by f (n) , y (n) or where n is called
dxn
the order of the derivative.

Example: Let y = f (x) = 12 x4 − 3x2 + 1.

d 1 4
Solution: Derivative y 0 = f 0 (x) = ( x − 3x2 + 1) = 2x3 − 6x.
dx 2

d2 y d
Second derivative y 00 = f 00 (x) = = (2x3 − 6x) = 6x2 − 6.
dx2 dx

000 d3 y
000 d
Third derivative y = f (x) = 3 = (6x2 − 6) = 12x.
dx dx

d4 y d
Fourth derivative y (4) = f (4) (x) = 4
= (12x) = 12.
dx dx

66
6.6 Implicit Differentiation

Compare


1. x2 − y 3 = 3 ⇐⇒ y = 3
x2 − 3.

2. x2 + y 2 = 1 ⇐⇒ y = ± 1 − x2 .

3. x3 + y 2 = 3xy ⇐⇒????????.

Implicit Functions. A function in which the dependent variable is expressed solely in terms of
the independent variable x, namely y = f (x), is said to be an explicit function, for example,
y = 12 x3 − 1. An equation f (x, y) = 0, on perhaps certain restricted ranges of the variables, is said
to define y implicitly as a function of x.

1−x
Example: (a) The equation xy + x − 2y − 1 = 0, with x 6= 2, defines the function y = .
√ x−2
(b) The equation 4x2 + 9y 2 − 36 = 0 defines the function y = 23 9 − x2 when |x| ≤ 3 and y ≥ 0

and the function y = − 23 9 − x2 when |x| ≤ 3 and y ≤ 0.

The derivative y 0 may be obtained by one of the following procedures:

1. Solve, when possible, for y and differentiate with respect to x.

2. Thinking of y as a function of x, differentiate both sides of the given equation with respect
to x and solve the resulting relation for y 0 . This differentiation process is known as implicit
differentiation.

dy
Example: Find if x2 + y 2 = 4.
dx

Solution: We differentiate both sides of the equation


d 2 d d
x + y2 = 4
dx dx dx
dy
2x + 2y =0 .
dx
Solving the derivative yields
dy x
=− .
dy y

d2 y
Example: Find if x2 + y 2 = 4.
dx2

67
Solution: From the above example, we already know that the first derivative is
dy x
=− .
dx y
Hence by the Quotient Rule

d2 y
 
d x
2
= −
dx dx y
dy
y·1−x·
= − dx
y2
 
x
y−x −
y dy
= − 2
Substituting for
y dx
y + x2
2
= − .
y3

Noting that x2 + y 2 = 4 permits us to write the second derivative as

d2 y 4
= − .
dx2 y3

dy
Example: Find if sin y = y cos 2x.
dx

Solution:
d d
sin y = y cos 2x
dx dx
dy dy
cos y = y(− sin 2x · 2) + cos 2x
dx dx
dy
(cos y − cos 2x) = −2y sin 2x
dx
dy 2y sin 2x
= − .
dx cos y − cos 2x

Example: Find y 0 , given xy + x − 2y − 1 = 0.

Solution: We have
d d d d d d
x (y) + y (x) + (x) − 2 (y) − (1) = (0)
dx dx dx dx dx dx
1+y
or xy 0 + y + 1 − 2y 0 = 0, then y 0 = .
2−x

Example: Find y 0 , given x2 y − xy 2 + x2 + y 2 = 0.

68
Solution:
d 2 d d 2 d 2
(x y) − (xy 2 ) + (x ) + (y ) = 0
dx dx dx dx
d d d d d 2 d 2
x2 (y) + y (x2 ) − x (y 2 ) − y 2 (x) + (x ) + (y ) = 0.
dx dx dx dx dx dx
y 2 − 2x − 2xy
Hence, x2 y 0 + 2xy − 2xyy 0 − y 2 + 2x + 2yy 0 = 0 and y 0 = .
x2 + 2y − 2xy

Example: Find y 0 and y 00 , given x2 − xy + y 2 = 3.

Solution:
d 2 d d 2 2x − y
(x ) − (xy) = (y ) = 2x − xy 0 − y + 2yy 0 = 0. So y 0 = .
dx dx dx x − 2y
Then
d d
(x − 2y) (2x − y) − (2x − y) (x − 2y) (x − 2y)(2 − y 0 ) − (2x − y)(1 − 2y 0 )
y 00 = dx dx =
(x − 2y)2 (x − 2y)2
 
2x − y
3x − 3y
3xy 0 − 3y x − 2y 6(x2 − xy + y 2 )
= = =
(x − 2y)2 (x − 2y)2 (x − 2y)2
18
= .
(x − 2y)2

You Try It: Find y 00 , given x3 − 3xy + y 3 = 1.

6.7 Logarithmic Differentiation

Take natural logarithm (ln) both sides, differentiate implicitly and solve for y 0 .

2 3
x 7x − 14
Example: Compute y 0 if y = .
(1 + x2 )4

69
√  2√
x2 3 7x − 14

x 3 7x − 14
Solution: y = ⇒ ln y = ln .
(1 + x2 )4 (1 + x2 )4
1
ln y = 2 ln x + ln(7x − 14) − 4 ln(1 + x2 )
  3 
1 (7x − 14)0 (1 + x2 )0
  
1 0 1
y = 2 + −4
y x 3 7x − 14 1 + x2
2 7 8x
= + −
x 3(7x − 14) 1 + x2
 
0 2 7 8x
y = y + −
x 3(7x − 14) 1 + x2

x2 3 7x − 14 2
 
7 8x
= + − .
(1 + x2 )4 x 3(7x − 14) 1 + x2

6.8 Derivatives of Inverse Trigonometric Functions

If x = sin y, the inverse function is written y = sin−1 x or y = arcsin x. The inverse trigonometric
functions are multivalued functions.

Example: Find the derivative of y = sin−1 x.

Solution: Differentiate implicitly with respect to x. Then sin y = x. Hence,

(sin y)0 = x0
cos yy 0 = 1
1
y0 =
cos y
1
y0 = p
1 − sin2 y
1
= √ .
1 − x2

d 1 d 1 d 1
Some Derivatives. (cos−1 x) = − √ , (cot−1 x) = − , (sec−1 x) = √ .
dx 1 − x2 dx 1 + x2 dx x x2 − 1

1
You Try It: Show that the derivative of tan−1 x = .
1 + x2

70
Applications of the Derivative

6.9 The Mean Value Theorem

Suppose f (x) is continuous on [a, b] and differentiable on (a, b). Then, there exists a c in (a, b) at
which the tangent line is parallel to the secant line joining the points (a, f (a)) and (b, f (b)), i.e., at
f (b) − f (a)
which f 0 (c) = ,
b−a

OR

If f (x) is continuous in [a, b] and differentiable in (a, b), then there exists a point c in (a, b) such
that
f (b) − f (a)
f 0 (c) = , a < c < b.
b−a

The word mean in The Mean Value Theorem refers to the mean (or average) rate of change of f
in the interval [a, b].

If f (a) = f (b) = 0, then the theorem says that there exists a c in (a, b) at which f 0 (c) = 0. The

71
graphs suggest that there must be at least one point on the graph, that corresponds to a number c
in (a, b), at which the tangent is horizontal. This special case of the Mean Value Theorem is called
Rolle’s Theorem 1 .


Example: Consider f (x) = x − 1 on [2, 5], f (x) is continuous when x − 1 ≥ 0, i.e., x ≥ 1. In
1
particular, f (x) is continuous on [2, 5] and f 0 (x) = √ , so differentiable when x > 1. In
2 x−1
particular, f (x) is differentiable on (2, 5).
√ √
f (b) − f (a) f (5) − f (2) 5−1− 2−1 1
= = = .
b−a 5−2 3 3
1
The Mean Value Theorem asserts that, for some c in (2, 5), f 0 (c) = . Let us find it.
3
1
f 0 (x) =
3
1 1
√ =
2 x−1 3

2 x−1 = 3
4(x − 1) = 9
9
x−1 =
4
13
x = .
4
13 13
Notice that is in (2, 5), so we may take c = .
4 4

π
Example: Show that if f (x) = tan x on the interval 0 ≤ x ≤ k where k < , then tan k ≥ k.
2
1
Michel Rolle, a French mathematician (1652-1719)

72
Solution: By the Mean Value Theorem
tan k − tan 0
= sec2 c,
k−0
for some c ∈ (0, k). But sec2 c ≥ 1 and tan 0 = 0. So
tan k
≥ 1 =⇒ tan k ≥ k.
k

Example: Use The Mean Value Theorem to show that | cos a − cos b| ≤ |a − b|.

Solution: The function cos x is continuous and differentiable for all x. By the Mean Value Theorem
cos a − cos b
(cos x)0 =
a−b
0
cos a − cos b
|(cos x) | =
,
a−b

but (cos x)0 = − sin x and |(cos x)0 | ≤ 1, therefore


| cos a − cos b|
≤ 1 =⇒ | cos a − cos b| ≤ |a − b|.
|a − b|

b−a b−a
Example: Prove that 2
< tan−1 b − tan−1 a < for a < b.
1+b 1 + a2

1 1
Solution: Let f (x) = tan−1 x. Since f 0 (x) = 2
, f 0 (c) = . By the Mean Value Theorem
1+x 1 + c2
f (b) − f (a) tan−1 b − tan−1 a 1
= = , a < c < b.
b−a b−a 1 + c2
Then, from a < c < b, we have
a2 < c2 < b2 =⇒ 1 + a2 < 1 + c2 < 1 + b2
1 1 1 1 1 1
2
> 2
> 2
=⇒ 2
< 2
<
1+a 1+c 1+b 1+b 1+c 1 + a2
1 tan−1 b − tan−1 a 1 b−a b−a
2
< < 2
=⇒ 2
< tan−1 b − tan−1 a < .
1+b b−a 1+a 1+b 1 + a2

 
a b b
Example: Use the Mean Value Theorem, to prove that if 0 < a < b, then 1 − < ln < − 1.
b a a
1 1
Hence show that < ln 1.2 < .
6 5

1
Solution: Let f (x) = ln x and f 0 (x) = . By the Mean Value Theorem, there exists c ∈ (a, b)
x
such that
1 ln b − ln a
f 0 (c) = = .
c b−a

73
Then, from a < c < b we have
1 1 1
a < c < b =⇒ < <
b c a
1 ln b − ln a 1 b−a b−a
< < =⇒ < ln b − ln a <
b b−a a b   a
a b b
=⇒ 1 − < ln < − 1.
b a a

   
12 6
Now, ln(1.2) = ln = ln . Therefore a = 5 and b = 6. Substituting in
  10 5
a b b
1 − < ln < − 1, we have
b a a
 
5 6 6 1 1
1 − < ln < − 1 =⇒ < ln 1.2 < .
6 5 5 6 5

6.10 Some Corollaries of The Mean Value Theorem

Corollary 6.10.1. If f 0 (x) = 0 at all points of the interval (a, b), then f (x) must be a constant in
the interval.

Proof. Let x1 < x2 be any two different points in (a, b). By the Mean Value Theorem for
x1 < x < x 2 ,
f (x2 ) − f (x1 )
= f 0 (x) = 0.
x2 − x1
Thus f (x1 ) = f (x2 ). Since x1 and x2 are arbitrarily chosen, the function f (x) has the same value
at all points in the interval. Thus, f (x) is constant.

Corollary 6.10.2. If f 0 (x) > 0 at all points of the interval (a, b), then f (x) is strictly increasing.

Proof. Let x1 < x2 be any two different points in (a, b). By the Mean Value Theorem for
x1 < x < x 2 ,
f (x2 ) − f (x1 )
= f 0 (x) > 0.
x2 − x1
Thus f (x2 ) > f (x1 ) for x2 > x1 and so f (x) is strictly increasing.

74
6.11 Indeterminate Forms

f (x) 0 ∞ f (x) 0
Happens when lim tends to or as x → a. Think of the situation lim →
x→a g(x) 0 ∞ x→a g(x) 0
where f (x) and g(x) are differentiable (and therefore continuous so f (a) = lim f (x) = 0 and
x→a
g(a) = lim g(x) = 0.), then
x→a

f (x) f (x) − f (a)


lim = lim
x→a g(x) x→a g(x) − g(x)

f (x) − f (a)
= lim x−a (provided the denominator is not zero)
x→a g(x) − g(a)

x−a
f (x) − f (a)
lim
=
x→a x−a
g(x) − g(a)
lim
x→a x−a
0
f (a) lim f 0 (x)
x→a
= 0 = (provided f 0 (x) and g 0 (x) are also continuous.)
g (a) lim g 0 (x)
x→a

Example:
x2 − 4 (x2 − 4)0 2x
lim = lim 0
= lim = 2 · 2 = 4.
x→2 x − 2 x→2 (x − 2) x→2 1

f (x) f 0 (x) f (x) 0 ∞


Theorem 6.11.1. If lim = lim 0 provided that lim is of the type or , this is
g(x) g (x) g(x) 0 ∞
f (x) 0 ∞
called L’Hôpital’s Rule: if either lim = or .
g(x) 0 ∞

sin 2x 2 cos 2x 2·1 2


Examples: (a) lim = lim = = .
x→0 sin 5x x→0 5 cos 5x 5·1 5
e3x 3e3x
(b) lim = lim = ∞.
x→∞ x x→∞ 1
x x
e −1 e
(c) lim 3
= lim 2 = ∞.
x→0 x x→0 3x

0 ∞
The form ∞ − ∞. A given limit that is not immediately or can be converted to one of these
0 ∞
forms by combination of algebra and a little cleverness.
 
1 + 3x 1
Example: Evaluate lim − .
x→0 sin x x

1 + 3x 1
Solution: We note → ∞ and → ∞. However, after writing the difference as a single
sin x x
75
0
fraction, we recognize the form .
0
3x2 + x − sin x
 
1 + 3x 1
lim − = lim
x→0 sin x x x→0 x sin x
6x + 1 − cos x
= lim
x→0 x cos x + sin x
6 + sin x
= lim
x→0 −x sin x + 2 cos x
6+0
= = 3.
0+2

The form 0 · ∞. By suitable manipulation, L’Hôpital’s Rule can sometimes be applied to the limit
form 0 · ∞.
 
1
Example: Evaluate lim x sin .
x→∞ x

Solution: Write the given expression as


 
1
sin
x
lim
x→∞ 1
x
0
and recognize that we have the form . Hence,
0
(−x−2 cos x1 )
 
1
lim x sin = lim
x→∞ x x→∞ (−x−2 )
1
= lim cos = 1.
x→∞ x

The form 00 , ∞0 , 1∞ . Suppose y = f (x)g(x) tends towards 00 , ∞0 , 1∞ as x → a or x → ∞. By


taking the natural logarithm of y ;

ln y = ln f (x)g(x) = g(x) ln f (x)

and we see
lim ln y = lim g(x) ln f (x)
x→a x→a

is of the form 0 · ∞. If it is assumed that lim ln y = ln(lim y) = L, then lim = eL or


x→a x→a x→a

lim f (x)g(x) = eL .
x→a

1
Example: Evaluate lim+ x ln x .
x→0

76
1
Solution: The form is 00 . Now, if we set y = x ln x , then
1
ln y = ln x = 1.
ln x
Notice we do not need L’Hôpital’s Rule in this case since

lim ln y = 1.
x→0+

1
Hence, lim+ y = e1 or equivalently lim+ x ln x = e.
x→0 x→0

1
Example: Evaluate lim (1 + x) x .
x→0

1
Solution: The limit form is of the form 1∞ . If y = (1 + x) x , then
1
ln y = ln(1 + x).
x
ln(1 + x) 0
Now, lim has the form and so
x→0 x 0
1
ln(1 + x)
lim = lim 1+x
x→0 x x→0 1
1
= lim = 1.
x→0 1 + x

Thus,
1
lim (1 + x) x = e.
x→0

 2x
3
Example: Evaluate lim 1 − .
x→∞ x

 2x  
∞ 3 3
Solution: The limit form is 1 . If y = 1 − then ln y = 2x ln 1 − . Observe that the
x x
2 ln(1 − x3 )
 
3 0
form lim 2x ln 1 − is ∞ · 0, whereas the form of lim 1 is . Therefore,
x→∞ x x→∞
x
0
3
x2
2 ln(1 − x3 ) (1 − x3 )
lim 1 = lim 2
x→∞
x
x→∞ − x12
−6
= lim = −6.
x→∞ (1 − 3 )
x

Finally, we conclude that  2x


3
lim 1 − = e−6 .
x→∞ x

77
6.12 Tutorial 4
3+x
1. Let f (x) = , x 6= 3. Evaluate f 0 (2) from the definition.
3−x
d √ 1
2. Show from definition that ( x − 1) = √ .
dx 2 x−1
3. Determine whether each of the following functions is differentiable at(
x = 0.
 3 1
 x − |x|
x sin x , x 6= 0 , x 6= 0 .
(i) f (x) = (ii) f (x) = x|x| (iii) f (x) = x
0, x=0 0, x=0

4. Show that f (x) = cos x is differentiable at any x ∈ R and that f 0 (x) = − sin x.
dy
5. Find for each of the following functions
dx
(i) y = sin3 (6x) (ii) y = (ax + b)m (cx + d)n (iii) y = cos(x2 − 3x + 1)

6. Use logarithmic differentiation to find derivatives ofseach of the following functions


(x + 1)(x + 2)
(i) y = xx , x > 0 (ii) y = xln x , x > 0 (iii) y = 3 2
(x + 1)(x2 + 2)
√ √ √ √
(iii) y = √x + 1 x + 2 4 x + 3 (iv) y(x) = y1 (x)y2 (x) · · · yn (x) (v) y = x x , x > 0
3
x4 + 6x2 (8x + 3)5 x
(vi) y = 2 (vii) y = (x2 + 1)(x+1) , x > −1 (viii) y = xx 2x
2
(2x + 7) 3
x ln x (x2 + 1)x
(ix) y = xx (x) y = xcos x (xi) y = x(x+1) (xii) y = (ln x)x (xiii) y = .
x2
dy
7. Use implicit differentiation to find .
dx
(i) (y − 1)2 = 4(x + 2) (ii) (x2 + y 2 )6 = x3 − y 3 (iii) y 4 − y 2 = 10x − 3 (iv) y = sin xy

dy
8. Find if xy 2 + 4y 3 + 3x = 0 at (1, −1).
dx
d2 y
9. Find .
dx2
(i) 4y 3 = 6x2 + 1 (ii) xy 4 = 5 (iii) x = 2 + t, y = 1 + t2

10. If y = tan x, prove that y 000 = 2(1 + y 2 )(1 + 3y 2 ).

11. Given y 2 sin x + y = tan−1 x. find y 0 .

12. Find the first derivatives of the following functions

(i) y = tan−1 (3x2 ).



 
1
(ii) y = x csc−1 + 1 − x2 .
x

78
13. Evaluate each of the following limits using L’hopital’s rule.
sin x sin x ln x 6x2 − 5x + 7
(i) lim (ii) lim x −x
(iii) lim x
(iv) lim
x→0 x  e −e
x→0  x→∞ ex x→∞ 4x2 − 2x
1 + 3x 1 3x 1
(ix) lim − (xiii) lim (xiv) lim+ xx (xv) lim x(e x − 1) (xvi)
 x→0 sin x x x→∞ 3x + 1 x→0 x→∞
1 1
lim − .
x→0 x sin x
 −1
e x2 , x 6= 0
14. Let f (x) = Prove that f 0 (0) = 0.
0, x=0

15. Explain why Rolle’s theorem is not applicable for the function f (x) = |x| on the interval
[−1, 1].

16. Verify Rolle’s Theorem for f (x) = x2 (1 − x)2 , 0 ≤ x ≤ 1.

17. Prove that if


an an−1 a1
+ + ··· + + a0 ,
n+1 n 2
then the equation an xn + an−1 xn−1 + · · · + a1 x + a0 = 0 has at least one real root between 0
and 1.

18. Find the value of c in Rolle’s Theorem when f (x) = (x − a)m (x − b)n where m and n are
positive integers.

19. If f 0 (x) ≤ 0 at all points of (a, b), prove that f (x) is monotonic decreasing in (a, b). Under
what conditions is f (x) strictly decreasing in (a, b)?
sin x
20. Use the mean value theorem to show that sin x ≤ x and tan x ≥ x. Hence show that is
x
strictly decreasing on (0, π2 ).
   
 a b b
21. Use the mean value theorem, to prove that, if 0 < a < b, then 1 − < ln < −1 .
b a a
1 1
Hence show that < ln(1.2) < .
6 5
22. Use the Mean Value Theorem to prove the following inequalities

(i) ln(1 + x) < x if x > 0.

79
Chapter 7

Integration

7.1 Anti-derivatives

In this chapter we shall see that an equally important problem is :


Given a f unction f, f ind a f unction whose derivative is the same as f.
That is, for a given function f , we wish to find another function F for which F 0 (x) = f (x) for all
x on some interval.
Definition 7.1.1. A function F is said to be an anti-derivative of a function f if F 0 (x) = f (x)
on some interval.

Example: An anti-derivative of f (x) = 2x is F (x) = x2 since F 0 (x) = 2x.

There is always more than one anti-derivative of a function. For instance, in the foregoing example,
F1 (x) = x2 −1 and F2 (x) = x2 +10 are also anti-derivatives of f (x) = 2x since F10 (x) = F20 (x) = f (x).
Indeed, if F is an anti-derivative of a function f , then so is G(x) = F (x) + C, for any constant C.
This is a consequence of the fact that
d
G0 (x) = (F (x) + C) = F 0 (x) + 0 = F 0 (x) = f (x).
dx
Thus, F (x) + C stands for a set of functions of which each member has a derivative equal to f (x).
Theorem 7.1.1. If G0 (x) = F 0 (x) for all x in some interval [a, b], then
G(x) = F (x) + C
for all x in the interval.

Examples: (a) The anti-derivative of f (x) = 2x is G(x) = x2 + C.


(b) The anti-derivative of f (x) = 2x + 5 is G(x) = x2 + 5x + C since G0 (x) = 2x + 5.

80
7.2 Indefinite Integral

For convenience let’s introduce a notation for an anti-derivative of a function. If F 0 (x) = f (x), we
shall represent the most general anti-derivative of f by
Z
f (x)dx = F (x) + C.
Z Z
The symbol is called an integral sign, and the notation f (x) is called the indefinite integral
of f (x) with respect to x. The function f (x) is called the integrand. The process of finding an
anti-derivative is called anti-differentiation or integration. The number C is called Z a constant
d
of integration. Just as () denotes differentiation with respect to x, the symbol ()dx denotes
dx
integration with respect to x.

7.3 The Indefinite Integral of a Power

When differentiating the power xn , we multiply by the exponent n and decrease the exponent by
1. To find an anti-derivative of xn , the reverse of the differentiation rule would be : Increase the
exponent by 1 and divide by the new exponent n + 1.

If n is a rational number, then for n 6= −1

xn+1
Z
xn dx = + C.
n+1

Proof. Notice that


xn+1 x(n+1)−1
 
d
+C = (n + 1) + 0 = xn .
dx n+1 n+1

Z Z
6 1
Example: Evaluate (a) x dx (b) dx.
x5

Solution:
x7
Z
(a) x6 dx = + C.
7
x−4
Z
1 1
(b) By writing 5 as x , we have x−5 dx =
−5
+ C = − 4 + C.
x −4 4x


Z
Example: Evaluate x dx.

81

Z Z
1
Solution: We first write x dx = x 2 dx and therefore
Z 3
1 x2 2 3
x dx =
2
3 + C = x 2 + C.
2
3

The following property of indefinite integrals is an immediate consequence of the fact that the
derivative of a sum is the sum of derivatives.
Theorem 7.3.1. If F 0 (x) = f (x) and G0 (x) = g(x), then
Z Z Z
[f (x) ± g(x)]dx = f (x)dx ± g(x)dx = F (x) ± G(x) + C.

Z
1
Example: Evaluate (x− 2 + x4 )dx.

Solution: We can write


1
x5 x5
Z Z Z
− 21 4 − 12 4 x2 1
(x + x )dx = x dx + x dx = 1 + + C = 2x +
2 + C.
2
5 5
Theorem 7.3.2. If F 0 (x) = f (x), then
Z Z
kf (x)dx = k f (x)dx

for any constant k.

The anti-derivative, or indefinite integral, of any finite sum can be obtained by integrating each
term.
Z  
− 31 5
Example: Evaluate 4x − 2x + 2 dx.
x

Solution: It follows that


Z   Z Z Z
− 13 5 − 13
4x − 2x + 2 dx = 4 xdx − 2 x dx + 5 x−2 dx
x
2
x2 x3 x−1
= 4· −2· 2 +5· +C
2 3
−1
2
= 2x2 − 3x 3 − 5x−1 + C.

7.4 Some Anti-Differentiation Formulas


Z Z
1. cf (x)dx = c f (x)dx.

82
Z Z Z
2. [f (x) ± g(x)] = f (x)dx ± g(x)dx.
Z
1
3. xn dx = xn+1 + C for n 6= −1.
n+1
Z
4. adx = ax + C.
Z
5. cos xdx = sin x + C.
Z
6. sin xdx = − cos x + C.
Z
7. sec2 xdx = tan x + C.
Z
8. ex dx = ex + C.
Z
1
9. dx = ln |x| + C.
x
Z
1
10. √ dx = sin−1 x + C.
1−x 2

Z
1
11. dx = tan−1 x + C.
1 + x2

Z 
5 √
3

Example: Evaluate 2
− 2 x dx.
x

Solution: We may write


Z 
5 √
3
 Z
1
Z
2
2
− 2 x dx = 5 dx − 2 x 3 dx
x x
1 5
= 5 ln |x| − 2 5 x 3 + C
3
6 5
= 5 ln |x| − x 3 + C.
5

83
7.5 u−Substitution

The Indefinite Integral of a Power of a Function

Theorem 7.5.1. If F is an anti-derivative of f , then


Z
f (g(x))g 0 (x)dx = F (g(x)) + C.

Z
x
Example: Evaluate dx.
(4x2 + 3)6

Solution: Let us rewrite the integral as


Z
(4x2 + 3)−6 xdx

and make the identifications


u = 4x2 + 3 and du = 8xdx.

−6
u
Z Z du
1 z 2 }| −6{ z }| {
(4x2 + 3)−6 xdx = (4x + 3) 8xdx
8
Z
1
= u−6 du
8
1 u−5
= · +C
8 −5
1
= − (4x2 + 3)−5 + C.
40

Z
Example: Evaluate x(x2 + 2)3 dx.

Solution: If u = x2 + 2 then du = 2xdx. Thus,


u3
Z Z du
1 z 2 }| {3 z }| {
x(x2 + 2)3 dx = (x + 2) 2xdx
2
Z
1
= u3 du
2
1 u4
= · +C
2 4
1 2
= (x + 2)4 + C.
8

84
p
You Try It: Evaluate 3
(7 − 2x3 )4 x2 dx.
Z
Example: Evaluate sin 10x dx.

Solution: If u = 10x, we then need du = 10dx. Accordingly, we write


Z Z
1
sin 10x dx = sin 10x(10dx)
10
Z
1
= sin u du
10
1
= (− cos u) + C
10
1
= − cos 10x + C.
10

Z
You Try It: Evaluate sec2 (1 − 4x) dx.

7.6 Area Under a Graph

As the derivative is motivated by the geometric problem of constructing a tangent to a curve, the
historical problem leading to the definition of a definite integral is the problem of finding area.
Specifically, we are interested in finding the area A of a region bounded between the x−axis, the
graph of a non-negative function y = f (x) defined on some interval [a, b] and

(i) the vertical lines x = a and x = b.

(ii) the x−intercepts of the graph.

85
7.6.1 The Definite Integral

The geometric problems that motivated the development of the integral calculus (determination
of lengths, areas, and volumes) arose in the ancient civilizations of Northern Africa. Where so-
lutions were found, they related to concrete problems such as the measurement of a quantity of
grain. Greek philosophers took a more abstract approach. In fact, Eudoxus (around 400 B.C.)
and Archimedes (250 B.C.) formulated ideas of integration as we know it today. Integral calculus
developed independently, and without an obvious connection to differential calculus. The calculus
became a “whole” in the last part of the seventeenth century when Isaac Barrow, Isaac Newton,
and Gottfried Wilhelm Leibniz (with help from others) discovered that the integral of a function
could be found by asking what was differentiated to obtain that function.
Definition 7.6.1. Let f be a function defined on a closed interval [a, b]. Then the definite integral
Z b
of f from a to b, denoted by f (x) dx, is defined to be
a
Z b n
X
f (x) dx = lim f (xi )∆x.
a n→∞
i=1

The numbers a and b are called the lower and upper limits of integration, respectively. If the
limit exists, the function f is said to be integrable on the interval.

7.6.2 Properties of the Definite Integral

The following two definitions prove to be useful when working with definite integrals.
Theorem 7.6.1. If f (a) exists, then
Z a
f (x)dx = 0.
a

Theorem 7.6.2. If f is integrable on [a, b], then


Z a Z b
f (x)dx = − f (x)dx.
b a

Example: By definition Z 1
(x3 + 3x)dx = 0.
1

Theorem 7.6.3. Let f and g be integrable functions on [a, b]. Then,

Z b Z b
(i) kf (x) dx = k f (x) dx where k is any constant.
a a

86
Z b Z b Z b
(ii) [f (x) ± g(x)] dx = f (x) dx ± g(x) dx.
a a a
Z b Z c Z b
(iii) f (x) dx = f (x) dx + f (x) dx, where c is any number in [a, b].
a a c

The independent variable x in a definite integral is called a dummy variable of integration. The
value of the integral does not depend on the symbol used. In other words,
Z b Z b Z b Z b
f (x) dx = f (r) dr = f (s) ds = f (t) dt
a a a a

and so on.
Theorem 7.6.4. For any constant k,
Z b Z b
k dx = k dx = k(b − a).
a a

Theorem 7.6.5. Let f be integrable on [a, b] and f (x) ≥ 0 for all x in [a, b], then
Z b
f (x) dx ≥ 0.
a

7.7 The Fundamental Theorem of Calculus

In this theorem we shall see that the concept of an anti-derivative of a continuous function provides
the bridge between the differential calculus and the integral calculus.
Theorem 7.7.1. Let f be continuous on [a, b] and let F be any function for which F 0 (x) = f (x).
Then Z b
f (x) dx = F (b) − F (a).
a

The difference is usually written b



F (x) ,
a
that is, b b
Z b
Z

f (x) dx = f (x) dx
= F (x) .
| a {z } | {z }a a

definite integral indefinite integral

Z 3
Example: Evaluate x dx.
1

87
x2
Solution: An anti-derivative of f (x) = x is F (x) = . Consequently,
2
3
3
x2
Z
9 1
x dx = = − = 4.
1 2 1 2 2

7.8 Techniques of Integration

7.8.1 Integration by Parts

Useful
Z for integrands
Z involving products of algebraic and exponential or logarithmic functions, such
as x2 ex dx and x ln x dx. This is the inverse operation of differentiating a product. If u and v
are functions of x, then
d dv du
(uv) = u + v .
dx dx dx
du dv
Integrate both sides, if and are continuous, then
dx dx
Z Z
dv du
uv = u dx + v dx
dx dx
Z Z
dv du
u dx = uv − v dx
dx dx
Z Z
u dv = uv − v du.

Z
Example: Evaluate xex dx.

Z Z
Solution: Let u = x, du = dx and dv = e dx ⇒ v = x
dv = ex dx = ex . Then,
Z
xex dx = xex − ex + C.

Z
Example: Evaluate x2 ln x dx.

Solution: x2 is more easily integrated than ln x. So choose dv = x2 dx. Then,

x3
Z Z
2 2
dv = x dx ⇒ v = dv = x dx = .
3

88
1
and u = ln x ⇒ du = dx. Therefore,
x
x3
Z Z  3 
2 x 1
x ln x dx = ln x − dx
3 3 x
x3
Z
1
= ln x − x2 dx
3 3
x3 x3
= ln x − + C.
3 9

Z
Example: Evaluate ln x dx.

Z Z
1
Solution: Choose dv = dx ⇒ v = dv = dx = x and u = ln x ⇒ du = dx, therefore
x
Z Z
1
ln x dx = x ln x − x· dx
x
Z
= x ln x − dx
= x ln x − x + C.

Z
You Try It: Evaluate x tan−1 x dx.

7.9 Using Integration by Parts Repeatedly


Z
Example: Evaluate x2 ex dx.

Solution: Notice that the derivative of x2 becomes simpler, whereas the derivative of ex does not.
So you should let u = x2 and dv = ex dx. So
Z Z
dv = e dx ⇒ v = dv = ex dx = ex
x

u = x2 ⇒ du = 2x dx.
Integrating by parts one time we get,
Z Z
x e dx = x e − 2xex dx.
2 x 2 x

Apply integration by parts a second time, where


Z Z
x
dv = e dx ⇒ v = dv = ex dx = ex
u = 2x ⇒ du = 2 dx.

89
Then,
Z Z
2 x 2 x
x e dx = x e − 2xex dx
 Z 
2 x x x
= x e − 2xe − 2e dx

= x2 ex − 2xex + 2ex + C
= ex (x2 − 2x + 2) + C.

7.10 Evaluating a Definite Integral

Z e
Example: Evaluate ln x dx.
1

Solution:
Z e
e

ln x dx = (x ln x − x)
1 1
= (e ln e − e) − (1 · ln 1 − 1)
= (e − e) − (0 − 1)
= 1.

7.11 Reduction Formulas

These are formulas in which a given integral is expressed in terms of similar integrals of simpler
form.

Example: Let n be a positive integer. Use integration by parts to derive the reduction formula
Z Z
x e dx = x e − n xn−1 ex dx + C.
n x n x

Z Z
n x
Solution: Let u = x , dv = e dx. Then du = nx n−1
,v= dv = ex dx = ex . So
Z Z
n x n x
x e dx = x e − ex (nxn−1 ) dx
Z
= x e − n xn−1 ex dx + C.
n x

90
Z
To illustrate the use of the reduction formula we calculate xn ex dx for n = 1, 2.
Z Z
x x
n=1 : ex dx = xex − ex + C.
xe dx = xe −
Z Z
n=2 : x e dx = x e − 2 xex dx = x2 ex − 2xex + 2ex + C.
2 x 2 x

Z
Example: Evaluate sinn x dx.

Z Z
n
Solution: Rewrite as sin x dx = sinn−1 x sin x dx.
ThenZ u = sinZn−1 x ⇒ du = (n − 1) sinn−2 x cos x dx and dv = sin x ⇒
v = dv = sin x dx = − cos x. Then ,
Z Z
n n−1
sin x dx = − sin x cos x + (n − 1) sinn−2 x cos x cos x dx
Z
n−1
= − sin x cos x + (n − 1) sinn−2 x cos2 x dx
Z
n−1
= − sin x cos x + (n − 1) sinn−2 x(1 − sin2 x) dx
Z 
n−1 n−2 n
= − sin x cos x + (n − 1) sin x dx − sin x dx
Z Z Z
n n n−1
sin x dx + (n − 1) sin x dx = − sin x cos x + (n − 1) sinn−2 x dx
Z Z
n n−1
n sin x dx = − sin x cos x + (n − 1) sinn−2 x dx

sinn−1 x cos x n − 1
Z Z
n
sin x dx = − + sinn−2 x dx.
n n

7.12 Partial Fractions

This technique involves the decomposition of a rational function into the sum of two or more
simpler rational functions. We will consider rational functions (quotients of polynomials) in which
the numerator has a lower degree than the denominator. If this condition is not meet, we first
carry out long division process, dividing the denominator into the numerator, until we reduce the
problem to an equivalent one involving a fraction in which the numerator has a lower degree than
the denominator.

91
x+7 2 1
Example: = − , then
x2 − x − 6 x−3 x+2
Z Z  
x+7 2 1
dx = − dx
x2 − x − 6 x−3 x+2
Z Z
1 1
= 2 dx − dx
x−3 x+2
= 2 ln |x − 3| − ln |x + 2| + C.

Z
2x + 1
You Try It: Evaluate dx.
(x − 1)(x + 3)

7.12.1 Integrating with Repeated Factors

5x2 + 20x + 6
Z
Example: Find dx.
x3 + 2x2 + x

Solution: Factorise the denominator as x(x + 1)2 . Then write the partial decomposition as
5x2 + 20x + 6 A B C
2
= + + .
x(x + 1) x x + 1 (x + 1)2
Substituting we get, A = 6, B = −1, C = 9, then
5x2 + 20x + 6
Z Z Z Z
6 1 9
dx = dx − dx + dx
x3 + 2x2 + x x x+1 (x + 1)2
9(x + 1)−1
= 6 ln |x| − ln |x + 1| + +C
6 −1
x
= ln − 9 + C.
x + 1 x + 1

6x − 1
Z
You Try It: Evaluate dx.
x3 (2x− 1)

7.12.2 Integrating an Improper Rational Function

x5 + x − 1
Z
Example: Find dx.
x4 − x3

Solution: This rational is improper, its numerator has a degree greater than that of its denomi-
nator. Carrying out long division, we have
x5 + x − 1 x3 + x − 1
= x + 1 + .
x4 − x3 x4 − x3

92
Now, applying partial fraction decomposition produces
x3 + x − 1 A B C D
3
= + 2+ 3+ .
x (x − 1) x x x x−1
We see that A = 0, B = 0, C = 1 and D = 1. So now we can integrate,
Z 5
x3 + x − 1
Z  
x +x−1
dx = x+1+ dx
x4 − x3 x4 − x3
Z  
1 1
= x+1+ 3 + dx
x x−1
x2 1
= + x − 2 + ln |x − 1| + C.
2 2x

x3 − 2x
Z
You Try It: Evaluate dx.
x2 + 3x + 2

7.12.3 Quadratic Factors

Z
dx
Example: Find .
x2 − 4x + 5

Solution: Here we cannot find real factors, but we can complete the square,
x2 − 4x + 5 = (x − 2)2 + 1,
therefore Z Z
dx dx
2
= = tan−1 (x − 2) + C.
x − 4x + 5 (x − 2)2 + 1

Z
(x + 3)dx
Example: Find .
x2 − 4x + 5

x+3 x+3
Solution: Completing the square, = . Now since x + 3 = x − 2 + 5, we
x2 − 4x + 5 (x − 2)2 + 1
have
(x − 2 + 5) dx
Z Z
(x + 3)dx
=
x2 − 4x + 5 (x − 2)2 + 1
(x − 2) dx
Z Z
5 dx
= +
(x − 2)2 + 1 (x − 2)2 + 1
1
= ln((x − 2)2 + 1) + 5 tan−1 (x − 2) + C.
2

Z
(x + 1) dx
Example: Find .
x(x2 + 1)

93
Solution: Decompose into partial fractions,
x+1 A B + Cx
2
= + 2 .
x(x + 1) x x +1

Then A = 1, B = 1, C = −1, and our integral is


Z Z Z Z
(x + 1) dx 1 dx x dx
= dx + −
x(x2 + 1) x x2 + 1 x2 + 1
1
= ln |x| + tan−1 x − ln(x2 + 1) + C.
2

Z
4x
You Try It: Evaluate dx.
(x2 + 1)(x2 + 2x + 3)

7.13 Integration of Rational Functions of Sine and Cosine

Integrals of rational expressions that involve sin x and cos x can be reduced to integrals of quotients
of polynomials by means of the substitution
x
t = tan .
2
1 − t2 2t dx 2
It then follows that cos x = , sin x = and = .
1 + t2 1 + t2 dt 1 + t2
Z
dx
Example: Evaluate .
2 + 2 sin x + cos x

Solution: We see that Z Z


dx 2 dt
= .
2 + 2 sin x + cos x t2 + 4t + 3
Since t2 + 4t + 3 = (t + 1)(t + 3), we use partial fractions,
Z Z  
dx 1 1
= − dt
2 + 2 sin x + cos x t+1 t+3
= ln |t + 1| − ln |t + 3| + C

t + 1
= ln +C
t + 3
1 + tan x2

= ln + C.
3 + tan x2

Z
dx
Example: Evaluate .
5 + 3 cos x

94
Solution: The integral becomes

Z Z 2dt
dx 1+ t2 
=
5 + 3 cos x 1 − t2
5+3
1 + t2
Z 2dt Z
1 + t2 2dt
= 2 2 =
5(1 + t ) + 3(1 − t ) 8 + 2t2
Z 1 + t2  
dt 1 −1 t
= 2
= tan +C
t +4 2 2
 
1 −1 1 x
= tan tan + C.
2 2 2

Z
dx
You Try It: Evaluate .
5 + 4 cos x

7.14 Integration of Powers of Trigonometric Functions

With the aid of trigonometric identities, it is possible to evaluate integrals of the type
Z
sinm x cosn x dx.

We distinguish two cases.

Case 1: m or n is an odd positive integer. Let us first assume that m is an odd positive
integer. By writing
sinm x = sinm−1 x sin x,
where m − 1 is even, and using sin2 x = 1 − cos2 x, the integrand can be expressed as a sum of
powers of cos x times sin x.
Z
Example: Evaluate sin3 x dx.

95
Solution:
Z Z
3
sin x dx = sin2 x sin x dx
Z
= (1 − cos2 x) sin x dx
Z Z
= sin x dx + cos2 x(− sin x) dx
1
= − cos x + cos3 x + C.
3

Z
Example: Evaluate sin5 x cos2 x dx.

Solution:
Z Z
5 2
sin x cos x dx = cos2 x sin4 x sin x dx
Z
= cos2 x(sin2 x)2 sin x dx
Z
= cos2 x(1 − cos2 x)2 sin x dx
Z
= cos2 x(1 − 2 cos2 x + cos4 x) sin x dx
Z Z Z
= − cos x(− sin x) dx + 2 cos x(− sin x) dx − cos6 x(− sin x) dx
2 4

1 2 1
= − cos3 x + cos5 x − cos7 x + C.
3 5 7
If n is an odd positive integer, the procedure for evaluation is the same except that we seek an
integrand that is the sum of powers of sin x times cos x.
Z
Example: Evaluate sin4 x cos3 x dx.

Solution:
Z Z
4 3
sin x cos x dx = sin4 x cos2 x cos x dx
Z
= sin4 x(1 − sin2 x) cos x dx
Z
= sin4 x(cos x) dx − sin6 x(cos x) dx
1 5 1
= sin x − sin7 x + C.
5 7
Z
You Try It: Evaluate sin2 x cos3 x dx.

96
Case II: m and n are both even non-negative integers. When both m and n are even
non-negative integers, the evaluation of the integral relies heavily on the identities,
1 1 − cos 2x 1 + cos 2x
sin x cos x = sin 2x, sin2 x = , cos2 x = .
2 2 2

Z
Example: Evaluate cos4 x dx.

Solution:
Z Z
4
cos x dx = (cos2 x)2 dx
Z  2
1 + cos 2x
= dx
2
Z
1
= (1 + 2 cos 2x + cos2 2x) dx
4
Z  
1 1 + cos 4x
= 1 + 2 cos 2x + dx
4 2
Z  
1 3 1
= + 2 cos 2x + cos 4x dx
4 2 2
3 1 1
= x + sin 2x + sin 4x + C.
8 4 32
Z
You Try It: Evaluate sin2 x cos2 x dx.

Instead of sin8 x cos6 x, suppose you have sin 8x cos 6x.


Z 2π
Example: Find sin 8x cos 6x dx.
0

Z 2π
More generally, find sin px cos qx dx. Use the identity
0

1 1
sin px cos qx = sin(p + q)x + sin(p − q)x.
2 2
1 1
Thus sin 8x cos 6x = sin 14x + sin 2x. Separated like this, sine are easy to integrate,
2 2
Z 2π   2π
1 cos 14x cos 2x
sin 8x cos 6x dx = − − = 0.
0 2 14 4 0

With two sines or two cosines, the addition formula, derive these formulas,
1 1
sin px cos qx = − cos(p + q)x + cos(p − q)x.
2 2
1 1
cos px cos qx = cos(p + q)x + cos(p − q)x.
2 2
97
Z
You Try It: Evaluate sin 2x sin 4x dx.

7.15 Tutorial 5
1. Evaluate the following.
tan−1 x
Z Z Z
2 3
(i) (x + 2) sin(x + 4x − 6)dx (ii) x2 esin x cos x3 dx (iii) dx
1 + x2
Z
dx
Z
2

(iv) (v) x 2x3 − 4dx
x(ln x)4
1 x+5 2x − 5
2. Integrate (i) (ii) (iii) .
(x2 + 1)(x + 1) (2x − 1)(x + 3) (x2 + 4)(x + 6)
3. Evaluate each of the following integrals by usingr
an appropriate substitution.
Z
x5 dx
Z √ Z
1+x
(i) √ (ii) x7 x4 + 1dx (iii) dx
x3 + 1 1−x
4. Evaluate
Z each of the following
Z integrals. Z Z
3
(i) sin xdx (ii) sin x cos xdx (iii) sin3 x cos3 xdx (iv) sin 6x cos 3xdx

5. UseZa suitable substitution


Z to evaluate the
Z following integrals.Z
dθ sin θ dθ
(i) sec θdθ (ii) (iii) dθ (iv) if a > b > 0.
1 + sin θ 2 + sin θ a + b cos θ
Z  
1 1
6. Show that sec x dx = In| sec x + tan x| = In tan π + x .
4 2
7. Evaluate
Z each of theZ following integrals.Z Z Z
(i) xe3x dx (ii) ex sin xdx (iii) sin−1 xdx (iv) ax
e cos pxdx (v) x3 cos xdx.
Z  
m m+1 ln x 1
8. (a) Show that x ln xdx = x − , m 6= −1.
m + 1 (m + 1)2
− sinn−1 x cos x n − 1
Z Z
n
(b) Show that sin xdx = + sinn−2 xdx, n 6= 0.
n n
sinm−1 x cosn+1 x m − 1
Z Z
m n
(d) Show that sin x cos xdx = − + sinm−2 x cosn xdx and so
Z m + n m + n
find sin6 x cos5 xdx.

secn−2 x tan x n − 2
Z
(e) Let In = secn xdx, n = 2, 3, . . . . Show that In = + In−2 .
n−1 n−1
Z
(f) Let In = (ln x)n dx, n = 1, 2, . . . . Show that In = x(ln x)n − nIn−1 .
Z
(h) Let In = (1 + ax2 )n dx. Show that (2n + 1)In = 2nIn−1 + x(1 + ax2 )n .

98
Chapter 8

Functions of Several Variables

So far we have dealt only with functions of single (independent) variables. Many familiar quantities,
however, are functions of two or more variables. For instance, the work done by a force (W = F D)
and the volume of a right circular cylinder (V = πr2 h) are both functions of two variables. The
volume of a rectangular solid (V = lwh) is a function of three variables. The notation for a function
of two or three variables is as follows
z = f (x, y) = x2 + xy
| {z }
2 variables
and
w = f (x, y, z) = x + 2y − 3z.
| {z }
3 variables

8.1 Definition of a Function of Two Variables

Let D be a set of ordered pairs of real numbers. If to each ordered pair (x, y) in D there corresponds
a unique real number f (x, y), then f is called a function of x and y. The set D is the domain
of f and the corresponding set of values for f (x, y) is the range of f . For the function given by
z = f (x, y), we call x and y the independent variables and z the dependent variable.

As with functions of one variable, the most common way to describe a function of several variables
is with an equation, and unless otherwise restricted, we can assume that the domain is the set of
all points for which the equation is defined. for example, the domain of the function given by
f (x, y) = x2 + y 2
is assumed to be the entire xy−plane.
Example 8.1.1. Find the domains of the following functions.
p
x2 + y 2 − 9 x
(i) f (x, y) = (ii) g(x, y, z) = p .
x 9 − x − y2 − z2
2

99
Solution: (i) The function f is defined for all points (x, y) such that x 6= 0 and x2 + y 2 ≥ 9. Thus,
the domain is the set of all points lying on or outside the circle x2 + y 2 = 9.

(ii) The function g is defined for all points (x, y, z) such that x2 + y 2 + z 2 < 9. Consequently, the
domain is the set of all points (x, y, z) lying inside a sphere of radius 3 that is centred at the origin.

Functions of several variables can be combined in the same ways as functions of single variables. For
instance, we can form the sum, difference, product and quotients of two functions of two variables
as follows

1. (f ± g)(x, y) = f (x, y) ± g(x, y) Sum or Difference.


2. (f g)(x, y) = f (x, y)g(x, y) Product.
f f (x, y)
3. (x, y) = , g(x, y) 6= 0 Quotient.
g g(x, y)

8.2 The Graph of a Function of Two Variables

As with functions of single variables, we can learn a lot about the behaviour of a function of two
variables by sketching its graph. The graph of a function f of two variables is the set of all points
(x, y, z) for which z = f (x, y) and (x, y) is in the domain of f . The graph can be interpreted as a
surface in space.

8.2.1 Level Curves

A second way to visualize a function of two variables is to use a scalar field in which the scalar
z = f (x, y) is assigned to the point (x, y). A scalar field can be characterized by level curves (or
contour lines) along which the value of f (x, y) is constant. For example, the weather map shows
level curves of equal pressure called isobars. In weather maps for which the level curves represent
points of equal temperature, the level curves are called isotherms. Another common use of level
curves is in representing electrical potential fields, in this type of map, the level curves are called
equipotential lines.

8.2.2 Level Surfaces

The concept of a level curve can be extended by one dimension to define a level surface. If f is
a function of three variables and c is a constant, then the graph of the equation f (x, y, z) = c is a
level surface of the function f .

100
8.3 Limits and Continuity

8.3.1 Neighborhoods in the Plane

In this section, we will study limits and continuity involving functions of two or three variables. We
begin our discussion of the limit of a function of two variables by defining a two-dimensional analog
to an interval on the real line. Using the formula for the distance between two points (x, y) and
(x0 , y0 ) in the plane, we can define the δ-neighborhood about (x0 , y0 ) to be the disc centered at
(x0 , y0 ) with radius δ > 0 p
{(x, y) : (x − x0 )2 + (y − y0 )2 < δ}.
closed. When this formula contains the less than inequality, <, the disc is called open, and when

sq
(x0 , y0 )

Figure 8.1: A closed disc

it contains the less than or equal to inequality, ≤, the disc is called closed.

A point (x0 , y0 ) in a plane region R is an interior point of R if there exists a δ-neighborhood


about (x0 , y0 ) that lies entirely in R. If every point in R is an interior point, then R is an open
region. A point (x0 , y0 ) is a boundary point of R if every open disc centered at (x0 , y0 ) contains
points inside R and points outside R. If a region contains all its boundary points, then the region
is closed.

101
8.3.2 Limit of a Function of Two Variables

Let f be a function of two variables defined on an open disc centered at (x0 , y0 ), except possibly at
(x0 , y0 ) and let L be a real number. Then

lim f (x, y) = L
(x,y)→(x0 ,y0 )

if for each ε > 0 there corresponds a δ > 0 such that


p
|f (x, y) − L| < ε whenever 0 < (x − x0 )2 + (y − y0 )2 < δ.

Definition
A function f (x, y) has a limit L as (x, y) approaches (x0 , y0 ) if given any  > 0 there exists
δ > 0 (depending on  and (x0 , y0 )) such that whenever (x − x0 )2 + (y − y0 )2 < δ 2 , then
|f (x, y) − L| < .

The definition of the limit of a function of two variables is similar to the definition of the limit
of a function of a single variable, yet there is a critical difference. For a function of two variables,
the statement (x, y) → (x0 , y0 ) means that the point (x, y) is allowed to approach (x0 , y0 ) from any
direction. If the value of
lim f (x, y)
(x,y)→(x0 ,y0 )

is not the same for all possible approaches or paths, to (x0 , y0 ), then the limit does not exist. We
usually choose convenient paths. Some of these are

1. along the x or y axis,


2. along straight lines,
3. along well-defined curves such as parabolae, for e.g., y = x3 .
Example 8.3.1. Show that
lim x = a.
(x,y)→(a,b)

Solution: Let f (x, y) = x and L = a. We need to show that for each ε > 0, there exists a
δ−neighborhood about (a, b) such that

|f (x, y) − L| = |x − a| < ε

whenever (x, y) 6= (a, b) lies in the neighborhood. We can observe that from
p
0 < (x − a)2 + (y − b)2 < δ

it follows that
p p
|f (x, y) − a| = |x − a| = (x − a)2 ≤ (x − a)2 + (y − b)2 < δ.

thus, we can choose δ = ε, and the limit is verified.

102
Example 8.3.2. Prove that lim x2 + 2y = 5.
(x,y)→(1,2)

Solution: Using the definition of limits, we must show that, given ε > 0, we can find a δ > 0
such that |x2 + 2y − 5| < ε when 0 < |x − 1| < δ and 0 < |y − 2| < δ. If 0 < |x − 1| < δ and
0 < |y − 2| < δ, then
1−δ <x<1+δ
and
2−δ <y <2+δ
excluding x = 1 and y = 2. Thus,

1 − 2δ + δ 2 < x2 < 1 + 2δ + δ 2

and
4 − 2δ < 2y < 4 + 2δ.
Adding
5 − 4δ + δ 2 < x2 + 2y < 5 + 4δ + δ 2
or
−4δ + δ 2 < x2 + 2y − 5 < 4δ + δ 2 .
If δ ≤ 1, it follows that
−5δ < x2 + 2y − 5 < 5δ
i.e., |x2 + 2y − 5| < 5δ whenever 0 < |x − 1| < δ and 0 < |y − 2| < δ. Choosing 5δ = ε i.e., δ = 5ε or
δ = 1 which ever is smaller, it follows that |x2 + 2y − 5| < ε when 0 < |x − 1| < δ and 0 < |y − 2| < δ
i.e., lim x2 + 2y = 5.
(x,y)→(1,2)

Example 8.3.3. Show that the following limit does not exist.
 2 2
x − y2
lim .
(x,y)→(0,0) x2 + y 2

2
x2 − y 2

Solution: The domain of the function given by f (x, y) = consists of all points in the
x2 + y 2
xy-plane except for the point (0, 0). To show that the limit as (x, y) approaches (0, 0) does not exist,
consider approaching (0, 0) along two different paths. along the x-axis, every point is of the form
(x, 0) and the limit along this approach is
2
x2 − y 2

lim = lim (1)2 = 1.
(x,0)→(0,0) x2 + y 2 (x,0)→(0,0)

However, if (x, y) approaches (0, 0) along the line y = x, we obtain


2 2
x2 − y 2
 
0
lim = lim = 0.
(x,x)→(0,0) x2 + y 2 (x,x)→(0,0) 2x2

Hence, f does not have a limit as (x, y) → (0, 0).

103
xy
Example 8.3.4. Show that lim does not exist.
(x,y)→(0,0) x2 + y2

Solution: The fact that the limit taken along the x and y−axis exists and equal zero may lead us
to suspect that the lim f (x, y) exists. We have not examined every path to (0, 0). We now try
(x,y)→(0,0)
any line through the origin given by y = mx,
mx2 m
lim f (x, y) = lim 2 2 2
= .
(x,y)→(0,0) (x,y)→(0,0) x + m x 1 + m2
2
This limit changes as the gradient m changes. For example (i) on y = 2x, lim f (x, y) = and
(x,y)→(0,0) 5
5
(ii) on y = 5x, lim f (x, y) = . There is no single number L that we can call the limit of f
(x,y)→(0,0) 26
as (x, y) → (0, 0). So the limit does not exist.

8.3.3 Limit Combinations

In general, if lim f (x, y) = L1 and lim g(x, y) = L2 , then


(x,y)→(x0 ,y0 ) (x,y)→(x0 ,y0 )

1. lim[f (x, y) ± g(x, y)] = lim f (x, y) ± lim g(x, y) = L1 ± L2 .


2. lim[f (x, y) · g(x, y)] = (lim f (x, y))(lim g(x, y)) = L1 L2 .
3. lim cf (x, y) = c lim f (x, y) = cL1 , where c is any number.
f (x, y) lim f (x, y) L1
4. lim = = , if L2 6= 0.
g(x, y) lim g(x, y) L2
Example 8.3.5. Evaluate lim (2x + 5xy − 3y 2 ).
(x,y)→(2,1)

Solution:
lim (2x + 5xy − 3y 2 ) = lim 2x + lim 5xy + lim (−3y 2 )
(x,y)→(2,1) (x,y)→(2,1) (x,y)→(2,1) (x,y)→(2,1)

= lim 2x + (lim 5x)(lim y) − 3 lim y 2


x→2 x→2 y→1 y→1
= 2 lim x + 5(lim x)(lim y) − 3(lim y)(lim y)
x→2 x→2 y→1 y→1 y→1
= 2 · 2 + 5 · 2 · 1 − 3 · 1 · 1 = 4 + 10 − 3 = 11.

8.4 Continuity of a Function of Two Variables

Notice that the limit of


5x2 y
f (x, y) =
x2 + y 2

104
as (x, y) → (1, 2) can be evaluated by direct substitution. That is, the limit is f (1, 2) = 2. In such
cases the function f is said to be continuous at the point (1, 2).

Definition of Continuity of a Function of Two Variables

A function f of two variables is continuous at a point (x0 , y0 ) in an open region R if

1. f is defined at (x0 , y0 ),
2. lim f (x, y) exists, and
(x,y)→(x0 ,y0 )

3. lim f (x, y) = f (x0 , y0 ).


(x,y)→(x0 ,y0 )

The function f is continuous in the open region R if it is continuous at every point in R.

Properties of Continuous Functions of Two Variables

If k is a real number and f and g are continuous at (x0 , y0 ), then the following functions are
continuous at (x0 , y0 ),

1. Scalar Multiple kf .
2. Sum and difference f ± g.
3. Product f g.
f
4. Quotient , if g(x0 , y0 ) 6= 0.
g

Polynomials and rational functions in two variables are continuous at any point at which they are
defined.

8.5 Partial Derivatives

8.5.1 Partial Derivatives of a Function of Two Variables

In the application of functions of several variables, the question often arises, “How will a function
be affected by a change in one of its independent variables?”. You can answer by considering the

105
independent variables one at a time. The process is called partial differentiation, and the result
is referred to as the partial derivative of f with respect to the chosen independent variable 1 .

Definition of Partial Derivatives of a Function of Two Variables

If z = f (x, y), then the first partial derivatives of f with respect to x and y are the functions fx
and fy defined by
f (x + ∆x, y) − f (x, y)
fx (x, y) = lim
∆x→0 ∆x
f (x, y + ∆y) − f (x, y)
fy (x, y) = lim
∆y→0 ∆y
provided the limits exist. This definition indicates that if z = f (x, y), then to find fx we consider
y constant and differentiate with respect to x. Similarly, to find fy , we consider x constant and
differentiate with respect to y.
Example 8.5.1. Find fx and fy for f (x, y) = 3x − x2 y 2 + 2x3 y.

Solution: Considering y to be constant and differentiating with respect to x produces


fx (x, y) = 3 − 2xy 2 + 6x2 y.
Considering x to be constant and differentiating with respect to y produces
fy (x, y) = −2x2 y + 2x3 .

Notation for Partial Derivatives

For z = f (x, y), the partial derivatives fx and fy are denoted by


∂ ∂z
f (x, y) = fx (x, y) = zx =
∂x ∂x
and
∂ ∂z
f (x, y) = fy (x, y) = zy = .
∂y ∂y
The first partials evaluated at the point (a, b) are denoted by

∂z
= fx (a, b)
∂x
(a,b)

and
∂z
= fy (a, b).
∂y

(a,b)
1
The introduction of partial derivatives followed Newton’s and Leibniz’s work in calculus by several years. Between
1760, Leonhard Euler and Jean Le Rond d’Alembert (1717-1783) separately published several papers on dynamics,
in which they established much of the theory of partial derivatives

106
2
Example 8.5.2. For f (x, y) = xex y , find fx and fy and evaluate each at the point (1, ln 2).

Solution: Because
2 2y
fx (x, y) = xex y (2xy) + ex
the partial derivative of f with respect to x at (1, ln 2) is

fx (1, ln 2) = eln 2 (2 ln 2) + eln 2 = 4 ln 2 + 2.

Because
2 2y
fy (x, y) = xex y (x2 ) = x3 ex
the partial derivative of f with respect to y at (1, ln 2) is

fy (1, ln 2) = eln 2 = 2.

8.5.2 Partial Derivatives of a function of Three or More Variables

The concept of a partial derivative can be extended naturally to functions of three or more variables.
For instance, if w = f (x, y, z), then there are three partial derivatives, each of which is formed by
holding two of three variables constant.
∂w f (x + ∆x, y, z) − f (x, y, z)
= fx (x, y, z) = lim
∂x ∆x→0 ∆x
∂w f (x, y + ∆y, z) − f (x, y, z)
= fy (x, y, z) = lim
∂y ∆y→0 ∆y
∂w f (x, y, z + ∆z) − f (x, y, z)
= fz (x, y, z) = lim .
∂z ∆z→0 ∆z
Example 8.5.3. (i) To find the partial derivative of f (x, y, z) = xy + yz 2 + xz with respect to z,
consider x and y to be constant and obtain
∂ 
xy + yz 2 + xz = 2yz + x.

∂z

(ii) to find the partial derivative of f (x, y, z) = z sin(xy 2 + 2z) with respect to z, consider x and y
to be constant. Then, using the Product rule, we obtain
∂  ∂ ∂
z sin(xy 2 + 2z) = (z) [sin(xy 2 + 2z)] + sin(xy 2 + 2z) [z]

∂z ∂z ∂z
2 2
= (z)[cos(xy + 2z)](2) + sin(xy + 2z)
= 2z cos(xy 2 + 2z) + sin(xy 2 + 2z).

x+y+z
(iii) To find the partial derivative of f (x, y, z, w) = with respect to w, consider x, y and
w
z to be constant and obtain
 
∂ x+y+z x+y+z
=− .
∂w w w2

107
8.5.3 Higher-Order Partial Derivatives

It is possible to take second, third and higher partial derivatives of a function of several variables,
provided such derivatives exist. Higher-order derivatives are denoted by the order in which the
differentiation occurs. For instance, the function z = f (x, y) has the following second partial
derivatives.

1. Differentiate twice with respect to x:

∂ 2f
 
∂ ∂f
= = fxx .
∂x ∂x ∂x2

2. Differentiate twice with respect to y:

∂ 2f
 
∂ ∂f
= = fyy .
∂y ∂y ∂y 2

3. Differentiate first with respect to x and then with respect to y:

∂ 2f
 
∂ ∂f
= = fxy .
∂y ∂x ∂y∂x

4. Differentiate first with respect to y and then with respect to x:

∂ 2f
 
∂ ∂f
= = fyx .
∂x ∂y ∂x∂y

The third and fourth cases are called mixed partial derivatives.

Example 8.5.4. Find the second partial derivatives of f (x, y) = 3xy 2 − 2y + 5x2 y 2 and determine
the value of fxy (−1, 2).

Solution: Begin by finding the first partial derivatives with respect to x and y.

fx (x, y) = 3y 2 + 10xy 2 and fy (x, y) = 6xy − 2 + 10x2 y.

Then, differentiate each of these with respect to x and y.

fxx (x, y) = 10y 2 , fyy (x, y) = 6x + 10x2 , fxy (x, y) = 6y + 20xy and fyx (x, y) = 6y + 20xy.

At (−1, 2), the value of fxy is fxy (−1, 2) = 12 − 40 = −28.

Notice that the two mixed partials are equal. Sufficient conditions for this occurrence are given in
the next theorem.

108
Theorem 8.5.1 (Equality of Mixed Partial Derivatives). If f is a function of x and y such that fx
and fy are continuous on an open disc R, then for every (x, y) in R,

fxy (x, y) = fyx (x, y).

The order of differentiation of the mixed partial derivatives is irrelevant.

Example 8.5.5. Show that fxz = fzx and fxzz = fzxz = fzzx for the function given by

f (x, y, z) = yex + x ln z.

Solution: First partials:


x
fx (x, y, z) = yex + ln z, fz (x, y, z) = .
z
Second partials: (Note the first two are equal)
1 1 x
fxz (x, y, z) = , fzx (x, y, z) = , fzz (x, y, z) = − .
z z z2
Third partials: (Note that all three are equal)
1 1 1
fxzz (x, y, z) = − , fzxz (x, y, z) = − , fzzx (x, y, z) = − .
z2 z2 z2

8.6 Total Differential

If z = f (x, y) and ∆x and ∆y are increments of x and y, then the differentials of the independent
variables x and y are
dx = ∆x and dy = ∆y
and the total differential of the dependent variable z is
∂z ∂z
dz = dx + dy = fx (x, y)dx + fy (x, y)dy.
∂x ∂y
This definition can be extended to a function of three or more variables. for example,
if w = f (x, y, z, u), then dx = ∆x, dy = ∆y, dz = ∆z, du = ∆u, and the total differential of w is

∂w ∂w ∂w ∂w
dw = dx + dy + dz + du.
∂x ∂y ∂z ∂u

Example 8.6.1. (i) The total differential dz for z = 2x sin y − 3x2 y 2 is

∂z ∂z
dz = dx + dy = (2 sin y − 6xy 2 )dx + (2x cos y − 6x2 y)dy.
∂x ∂y

109
(ii) The total differential dw for w = x2 + y 2 + z 2 is
∂w ∂w ∂w
dw = dx + dy + dz = 2xdx + 2ydy + 2zdz.
∂x ∂y ∂z
Theorem 8.6.1. If a function of x and y is differentiable at (x0 , y0 ), then it is continuous at
(x0 , y0 ).

Note that the existence of fx and fy is not sufficient to guarantee differentiability.


Example 8.6.2. Show that fx (0, 0) and fy (0, 0) both exist, but that f is not differentiable at (0, 0)
where f is defined as 
 −3xy
, 6 (0, 0)
(x, y) =
f (x, y) = x2 + y 2
 0, (x, y) = (0, 0),

Solution: You can show that f is not differentiable at (0, 0) by showing that it is not continuous at
this point. to see that f is not continuous at (0, 0), look at the values of f (x, y) along two different
approaches to f (x, y). Along the line y = x, the limit is
−3x2 3
lim f (x, y) = lim 2
=−
(x,x)→(0,0) (x,x)→(0,0) 2x 2
whereas along y = −x we have
3x2 3
lim f (x, y) = lim = .
(x,−x)→(0,0) (x,−x)→(0,0) 2x2 2
Thus, the limit of f (x, y) as (x, y) → (0, 0) does not exist, and we can conclude that f is not
continuous at (0, 0). Hence f is not differentiable at (0, 0). On the other hand, by the definition of
the partial derivatives fx and fy , we have
f (∆x, 0) − f (0, 0) 0−0
fx (0, 0) = lim = lim =0
∆x→0 ∆x ∆x→0 ∆x

and
f (0, ∆y) − f (0, 0) 0−0
fy (0, 0) = lim = lim = 0.
∆y→0 ∆y ∆y→0 ∆y

Thus, the partial derivatives at (0, 0) exist.

8.7 Chain Rules for Functions of Several Variables

Theorem 8.7.1. Let w = f (x, y), where f is a differentiable function of x and y. If x = g(t) and
y = h(t), where g and h are differentiable functions of t, then w is a differentiable function of t,
and
dw ∂w dx ∂w dy
= + .
dt ∂x dt ∂y dt

110
∂w
∂w w ∂y
∂x

x y

dy
dx dt
dt

t t

Figure 8.2: Chain rule: one independent variable

dw
Example 8.7.1. Let w = x2 y − y 2 , where x = sin t and y = et . Find at t = 0.
dt

Solution: By the Chain Rule for one independent variable, we have


dw ∂w dx ∂w dy
= +
dt ∂x dt ∂y dt
= 2xy(cos t) + (x2 − 2y)et .

When t = 0, x = 0, and y = 1, it follows that


dw
= 0 − 2 = −2.
dt

The Chain Rule can be extended to any number of variables. for example, if each xi is a differentiable
function of a single variable t, then for w = f (x1 , x2 , . . . , xn ), we have

dw ∂w dx1 ∂w dx2 ∂w dxn


= + + ··· + .
dt ∂x1 dt ∂x2 dt ∂xn dt
Another type of composite function is one in which the intermediate variables are themselves func-
tions of more than one variable. For example, if w = f (x, y), where x = g(s, t) and y = h(s, t), then
it follows that w is a function of s and t, and we consider the partial derivatives of w with respect
to s and t. One way to find these partial derivatives is to write w as a function of s and t explicitly
by substituting the equations x = g(s, t) and y = h(s, t) into the equation w = f (x, y), then find
the partial derivatives in the usual way.
∂w ∂w s
Example 8.7.2. Find and for w = 2xy, where x = s2 + t2 and y = .
∂s ∂t t

111
s
Solution: Begin by substituting x = s2 + t2 and y = into the equation w = 2xy to obtain
t
s  3 
2 2 s
w = 2xy = 2(s + t ) =2 + st .
t t

∂w
Then, to find , hold t constant and differentiate with respect to s.
∂s
 2
6s2 + 2t2

∂w 3s
=2 +t = .
∂s t t

∂w
Similarly, to find , hold s constant and differentiate with respect to t to obtain
∂t
 3  3
−s + st2 2st2 − 2s3
 
∂w s
=2 − 2 +s =2 = .
∂t t t2 t2

The following theorem gives an alternative method for finding the partial derivatives without ex-
plicitly writing w as a function of s and t.

Theorem 8.7.2 (Chain Rule: Two Independent Variables). Let w = f (x, y), where f is a differ-
∂x ∂x ∂y
entiable function of x and y. If x = g(s, t) and y = h(s, t) such that the first partials , ,
∂s ∂t ∂s
∂y ∂w ∂w
and all exist, then and exist and are given by
∂t ∂s ∂t
∂w ∂w ∂x ∂w ∂y ∂w ∂w ∂x ∂w ∂y
= + and = + .
∂s ∂x ∂s ∂y ∂s ∂t ∂x ∂t ∂y ∂t

∂w ∂w s
Example 8.7.3. Use the Chain rule to find and for w = 2xy where x = s2 + t2 and y = .
∂s ∂t t

Solution: We can hold t constant and differentiate with respect to s to obtain,


∂w ∂w ∂x ∂w ∂y
= +
∂s ∂x ∂s ∂y ∂s
 
1
= 2y(2s) + 2x
t
 2
s 2s + 2t2
2
= 4 +
t t
2 2
6s + 2t
= .
t

112
w
∂w
∂w ∂y
∂x

#
y
x
"!
∂x ∂x ∂y ∂y
∂t ∂s ∂t ∂s

'$

t s t s
&%

Figure 8.3: Chain Rule: Two independent variables

Similarly, holding s fixed gives


∂w ∂w ∂x ∂w ∂y
= +
∂t ∂x ∂t ∂y ∂t
 
−s
= 2y(2t) + 2x
t2
 
s
2 2 −s
= 2 (2t) + 2(s + t )
t t2
2s3 + 2st2
= 4s −
t2
4st − 2s − 2st2
2 3
=
t2
2st − 2s3
2
= .
t2
∂w ∂w
Example 8.7.4. Find and when s = 1 and t = 2π for the function given by w = xy+yz+xz
∂s ∂t
where x = s cos t, y = s sin t and z = t.

Solution: By extending the theorem, we have


∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂s ∂x ∂s ∂y ∂s ∂z ∂s
= (y + z)(cos t)+)(x + z)(sin t) + (y + x)(0)
= (y + z)(cos t) + (x + z)(sin t)

113
When s = 1 and t = 2π, we have x = 1, y = 0 and z = 2π. Therefore
∂w
= 2π(1) + (1 + 2π)(0) + 0 = 2π.
∂s
Furthermore
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂t ∂x ∂t ∂y ∂t ∂z ∂t
= (y + z)(−s sin t) + (x + z)(s cos t) + (y + x)(1)

and for s = 1 and t = 2π it follows that


∂w
= (0 + 2π)(0) + (1 + 2π)(1) + (0 + 1)(1) = 2 + 2π.
∂t
∂x ∂x ∂y ∂y
Example 8.7.5. If u = x + y and v = xy, find , , , .
∂u ∂v ∂u ∂v

Solution: Clearly du = dx + dy and dv = ydx + xdy, and hence −ydx = xdy − dv and
xdx = xdu − xdy. Adding these two equations, yield (x − y)dx = xdu − dv. Also, xdy = dv − ydx
and −ydy = ydx − ydu, hence (x − y)dy = dv − ydu. Thus
x 1
dx = du − dv
x−y x−y
and
1 y
dy = dv − du.
x−y x−y
Consequently, we have
∂x x ∂x −1 ∂y −y ∂y 1
= , = = = .
∂u x−y ∂v x−y ∂u x−y ∂v x−y

Example 8.7.6. Parabolic co-ordinates (u, v) are defined implicitly in terms of the Cartesian co-
ordinates (x, y) by the pair of equations

u2 − v 2
x= , y = uv.
2
∂u ∂v ∂v
Obtain expressions for , and in terms of u and v and verify that
∂y ∂x ∂y
∂u ∂v ∂u ∂v
+ = 0.
∂x ∂x ∂y ∂y
∂f ∂f ∂φ ∂φ
and
Given that f (x, y) = φ(u, v), obtain expressions for in terms of , , u and v, and
∂x ∂y ∂u ∂v
deduce that  2  2 "   2 #
2
∂f ∂f 1 ∂φ ∂φ
+ = 2 + .
∂x ∂y u + v2 ∂u ∂v

114
u2 − v 2
Solution: Since x = , y = uv, then dx = udu − vdv and dy = vdu + udv. Multiplying the
2
first by u and the second by v, we have

udx = u2 du − uvdv, vdy = v 2 du + uvdv.

On adding them, we obtain


(u2 + v 2 )du = udx + vdy. (8.1)
Also multiplying the first by v and the second by u, we get

vdx = uvdu − v 2 dv, udy = uvdu + u2 dv.

Subtracting them, we obtain


(u2 + v 2 )dv = udy − vdx. (8.2)
From (8.1) and (8.2), we have
u v
du = dx + 2 dy,
u2 +v 2 u + v2
and
u v
dv =dy − 2 dx.
+v 2 u2
u + v2
∂u ∂u ∂v ∂v
Comparing these equations with du = dx + dy and dv = dx + dy, we get
∂x ∂y ∂x ∂y
∂u u ∂u v ∂v u ∂v v
= 2 , = 2 , = 2 , =− 2 .
∂x u + v2 ∂y u + v2 ∂y u + v2 ∂x u + v2
Now
∂u ∂v ∂u ∂v −uv uv
+ = 2 + = 0.
∂x ∂x ∂y ∂y u + v 2 u2 + v 2
Given that f (x, y) = φ(u, v), it follows that u and v are implicit functions of x and y and so

∂f ∂φ ∂u ∂φ ∂v u ∂φ v ∂φ
= + = 2 2
− 2 2
∂x ∂u ∂x ∂v ∂x u + v ∂u u + v ∂v
and
∂f ∂φ ∂u ∂φ ∂v v ∂φ u ∂φ
= + = 2 + 2 .
∂y ∂u ∂y ∂v ∂y u + v ∂u u + v 2 ∂v
2

Now 2  2  2
u2 v2

∂f ∂φ 2uv ∂φ ∂φ ∂φ
= 2 2 2
− 2 2 2
+ 2 2 2
∂x (u + v ) ∂u (u + v ) ∂u ∂v (u + v ) ∂v
 2 2 2
v2 u2
  
∂f ∂φ 2uv ∂φ ∂φ ∂φ
= 2 + 2 + .
∂y (u + v 2 )2 ∂u (u + v 2 )2 ∂u ∂v (u2 + v 2 )2 ∂v
Hence  2  2 " 2  2 #
∂f ∂f 1 ∂φ ∂φ
+ = 2 + .
∂x ∂y u + v2 ∂u ∂v

115
Example 8.7.7. Let w = f (x, y), where x and y are given in polar coordinates by the equations
∂w ∂w ∂ 2w
x = r cos θ and y = r sin θ. Calculate , and in terms of r and θ and the partial
∂r ∂θ ∂r2
derivatives of w with respect to x and y.

Solution: Here x and y are intermediate values, while the independent variables are r and θ. First
note that
∂x ∂y ∂x ∂y
= cos θ, = sin θ, = −r sin θ and = r cos θ.
∂r ∂r ∂θ ∂θ
Then
∂w ∂w ∂x ∂w ∂y ∂w ∂w
= + = cos θ + sin θ
∂r ∂x ∂r ∂y ∂r ∂x ∂y
and
∂w ∂w ∂x ∂w ∂y ∂w ∂w
= + = −r sin θ + r cos θ.
∂θ ∂x ∂θ ∂y ∂θ ∂x ∂y
Next,
∂ 2w
   
∂ ∂w ∂ ∂w ∂w ∂wx ∂wy
2
= = cos θ + sin θ = cos θ + sin θ,
∂r ∂r ∂r ∂r ∂x ∂y ∂r ∂r
∂w ∂w
where wx = and wy = . Therefore
∂x ∂y
∂ 2w
   
∂wx ∂x ∂wx ∂y ∂wy ∂x ∂wy ∂y
= + cos θ + + sin θ
∂r2 ∂x ∂r ∂y ∂r ∂x ∂r ∂y ∂r
 2
∂ 2w
 2
∂ 2w
 
∂ w ∂ w
= cos θ + sin θ cos θ + cos θ + sin θ sin θ.
∂x2 ∂y∂x ∂x∂y ∂y 2
Finally, because wyx = wxy , we get
∂ 2w ∂ 2w 2 ∂ 2w ∂ 2w 2
= cos θ + 2 cos θ sin θ + sin θ.
∂r2 ∂x2 ∂x∂y ∂y 2

8.8 Extrema of Functions of Two Variables

8.8.1 Absolute Extrema and Relative Extrema

Theorem 8.8.1 (Extreme Value Theorem). Let f be a continuous function of two variables x and
y defined on a closed bounded region R in the xy-plane.

1. There is at least one point in R where f takes on a minimum value.


2. There is at least one point in R where f takes on a maximum value.

A minimum value is also called an absolute minimum and a maximum is also called an absolute
maximum.

116
Definition of Relative Extrema

Let f be a function defined on a region R containing (x0 , y0 ).

1. The function f has a relative minimum at (x0 , y0 ) if


f (x, y) ≥ f (x0 , y0 )
for all (x, y) in an open disc containing (x0 , y0 ).
2. The function f has a relative maximum at (x0 , y0 ) if
f (x, y) ≤ f (x0 , y0 )
for all (x, y) in an open disc containing (x0 , y0 ).

To locate relative extreme of f , we can investigate the points at which the gradient of f is 0. Such
points are called critical points of f .

Definition of Critical Point

Let f be defined on an open region R containing (x0 , y0 ). The point (x0 , y0 ) is a critical point of
f if one of the following is true.

1. fx (x0 , y0 ) = 0 and fy (x0 , y0 ) = 0.


2. fx (x0 , y0 ) and fy (x0 , y0 ) does not exist.
Theorem 8.8.2. If f has a relative extremum at (x0 , y0 ) on an open region R, then (x0 , y0 ) is a
critical point of f .
Example 8.8.1. Determine the relative extrema of
f (x, y) = 2x2 + y 2 + 8x − 6y + 20.

Solution: Begin by finding the critical points of f . Because fx (x, y) = 4x + 8 and fy (x, y) = 2y − 6
are defined for all x and y, the only critical points are those for which both first partial derivatives
are 0. To locate these points, let fx (x, y) and fy (x, y) be 0, and solve the system of equations
4x + 8 = 0 and 2y − 6 = 0
to obtain the critical point (−2, 3). By completing the square, we can conclude that for all
(x, y) 6= (−2, 3),
f (x, y) = 2(x + 2)2 + (y − 3)2 + 3 > 3.
Therefore, a relative minimum of f occurs at (−2, 3). The value of the relative minimum is
f (−2, 3) = 3.

117
The above example shows a relative minimum occurring at one type of critical point, the type for
which both fx (x, y) and fy (x, y) are 0. The next example concerns a relative maximum that occurs
at the other type of critical point, the type for which either fx (x, y) or fy (x, y) is undefined.
1
Example 8.8.2. Determine the relative extrema of f (x, y) = 1 − (x2 + y 2 ) 3 .

Solution: Because
2x 2y
fx (x, y) = − 2 and fy (x, y) = − 2
3(x2 + y 2 ) 3 3(x2 + y 2 ) 3

it follows that both partial derivatives are defined for all points in the xy-plane except for (0, 0).
Moreover, because the partial derivatives cannot both be 0 unless both x and y are 0, we can conclude
that (0, 0) is the only critical point. Note that f (0, 0) = 1, for all other (x, y) it is clear that
1
f (x, y) = 1 − (x2 + y 2 ) 3 < 1.

Therefore, f has a relative maximum at (0, 0).

The Second Partials Test

Some critical points yield saddle points, which are neither relative maxima nor relative minima.

Theorem 8.8.3. Let f have continuous second partial derivatives on an open region containing a
point (a, b) for which
fx (a, b) = 0 and fy (a, b) = 0.
To test for relative extrema of f , we define the quantity

d = fxx (a, b)fyy (a, b) − [fxy (a, b)]2 .

1. If d > 0 and fxx (a, b) > 0, then f has a relative minimum at (a, b).

2. If d > 0 and fxx (a, b) < 0, then f has a relative maximum at (a, b).

3. If d < 0, then (a, b, f (a, b)) is a saddle point.

4. The test is inconclusive if d = 0.

A convenient device for remembering the formula for d in the Second Partials Test is given by
the 2 × 2 determinant
fxx (a, b) fxy (a, b)
d =
fyx (a, b) fyy (a, b)
where fxy (a, b) = fyx (a, b).

118
Example 8.8.3. Find the relative extrema of f (x, y) = −x3 + 4xy − 2y 2 + 1.

Solution: Begin by finding the critical points of f . Because


fx (x, y) = −3x2 + 4y and fy (x, y) = 4x − 4y
are defined for all x and y, the only critical points are those for which both first partial derivatives
are 0. Solving the equations −3x2 + 4y = 0 and 4x − 4y = 0, we see that from the second equation
that x = y, and by substitution into the first equation, we obtain two solutions, y = x = 0 and
y = x = 34 . Because
fxx (x, y) = −6x, fyy (x, y) = −4, fxy (x, y) = 4
it follows that, for the critical point (0, 0),
d = fxx (0, 0)fyy (0, 0) − [fxy (0, 0)]2 = 0 − 16 < 0
and, by the Second Partials Test, we can conclude that (0, 0) is a saddle point of f . Furthermore,
for the critical point ( 34 , 43 ),
      2
4 4 4 4 4 4
d = fxx , fyy , − fxy , = −8(−4) − 16 > 0
3 3 3 3 3 3
and because fxx ( 43 , 43 ) = −8 < 0 we can conclude that f has a relative maximum at ( 34 , 43 ).

8.9 Tutorial 6
 
3 2 1 2
1. If f (x, y) = x − 2xy + 3y , find (i) f (−2, 3) (ii) f , .
x y
2. Use the definition of a limit to show that:
(i) lim (3x − 2y) = 14 (ii) lim (xy − 3x + 4) = 0 (iii) lim (2x + 5xy − 3y 2 ) =
(x,y)→(4,−1) (x,y)→(2,1) (x,y)→(2,1)
11 .
3. Let 
xy 2
, (x, y) 6= (0, 0)

f (x, y) = x2 + y 2
0, (x, y) = (0, 0),

Prove that lim f (x, y) = 0.


(x,y)→(0,0)

4. Use rules of limits to evaluate each of the following limits.


3−x+y y
(i) lim (3x3 − 2xy + 4y 2 ) (ii) lim (iii) lim x2 sin
(x,y)→(2,−1) (x,y)→(1,2) 4 + x − 2y (x,y)−→(4,π) x
x+y−1
(iv) lim √ √ .
(x,y)→(0+ ,1− ) x− 1−y
2
“Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole life believing
that it is stupid.”— Albert Einstein

119
5. Show that the following limits do not exist:
x2 − 3y 2 xy x2 y 2x − y 2
(i) lim 2 2
(ii) lim (iii) lim (iv) lim .
(x,y)→(0,0) x + 2y (x,y)→(0,0) x + y 2
2 (x,y)→(0,0) x4 + y 2 (x,y)→(0,0) 2x2 + y

∂z ∂z
6. Use the definition of partial derivative to find and for z = 3x2 − xy + 2y 2 + 3.
∂x ∂y
x−y ∂f ∂f
7. If f (x, y) = , find and from the definition.
x+y ∂x ∂y
2 +2y 2 ∂f ∂f ∂ 2f ∂ 2f
8. Let f (x, y) = xex , find , and verify that = .
∂x ∂y ∂x∂y ∂y∂x
y ∂ 2z
9. If z = x2 tan−1 , find at (1, 1).
x ∂x∂y
1
10. Let f (x, y, z) = p . Show that fxx + fyy + fzz = 0.
x2 + y 2 + z 2
dw y
11. Find given that (i) w = x2 + y 2 , x = et , y = e−t (ii) w = ln , x = cos t, y = sin t
dt p x
(iii), w = x2 + y 2 , x = sin t, y = et .
∂r ∂r ∂r
12. Find , and .
∂x ∂y ∂z
(i) r = eu+v+w and u = yz, v = xz, w = xy (ii) r = uvw − u2 − v 2 − w2 and u = y + z,
v = x + z, w = x + y.
13. Let f (x, y) be a function of x and y where x = eu−v cos(u + v), y = eu−v sin(u + v).
Show that
∂f ∂f ∂f ∂f
+ = 2x − 2y .
∂u ∂v ∂y ∂x
14. Assume that w = f (u, v) where u = x + y and v = x − y. Show that
 2  2
∂w ∂w ∂w ∂w
= − .
∂x ∂y ∂u ∂v
15. Suppose that w = f (x, y) and that x = u + v and y = u − v. Show that
∂ 2w ∂ 2w 1 ∂ 2w ∂ 2w
 
− = + .
∂x2 ∂y 2 2 ∂u∂v ∂v∂u
16. Let V = f (x, y), where x and y are given in polar co-ordinates by the equations x = r cos θ
 2  2  2  2
∂V ∂V ∂V 1 ∂V
and y = r sin θ. Show that + = + 2 .
∂x ∂y ∂r r ∂θ
17. Given V = f (x, y). Show that if x = r cos θ and y = r sin θ, then
∂ 2V ∂ 2V ∂ 2V 1 ∂V 1 ∂ 2V
+ = + + .
∂x2 ∂y 2 ∂r2 r ∂r r2 ∂θ2
18. Given that w = f (x, y), x = eu cos v and y = eu sin v. Show that
 2  2 "   2 #
2
∂w ∂w ∂w ∂w
+ = e−2u + .
∂x ∂y ∂u ∂v

120
8.10 Tutorial 7
1. Find the total differential of the following functions.
x2
(i) z = 3x2 y 3 (ii) z = (iii) z = x cos y − y cos x (iv) w = ex cos y + z
y
1 2 2 2 2 x+y
(v) w = x2 yz 2 sin yz (vi) z = (ex +y − e−x −y ) (vii) z = ex sin y (viii) w = .
2 z − 2y
2. Find and classify the stationary points of

(a) f (x, y) = x2 − 2xy + 2y 2 − 2x + 2y + 4


(b) f (x, y) = 4xy − y 4 − x4
2 −y 2
3. (a) Let f (x, y) = (x2 − y 2 )e−x . Calculate the partial derivatives

fx , fy , fxx , fyx and fxy .

(b) Hence, find and classify all the critical points of the function f (x, y).
x2 − y 2
4. Suppose that w = f (u), where u = . Show that xwx + ywy = 0.
x2 + y 2
5. Suppose that w = f (u) + g(v), where u = x − at and v = x + at. Show that

∂ 2w 2
2∂ w
= a .
∂t2 ∂x2

121
Chapter 9

Multiple Integration

9.1 Iterated Integrals

In the previous chapter, we saw that it is meaningful to differentiate functions of several variables
with respect to one variable while holding the other variable constant, we can integrate functions
of several variables by a similar procedure. For example, if we have the partial derivative
fx (x, y) = 2xy, then by considering y constant, we can integrate with respect to x to obtain
Z
f (x, y) = fx (x, y) dx Integrate with respect to x
Z
= 2xy dx y is held constant
Z
= y 2x dx Factor out constant y

= y(x2 ) + C(y) Anti-derivative of 2x is x2


= x2 + C(y). C(y) is function of y

Note that the constant of integration, C(y) is a function of y. In other words, by integrating with
respect to x, we are able to recover f (x, y) only partially. For example, by considering y constant,
we can apply the Fundamental Theorem of calculus to evaluate
Z 2y 2y

2xy dx = x y = (2y)2 y − (1)2 y = 4y 3 − y.
2
1
1

Note that the variable ofZ integration cannot appear in either limit of integration. For example, it
x
makes no sense to write y dx.
0
Z x
Example 9.1.1. Evaluate (2x2 y −2 + 2y) dy.
1

122
Solution: Considering x to be constant and integrating with respect to y produces
Z x x
−2x2

2 −2 2
(2x y + 2y) dy = +y
1 y
1 
−2x2 −2x2
 
2
= +x − +1
x 1
= 3x2 − 2x − 1.

Notice that in the above example the integral defines a function of x and can itself be integrated.
Z 2 Z x 
2 −2
Example 9.1.2. Evaluate (2x y + 2y) dy dx.
1 1

Solution:
y 1≤x≤2 y=x
1≤y≤x
6

1 2 - x

Figure 9.1: The region of integration

Using the result from the previous example, we have


Z 2 Z x  Z 2
2 −2
(2x y + 2y) dy dx = (3x2 − 2x − 1) dx
1 1 1
2
= x − x2 − x 1
 3

= 2 − (−1)
= 3.

123
The integral in the above example is an iterated integral. Iterated integrals are usually written
simply as
Z b Z g2 (x) Z d Z h2 (y)
f (x, y) dydx and f (x, y) dxdy.
a g1 (x) c h1 (y)

The inside limits of integration can be variable with respect to the outer variable of integration.
However, the outside limits of integration must be constant with respect to both variables of
integration. After performing the inside integration, we obtain a definite integral, and the second
integration produces a real number.

One order of integration will often produce a simpler integration problem than the other order. The
order of integration affects the ease of integration, but not the value of the integral.

9.1.1 Comparing Different Orders of Integration

Example 9.1.3. Sketch the region whose area is represented by the integral
Z 2Z 4
dxdy.
0 y2

Then find another iterated integral using the order dydx to represent the area, and show that both
integrals yield the same value.

Solution: From the given limits of integration, we know that


y
6

y2

∆y

- x
4

Figure 9.2: Fig 1

y2 ≤ x ≤ 4 Inner limits of integration

124
which means that the region R is bounded on the left by the parabola x = y 2 and on the right by the
line x = 4. Furthermore, because
0≤y≤2 Outer limits of integration
we know that R bounded below by the x-axis. The value of this integral is
Z 2Z 4 Z 2 #4
dxdy = x dy
0 y2 0
y2
Z 2
= (4 − y 2 ) dy
0
2
y3

16
= 4y − = .
3 0 3
To change the order of integration to dydx, place a vertical rectangle in the region. From this we
can see that the constant bounds 0 ≤ x ≤ 4 serve as the outer limits of integration.
√ By solving for
2
y in the equation x = y , we can conclude that the inner bounds are 0 ≤ y ≤ x. Therefore, the
area of the region can be represented by
Z 4 Z √x
dydx.
0 0
By evaluating this integral, we can see that it has the same value as the original integral.
y

6 √
y= x

∆x
- x

Figure 9.3: Fig 2

√ # √x
Z 4 Z x Z 4
dydx = y dx
0 0 0
0

Z 4
= x dx
0
#4
2 3 16
= x2 = .
3 3
0

125
Z 4 Z 2
2
Example 9.1.4. Express ey dydx as an iterated integral with order of integration reversed
x
0 2
and evaluate.

Solution: From the given limits of integration we see that, for a fixed x, y varies from y = x2 to
y = 2 and x varies from x = 0 to x = 4. We can also describe the region as, for y fixed, x varies
from
Z Z x = 0 to x = 2y and y varies from y = 0 to y = 2. The corresponding iterated integral is
2 2y
2
ey dxdy. Solving, we have
0 0

Z 2 Z 2y Z 2 x=2y
y2 y2
e dxdy = xe dy
0 0 0 x=0
Z 2
2
= 2yey dy
0
2
2
= ey
0
4
= e − 1.
Z 2 Z 1
3
Example 9.1.5. Evaluate yex dxdy.
y
0 2

3
Solution: We cannot integrate first with respect to x, as indicated, because it happens that ex
has no elementary anti-derivative. So we try to evaluate the integral by first reversing the order of
integration.
Z 2Z 1 Z 1 Z 2x
x3 3
ye dxdy = yex dydx
y
0 2
0 0
Z 1 2
1 2 3
= y xex dx
0 2 0
Z 1
3
= 2x2 ex dx
0
1
2 x3
= e
3 0
2
= (e − 1).
3

126
9.2 Double Integrals and Volume

9.2.1 Double Integrals and Volume of a Solid Region

Definition of Double Integral

If f is defined on a closed, bounded region R in the xy-plane, then the double integral of f over
R is given by
ZZ Xn
f (x, y) dA = lim f (xi , yi )∆xi ∆yi
|∆|→0
R i=1

provided the limit exists. If the limit exists, then f is integrable over R.

A double integral can be used to find the volume of a solid region that lies between the xy-plane
and the surface given by z = f (x, y).

Volume of a Solid Region

If f is integrable over a plane region R and f (x, y) ≥ 0 for all (x, y) in R, then the volume of the
solid region that lies above R and below the graph of f is defined as
ZZ
V = f (x, y) dA.
R

Example 9.2.1. Find the volume of the solid region R bounded by the surface
2
f (x, y) = e−x

and the planes y = 0, y = x and x = 1.

Solution: The base of R in the xy-plane is bounded by the lines y = 0, x = 1 and y = x. The two
possible orders of integration are
Z 1Z x Z 1Z 1
−x2 2
e dydx and e−x dxdy.
0 0 0 y

By setting Z
up the corresponding iterated integrals, we can see that the order dxdy requires the anti-
2
derivative e−x dx, which is not an elementary function. On the other hand, the order dydx

127
produces the integral
#x
Z 0 Z x Z 1
−x2 −x2
e dydx = e y dx
1 0 0
0
Z 1
−x2
= xe dx
0
#1
1 −x2
= − e
2
 0 
1 1
= − −1
2 e
e−1
= = 0.316.
2e

9.3 Triple Integrals

Definition of Triple Integral

If f is continuous over a bounded solid region Q, then the triple integral of f over Q is defined
as ZZZ n
X
f (x, y, z) dV = lim f (xi , yi , zi ) ∆Vi
|∆|→0
Q i=1

provided the limit exists. The volume of the solid region Q is given by
ZZZ
Volume of Q = dV.
Q

Evaluation by Iterated Integrals

Let f be continuous on a solid region Q defined by


a ≤ x ≤ b, h1 (x) ≤ y ≤ h2 (x), g1 (x, y) ≤ z ≤ g2 (x, y)
where h1 , h2 , g1 and g2 are continuous functions. Then
ZZZ Z b Z h2 (x) Z g2 (x,y)
f (x, y, z) dV = f (x, y, z) dV.
a h1 (x) g1 (x,y)
Q

Example 9.3.1. Evaluate the triple iterated integral


Z 2 Z x Z x+y
ex (y + 2z) dzdydx.
0 0 0

128
Solution: For the first integration, hold x and y constant and integrate with respect to z.
Z 2 Z x Z x+y Z 2Z x #x+y
ex (y + 2z) dzdydx = ex (yz + z 2 ) dydx
0 0 0 0 0
0
Z 2Z x
= ex (x2 + 3xy + 2y 2 ) dydx.
0 0
For the second integration, hold x constant and integrate with respect to y.
Z 2Z x Z 2  x
x 2 2 x 2 3xy 2 2y 3
e (x + 3xy + 2y ) dydx = e x y+ + dx
0 0 0 2 3 0
19 2 3 x
Z
= x e dx
6 0
" #2
19 x 3
= e (x − 3x2 + 6x − 6)
6
0
19 e2
 
= +1 .
6 3
Example 9.3.2. If f (x, y, z) = xy + yz and T consists of those points (x, y, z) in space satisfying
the inequalities −1 ≤ x ≤ 1, 2 ≤ y ≤ 3 and 0 ≤ z ≤ 1, then
ZZ Z 1 Z 3Z 1
f (x, y, z) dV = (xy + yz) dzdydx
−1 2 0
T
Z 1 Z 3  1
1 2
= xyz + yz dydx
−1 2 2 z=0
Z 1 Z 3 
1
= xy + y dydx
−1 2 2
Z 1 3
1 2 1 2
= xy + y dx
−1 2 4 y=2
Z 1 
5 5
= = x+ dx
−1 2 4
 1
5 2 5 5
= x + x = .
4 4 −1 2

9.4 Change of Variables : Jacobians

9.4.1 Jacobians

The Jacobian is named after the German mathematician Carl Gustav Jacobi (1804-1851). For the
single integral Z b
f (x) dx
a

129
you can change variables by letting x = g(u), so that dx = g 0 (u)du, and obtain
Z b Z d
f (x) dx = f (g(u))g 0 (u) du
a c

where a = g(c) and b = g(d). Note that the change of variable introduces an additional factor g 0 (u)
into the integrand. This also occurs in the case of double integrals.
ZZ ZZ
∂x ∂y ∂y ∂x
f (x, y) dA = f (g(u, v), h(u, v))
− dudv
∂u ∂v ∂u ∂v
R S | {z }
Jacobian
where the change of variables x = g(u, v) and y = h(u, v) introduces a factor called the Jacobian
of x and y with respect to u and v.

Definition of the Jacobian

If x = g(u, v) and y = h(u, v), then the Jacobian of x and y with respect to u and v, denoted by
∂(x, y)
is
∂(u, v)
∂x ∂x

∂(x, y) ∂u ∂v ∂x ∂y ∂y ∂x
= = − .
∂(u, v) ∂y ∂y ∂u ∂v ∂u ∂v

∂u ∂v

∂(u, v)
In cases it is more convenient to express u and v in terms of x and y, we can first compute
∂(x, y)
∂(x, y)
explicitly and then find the needed Jacobian from the formula
∂(u, v)

∂(x, y) ∂(u, v)
· = 1.
∂(u, v) ∂(x, y)

Example 9.4.1. Find the Jacobian for the change of variables defined by

x = r cos θ and y = r sin θ.

Solution: From the definition of a Jacobian, we obtain



∂x ∂x

∂r ∂θ

∂(x, y)
cos θ −r sin θ

= = = r cos2 θ + r sin2 θ = r.
∂(r, θ) ∂y ∂y sin θ r cos θ

∂r ∂θ

130
The above example points out that the change of variables from rectangular to polar coordinates
for a double integral can be written as
ZZ ZZ
f (x, y) dA = f (r cos θ, r sin θ) rdrdθ, r > 0
R S
ZZ
∂(x, y)
= f (r cos θ, r sin θ)
drdθ
∂(r, θ)
S

where S is the region in the rθ-plane that corresponds to the region R in the xy-plane. In general,
a change of variables is given by a one-to-one transformation T from a region S in the uv-plane
to a region R in the xy-plane, to be given by

T (u, v) = (x, y) = (g(u, x), h(u, v))

where g and h have continuous first partial derivatives in the region S. Note that the point (u, v)
lies in S and the point (x, y) lies in R. In most cases, we are hunting for a transformation for which
the region S is simpler than the region R.

Change of Variables for Double Integrals

Theorem 9.4.1. Let R and S be regions in the xy- and uv-planes that are related by the equations
x = g(u, v) and y = h(u, v) such that each point in R is the image of a unique point in S. If f is
∂(x, y)
continuous on R, g and h have continuous partial derivatives on S, and is non-zero on S,
∂(u, v)
then ZZ ZZ
∂(x, y)
f (x, y) dA = f (g(u, v), h(u, v)) dudv.
∂(u, v)
R S

Example 9.4.2. Let R be the region bounded by the lines

x − 2y = 0, x − 2y = −4, x + y = 4, and x + y = 1.

Evaluate the double integral ZZ


3xy dA.
R

Solution: to begin, let u = x + y and v = x − 2y. Solving this system of equations for x and y
1 1
produces x = (2u + v) and y = (u − v). The partial derivatives of x and y are
3 3
∂x 2 ∂x 1 ∂y 1 ∂y 1
= , = , = and =−
∂u 3 ∂v 3 ∂u 3 ∂v 3

131
which implies that the Jacobian is

∂x ∂x

∂(x, y) ∂u ∂v
=

∂(u, v)

∂y ∂y

∂u ∂v


2 1

3 3
=


1 1

3 3

2 1 1
= − − =− .
9 9 3
Therefore, we obtain
ZZ ZZ  
1 1 ∂(x, y)
3xy dA = 3 (2u + v) (u − v) dvdu
3 3 ∂(u, v)
R S
Z 4Z 0
1
= (2u2 − uv − v 2 ) dvdu
1 −4 9
Z 4 0
1 2 uv 2 v 3
= 2u v − − du
9 1 2 3 −4
1 4
Z  
2 64
= 8u + 8u − du
9 1 3
4
1 8u3

2 64
= + 4u − u
9 3 3 1
164
= .
9
Example 9.4.3. Suppose R is the Z Z plane bounded by the hyperbolas xy = 1, xy = 3 and
x2 − y 2 = 1, x2 = y 2 = 4. Find (x2 + y 2 ) dxdy.
R

Solution: The hyperbolas bounding R are u−curves and v−curves if u = xy and v = x2 − y 2 . We


can most easily write x2 + y 2 in terms of u and v by first noting that

4u2 + v 2 = 4x2 y 2 + (x2 − y 2 )2 = (x2 + y 2 )2 ,



so that x2 + y 2 = 4u2 + v 2 . Now

∂(u, v) y x
= = −2(x2 + y 2 ).
∂(x, y) 2x −2y

Hence we have
∂(x, y) 1 1
=− 2 2
=− √ .
∂(u, v) 2(x + y ) 2 4u2 + v 2

132
Therefore
ZZ
2 2
Z 4 Z 3 √ 1
Z 4 Z 3
1
(x + y ) dxdy = 4u2 + v2 √ dudv = dudv = 3.
1 1 2 4u2 + v 2 1 1 2
R

Example 9.4.4. Find the area of the region R bounded by the curves xy = 1, xy = 3 and
xy 1.4 = 1, xy 1.4 = 2.

Solution: Define change of variables transformation by u = xy and v = xy 1.4 . Then



∂(u, v) y x
= (0.4)xy 1.4 = (0.4)v.
= 1.4
∂(x, y) y (1.4)xy 0.4

∂(x, y) 1 2.5
So = = . Consequently,
∂(u, v) ∂(u, v) v
∂(x, y)
ZZ Z 2 Z 3
2.5
dxdy = dudv = 5 ln 2.
1 1 v
R

9.5 Tutorial 8
1. Evaluate the integral.
Z x Z x2 Z y Z x3 Z cos x
y y ln x y
−x
(i) (2x − y) dy (ii) dy (iii) dx (iv) ye dy (v) y dy.
0 x x ey x 0 0

2. Evaluate the iterated integral.


π
Z 1Z 2 Z 1Z x √ Z 2 Z 2y−y 2 Z
2
Z sin θ
(i) (x+y) dydx (ii) 1− x2 dydx (iii) 3y dxdy (iv) θr drdθ.
0 0 0 0 0 3y 2 −6y 0 0

3. Sketch the region R of integration


Z 4Z Z 4 Zand switch the order ofZintegration.
y 2 1 Z 1
(i) f (x, y) dxdy (ii) √
f (x, y) dxdy (iii) f (x, y) dydx.
0 0 0 y −1 x2

4. Sketch the region R whose area is given by the iterated integral. Then switch the order of
integration and show that both orders yield the same area.
Z 1Z 2 Z 2Z 1 Z 4Z 2 Z 2 Z 4−y2
(i) dydx (ii) dydx (iii) √
dydx (iv) dxdy.
x
0 0 0 2
0 x −2 0

5. Evaluate
Z 2 Z 2thepiterated integral. (Note
Z 2 Z it2 is necessary to switch
Z 1 Z the order of integration)
1 Z 2Z 4

3 −y 2 2
(i) x 1 + y dydx (ii) e dydx (iii) sin x dxdy (iv) x sin x dxdy
0 x 0 x 0 y 0 y2
Z 1Z 1 Z πZ π Z 1Z 1
dxdy sin y 2
(v) 4
(vi) dydx (vii) e−x dxdy.
0 y 1+x 0 x y 0 y

133
6. Use polar√coordinates to evaluate each of the following integrals.
Z 1 Z 1−x2 Z 2 Z √4−x2 Z 1 Z √1−y2
dydx 3
(i) 2 − y2
(ii) (x2 +y 2 ) 2 dydx (iii) sin(x2 +y 2 ) dxdy.
0 0 4 − x 0 0 0 0

7. Evaluate
Z 3 Z 2the
Z 1triple integral. Z 1 Z 1 Z 1 Z 4 Z 1 Z x
2
(i) (x+y+z) dxdydz (ii) 2 2 2
x y z dxdydz (iii) 2ze−x dydxdz
Z √y2 −9x2
0 0 0 −1 −1 −1 1 0 0
1 π y 1 y
Z 4 Z e2 Z xz
Z
2
Z
2
Z
y
Z 9 Z
3
(iv) ln zdydzdx (v) sin y dzdxdy (vi) z dzdxdy.
1 1 0 0 0 0 0 0 0

∂(x, y)
8. Find the Jacobian for the indicated change of variables.
∂(u, v)
1 1
(i) x = − (u − v), y = (u + v) (ii) x = au + bv, y = cu + dv (iii) x = u − uv, y = uv
2 2
(iv) x = u cos θ − v sin θ, y = u sin θ + v cos θ (v) x = eu sin v, y = eu cos v.
ZZ p
9. Evaluate x2 + y 2 dxdy, where D is the region bounded by x2 + y 2 = 4 and x2 + y 2 = 9.
D
ZZ
10. Evaluate (x + y)2 dxdy where D is the parallelogram bounded by the lines
D
x + y = 0, x + y = 1, 2x − y = 0 and 2x − y = 3.

11. Let D be the region in the firstZ Zquadrant


r bounded by the hyperbolas xy = 1, xy = 9 and the
y √
lines y = x, y = 4x. Evaluate + xy dxdy.
x
D
ZZ
1
12. Using a suitable transformation, evaluate x(x2 −y 2 ) 2 dxdy, where R is the region bounded
R
by x2 − y 2 = 1, x2 − y 2 = 2, 2y = x and 5y = x.
Z ∞Z ∞ Z ∞
−(x2 +y 2 ) 2
13. Using polar co-ordinates, evaluate e dxdy. Deduce the value of e−x dx.
0 0 0

Z aZ a− a2 −x2
xy log(y + a)
14. By changing the order of integration or otherwise, evaluate dydx.
0 0 (y − a)2
1Z 1
xy 2
Z
15. Change the order of integration and evaluate the integral √
p dydx.
0 x x2 + y 2
ZZ p
16. Evaluate x2 + y 2 dydx, where R is the region in which x2 + y 2 ≤ 4, x ≥ 0, y ≥ 0.
R
Z a Z a
1
17. By means of the substitutions u = x+y and v = y−x, evaluate the integral p dxdy.
0 0 a2 − (y − x)2
Z a Z a
1 1 dxdy
18. By means of substitutions x = 2
(u+v) and y = 2
(u−v), evaluate the integral p .
0 0 a2 + (y + x)2

134
ZZ
19. Calculate (x + y)3 cos2 (x − y) dxdy, where R is the region bounded by the lines
R
x + y = π, x + y = 5π, x − y = π and x − y = −π.

20. Calculate the area of the region bounded by xy = 4, xy = 8, xy 3 = 5 and xy 3 = 15.

21. Calculate the area of the parallelogram bounded by the lines x + y = 1, x + y = 2 and
2x − 3y = 2, 2x − 3y = 5.
ZZ
22. Evaluate (x2 + y 2 ) dxdy, where R is the region in the first quadrant bounded by the
R
hyperbolas x2 − y 2 = 6, x2 − y 2 = 1, 2xy = 4 and 2xy = 1. (Hint: Use u = x2 − y 2 , v = 2xy)
Z 1Z 1 Z √π Z √π
− x2
23. Evaluate (i) √
e y dydx (ii) sin(x2 ) dxdy.
0 x 0 y

135
Chapter 10

Basic Concepts of Set Theory

136
10.1 The Importance of Set Theory

One striking feature of humans is their inherent need-and ability-to group objects according to
specific criteria. Our prehistoric ancestors grouped tools based on their hunting needs. They even-
tually evolved into strict hierarchical societies where a person belonged to one class and not another.
Many of us today like to sort our clothes at house, or group the songs on our computer into playlists.

The idea of sorting out certain objects into similar groupings, or sets, is the most fundamental
concept in modern mathematics. The theory of sets has, in fact, been the unifying framework for
all mathematics since the German mathematician Georg Cantor formulated it in the 1870’s. No
field of mathematics could be described nowadays without referring to some kind of abstract set.
A geometer, for example, may study the set of parabolic curves in three dimensions or the set of
spheres in a variety of different spaces. An algebraist may work with a set of equations or a set
of matrices. A statistician typically works with large sets of raw data. And the list goes on. You
may have also read or heard that the most important unresolved problem in mathematics at the
moment deals with the set of prime numbers (this problem in number theory is known as Riemann’s
Hypothesis; the Clay Institute will award a million dollars to whoever solves it.) As it turns out,
even numbers are described by mathematicians in terms of sets!

More broadly, the concept of set membership, which lies at the heart of set theory, explains how
statements with nouns and predicates are formulated in our language – or any abstract language
like mathematics. Because of this, set theory is intimately connected to logic and serves as the
foundation for all of mathematics.

10.2 What is a Set?

A set is a collection of objects called the elements or members of the set. These objects could be
anything conceivable, including numbers, letters, colors, even sets themselves! However, none of the
objects of the set can be the set itself. We discard this possibility to avoid running into Russell’s
Paradox, a famous problem in mathematical logic unearthed by the great British logician Bertrand
Russell in 1901.

10.3 Some Interesting Sets of Numbers

Let’s look at different types of numbers that we can have in our sets.

1. Natural Numbers
The set of natural numbers is {1, 2, 3, 4, . . . } and is denoted by N.

137
2. Integers
The set of integers is {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . } and is denoted by Z. The Z symbol
comes from the German word, Zahlen, which means number. Define the non-negative
integers {0, 1, 2, 3, 4, . . . } often denoted by Z+ . All natural numbers are integers.

3. Rational Numbers
The set of rational numbers is denoted by Q and consists of all fractional numbers i.e., x ∈ Q
if x can be written in the form pq , where p, q ∈ Z with q 6= 0.

4. Real Numbers
The real numbers are denoted by R.

5. Complex Numbers
The complex numbers are denoted by C.

10.4 Notation
1. A, B, C, . . . for sets.

2. a, b, c, . . . or x, y, z, . . . for members.

3. b ∈ A, if b belongs to A.

4. c ∈
/ A, if c does not belong to A.

5. ∅ is used for the empty set. There is exactly one set, the empty set or null set, which has
no members at all.

6. A set with only one member is called a singleton or singleton set. for example, {x}.

10.5 Well-Defined Sets

A set is said to be well-defined if it is unambiguous which elements belong to the set. In other
words, if A is well-defined, then the question “Does x ∈ A?” can always be answered for any object
x.
For example, if we define C as the set of large numbers, then it is unclear which numbers should be
considered “large”. C is therefore not a well-defined set. Similarly, the set of all great Zimbabwean
footballers, or the set of all expensive restaurants in Harare, are also not well-defined.

138
10.6 Specification of Sets

10.6.1 Three Ways to Specify a Set

1. Listing all its members (List Notation).

2. By stating a property of its elements (Predicate Notation).

3. By defining a set of rules which generates (defines) its members (Recursive Rules).

List Notation

This is suitable for finite sets. It lists names of elements of a set, separated by commas and enclose
them in braces. For example

{2, 3, 5}, {a, b, d, m}, {Zimbabwe, SouthAfrica}.

Three Dot Abbreviation

For example, {1, 2, . . . , 100}.

Predicate Notation

For example, {x|x is a natural number and x < 8}.


Reading : the set of all x such that x is a natural number and is less than 8.
The general form is {x|P (x)} where P is some predicate (condition, property).

Recursive Rules

For example, the set E of even numbers greater than 3,


(a) 4 ∈ E (b) if x ∈ E, then x + 2 ∈ E (c) nothing else belongs to E. The first rule is the
basis of the recursion, the second one generates new elements from the elements defined before and
the third rule restricts the defined set to the elements generated by (a) and (b).
The notion of a set does not allow for multiple instances (repetitions) of the same element in the
set, for example, {1, 2, 2, 3} is not a set.
The collection, out of which all sets under consideration may be formed, is called the universe of
discourse or universal set, denoted by U.

139
10.7 The Empty set (Null Set)

We have that the fundamental property of a set is that we can assert of each object whether or not
it is a member of the set.
Consider a set constructed by asserting of each object that it is not a member of the set. This set
has no members and is therefore called the empty set.
Definition 10.7.1. The null or empty set is the set that does not contain any elements, denoted
∅ = {} = {x|x 6= x}.
Example 10.7.1. (i) {x ∈ R|x2 = −1} (ii) {x ∈ Z|x2 = 2}.
Theorem 10.7.1. There is exactly one empty set.

10.8 Identity and Cardinality

Two sets are identical if and only if (iff) they have the same elements or both are empty. So A = B
iff, for every x, x ∈ A ⇔ x ∈ B.
Example 10.8.1. {0, 2, 4} = {x|x is an even positive integer less than 5}.

The number of elements in a set A is called the cardinality of A, denoted by |A|. The cardinality
of a finite set is a natural number. Infinite sets also have cardinalities but they are not natural
numbers. The set A is said to be countable or enumerable if there is a way to list the elements
of A.

10.9 Russell’s Paradox (antimony)

A paradox (antimony) is an apparently true statement that seems to lead to a logical self-
contradiction.
Its important to note that any given property, P (x) does not necessarily determine a set, i.e., we
cannot say that given any arbitrary property P , there corresponds a set whose elements satisfy the
property P .
Consider the following. There was once a barber man, wherever he lived, all of the men in this
town either shaved themselves or were shaved by the barber. And this barber man only shaved the
men who did not shave themselves. Did the barber shave himself?
Let’s say that he did shave himself. But from this he shaved only the men in town, who did not
shave themselves, therefore, he did not shave himself.
But we see that every men in town either shaved himself or was shaved by the barber. So he did
shave himself. We have a contradiction.
Russell observed that if S is a set, then either S ∈ S or S ∈ / S, since a given object is either a

140
member of a given set or is not a member of that set. Consider the set of all sets that are not
members of themselves, R = {x|x is a set and x ∈ / x}. R is an object, either R ∈ R or R ∈/ R.

(i) Assume R ∈ R, then R is a set, and R ∈


/ R by definition.
(ii) Assume R ∈ / R, then R ∈ R by definition of R, since we are assuming R is a set. But we
cannot have both R ∈ R and R ∈ R, so we reach a contradiction.

In both cases we have inferred the paradox that R ∈ R iff R ∈ / R. In other words, the assumption
that R is a set has led to a contradiction and therefore there is no such thing, then, as the set of
all sets. To avoid unnecessary paradoxes, we assume the existence of the universal set, U. All this
leads to the following problems

1. There are things that are true in mathematics (based on assumptions).


2. There are things that are false.
3. There are things that are true that can never be proved.
4. There are things that are false that can never be disproved.

After this paradox was described, set theory had to be reformulated axiomatically as axiomatic
set theory.

10.10 Inclusion

Definition 10.10.1. Having fixed our universal set, U, then for all x ∈ U. If A and B are sets
(with all members in U), we write A ⊆ B or B ⊇ A iff x ∈ A =⇒ x ∈ B. (⊆ , set inclusion symbol)

A set A is a subset of a set B iff every element of A is also an element of B. If A ⊆ B and A 6= B,


we call A a proper subset of B and write A ⊂ B.
Theorem 10.10.1. If A ⊆ B and B ⊆ C then A ⊆ C.

Proof. Let x ∈ A, then since A ⊆ B, we have x ∈ B and given that B ⊆ C, we conclude that
x ∈ C, thus A ⊆ C.
Example 10.10.1. (i) {a, b} ⊆ {d, a, b, e} (ii) {a, b} ⊆ {a, b} (iii) {a, b} ⊂ {d, a, b, e}
(iv) {a, b} 6⊂ {a, b}.

Note that the empty set is a subset of every set, ∅ ⊆ A, for every set A and that for any set A, we
have A ⊆ A.

141
10.11 Axiom of Extensionality

Theorem 10.11.1. For any two sets A and B, A = B ⇐⇒ A ⊆ B and B ⊆ A.

10.11.1 Power Sets

The set of all subsets of A is called the power set of A and is denoted by P(A) and |P(A)| = 2|A|
where |A| is finite.
Example 10.11.1. If A = {a, b}, then P(A) = {∅, {a}, {b}, {a, b}}.

From the above example, a ∈ A, {a} ⊆ A, {a} ∈ P(A), ∅ ⊆ A, ∅ ∈


/ A, ∅ ⊆ P(A), ∅ ∈ P(A).

10.12 Operations on Sets

10.12.1 Union and Intersection

Let A and B be arbitrary sets. The union of A and B, written A ∪ B, is the set whose elements
are just the elements of A or B or both.
A ∪ B := {x|x ∈ A or x ∈ B}.
Example 10.12.1. Let K = {a, b}, L = {c, d}, M = {b, d}, then K ∪ L = {a, b, c, d},
K ∪ M = {a, b, d}, L ∪ M = {b, c, d}, (K ∪ L) ∪ M = K ∪ (L ∪ M ) = {a, b, c, d}, K ∪ K = K,
K ∪ ∅ = ∅ ∪ K = K = {a, b}.

The intersection of A and B, written A ∩ B, is the set whose elements are just the elements of
both A and B.
A ∩ B := {x|x ∈ A and x ∈ B}.
Example 10.12.2. K ∩ L = ∅, K ∩ M = {b}, L ∩ M = {d}, (K ∩ L) ∩ M = K ∩ (L ∩ M ) = ∅,
K ∩ K = K, K ∩ ∅ = ∅ ∩ K = ∅.

10.13 Properties of ∪ and ∩


1. Every element x in A ∩ B belongs to both A and B, hence x belongs to A and x belongs to
B. Thus A ∩ B is a subset of A and of B i.e.,
A ∩ B ⊆ A and A ∩ B ⊆ B.

142
2. An element x belongs to the union A ∪ B if x belongs to A or x belongs to B, hence every
element in A belongs to A ∪ B and every element in B belong to A ∪ B, i.e.,

A⊆A∪B and B ⊆ A ∪ B.

Theorem 10.13.1. For any sets A and B, we have (i) A ∩ B ⊆ A ⊆ A ∪ B and


(ii) A ∩ B ⊆ B ⊆ A ∪ B.

Theorem 10.13.2. The following are equivalent, A ⊆ B, A ∩ B = A, A ∪ B = B.

Proof. Suppose A ⊆ B and let x ∈ A. Then x ∈ B, hence x ∈ A ∩ B and A ⊆ A ∩ B. Then


A ∩ B ⊆ A. Therefore A ∩ B = A. Suppose A ∩ B = A and let x ∈ A. Then x ∈ (A ∩ B), hence
x ∈ A and x ∈ B. Therefore A ⊆ B.
Suppose again that A ⊆ B. Let x ∈ (A ∪ B), then x ∈ A or x ∈ B. If x ∈ A, then x ∈ B because
A ⊆ B. In either case x ∈ B. Therefore A ∪ B ⊆ B. But B ⊆ A ∪ B. Therefore A ∪ B = B. Now
suppose A ∪ B = B and let x ∈ A. Thus x ∈ (A ∪ B). Hence x ∈ B = A ∪ B, therefore A ⊆ B.

Definition 10.13.1. Two sets A and B are called disjoint sets if the intersection of A and B is
the null set i.e., A ∩ B = ∅.

10.14 Difference and Complement

Definition 10.14.1. A minus B written A\B or A−B, which subtracts from A all elements which
are in B (also called relative complement, or the complement of B relative to A) is defined as

A − B := {x|x ∈ A and x ∈
/ B}.

Example 10.14.1. K − L = {a, b}, K − K = ∅, K − M = {a}, K − ∅ = K, L − M = {c},


∅ − K = ∅.

10.14.1 Symmetric Difference

Definition 10.14.2. A 4 B = A ⊕ B := {x|x ∈ A or x ∈ B but not in both} or


A 4 B = A ⊕ B := (A ∪ B) \ (A ∩ B) = (A \ B) ∪ (B \ A).

The complement of a set A, is the set of elements which do not belong to A, i.e., the difference of
the universal set U and A. Denote the complement of A by A0 or Ac .

A0 = {x|x ∈ U and x ∈
/ A} or A0 = U − A.

Example 10.14.2. Let E = {2, 4, 6, . . . }, the set of all even numbers. Then E c = {1, 3, 5, . . . },
the set of odd numbers.

143
10.15 Venn Diagrams

A simple and instructive way of illustrating the relationship between sets in the use of the so called
Venn-Euler diagrams or simply Venn diagrams.

10.16 Set Theoretic Equalities


1. Idempotent Laws (i) X ∪ X = X (ii) X ∩ X = X.

2. Commutative Laws (i) X ∪ Y = Y ∪ X (ii) X ∩ Y = Y ∩ X.

3. Associative Laws (i) (X ∪ Y ) ∪ Z = X ∪ (Y ∪ Z) (ii) (X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z).

4. Distributive Laws (i) X ∪(Y ∩Z) = (X ∪Y )∩(X ∪Z) (ii) X ∩(Y ∪Z) = (X ∩Y )∪(X ∩Z).

5. Identity Laws (i) X ∪ ∅ = X (ii) X ∪ U = U (iii) X ∩ ∅ = ∅ (iv) X ∩ U = X.

6. Complement Laws (i) X ∪ X c = U (ii) (X c )c = X (iii) X ∩ X c = ∅


(iv) X − Y = X ∩ Y c .

7. De Morgan’s Laws (i) (X ∪ Y )c = X c ∩ Y c (ii) (X ∩ Y )c = X c ∪ Y c .

8. Consistency Principle (i) X ⊆ Y iff X ∪ Y = Y (ii) X ⊆ Y iff X ∩ Y = X.

Example 10.16.1. Show that (Ac )c = A.

Proof. We need to show that A ⊆ (Ac )c and (Ac )c ⊆ A. Let x ∈ A then x ∈ / Ac . If x ∈


/ Ac , then
x ∈ (Ac )c . By definition of subsets A ⊆ (Ac )c .
We want to show that (Ac )c ⊆ A. Let y ∈ (Ac )c , then y ∈
/ Ac . If y ∈
/ Ac , then y ∈ A. We have
shown that y ∈ (Ac )c =⇒ y ∈ A. Thus (Ac )c ⊆ A. By equality of sets (Ac )c = A.

Example 10.16.2. Show that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).

Proof. Let D = A ∩ (B ∪ C) and E = (A ∩ B) ∪ (A ∩ C). We have to prove first that D ⊆ E. Let


x ∈ D, then x ∈ A and x ∈ (B ∪ C). Since x ∈ (B ∪ C), either x ∈ B or x ∈ C or both. In case
x ∈ B we have x ∈ A and x ∈ B, so x ∈ (A ∩ B). On the other hand, if x ∈ / B, then we must have
x ∈ C, so x ∈ (A ∩ C). Taking these two cases together, x ∈ (A ∩ B) or x ∈ (A ∩ C), so x ∈ E.
Now, we prove that E ⊆ D. Let x ∈ E. Suppose first that x ∈ (A ∩ B), then x ∈ A and x ∈ B, so
x ∈ A and x ∈ (B ∪ C), so x ∈ D. On the other hand, if x 6∈ (A ∩ B), then x ∈ (A ∩ C), so again
we obtain x ∈ A and x ∈ (B ∪ C), giving x ∈ D. Hence E ⊆ D. Hence both D ⊆ E and E ⊆ D
and we conclude that D = E and consequently A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) .

144
10.17 Counting Elements in Sets

If A and B are disjoint sets, then


|A ∪ B| = |A| + |B|,
otherwise
|A ∪ B| = |A| + |B| − |A ∩ B|.
Example 10.17.1. Let A = {a, b, c, d, e} and B = {d, e, f, g, h, i}, so that A∪B = {a, b, c, d, e, f, g, h, i}
and A ∩ B = {d, e}. Since |A| = 5, |B| = 6, |A ∪ B| = 9, |A ∩ B| = 2, we have
|A ∪ B| = |A| + |B| − |A ∩ B| = 5 + 6 − 2 = 9.

10.18 The Algebra of Sets

We have considered the problem of showing that two sets are the same, however this technique
becomes tedious should the expressions involved be at all complicated. We shall develop an algebra
of sets, to assist us in simplifying a given expression. The following basic laws are easily established.
Law 1 : (Ac )c = A Law 2 : A ∪ B = B ∪ A Law 3 : A ∩ B = B ∩ A
Law 4 : A ∪ (B ∩ C) = (A ∪ B) ∪ C Law 5 : A ∩ (B ∩ C) = (A ∩ B) ∩ C
Law 6 : A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) Law 7 : A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
Law 8 : (A ∪ B)c = Ac ∩ B c Law 9 : (A ∩ B)c = Ac ∪ B c Law 10 : U c = ∅
Law 11 : ∅c = U Law 12 : A ∪ ∅ = A Law 13 : A ∪ U = U Law 14 : A ∩ U = A
Law 15 : A ∩ ∅ = ∅ Law 16 : A ∪ Ac = U Law 17 : A ∩ Ac = ∅.
Example 10.18.1. By using the algebra of sets, show that A ∪ (B ∩ Ac ) = A ∪ B.

Proof.

A ∪ (B ∩ Ac ) = (A ∪ B) ∩ (A ∪ Ac ) by Law 6
= (A ∪ B) ∩ U by Law 16
= A ∪ B by Law 14.

10.19 Set Products

10.19.1 Ordered Pairs

Definition 10.19.1. Let n be any natural number and let a1 , a2 , . . . , an be any objects. Then
(a1 , a2 , . . . , an ) denotes the ordered n-tuple with first term a1 , second term a2 , . . . and nth term an .

145
Example 10.19.1. (5, 7) denotes the ordered pair whose first term is 5 and second term 7. Note
that (5, 7, 2) is called an ordered triple, (5, 7, 2, 4) is called an ordered 4-tuple.

The fundamental statement we can make about an ordered n-tuple is that a given object is the kth
term of an ordered n-tuple.
Definition 10.19.2. Let A and B be any non-empty sets, then

A × B := {(a, b)|a ∈ A and b ∈ B}.

If A and B are both finite sets, then |A × B| = |A| · |B|. If A = B, we sometimes write A2 for
A × A.
Example 10.19.2. 1. If A = {1, 2} and B = {2, 3, 4}, then A×B = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4)}
and B × A = {(2, 1), (2, 2), (3, 1), (3, 2), (4, 1), (4, 2)}.
Notice that A × B 6= B × A, in general.

2. The Cartesian product R × R = R2 is the set of all ordered pairs of real numbers and this
represents the 2-dimensional Cartesian plane.

3. (s1 , t1 ) = (s2 , t2 ) if and only if s1 = s2 and t1 = t2 .

10.20 Theorems on Set Products

Let A, B, C, D be sets, then

1. A × (B ∪ C) = (A × B) ∪ (A × C).

2. A × (B ∩ C) = (A × B) ∩ (A × C).

3. (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D).

4. (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D).

5. (A − B) × C = (A × C) − (B × C).

6. If A and B are non-empty sets, then A × B = B × A if and only if A = B.

7. If A1 ∈ P(A) and B1 ∈ P(B), then A1 × B1 ∈ P(A × B).


Example 10.20.1. Prove that (A ∪ B) × C = (A × C) ∪ (B × C).

Proof. Consider any element (u, v) ∈ (A ∪ B) × C. By definition u ∈ (A ∪ B) and v ∈ C. Thus


u ∈ A or u ∈ B. If u ∈ A, then (u, v) ∈ (A × C) and if u ∈ B, then (u, v) ∈ (B × C). Thus
(u, v) is in A × C or in B × C and therefore (u, v) ∈ (A × C) ∪ (B × C). This proves that
(A ∪ B) × C ⊆ (A × C) ∪ (B × C).

146
Now consider any element (u, v) ∈ (A × C) ∪ (B × C). This implies that (u, v) ∈ (A × C) or
(u, v) ∈ (B × C). In the first case u ∈ A and v ∈ C and in the second case u ∈ B and v ∈ C. Thus
u ∈ (A∪B) and v ∈ C which implies (u, v) ∈ (A∪B)×C. Therefore (A×C)∪(B×C) ⊆ (A∪B)×C.
Hence (A ∪ B) × C = (A × C) ∪ (B × C).

10.21 Tutorial 9
1. Let {2, 4, 6}, and {3, 4, 5, 6, 7, 8, 9, 10}, What is the cardinality of the sets A ∩ B and A ∪ B.

2. List the elements of the following sets where P = {1, 2, 3, . . .}.

(a) A = {x : x ∈ P, 3 < x < 12}


(b) B = {x : x ∈ P, x is even, x < 15}
(c) C = {x : x ∈ P, 4 + x = 3}
(d) D = {x : x ∈ P, x is a multiple of 5}.

3. Consider the universal set U = {1, 2, 3, . . . , 9} and the sets: A = {1, 2, 3, 4, 5}, B = {4, 5, 6, 7},
C = {5, 6, 7, 8, 9}, D = {1, 3, 5, 7, 9}, E = {2, 4, 6, 8}, F = {1, 5, 9}. Find:

(a) A ∪ B and D ∩ F (b) B c , Dc , U c , ∅c (c) A − B, B − A, D − E, F − D


(d) A ⊕ B, C ⊕ D, E ⊕ F (e) A ∩ (B ∪ E), (A ∩ D) − B, (B ∩ F ) ∪ (C ∩ E).

4. Let A = {1, 2, 3, 4, 5}, find the power set P(A) of A.

5. If A, B and C are any three sets, prove that

(a) A − B = A ∩ B 0 (b) A ⊂ B implies B 0 ⊂ A0


(e) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) and A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)

6. Use the laws of algebra of sets to prove that

(a) (A ∩ B) ∪ (A ∩ B 0 ) = A, (b) (A ∩ U) ∩ (∅ ∪ A0 ) = ∅, where U is the universal


set.

7. Define A − B = A ∩ B0 and A 4 B = (A − B) ∪ (B − A). Simplify each of the following:


(i) A 4 ∅, (ii) A 4 U, (iii) A 4 A, (iv) A 4 A0 .

8. Let A, B, C, D be sets, prove the following

(b) A × (B ∩ C) = (A × B) ∩ (A × C),
(c) (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D).

147
Introduction to Probability Theory

148
10.22 Probability

Probability theory provides a mathematical foundation to concepts such as “probability”, “infor-


mation”, “belief”, “uncertainty”, “confidence”, “randomness”, “variability”, “chance” and “risk”.
Probability theory is important to empirical scientists because it gives them a rational framework
to make inferences and test hypotheses based on uncertain empirical data. Probability theory is
also useful to engineers building systems that have to operate intelligently in an uncertain world.
For example, some of the most successful approaches in machine perception (e.g., automatic speech
recognition, computer vision) and artificial intelligence are based on probabilistic models. More-
over probability theory is also proving very valuable as a theoretical framework for scientists trying
to understand how the brain works. Many computational neuroscientists think of the brain as a
probabilistic computer built with unreliable components, i.e., neurons, and use probability theory
as a guiding framework to understand the principles of computation used by the brain. Consider
the following examples:

• You need to decide whether a coin is loaded (i.e., whether it tends to favor one side over the
other when tossed). You toss the coin 6 times and in all cases you get “Tails”. Would you
say that the coin is loaded?

• You are trying to figure out whether newborn babies can distinguish green from red. To do
so you present two colored cards (one green, one red) to 6 newborn babies. You make sure
that the 2 cards have equal overall luminance so that they are indistinguishable if recorded
by a black and white camera. The 6 babies are randomly divided into two groups. The first
group gets the red card on the left visual field, and the second group on the right visual field.
You find that all 6 babies look longer to the red card than the green card. Would you say
that babies can distinguish red from green?

• A pregnancy test has a 99% validity (i.e., 99 of 100 pregnant women test positive) and 95%
specificity (i.e., 95 out of 100 non pregnant women test negative). A woman believes she has a
10% chance of being pregnant. She takes the test and tests positive. How should she combine
her prior beliefs with the results of the test?

• You need to design a system that detects a sinusoidal tone of 1000Hz in the presence of white
noise. How should you design the system to solve this task optimally?

• How should the photo receptors in the human retina be interconnected to maximize informa-
tion transmission to the brain?

While these tasks appear different from each other, they all share a common problem: The need to
combine different sources of uncertain information to make rational decisions. Probability theory
provides a very powerful mathematical framework to do so. We now go into the mathematical
aspects of probability theory.

149
10.23 Sample Spaces

A set S that consists of all possible outcomes of a random experiment is called a sample space,
and each outcome is called a sample point. Often there will be more than one sample space that
can describe outcomes of an experiment, but there is usually only one that will provide the most
information.

Example 10.23.1. If we toss a die, then one sample space is given by {1, 2, 3, 4, 5, 6} while another
is {even, odd}. It is clear, however, that the latter would not be adequate to determine, for example,
whether an outcome is divisible by 3.

The sample space is also called the outcome space, reference set, and universal set. It is
often useful to portray a sample space graphically. In such cases, it is desirable to use numbers in
place of letters whenever possible. If a sample space has a finite number of points, it is called a
finite sample space. If it has as many points as there are natural numbers 1, 2, 3, . . . , it is called a
countably infinite sample space. If it has as many points as there are in some interval on the x axis,
such as 0 ≤ x ≤ 1, it is called a noncountably infinite sample space. A sample space that is finite
or countably finite is often called a discrete sample space, while one that is noncountably infinite is
called a nondiscrete sample space.

Example 10.23.2. The sample space resulting from tossing a die yields a discrete sample space.
However, picking any number, not just integers, from 1 to 10, yields a non-discrete sample space.

10.24 Events

We have defined outcomes as the elements of a sample space S. In practice, we are interested in
assigning probability values not only to outcomes but also to sets of outcomes. For example, we
may want to know the probability of getting an even number when rolling a die. In other words,
we want the probability of the set {2, 4, 6}. An event is a subset A of the sample space S, i.e., it is
set of possible outcomes. If the outcome of an experiment is an element of A, we say that the event
A has occurred. An event consisting of a single point of S is called a simple or elementary event.
As particular events, we have S itself, which is the sure or certain event since an element of S must
occur, and the empty set ∅, which is called the impossible event because an element of ∅ cannot
occur.
By using set operations on events in S, we can obtain other events in S. For example, if A and B
are events, then

1. A ∪ B is the event “either A or B or both.” A ∪ B is called the union of A and B.

2. A ∩ B is the event “both A and B.” A ∩ B is called the intersection of A and B.

3. A0 is the event “not A.” A0 is called the complement of A.

150
4. A − B = A ∩ B 0 is the event “A but not B.” In particular, A0 = S − A.

If the sets corresponding to events A and B are disjoint, i.e., A ∩ B = ∅, we often say that the
events are mutually exclusive. This means that they cannot both occur. We say that a collection
of events A1 , A2 , . . . , An is mutually exclusive if every pair in the collection is mutually exclusive.
Example 10.24.1. Consider an experiment of tossing a coin twice, let A be the event “at least one
head occurs” and B the event “the second toss results in a tail.” Find the events A ∪ B, A ∩ B, A0
and A − B.

Solution: We observe that A = {HT, T H, HH}, B = {HT, T T } and so we have


A ∪ B = {HT, T H, HH, T T } = S,
A ∩ B = {HT }
A0 = {T T }
A − B = {T H, HH}.

10.25 The Concept of Probability

In any random experiment there is always uncertainty as to whether a particular event will or will
not occur. As a measure of the chance, or probability, with which we can expect the event to occur,
it is convenient to assign a number between 0 and 1. If we are sure or certain that an event will
occur, we say that its probability is 100% or 1. If we are sure that the event will not occur, we
say that its probability is zero. If, for example, the probability is 1/4, we would say that there is
a 25% chance it will occur and a 75% chance that it will not occur. Equivalently, we can say that
the odds against occurrence are 75% to 25%, or 3 to 1.

There are two important procedures by means of which we can estimate the probability of an
event.

1. CLASSICAL APPROACH: If an event can occur in h different ways out of a total of n


possible ways, all of which are equally likely, then the probability of the event is h/n.
Example 10.25.1. Suppose we want to know the probability that a head will turn up in a
single toss of a coin. Since there are two equally likely ways in which the coin can come up-
namely, heads and tails (assuming it does not roll away or stand on its edge)- and of these
two ways a head can arise in only one way, we reason that the required probability is 1/2. In
arriving at this, we assume that the coin is fair, i.e., not loaded in any way.

2. FREQUENCY APPROACH: If after n repetitions of an experiment, where n is very


large, an event is observed to occur in h of these, then the probability of the event is h/n.
This is also called the empirical probability of the event.
Example 10.25.2. If we toss a coin 1000 times and find that it comes up heads 532 times,
we estimate the probability of a head coming up to be 532/1000 = 0.532.

151
Both the classical and frequency approaches have serious drawbacks, the first because the words
“equally likely” are vague and the second because the “large number” involved is vague. Because
of these difficulties, mathematicians have been led to an axiomatic approach to probability.

10.26 The Axioms of Probability

Suppose we have a sample space S. If S is discrete, all subsets correspond to events and conversely;
if S is nondiscrete, only special subsets (called measurable) correspond to events. To each event A
in the class C of events, we associate a real number P (A). The P is called a probability function,
and P (A) the probability of the event, if the following axioms are satisfied.

Axiom 1. For every event A in class C, P (A) ≥ 0


Axiom 2. For the sure or certain event S in the class C, P (S) = 1
Axiom 3. For any number of mutually exclusive events A1 , A2 , . . . , in the class C,
P (A1 ∪ A2 ∪ . . .) = P (A1 ) + P (A2 ) + . . . In particular, for two mutually exclusive events A1
and A2 , P (A1 ∪ A2 ) = P (A1 ) + P (A2 ).

10.27 Some Important Theorems on Probability

From the above axioms we can now prove various theorems on probability that are important in
further work.
Theorem 10.27.1. If A1 ⊂ A2 , then P (A1 ) ≤ P (A2 ) and P (A2 − A1 ) = P (A2 ) − P (A1 ).
Theorem 10.27.2. For every event A, 0 ≤ P (A) ≤ 1, i.e., a probability between 0 and 1.
Theorem 10.27.3. For ∅, the empty set, P (∅) = 0, i.e., the impossible event has probability zero.
Theorem 10.27.4. If A0 is the complement of A, then P (A0 ) = 1 − P (A).
Theorem 10.27.5. If A = A1 ∪ A2 ∪ A3 ∪ . . . ∪ An , where A1 , A2 , . . . , An are mutually exclusive
events, then
P (A) = P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ).
In particular, if A = S, the sample space, then
P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ) = 1.
Theorem 10.27.6. If A and B are any two events, then P (A ∪ B) = P (A) + P (B) − P (A ∩ B).
More generally, if A1 , A2 , A3 are any three events, then
P (A1 ∪A2 ∪A3 ) = P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A2 )−P (A2 ∩A3 )−P (A3 ∩A1 )+P (A1 ∩A2 ∩A3 ).
Generalizations to n events can also be made.

152
Theorem 10.27.7. For any events A and B, P (A) = P (A ∩ B) + P (A ∩ B 0 ).

Theorem 10.27.8. If an event A must result in the occurrence of one of the mutually exclusive
events A1 , A2 , . . . , An , then

P (A) = P (A ∩ A1 ) + P (A ∩ A2 ) + · · · + P (A ∩ An ).

10.28 Assignment of Probabilities

If a sample space S consists of a finite number of outcomes a1 , a2 , . . . , an , then by theorem 10.27.5,

P (A1 ) + P (A2 ) + . . . + P (An ) = 1

where A1 , A2 , . . . , An are elementary events given by Ai = {ai }.

It follows that we can arbitrarily choose any nonnegative numbers for the probabilities of these
simple events as long as the previous equation is satisfied. In particular, if we assume equal proba-
bilities for all simple events, then
1
P (Ak ) = , k = 1, 2, . . . , n
n
And if A is any event made up of h such simple events, we have
h
P (A) = .
n
This is equivalent to the classical approach to probability. We could of course use other procedures
for assigning probabilities, such as frequency approach.

Assigning probabilities provides a mathematical model, the success of which must be tested by
experiment in much the same manner that the theories in physics or others sciences must be tested
by experiment.

Example 10.28.1. A single die is tossed once. Find the probability of a 2 or 5 turning up.

Solution: The sample space is S = {1, 2, 3, 4, 5, 6}. If we assign equal probabilities to the sample
points, i.e., if we assume that the die is fair, then
1
P (1) = P (2) = · · · = P (6) = .
6
The event that either 2 or 5 turns up is indicated by 2 ∪ 5. Therefore,
1 1 1
P (2 ∪ 5) = P (2) + P (5) = + = .
6 6 3

153
10.29 Conditional Probability

Let A and B be two events such that P (A) > 0. Denote P (B|A) the probability of B given that A
has occurred. Since A is known to have occurred, it becomes the new sample space replacing the
original S. From this we are led to the definition

P (A ∩ B)
P (B|A) ≡ (10.1)
P (A)
or

P (A ∩ B) ≡ P (A)P (B|A). (10.2)

In words, this is saying that the probability that both A and B occur is equal to the probability
that A occurs times the probability that B occurs given that A has occurred. We call P (B|A)
the conditional probability of B given A, i.e., the probability that B will occur given that A has
occurred. It is easy to show that conditional probability satisfies the axioms of probability previously
discussed.

Example 10.29.1. Find the probability that a single toss of a die will result in a number less than
4 if

(a) no other information is given and

(b) it is given that the toss resulted in an odd number.

Solution:

(a) Let B denote the event {less than 4}. Since B is the union of the events 1, 2, or 3 turning up,
we see by Theorem 10.27.5 that
1 1 1 1
P (B) = P (1) + P (2) + P (3) = + + =
6 6 6 2
assuming equal probabilities for the sample points.
3 1 2 1
(b) Letting A be the event {odd number}, we see that P (A) = = . Also, P (A ∩ B) = = .
6 2 6 3
Then
P (A ∩ B) 1/3 2
P (B|A) = = = .
P (A) 1/2 3
Hence, the added knowledge that the toss results in an odd number raises the probability from
1/2 to 2/3.

154
10.30 Theorems on Conditional Probability

Theorem 10.30.1. For any three events A1 , A2 , A3 , we have

P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 |A1 )P (A3 |A1 ∩ A2 ). (10.3)

In words, the probability that A1 and A2 and A3 all occur is equal to the probability that A1
occurs times the probability that A2 occurs given that A1 has occurred times the probability that
A3 occurs given that both A1 and A2 have occurred. The result is easily generalized to n events.

Theorem 10.30.2. If an event A must result in one of the mutually exclusive events A1 , A2 , . . . , An ,
then

P (A) = P (A1 )P (A|A1 ) + P (A2 )P (A|A2 ) + . . . + P (An )P (A|An ). (10.4)

10.31 Independent Events

If P (B|A) = P (B), i.e., the probability of B occurring is not affected by the occurrence or nonoc-
currence of A, then we say that A and B are independent events. This is equivalent to

P (A ∩ B) = P (A)P (B). (10.5)

Notice also that if this equation holds, then A and B are independent.

We say that three events A1 , A2 , A3 are independent if they are pairwise independent.

P (Aj ∩ Ak ) = P (Aj )P (Ak ), j 6= k where j, k = 1, 2, 3 (10.6)

and

P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ). (10.7)

Both of these properties must hold in order for the events to be independent. Independence of more
than three events is easily defined.
Note: In order to use this multiplication rule, all of your events must be independent.

10.32 Bayes’ Theorem or Rule

Suppose that A1 , A2 , . . . , An are mutually exclusive events whose union is the sample space S, i.e.,
one of the events must occur. Then if A is any event, we have the following important theorem:

155
Theorem 10.32.1. (Bayes’ Rule):

P (Ak )P (A|Ak )
P (Ak |A) = n . (10.8)
X
P (Aj )P (A|Aj )
j=1

This enables us to find the probabilities of the various events A1 , A2 , . . . , An that can occur. For
this reason Bayes’ theorem is often referred to as a theorem on the probability of causes.

10.33 Tutorial 10
1. There are 3 arrangements of the word DAD, namely DAD, ADD, and DDA. How many
arrangements are there of the word PROBABILITY?

2. A ball is drawn at random from a box containing 8 red balls, 17 white balls, and 9 blue balls.
Determine the probability that it is
(i) white, (ii) not blue, (iii) red or blue, (iv) neither white nor red.

3. A card is picked from a deck of 52 playing cards, without replacement, and then another one
is picked. What is the probability of picking (i) two red cards, (ii) one of each colour.

4. A die is loaded in such a way that each odd number is twice likely to occur as each even
number. Find P (G), where G is the event that a number greater than 3 occurs on a single
roll of the die.

5. Events X and Y are such that P (X | Y ) = 0.4, P (Y | X) = 0.25, P (X ∩ Y ) = 0.12


(i) Calculate the value of P (Y ).
(ii) Give a reason why X and Y are not independent.
(iii) Calculate the value of P (X ∩ Y 0 ).

6. Two dice are rolled.


A = {‘sum of two dice equals 3 ’}
B = {‘sum of two dice equals 7’}

(a) What is P (A | C)?


(b) What is P (B | C)?
(c) Are A and C independent? What about B and C?

7. From a batch of 100 items of which 20 are defective, exactly two items are chosen, one at a
time without replacement. Calculate the probabilities that:

(a) the first item chosen is defective


(b) both items chosen are defective
(c) the second item chosen is defective.

156
8. The punctuality of buses has been investigated by considering a number of bus journey. In
the sample, 60% of buses had a destination of Masvingo, 20% Bulawayo and 20% Mutare.
The probabilities of a bus arriving late in Masvingo, Bulawayo or Mutare are 30%, 25% and
20% respectively. If a late bus is picked at random from the group under consideration, what
is the probability that it terminated in Masvingo.

9. Machines M and N produce 10% and 90% respectively of the production of a component
intended for the motor industry. From experience, it is known that the probability that
machine M produces a defective component is 0.01 while the probability that machine N
produces a defective component is 0.05. If a component is selected at random from a day‘s
production and is found to be defective, find the probability that it was made by

(a) machine M
(b) machine N.

157
Complex Numbers and Polynomials

There are no secrets about the world of nature. There are secrets about the thoughts and intentions of men.

—Robert Oppenheimer

10.34 Introduction

No one person invented complex numbers, but controversies surrounding the use of these numbers
existed in the sixteenth century. In their quest to solve polynomial equations by formulas involving

158
radicals, early dabblers in mathematics were forced to admit that there were other kinds of numbers
besides positivepintegers. Equations such as x2 + 2x + 2 = 0 and x3 = 6x + 4 that yielded solutions
√ √ p √
1 + −1 and 3 2 + −2 + 3 2 − −2 caused particular consternation within the community √ of
√ mathematical scholars because everyone knew that there are no numbers such as −1
fledgling
and −2, numbers whose square is negative. Such numbers exist only in one’s imagination, or
as one philosopher opined, “the imaginary, (the) bosom child of complex mysticism.” Over time
these imaginary numbers did not go away, mainly because mathematicians as a group are tenacious
and some are even practical. A famous mathematician held that even though they exist in our
imagination, nothing prevents us from employing them in calculations. Mathematicians also hate
to throw anything away. After all, a memory still lingered that negative numbers at first were
branded fictitious. The concept of number evolved over centuries; gradually the set of numbers grew
from just positive integers to include rational numbers, negative numbers, and irrational numbers.
But in the eighteenth century the number concept took a gigantic evolutionary step forward when
the German mathematician Carl Friedrich Gauss put the so-called imaginary numbers or complex
numbers, as they were now beginning to be called on a logical and consistent footing by treating
them as an extension of the real number system.

10.35 Complex Numbers

The set of all complex numbers is usually denoted by C. Since x2 ≥ 0 for every real number, x, the
equation
x2 + 1 = 0
has no real solutions.

Introduce the imaginary number 1 , √


i= −1
which is assumed to have the property

i2 = ( −1)2 = −1.

Complex numbers are usually written in the form a + bi where a and b are real numbers or can be
regarded as the ordered pair (a, b).

Ordered Pair Equivalent Notation


(3, 4) 3 + 4i
(0, 1) 0+i
(2, 0) 2 + 0i
(4, −2) 4 + (−2)i
1
was first used by the Swiss mathematician Leonhard Euler in 1777.

159
Geometrically, a complex number can be viewed either as a point or vector in the xy−plane.

Let us denote
z = a + bi.
The real number a is called the real part of z and the real number b is called the imaginary part
of z.

These numbers are denoted Re(z) and Im(z) respectively.


Example 10.35.1. Re(4 − 3i) = 4 and Im(4 − 3i) = −3.

When complex numbers are represented geometrically in the xy-coordinate system, the x-axis is
called the real axis, the y-axis, the imaginary axis, and the plane is called the complex plane.
Definition 10.35.1. Two complex numbers a + bi and c + di are defined to be equal, when
a + bi = c + di if a = c and b = d.

Numbers of the form where a = 0, then a + bi reduces to 0 + bi = bi, these complex numbers which
correspond to points on the imaginary axis, are called purely imaginary numbers. For example
z = 8i is a purly imaginary number.

10.35.1 Operations

Complex numbers can be added, subtracted, multiplied and divided.


(a + bi) + (c + di) = (a + c) + (b + d)i.
(a + bi) − (c + di) = (a − c) + (b − d)i.
k(a + bi) = (ka) + (kb)i, k ∈ R. (multiplication by a real number)
Because (−1)z + z = 0, we denote (−1)z as −z and call it the negative of z.
Example 10.35.2. If z1 = 4 − 5i, z2 = −1 + 6i, find z1 + z2 , z1 − z2 , 3z1 and −z2 .

Solution:
z1 + z2 = (4 − 5i) + (−1 + 6i) = (4 − 1) + (−5 + 6)i = 3 + i.
z1 − z2 = (4 − 5i) − (−1 + 6i) = (4 + 1) + (−5 − 6)i = 5 − 11i.
3z1 = 3(4 − 5i) = 12 − 15i.
−z2 = −1(z2 ) = (−1)(−1 + 6i) = 1 − 6i.

Multiplying two complex numbers as (a + bi)(c + di), treating i2 = −1, this yields
(a + bi)(c + di) = ac + bdi2 + adi + bci = (ac − bd) + (ad + bc)i.
Example 10.35.3. 1. (3 + 2i)(4 + 5i) = (3 · 4 − 2 · 5) + (3 · 5 + 2 · 4)i = 2 + 23i.
2. i2 = (0 + i)(0 + i) = (0 · 0 − 1 · 1) + (0 · 1 + 1 · 0)i = −1.

160
10.35.2 Rules of Complex Arithmetic

Given that z1 , z2 , z2 ∈ C, then

1. z1 + z2 = z2 + z1 .

2. z1 z2 = z2 z1 .

3. z1 + (z2 + z3 ) = (z1 + z2 ) + z3 .

4. z1 (z2 z3 ) = (z1 z2 )z3 .

5. z1 (z2 + z3 ) = z1 z2 + z1 z3 .

6. 0 + z = z.

7. z + (−z) = 0.

8. 1 · z = z

10.36 Modulus, Complex Conjugate and Division

10.36.1 Complex Conjugate

If z = a + bi, is any complex number, then the conjugate of z denoted by z is defined as

z = a − bi.

Geometrically, z is the reflection of z about the axis.

Example 10.36.1. 1. z = 3 + 2i, then z = 3 − 2i.

2. z = −4 − 2i, then z = −4 + 2i.

3. z = 4, then z = 4.

So z = z if and only if z is a real number.

161
10.36.2 Modulus of a Complex Number

Definition 10.36.1. The modulus of a complex number z = a + bi, denoted |z|, is defined by

|z| = a2 + b2 .

If b = 0, then z = a is a real number, and


√ √
|z| = a2 + 02 = a2 = |a|.
So the modulus of a real number is simply its modulus value.
Example 10.36.2. Find |z| if z = 3 − 4i.
p √
Solution: |z| = 32 + (−4)2 = 25 = 5.
Theorem 10.36.1. For any complex number

zz = |z|2 or |z| = zz.

Proof. If z = a + bi, then


zz = (a + bi)(a − bi) = a2 − abi + bai − b2 i2
= a2 + b 2
= |z|2 .


z1 |z1 |
The modulus of a complex number z has the additional properties |z1 z2 | = |z1 ||z2 | and = .
z2 |z2 |

10.36.3 Division of Complex Numbers

For division
z1 z1 z 2
= .
z2 |z2 |2
3 + 4i
Example 10.36.3. Express in the form a + bi.
1 − 2i

Solution:
3 + 4i (3 + 4i)(1 + 2i)
=
1 − 2i (1 − 2i)(1 + 2i)
3 + 6i + 4i + 8i2
=
1 + 2i − 2i − 4i2
−5 + 10i
=
5
= −1 + 2i.

162
10.36.4 Properties of the Conjugate

Theorem 10.36.2. For any complex numbers z, z1 and z2 , then

(a) z1 + z2 = z1 + z2 .
(b) z1 − z2 = z1 − z2 .
(c) z1 · z2 = z1 · z2 .
 
z1 z1
(d) = .
z2 z2
(e) z = z.

Proof. (a) Let z1 = a1 + b1 i and z2 = a2 + b2 i, then


z1 + z2 = (a1 + a2 ) + (b1 + b2 )i
= (a1 + a2 ) − (b1 + b2 )i
= (a1 − b1 i) + (a2 − b2 i)
= z1 + z2 .

1 √ 1
Since |z| = (zz) 2 = a2 + b2 = ((Re(z))2 + (Im(z)2 )) 2 , then
p p
Re(z) ≤ |Re(z)| = (Re(z))2 ≤ (Re(z))2 + (Im(z))2 = |z|.

Similarly,
Im(z) ≤ |Im(z)| ≤ |z|.
For any two complex numbers, z1 and z2 , we have that
|z1 + z2 | ≤ |z1 | + |z2 |.
This is called the triangle inequality.

Proof.
|z1 + z2 |2 = (z1 + z2 )(z1 + z2 ) = (z1 + z2 )(z1 + z2 )
= z1 z1 + 2Re(z1 z2 ) + z2 z2 .
Using the fact that 2Re(z1 z2 ) ≤ 2|z1 z2 | = 2|z1 ||z2 |, we get
|z1 + z2 |2 ≤ |z1 |2 + 2|z1 ||z2 | + |z2 |2 = (|z1 | + |z2 |)2 .
Taking square roots the result follows, that is
|z1 + z2 | ≤ |z1 | + |z2 |.

163
10.37 Polar Representation of Complex Numbers

If z = x + iy is a non-zero complex number, r = |z| and θ measures the angle from the positive real
axis to the vector z,

P = (r, θ)

r = directed distance

θ = directed angle

Figure 10.1: Polar form

then
x = r cos θ and y = r sin θ,
so that z = x + iy can be written as

z = r cos θ + ir sin θ = r(cos θ + i sin θ).

This is called a polar form of z. The angle θ is called an argument of z and is denoted by
θ = arg z. The argument of z is not uniquely determined because we can add or subtract any
multiple of 2π from θ to produce another value of the argument.

One value of the argument in radians that satisfies −π < θ ≤ π is called the principal argument
of z and is denoted by θ = Arg z.

Example 10.37.1. Express z = 1 + 3i in polar form using the principal argument.
q √ √ √
Solution: The value of r is r = |z| = (1)2 + ( 3)2 = 4 = 2. Since x = 1 and y = 3, it
√ √
follows that 1 = 2 cos θ and 3 = 2 sin θ. So cos θ = 12 and sin θ = 23 . The only value of θ that
satisfies these relations and meets the requirement −π, θ ≤ π is θ = π3 . The polar form of z is
 π π
z = 2 cos + i sin .
3 3
164
We now show how polar forms can be used to give geometric interpretations of multiplication and
division of complex numbers.

Let z1 = r1 (cos θ1 + sin θ1 ) and z2 = r2 (cos θ2 + i sin θ2 ). Multiplying, we obtain

z1 z2 = r1 r2 [(cos θ1 cos θ2 − sin θ1 sin θ2 ) + i(sin θ1 cos θ2 + cos θ1 sin θ2 )].

Recall:

cos(θ1 + θ2 ) = cos θ1 cos θ2 − sin θ1 sin θ2 .


sin(θ1 + θ2 ) = sin θ1 cos θ2 + cos θ1 sin θ2 .

We obtain
z1 z2 = r1 r2 [cos(θ1 + θ2 ) + i sin(θ1 + θ2 )]
which is a polar form of the complex number with modulus r1 r2 and argument θ1 + θ2 . Thus, we
have shown that
|z1 z2 | = |z1 ||z2 | and arg(z1 z2 ) = arg z1 + arg z2 .
Also
z1 r1
= [cos(θ1 − θ2 ) + i sin(θ1 − θ2 )] ,
z2 r2
from which, it follows that
z1 |z1 |
=
z2 |z2 | , if z2 6= 0

and  
z1
arg = arg z1 − arg z2 .
z2

10.38 De Moivre’s Formula

If n is a positive integer and z = r(cos θ + i sin θ), then

z n = z · z · z · · · z = rn [cos (θ + θ + · · · + θ) +i sin (θ + θ + · · · + θ)]


| {z } | {z }
n terms n terms

or
z n = rn (cos nθ + i sin nθ). (10.9)

In the special case, if r = 1, we have for z = (cosθ + i sin θ), so that (10.9) becomes

(cos θ + i sin θ)n = cos nθ + i sin nθ

which is called the De Moivre’s formula.

165
10.38.1 Application of De Moivre’s Formula

It is used to obtain roots of complex numbers

Recall from algebra that −2 and 2 are said to be square roots of the number 4 because (−2)2 = 4
and (2)2 = 4. In other words, the two square roots of 4 are distinct solutions of the equation w2 = 4.

If n is a positive integer and z is any complex number, then we define the nth root of z to be any
complex number that satisfies the equation

wn = z (10.10)
1
and denote the nth root of z by z n .

Let w = ρ(cos α + i sin α) and z = r(cos θ + i sin θ), then

ρn (cos nα + i sin nα) = r(cos θ + i sin θ).

Comparing the moduli of the two sides, we see that



ρn = r or ρ = n
r

where n r denotes the real positive nth root of r. In order to have cos nα = cos θ and sin nα = sin θ,
the angles nα and θ must be either equal or differ by a multiple of 2π, that is

nα = θ + 2kπ, k = 0, ±1, ±2, . . .


θ 2kπ
α = + , k = 0, ±1, ±2, . . .
n n
Thus, the values of w = ρ(cos α + i sin α) that satisfy (10.10) are given by


    
θ 2kπ θ 2kπ
n
w = r cos + + i sin + , k = 0, ±1, ±2, . . .
n n n n

Although there are infinitely many values of k, it can be shown that k = 0, 1, 2, . . . , n − 1 produces
distinct values of w satisfying (10.10), but all other choices of k yield duplicates of these.

Example 10.38.1. Find all the cube roots of −8.

Solution: Since −8 lies on the negative real axis, we can use π as an argument.
Here r = |z| = | − 8| = 8, so a polar form of −8 is

−8 = 8(cos π + i sin π).

Here n = 3, hence

    
1 3 π 2kπ π 2kπ
(−8) =3 8 cos + + i sin + , k = 0, 1, 2.
3 3 3 3

166
Thus, the cube roots of −8 are
√ !
 π π 1 3 √
k = 0, 2 cos + i sin = 2 + i = 1 + 3i.
3 3 2 2
k = 1, 2(cos π + i sin π) = 2(−1) = −2.
√ !

 
5π 5π 1 3
k = 2, 2 cos + i sin = 2 − i = 1 − 3i.
3 3 2 2

10.39 Applications of Complex Numbers

10.39.1 The Quadratic Formula

Example 10.39.1. Solve the quadratic equation z 2 + (1 − i)z − 3i = 0.

Solution: From the quadratic formula, we have


1
−(1 − i) + [(1 − i)2 − 4(−3i)] 2
z =
2
1h 1
i
= −1 + i + (10i) 2 .
2
1 √
We compute (10i) 2 with r = 10 and θ = π2 and n = 2 for k = 0 and k = 1. The two square roots
of 10i are
√  π √ √ √
 
π 1 1
w0 = 10 cos + i sin = 10 √ + √ i = 5 + 5i
4 4 2 2
√ √ √ √
   
5π 5π 1 1
w1 = 10 cos + i sin = 10 − √ − √ i = − 5 − 5i.
4 4 2 2
√ √ √ √
Therefore the two values are z1 = 21 [−1 + i + ( 5 + 5i)] and z2 = 12 [−1 + i + (− 5 − 5i)]. These
solutions written in the form z = a + bi, are
1 √ 1 √ 1 √ 1 √
z1 = ( 5 − 1) + ( 5 + 1)i and z2 = − ( 5 + 1) − ( 5 − 1)i.
2 2 2 2

10.39.2 Roots of Polynomials

A polynomial in x is a function of the form

p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 .

Example 10.39.2. x3 − 2x + 4.

167
A number (real or complex) a is said to be a root of the polynomial p(x) if p(a) = 0.
Example 10.39.3. x = 1 is a root of x2 − 2x + 1, since 12 − 2 + 1 = 0.

A number a (real or complex) is a root of the polynomial p(x) if and only if (x − a) is a factor of
p(x). It may be the case that you pull more than one factor (x − a) out of the polynomial. In such
cases a is said to be a multiple root of p(x).

A root is called a simple root if it produces one factor.

10.39.3 The Fundamental Theorem of Algebra

Theorem 10.39.1 (The Fundamental Theorem of Algebra). Let p(x) be any polynomial of degree
n. Then p(x) can be factorized into a product of a constant and n factors of the form (x − a), where
a may be real or complex.

Suppose the complex number z is a root of the polynomial, then the complex conjugate z is also a
root.
Example 10.39.4. Let p(z) = z 4 − 4z 3 + 9z 2 − 16z + 20. Given that 2 + i is a root, express p(z)
as a product of real quadratic factors.

Solution: Given that 2 + i is a root, it follows that 2 − i must also be a root and so the quadratic
(z − (2 + i))(z − (2 − i)) = z 2 − 4z + 5
must be a factor. Dividing the given polynomial by this factor gives
p(z) = z 4 − 4z 3 + 9z 2 − 16z + 20 = (z 2 − 4z + 5)(z 2 + 4).
Example 10.39.5. Solve z 3 + 3z 2 + 2z − 6 = 0 and express the left hand side as a product of
irreducible factors.

Solution: Since the equation is a polynomial equation of odd degree there is at least one real
solution. To find that solution by trial and error the factors of the constant terms are substituted
into the polynomial. The factors of 6 are ±1, ±2, ±3, ±6.

Substituting z = 1 gives
1+3+2−6=0
z 3 + 3z 2 + 2z + 6
so z = 1 is a solution and (z − 1) is a factor. So = z 2 + 4z + 6 and the other
√ z − 1
solutions are z = −2 ± 2i and so
z 3 + 3z 2 + 2z − 6 = (z − 1)(z 2 + 4z + 6)
as a product of irreducible real factors.
Exercise 10.39.1. Express z 5 − 1 as a product of real linear and quadratic factors.

168
10.40 Tutorial 11
1. Given that z1 = 3 − 8i and z2 = −7 + i, find
(i) iz1 + 2z2 (ii) z1 + z2

2. Express each of the following complex numbers in polar form and represent each number on
an Argand diagram:
√ 2
(i) −1 − i (ii) 3 − 3 3i (iii) −2 − √ i
2
2+i 1
3. If z = , find the real and imaginary parts of z + .
1−i z
z−i
(a) Let z ∈ C, and let w = .
z+i
i. Evaluate w when z = 0, and when z = 1.
ii. Let z = β where β ∈ R. Show that for any such z the corresponding w
always has unit modulus.
(b) i. Express the complex number z = 24 + 7i in polar form.
1
ii. Find the four values of z 4 in exponential form, and plot them on an
Argand diagram.

4. Show that cos 6φ = 32 cos6 φ − 48 cos4 φ + 18 cos2 φ − 1.

5. Consider the polynomial


p(z) = z 4 − 3z 3 + rz 2 + sz + t,
where r, s, and t are real constants. Given that the two roots of p(z) are 2 and 1 + 2i,
determine the values of r, s and t.

6. Find all values of z for which z 4 + 2 3i + 2 = 0.

169
Chapter 11

Theory of Matrices

Any man who can drive safely while kissing a pretty girl is simply not giving the kiss the attention it
deserves.

—Albert Einstein

170
11.1 Matrices

Definition 11.1.1. A matrix over a field K (elements of K are called numbers or scalars) is a
rectangular array of scalars presented in the following form
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A =  ..
 
.. .. 
 . . ··· . 
am1 am2 · · · amn

The rows of such a matrix are the m horizontal list of scalars, that is

(a11 , a12 , · · · , a1n ), (a21 , a22 , · · · , a2n ), · · · , (am1 , am2 , · · · , amn )

and the columns of A are the n vertical list of scalars,


     
a11 a12 a1n
 a21   a22   a2n 
     
 a31   a32 
,  , · · · ,  a3n  .
 

 ..   ..   .. 
 .   .   . 
am1 am2 amn

The element aij , called the ij-entry or ij-element appears in row i and column j. Denote a matrix
simply by A = [aij ].

A matrix with m rows and n columns is called an m by n matrix, written m × n. The pair of
numbers m and n are called the size of the matrix.

Two matrices are equal, written A = B, if they have the same size and if corresponding elements
are equal.
Example 11.1.1. Find x, y, z, t such that
   
x + y 2z + t 3 7
= .
x−y z−t 1 5

Solution: By definition of equality of matrices, the four corresponding entries must be equal. Thus

x + y = 3, 2z + t = 7, x − y = 1, z − t = 5.

Solving the above system of equations yield

x = 2, y = 1, z = 4, t = −1.

A matrix whose entries are all zero is called a zero matrix.

171
Example 11.1.2.  
0 0  
A= , P = 0 0 .
0 0

Matrices whose entries are all real numbers are called real matrices and are said to be matrices
over R. Matrices whose entries are all complex numbers are called complex matrices and are
said to be matrices over C.

11.2 Matrix Addition and Scalar Multiplication

Let A = [aij ] and B = [bij ] be two matrices with the same size, say m × n matrices. The sum of
A and B, written A + B, is the matrix obtained by adding corresponding elements from A and B,
that is  
a11 + b11 a12 + b12 · · · a1n + b1n
 a21 + b21 a22 + b22 · · · a2n + b2n 
A+B = .
 
.. .. ..
 . . ··· . 
am1 + bm1 am2 + bm2 · · · amn + bmn
The product of a matrix A by a scalar k, written kA, is the matrix obtained by multiplying each
element of A by k, that is  
ka11 ka12 · · · ka1n
 ka21 ka22 · · · ka2n 
kA =  .. ..  .
 
..
 . . ··· . 
kam1 kam2 · · · kamn

Observe that A + B and kA are also m × n matrices.

We also define
−A = (−1)A and A − B = A + (−B).

The matrix −A s called the negative of matrix A and the matrix A − B is called the difference
of matrix A and B.

Example 11.2.1. Let    


1 −2 3 4 6 8
A= and B=
0 4 5 1 −3 −7
then    
1 + 4 −2 + 6 3+8 5 4 11
A+B = =
0 + 1 4 + (−3) 5 + (−7) 1 1 −2
and    
3(1) 3(−2) 3(3) 3 −6 9
3A = = .
3(0) 3(4) 3(5) 0 12 15

172
11.2.1 Properties

Theorem 11.2.1. Consider any matrices A, B and C (with same size) and scalars k and l. Then

(i) (A + B) + C = A + (B + C).

(ii) A + 0 = 0 + A = A.

(iii) A + (−A) = (−A) + A = 0.

(iv) A + B = B + A.

(v) k(A + B) = kA + kB.

(vi) (k + l)A = kA + lA.

(vii) (kl)A = k(lA).

(viii) 1 · A = A.

Proof. (i) Suppose A = [aij ], B = [bij ] and C = [cij ].

Need to show that corresponding ij-entries in each side of each matrix equation are equal.

The ij-entry of A + B is aij + bij , hence the ij-entry of (A + B) + C is (aij + bij ) + cij . On the other
hand, the ij-entry of B + C is bij + cij and hence the ij-entry of A + (B + C) is aij + (bij + cij ).
However for scalars in K,
(aij + bij ) + cij = aij + (bij + cij ).
Thus (A+B)+C and A+(B +C) have identical ij-entries. Therefore (A+B)+C = A+(B +C).

Proof. (v) The ij-entry of A + B is aij + bij , hence k(aij + bij ) is the ij-entry of k(A + B). On
the other hand, the ij-entry of kA and kB are kaij and kbij respectively. Thus, kaij + kbij is the
ij-entry of kA + kB. However, for scalars in K,

k(aij + bij ) = kaij + kbij .

Thus, k(A + B) and kA + kB have identical ij-entries. Therefore, k(A + B) = kA + kB.

11.3 Matrix Multiplication

The product of matrices A and B, is written as AB. Consider the product AB, of a row matrix
A = [aij ] and a column matrix B = [bij ] with the same number of elements is defined to be the

173
scalar obtained by multiplying corresponding entries and adding, that is
 
b1
 b2  Xn
AB = [a1 , a2 , · · · , an ]  ..  = a1 b1 + a2 b2 + · · · + an bn = ak b k .
 
. k=1
bn

The product AB is not defined when A and B have different number of elements.

Example 11.3.1.
 
3
[7, −4, 5]  2  = 7(3) + −4(2) + 5(−1) = 21 − 8 − 5 = 8.
−1

We now define matrix multiplication in general.

Definition 11.3.1. Suppose A = [aik ] and B = [bkj ] are matrices such that the number of columns
of A is equal to the number of rows of B, say, A is an m × p matrix and B is a p × n matrix. Then
the product AB is the m × n matrix whose ij-entry is obtained by multiplying the ith row of A by
the jth column of B, that is
 b   
a11 · · · · · · aip 11 · · · b1j · · · b1n c11 · · · · · · c1n

 .. .. .. ..   .. .. .. .. ..   .. .. .. .. 
 . . . .  . . . . .   . . . . 
 .
. .
. .
. .
. .
.
  .
 =  .. · · · c

 ai1 · · · · · · aip  

 . . . . . . ij · · · 
.. .. ..   .
   
 .. . . . .   . . . .

. . . . . . . . .

. . .  . . . . .   . . . . 
am1 · · · · · · amp bp1 · · · bpj · · · bpn cm1 · · · · · · cmn

where p
X
cij = ai1 b1j + ai2 b2j + · · · + aip bpj = aik bkj .
k=1

The product AB is not defined if A is an m × p matrix and B is an q × n matrix, where p 6= q.

Example 11.3.2. Find AB where


   
1 3 2 0 −4
A= and B= .
2 −1 5 −2 6

Solution: Since A is 2 × 2 and B is 2 × 3, the product AB is defined and AB is a 2 × 3 matrix.


Hence    
2 + 15 0 − 6 −4 + 18 17 −6 14
AB = = .
4 − 5 0 + 2 −8 − 6 −1 2 −14

174
   
1 2 5 6
Example 11.3.3. Suppose A = and . Then
3 4 0 −2
   
5+0 6−4 5 2
AB = =
15 + 0 18 − 8 15 10

and    
5 + 18 10 + 24 23 34
BA = = .
0−6 0−8 −6 −8

The above example shows that matrix multiplication is not commutative, that is, the products AB
and BA of matrices need not be equal. Matrix multiplication satisfies the following properties

Theorem 11.3.1. Let A, B and C be matrices, then, whenever the products and sums are defined.

(i) (AB)C = A(BC).

(ii) A(B + C) = AB + AC.

(iii) (B + C)A = BA + CA.

(iv) k(AB) = (kA)B = A(kB), where k is a scalar.

Proof. (i) (AB)C = A(BC).

Let A = [aij ], B = [bjk ], C = [ckl ] and let AB = S = [sik ] and BC = T = [tjl ]. Then
m
X n
X
sik = aij bjk and tjl = bjk ckl .
j=1 k=1

Multiplying S = AB by C, the il-entry of (AB)C is


n
X n X
X m
si1 c1l + si2 c2l + · · · + sim cnl = sik ckl = (aij bjk )ckl .
k=1 k=1 j=1

On the other hand, multiplying A by T = BC, the il-entry of A(BC) is


m
X m X
X n
ai1 t1l + ai2 t2l + · · · + a1m tnl = aij tjl = aij (bjk ckl ).
j=1 j=1 k=1

The above sums are equal, that is, corresponding elements in (AB)C and A(BC) are equal. Thus

(AB)C = A(BC).

Exercise 11.3.1. Prove that A(B + C) = AB + AC.

175
11.4 Transpose of a Matrix

Definition 11.4.1. The transpose of a matrix A, written At , the matrix obtained by writing the
columns of A, in order, as rows.
Example 11.4.1.   

 t 1 4 1
1 2 3
= 2 5 and [1 − 3 − 5]t = −3 .
4 5 6
3 6 −5

In other words, if A = [aij ] is an m × n matrix, then At = [bij ] is the n × m matrix, where bij = aji .
Observe that the transpose of a row vector is a column vector. Similarly, the transpose of a column
vector is a row vector. The basic properties of the transpose operation are
Theorem 11.4.1. Let A and B be matrices and let k be a scalar. Then, whenever the sum and
product are defined, we have

(i) (A + B)t = At + B t .
(ii) (At )t = A.
(iii) (kA)t = kAt .
(iv) (AB)t = B t At .

Proof. (iv) (AB)t = B t At .

Let A = [aik ] and B = [bkj]. Then the ij-entry of AB is

ai1 b1j + ai2 b2j + · · · + aim bmj .

This is the ji-entry (reverse order) of (AB)t . Now column j of B becomes row j of B t and row i of
A becomes column i of At , Thus, the ij-entry of B t At is

[b1j , b2j , · · · , bmj ][ai1 ai2 aim ]t = b1j ai1 + b2j ai2 + · · · + bmj aim .

Thus, (AB)t = B t At , since the corresponding entries are equal.

11.5 Square Matrices

Definition 11.5.1. A square matrix is a matrix with the same number of rows as columns.

An n × n square matrix is said to be of order n and is sometimes called an n-square matrix.

176
Example 11.5.1. The following are square matrices of order 3.
   
1 2 3 2 −5 1
A = −4 −4 −4 and B = 0 3 −2 .
5 6 7 1 2 −4

11.6 Diagonal and Trace

Definition 11.6.1. Let A = [aij ] be an n-square matrix. The diagonal or main diagonal of A
consists of the elements with the same subscripts, that is
a11 , a22 , . . . , ann .
Definition 11.6.2. The trace of A, written tr(A), is the sum of the diagonal elements. Namely
tr(A) = a11 + a22 + · · · + ann .

11.6.1 Identity Matrix

The n-square identity or unit matrix, denoted by I, is the n-square matrix with 1’s on the diagonal
and 0’s elsewhere.

For any n-square matrix A,


AI = IA = A.
Example 11.6.1. The following are identity matrices of order 3 and 4.
 
  1 0 0 0
1 0 0
0 1 0 and 0 1 0 0 .
 
0 0 1 0
0 0 1
0 0 0 1

11.7 Powers of Matrices

Let A be an n-square matrix over a field K. Powers of A are defined as follows


A2 = AA, A3 = A2 A, · · · , An+1 = An A and A0 = I.
 
1 2
Example 11.7.1. Suppose A = . Then
3 −4
    
2 1 2 1 2 7 −6
A = =
3 −4 3 −4 −9 22

177
and    
3 7 −6 1 2
2 −11 38
A =A A= = .
−9 22 3 −4 57 −106

11.8 Special Types of Square Matrices

11.8.1 Diagonal Matrix

A square matrix D = [dij ] is diagonal if its non diagonal entries are all zero.

Example 11.8.1.  
3 0 0  
4 0
A = 0 −7 0 and B= .
0 −5
0 0 2

11.8.2 Triangular Matrices

A square matrix A = [aij ] is upper triangular if all entries below the main diagonal are equal to
zero.

Example 11.8.2.  
  b11 b12 b13
a11 a12
A= and B =  0 b22 b23  .
0 a22
0 0 b33

A lower triangular matrix is a square matrix whose entries above the main diagonal are all zero.

Suppose A is a square matrix with real entries.

11.8.3 Symmetric Matrices

Definition 11.8.1. A matrix A is symmetric if

At = A.
   
2 −3 5 2 −3 5
Example 11.8.3. Let A = −3 6 7 , then At = −3 6 7 . Hence At = A, thus A is
5 7 −8 5 7 −8
symmetric.

178
11.8.4 Skew-Symmetric Matrices

Definition 11.8.2. A matrix A is skew-symmetric if

At = −A.

The diagonal elements of such matrix must be zero.

Example 11.8.4.  
0 3 −4
B = −3 0 5 .
4 −5 0

11.8.5 Orthogonal Matrices

Definition 11.8.3. A real matrix A is orthogonal if

At = A−1 ,

that is
AAt = At A = I.

A must necessarily be square and invertible.

11.9 Complex Matrices

Let A be a complex matrix. The conjugate of a complex matrix A, written A, is the matrix
obtained from A by taking the conjugate of each entry of A.

A∗ is used for the conjugate transpose of A, that is

A∗ = (A)t = (At ).

If A is real then A∗ = At .
 
  2 − 8i −6i
2 + 8i 5 − 3i 4 − 7i
Example 11.9.1. Let A = , then A∗ = 5 + 3i 1 + 4i.
6i 1 − 4i 3 + 2i
4 + 7i 3 − 2i

179
11.9.1 Hermitian Matrices

Definition 11.9.1. A complex matrix A is said to be Hermitian if

A∗ = A.

Skew-Hermitian Matrices

Definition 11.9.2. A complex matrix A is said to be skew-Hermitian if

A∗ = −A.

11.9.2 Unitary Matrices

Definition 11.9.3. A complex matrix A is unitary if

A∗ A−1 = A−1 A∗ = I, i.e., A∗ = A−1 .

11.10 Inversion of Matrices

Here we are dealing with square matrices.

Proposition 11.10.1. For every n × n matrix A,

AI = IA = A.

This raises the following question : Given an n×n matrix A, is it not possible to find another
n × n matrix B, such that AB = BA = I?

Definition 11.10.1. An n × n matrix A is said to be invertible, if there exists an n × n matrix


B, such that
AB = BA = I.
In this case, we sat that B is the inverse of A and write

B = A−1 .

Proposition 11.10.2. Suppose that A is an invertible n × n matrix. Then its inverse A−1 is
unique.

180
Proof. Suppose that B satisfies the requirements for being the inverse of A. Then AB = I = BA.
It follows that
A−1 = A−1 I = A−1 (AB) = (A−1 A)B = IB = B.
Hence the inverse A−1 is unique.

Exercise 11.10.1. Suppose that A and B are invertible n × n matrices. Prove that

(AB)−1 = B −1 A−1 .

Exercise 11.10.2. Suppose that A is an invertible n × n matrix. Prove that

(A−1 )−1 = A.

11.11 Determinants

Each n-square matrix A = [aij ] is assigned a special scalar called the determinant of A, denoted
by det A or |A| or
a11
a12 ··· a1n
a21 a22 ··· a2n
.. .

.. ..
.
. ··· .
am1 am2 ··· amn

11.11.1 Determinants of Order 1 and 2

Determinants of order 1 and 2 are defined as



a a12
|a11 | = a11 and 11 = a11 a22 − a12 a21 .
a21 a22

5 3
Example 11.11.1. (a) det(27) = 27 and (b) = 5(6) − 3(4) = 30 − 12 = 18.
4 6

11.12 Determinants of Order 3

Consider an arbitrary 3 × 3 matrix A = [aij ]. The determinant of A is defined as follows



a11 a12 a13

det A = a21 a22 a23
a31 a32 a33
= a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 .

181
A procedure for evaluating the determinants of 3 × 3 is called Sarrus’ Rule.

+ + +
a11 a12 a13 a11 a12

a21 a22 a23 a21 a22

a31 a32 a33 a31 a32


− − −
  
2 1 1 3 2 1
Example 11.12.1. Let A = 0 5 −2 and B = −4 5 −1. Find det A and det B.
1 −3 4 2 −3 4

 
2 1 1
det A = 0 5 −2 =
1 −3 4

+ + +
2 1 1 2 1

0 5 −2 0 5

1 −3 4 1 −3
− − −

= 2(5)(4) + 1(−2)(1) + 1(0)(−3) − (1)(0)(4) − (2)(−2)(−3) − (1)(5)(1) = 40 − 2 + 0 − 0 − 12 − 5 = 21.

 
3 2 1
det B = −4 5 −1 =
2 −3 4

+ + +
3 2 1 3 2

−4 5 −1 −4 5

2 −3 4 2 −3
− − −

182
= 3(5)(4) + 2(−1)(2) + 1(−4)(−3) − (2)(−4)(4) − (3)(−1)(−3) − (1)(5)(2) = 60 − 4 + 12 + 32 − 9 − 10
= 81.

Sarrus’ rule applies for evaluating the determinant of 3 × 3 matrices only.

11.13 Evaluation of Determinants of Any Order

11.13.1 Minors and Co-factors

Definition 11.13.1. If A = [aij ] is an n × n matrix, then the minor of the element aij denoted
by Mij and is defined ad the determinant of the (n − 1) × (n − 1) sub-matrix which is obtained by
deleting all the entries in the ith row and the jth column.

Example 11.13.1. For the matrix



a11 a12 a13

A = a21 a22 a23 .
a31 a32 a33

The minor of a11 is


a22 a23
a32 a33 = M11 .

The minor of a12 is


a21 a23
a31 a33 = M12 .

The minor of a13 is


a21 a22
a31 a32 = M13 .

Definition 11.13.2. The co-factor of an element aij denoted by aij is defined as the product of
(−1)i+j and the minor of aij , that is

Aij = (−1)i+j Mij .

Co-factor of an element is merely the signed minor of the element. We emphasize Mij denotes a
matrix and Aij denotes a scalar.

a11 a12 a13
1+1 a22 a23

Example 11.13.2. If A = a21 a22 a23 , then the co-factor of a11 = A11 = (−1)

a31 a32 a33 a32 a33

a22 a23 1+2
a21 a23 a21 a23
= + a31 a33 = − a31 a33 .
, the co-factor of a12 = A12 = (−1)
a32 a33

183
11.14 Laplace Expansion of the Determinant

To compute the determinant of an n×n matrix we make use of the concept of co-factors and minors
to reduce the matrix to lower ones whose determinants we already know how to calculate.

The determinant of a square matrix A = [aij ] is equal to the sum of the products obtained by
multiplying the elements of any row (column) by their respective co-factors.
n
X
|A| = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain = aij Aij .
j=1

This expansion can be carried out along any row of the matrix in question and the value of the
determinant is the same.
 
3 −1 5
Example 11.14.1. Given that A = 0 4 −3. Find |A|.
2 1 2

Solution: Expanding along the first row, gives



1+1 4 −3 1+2 0 −3 1+3 0 4

det A = 3(−1) + (−1)(−1) + 5(−1)
1 2 2 2 2 1

4 −3 0 −3
+ 5 0 4

= 3 +
1 2 2 2 2 1
= 3(8 + 3) + (0 + 6) + 5(−8)
= 3(11) + 6 − 40 = 33 + 6 − 40
= −1.

Expanding along the second row, gives



−1 5 3 5 3 −1
det A = 0(−1)2+1 + 4(−1)2+2 + (−3)(−1)2+3
1 2 2 2 2 1
= 0 − 16 + 15
= −1.

Note that expanding by a row or column that contains zeros significantly reduces the number
of cumbersome calculations that need to be done. It is sensible to evaluate the determinant by
co-factor expansion along a row or column with the greatest number of zeros.
 
0 0 0 1
3 5 0 −1
Example 11.14.2. Given that A =  0 3 −2 5 . Find det A.

1 0 0 2

184
Solution:

3 5 0
5 0
det A = − 0 3 −2 = −

1 0 0 3 −2

= −(10) − 0
= 10.

The determinant of the identity matrix is 1. The determinant of a diagonal matrix D of order
n × n is given by the product of the elements on its main diagonal. The determinant of a triangular
matrix of order n × n is given by the product of the elements on its main diagonal.

11.15 Properties
1. For general matrices, A and B
|AB| = |A||B|.

2. In general, for an n × n matrix A,


det A = det At .

3. If A and B are n × n matrices, then

|AB| = |BA|.

4. In general, if two rows (columns) of an n × n matrix A are interchanged, then

det A = − det A.

5. If the elements of any rows (columns) of an n × n matrix A are multiplied by the same scalar
k, then the value of the determinant of the new matrix is k times the determinant of A.

6. If the elements of any row (column) of A are all zeros, then the determinant of A is zero.

7. If an n × n matrix A is multiplied by a scalar k, then the determinant of kA is k n det A, that


is
det kA = k n det A.

8. If A is an n × n matrix, with any two of its rows (columns) equal, then the determinant of A
is zero.

9. If A is an n × n matrix, in which one row (column) is proportional to another, then the


determinant of the matrix is zero.

185
11.16 Adjoint

Definition 11.16.1. Let A = [aij ] be an n × n matrix and let Aij denote the co-factors of aij . The
adjoint of A, denoted by adj A is the transpose of the matrix of co-factors of A, that is

adj A = [Aij ]t .
 
2 3 −4
Example 11.16.1. Let A = 0 −4 2 . The co-factors of the nine elements of A are as follows,
1 −1 5

−4 2 0 2 0 −4
A11 = + = −18, A12 = −
1 5 = 2, A13 = + 1 −1 = 4

−1 5

3 −4 2 −4 2 3
A21 = − = −11, A22 = +
1 5 = 14, A23 = − 1 −1 = 5

−1 5

3 −4 2 −4 2 3
= −10, A32 = −
A31 = + 0 2 = −4, A33 = + 0 −4 = −8.

−4 2
   
A11 A12 A13 −18 2 4
[Aij ] = A21 A22 A23  = −11 14 5  .
A31 A32 A33 −10 −4 −8
The transpose of the above matrix of co-factors yields the adjoint of A, that is
 
−18 −11 −10
adj A =  2 14 −4  .
4 5 −8

Theorem 11.16.1. Let A be any square matrix.Then

A(adj A) = (adj A)A = |A|I,

where I is the identity matrix. Thus, if |A| =


6 0,
1
A−1 = adj A.
|A|

Example 11.16.2. Let A be the matrix above. We have

det A = −40 + 6 + 0 − 16 + 4 + 0 = −46.

Thus A does have an inverse and from


 
−18 −11 −10
1 1
A−1 = adj A = −  2 14 −4  .
|A| 46
4 5 −8

186
11.17 Properties of Inverses
1. If an n × n matrix A is invertible, then det A 6= 0.

Definition 11.17.1. A matrix which has an inverse is said to be invertible. A matrix whose
determinant is non-zero is said to be non-singular and if a matrix has determinant equal to
zero it is called a singular matrix.

2. If an n × n matrix A is invertible, then

(A−1 )−1 = A.

3. If an n × n matrix A is invertible, then At is also invertible, and

(At )−1 = (A−1 )t .

4. If A is an n × n invertible matrix, then


1
det A−1 = .
det A

187
Chapter 12

Application of Matrices

Gravitation cannot be held responsible for people falling in love. How on earth can you explain in terms of
chemistry and physics so important a biological phenomenon as first love? Put your hand on a stove for a
minute and it seems like an hour. Sit with that special girl for an hour and it seems like a minute. That’s
relativity.

—Albert Einstein

12.1 Elementary Row Operations

Let ri denote row i of matrix A. There are 3 elementary row operations, namely

1. ri ↔ rj meaning interchanging row i with row j.

2. ri → kri , k 6= 0 meaning multiply ri by a scalar k.

3. ri → kri + rj meaning multiply row i by k and add row j.

188

1 0 2
Example 12.1.1. Consider the matrix A = 4 1 3. Then
3 2 6
 
4 1 3
r1 ↔ r2 gives  1 0 2
3 2 6
 
1 0 2
r2 → 2r2 gives  8 2 6
3 2 6
 
5 2 10
r1 → 2r1 + r3 gives 4 1 3 .
3 2 6

12.2 Inverses Using Row Operations

We can use row operations to find the inverse of A by writing a matrix (A|In ), then use row
operations to get (In |A−1 ).
 
2 3
Example 12.2.1. Consider A = .
2 2

Solution: We write  
2 3 1 0
.
2 2 0 1
Then performing row operations we have
 
2 3 1 0
r2 → r2 − r1
0 −1 −1 1
 
2 0 −2 3
r1 → r1 + 3r2
0 −1 −1 1
0 −1 23
 
1 1
r1 → r1
2 0 −1 −1 1
0 −1 32
 
1
r2 → −r2 .
0 1 1 −1

−1 23
 
−1
Therefore A = . Checking can be done by verifying that A−1 A = I.
1 −1

−1 23
    
2 3 1 0
= = I.
1 −1 2 2 0 1

189
12.3 Linear Equations

An equation of the kind


a1 x 1 + a2 x 2 + · · · + an x n = b
is called a linear equation in the n variables x1 , x2 , · · · , xn and a1 , a2 , · · · , an and b are real
constants.

A solution of a linear equation a1 x1 + a2 x2 + · · · + an xn = b is a sequence of n numbers s1 , s2 , · · · , sn


such that the equation is satisfied when we substitute x1 = s1 , x2 = s2 , · · · , xn = sn . The set of all
solutions of the equation is called its solution set.

A finite set of linear equations in the variables x1 , x2 , · · · , xn is called a system of linear equations
or a linear system.

A sequence of numbers s1 , s2 , · · · , sn is called a solution of the system if x1 = s1 , x2 = s2 , · · · ,


xn = sn is a solution of every equation in the system. Not all systems of linear equations have
solutions. A system of equations that has no solutions is said to be inconsistent. If there is at
least one solution, it is called consistent.

Every system of linear equations has either no solutions, exactly one solution or infinitely
many solutions.

An arbitrary system of m linear equations in n unknowns will be written as


a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .
. = ..
am1 x1 + am2 x2 + · · · + amn xn = bm
where x1 , x2 , · · · , xn are the unknowns. We can write a rectangular array of numbers, as
 
a11 a12 · · · a1n b1
 a21 a22 · · · a2n b2 
..  .
 
 .. .. ..
 . . ··· . . 
am1 am2 · · · amn bm
This is called the augmented matrix for the system.
Example 12.3.1. The augmented matrix for the system of equations
x1 + x2 + 2x3 = 9
2x1 + 4x2 − 3x3 = 1
3x1 + 6x2 − 5x3 = 0
is  
1 1 2 9
2 4 −3 1 .
3 6 −5 0

190
Example 12.3.2. Find the solution set of

2x1 − x2 + x3 = 4
−3x1 + 2x2 − 4x3 = 1
x1 − 5x3 = 0

Solution: The augmented matrix for the linear system is


 
2 −1 1 4
−3 2 −4 1 .
1 0 −5 0

Doing row operations, we have


 
1 0 −5 0
r3 ↔ r1 −3 2 −4 1 .
2 −1 1 4
 
1 0 −5 0
r2 → r2 + 3r1 , r3 → r3 − 2r1  0 2 −19 1 .
0 −1 11 4
 
1 0 −5 0
r2 ↔ r3 0
 −1 11 4 .
0 2 −19 1
 
1 0 −5 0
r2 ↔ (−1)r2 0 1 −11 −4 .
0 2 −19 1
 
1 0 −5 0
r3 → r3 − 2r2 0 1 −11 −4 .
0 0 3 9

Corresponding system of linear equations which is derived from the augmented matrix is

x1 − 5x3 = 0
x2 − 11x3 = −4
3x3 = 9.

Now using the method of back substitution, we find the values of the unknown as follows

x3 = 3
x2 = −4 + 33 = 29
x1 = 0 + 15 = 15.

The solution set is


(x1 , x2 , x3 ) = (15, 29, 3).

191
12.3.1 Applications of Linear Equations

Linear equations arise in many applications, for example, quadratic interpolation, temperature
distribution, global positioning system (gps), e.t.c.

12.4 Row Echelon Form

To be in this form, a matrix must have the following properties

1. If a row does not consist entirely of zeros, then the first non-zero number in the row is a 1
(leading 1).

2. If there are any rows that consists entirely of zeros, then they are grouped together at the
bottom of the matrix.

3. In any two successive rows that do not consist entirely of zeros, the leading 1 in the lower row
occur further to the right than the leading 1 in the higher row.

4. Each column that contain a leading 1 has zeros elsewhere.

A matrix having properties 1, 2 and 3 is said in row-echelon form.

A matrix in reduced row-echelon form must have zeros above and below each leading 1.

Example 12.4.1. These are in row echelon form


 
1 −3 1 0 −8  
1 1 −3 4
0
 1 −9 6 0   and 0 0 1 9 .
0 0 0 1 7
0 0 0 0
0 0 0 0 1

Example 12.4.2. These are in reduced row-echelon form


   
  0 1 0 1 0 0 0
1 0 −4 0
0 1 2 0 , 0 0 0 ,
  0 0 0 0
 .
0 0 1 0 0 0 0
0 0 0 1
0 0 0 0 0 0 0

The procedure for reducing a matrix to a reduced row-echelon form is called Gauss-Jordan Elim-
ination and the procedure which produces a row-echelon form is called the Gauss Elimination.
The Gauss Elimination method requires fewer elementary row operations than the Gauss-Jordan
method.

192
Example 12.4.3. Solve the following system of linear equations
−3x2 + 4x3 = −2
x1 + 5x2 + 2x3 = 9
x1 + x2 − 6x3 = −7.

Solution: The augmented matrix is


 
0 −3 4 −2
1 5 2 9 .
1 1 −6 −7
Doing row operations, yields
 
1 5 2 9
r1 ↔ r2 , r2 ↔ r3 , r2 → r2 − r1 0 −4 −8 −16 .
0 −3 4 −2
 
1 5 2 9
1
r2 → − r2 , r3 → r3 + 3r2 0 1 2 4  .
4
0 0 10 10
 
1 5 2 9
1
r3 → r3 0 1 2 4 ,
10
0 0 1 1
which is now in echelon form and is equivalent to
x1 + 5x2 + 2x3 = 9
x2 + 2x3 = 4
x3 = 1.
This means x3 = 1, x2 = 4 − 2x3 = 2 and finally x1 = 9 − 5x2 − 2x3 = −3. Hence
(x1 , x2 , x3 ) = (−3, 2, 1).
Alternatively we could continue with the elementary row operations as follows
 
1 0 −8 −11
r1 → r1 − 5r2 , r2 → r2 − 2r3 , 0 1 0 2 .
0 0 1 1
 
1 0 0 −3
r1 → r1 + 8r3 0 1 0 2  ,
0 0 1 1
which is now in reduced row-echelon form and is equivalent to
x1 = −3
x2 = 2
x3 = 1.
Therefore
(x1 , x2 , x3 ) = (−3, 2, 1).

193
The types of solutions one can get when solving system of linear equations, we shall look at several
augmented matrices that have already been reduced to echelon form.

Case 1

 
1 2 0 3
0 1 1 4 
0 0 1 −1
is equivalent to
x1 + 2x2 = 3
x2 + x3 = 4
x3 = −1,
which implies that x1 = −7, x2 = 5 and x3 = −1.

Case 2

 
1 0 −2 1
0 1 −1 3
0 0 0 0
which is equivalent to
x1 − 2x3 = 1
x2 − x3 = 3
which implies that
x2 = x3 + 3
x1 = 2x3 + 1
x3 = x3 .
In this case all the solutions are expressed in terms of x3 . Any arbitrary value can be assigned to x3
and the resulting values of x1 , x2 and x3 will satisfy all the equations in the system. The solution
set therefore is infinite and written as follows
(x1 , x2 , x3 ) = (2t + 1, t + 3, t).

Case 3

 
1 −2 4 0
0 1 3 −2 .
0 0 0 −4

194
The equation represented by the last row is 0x1 + 0x2 + 0x3 = −4. Clearly, we can never find
suitable values for x1 , x2 and x3 which satisfy this equation. Therefore the solution does not exist.

The reduced row-echelon form of a matrix is unique and a row-echelon form is not unique, by
changing the sequence of elementary row operations it is possible to arrive at different row-echelon
forms.
Example 12.4.4. Find the value of α for which the following system of equations is (a) consistent
(b) inconsistent.
−3x1 + x2 = −2
x1 + 2x2 = 3
2x1 + 3x2 = α.

Solution: The augmented matrix is


 
−3 1 −2
 1 2 3 .
2 3 α
Doing elementary row operations, we have
 
1 2 3
r1 ↔ r2 , −3 1 −2 .
2 3 α
 
1 2 3
r2 → r2 + 3r1 , r3 → r3 − 2r1 0 7 7 .
0 −1 α − 6
 
1 2 3
1
r2 → r2 0 1 1 .
7
0 −1 α − 6
 
1 2 3
r3 → r3 + r2 0 1 1 
0 0 α−5
which is now in echelon form. The last row is equivalent to 0x1 + 0x2 = α − 5.

(a) The system can be consistent only if α − 5 = 0, that is, when α = 5.


(b) The system is inconsistent if α − 5 6= 0, that is, when α 6= 5.

12.5 Homogeneous System of Linear Equations

Definition 12.5.1. A homogeneous system of linear equations is a system in which all the constant
terms are zero.

195
a11 x1 + a12 x2 + · · · + a1n xn = 0
a21 x1 + a22 x2 + · · · + a2n xn = 0
.. .. . .
. . · · · .. = ..
am1 x1 + am2 x2 + · · · + amn xn = 0.

Any homogeneous system of equations will always have a solution no matter what the coefficient
matrix is like, and so can never be inconsistent.

The obvious solution is x1 = x2 = · · · = xn = 0. This solution is known as the trivial solution.

12.5.1 Solution of Homogeneous Systems

Example 12.5.1. Find the solution set of the following homogeneous system of linear equations
x1 − 2x2 + x3 = 0
2x1 + x2 − 3x3 = 0
−3x2 + x3 = 0.

Solution: The augmented matrix is


 
1 −2 1 0
2 1 −3 0 .
0 −3 1 0
doing the elementary row operations, we have
 
1 −2 1 0
r2 → r2 − 2r1 0 5 −5 0 .
0 −3 1 0
 
1 −2 1 0
1
r2 → r2 0 1 −1 0 .
5
0 −3 1 0
 
1 −2 1 0
r3 → r3 + 3r2 0 1 −1 0
0 0 1 0
which is equivalent to
x1 − 2x2 + x3 = 0
x2 − x − x3 = 0
x3 = 0.
Therefore (x1 , x2 , x3 ) = (0, 0, 0) has only one solution, the trivial solution.

196
In general if (i) n = m the system has only the zero solution. (ii) if m < n, the system has a non
zero solution.

Theorem 12.5.1. A homogeneous system of linear equations with more unknowns than equations
has a non-zero solution.

12.6 Tutorial 11
   
2 1 1 0
1. Let A = . Solve for B the equation AB = .
3 1 0 1
2. Let A, B, C be square matrices of order n. Prove that
(i) A + B = B + A, (ii) A + (B + C) = (A + B) + C, (iii) (AB)C = A(BC).

3. Find the determinant of  


2 1 2 1 1

 1 0 0 1 1 

C= 0 1 1 0 0 

 1 0 0 1 2 
1 1 2 2 1
   
1 1 1 −1 3
4. Given that A = , and B = , find (AB)t and B t At and verify that
0 1 2 0 −4
(AB)t = B t At .
 
x 1
5. Let E(x) = . Show that
−1 0

(a) E(x)E(0)E(y) = −E(x + y).


(b) the inverse of E(x) is E(0)E(−x)E(0).

1 a a 2

6. Show that 1 b b2 = (b − a)(c − a)(c − b).
1 c c2
   
1 0 1 1 3 2
7. Find the inverse of A =  2 2 0  and C =  3 1 4 .
0 1 3 0 2 3
8. Solve for x:

2−x 4 −2

4
2 − x −2 =0

−2 −2 4 − x

9. Prove that if A is an invertible matrix, then At is also invertible and (At )−1 = (A−1 )t .

197
1 17 17
 

10. Let A = 0 1 0 . Calculate A2 , A3 , A4 and find an expression for An .


0 0 1

1. Solve the following system using Gaussian elimination method


x + y + z = 6
2x − y + z = 3
(a)
x + z = 4
2x + y + z = 7.
2. Solve the following systems using Gauss Jordan elimination method
x + y + z = 5
(a) 2x + 3y + 5z = 8
4x + 5z = 2.
 
3 1 1
3. Using Gauss Jordan elimination, find the inverse of D =  1 5 1 .
1 1 3
4. Consider the system of equations

x + y + 2z = a
x + z = b
2x + y + 3z = c.

Show that in order for this system to be consistent, a, b, and c must satisfy c = a + b.

5. In the following linear system, determine all the values of a for which the resulting linear
system has (a) no solution (b) a unique solution and (c) infinitely many solutions.

x + 3y − 2z = 1
x + 7y − 4z = 7
2x + 4y + (a2 − 28)z = a + 4.

6. For what values of c does the following system of linear equations have
(i) no solution, (ii) a unique solution, (iii) infinitely many solutions ?

x + 2y − 3z = 4
3x − y + 5z = 2
4x + y + (c2 − 14)z = c + 2.

Show that in the case of infinitely many solutions, the solution may be written as ( 78 − α, 10
7
+
2α, α) for any real α.
 
2 0 4 2
7. Find the rank of the matrix A = 0 4 8 4 .
0 2 2 1

198

You might also like