You are on page 1of 52

MATH1005: Analytical and Computational Foundations

Professor J K Langley, amended by Dr S Pumplün


December 1, 2021

These notes cover the whole content of this module, but not the Python component, for which
there will be separate notes and exercises. The notes may be amended or augmented during the
academic year.

1 Sets and Mathematical Reasoning


1.1 Sets and sensibility
Definition A set A is a collection of mathematical objects, be they numbers, functions, or even
other sets. When we write
x∈A
we mean that x is a member, also known as an element, of A: thus x belongs to the collection
A.

We list some key sets for later use:

ˆ N is the collection of natural numbers, i.e. positive integers. The members of N are
1, 2, 3, . . . (not 0).

ˆ Z is the collection of all integers 0, ±1, ±2, . . .

ˆ Q is the collection of fractions, also called rational numbers: these are of form p/q, where
p and q are integers with q 6= 0.

ˆ R is the collection of all real numbers (see later).

ˆ C is the collection of all complex numbers (see your Linear Mathematics module for the
definition).

ˆ ∅ is the empty set, the set with no members at all.

Note that 2 ∈ N, while 12 ∈ Q, but π 6∈ Q (true though hard to prove). There are a number of
ways to define further sets.

1
ˆ The notation
{a, b, c, d}
just means the set with elements a, b, c, d: a finite set (this means a set with finitely many
elements), can be defined simply by listing its elements.

ˆ The notation
{x ∈ A | P (x)} or {x ∈ A | P (x)}
means the collection of all elements x of the set A which satisfy the condition P (x).

ˆ For example,
{x ∈ Z | x2 < 2}
can be written more simply as {−1, 0, 1}.

ˆ For example, the interval


{x ∈ R | a < x < b}
is usually denoted by (a, b). The full list of different types of intervals is given in G11CAL.

ˆ If we write {q(x) | x ∈ A} then we mean the collection of all q(x) where x varies over all
members of the set A. For example, there is a simpler way to write

{ln x | x ∈ (0, ∞)}.

1.2 Subsets
Definition We say that A is a subset of B, written A ⊂ B, if every element x in A also is an
element in B. This may also be expressed by saying that A is contained in B.

Note that:

ˆ the statement A ⊂ B only makes sense when A and B are both sets.

ˆ A ⊂ B is not the same as A ∈ B. We have to carefully distinguish between these two


notations.

ˆ Of the statements

(a) {2} ∈ R, (b) {2} ⊂ R, (c) 2 ∈ R, (d) 2 ⊂ R,

precisely two are true, the others make no sense, as the notation is not defined. Which of
them are true you find out in the lecture in week one.

ˆ For our basic sets we have


∅ ⊂ N ⊂ Z ⊂ Q ⊂ R ⊂ C.

ˆ If A ⊂ B and B ⊂ A then A and B have the same elements and so are the same set, i.e.
A = B.

2
1.3 Operations on sets: union and intersection
Given two sets A and B, we can form their union and intersection as follows.
Definition (i) The union A ∪ B is defined as the set of all x which belong to at least one of A
and B:
A ∪ B = {x | x ∈ A or x ∈ B}.
Thus we take all elements of A and all elements of B and put them into a combined set A ∪ B.
Note that this includes elements x which belong to both.
(ii) The intersection A ∩ B is the set of all x which belong to both of A and B:

A ∩ B = {x | x ∈ A and x ∈ B}.

Example (i) What is

{x ∈ R | x2 > 1} ∪ {sin x | x ∈ R}?


For the answer to a similar problem, check out the lecture.
(ii)  
1
A = x ∈ [−π, π] | sin x = √ = {π/4, 3π/4},
2
 
1
B = x ∈ [−π, π] | cos x = √ = {−π/4, π/4}
2
give
A ∩ B = {π/4}.

Note that for any sets A, B we always have

A ∩ B = B ∩ A, A ∪ B = B ∪ A,

A ∩ B ⊂ A ⊂ A ∪ B.

1.4 Statements, implication and truth


A mathematical statement is a meaningful assertion about mathematical objects that, when
assigned some ”true” or ”false” value, can be either true or false but not both. We now introduce
different ways to concisely write mathematical statements and then explain how we can assign
them true or false values using some basic rules from logic. So these are two steps: write a
statement that makes sense from the forma point of view, and then secondly check if it is true
or false.
A formally written statement which often causes confusion is (for a real number x):

x < 0 ⇒ x ≤ 0. (1)

This says that if x < 0 then x ≤ 0. Is (1) true? Two possible but mutually contradictory
arguments run like this, one of them turns out to be false:

3
ˆ (i) If x < 0 then x can’t be greater than 0. So certainly we have that x ≤ 0 and so (1) is
true.
ˆ (ii) If we say x ≤ 0 then this allows for the possibility that x = 0. This can’t be true if
x < 0 therefore (1) must be false.
Indeed, statement (1) is true, but to determine why this is correct we must be systematic about
what we mean by statements like A ⇒ B, and when they are true.

If we write A ⇒ B then this means “A implies B”, which is also expressed as “if A then
B”. If A holds, then B must hold also. For example, with x a real number,
x=1 ⇒ x2 = x. (2)
This is sometimes also expressed as “A only if B” i.e. A can only hold if B holds: so this means
the same as A ⇒ B.

If we assign a true or false value to an abstract mathematical statement to give it some meaning,
then we can formulate certain rules that are usually written as “truth tables”, explaining how a
statement becomes true or false, depending on the different statements it is made up on. Here
is the truth table for “A ⇒ B” (T=true, F=false):

A B A⇒B

T T T
T F F
F T T
F F T

Note that an implication thus only is false if A is true and B is false. Note that we can therefore
conclude that any implication that follows from a false statement A will always be true.

The converse of A ⇒ B is defined as the statement B ⇒ A, which is the reverse implication.


When we write B ⇒ A, we say “A if B”, this means “if B then A.”

Note that B ⇒ A is the same as A ⇐ B.

It is very important that

“if A then B” is NOT the same as “A if B”.

For a real number x, our statement (2)


x=1 ⇒ x2 = x
is true. Its converse is
x2 = x ⇒ x = 1,

4
which is false since x = 0 also satisfies x2 = x.

A further notational convention is the definition of the terms necessary and sufficient condition.
If A ⇒ B then we say:

ˆ A is a sufficient condition for B;

ˆ B is a necessary condition for A.

Thus a necessary condition for x ∈ Q is x ∈ R.

1.5 Negation and contrapositive


Lt A be a statement. The notation ¬A means “not A”.
Note that ¬(¬A) is the same as A.
For example, for sets C, D,
¬(C ⊂ D)
means that C is not a subset of D, i.e. at least one member of C is not a member of D.
Sometimes there is a more succinct notation: for example ¬(π ∈ Q) can be written as π ∈
6 Q.
Let
A⇒B (3)
be a statement. Then the statement

(¬B) ⇒ (¬A). (4)

is called the contrapositive of statement (3).


Suppose we know that for two statements A and B, we have A ⇒ B, but that B does not
hold. Then A cannot possibly hold, because if it did, B would hold, which we know is false.
Therefore, from
A⇒B
we can conclude that
(¬B) ⇒ (¬A).
If we apply this reasoning again, we see that if (4) is true then

¬(¬A) ⇒ ¬(¬B), which is the same as A ⇒ B.

Thus statements (3) and (4) are equivalent: each implies the other, and so either both hold or
neither holds. This observation can be used when proving statements as sometimes it may be
easier to prove the contrapositive of a statement instead of the actual statement (see later for
examples).
Note that the contrapositive is not the same as the converse.

5
1.6 Examples
Each of these statements involves positive integers n. What are their contrapositives and con-
verses? Which are true?

(a) 4 divides n only if 4 divides n2 ;


(b) n + 1 is not prime ⇒ n is prime;
(c) If the sum of the digits of n is divisible by 3 then n is divisible by 9.

Answers:
(a) The original statement is “4 divides n ⇒ 4 divides n2 .” Its contrapositive is “4 does not
divide n2 ⇒ 4 does not divide n ”. The original and contrapositive statement are true: If 4
divides n then there is a positive integer k such that n = 4k. This implies that n2 = 42 k 2 , thus
4 divides n2 . Its converse is “4 divides n2 ⇒ 4 divides n”. This is false: take n = 2 then n2 = 4
but 4 does not divide 2.
(b) The original statement “n + 1 is not prime ⇒ n is prime” has the contrapositive “ n is not
prime ⇒ n + 1 is prime” is false: take n = 8, then 9 is not prime.
Its converse “ n is prime ⇒ n + 1 is not prime” is true: if n is prime then in particular this
means that n is odd, so n + 1 must be even, hence not prime.
(c): Write the number n as
n = 10m am + 10m−1 am−1 + . . . + 10a1 + a0 ,
where each digit aj ∈ {0, 1, . . . , 9}. Thus n written in base 10 has the form n = am . . . a0 . Now
let N = am + am−1 + . . . a1 + a0 be the sum of the digits of n.
The original statement “N is divisible by 3 ⇒ n is divisible by 9” is false (and thus also its
contrapositive): let n = 3 then N = 3 but n is not divisible by 9.
The converse statement “n is divisible by 3 ⇒ N is divisible by 9” is true: Look at
n − N = (10m − 1)am + . . . + (10 − 1)a1 .
This is divisible by 9, since each term 10j − 1 is divisible by 9. If n is divisible by 9, what does
this tell us about N ? If n is divisible by 9 then n = 9s for some positive integer s and so we get
9s − N = (10m − 1)am + . . . + (10 − 1)a1 and thus N = −(10m − 1)am − . . . − (10 − 1)a1 + 9s
is divisible by 9, and so in particular also divisible by 3.

1.7 If and only if


Definition When we write A ⇔ B, we mean
A⇒B and B ⇒ A,
equivalently
A⇒B and A ⇐ B.
This is often expressed as
A if and only if B.
It may then take a moment of thought to recall that:
“A if B” is “A ⇐ B”; “A only if B” is “A ⇒ B”.

6
1.8 Statements involving and/or
When we write “A or B” we mean that at least one (possibly both) of A and B holds. “A and
B” means that both hold.
The negation of “A or B” is “¬A and ¬B”, that means “neither A nor B”; the negation of “A
and B” is “¬A or ¬B”, that means “not both of A and B”.
For example, for real x,
x > 1 or x < −1 ⇒ x2 > 1.
Note that the contrapositive of this is
x2 ≤ 1 ⇒ x ≤ 1 and x ≥ −1,
which introduces an “and”.

Here are the truth tables for “and” and “or” (T=true, F=false):

A B A and B

T T T
T F F
F T F
F F F

A B A or B

T T T
T F T
F T T
F F F

and the one for the negation of a statement A:

A ¬A

T F
F T

Optional: Show that the truth tables of the two statements “A ⇒ B” and “¬A or B” are
the same. Thus both statements are “logically equivalent” and can be used interchangeably!

1.9 Back to the problem of (1)


Is the statement
A ⇒ A or B (5)

7
always true? Let’s look at its contrapositive, which is

¬A and ¬B ⇒ ¬A,

which is plainly always true, whatever A and B are. So (5) is always true.
If we take A to be “x < 0” and B to be “x = 0” then “A or B” becomes “x ≤ 0” and so
our statement (1) for real numbers x,

x < 0 ⇒ x ≤ 0,

is TRUE.

1.10 Quantifiers
There are two quantifier symbols used widely in mathematics:

ˆ the symbol ∀ is usually read as “for all” (sometimes “for each”, “for every”, “to each”);

ˆ the symbol ∃ means “there exists”.

For example,
∀ x ∈ R ∃n ∈ N such that n > x. (6)
“For every real number x there exists a natural number (i.e. positive integer) n with n > x.”
“To each real number x corresponds a natural number n with n > x.”
This is a true statement. Note that it should not be interpreted as saying that the n correspond-
ing to a given x is unique.

These quantifiers are useful but need to be treated with some care. The statement (6) can
be written more economically as

∀ x ∈ R ∃n ∈ N (n > x).

If we then reverse the order of the quantifiers we obtain

∃n ∈ N ∀ x ∈ R (n > x).

This says that there exists a positive integer n which is greater than every real x, and is obviously
false. It is thus important to not change the order as this changes the statement.
Negating a statement involving quantifiers also requires some thought. If we wish to negate
(6) then we assert that there is some real x for which there does not exist a natural number n
with n > x i.e.
∃ x ∈ R such that ∀n ∈ N n ≤ x,
which of course is again false. The general rule here is that if we negate a statement that contains
∃x ∈ ... then this negates to ∀x ∈ .... If we negate a statement that contains ∀x ∈ ... then this
negates to ∃x ∈ ....

8
2 Proof and Mathematical Induction
2.1 Why proof ?
The Dictionary of Mathematics (E J Borowski and J M Borwein, publ. HarperCollins 1989)
defines “proof” as:

“a sequence of statements, each of which is either validly derived from those preceding it or
is an axiom or assumption, and the final member of which, the conclusion, is the statement of
which the truth is thereby established.”

It goes on to say:

“A direct proof proceeds linearly from premises to conclusion”;

“an indirect proof assumes the falsehood of the desired conclusion and shows that to be im-
possible”.

We offer three reasons why proof is important in mathematics:

ˆ because of the key role played by mathematics in scientific, engineering and everyday life;

ˆ because even in mathematics surprising things may happen (for example, if you add up real
numbers does the answer you get depend on the order in which you add them up?);

ˆ because proof in mathematics, unlike in many other subjects, generally is possible.

In the following, we do not set out to present a comprehensive treatment of proof, more a
recipe for how to prove mathematical assertions, but we will look at instances of some of the
main types of proof or disproof.

2.2 Disproof by counter-example


Consider
22 − 1 = 3, 23 − 1 = 7, 25 − 1 = 31, 27 − 1 = 127.
Each of these is a prime number i.e. is at least 2 and has no positive integer factors apart from
1 and itself. These might lead us to believe the statement

“for every prime number p, the number 2p − 1 is also prime.” (7)

However,
211 − 1 = 2047 = (23)(89)
and this solitary example suffices to show that (7) is false.

9
2.3 Direct proof
Consider the following proposition, in which n ≥ 0 is an integer.

Proposition Let x0 , x1 , . . . , xn be n + 1 distinct real numbers, and let y0 , y1 , . . . , yn be n + 1


real numbers. Then there exists a polynomial Qn (x), of degree at most n, such that Qn (xj ) = yj
for j = 0, . . . , n.

Here the degree means the highest power of x which occurs in Qn (x).

The most satisfactory way to prove an assertion like this is, if possible, to exhibit Qn directly and
show that it satisfies the required conditions. For this problem, a solution comes in the form of
the Lagrange interpolation formula. We define Qn (x) to be
(x − x1 ) . . . (x − xn ) (x − x0 )(x − x2 ) . . . (x − xn )
y0 + y1
(x0 − x1 ) . . . (x0 − xn ) (x1 − x0 )(x1 − x2 ) . . . (x1 − xn )
(x − x0 )(x − x1 ) . . . (x − xn−1 )
+ . . . + yn .
(xn − x0 )(xn − x1 ) . . . (xn − xn−1 )
Here in the term involving yj we multiply together all the terms (x − xi ) apart from x − xj , and
the terms in the denominator are there to make this term equal to yj at xj . Since each term in
our sum has degree at most n and the term involving yj is 0 at xi for i 6= j, we see that Qn (x)
has the required properties.

Thus we have proved the existence of Qn constructively. We will see another proof of the
existence of Qn later.

Note that the details of this proof are not important for this module: rather, this is included
as an illustration of an approach to proof.

Something to think about (optional): is Qn unique i.e. for given n, x0 , . . . , xn , y0 , . . . , yn , is


there another polynomial with the same properties?

2.4 Another direct proof


Constructive proofs are common in geometry.

Theorem Let A, B, C be three points in a plane, and assume that A, B, C are not collinear
(i.e. do not lie on a straight line). Then there exists a circle passing through A, B, C.

Sketch of the idea (see lectures). Join A to B and B to C by straight line segments.
Let D and E be the midpoints of AB and BC respectively. Construct straight lines perpendic-
ular to AB and BC at D and E respectively. Let the point where these lines intersect be O.
Think about the triangles AOB and BOC.

10
Again, the details of this proof are not important for this module: rather, this is included as
an illustration of an approach to proof.

2.5 A direct, but non-constructive, proof


Theorem Let a, b, c, d be positive real numbers. Then the polynomial

P (x) = x4 + ax3 + bx2 + cx − d



has a root A satisfying P (A) = 0 and 0 < A < B = 4 d.

Proof We have P (0) = −d < 0. Also P (B) = B 4 + aB 3 + bB 2 + cB − d = aB 3 + bB 2 + cB > 0


since a, b, c, B are positive. So the graph of y = P (x) must cut the real axis somewhere between
x = 0 and x = B, i.e. at some A ∈ (0, B), as it can be drawn without lifting the pen from the
paper (P is “continuous”).

Note that this proof is not constructive since we cannot determine A explicitly.

As before, the details of this proof are not important for this module: rather, this is included as
an illustration of an approach to proof.

2.6 Proof using the contrapositive


We saw earlier that the statement A ⇒ B is the same as ¬B ⇒ ¬A. However, in order to prove
A ⇒ B it is sometimes easier to prove ¬B ⇒ ¬A. Here is an instance of this.

Proposition Let n be a positive integer. If 3 divides n2 then 3 divides n.


Proof To prove our assertion, suppose that 3 does not divide n. So the remainder when we
divide n by 3 must be 1 or 2. Hence

n = 3m + r where m ∈ Z and r ∈ {1, 2}.

This gives
n2 = 9m2 + 6mr + r2 .
Here 9m2 and 6mr are obviously divisible by 3. But r2 is either 1 or 4, so is not divisible by 3,
and therefore n2 is not divisible by 3.

2.7 Proof by contradiction


This is related to the method in §2.6. If we want to show that some statement A is true, it is
often effective to show that assuming A false leads to a transparently false conclusion. This is
sometimes known as reductio ad absurdum.

Theorem (Euclid). There are infinitely many prime numbers.

11
Proof Suppose that there are only finitely many prime numbers, say n of them. Write them
down in increasing order as
2 = p1 < 3 = p2 < . . . < pn .
Now let
M = p1 p2 . . . pn + 1.
Thus we multiply together p1 , . . . , pn and add 1. Now M > 1, so M has a prime factor, which
must be one of p1 , . . . , pn , say pj . This is a contradiction, because when we divide M by pj we get
remainder 1 6= 0. Our assumption that there are finitely many primes has led to a contradiction,
so there must be infinitely many.

Note: (1) We used the fact that every integer M ≥ 2 has a prime factor. Why? If M is
itself prime, then we’ve finished. If not, M has a factor M1 with 1 < M1 < M , and any factor
of M1 is a factor of M . Now either M1 is prime, or M1 has a factor 1 < M2 < M1 . Repeating
this finitely many times we get a prime factor of M .
(2) This proof does not show that if you multiply together the first n primes and add 1, you get
a prime. In fact,
2 · 3 · 5 · 7 · 11 · 13 + 1 = 30031 = 59 · 509
(3) Finally, the details of this proof are again not important for this module: rather, this is
included as an illustration of an approach to proof. (4) There is a short video clip of this proof
on Youtube, the link is at the bottom of the Moodle page.

2.8 The method of proof by mathematical induction


Suppose we have some statement P (n) which depends on the integer n. For example, possibili-
ties for P (n) might be:
(i) 5n > n5 ;
(ii) 5 divides n5 − n;
(iii) n can be expressed as a product of prime numbers;
n(n + 1)
(iv) nk=1 k = 1 + 2 + . . . + n =
P
.
2
Suppose next that we can find some integer N such that:
(a) P (N ) is true;
(b) for every integer n ≥ N , if P (n) is true then so is P (n + 1).

Applying (b) with n = N we deduce that P (N + 1) is true, and applying (b) again we get
P (N + 2), and subsequently P (N + 3) etc. Indeed to deduce P (n) for any n ≥ N , starting from
P (N ), we only need to apply (b) n − N times.

Thus (a) and (b) together imply that P (n) is true for all integers n ≥ N . This is called
the principle of mathematical induction (PMI).

Here’s another way to look at the PMI, which is based on the idea of proof by contradiction.
Assume that (a) and (b) are satisfied for our P (n). Suppose it’s not the case that P (n) is true

12
for every integer n ≥ N . Then there must be a least integer m ≥ N for which P (m) is false.
Since we know P (N ) is true, we must have m > N , and so we can write m = n + 1, where
n ≥ N . Because m is the least integer not less than N for which our property P is false, it must
be the case that P (n) is true. But now (b) tells us that since P (n) is true so is P (m).
Thus assuming that it is not the case that P (n) holds for all integers n ≥ N has led to a
contradiction, and so P (n) must be true for all n ≥ N . When we argue like this, the integer m
which arises is sometimes known as the minimal criminal.

Proposition We have 5n > n5 for all integers n ≥ 6.


Proof Let P (n) be the statement that 5n > n5 . Now

56 = 15625 > 7776 = 65 ,

so P (6) is true. Here we are taking N = 6, and this is called the anchor step, or induction
beginning.

Assume next that n ≥ 6 and that P (n) is true. This is called the induction hypothesis. Now

5n+1 = 5 · 5n > 5n5

by our induction hypothesis P (n). But, since n ≥ 6,


 5  5  5
5 5 n+1 5 1 51
(n + 1) = n =n 1+ ≤n 1+ = n5 (2.161394033 . . .) < 5n5 .
n n 6

Putting these together we get 5n+1 > (n + 1)5 , and so we have shown that P (n) ⇒ P (n + 1)
for n ≥ 6. This is called the induction step.

We conclude by the PMI that P (n) holds for all integers n ≥ 6.


Remark some people prefer to deduce P (n) from P (n − 1). This approach is essentially the
same as that given in these notes. To prove P (n) for n ≥ N you have to show that P (N ) is
true and that P (n − 1) ⇒ P (n) for n − 1 ≥ N i.e. n ≥ N + 1.

2.9 Examples of proof by mathematical induction


1. Prove by induction that 5 divides n5 − n for every integer n ≥ 2.

Let P (n) be the statement that 5 divides n5 − n.


Anchor step: Since
25 − 2 = 32 − 2 = 30 = 6 · 5
we see that P (2) is true.
Induction hypothesis: Suppose next that n ≥ 2 is an integer and that P (n) is true.
Induction step: Expanding out gives

(n+1)5 −(n+1) = n5 +5n4 +10n3 +10n2 +5n+1−(n+1) = n5 −n+5n4 +10n3 +10n2 +5n.

13
Now 5 divides n5 − n, by the induction hypothesis P (n), and 5 clearly divides 5n4 , 10n3 and
10n2 . So 5 divides (n + 1)5 − (n + 1), and we have deduced P (n + 1) from P (n).
Thus P (n) holds for all integers n ≥ 2 by the induction.

2. Consider again the proposition, in which n ≥ 0 is an integer:

Let x0 , x1 , . . . , xn be n + 1 distinct real numbers, and let y0 , y1 , . . . , yn be n + 1 real num-


bers. Then there exists a polynomial Qn , of degree at most n, such that Qn (xj ) = yj for
j = 0, . . . , n.

We can prove this by mathematical induction. For a given integer n ≥ 0 let P (n) be the
above statement.
Anchor step: P (0) is true, because setting Q0 (x) = y0 (constant, so clearly of degree at most
0) makes Q0 (x0 ) = y0 .
Induction hypothesis: Suppose that n ≥ 0 is an integer and that P (n) is true.
Induction step: We want to deduce P (n + 1). So let x0 , . . . , xn+1 be distinct real numbers,
and let y0 , . . . , yn+1 be real numbers. By the induction hypothesis P (n) we can find a polynomial
Qn , of degree at most n, which satisfies Qn (xj ) = yj for j = 0, . . . , n. Now set
(x − x0 ) . . . (x − xn )
Qn+1 (x) = Qn (x) + (yn+1 − Qn (xn+1 )) .
(xn+1 − x0 ) . . . (xn+1 − xn )
Then we find that Qn+1 (xj ) = Qn (xj ) = yj for j = 1, . . . , n, and
Qn+1 (xn+1 ) = Qn (xn+1 ) + (yn+1 − Qn (xn+1 )) = yn+1
as required. Also Qn+1 has degree at most n + 1. Thus we have deduced P (n + 1) from P (n).

Since P (0) is true we obtain in succession P (0), P (1), P (2), . . ., and P (n) is true for every
integer n ≥ 0 by induction.

3. Let x ∈ R with x ≥ −1. Prove that


(1 + x)n ≥ 1 + nx
for every integer n ≥ 1.

Let P (n) be the statement that (1 + x)n ≥ 1 + nx.


Anchor step: P (1) is true.
Induction hypothesis: Suppose that n ≥ 1 is an integer and that P (n) is true.
Induction step: Since x ≥ −1 we have 1 + x ≥ 0 and so using P (n) gives
(1 + x)n+1 = (1 + x) · (1 + x)n
≥ (1 + x)(1 + nx)
= 1 + x + nx + nx2
= 1 + (n + 1)x + nx2
≥ 1 + (n + 1)x,

14
so that we have deduced P (n + 1). Hence P (n) holds for all n ≥ 1 by the PMI.

Something to think about (optional): what happens if −2 ≤ x < −1?

2.10 Strong induction


Sometimes the standard form of induction, in which we deduce P (n + 1) from P (n), is not
quite sufficient: A common application of induction is applying it to problems involving recurrent
sequences such as the Fibonacci numbers, or to the representation of integers as a product of
primes (Fundamental Theorem of Arithmetic). In applications of this type, the case P (n) in the
induction step is not enough to deduce the case P (n + 1); one usually needs additional predeces-
sors to get the induction step to work, e.g., the two preceding cases P (n) and P (n − 1), or all
preceding cases P (1), . . . , P (n − 1). The induction principle remains valid in this modified form.
This variation of the induction method is called strong induction:
Suppose we have a statement P (n) for n ∈ N.
Anchor Step: There is some N ∈ N, such that P (N ) is true.
Induction Hypothesis: Assume that P (N ), P (N + 1), . . . , P (n) are true.
Induction Step: Using that n ≥ N and P (N ), P (N + 1), . . . , P (n) are true, we show that
P (n + 1) is true.
Then by the principle of strong induction, P (n) is true for all n ≥ N .

For application of induction to two-term recurrent sequences like the Fibonacci numbers, one
only needs two preceding cases, P (n) and P (n − 1) in the induction step, and two anchor cases
(e.g., P (1) and P (2)) to get the induction going. The logical structure of such a proof is of the
following form:
Anchor step: P (n) is true for n = 1, 2.
Induction Hypothesis: Assume that P (n − 1) and P (n) are true.
Induction step: Let n ≥ 2 be given and using that P (n − 1) and P (n) hold, we show that
P (n + 1) holds.
By the principle of strong induction, P (n) holds for all n ∈ N.
(Note that here in the induction step we could have also said: “Assume P (1), P (2), . . . , P (n)
are true; this is a bit redundant as only the last two of the cases P (n − 1) and P (n) are needed,
though logically correct.)
Proposition Every integer n ≥ 2 can be factored into prime numbers.
Proof Let n ≥ 2 be an integer and let P (n) be the statement “n can be factored into prime
numbers”.
Anchor Step: P (2) is true.
Induction assumption: Let n ≥ 2 and assume that P (2), P (3), . . . , P (n) are true.
Induction step: There are two cases:
(i) n + 1 is prime. Then P (n + 1) holds.
(ii) n+1 = ab for two natural numbers a, b, 1 < a, b < n. Since P (a) and P (b) hold by induction
assumption, we know that a = p1 · · · pk and b = q1 · · · ql for primes p1 , . . . , pk , q1 , . . . ql . Thus
we have n + 1 = p1 · · · pk · q1 · · · ql is a prime factorization of n + 1. Thus P (n + 1) holds, and
we have shown the assertion using strong induction.

15
The Fibonacci sequence consists of the so-called Fibonacci numbers 0, 1, 1, 2, 3, 5, 8 , 13, 21,
... and is defined via F0 =√0, F1 = 1 and√Fn+2 = Fn+1 + Fn for all n ≥ 0. Define the roots of
x2 − x − 1 = 0 as ϕ = 1+2 5 and ψ = 1−2 5 .
Proposition The n-th Fibonacci number is given by
ϕn − ψ n ϕn − ψ n
Fn = = √ .
ϕ − psi 5

−ψ n n
Proof Let P (n) be the statement “Fn = ϕϕ−ψ ”. We now use strong induction using the
formula Fn+2 = Fn+1 + Fn .
0 −ψ 0
Anchor step: F0 = ϕϕ−psi = 0.
Induction hypothesis: Assume that n ≥ 0 and that P (n) and P (n + 1) both hold.
Induction step: we have

ϕn+1 − ψ n+1 ϕn − ψ n
Fn+2 = Fn+1 + Fn = +
ϕ−ψ ϕ−ψ

ϕn (ϕ + 1) − ψ n (ψ + 1)
= =.
ϕ−ψ
The first equality here holds because we have the induction hypothesis. We then used that ϕ, ψ
n+2 −ψ n+2
are roots of x2 − x − 1, thus ϕ2 = ϕ + 1, ψ 2 = ψ + 1 = ϕ ϕ−ψ . Hence we have shown that
P (n + 2) is true using that P (n + 1) and P (n) are true and so P (n) holds for all n by strong
induction.
Note that (and this is optional, but interesting!) both induction and strong induction are equiv-
alent to the well-ordering principle of the natural numbers: every non-empty subset of N has a
smallest element.

2.11 The minimal criminal revisited (optional)


Sometimes the standard form of induction, in which we deduce P (n + 1) from P (n), is not quite
sufficient, as observed in the last section. We can use another way of arguing as well. Consider
the following problem.

Define the polynomials fn by

f0 (x) = 1, f1 (x) = x, fn (x) = 2xfn−1 (x) − fn−2 (x) (n = 2, 3, . . .).

Let x ∈ R. Prove that fn (−x) = (−1)n fn (x).

Here f2 (x) = 2x2 − 1, f3 (x) = 4x3 − 3x etc.

To prove the required assertion we can fix x ∈ R and let P (n) be the statement that fn (−x) =
(−1)n fn (x). Then f0 (−x) = 1 = (−1)0 · 1 = (−1)0 f0 (x) and f1 (−x) = −x = (−1)1 · x =
(−1)1 f1 (x), so P (0) and P (1) are true.

16
However, a problem arises when we try to deduce P (n + 1) from P (n). We need

fn+1 (−x) = (−1)n+1 fn+1 (x)

and we know that


fn+1 (x) = 2xfn (x) − fn−1 (x),
which involves fn−1 as well as fn . So we cannot deduce P (n + 1) from P (n) alone.
To the rescue comes the idea of the minimal criminal. Suppose it is NOT the case that P (n)
holds for all integers n ≥ 0. Then there is some LEAST integer n ≥ 0 for which P (n) is not
true. This minimal criminal n must be at least 2, because we have already checked P (0) and
P (1). So P (n − 1) and P (n − 2) must be true, because n is at least 2 and is minimal. So

fn (−x) = 2(−x)fn−1 (−x) − fn−2 (−x)


= −2x(−1)n−1 fn−1 (x) − (−1)n−2 fn−2 (x)
= (−1)n (2xfn−1 (x) − fn−2 (x))
= (−1)n fn (x),

contradicting our assumption that P (n) is false.


We have obtained a contradiction from the assumption that it is not the case that P (n) holds
for all n ≥ 0. Hence we conclude that in fact P (n) IS true for all integers n ≥ 0.

Something to think about (optional): what is the connection between the fn (x) and cos nt?

3 Inequalities
3.1 Introduction
Inequalities
x<y
are generally harder to work with than equations, particularly when one tries to multiply them.
For example
a = b and c = d ⇒ ac = bd
whereas

−4 < −3, −2 < −1, but it is not true that 8 = (−4)(−2) < (−3)(−1) = 3.

To develop some rules which we can use we need to go back to first principles.

3.2 Some basic rules and definitions


For every real number x, precisely one of the following three possibilities is true:
(a) x > 0 i.e. x is positive; (b) x = 0; (c) 0 > x i.e. x is negative.

We assume these to be true and do not have to prove them. These are called axioms. For

17
positive real numbers we also have the following two rules (two more axioms, we do not have to
prove these, either):
(P1) the sum of two positive real numbers is positive;
(P2) the product of two positive real numbers is positive.

For real numbers a, b in general, if we write b > a we mean that b − a > 0,


and a < b means the same as b > a.

We can now derive some additional rules. Let x, y, z be any real numbers.

(i) Let x < y and y < z. Then x < z.


Proof Since x < y and y < z we see that y − x and z − y are positive and therefore by P1 so
is (y − x) + (z − y) = z − x. Therefore we get x < z.

(ii) Let x < y. Then x + z < y + z.


Proof This is true because x < y gives 0 < y − x = (y + z) − (x + z) and so x + z < y + z.

(iii) Let x < y and z > 0. Then zx < zy.


Proof This follows because x < y gives y −x > 0, and so by P2 we have zy −zx = z(y −x) > 0.
Adding zx to both sides using (ii) gives zy > zx.

When we multiply by a negative quantity the rule is:


(iii*) Let x < y and z < 0. Then zx > zy (thus we have to reverse the inequality sign).
Proof To see this, note by (ii) that adding −z to both sides of z < 0 gives 0 < −z. Thus
−zx = x(−z) < y(−z) = −zy and adding zx + zy to both sides gives zy < zx.

We also have:
Let x > 0. Then 1/x > 0.
Proof Obviously 1/x 6= 0 because otherwise we’d get 1 = x(1/x) = x · 0 = 0.
Now suppose 0 > 1/x. Adding −1/x to both sides we get −1/x > 0 and P2 gives −1 =
x(−1/x) > 0, a contradiction. Thus we must have 1/x > 0.

3.3 The relations x ≤ y, y ≥ x


The statement x ≤ y means “x < y or x = y”.
Similarly y ≥ x means “y > x or x = y”, which is the same as “x < y or x = y” and so as x ≤ y.

If x ≥ 0 we also say x is non-negative, if x ≤ 0 then x is called non-positive,

Using rule (iii) from §3.2 we see that if x ≤ y and z ≥ 0 then y − x and z are each either
positive or 0 and so is zy − zx = z(y − x), so that zx ≤ zy.
This way we get analogous statements for all the ones in the previous section with < replaced
by ≤.

18
3.4 Example
Solve the following inequality for real x:
1 2
+ > 0, x 6= 0, 1. (8)
x 1−x
(i.e. determine all x ∈ R with x 6= 0, 1 such that (8) is satisfied).
We have to exclude x = 0, 1 because if x is 0 or 1 the LHS of (8) is not defined.

Solution. We need
1 2 1 − x + 2x 1+x
0< + = = .
x 1−x x(1 − x) x(1 − x)
For this to be true the numerator 1 + x and the denominator x(1 − x) must both be positive, or
both be negative.
Now if 0 < x < 1 then x(1 − x) is positive, and so is 1 + x.
For x < 0 or x > 1 the denominator x(1 − x) is negative, so we need 1 + x < 0 i.e. x < −1.

So our solution is x < −1 or 0 < x < 1 i.e. x ∈ (−∞, −1) ∪ (0, 1).

3.5 The modulus and the triangle inequality


Given a real number x, we define |x| to be x (if x ≥ 0) and −x (if x < 0). Thus we always have
|x| ≥ 0.

19
y

y = |x|

|x| = x (x > 0) , |x| = − x (x < 0).

We can√ also think of |x| as the distance from x to 0. A convenient formula, for real x, is
|x| = x2 , in which the square root sign always means the non-negative square root. We then
have, if x and y are real, p √ p
|xy| = x2 y 2 = x2 y 2 = |x||y|.
Also,

|x + y|2 = (x + y)2 = x2 + y 2 + 2xy


≤ x2 + y 2 + 2|xy|
= |x|2 + |y|2 + 2|x||y| = (|x| + |y|)2 ,

and so taking square roots we have |x + y| ≤ |x| + |y| for all real x, y. This is called the
triangle inequality. We also have as a consequence |x| = |(x − y) + y| ≤ |x − y| + |y|, and so
|x − y| ≥ |x| − |y| for all x, y ∈ R.
Note There is a triangle inequality for complex numbers z, w as well, which is given by

|z + w| ≤ |z| + |w|
p
where now |z| = z02 + z12 if z = z0 + iz1 with z0 , z1 ∈ R. However, for complex numbers the
proof is different, and obviously it is not true in general that |z| = ±z e.g. |i| = 1 6= ±i.

20
Example Determine all real numbers x with

|x| + 2 |x + 2| < 7.

Solution. The inequality is difficult to manipulate with modulus signs present. However splitting
up the range under consideration allows us to solve for x.

(a) Suppose x ≥ 0.
Then |x| = x, |x + 2| = x + 2 and the inequality reads 3x + 4 < 7 i.e. x < 1. Thus in this range
we require 0 ≤ x < 1.

(b) Suppose −2 ≤ x < 0.


In this range |x| = −x and |x + 2| = x + 2 and we need to solve −x + 2(x + 2) = x + 4 < 7,
i.e. x < 3. This is true for all x in this range.
We could alternatively note here that for −2 ≤ x < 0 we have |x| ≤ 2 and |x + 2| ≤ 2 and so
|x| + 2 |x + 2| ≤ 2 + 2(2) = 6 < 7.

(c) Suppose x < −2.


Then we need to solve −x − 2(x + 2) = −3x − 4 < 7 i.e. −3x < 11 i.e. x > −11/3 (reversing
the inequality since we multiply by −1/3 < 0).

So our full solution is −11/3 < x < 1.

3.6 Quadratics
Consider the quadratic
f (x) = ax2 + bx + c,
where a, b, c ∈ C. If a 6= 0 then

4af (x) = 4a2 x2 + 4abx + 4ac = 4a2 x2 + 4abx + b2 + 4ac − b2 = (2ax + b)2 + 4ac − b2

and so, for x ∈ C,


f (x) = 0 ⇔ 2ax + b = (b2 − 4ac)1/2 . (9)
How to find the (in general two) values of w1/2 when w is a complex number is discussed in the
Linear Mathematics module.

Definition f (x) has constant sign on R if precisely one of the following holds:
f (x) ≥ 0 for all real x;
f (x) ≤ 0 for all real x.

For future reference we need the following.

Lemma Let a, b, c ∈ R. Then f (x) = ax2 + bx + c has constant sign on R if and only if
b2 ≤ 4ac. Moreover, if a > 0 and b2 ≤ 4ac then f (x) ≥ 0 for all x ∈ R, and if a < 0 and
b2 ≤ 4ac then f (x) ≤ 0 for all x ∈ R.

21
Proof Suppose first that a = 0. Then the straight line y = bx + c has slope b and so takes
positive and negative values if and only if b 6= 0 i.e. if and only if b2 > 4ac.

Now suppose a 6= 0. Then


f (x) has constant sign on R if and only if
g(x) = 4af (x) = (2ax + b)2 + 4ac − b2 has constant sign on R.
But g(x) is positive for large positive x, because of the squared term involving x.
Thus f (x) has constant sign on R if and only if g(x) ≥ 0 for all real x, which is true if and only
if 4ac − b2 ≥ 0.
In particular, if a > 0 and b2 ≤ 4ac then f (x) ≥ 0 for all x ∈ mathbbR, and if a < 0 and
b2 ≤ 4ac then f (x) ≤ 0 for all x ∈ mathbbR.

3.7 The Cauchy-Schwarz inequality


We now use the results of the last section to prove the The Cauchy-Schwarz inequality.
Theorem (The Cauchy-Schwarz inequality) Let n ∈ N and let a1 , . . . , an and b1 , . . . , bn be
real numbers. Then !2 ! n !
Xn X n X
ak b k ≤ a2k b2k . (10)
k=1 k=1 k=1

Proof) To prove (10) consider the quadratic


n
X
F (x) = (xak − bk )2
k=1

for all x ∈ R. Since F (x) is a sum of squares of real numbers, we see that F (x) ≥ 0 for all real
x. But expanding out the squares gives
n
X
F (x) = (x2 a2k − 2xak bk + b2k ) = Ax2 − 2Bx + C,
k=1

where n n n
X X X
A= a2k , B= ak b k , C= b2k .
k=1 k=1 k=1

Since the quadratic F (x) is non-negative for every real x, it follows from our discussion of
quadratics and the Lemma of the previous section that we must have
4B 2 − 4AC = (2B)2 − 4AC ≤ 0 and hence B 2 ≤ AC,
which is (10).

Example Let
99
X p √ √ √ √
S= k(100 − k) = 99 + 2 · 98 + . . . + 98 · 2 + 99.
k=1

22
Estimate S using Cauchy-Schwarz.

By Cauchy-Schwarz we have
99
! 99
! 99
!2
X X X
S2 ≤ k (100 − k) = k
k=1 k=1 k=1

and so
99
X 99 · 100
S≤ k= = 4950.
k=1
2
The correct value is 3922.835639 but our estimate is at least of the right order of magnitude.

The Cauchy-Schwarz inequality has many applications in mathematics: for example it occurs
in connection with the isoperimetric inequality (of all loops of a given length, the one enclos-
ing the largest area is a circle), and in some treatments of the famous Heisenberg uncertainty
principle from quantum theory. It will appear again in the Linear Mathematics module.

3.8 The AM/GM inequality


Definition Let n ∈ N and let a1 , . . . , an be positive real numbers. The arithmetic mean (or
average) of the aj is
n
1X a1 + . . . + an
A= aj = .
n k=1 n
The geometric mean of the aj is

G = (a1 . . . an )1/n = n
a1 . . . an

i.e. the (positive) nth root of the product a1 . . . an .


Note that log G is the AM of the real numbers log a1 , . . . , log an .

If all the aj are equal, we obviously have aj = A = G.

Theorem (AM/GM inequality)We have A ≥ G, with equality only if all the aj are equal.

Proof (Optional.) If all the aj are equal there is nothing to prove. Suppose now that not
all the aj are equal. Hence we can find ar and as with ar < A < as . Replace ar and as by
br = A and bs = ar + as − A. Note that bs > 0 because as > A. Since

b r + b s = ar + as

the arithmetic mean of our numbers is unchanged, being still A. But

br bs − ar as = A(ar + as − A) − ar as = ar (A − as ) + A(as − A) = (A − ar )(as − A) > 0

23
and so br bs > ar as , which means that we have increased the GM.

Since we have replaced one of our numbers by A, the number of them which equal A has
increased. Note also that if ar and as are the only ones which do not equal A then ar + as must
be 2A, since A is the AM: hence br = bs = A in this case.
Thus we can repeat this process until all of our numbers are equal to A. The AM has not
changed, being always A, but the GM increases at every step. The final GM is equal to A, and
so the original GM must be less than A.
Example Find the maximum value of xyz for real numbers x, y, z subject to the conditions that
x, y, z > 0 and x + 2y + 3z = 6.
Solution: Let p = x, q = 2y, r = 3z. Then p, q, r > 0 with AM = p+q+r3
= x+2y+3z
3
= 2 and
1/3
GM = (pqr) ≤ 2 by the AM/GM inequality, with equality if and only if p = q = r = 2. Thus
pqr ≤ 23 if and only if 6xyz ≤ 8, which implies that xyz ≤ 4/3 where we have equality if and
only if x = 2, y = 1, z = 2/3.

4 Sets of Real Numbers


4.1 Introduction
For the rest of this module we will mostly be working with the real numbers R. We will not
define or construct the real numbers in any rigorous or formal way, but we can think of them in
two fundamental (and equivalent) ways:

(i) as points on a line, going to infinity in both directions;

(ii) as infinite decimal expansions.

In particular every positive real number x has a decimal expansion

x = bn bn−1 . . . b1 b0 .a1 a2 a3 . . .

in which each of the digits bj , aj is one of 0, 1, . . . , 9. This expansion is not always unique, since
1 = 0.9̇ = 0.99999999 . . ..

Note that +∞, −∞ (∞ means infinity) are NOT real numbers, and that we will mostly just
write ∞ for +∞.
We use the following notation for intervals of real numbers:
(a, b) = {x ∈ R | a < x < b},
(a, ∞) = {x ∈ R | x > a},
(−∞, b) = {x ∈ R | x < b},
[a, b] = {x ∈ R | a ≤ x ≤ b},
(−∞, b] = {x ∈ R | x ≤ b},
(a, b] = {x ∈ R | a < x ≤ b}, etc.

24
4.2 Upper and lower bounds
When we have a set A of real numbers, the most natural question to ask is how large is A? For
example,    
2 1 x 1
B= x∈R:x < , C= x∈R:e > ,
4 4
are both infinite sets (each has infinitely many members). However there is a difference between
B and C in that C has arbitrarily large positive members, since ex > 41 for large positive x. In
fact C = (− ln 4, ∞), but all members of B = (−1/2, 1/2) are less than 21 .
Let A be any subset of R. This just means that A is a collection of real numbers (every
member of A is a member of R).
Definition M ∈ R is an upper bound (or majorant) for A if x ≤ M for all x ∈ A. We also say
that A is bounded above. m ∈ R is a lower bound (or minorant) for A if m ≤ x for all x ∈ A.
We also say that A is bounded below. A is called bounded if it is bounded above and below.

In our example, the set B is bounded above ( 12 is an upper bound), but the set C is not.

4.3 Some examples and remarks


(i) Note that if M is an upper bound for A and N ≥ M then N is also an upper bound for A.
So if A has one upper bound then it has infinitely many.

(ii) The set N is bounded below, but not above. The empty set is bounded as any real number
is both an upper and a lower bound.
√ √ √ √
(iii) Let B be the interval [− 2, 2] i.e. the set of all real numbers x such that − 2 ≤ x ≤ 2.
Then B is bounded above and below. This set is bounded.
(iv) An upper bound for A does not have to be in A. Also, “upper bound” does not mean
“boundary”.

4.4 Maxima and minima of a set of real numbers


Let A be a subset of R.
Definition X ∈ R is a maximum element of A, written X = max A, if X is in A and x ≤ X
for all x in A. This is the same as saying that X is an upper bound for A which lies in A.
Y ∈ R is a minimum of A, written Y = min A, if Y is in A and is a lower bound for A.
Proposition Any subset A of R has at most one maximum and at most one minimum element.
Proof Suppose that M1 , M2 are two maximum elements of the set A. Then x ≤ M1 and
x ≤ M2 for all x ∈ A. In particular, this implies that M1 ≤ M2 since M1 ∈ A, and since
M2 ∈ A, we also get M2 ≤ M1 . Therefore we have M1 = M2 . The proof for minimum is
analogously.

25
4.5 Examples and remarks
(i) Any set with a maximum element is bounded above. However, a set may be bounded above
but have no maximum element, see (iv).

(ii) A non-empty finite set of real numbers always has a maximum element and a minimum
element. The empty set, however, has no maximum or minimum element (it has no elements at
all).

2
√ √ √
(iii)
√ Let B = {x ∈ R : x ≤ 2}. Then B = [− 2, 2] and clearly 2 = max B and
− 2 = min B.

(iv) Now let D = {1 − 10−n : n ∈ N}. Then D is bounded above by 1. However, D has
no maximum element, because increasing n increases 1 − 10−n . There are elements of D arbi-
trarily close to 1 and so, of all upper bounds for D, the least is 1.

(v) Take the set X = {−1/n : n ∈ N}. Since xn = −1/n increases as n increases, the
least element of X is x1 = −1. For the same reason, X has no greatest element. As n increases
towards ∞, we see that xn approaches 0. Thus 0 is the least real number which is an upper
bound, but this value is not a max since it is not in the set.

4.6 Least upper bounds


Let A be a subset of R.
Definition s ∈ R is called a least upper bound for A (or a supremum of A), if the following two
conditions hold:

(a) x ≤ s for all x ∈ A (i.e., s is an upper bound for A);

(b) for every real number t with t < s, there exists some x ∈ A such that x > t (i.e., s is
the smallest possible upper bound).

Note that every set A has at most one least upper bound: Suppose that s and t are both
upper bounds and assume that t < s, then from the definition that s = lub(A) it follows that for
t there exists some x ∈ A such that x > t, meaning that t is not an upper bound, contradiction.
Thus t ≥ s. Assume next that t > s then from the definition that t = lub(A) it follows that for
s there exists some y ∈ A such that y > s, meaning that s is not an upper bound, contradiction.
It follows that s = t. We conclude that we can talk about “the” least upper bound of a set A,
as it is uniquely determined.

We denote the least upper bound s of a set A by lub(A) or sup(A).

Here is an equivalent definition ((b) is equivalent to (b’)):


Definition s ∈ R is called a least upper bound for A (or a supremum of A), if the following two
conditions hold:

26
(a) x ≤ s for all x ∈ A (i.e., s is an upper bound for A);

(b’) for every ε > 0 there exists some x ∈ A such that x > s − ε (i.e., there exists at least one
element of A within any given distance of s).

Example Let A = {− n1 | n ∈ N}. Then lub(A) = 0:


(a) Since x = − n1 < 0 for all n ∈ N, 0 is an upper bound.
(b’) For all ε > 0 we have to show that there exists x = n ∈ N ∈ A such that x > 0 − ε. Rough
work: − n1 > 0 − ε if and only if n1 < ε if and only if n > 1ε .
Thus we can now show (b’): For ε > 0 choose n0 to be an integer such that n0 > 1ε , then by
”rough work” we get − n1 > 0 − ε and hence proved (b’).
√ √ √
Example Let A √ = (− 2, 2). Then
√ lub(A) = 2:
(a) Since x < 2 for all x ∈ A, 2 is an upper bound. √
(b’)
√ For all ε > 0 we have to show that
√ there exists x ∈ A such that x > 2 − ε. Clearly, if
2 − ε is not an upper bound, then √2 − ε0 is not an upper bound for any ε0 > ε, either, so we
may assume that ε is small, say ε < 2. Consider x = −ε/2. Then
√ √ √ √
− 2 < 2 − ε < x = 2 − ε/2 < 2,
√ √
so x ∈ S and x > 2 − ε. Thus (b’) is proven and 2 = lub(A).

Take a non-empty subset A of R such that A is bounded above. We have already seen that A
might have no maximum element, but A has at least one upper bound M . In fact A then has
infinitely many upper bounds, and it is a fundamental property of R that among all of the upper
bounds of A there is one which is the least. This is the
Least upper bound property of the real numbers:
If A is a non-empty subset of R which is bounded above, then A has a least upper bound.

4.7 Some useful facts


Proposition Let A be a non-empty set of real numbers that is bounded above and has a maximum
element. Then max(A) = lub(A).
Proof By definition, the maximum element of A is an upper bound, and no real number t <
max(A) is an upper bound, since max(A) ∈ A.
Proposition Let s = lub(A).
(i) If s ∈ A then s is the maximum element of A.
(ii) If s 6∈ A then A does not have a maximum element.
Proof (i) Since s is an upper bound and s ∈ A, s must be the maximum element.
(ii) Suppose that X = max(A) exists. Since s is an upper bound, we have X ≤ s. But X 6= s
since X ∈ A and s 6∈ A. Therefore X < s which contradicts the fact that s is the least upper
bound.

27
Note that the rational numbers √
Q do not have
√ the least
√ upper
√ bound property. Consider the
non-empty
√ set
√ S = {x ∈ Q | − 2 < x < 2} = (− 2, 2) ∩
√ Q. Then S is bounded above
by 2. Now 2 i the least upper bound for S in R, however 2 6∈ Q. S does not have a least
upper bound in Q.

4.8 Examples and remarks


(i) Let A be the set of negative rational numbers. A has no maximum element, but sup A = 0.

(ii) Let C be the set of all decimals x = 0.a1 a2 . . . . in which each aj is 0 or 3. Then max C = 1/3,
and this is also the supremum (i.e., lub(C) = 1/3).

(iii) Now let C ∗ be the same as C, but with each x having only finitely many aj allowed to
be 3. The number of 3s must be finite for each x, but may depend on x.
Then 1/3 is still an upper bound but is not in C ∗ . Also, C ∗ has no max, because given any
element of C ∗ we can make a larger element by adding one extra 3. Finally, if t < 1/3 then by
taking 0.3 . . . 30 with enough 3s, we can make an element of C ∗ which is greater than t. So 1/3
is the least upper bound.

(iv) Let G be the set of all numbers of form

n2
,
n2 + 1
where n ∈ N. Then an upper bound for G is 1, but 1 is not in G. Show that 1 = lub(G):
(a) as noted already, 1 is an upper bound for G.
2
(b’) For all ε > 0 we have to show that there exists x = n2n+1 ∈ G such that x > 1 − ε. Write

n2 1
= 1 − .
n2 + 1 n2 + 1
n2 1 1
Rough work: n2 +1
> 1 − ε if and only if 1 − n2 +1
> 1 − ε if and only if n2 +1
< ε if and only if
q
1
n2 + 1 > ε
if and only if n > 1ε − 1.
q
1
Thus for every ε > 0 choose an integer n0 such that n0 > ε
− 1. Then by our “rough work”
n20
n20 +1
> 1 − ε. So 1 = lub(G).

(v) The empty set has no least upper bound, as every real number is an upper bound for the
empty set.

4.9 Greatest lower bounds


Let A be a subset of R.
Definition g ∈ R is called a greatest lower bound for A (or a infimum of A), if the following
two conditions hold:

28
(a) g ≤ x for all x ∈ A (i.e., g is an lower bound for A);

(b) for every real number h with h > g, there exists some x ∈ A such that x > h (i.e., g
is the smallest possible upper bound).

Note that every set A has at most one greatest lower bound: Suppose that g and g are greatest
lower bounds and assume that h > g, then from the definition that g = glb(A) it follows that for
h there exists some x ∈ A such that x > h, meaning that h is not a lower bound, contradiction.
Thus h ≤ g. Assume next that h > g then from the definition that h = glb(A) it follows that for
g there exists some y ∈ A such that y > g, meaning that g is not a lower bound, contradiction.
It follows that g = h. We conclude that we can talk about “the” greatest lower bound of a set
A, as it is uniquely determined.

We denote the greatest lower bound g of a set A by glb(A) or inf(A).

Here is an equivalent definition ((b) is equivalent to (b’)):


Definition g ∈ R is called a greatest lower bound for A (or an infimum of A), if the following
two conditions hold:

(a) g ≤ x for all x ∈ A (i.e., g is a lower bound for A);

(b’) for every ε > 0 there exists some x ∈ A such that x < g + ε (i.e., there exists at least one
element of A within any given distance of g).

We denote the greatest lower bound of A by g = glb(A) or by inf (A).


Greatest lower bound property of R
If A is a non-empty set of real numbers which is bounded below, then A has a unique greatest
lower bound in R.

Everything that we proved for glb(A) holds also for the glb(A), by simply reversing the in-
equalities in the respective proofs. In particular:
Proposition Let A be a non-empty set of real numbers that is bounded below and has a minimum
element. Then min(A) = glb(A).
Proof By definition, the minimum element of A is a lower bound, and no real number g > min(A)
is an lower bound, since min(A) ∈ A.
Proposition Let g = glb(A).
(i) If g ∈ A then g is the minimum element of A.
(ii) If g 6∈ A then A does not have a minimum element.
Proof (i) Since g is a lower bound and g ∈ A, g must be the minimum element.
(ii) Suppose that X = min(A) exists. Since g is a lower bound, we have g ≤ X. But X 6= g
since X ∈ A and g 6∈ A. Therefore X > g which contradicts the fact that g is the greatest lower
bound.

29
Note that the rational numbers Q do not√have the greatest
√ upper
√ √ bound property, either. Con-
sider the non-empty
√ S = {x ∈ Q | − 2 < x < 2} = (− 2, 2) ∩ Q. Then
set √ √ S is bounded
below by − 2. Now − 2 i the greatest lower bound for S in R, however − 2 6∈ Q. S does
not have a least upper bound in Q.
Example Let A be the set of all x > 0 such that sin x < 0. Then glb(A) = π. There is no
minimum element.
Example Let A = {(−1)n n2n+1 n ∈ N}. Find inf (A) and sup(A). Does A have a minimum
and a maximum?
To answer this we rewrite an element x = (−1)n n2n+1 of A as (−1)n n2n+1 = 1 − n21+1 if n is even,
and (−1)n n2n+1 = −1 + n21+1 if n is odd. This shows us that −1 < x < 1 for all x ∈ A. We thus
suspect that inf (A) = −1 and sup(A) = 1. We prove this only for the infimum here:
(a) −1 < x for all x ∈ A, so −1 is a lower bound.
(b’) We have to show that for every ε > 0 there exists some x ∈ A such that x < −1 + ε.
Rough work: Consider n odd (for even n, the elements of A will not lie close to −1). Then
−1 + n21+1 < −1 + ε iff n21+1 < ε, iff n2 + 1 > 1ε .
Now we can show: for ε > 0 choose an integer n0 such that n0 > ε, then n2 + 1 > n0 > 1ε and
hence by our rough work, −1 + n21+1 < −1 + ε.
0
Thus inf (A) = −1.
Since inf (A) = −1 6∈ A the set A has no minimum. It can be shown analogously that sup(A) =
1. Since sup(A) = 1 6∈ A the set A has no maximum.

4.10 An application of the lub


A nice application of the least upper bound involves the length L of a curve C. Suppose that C
goes from a to b (in R2 , but the same ideas work in higher dimensions). Choose finitely many
points (vertices) a = x0 , x1 , . . . , xn = b in order along C. Now join each xj to the next point
xj+1 by a straight line, to form a polygonal curve P . Then

L(P ) ≤ L.

If we add an extra vertex to P , thus forming P 0 , then we get

L(P ) ≤ L(P 0 ) ≤ L,

and so in principle L(P 0 ) is “closer” to L than L(P ). Thus one way to define the length L is as
the supremum of L(P ) over all possible P .
The picture shows an example where (as happens typically) L(P ) < L(P 0 ) < L. For “nice”
curves there is an easier formula for the length, expressed as an integral (see later modules).

30
C

P
C

P
P’ P
C
P’
P C

We add an extra vertex to P, to form P’


We have L(P) < L(P’) < length (C)

4.11 Optional extra reading: an explanation of why least upper


bounds exist (but not a good way to find them)
Suppose that A is a non-empty set of real numbers, and that A is bounded above. Write each
element x of A in the form
x = q + 0.d1 d2 d3 d4 . . .
where q is an integer and each dj is an integer between 0 and 9. For example, −3/8 = −1+0.625.
For uniqueness, avoid writing 9 recurring.
Because A is bounded above, there is some real M such that x ≤ M for all x in A. Thus all
of the integers q in these “decimal” expansions satisfy q ≤ x ≤ M . So let Q be the maximum of
all these integers (to find Q, just choose any x in A and look at its q, and then try all integers
between that q and M ). Throw away all elements of A for which the integer q is less than Q.
Call the set of remaining elements A1 . Then A1 is non-empty.
Now look at the first decimal place d1 of each element of A1 , and let D1 be the maximum
of these. Throw away all elements of A1 for which d1 < D1 , and call what’s left A2 . This A2 is
non-empty. Note that all elements of A2 are of form Q + 0.D1 . . ..
Keep repeating this. Thus we look at the second decimal place d2 of each element of A2 ,
and let D2 be the maximum of these. We throw away all elements of A2 for which d2 is less than
D2 , and call the non-empty set which is left over A3 . Now look at d3 for x in A3 etc.

31
Now let
S = Q + 0.D1 D2 D3 D4 . . .
Then S is an upper bound for A because if x = q + 0.d1 d2 d3 . . . is in A but x > S then either
q > Q, or q = Q and d1 > D1 , or q = Q and d1 = D1 and d2 > D2 etc., and none of
these is possible for x in A. Also, all of the sets AN are non-empty, and if x is in AN then
its expansion goes x = Q + 0.D1 ...DN −1 . . ., and so S ≤ x + 101−N . So there are elements of
A as close as we like to S. So no number less than S can be an upper bound for A. So S is sup A.

4.12 (Optional) Extreme values, attained and non-attained


Much of mathematics, in particular calculus, is concerned with optimization, e.g. making quan-
tities as small or as large as possible. In informal language people often talk about finding the
maximum or minimum of some quantity, subject to some condition(s). However, as we saw in
our discussion of sets it is important to bear in mind that it may not be possible to attain these
extreme values. We give two “concrete” examples where this happens.

Example I:
The points A, B, C are not in a straight line, and the distance from A to B is 5, while that from
B to C is 4. A particle (or car, or ship) moves on a smooth path, without abrupt changes of
direction, and must go from A to C via B. How far must it travel?
Distance AB is 5
A

Distance BC is 4

We travel on a smooth curve from A to C via B

Obviously our traveller must travel a distance at least 5 + 4 = 9. However, since the corner at B
has to be smoothed off, it must travel a distance greater than 9. In principle, the distance could
be arbitrarily close to 9, but it can never equal 9.

32
In terms of our notation, the glb of the distance travelled is 9, but 9 is not a min because
it is not attainable.

Example II:
Consider all real-valued functions f (x) which are defined and ≥ 0 for 0 ≤ x ≤ 1, with
f (0) = 0, f (1) = 1, and continuous. This means basically that the graph of f is an unbro-
ken curve with no jumps: we will be more precise about this later on. How small can the area
under the curve be? R1
If we take f (x) = xn with n a positive integer, then the area is 0 xn dx = 1/(n + 1). Thus
the area can be as close as we like to 0.
On the other hand, the area will always be positive. In fact there must be some c, depending
on f , with 0 < c < 1 such that f (x) ≥ 1/2 for c ≤ x ≤ 1, so that the area is at least
(1 − c)/2 > 0.

y = f(x)

0.5

0 c 1

f(x) is at least 0.5 for x in [c, 1], so the area under the curve is at least (1 − c)/2 .

Again, 0 is the glb but not a min.

33
There are a lot of such instances in mathematics, in which we seek to “minimize” or “maxi-
mize” some quantity (e.g. an integral) and it is important to know whether the extreme value is
attained.

5 Sequences
5.1 The basic definitions
Definition We define a real sequence (xn ) as a a non-terminating list of real numbers

xN , xN +1 , xN +2 , . . . ...,

where N is some integer. Here xn is some quantity depending on n, defined for all integers
n ≥ N , for some starting integer N .

We use the ( ) to distinguish the sequence (xn ) from the particular term xn .

Sometimes a sequence is thought of as a function with domain {N, N + 1, . . .}.

NOTE: a “series” is not the same thing. We will meet series (which are sums of terms) later.

Example Define sequences (xn ), (yn ), (wn ), (vn ), (tn ) by

xn = 1/n, yn = cos n, wn = 2n , un = (−1)n , vn = n((−1)n + 1), tn = 1,

for n = 1, 2, 3, . . .. We also define sn by sn = 1/n if n is odd and sn = 0 if n is even. We


consider what happens to each of these sequences as n gets large.

(xn ) is the sequence 1, 1/2, 1/3, . . . .. and it is clear that the terms are decreasing and posi-
tive and get very small as n gets large.
The first four terms of (yn ) (N.B. using radians) are 0.54, −0.42, −0.99, −0.65 to 2 decimal
places. However many terms you look at, yn appears to oscillate “randomly” and in particular
yn does not seem to approach any specific value as n gets large.
The terms wn clearly increase and get very large as n gets large. un alternates between 1
and −1. The terms vn get large and positive for even n and for odd n, vn = 0. Finally, tn is
constant, and sn approaches 0.

We need a concise description of what happens as n gets larger and larger (this idea being
expressed by the phrase “as n → ∞”). Roughly speaking, a sequence CONVERGES if there is
some real number L, such that xn is very close to L whenever n is sufficiently large. Otherwise
it diverges. In our list, (xn ) and (sn ) converge to 0, while (tn ) converges to 1. The rest diverge.

Next, we seek a precise definition of convergence. Most ways of expressing this in words have
drawbacks: e.g.

34
“xn gets close to L as n → ∞” (how close?);

“xn gets closer and closer to L as n → ∞” (tends to suggest that |xn+1 − L| should al-
ways be less than |xn − L|, which is not the case for sn ).

To motivate a definition, we look at another example.


Example Let
n1/2 + 1
an = .
n1/2
Writing an = 1 + n−1/2 it becomes clear that when n is very large (n → ∞) the term n−1/2
is very small and an is close to 1. Note that |an − 1| is the distance from an to 1, and this is
small for large n. How big do we need to make n in order that |an − 1| < 10−1 ? We just need
n−1/2 < 10−1 , i.e. n > 100 i.e. n ≥ 101. Similarly |an − 1| < 10−2 provided n ≥ 104 + 1 and,
generally, |an − 1| < 10−m as soon as n ≥ 102m + 1.
In fact, if we are given a positive real number ε, no matter how small (ε being a traditional
symbol for a small positive number just as n traditionally is used to denote an integer), we can
find some integer n0 such that |an − 1| < ε for all integers n ≥ n0 . To find n0 , note that we
need n > ε−2 in order to make |an − 1| < ε. Thus we can choose n0 to be the least integer
> ε−2 .

5.2 Definition
Definition The real sequence (xn ) converges to the real number α, if the following is true:
For every positive real number ε there exists an integer n0 , possibly depending on ε, such that
|xn − α| < ε for all integers n ≥ n0 .
Note that n0 is allowed to depend on ε and usually will, and some people write n0 (ε) to emphasize
this.

If α ∈ R other notations for convergence to α are:

xn → α as n → ∞ ; limn→∞ xn = α.

These should not lead you to think that n ever equals ∞ or that there is some x∞ in the
sequence: they are just shorthand for “(xn ) converges to α”. A non-convergent sequence is
called divergent.
Example Back to our examples we get: for xn = 1/n, to make |xn − 0| < ε we just need
n > 1/ε, and so n ≥ n0 , where n0 is the least integer greater than 1/ε, will do. For tn = 1 then
any n0 will do.

(i) Xn = 1/n2 . It is fairly obvious that (Xn ) in this case converges to 0. However, we check
2
our definition is satisfied. If ε > 0 is given we need |X
that p pn − 0| = 1/n < ε, and this is true if
n > 1/ε. So if we choose n0 to be the least integer > 1/ε, then we have |Xn − 0| < ε for

35
all integers n ≥ n0 , as required.

(ii) Yn = 2 − e−n . We have |Yn − 2| = e−n . If ε > 0 then for |Yn − 2| < ε we need pre-
cisely e−n < ε and so en > 1/ε and so n > ln(1/ε). So we can make n0 (ε) be the smallest
integer > ln(1/ε).

5.3 The algebra of limits


Theorem Let (xn ), (yn ) be real sequences converging to the real numbers α, β, respectively.
Then

(i) (xn + yn ) converges to α + β.

(ii) (xn yn ) converges to αβ.

(iii) (|xn |) converges to |α|.

(iv) If β 6= 0, then (1/yn ) converges to 1/β.

Intuitively these should be fairly obvious. However, we can prove them using our definition.

Proof (optional, at most one case covered in lectures)

(i) We have to show that for every ε > 0 there exists an integer n0 (ε) such that |(xn + yn ) −
(α + β)| < ε for all integers n ≥ n0 (ε).
Since we don’t have an explicit formula for xn +yn , how will we find n0 ? Answer: from xn and yn .

So let ε > 0 be given. First write

|(xn + yn ) − (α + β)| = |xn − α + yn − β| ≤ |xn − α| + |yn − β|

by the triangle inequality. If we can make |xn − α| and |yn − β| both < ε/2 then we get
|(xn + yn ) − (α + β)| < ε. But we know that (xn ) converges to α and (yn ) to β. So there
is some integer n1 such that |xn − α| < ε/2 for all n ≥ n1 , and there is some n2 such that
|yn − β| < ε/2 for all n ≥ n2 .

Let n0 be the larger of n1 and n2 . Then for all integers n ≥ n0 we have |xn − α| < ε/2
and |yn − β| < ε/2 and so |(xn + yn ) − (α + β)| < ε as required.

(ii) Again, we have to show that for every each ε > 0 there exists an n0 such that |xn yn −αβ| < ε
for all integers n ≥ n0 . We first write

|xn yn − αβ| = |(xn − α)yn + α(yn − β)| ≤ |(xn − α)yn | + |α(yn − β)|. (11)

36
Suppose that ε > 0 is given. Choose a number δ which is positive and satisfies

δ(|α| + |β| + δ) = ε,

the quadratic formula giving


p
−|α| − |β| + (|α| + |β|)2 + 4ε
δ= > 0.
2
The reason for this choice will be clear in a moment.

Since (xn ) converges to α and (yn ) to β we can find positive integers n1 and n2 such that
|xn − α| < δ for all n ≥ n1 and |yn − β| < δ for all n ≥ n2 . Again, let n0 be the larger of
n1 and n2 . So for all n ≥ n0 we have |xn − α| < δ and |yn − β| < δ, as well as observe that
|yn | = |yn − β + β| ≤ |yn − β| + |β| < δ + |β|. Substituting all these into (11) we get

|xn yn − αβ| < δ(|β| + δ) + |α|δ = ε,

for all n ≥ n0 .

(iii) This is easier, as ||xn | − |α|| is either |xn | − |α| or |α| − |xn | and both of these are ≤ |xn − α|.
So to make ||xn | − |α|| < ε, we just need to make |xn − α| < ε, which we know is true for all
integers n greater than or equal to some n0 .

(iv) This time we write


1 1 β − yn
− = . (12)
yn β βyn
We have to show that for every ε > 0 there exists a positive integer n0 such that |1/yn −1/β| < ε
for all n ≥ n0 .

We choose δ so that
δ
= ε,
|β|(|β| − δ)
this solving to give
ε|β|2
δ= ,
1 + ε|β|
which is positive but less than |β|. There is some n0 such that for all n ≥ n0 we have |yn −β| < δ
and consequently |yn | = |β − (β − yn )| ≥ |β| − |β − yn | > |β| − δ. Substituting into (12) we
have |yn − β| < δ and |βyn | > |β|(|β| − δ) and so

1 1 δ
− < =ε
yn β |β|(|β| − δ)

for all n ≥ n0 .
(−1)n 1
Remark Note that in (iv) we need β 6= 0: Let yn = n
then limn→∞ yn = 0, but yn
= n if

37
n is even and y1n = −n if n is odd, so ( y1n ) has no limit.
If (xn ) converges to α and γ ∈ R, then (γxn ) converges to γα. To see this we just take (yn ) to
be the constant sequence yn = γ for all n in (ii).
If (xn ) converges to α and (yn ) converges to β 6= 0, where yn 6= 0 for all n, then ( xynn ) converges
to αβ . To see this apply (ii) and (iv) to ( xynn ) = (xn y1n ).

The algebra of limit does not say anything about divergent sequences. But we can use proof by
contradiction to get results, for instance:
Proposition If (xn ) converges to α and (yn ) diverges, then (xn + yn ) diverges.
Proof Suppose that (xn + yn ) converges to γ as n tends to infinity. Then y + n = (xn + yn ) − xn
and by the Algebra of Limits, the sequence (yn ) thus tends to γ − α as n tends to infinity. This
contradicts our assumption that the sequence is divergent.
Proposition (“Limits preserve non-strict inequalities”) If an ≤ bn and (an ) tends to a and (bn )
tends to b, then a ≤ b.
Proof Assume towards a contradiction that a > b. So for all ε > 0 there is N1 such that
|an − a| < ε for all n > N1 and an N2 such that |bn − b| < ε for all n > N2 . Therefore for
all ε > 0 there is N = max(N1 , N2 ) such that for all n > max(N1 , N2 ) we have |an − a| < ε
and |bn − b| < ε. Now choose ε = a−b 2
> 0, then there is N such that for all n ≥ N , both
|an − a| < a−b
2
and |b n − b| < a−b
2
. This implies that −(an − a) < a−b
2
, so an − a > −a+b
2
and
a+b a−b a+b
therefore an > 2 . It also implies that bn − b < 2 , hence bn < 2 . Together we therefore
have
a+b
bn < < an
2
which contradicts the assumption that an ≤ bn . Thus if an ≤ bn and (an ) tends to a and (bn )
tends to b, then a ≤ b.s

Note: 1/n > 0 for each n, and hence the limit a = 0 of the sequence (1/n) also satisfies
a=0. Note, however, that a does not satisfy a > 0. Hence, the term ”non-strict” is important
here.

5.4 Examples
(i) Let xn = 1 + n−1 + 2−n . Then (xn ) converges to 1.

(ii) Let
2n3 + 1
yn = .
n3 + 2n + 1
Dividing top and bottom by n3 , the largest degree occurring here, we see that (yn ) converges to 2.

(iii) Let
2n2 + 1
zn =
n3 + 2n + 1

38
Dividing top and bottom by n3 , the largest degree occurring here, we see that (zn ) tends to 0.

(iv) Let
2n3 + 1
un = .
n2 + 2n + 1
Dividing top and bottom by n2 , we see that un gets very large and diverges.

(v) Let pn = (−1)n + 2−n . Since 2−n → 0, we see that (pn ) must diverge, because other-
wise (−1)n would tend to a limit, by the algebra of limits.

(vi) (Optional) We prove here that the sequence given by Zn = cos n diverges.
Suppose that (Zn ) converges. Then there is a real number L such that cos n is close to L for all
large positive integers n. If n is large then so is n + 1 and therefore cos(n + 1) will also be close
to L, for n large, so that
Z n+1
cos(n + 1) − cos n = − sin t dt
n

will be small, say less than 1/4 in absolute value, for all large integers n.
Now sin x ≥ 1/2 for π/6 ≤ x ≤ 5π/6. Since sine has period 2π, we get sin x ≥ 1/2 for
2mπ + π/6 ≤ x ≤ 2mπ + 5π/6, for every positive integer m. Choose a large positive integer m.
Since 5π/6 − π/6 = 2π/3 > 2, we can find an integer n, which will also be large and positive,
such that 2mπ + π/6 ≤ n < n + 1 ≤ 2mπ + 5π/6. But this gives
Z n+1 Z n+1
− sin t dt ≤ −(1/2) dt = −1/2 < −1/4,
n n

and we have a contradiction, so our assumption that (cos n) converges must be wrong.

You will find some more examples in the handwritten lecture notes, Sections 5.4 and 5.6.

5.5 The Sandwich Theorem


Theorem (The Sandwich Theorem) Suppose that (an ), (bn ), (cn ) are sequences such that for
each n we have an ≤ bn ≤ cn , and suppose further that (an ) and (cn ) converge to α ∈ R. Then
(bn ) converges to α.

Proof
Suppose we are given a positive real number ε. Then there exists some n1 such that for all
n ≥ n1 we have |an − α| < ε, so that α − ε < an < α + ε. Similarly there is some n2 such that
α − ε < cn < α + ε for all n ≥ n2 . So if n ≥ n0 = max{n1 , n2 } (the larger of n1 , n2 ) then
α − ε < an ≤ bn ≤ cn < α + ε, which gives |bn − α| < ε as required.
sin n
Example (i) vn = . We have −1/n ≤ vn ≤ 1/n and both 1/n and −1/n tend to 0. So vn
n
tends to 0 by the Sandwich Theorem.

39
(−1)n
(ii) xn = 3 + n
tends to 3.

ln n
(iii) tn = nd
, where d is a positive real number. Then lnndn tends to 0: We have
Z n Z n
−1
ln n = t dt ≤ td/2−1 dt = (2/d)(nd/2 − 1) < (2/d)nd/2 ,
1 1

and so
0 < tn < (2/d)n−d/2 → 0.
See Section 5.7 in the handwritten notes for a different way to do this!

(iv) yn = n1/n . By (iii), we see that ln yn = (1/n) ln n → 0 so yn → 1.

(v) Let |a| < 1 and let vn = an . We claim that vn → 0. This is fairly clear generally, and
totally obvious if a = 0. Assume now that 0 < |a| < 1 and let ε > 0. How large does n need to
be, in order to make |vn − 0| < ε?

We need |an | = |a|n < ε and so (1/|a|)n > 1/ε. Hence we need n ln(1/|a|) > ln(1/ε),
ln(1/ε)
and this is true for all n > ln(1/|a|) .

np
(vi) Let b > 1 and let p be an integer, and set wn = bn
. Then |wn+1 /wn | → 1/b < 1 as
n → ∞, and so wn → 0.

For more examples, see Section 5.7 of the handwritten notes.

An important result that follows from the Sandwich Theorem is the following:
Theorem (The Ratio Test) Let (xn ) be a real sequence such that each xn is non-zero, and
that |xn+1 /xn | → L < 1 as n → ∞. Then (xn ) tends to 0.
Proof Since limn→∞ |xn+1 /xn | = L, it follows that for every ε > 0 there exists an n0 such that
for all n > n0 we have
||xn+1 /xn | − L| < ε.
Choose s > 0 with L < s < 1. Then for large n, say n ≥ N , we have |xn+1 /xn | ≤ s and so
|xN +1 /xN | ≤ s, that means |xN +1 | ≤ s|xN |, |xN +2 | ≤ s|xN +1 | ≤ s2 |xN |, |xN +3 | ≤ s|xN +2 | ≤
s2 |xN +1 | ≤ s3 |xN |, etc. Thus |xN +k | ≤ sk |xN | for k = 1, 2, .... Write n = N + k, then
k = n − N and we get for large n: 0 ≤ |xn | ≤ sn−N |xN | = sn |xsNN | → 0 as n → ∞. This is
because limn→∞ sn = 0 as |s| < 1. So (xn ) converges to 0 by the Sandwich Theorem.

By the Ratio Test, the sequence (2n /n!) has limit 0. For the proof of this result and more
examples, see the handwritten notes, and the Core Booklet, Section A5.

5.6 Types of divergence


Consider
an = (−1)n , bn = n2 , cn = (−1)n n.

40
All three diverge, but (bn ) is better behaved than the others. As n gets large, bn gets large and
positive, while an and cn do not approach anything. We say that (xn ) diverges to ∞ if xn gets
large and positive as n gets large. This means that if we choose any positive real number M , we
will have xn > M for all sufficiently large n. A precise definition is
Definition Let (xn ) be a real sequence. Then (xn ) diverges to ∞, if the following is true: For
every positive real number M , there exists an integer n0 such that xn > M for all integers n ≥ n0 .

So for every given positive number M , no matter how large M might be, there are only finitely
many n such that xn ≤ M .

(xn ) diverges to −∞ if (−xn ) diverges to ∞.


Example (i) This definition does not assume or imply that we always have xn+1 > xn . For
example, the sequence

xn = n (n odd) , xn = n2 (n even) ,

diverges to ∞ but xn+1 < xn when n is even.


Proof For every real M take the closest integer greater than M , call in n0 . Then for all n ≥ n0 ,
n odd, xn = n ≥ n0 > M , and for all n ≥ n0 , n even, xn = n2 ≥ n ≥ n0 ≥ M . Thus for all
n ≥ n0 , xn > M . This means (xn ) diverges to ∞.

(ii) yn = 2n − n.
Proof yn = 2n − n = 2n (1 − n/2n ) and (n/2n ) tends to 0 as n tends to ∞, so (1 − n/2n ) tends
to 1, and thus (yn ) diverges to ∞.

(iii) n − 2n diverges to −∞, because 2n − n diverges to ∞.

We finish this section with some further results on convergence/divergence:


Proposition

(i) If an ≤ bn for all n ∈ N, limn→∞ an = a, limn→∞ bn = b, then a ≤ b.

(ii) limn→∞ = xn ∞ ⇔ ∃n0 ∈ N s.t. xn > 0 for all n ≥ n0 , and limn→∞ x1n = 0.

(iii) If xn > 0 for all n ∈ N, and limn→∞ xn = 0, then limn→∞ x1n = ∞.

(iv) If limn→∞ xn = ∞ and limn→∞ yn = β > 0, then limn→∞ xn yn = ∞.

(v) If limn→∞ |xn | = ∞ and limn→∞ yn = β, then limn→∞ xynn = 0.

(vi) If xn > 0 for all n ∈ N, xn → 0 and yn → β > 0, then limn→∞ xynn = ∞.

(vii) If an ≤ bn for all n ∈ N and limn→∞ an = ∞, then limn→∞ bn = ∞.

(viii) If limn→∞ bn = b, and a < b < c, then ∃n0 ∈ N s.t. a < bn < c, for all n ≥ n0 .

41
Proof
(i) By contradiction, suppose a > b, then let ε = a−b 2
> 0. Then there exist n1 , n2 ∈ N such
that |an − a| < ε, for all n ≥ n1 , and |bn − b| < ε, for all n ≥ n2 . Let n0 = max{n1 , n2 }.
Then for all n ≥ n0 , |an − a| < ε ⇒ an > a − ε = a − a−b 2
= a+b 2
, and |bn − b| < ε ⇒
a−b a+b a+b
bn < b + ε = b + 2 = 2 . Thus we obtain an > 2 > bn - a contradiction. Hence, a ≤ b.
(ii) (⇒) Suppose that xn → ∞. Let ε > 0, then for M = 1ε > 0 ∃n0 ∈ N such that
xn > M = 1ε > 0 for all n ≥ n0 . Thus, xn > 0 and 0 < x1n < ε, for n ≥ n0 . So, | x1n − 0| < ε,
for n ≥ n0 .
(⇐) Let M > 0, take ε = M1 > 0. Since xn > 0, for n ≥ n0 , and x1n → 0, ∃n1 ∈ N, n1 ≥ n0 ,
such that | x1n − 0| < ε, and xn > 0, for n ≥ n1 . Thus, for all n ≥ n1 , 0 < x1n < ε, and so
xn > 1ε = M . Hence, xn → ∞.
(iii) Let zn = x1n . Then zn > 0 and z1n = xn → 0. By (ii), limn→∞ x1n = limn→∞ zn = ∞.
(iv) Let ε = β2 > 0. Since yn → β, ∃n1 ∈ N, such that for all n ≥ n1 |yn − β| < ε ⇔
−ε + β < yn < ε + β. Thus yn > β − ε = β − β2 = β2 , for all n ≥ n1 . Now for any given
M > 0, 2M β
> 0 is a fixed positive number, so ∃n2 ∈ N such that xn > 2M β
, for all n ≥ n2 .
β
Let n0 = max{n1 , n2 }. Then for all n ≥ n0 we have yn > 2 > 0 and xn > 2M β
> 0, so that
2M β
xn yn > β · 2 = M , for n ≥ n0 . Hence, xn yn → ∞ as n → inf ty.
(v) Let zn = |x1n | . By (ii), zn → 0. Hence, by the Algebra of Limits, | xynn | = zn |yn | → 0 · |β| = 0
as n → inf ty. Thus xynn → 0 as n → inf ty.
(vi) Let zn = x1n > 0. By (iii), zn → +∞. Then by (iv) we have xynn = zn yn → ∞ as n → inf ty.
(vii) Since an → +∞, for any given M > 0 ∃n0 ∈ N such that an > M for all n ≥ n0 . Since
bn ≥ an , we have bn ≥ an > M , for all n ≥ n0 . Thus bn → ∞ as n → inf ty.
(viii) Let ε = min{ c−b 2
, b−a
2
} > 0. Since bn → b, ∃n0 ∈ N, such that for all n ≥ n0 , |bn − b| <
ε ⇔ b − ε < bn < b + ε. Thus for all n ≥ n0 , bn < b + ε ≤ b + c−b 2
= c+b2
< c, as b < c, and
b−a a+b
bn > b − ε ≥ b − 2 = 2 > a, as b > a. Hence a < bn < c, for all n ≥ n0 .

5.7 Monotone sequences


Recall that the symbol ∀ means “for all”.
Definition Let (xn ) be a real sequence. We say (xn ) is

increasing for n ≥ N if xn+1 ≥ xn ∀n ≥ N ,

decreasing for n ≥ N if xn+1 ≤ xn ∀n ≥ N .

If (xn ) is either of the above it is called monotone for n ≥ N .

For a monotone sequence there are just three possibilities.

(i) (xn ) converges.


(ii) (xn ) diverges to ∞.
(iii) (xn ) diverges to −∞.

42
To convince ourselves of this we need only consider the case where (xn ) is increasing because if
(xn ) is decreasing then (−xn ) is increasing.

5.8 The Monotone Sequence Theorem


Idea: Let (xn ) be a real sequence which is increasing for n ≥ N . So the xn can only move to
the right. Either xn gets arbitrarily large and positive, or there is some real M beyond which the
sequence doesn’t get. The least such M is the limit.

To make this more precise, we have the following:


Theorem (The Monotone Sequence Theorem) Let (xn ) be a real sequence.
(a) Let (xn ) be increasing for n ≥ N . If the set

A = {xn | n ∈ N, n ≥ N }

is bounded above (i.e. if there is some M > 0 such that xn ≤ M for all n ≥ N ), then (xn )
converges to the supremum of A.
If A is not bounded above then (xn ) diverges to ∞.
(b) Let (xn ) be decreasing for n ≥ N . If the set

A = {xn | n ∈ N, n ≥ N }

is bounded below (i.e. if there is some M > 0 such that xn ≥ M for all n ≥ N ), then (xn )
converges to the infimum of A.
If A is not bounded below then (xn ) diverges to −∞.
Proof (a) Let (xn ) be a sequence which is increasing for n ≥ N . Suppose first that A is bounded
above, and let s be the supremum of A. Then xn ≤ s for all integers n ≥ N . Also if t < s then
some xn is greater than t. So if ε > 0 then we have xn0 > s − ε for some integer n0 , and so
xn ≥ xn0 > s − ε for all integers n ≥ n0 . Thus if ε > 0 we have some n0 such that |xn − s| < ε
for all integers n ≥ n0 . This says precisely that xn tends to s as n → ∞.

Now suppose that A is not bounded above. Let M > 0. Then we have xn0 > M for some
n0 , and xn ≥ xn0 > M for all integers n ≥ n0 . This says that (xn ) diverges to ∞.

(b) is proved analogously.

5.9 Examples
A sequence is defined by
xn
x1 = 1, xn+1 =
2 + xn
for n ≥ 1. Does (xn ) converge?

We use the Monotone Sequence Theorem (MST) to answer this (see handwritten notes).

43
5.10 Repeated limits
The following example shows that these need to be treated with some care. Find

m1/n − 1
 
lim lim
n→∞ m→∞ m1/n + 1

and
m1/n − 1
 
lim lim 1/n .
m→∞ n→∞ m +1
Note they are not the same (check the hand written lecture notes or the video of that lesson for
the answer)!

6 Subsequences
6.1 Definitions and first results
Concerning notation, we now sometimes will write (xn )n instead of simply (xn ) for a sequence,
to make clear that n is the index that counts the terms (this is done throughout the literature in
many text books, too).
A useful concept related to sequences is that of a subsequence. Loosely speaking, a subse-
quence of (xn ) is a sequence that contains only some of the elements of (xn ), and all in the same
order as they appear in the “parent sequence” (xn ):
Definition Let (xn )n be a sequence and let (ni ) be a strictly increasing sequence of natural
numbers (that is, n1 < n2 < n3 < ...). Then every sequence of the type

(xni )i

is called a subsequence of (xn )n .


Remark It is easy to see by induction that ni ≥ i for all i.
1
Example Take the sequence (yn )n defined by yn = n1 . The sequence (y3n )n = ( 3n )n is a
subsequence. Note that (y3n )n is the sequence 13 , 16 , 91 , 12
1
, ....

Note The elements of the subsequence must come from the original sequence, so the sequence
1, 0, 13 , 0, 51 , ... is not a subsequence of (yn ). Also, the order of elements must be preserved, so
the sequence 1, 31 , 12 , 15 , ... is not a subsequence of (yn )

Subsequences inherit many properties from the parent sequence.


Proposition (i) If (xn ) is a bounded sequence, then any subsequence of (xn ) is also bounded.
(ii) If (xn ) is increasing (resp., decreasing), then any subsequence of (xn ) is also increasing (resp.,
decreasing).

Proof (i) Let (xn ) be a bounded sequence, that means there are real numbers m and M and an
integer N such that for all n ≥ N we have m ≤ xn ≤ M . Then for any subsequence (xni ) also

44
m ≤ xni ≤ M for all ni ≥ N . We know that ni ≥ i. Hence i ≥ N implies ni ≥ N . Thus for all
i ≥ N we have m ≤ xni ≤ M .
(ii) is clear.
Proposition If (xn ) is a convergent sequence, then any subsequence (xni ) is also convergent
and converges to the same limit:
lim xn = lim xni .
n→∞ i→∞

Proof Let limn→∞ xn = α. This means that for every ε > 0 there is N ∈ N such that for all
n ≥ N,
|xn − α| < ε.
We know that ni ≥ i. Hence i ≥ N implies ni ≥ N . Thus for all i ≥ N we have

|xni − α| < ε.

this implies that limn→∞ xn = α.


The previous result is very useful for proving divergence:
Example Let (an ) be defined by an = (−1)n . Then the sequence (an ) has two constant
subsequences:

−1, −1, −1, −1, −1, ...


1, 1, 1, 1, 1, ...

The first converges to −1, but the other converges to 1. Assume towards a contradiction that
(an ) converges to some limit α. By our proposition above,, then all subsequences of (an ) will
converge to the same limit α. However, since these two subsequences converge to different limits,
our original sequence (an ) must diverge.

6.2 Theorem of Bolzano-Weierstrass


While it is not true that every bounded sequence is convergent, the Theorem of Bolzano-
Weierstrass tells us that we can at least find a convergent subsequence:

Theorem (Bolzano-Weierstrass)
Every bounded sequence in R has a convergent subsequence.

Proof Let (xn ) be a bounded sequence. By the definition of a bounded sequence, there exist
two real numbers a1 < b1 such that, a1 ≤ xn ≤ b1 for all n ∈ N. We will define the required
subsequence inductively. Start with n1 = 1, that is xn1 = x1 .

Let y = a1 +b 2
1
be the midpoint of the interval. Then at least one of the intervals [a1 , y] and
[y, b1 ] has to contain infinitely many elements of the sequence (xn ).

45
Figure 1: A bounded sequence (xn )

If it is the first interval, we define a2 = a1 and b2 = y. If it is the second interval, we define


a2 = y and b2 = b1 . In both cases the following properties are satisfied:
b 1 − a1
a1 ≤ a2 ≤ b 2 ≤ b 1 and b2 − a2 = .
2
and the interval [a2 , b2 ] contains infinitely many elements of the sequence (xn ).

Figure 2: Method of ”compactness” to find an infinite subsequence. 1st iteration.

Now choose n2 ∈ N, such that n2 > n1 and xn2 ∈ [a2 , b2 ]. We iterate this construction. Having
chosen a1 , ..., ak and b1 , ..., bk we let yk = ak +b2
k
be the midpoint of the interval [ak , bk ] and
we define either [ak+1 , bk+1 ] = [ak , yk ] (or [ak+1 , bk+1 ] = [yk , bk ]) such that [ak+1 , bk+1 ] contains
infinitely many elements of the sequence (xn ). Then we choose nk+1 ∈ N such that nk+1 > nk
and xnk+1 ∈ [ak+1 , bk+1 ].

Figure 3: Method of ”compactness” to find an infinite subsequence. kth iteration.

Now that we have defined the subsequence (xnk ) we have to show that it is convergent.

46
We show convergence using the Sandwich Theorem. The property xnk ∈ [ak , bk ], which holds
for all k ∈ N, can be written as ak ≤ xnk ≤ bk . We need to show that (ak ) and (bk ) are both
convergent and have the same limit. Note that by construction (ak ) is increasing and (bk ) is
decreasing. They are also bounded, because

a1 ≤ a2 ≤ ... ≤ ak ≤ bk ≤ ... ≤ b2 ≤ b1 .

Thus a1 is a lower bound for both sequences and b1 is an upper bound. By the Monotone Se-
quence Theorem it follows that both sequences are convergent.

Let a = limk→∞ ak and b = limk→∞ bk be their limits. Now consider the lengths of the in-
tervals [ak , bk ]. The following identity can be shown by induction,
1 1 1
bk − ak = (bk−1 − ak−1 ) = 2 (bk−2 − ak−2 ) = ... = k−1 (b1 − a1 ).
2 2 2

We then take the limit as n → ∞:


1
lim (bk − ak ) = lim (b1 − a1 ) = 0,
k→∞ k→∞ 2k−1

and
lim (bk − ak ) = lim bk − lim ak = b − a.
k→∞ k→∞ k→∞

Therefore, b − a = 0 and so b = a. The subsequence (xnk ) is squeezed between two sequences


converging to the same limit a and so, by the Sandwich Theorem, this implies that (xnk ) is
convergent and converges to a as well.

6.3 Examples

(i) Does the sequence (zn ) = (cos(2n )) have a convergent subsequence?


Since | cos(n)| ≤ 1 for all x ∈ R, it is bounded, so cos(2n ) is bounded, too. It follows that
(zn ) has a convergent subsequence by the Bolzano-Weierstrass theorem.

(ii) Does the sequence (cn ) = (4 cos(n) − sin2 (n) − 1) have a convergent subsequence?
Use the triangle inequality :

|4 cos(n) − sin2 (n) − 1| ≤ |4 cos(n)| + | sin2 (n)| + 1 ≤ 4 + 1 + 1 = 6

Since (cn ) is bounded, it follows that (cn ) has a convergent subsequence by the Bolzano-
Weierstrass theorem.

47
6.4 Cauchy Sequences

A limitation when deciding if a sequence is convergent (or not) is that you need to have an idea
of its limit before you can test if it is.

Example Consider the recursively defined sequence


xn + xn−1
x1 = 0, x2 = 1 xn+1 = , n ≥ 2.
2
The first elements in the sequence are 0, 1, 21 , 34 , 85 , 11 , 21 , ...
16 32

The sequence is not monotone, because 0 ≤ 1 ≥ 21 ≤ 34 ≥ 58 ≤ ... Nevertheless, each following


element of the sequence is the midpoint of the interval spanned by the two previous ones. There-
fore, the sequence is bounded: 0 ≤ xn ≤ 1 for all n ∈ N.

We expect the sequence to converge, but without knowing the limit it seems difficult to prove
it. The notion of a Cauchy sequence will provide a tool to test convergence without having to
specify the limit in advance.

Definition A real sequence is called a Cauchy sequence, if for every ε > 0 there exists N ∈ N
such that for all m, n ≥ N we have
|xn − xm | < ε.
In other words, a real sequence is a Cauchy sequence, if the terms of the sequence eventually all
become arbitrarily close to one another. Note that N depends on ε here.

For comparison, recall that a real sequence converges to α, if for every ε > 0 there exists N ∈ N
such that for all n ≥ N we have
|xn − α| < ε.
Example Show that the sequence (xn ) = ( n1 ) is a Cauchy sequence.

We want to show that ∀ε > 0 there exists an N ∈ N such that ∀m, n ≥ N , we have |xn −xm | < ε.

Let ε be given and choose N such that N > 2ε . If m, n ≥ N then n ≥ N > 2ε implies
that n1 ≤ N1 < 2ε and similarly, m ≥ N > 2ε implies that m1 ≤ N1 < 2ε . Therefore,
applying the triangle inequality we have

1 1 1 1 1 1 ε ε
|xn − xm | = − ≤ + ≤ + < + .
n m n m n m 2 2

Therefore (xn ) = ( n1 ) is a Cauchy sequence.


Example Show that the sequence (yn ) = ((−1)n ) is not a Cauchy sequence.

48
Notice that if n is even then yn = 1, and so yn+1 = −1. Choose ε = 2 and select any
even n such that n ≥ N ∈ N. Then choose m = n + 1. Then

|xn − xm | = |(1) − (−1)| = 2 ≥ ε.

Therefore, (yn ) = ((−1)n ) is not a Cauchy sequence.

6.5 Properties of a Cauchy sequence


Let us explore some useful properties of a (real) Cauchy sequence:
Proposition Every real Cauchy sequence is bounded.
Proof Suppose (xn ) is a Cauchy sequence. Then for all ε > 0 there exists N ∈ N such that for
all n, k ≥ N , we have
|xn − xk | < ε.
So choose ε = 1. Then there exists N ∈ N such that for all n, k ≥ N , we have

|xn − xk | < 1.

In particular, for all n ≥ N we have

|xn − xN | < 1.

It follows that

(1) |xn | = |xn − xN + xN | ≤ |xn − xN | + |xN | < 1 + |xN |

for all n ≥ N , employing the triangle inequality.


For the first finitely many x1 , ...xN −1 , we do not know this but can adjust the inequality as
follows: since the first N − 1 terms all satisfy |x1 | ≤ 1 + |x1 |, |x2 | ≤ 1 + |x2 | etc., they also
satisfy
|xn | ≤ 1 + |x1 | + |x2 | + ... + |xN |
for n = 1, ...N − 1. By adding all these the |xi |’s also to the r.h.s. bound (1) above we see that

|xn | ≤ 1 + |x1 | + |x2 | + ... + |xN |

for all n. Therefore, the sequence is bounded.


Proposition Any convergent sequence is a Cauchy sequence.
ε
Proof Let xn → α as n → ∞. Then for all ε > 0 there is integer N such that |xn − α| < 2
for
all n ≥ N . That means for m ≥ N then also |xn − α| < 2ε , and so for m, n ≥ N we have

|xn − xm | = |xn − xm − α + α| = |xn − α − (xm − α)| ≤ |xn − α| + |xm − α|

by the triangle inequality. Therefore we have


ε ε
|xn − xm | < + =ε
2 2
and this means (xn ) is a Cauchy sequence.

49
6.6 Cauchy Convergence Criterion

Theorem (Cauchy Convergence Criterion)


Let (xn ) be a sequence of real numbers. Then (xn ) is convergent if and only if (xn ) is a Cauchy
sequence.

Proof “⇒:” If (xn ) is convergent, then (xn ) is a Cauchy sequence by our Proposition above.
“⇐:”
Let (xn ) be a Cauchy sequence, then by our other Proposition above it is bounded.

Since (xn ) is bounded, by the Bolzano-Weierstrass Theorem there exists a convergent subse-
quence, say
(xnk )k .
Let
lim xnk = α.
k→∞

We now show that (xn )n also converges to α:

Let ε > 0 then there is N1 , such that for all n, m ≥ N1 :


ε
|xn − xm | < ,
2
because (xn ) is a Cauchy sequence.
And there is N2 , such that for all nk ≥ N2 , such that xnk belongs to our subsequence:
ε
|xnk − α| < .
2
Now set N = max(N1 , N2 ). Then for all nk ≥ N , such that xnk belongs in our subsequence
ε ε
|xn − α| = |xn − xnk + xnk − α| ≤ |xn − xnk | + |xnk − α| < + = ε.
2 2
Since nk ≥ k, thus for all n ≥ N , |xn − α| < ε. Therefore, the sequence (xn ) converges to α.

Note that we have proved even more: We have shown that every convergent subsequence of a
Cauchy sequence converges to the same limit, which is the limit of the Cauchy sequence.

6.7 Example
Use the Cauchy Convergence Criterion (above Theorem) to show that (zn ) with
sin 1 sin 2 sin n
zn = 1
+ 2 + ... + n
2 2 2
converges.

50
Solution We will show that the sequence is a Cauchy sequence, because then it is convergent.

Without loss of generality (w.l.o.g.) let n > m. Then we have


sin 1 sin 2 sin m
zm = + 2 + ... + m
2 2 2
and
sin 1 sin 2 sin m sin(m + 1) sin n
zn = + 2 + ... + m + m+1
+ ... + n .
2 2 2 2 2

Thus
sin(m + 1) sin n sin(m + 1) sin n 1 1
|zn − zm | = m+1
+ ... + n ≤ m+1
+ ... + n
< m+1 + ... + n
2 2 2 2 2 2
1 1 1
= m
− n < m
2 2 2
by the triangle inequality (and geometric progression). Given ε > 0, now we choose N such that
1
2N
< ε for all n > m ≥ N then

1 1
|zn − zm | < m
< N < ε.
2 2

Therefore, the sequence (zn ) is Cauchy and by the Cauchy convergence criterion the sequence is
also convergent.

6.8 Remarks
It is not enough to have each term ”close” to the next one (|xn − xn−1 | < ε), in order to show
a sequence is a Cauchy sequence. For example, the divergent sequence of partial sums of the
harmonic series (see later in the course) does satisfy this property, but not the condition for a
Cauchy sequence.

The fact that in R Cauchy sequences are the same as convergent sequences is also called the
Cauchy criterion for convergence.

The use of the Completeness Axiom to prove the Cauchy criterion for convergence is crucial. For
example, let (xn ) be a sequence of rational numbers converging to an irrational. Take

(1, 1.4, 1.41, 1.414, ...) → 2

Then since (xn ) is a convergent sequence in R it is a Cauchy sequence in R and hence also a
Cauchy sequence in Q. But it has no limit in Q.

In fact one can formulate the Completeness axiom in terms of Cauchy sequences. Here are some
equivalent formulations of the axiom:

51
1. Every subset of R which is bounded above has a least upper bound.

2. In R every bounded monotone sequence is convergent.

3. In R every Cauchy sequence is convergent.

Note: the formulation (3) is a useful way of generalising the idea of completeness to structures
which are more general than ordered fields.

52

You might also like