Professional Documents
Culture Documents
These notes cover the whole content of this module, but not the Python component, for which
there will be separate notes and exercises. The notes may be amended or augmented during the
academic year.
N is the collection of natural numbers, i.e. positive integers. The members of N are
1, 2, 3, . . . (not 0).
Q is the collection of fractions, also called rational numbers: these are of form p/q, where
p and q are integers with q 6= 0.
C is the collection of all complex numbers (see your Linear Mathematics module for the
definition).
Note that 2 ∈ N, while 12 ∈ Q, but π 6∈ Q (true though hard to prove). There are a number of
ways to define further sets.
1
The notation
{a, b, c, d}
just means the set with elements a, b, c, d: a finite set (this means a set with finitely many
elements), can be defined simply by listing its elements.
The notation
{x ∈ A | P (x)} or {x ∈ A | P (x)}
means the collection of all elements x of the set A which satisfy the condition P (x).
For example,
{x ∈ Z | x2 < 2}
can be written more simply as {−1, 0, 1}.
If we write {q(x) | x ∈ A} then we mean the collection of all q(x) where x varies over all
members of the set A. For example, there is a simpler way to write
1.2 Subsets
Definition We say that A is a subset of B, written A ⊂ B, if every element x in A also is an
element in B. This may also be expressed by saying that A is contained in B.
Note that:
the statement A ⊂ B only makes sense when A and B are both sets.
Of the statements
precisely two are true, the others make no sense, as the notation is not defined. Which of
them are true you find out in the lecture in week one.
If A ⊂ B and B ⊂ A then A and B have the same elements and so are the same set, i.e.
A = B.
2
1.3 Operations on sets: union and intersection
Given two sets A and B, we can form their union and intersection as follows.
Definition (i) The union A ∪ B is defined as the set of all x which belong to at least one of A
and B:
A ∪ B = {x | x ∈ A or x ∈ B}.
Thus we take all elements of A and all elements of B and put them into a combined set A ∪ B.
Note that this includes elements x which belong to both.
(ii) The intersection A ∩ B is the set of all x which belong to both of A and B:
A ∩ B = {x | x ∈ A and x ∈ B}.
A ∩ B = B ∩ A, A ∪ B = B ∪ A,
A ∩ B ⊂ A ⊂ A ∪ B.
x < 0 ⇒ x ≤ 0. (1)
This says that if x < 0 then x ≤ 0. Is (1) true? Two possible but mutually contradictory
arguments run like this, one of them turns out to be false:
3
(i) If x < 0 then x can’t be greater than 0. So certainly we have that x ≤ 0 and so (1) is
true.
(ii) If we say x ≤ 0 then this allows for the possibility that x = 0. This can’t be true if
x < 0 therefore (1) must be false.
Indeed, statement (1) is true, but to determine why this is correct we must be systematic about
what we mean by statements like A ⇒ B, and when they are true.
If we write A ⇒ B then this means “A implies B”, which is also expressed as “if A then
B”. If A holds, then B must hold also. For example, with x a real number,
x=1 ⇒ x2 = x. (2)
This is sometimes also expressed as “A only if B” i.e. A can only hold if B holds: so this means
the same as A ⇒ B.
If we assign a true or false value to an abstract mathematical statement to give it some meaning,
then we can formulate certain rules that are usually written as “truth tables”, explaining how a
statement becomes true or false, depending on the different statements it is made up on. Here
is the truth table for “A ⇒ B” (T=true, F=false):
A B A⇒B
T T T
T F F
F T T
F F T
Note that an implication thus only is false if A is true and B is false. Note that we can therefore
conclude that any implication that follows from a false statement A will always be true.
4
which is false since x = 0 also satisfies x2 = x.
A further notational convention is the definition of the terms necessary and sufficient condition.
If A ⇒ B then we say:
Thus statements (3) and (4) are equivalent: each implies the other, and so either both hold or
neither holds. This observation can be used when proving statements as sometimes it may be
easier to prove the contrapositive of a statement instead of the actual statement (see later for
examples).
Note that the contrapositive is not the same as the converse.
5
1.6 Examples
Each of these statements involves positive integers n. What are their contrapositives and con-
verses? Which are true?
Answers:
(a) The original statement is “4 divides n ⇒ 4 divides n2 .” Its contrapositive is “4 does not
divide n2 ⇒ 4 does not divide n ”. The original and contrapositive statement are true: If 4
divides n then there is a positive integer k such that n = 4k. This implies that n2 = 42 k 2 , thus
4 divides n2 . Its converse is “4 divides n2 ⇒ 4 divides n”. This is false: take n = 2 then n2 = 4
but 4 does not divide 2.
(b) The original statement “n + 1 is not prime ⇒ n is prime” has the contrapositive “ n is not
prime ⇒ n + 1 is prime” is false: take n = 8, then 9 is not prime.
Its converse “ n is prime ⇒ n + 1 is not prime” is true: if n is prime then in particular this
means that n is odd, so n + 1 must be even, hence not prime.
(c): Write the number n as
n = 10m am + 10m−1 am−1 + . . . + 10a1 + a0 ,
where each digit aj ∈ {0, 1, . . . , 9}. Thus n written in base 10 has the form n = am . . . a0 . Now
let N = am + am−1 + . . . a1 + a0 be the sum of the digits of n.
The original statement “N is divisible by 3 ⇒ n is divisible by 9” is false (and thus also its
contrapositive): let n = 3 then N = 3 but n is not divisible by 9.
The converse statement “n is divisible by 3 ⇒ N is divisible by 9” is true: Look at
n − N = (10m − 1)am + . . . + (10 − 1)a1 .
This is divisible by 9, since each term 10j − 1 is divisible by 9. If n is divisible by 9, what does
this tell us about N ? If n is divisible by 9 then n = 9s for some positive integer s and so we get
9s − N = (10m − 1)am + . . . + (10 − 1)a1 and thus N = −(10m − 1)am − . . . − (10 − 1)a1 + 9s
is divisible by 9, and so in particular also divisible by 3.
6
1.8 Statements involving and/or
When we write “A or B” we mean that at least one (possibly both) of A and B holds. “A and
B” means that both hold.
The negation of “A or B” is “¬A and ¬B”, that means “neither A nor B”; the negation of “A
and B” is “¬A or ¬B”, that means “not both of A and B”.
For example, for real x,
x > 1 or x < −1 ⇒ x2 > 1.
Note that the contrapositive of this is
x2 ≤ 1 ⇒ x ≤ 1 and x ≥ −1,
which introduces an “and”.
Here are the truth tables for “and” and “or” (T=true, F=false):
A B A and B
T T T
T F F
F T F
F F F
A B A or B
T T T
T F T
F T T
F F F
A ¬A
T F
F T
Optional: Show that the truth tables of the two statements “A ⇒ B” and “¬A or B” are
the same. Thus both statements are “logically equivalent” and can be used interchangeably!
7
always true? Let’s look at its contrapositive, which is
¬A and ¬B ⇒ ¬A,
which is plainly always true, whatever A and B are. So (5) is always true.
If we take A to be “x < 0” and B to be “x = 0” then “A or B” becomes “x ≤ 0” and so
our statement (1) for real numbers x,
x < 0 ⇒ x ≤ 0,
is TRUE.
1.10 Quantifiers
There are two quantifier symbols used widely in mathematics:
the symbol ∀ is usually read as “for all” (sometimes “for each”, “for every”, “to each”);
For example,
∀ x ∈ R ∃n ∈ N such that n > x. (6)
“For every real number x there exists a natural number (i.e. positive integer) n with n > x.”
“To each real number x corresponds a natural number n with n > x.”
This is a true statement. Note that it should not be interpreted as saying that the n correspond-
ing to a given x is unique.
These quantifiers are useful but need to be treated with some care. The statement (6) can
be written more economically as
∀ x ∈ R ∃n ∈ N (n > x).
∃n ∈ N ∀ x ∈ R (n > x).
This says that there exists a positive integer n which is greater than every real x, and is obviously
false. It is thus important to not change the order as this changes the statement.
Negating a statement involving quantifiers also requires some thought. If we wish to negate
(6) then we assert that there is some real x for which there does not exist a natural number n
with n > x i.e.
∃ x ∈ R such that ∀n ∈ N n ≤ x,
which of course is again false. The general rule here is that if we negate a statement that contains
∃x ∈ ... then this negates to ∀x ∈ .... If we negate a statement that contains ∀x ∈ ... then this
negates to ∃x ∈ ....
8
2 Proof and Mathematical Induction
2.1 Why proof ?
The Dictionary of Mathematics (E J Borowski and J M Borwein, publ. HarperCollins 1989)
defines “proof” as:
“a sequence of statements, each of which is either validly derived from those preceding it or
is an axiom or assumption, and the final member of which, the conclusion, is the statement of
which the truth is thereby established.”
It goes on to say:
“an indirect proof assumes the falsehood of the desired conclusion and shows that to be im-
possible”.
because of the key role played by mathematics in scientific, engineering and everyday life;
because even in mathematics surprising things may happen (for example, if you add up real
numbers does the answer you get depend on the order in which you add them up?);
In the following, we do not set out to present a comprehensive treatment of proof, more a
recipe for how to prove mathematical assertions, but we will look at instances of some of the
main types of proof or disproof.
However,
211 − 1 = 2047 = (23)(89)
and this solitary example suffices to show that (7) is false.
9
2.3 Direct proof
Consider the following proposition, in which n ≥ 0 is an integer.
Here the degree means the highest power of x which occurs in Qn (x).
The most satisfactory way to prove an assertion like this is, if possible, to exhibit Qn directly and
show that it satisfies the required conditions. For this problem, a solution comes in the form of
the Lagrange interpolation formula. We define Qn (x) to be
(x − x1 ) . . . (x − xn ) (x − x0 )(x − x2 ) . . . (x − xn )
y0 + y1
(x0 − x1 ) . . . (x0 − xn ) (x1 − x0 )(x1 − x2 ) . . . (x1 − xn )
(x − x0 )(x − x1 ) . . . (x − xn−1 )
+ . . . + yn .
(xn − x0 )(xn − x1 ) . . . (xn − xn−1 )
Here in the term involving yj we multiply together all the terms (x − xi ) apart from x − xj , and
the terms in the denominator are there to make this term equal to yj at xj . Since each term in
our sum has degree at most n and the term involving yj is 0 at xi for i 6= j, we see that Qn (x)
has the required properties.
Thus we have proved the existence of Qn constructively. We will see another proof of the
existence of Qn later.
Note that the details of this proof are not important for this module: rather, this is included
as an illustration of an approach to proof.
Theorem Let A, B, C be three points in a plane, and assume that A, B, C are not collinear
(i.e. do not lie on a straight line). Then there exists a circle passing through A, B, C.
Sketch of the idea (see lectures). Join A to B and B to C by straight line segments.
Let D and E be the midpoints of AB and BC respectively. Construct straight lines perpendic-
ular to AB and BC at D and E respectively. Let the point where these lines intersect be O.
Think about the triangles AOB and BOC.
10
Again, the details of this proof are not important for this module: rather, this is included as
an illustration of an approach to proof.
Note that this proof is not constructive since we cannot determine A explicitly.
As before, the details of this proof are not important for this module: rather, this is included as
an illustration of an approach to proof.
This gives
n2 = 9m2 + 6mr + r2 .
Here 9m2 and 6mr are obviously divisible by 3. But r2 is either 1 or 4, so is not divisible by 3,
and therefore n2 is not divisible by 3.
11
Proof Suppose that there are only finitely many prime numbers, say n of them. Write them
down in increasing order as
2 = p1 < 3 = p2 < . . . < pn .
Now let
M = p1 p2 . . . pn + 1.
Thus we multiply together p1 , . . . , pn and add 1. Now M > 1, so M has a prime factor, which
must be one of p1 , . . . , pn , say pj . This is a contradiction, because when we divide M by pj we get
remainder 1 6= 0. Our assumption that there are finitely many primes has led to a contradiction,
so there must be infinitely many.
Note: (1) We used the fact that every integer M ≥ 2 has a prime factor. Why? If M is
itself prime, then we’ve finished. If not, M has a factor M1 with 1 < M1 < M , and any factor
of M1 is a factor of M . Now either M1 is prime, or M1 has a factor 1 < M2 < M1 . Repeating
this finitely many times we get a prime factor of M .
(2) This proof does not show that if you multiply together the first n primes and add 1, you get
a prime. In fact,
2 · 3 · 5 · 7 · 11 · 13 + 1 = 30031 = 59 · 509
(3) Finally, the details of this proof are again not important for this module: rather, this is
included as an illustration of an approach to proof. (4) There is a short video clip of this proof
on Youtube, the link is at the bottom of the Moodle page.
Applying (b) with n = N we deduce that P (N + 1) is true, and applying (b) again we get
P (N + 2), and subsequently P (N + 3) etc. Indeed to deduce P (n) for any n ≥ N , starting from
P (N ), we only need to apply (b) n − N times.
Thus (a) and (b) together imply that P (n) is true for all integers n ≥ N . This is called
the principle of mathematical induction (PMI).
Here’s another way to look at the PMI, which is based on the idea of proof by contradiction.
Assume that (a) and (b) are satisfied for our P (n). Suppose it’s not the case that P (n) is true
12
for every integer n ≥ N . Then there must be a least integer m ≥ N for which P (m) is false.
Since we know P (N ) is true, we must have m > N , and so we can write m = n + 1, where
n ≥ N . Because m is the least integer not less than N for which our property P is false, it must
be the case that P (n) is true. But now (b) tells us that since P (n) is true so is P (m).
Thus assuming that it is not the case that P (n) holds for all integers n ≥ N has led to a
contradiction, and so P (n) must be true for all n ≥ N . When we argue like this, the integer m
which arises is sometimes known as the minimal criminal.
so P (6) is true. Here we are taking N = 6, and this is called the anchor step, or induction
beginning.
Assume next that n ≥ 6 and that P (n) is true. This is called the induction hypothesis. Now
Putting these together we get 5n+1 > (n + 1)5 , and so we have shown that P (n) ⇒ P (n + 1)
for n ≥ 6. This is called the induction step.
(n+1)5 −(n+1) = n5 +5n4 +10n3 +10n2 +5n+1−(n+1) = n5 −n+5n4 +10n3 +10n2 +5n.
13
Now 5 divides n5 − n, by the induction hypothesis P (n), and 5 clearly divides 5n4 , 10n3 and
10n2 . So 5 divides (n + 1)5 − (n + 1), and we have deduced P (n + 1) from P (n).
Thus P (n) holds for all integers n ≥ 2 by the induction.
We can prove this by mathematical induction. For a given integer n ≥ 0 let P (n) be the
above statement.
Anchor step: P (0) is true, because setting Q0 (x) = y0 (constant, so clearly of degree at most
0) makes Q0 (x0 ) = y0 .
Induction hypothesis: Suppose that n ≥ 0 is an integer and that P (n) is true.
Induction step: We want to deduce P (n + 1). So let x0 , . . . , xn+1 be distinct real numbers,
and let y0 , . . . , yn+1 be real numbers. By the induction hypothesis P (n) we can find a polynomial
Qn , of degree at most n, which satisfies Qn (xj ) = yj for j = 0, . . . , n. Now set
(x − x0 ) . . . (x − xn )
Qn+1 (x) = Qn (x) + (yn+1 − Qn (xn+1 )) .
(xn+1 − x0 ) . . . (xn+1 − xn )
Then we find that Qn+1 (xj ) = Qn (xj ) = yj for j = 1, . . . , n, and
Qn+1 (xn+1 ) = Qn (xn+1 ) + (yn+1 − Qn (xn+1 )) = yn+1
as required. Also Qn+1 has degree at most n + 1. Thus we have deduced P (n + 1) from P (n).
Since P (0) is true we obtain in succession P (0), P (1), P (2), . . ., and P (n) is true for every
integer n ≥ 0 by induction.
14
so that we have deduced P (n + 1). Hence P (n) holds for all n ≥ 1 by the PMI.
For application of induction to two-term recurrent sequences like the Fibonacci numbers, one
only needs two preceding cases, P (n) and P (n − 1) in the induction step, and two anchor cases
(e.g., P (1) and P (2)) to get the induction going. The logical structure of such a proof is of the
following form:
Anchor step: P (n) is true for n = 1, 2.
Induction Hypothesis: Assume that P (n − 1) and P (n) are true.
Induction step: Let n ≥ 2 be given and using that P (n − 1) and P (n) hold, we show that
P (n + 1) holds.
By the principle of strong induction, P (n) holds for all n ∈ N.
(Note that here in the induction step we could have also said: “Assume P (1), P (2), . . . , P (n)
are true; this is a bit redundant as only the last two of the cases P (n − 1) and P (n) are needed,
though logically correct.)
Proposition Every integer n ≥ 2 can be factored into prime numbers.
Proof Let n ≥ 2 be an integer and let P (n) be the statement “n can be factored into prime
numbers”.
Anchor Step: P (2) is true.
Induction assumption: Let n ≥ 2 and assume that P (2), P (3), . . . , P (n) are true.
Induction step: There are two cases:
(i) n + 1 is prime. Then P (n + 1) holds.
(ii) n+1 = ab for two natural numbers a, b, 1 < a, b < n. Since P (a) and P (b) hold by induction
assumption, we know that a = p1 · · · pk and b = q1 · · · ql for primes p1 , . . . , pk , q1 , . . . ql . Thus
we have n + 1 = p1 · · · pk · q1 · · · ql is a prime factorization of n + 1. Thus P (n + 1) holds, and
we have shown the assertion using strong induction.
15
The Fibonacci sequence consists of the so-called Fibonacci numbers 0, 1, 1, 2, 3, 5, 8 , 13, 21,
... and is defined via F0 =√0, F1 = 1 and√Fn+2 = Fn+1 + Fn for all n ≥ 0. Define the roots of
x2 − x − 1 = 0 as ϕ = 1+2 5 and ψ = 1−2 5 .
Proposition The n-th Fibonacci number is given by
ϕn − ψ n ϕn − ψ n
Fn = = √ .
ϕ − psi 5
−ψ n n
Proof Let P (n) be the statement “Fn = ϕϕ−ψ ”. We now use strong induction using the
formula Fn+2 = Fn+1 + Fn .
0 −ψ 0
Anchor step: F0 = ϕϕ−psi = 0.
Induction hypothesis: Assume that n ≥ 0 and that P (n) and P (n + 1) both hold.
Induction step: we have
ϕn+1 − ψ n+1 ϕn − ψ n
Fn+2 = Fn+1 + Fn = +
ϕ−ψ ϕ−ψ
ϕn (ϕ + 1) − ψ n (ψ + 1)
= =.
ϕ−ψ
The first equality here holds because we have the induction hypothesis. We then used that ϕ, ψ
n+2 −ψ n+2
are roots of x2 − x − 1, thus ϕ2 = ϕ + 1, ψ 2 = ψ + 1 = ϕ ϕ−ψ . Hence we have shown that
P (n + 2) is true using that P (n + 1) and P (n) are true and so P (n) holds for all n by strong
induction.
Note that (and this is optional, but interesting!) both induction and strong induction are equiv-
alent to the well-ordering principle of the natural numbers: every non-empty subset of N has a
smallest element.
To prove the required assertion we can fix x ∈ R and let P (n) be the statement that fn (−x) =
(−1)n fn (x). Then f0 (−x) = 1 = (−1)0 · 1 = (−1)0 f0 (x) and f1 (−x) = −x = (−1)1 · x =
(−1)1 f1 (x), so P (0) and P (1) are true.
16
However, a problem arises when we try to deduce P (n + 1) from P (n). We need
Something to think about (optional): what is the connection between the fn (x) and cos nt?
3 Inequalities
3.1 Introduction
Inequalities
x<y
are generally harder to work with than equations, particularly when one tries to multiply them.
For example
a = b and c = d ⇒ ac = bd
whereas
−4 < −3, −2 < −1, but it is not true that 8 = (−4)(−2) < (−3)(−1) = 3.
To develop some rules which we can use we need to go back to first principles.
We assume these to be true and do not have to prove them. These are called axioms. For
17
positive real numbers we also have the following two rules (two more axioms, we do not have to
prove these, either):
(P1) the sum of two positive real numbers is positive;
(P2) the product of two positive real numbers is positive.
We can now derive some additional rules. Let x, y, z be any real numbers.
We also have:
Let x > 0. Then 1/x > 0.
Proof Obviously 1/x 6= 0 because otherwise we’d get 1 = x(1/x) = x · 0 = 0.
Now suppose 0 > 1/x. Adding −1/x to both sides we get −1/x > 0 and P2 gives −1 =
x(−1/x) > 0, a contradiction. Thus we must have 1/x > 0.
Using rule (iii) from §3.2 we see that if x ≤ y and z ≥ 0 then y − x and z are each either
positive or 0 and so is zy − zx = z(y − x), so that zx ≤ zy.
This way we get analogous statements for all the ones in the previous section with < replaced
by ≤.
18
3.4 Example
Solve the following inequality for real x:
1 2
+ > 0, x 6= 0, 1. (8)
x 1−x
(i.e. determine all x ∈ R with x 6= 0, 1 such that (8) is satisfied).
We have to exclude x = 0, 1 because if x is 0 or 1 the LHS of (8) is not defined.
Solution. We need
1 2 1 − x + 2x 1+x
0< + = = .
x 1−x x(1 − x) x(1 − x)
For this to be true the numerator 1 + x and the denominator x(1 − x) must both be positive, or
both be negative.
Now if 0 < x < 1 then x(1 − x) is positive, and so is 1 + x.
For x < 0 or x > 1 the denominator x(1 − x) is negative, so we need 1 + x < 0 i.e. x < −1.
So our solution is x < −1 or 0 < x < 1 i.e. x ∈ (−∞, −1) ∪ (0, 1).
19
y
y = |x|
We can√ also think of |x| as the distance from x to 0. A convenient formula, for real x, is
|x| = x2 , in which the square root sign always means the non-negative square root. We then
have, if x and y are real, p √ p
|xy| = x2 y 2 = x2 y 2 = |x||y|.
Also,
and so taking square roots we have |x + y| ≤ |x| + |y| for all real x, y. This is called the
triangle inequality. We also have as a consequence |x| = |(x − y) + y| ≤ |x − y| + |y|, and so
|x − y| ≥ |x| − |y| for all x, y ∈ R.
Note There is a triangle inequality for complex numbers z, w as well, which is given by
|z + w| ≤ |z| + |w|
p
where now |z| = z02 + z12 if z = z0 + iz1 with z0 , z1 ∈ R. However, for complex numbers the
proof is different, and obviously it is not true in general that |z| = ±z e.g. |i| = 1 6= ±i.
20
Example Determine all real numbers x with
|x| + 2 |x + 2| < 7.
Solution. The inequality is difficult to manipulate with modulus signs present. However splitting
up the range under consideration allows us to solve for x.
(a) Suppose x ≥ 0.
Then |x| = x, |x + 2| = x + 2 and the inequality reads 3x + 4 < 7 i.e. x < 1. Thus in this range
we require 0 ≤ x < 1.
3.6 Quadratics
Consider the quadratic
f (x) = ax2 + bx + c,
where a, b, c ∈ C. If a 6= 0 then
4af (x) = 4a2 x2 + 4abx + 4ac = 4a2 x2 + 4abx + b2 + 4ac − b2 = (2ax + b)2 + 4ac − b2
Definition f (x) has constant sign on R if precisely one of the following holds:
f (x) ≥ 0 for all real x;
f (x) ≤ 0 for all real x.
Lemma Let a, b, c ∈ R. Then f (x) = ax2 + bx + c has constant sign on R if and only if
b2 ≤ 4ac. Moreover, if a > 0 and b2 ≤ 4ac then f (x) ≥ 0 for all x ∈ R, and if a < 0 and
b2 ≤ 4ac then f (x) ≤ 0 for all x ∈ R.
21
Proof Suppose first that a = 0. Then the straight line y = bx + c has slope b and so takes
positive and negative values if and only if b 6= 0 i.e. if and only if b2 > 4ac.
for all x ∈ R. Since F (x) is a sum of squares of real numbers, we see that F (x) ≥ 0 for all real
x. But expanding out the squares gives
n
X
F (x) = (x2 a2k − 2xak bk + b2k ) = Ax2 − 2Bx + C,
k=1
where n n n
X X X
A= a2k , B= ak b k , C= b2k .
k=1 k=1 k=1
Since the quadratic F (x) is non-negative for every real x, it follows from our discussion of
quadratics and the Lemma of the previous section that we must have
4B 2 − 4AC = (2B)2 − 4AC ≤ 0 and hence B 2 ≤ AC,
which is (10).
Example Let
99
X p √ √ √ √
S= k(100 − k) = 99 + 2 · 98 + . . . + 98 · 2 + 99.
k=1
22
Estimate S using Cauchy-Schwarz.
By Cauchy-Schwarz we have
99
! 99
! 99
!2
X X X
S2 ≤ k (100 − k) = k
k=1 k=1 k=1
and so
99
X 99 · 100
S≤ k= = 4950.
k=1
2
The correct value is 3922.835639 but our estimate is at least of the right order of magnitude.
The Cauchy-Schwarz inequality has many applications in mathematics: for example it occurs
in connection with the isoperimetric inequality (of all loops of a given length, the one enclos-
ing the largest area is a circle), and in some treatments of the famous Heisenberg uncertainty
principle from quantum theory. It will appear again in the Linear Mathematics module.
Theorem (AM/GM inequality)We have A ≥ G, with equality only if all the aj are equal.
Proof (Optional.) If all the aj are equal there is nothing to prove. Suppose now that not
all the aj are equal. Hence we can find ar and as with ar < A < as . Replace ar and as by
br = A and bs = ar + as − A. Note that bs > 0 because as > A. Since
b r + b s = ar + as
23
and so br bs > ar as , which means that we have increased the GM.
Since we have replaced one of our numbers by A, the number of them which equal A has
increased. Note also that if ar and as are the only ones which do not equal A then ar + as must
be 2A, since A is the AM: hence br = bs = A in this case.
Thus we can repeat this process until all of our numbers are equal to A. The AM has not
changed, being always A, but the GM increases at every step. The final GM is equal to A, and
so the original GM must be less than A.
Example Find the maximum value of xyz for real numbers x, y, z subject to the conditions that
x, y, z > 0 and x + 2y + 3z = 6.
Solution: Let p = x, q = 2y, r = 3z. Then p, q, r > 0 with AM = p+q+r3
= x+2y+3z
3
= 2 and
1/3
GM = (pqr) ≤ 2 by the AM/GM inequality, with equality if and only if p = q = r = 2. Thus
pqr ≤ 23 if and only if 6xyz ≤ 8, which implies that xyz ≤ 4/3 where we have equality if and
only if x = 2, y = 1, z = 2/3.
x = bn bn−1 . . . b1 b0 .a1 a2 a3 . . .
in which each of the digits bj , aj is one of 0, 1, . . . , 9. This expansion is not always unique, since
1 = 0.9̇ = 0.99999999 . . ..
Note that +∞, −∞ (∞ means infinity) are NOT real numbers, and that we will mostly just
write ∞ for +∞.
We use the following notation for intervals of real numbers:
(a, b) = {x ∈ R | a < x < b},
(a, ∞) = {x ∈ R | x > a},
(−∞, b) = {x ∈ R | x < b},
[a, b] = {x ∈ R | a ≤ x ≤ b},
(−∞, b] = {x ∈ R | x ≤ b},
(a, b] = {x ∈ R | a < x ≤ b}, etc.
24
4.2 Upper and lower bounds
When we have a set A of real numbers, the most natural question to ask is how large is A? For
example,
2 1 x 1
B= x∈R:x < , C= x∈R:e > ,
4 4
are both infinite sets (each has infinitely many members). However there is a difference between
B and C in that C has arbitrarily large positive members, since ex > 41 for large positive x. In
fact C = (− ln 4, ∞), but all members of B = (−1/2, 1/2) are less than 21 .
Let A be any subset of R. This just means that A is a collection of real numbers (every
member of A is a member of R).
Definition M ∈ R is an upper bound (or majorant) for A if x ≤ M for all x ∈ A. We also say
that A is bounded above. m ∈ R is a lower bound (or minorant) for A if m ≤ x for all x ∈ A.
We also say that A is bounded below. A is called bounded if it is bounded above and below.
In our example, the set B is bounded above ( 12 is an upper bound), but the set C is not.
(ii) The set N is bounded below, but not above. The empty set is bounded as any real number
is both an upper and a lower bound.
√ √ √ √
(iii) Let B be the interval [− 2, 2] i.e. the set of all real numbers x such that − 2 ≤ x ≤ 2.
Then B is bounded above and below. This set is bounded.
(iv) An upper bound for A does not have to be in A. Also, “upper bound” does not mean
“boundary”.
25
4.5 Examples and remarks
(i) Any set with a maximum element is bounded above. However, a set may be bounded above
but have no maximum element, see (iv).
(ii) A non-empty finite set of real numbers always has a maximum element and a minimum
element. The empty set, however, has no maximum or minimum element (it has no elements at
all).
2
√ √ √
(iii)
√ Let B = {x ∈ R : x ≤ 2}. Then B = [− 2, 2] and clearly 2 = max B and
− 2 = min B.
(iv) Now let D = {1 − 10−n : n ∈ N}. Then D is bounded above by 1. However, D has
no maximum element, because increasing n increases 1 − 10−n . There are elements of D arbi-
trarily close to 1 and so, of all upper bounds for D, the least is 1.
(v) Take the set X = {−1/n : n ∈ N}. Since xn = −1/n increases as n increases, the
least element of X is x1 = −1. For the same reason, X has no greatest element. As n increases
towards ∞, we see that xn approaches 0. Thus 0 is the least real number which is an upper
bound, but this value is not a max since it is not in the set.
(b) for every real number t with t < s, there exists some x ∈ A such that x > t (i.e., s is
the smallest possible upper bound).
Note that every set A has at most one least upper bound: Suppose that s and t are both
upper bounds and assume that t < s, then from the definition that s = lub(A) it follows that for
t there exists some x ∈ A such that x > t, meaning that t is not an upper bound, contradiction.
Thus t ≥ s. Assume next that t > s then from the definition that t = lub(A) it follows that for
s there exists some y ∈ A such that y > s, meaning that s is not an upper bound, contradiction.
It follows that s = t. We conclude that we can talk about “the” least upper bound of a set A,
as it is uniquely determined.
26
(a) x ≤ s for all x ∈ A (i.e., s is an upper bound for A);
(b’) for every ε > 0 there exists some x ∈ A such that x > s − ε (i.e., there exists at least one
element of A within any given distance of s).
Take a non-empty subset A of R such that A is bounded above. We have already seen that A
might have no maximum element, but A has at least one upper bound M . In fact A then has
infinitely many upper bounds, and it is a fundamental property of R that among all of the upper
bounds of A there is one which is the least. This is the
Least upper bound property of the real numbers:
If A is a non-empty subset of R which is bounded above, then A has a least upper bound.
27
Note that the rational numbers √
Q do not have
√ the least
√ upper
√ bound property. Consider the
non-empty
√ set
√ S = {x ∈ Q | − 2 < x < 2} = (− 2, 2) ∩
√ Q. Then S is bounded above
by 2. Now 2 i the least upper bound for S in R, however 2 6∈ Q. S does not have a least
upper bound in Q.
(ii) Let C be the set of all decimals x = 0.a1 a2 . . . . in which each aj is 0 or 3. Then max C = 1/3,
and this is also the supremum (i.e., lub(C) = 1/3).
(iii) Now let C ∗ be the same as C, but with each x having only finitely many aj allowed to
be 3. The number of 3s must be finite for each x, but may depend on x.
Then 1/3 is still an upper bound but is not in C ∗ . Also, C ∗ has no max, because given any
element of C ∗ we can make a larger element by adding one extra 3. Finally, if t < 1/3 then by
taking 0.3 . . . 30 with enough 3s, we can make an element of C ∗ which is greater than t. So 1/3
is the least upper bound.
n2
,
n2 + 1
where n ∈ N. Then an upper bound for G is 1, but 1 is not in G. Show that 1 = lub(G):
(a) as noted already, 1 is an upper bound for G.
2
(b’) For all ε > 0 we have to show that there exists x = n2n+1 ∈ G such that x > 1 − ε. Write
n2 1
= 1 − .
n2 + 1 n2 + 1
n2 1 1
Rough work: n2 +1
> 1 − ε if and only if 1 − n2 +1
> 1 − ε if and only if n2 +1
< ε if and only if
q
1
n2 + 1 > ε
if and only if n > 1ε − 1.
q
1
Thus for every ε > 0 choose an integer n0 such that n0 > ε
− 1. Then by our “rough work”
n20
n20 +1
> 1 − ε. So 1 = lub(G).
(v) The empty set has no least upper bound, as every real number is an upper bound for the
empty set.
28
(a) g ≤ x for all x ∈ A (i.e., g is an lower bound for A);
(b) for every real number h with h > g, there exists some x ∈ A such that x > h (i.e., g
is the smallest possible upper bound).
Note that every set A has at most one greatest lower bound: Suppose that g and g are greatest
lower bounds and assume that h > g, then from the definition that g = glb(A) it follows that for
h there exists some x ∈ A such that x > h, meaning that h is not a lower bound, contradiction.
Thus h ≤ g. Assume next that h > g then from the definition that h = glb(A) it follows that for
g there exists some y ∈ A such that y > g, meaning that g is not a lower bound, contradiction.
It follows that g = h. We conclude that we can talk about “the” greatest lower bound of a set
A, as it is uniquely determined.
(b’) for every ε > 0 there exists some x ∈ A such that x < g + ε (i.e., there exists at least one
element of A within any given distance of g).
Everything that we proved for glb(A) holds also for the glb(A), by simply reversing the in-
equalities in the respective proofs. In particular:
Proposition Let A be a non-empty set of real numbers that is bounded below and has a minimum
element. Then min(A) = glb(A).
Proof By definition, the minimum element of A is a lower bound, and no real number g > min(A)
is an lower bound, since min(A) ∈ A.
Proposition Let g = glb(A).
(i) If g ∈ A then g is the minimum element of A.
(ii) If g 6∈ A then A does not have a minimum element.
Proof (i) Since g is a lower bound and g ∈ A, g must be the minimum element.
(ii) Suppose that X = min(A) exists. Since g is a lower bound, we have g ≤ X. But X 6= g
since X ∈ A and g 6∈ A. Therefore X > g which contradicts the fact that g is the greatest lower
bound.
29
Note that the rational numbers Q do not√have the greatest
√ upper
√ √ bound property, either. Con-
sider the non-empty
√ S = {x ∈ Q | − 2 < x < 2} = (− 2, 2) ∩ Q. Then
set √ √ S is bounded
below by − 2. Now − 2 i the greatest lower bound for S in R, however − 2 6∈ Q. S does
not have a least upper bound in Q.
Example Let A be the set of all x > 0 such that sin x < 0. Then glb(A) = π. There is no
minimum element.
Example Let A = {(−1)n n2n+1 n ∈ N}. Find inf (A) and sup(A). Does A have a minimum
and a maximum?
To answer this we rewrite an element x = (−1)n n2n+1 of A as (−1)n n2n+1 = 1 − n21+1 if n is even,
and (−1)n n2n+1 = −1 + n21+1 if n is odd. This shows us that −1 < x < 1 for all x ∈ A. We thus
suspect that inf (A) = −1 and sup(A) = 1. We prove this only for the infimum here:
(a) −1 < x for all x ∈ A, so −1 is a lower bound.
(b’) We have to show that for every ε > 0 there exists some x ∈ A such that x < −1 + ε.
Rough work: Consider n odd (for even n, the elements of A will not lie close to −1). Then
−1 + n21+1 < −1 + ε iff n21+1 < ε, iff n2 + 1 > 1ε .
Now we can show: for ε > 0 choose an integer n0 such that n0 > ε, then n2 + 1 > n0 > 1ε and
hence by our rough work, −1 + n21+1 < −1 + ε.
0
Thus inf (A) = −1.
Since inf (A) = −1 6∈ A the set A has no minimum. It can be shown analogously that sup(A) =
1. Since sup(A) = 1 6∈ A the set A has no maximum.
L(P ) ≤ L.
L(P ) ≤ L(P 0 ) ≤ L,
and so in principle L(P 0 ) is “closer” to L than L(P ). Thus one way to define the length L is as
the supremum of L(P ) over all possible P .
The picture shows an example where (as happens typically) L(P ) < L(P 0 ) < L. For “nice”
curves there is an easier formula for the length, expressed as an integral (see later modules).
30
C
P
C
P
P’ P
C
P’
P C
31
Now let
S = Q + 0.D1 D2 D3 D4 . . .
Then S is an upper bound for A because if x = q + 0.d1 d2 d3 . . . is in A but x > S then either
q > Q, or q = Q and d1 > D1 , or q = Q and d1 = D1 and d2 > D2 etc., and none of
these is possible for x in A. Also, all of the sets AN are non-empty, and if x is in AN then
its expansion goes x = Q + 0.D1 ...DN −1 . . ., and so S ≤ x + 101−N . So there are elements of
A as close as we like to S. So no number less than S can be an upper bound for A. So S is sup A.
Example I:
The points A, B, C are not in a straight line, and the distance from A to B is 5, while that from
B to C is 4. A particle (or car, or ship) moves on a smooth path, without abrupt changes of
direction, and must go from A to C via B. How far must it travel?
Distance AB is 5
A
Distance BC is 4
Obviously our traveller must travel a distance at least 5 + 4 = 9. However, since the corner at B
has to be smoothed off, it must travel a distance greater than 9. In principle, the distance could
be arbitrarily close to 9, but it can never equal 9.
32
In terms of our notation, the glb of the distance travelled is 9, but 9 is not a min because
it is not attainable.
Example II:
Consider all real-valued functions f (x) which are defined and ≥ 0 for 0 ≤ x ≤ 1, with
f (0) = 0, f (1) = 1, and continuous. This means basically that the graph of f is an unbro-
ken curve with no jumps: we will be more precise about this later on. How small can the area
under the curve be? R1
If we take f (x) = xn with n a positive integer, then the area is 0 xn dx = 1/(n + 1). Thus
the area can be as close as we like to 0.
On the other hand, the area will always be positive. In fact there must be some c, depending
on f , with 0 < c < 1 such that f (x) ≥ 1/2 for c ≤ x ≤ 1, so that the area is at least
(1 − c)/2 > 0.
y = f(x)
0.5
0 c 1
f(x) is at least 0.5 for x in [c, 1], so the area under the curve is at least (1 − c)/2 .
33
There are a lot of such instances in mathematics, in which we seek to “minimize” or “maxi-
mize” some quantity (e.g. an integral) and it is important to know whether the extreme value is
attained.
5 Sequences
5.1 The basic definitions
Definition We define a real sequence (xn ) as a a non-terminating list of real numbers
xN , xN +1 , xN +2 , . . . ...,
where N is some integer. Here xn is some quantity depending on n, defined for all integers
n ≥ N , for some starting integer N .
We use the ( ) to distinguish the sequence (xn ) from the particular term xn .
NOTE: a “series” is not the same thing. We will meet series (which are sums of terms) later.
(xn ) is the sequence 1, 1/2, 1/3, . . . .. and it is clear that the terms are decreasing and posi-
tive and get very small as n gets large.
The first four terms of (yn ) (N.B. using radians) are 0.54, −0.42, −0.99, −0.65 to 2 decimal
places. However many terms you look at, yn appears to oscillate “randomly” and in particular
yn does not seem to approach any specific value as n gets large.
The terms wn clearly increase and get very large as n gets large. un alternates between 1
and −1. The terms vn get large and positive for even n and for odd n, vn = 0. Finally, tn is
constant, and sn approaches 0.
We need a concise description of what happens as n gets larger and larger (this idea being
expressed by the phrase “as n → ∞”). Roughly speaking, a sequence CONVERGES if there is
some real number L, such that xn is very close to L whenever n is sufficiently large. Otherwise
it diverges. In our list, (xn ) and (sn ) converge to 0, while (tn ) converges to 1. The rest diverge.
Next, we seek a precise definition of convergence. Most ways of expressing this in words have
drawbacks: e.g.
34
“xn gets close to L as n → ∞” (how close?);
“xn gets closer and closer to L as n → ∞” (tends to suggest that |xn+1 − L| should al-
ways be less than |xn − L|, which is not the case for sn ).
5.2 Definition
Definition The real sequence (xn ) converges to the real number α, if the following is true:
For every positive real number ε there exists an integer n0 , possibly depending on ε, such that
|xn − α| < ε for all integers n ≥ n0 .
Note that n0 is allowed to depend on ε and usually will, and some people write n0 (ε) to emphasize
this.
xn → α as n → ∞ ; limn→∞ xn = α.
These should not lead you to think that n ever equals ∞ or that there is some x∞ in the
sequence: they are just shorthand for “(xn ) converges to α”. A non-convergent sequence is
called divergent.
Example Back to our examples we get: for xn = 1/n, to make |xn − 0| < ε we just need
n > 1/ε, and so n ≥ n0 , where n0 is the least integer greater than 1/ε, will do. For tn = 1 then
any n0 will do.
(i) Xn = 1/n2 . It is fairly obvious that (Xn ) in this case converges to 0. However, we check
2
our definition is satisfied. If ε > 0 is given we need |X
that p pn − 0| = 1/n < ε, and this is true if
n > 1/ε. So if we choose n0 to be the least integer > 1/ε, then we have |Xn − 0| < ε for
35
all integers n ≥ n0 , as required.
(ii) Yn = 2 − e−n . We have |Yn − 2| = e−n . If ε > 0 then for |Yn − 2| < ε we need pre-
cisely e−n < ε and so en > 1/ε and so n > ln(1/ε). So we can make n0 (ε) be the smallest
integer > ln(1/ε).
Intuitively these should be fairly obvious. However, we can prove them using our definition.
(i) We have to show that for every ε > 0 there exists an integer n0 (ε) such that |(xn + yn ) −
(α + β)| < ε for all integers n ≥ n0 (ε).
Since we don’t have an explicit formula for xn +yn , how will we find n0 ? Answer: from xn and yn .
by the triangle inequality. If we can make |xn − α| and |yn − β| both < ε/2 then we get
|(xn + yn ) − (α + β)| < ε. But we know that (xn ) converges to α and (yn ) to β. So there
is some integer n1 such that |xn − α| < ε/2 for all n ≥ n1 , and there is some n2 such that
|yn − β| < ε/2 for all n ≥ n2 .
Let n0 be the larger of n1 and n2 . Then for all integers n ≥ n0 we have |xn − α| < ε/2
and |yn − β| < ε/2 and so |(xn + yn ) − (α + β)| < ε as required.
(ii) Again, we have to show that for every each ε > 0 there exists an n0 such that |xn yn −αβ| < ε
for all integers n ≥ n0 . We first write
|xn yn − αβ| = |(xn − α)yn + α(yn − β)| ≤ |(xn − α)yn | + |α(yn − β)|. (11)
36
Suppose that ε > 0 is given. Choose a number δ which is positive and satisfies
δ(|α| + |β| + δ) = ε,
Since (xn ) converges to α and (yn ) to β we can find positive integers n1 and n2 such that
|xn − α| < δ for all n ≥ n1 and |yn − β| < δ for all n ≥ n2 . Again, let n0 be the larger of
n1 and n2 . So for all n ≥ n0 we have |xn − α| < δ and |yn − β| < δ, as well as observe that
|yn | = |yn − β + β| ≤ |yn − β| + |β| < δ + |β|. Substituting all these into (11) we get
for all n ≥ n0 .
(iii) This is easier, as ||xn | − |α|| is either |xn | − |α| or |α| − |xn | and both of these are ≤ |xn − α|.
So to make ||xn | − |α|| < ε, we just need to make |xn − α| < ε, which we know is true for all
integers n greater than or equal to some n0 .
We choose δ so that
δ
= ε,
|β|(|β| − δ)
this solving to give
ε|β|2
δ= ,
1 + ε|β|
which is positive but less than |β|. There is some n0 such that for all n ≥ n0 we have |yn −β| < δ
and consequently |yn | = |β − (β − yn )| ≥ |β| − |β − yn | > |β| − δ. Substituting into (12) we
have |yn − β| < δ and |βyn | > |β|(|β| − δ) and so
1 1 δ
− < =ε
yn β |β|(|β| − δ)
for all n ≥ n0 .
(−1)n 1
Remark Note that in (iv) we need β 6= 0: Let yn = n
then limn→∞ yn = 0, but yn
= n if
37
n is even and y1n = −n if n is odd, so ( y1n ) has no limit.
If (xn ) converges to α and γ ∈ R, then (γxn ) converges to γα. To see this we just take (yn ) to
be the constant sequence yn = γ for all n in (ii).
If (xn ) converges to α and (yn ) converges to β 6= 0, where yn 6= 0 for all n, then ( xynn ) converges
to αβ . To see this apply (ii) and (iv) to ( xynn ) = (xn y1n ).
The algebra of limit does not say anything about divergent sequences. But we can use proof by
contradiction to get results, for instance:
Proposition If (xn ) converges to α and (yn ) diverges, then (xn + yn ) diverges.
Proof Suppose that (xn + yn ) converges to γ as n tends to infinity. Then y + n = (xn + yn ) − xn
and by the Algebra of Limits, the sequence (yn ) thus tends to γ − α as n tends to infinity. This
contradicts our assumption that the sequence is divergent.
Proposition (“Limits preserve non-strict inequalities”) If an ≤ bn and (an ) tends to a and (bn )
tends to b, then a ≤ b.
Proof Assume towards a contradiction that a > b. So for all ε > 0 there is N1 such that
|an − a| < ε for all n > N1 and an N2 such that |bn − b| < ε for all n > N2 . Therefore for
all ε > 0 there is N = max(N1 , N2 ) such that for all n > max(N1 , N2 ) we have |an − a| < ε
and |bn − b| < ε. Now choose ε = a−b 2
> 0, then there is N such that for all n ≥ N , both
|an − a| < a−b
2
and |b n − b| < a−b
2
. This implies that −(an − a) < a−b
2
, so an − a > −a+b
2
and
a+b a−b a+b
therefore an > 2 . It also implies that bn − b < 2 , hence bn < 2 . Together we therefore
have
a+b
bn < < an
2
which contradicts the assumption that an ≤ bn . Thus if an ≤ bn and (an ) tends to a and (bn )
tends to b, then a ≤ b.s
Note: 1/n > 0 for each n, and hence the limit a = 0 of the sequence (1/n) also satisfies
a=0. Note, however, that a does not satisfy a > 0. Hence, the term ”non-strict” is important
here.
5.4 Examples
(i) Let xn = 1 + n−1 + 2−n . Then (xn ) converges to 1.
(ii) Let
2n3 + 1
yn = .
n3 + 2n + 1
Dividing top and bottom by n3 , the largest degree occurring here, we see that (yn ) converges to 2.
(iii) Let
2n2 + 1
zn =
n3 + 2n + 1
38
Dividing top and bottom by n3 , the largest degree occurring here, we see that (zn ) tends to 0.
(iv) Let
2n3 + 1
un = .
n2 + 2n + 1
Dividing top and bottom by n2 , we see that un gets very large and diverges.
(v) Let pn = (−1)n + 2−n . Since 2−n → 0, we see that (pn ) must diverge, because other-
wise (−1)n would tend to a limit, by the algebra of limits.
(vi) (Optional) We prove here that the sequence given by Zn = cos n diverges.
Suppose that (Zn ) converges. Then there is a real number L such that cos n is close to L for all
large positive integers n. If n is large then so is n + 1 and therefore cos(n + 1) will also be close
to L, for n large, so that
Z n+1
cos(n + 1) − cos n = − sin t dt
n
will be small, say less than 1/4 in absolute value, for all large integers n.
Now sin x ≥ 1/2 for π/6 ≤ x ≤ 5π/6. Since sine has period 2π, we get sin x ≥ 1/2 for
2mπ + π/6 ≤ x ≤ 2mπ + 5π/6, for every positive integer m. Choose a large positive integer m.
Since 5π/6 − π/6 = 2π/3 > 2, we can find an integer n, which will also be large and positive,
such that 2mπ + π/6 ≤ n < n + 1 ≤ 2mπ + 5π/6. But this gives
Z n+1 Z n+1
− sin t dt ≤ −(1/2) dt = −1/2 < −1/4,
n n
and we have a contradiction, so our assumption that (cos n) converges must be wrong.
You will find some more examples in the handwritten lecture notes, Sections 5.4 and 5.6.
Proof
Suppose we are given a positive real number ε. Then there exists some n1 such that for all
n ≥ n1 we have |an − α| < ε, so that α − ε < an < α + ε. Similarly there is some n2 such that
α − ε < cn < α + ε for all n ≥ n2 . So if n ≥ n0 = max{n1 , n2 } (the larger of n1 , n2 ) then
α − ε < an ≤ bn ≤ cn < α + ε, which gives |bn − α| < ε as required.
sin n
Example (i) vn = . We have −1/n ≤ vn ≤ 1/n and both 1/n and −1/n tend to 0. So vn
n
tends to 0 by the Sandwich Theorem.
39
(−1)n
(ii) xn = 3 + n
tends to 3.
ln n
(iii) tn = nd
, where d is a positive real number. Then lnndn tends to 0: We have
Z n Z n
−1
ln n = t dt ≤ td/2−1 dt = (2/d)(nd/2 − 1) < (2/d)nd/2 ,
1 1
and so
0 < tn < (2/d)n−d/2 → 0.
See Section 5.7 in the handwritten notes for a different way to do this!
(v) Let |a| < 1 and let vn = an . We claim that vn → 0. This is fairly clear generally, and
totally obvious if a = 0. Assume now that 0 < |a| < 1 and let ε > 0. How large does n need to
be, in order to make |vn − 0| < ε?
We need |an | = |a|n < ε and so (1/|a|)n > 1/ε. Hence we need n ln(1/|a|) > ln(1/ε),
ln(1/ε)
and this is true for all n > ln(1/|a|) .
np
(vi) Let b > 1 and let p be an integer, and set wn = bn
. Then |wn+1 /wn | → 1/b < 1 as
n → ∞, and so wn → 0.
An important result that follows from the Sandwich Theorem is the following:
Theorem (The Ratio Test) Let (xn ) be a real sequence such that each xn is non-zero, and
that |xn+1 /xn | → L < 1 as n → ∞. Then (xn ) tends to 0.
Proof Since limn→∞ |xn+1 /xn | = L, it follows that for every ε > 0 there exists an n0 such that
for all n > n0 we have
||xn+1 /xn | − L| < ε.
Choose s > 0 with L < s < 1. Then for large n, say n ≥ N , we have |xn+1 /xn | ≤ s and so
|xN +1 /xN | ≤ s, that means |xN +1 | ≤ s|xN |, |xN +2 | ≤ s|xN +1 | ≤ s2 |xN |, |xN +3 | ≤ s|xN +2 | ≤
s2 |xN +1 | ≤ s3 |xN |, etc. Thus |xN +k | ≤ sk |xN | for k = 1, 2, .... Write n = N + k, then
k = n − N and we get for large n: 0 ≤ |xn | ≤ sn−N |xN | = sn |xsNN | → 0 as n → ∞. This is
because limn→∞ sn = 0 as |s| < 1. So (xn ) converges to 0 by the Sandwich Theorem.
By the Ratio Test, the sequence (2n /n!) has limit 0. For the proof of this result and more
examples, see the handwritten notes, and the Core Booklet, Section A5.
40
All three diverge, but (bn ) is better behaved than the others. As n gets large, bn gets large and
positive, while an and cn do not approach anything. We say that (xn ) diverges to ∞ if xn gets
large and positive as n gets large. This means that if we choose any positive real number M , we
will have xn > M for all sufficiently large n. A precise definition is
Definition Let (xn ) be a real sequence. Then (xn ) diverges to ∞, if the following is true: For
every positive real number M , there exists an integer n0 such that xn > M for all integers n ≥ n0 .
So for every given positive number M , no matter how large M might be, there are only finitely
many n such that xn ≤ M .
xn = n (n odd) , xn = n2 (n even) ,
(ii) yn = 2n − n.
Proof yn = 2n − n = 2n (1 − n/2n ) and (n/2n ) tends to 0 as n tends to ∞, so (1 − n/2n ) tends
to 1, and thus (yn ) diverges to ∞.
(ii) limn→∞ = xn ∞ ⇔ ∃n0 ∈ N s.t. xn > 0 for all n ≥ n0 , and limn→∞ x1n = 0.
(viii) If limn→∞ bn = b, and a < b < c, then ∃n0 ∈ N s.t. a < bn < c, for all n ≥ n0 .
41
Proof
(i) By contradiction, suppose a > b, then let ε = a−b 2
> 0. Then there exist n1 , n2 ∈ N such
that |an − a| < ε, for all n ≥ n1 , and |bn − b| < ε, for all n ≥ n2 . Let n0 = max{n1 , n2 }.
Then for all n ≥ n0 , |an − a| < ε ⇒ an > a − ε = a − a−b 2
= a+b 2
, and |bn − b| < ε ⇒
a−b a+b a+b
bn < b + ε = b + 2 = 2 . Thus we obtain an > 2 > bn - a contradiction. Hence, a ≤ b.
(ii) (⇒) Suppose that xn → ∞. Let ε > 0, then for M = 1ε > 0 ∃n0 ∈ N such that
xn > M = 1ε > 0 for all n ≥ n0 . Thus, xn > 0 and 0 < x1n < ε, for n ≥ n0 . So, | x1n − 0| < ε,
for n ≥ n0 .
(⇐) Let M > 0, take ε = M1 > 0. Since xn > 0, for n ≥ n0 , and x1n → 0, ∃n1 ∈ N, n1 ≥ n0 ,
such that | x1n − 0| < ε, and xn > 0, for n ≥ n1 . Thus, for all n ≥ n1 , 0 < x1n < ε, and so
xn > 1ε = M . Hence, xn → ∞.
(iii) Let zn = x1n . Then zn > 0 and z1n = xn → 0. By (ii), limn→∞ x1n = limn→∞ zn = ∞.
(iv) Let ε = β2 > 0. Since yn → β, ∃n1 ∈ N, such that for all n ≥ n1 |yn − β| < ε ⇔
−ε + β < yn < ε + β. Thus yn > β − ε = β − β2 = β2 , for all n ≥ n1 . Now for any given
M > 0, 2M β
> 0 is a fixed positive number, so ∃n2 ∈ N such that xn > 2M β
, for all n ≥ n2 .
β
Let n0 = max{n1 , n2 }. Then for all n ≥ n0 we have yn > 2 > 0 and xn > 2M β
> 0, so that
2M β
xn yn > β · 2 = M , for n ≥ n0 . Hence, xn yn → ∞ as n → inf ty.
(v) Let zn = |x1n | . By (ii), zn → 0. Hence, by the Algebra of Limits, | xynn | = zn |yn | → 0 · |β| = 0
as n → inf ty. Thus xynn → 0 as n → inf ty.
(vi) Let zn = x1n > 0. By (iii), zn → +∞. Then by (iv) we have xynn = zn yn → ∞ as n → inf ty.
(vii) Since an → +∞, for any given M > 0 ∃n0 ∈ N such that an > M for all n ≥ n0 . Since
bn ≥ an , we have bn ≥ an > M , for all n ≥ n0 . Thus bn → ∞ as n → inf ty.
(viii) Let ε = min{ c−b 2
, b−a
2
} > 0. Since bn → b, ∃n0 ∈ N, such that for all n ≥ n0 , |bn − b| <
ε ⇔ b − ε < bn < b + ε. Thus for all n ≥ n0 , bn < b + ε ≤ b + c−b 2
= c+b2
< c, as b < c, and
b−a a+b
bn > b − ε ≥ b − 2 = 2 > a, as b > a. Hence a < bn < c, for all n ≥ n0 .
42
To convince ourselves of this we need only consider the case where (xn ) is increasing because if
(xn ) is decreasing then (−xn ) is increasing.
A = {xn | n ∈ N, n ≥ N }
is bounded above (i.e. if there is some M > 0 such that xn ≤ M for all n ≥ N ), then (xn )
converges to the supremum of A.
If A is not bounded above then (xn ) diverges to ∞.
(b) Let (xn ) be decreasing for n ≥ N . If the set
A = {xn | n ∈ N, n ≥ N }
is bounded below (i.e. if there is some M > 0 such that xn ≥ M for all n ≥ N ), then (xn )
converges to the infimum of A.
If A is not bounded below then (xn ) diverges to −∞.
Proof (a) Let (xn ) be a sequence which is increasing for n ≥ N . Suppose first that A is bounded
above, and let s be the supremum of A. Then xn ≤ s for all integers n ≥ N . Also if t < s then
some xn is greater than t. So if ε > 0 then we have xn0 > s − ε for some integer n0 , and so
xn ≥ xn0 > s − ε for all integers n ≥ n0 . Thus if ε > 0 we have some n0 such that |xn − s| < ε
for all integers n ≥ n0 . This says precisely that xn tends to s as n → ∞.
Now suppose that A is not bounded above. Let M > 0. Then we have xn0 > M for some
n0 , and xn ≥ xn0 > M for all integers n ≥ n0 . This says that (xn ) diverges to ∞.
5.9 Examples
A sequence is defined by
xn
x1 = 1, xn+1 =
2 + xn
for n ≥ 1. Does (xn ) converge?
We use the Monotone Sequence Theorem (MST) to answer this (see handwritten notes).
43
5.10 Repeated limits
The following example shows that these need to be treated with some care. Find
m1/n − 1
lim lim
n→∞ m→∞ m1/n + 1
and
m1/n − 1
lim lim 1/n .
m→∞ n→∞ m +1
Note they are not the same (check the hand written lecture notes or the video of that lesson for
the answer)!
6 Subsequences
6.1 Definitions and first results
Concerning notation, we now sometimes will write (xn )n instead of simply (xn ) for a sequence,
to make clear that n is the index that counts the terms (this is done throughout the literature in
many text books, too).
A useful concept related to sequences is that of a subsequence. Loosely speaking, a subse-
quence of (xn ) is a sequence that contains only some of the elements of (xn ), and all in the same
order as they appear in the “parent sequence” (xn ):
Definition Let (xn )n be a sequence and let (ni ) be a strictly increasing sequence of natural
numbers (that is, n1 < n2 < n3 < ...). Then every sequence of the type
(xni )i
Note The elements of the subsequence must come from the original sequence, so the sequence
1, 0, 13 , 0, 51 , ... is not a subsequence of (yn ). Also, the order of elements must be preserved, so
the sequence 1, 31 , 12 , 15 , ... is not a subsequence of (yn )
Proof (i) Let (xn ) be a bounded sequence, that means there are real numbers m and M and an
integer N such that for all n ≥ N we have m ≤ xn ≤ M . Then for any subsequence (xni ) also
44
m ≤ xni ≤ M for all ni ≥ N . We know that ni ≥ i. Hence i ≥ N implies ni ≥ N . Thus for all
i ≥ N we have m ≤ xni ≤ M .
(ii) is clear.
Proposition If (xn ) is a convergent sequence, then any subsequence (xni ) is also convergent
and converges to the same limit:
lim xn = lim xni .
n→∞ i→∞
Proof Let limn→∞ xn = α. This means that for every ε > 0 there is N ∈ N such that for all
n ≥ N,
|xn − α| < ε.
We know that ni ≥ i. Hence i ≥ N implies ni ≥ N . Thus for all i ≥ N we have
|xni − α| < ε.
The first converges to −1, but the other converges to 1. Assume towards a contradiction that
(an ) converges to some limit α. By our proposition above,, then all subsequences of (an ) will
converge to the same limit α. However, since these two subsequences converge to different limits,
our original sequence (an ) must diverge.
Theorem (Bolzano-Weierstrass)
Every bounded sequence in R has a convergent subsequence.
Proof Let (xn ) be a bounded sequence. By the definition of a bounded sequence, there exist
two real numbers a1 < b1 such that, a1 ≤ xn ≤ b1 for all n ∈ N. We will define the required
subsequence inductively. Start with n1 = 1, that is xn1 = x1 .
Let y = a1 +b 2
1
be the midpoint of the interval. Then at least one of the intervals [a1 , y] and
[y, b1 ] has to contain infinitely many elements of the sequence (xn ).
45
Figure 1: A bounded sequence (xn )
Now choose n2 ∈ N, such that n2 > n1 and xn2 ∈ [a2 , b2 ]. We iterate this construction. Having
chosen a1 , ..., ak and b1 , ..., bk we let yk = ak +b2
k
be the midpoint of the interval [ak , bk ] and
we define either [ak+1 , bk+1 ] = [ak , yk ] (or [ak+1 , bk+1 ] = [yk , bk ]) such that [ak+1 , bk+1 ] contains
infinitely many elements of the sequence (xn ). Then we choose nk+1 ∈ N such that nk+1 > nk
and xnk+1 ∈ [ak+1 , bk+1 ].
Now that we have defined the subsequence (xnk ) we have to show that it is convergent.
46
We show convergence using the Sandwich Theorem. The property xnk ∈ [ak , bk ], which holds
for all k ∈ N, can be written as ak ≤ xnk ≤ bk . We need to show that (ak ) and (bk ) are both
convergent and have the same limit. Note that by construction (ak ) is increasing and (bk ) is
decreasing. They are also bounded, because
a1 ≤ a2 ≤ ... ≤ ak ≤ bk ≤ ... ≤ b2 ≤ b1 .
Thus a1 is a lower bound for both sequences and b1 is an upper bound. By the Monotone Se-
quence Theorem it follows that both sequences are convergent.
Let a = limk→∞ ak and b = limk→∞ bk be their limits. Now consider the lengths of the in-
tervals [ak , bk ]. The following identity can be shown by induction,
1 1 1
bk − ak = (bk−1 − ak−1 ) = 2 (bk−2 − ak−2 ) = ... = k−1 (b1 − a1 ).
2 2 2
and
lim (bk − ak ) = lim bk − lim ak = b − a.
k→∞ k→∞ k→∞
6.3 Examples
(ii) Does the sequence (cn ) = (4 cos(n) − sin2 (n) − 1) have a convergent subsequence?
Use the triangle inequality :
Since (cn ) is bounded, it follows that (cn ) has a convergent subsequence by the Bolzano-
Weierstrass theorem.
47
6.4 Cauchy Sequences
A limitation when deciding if a sequence is convergent (or not) is that you need to have an idea
of its limit before you can test if it is.
We expect the sequence to converge, but without knowing the limit it seems difficult to prove
it. The notion of a Cauchy sequence will provide a tool to test convergence without having to
specify the limit in advance.
Definition A real sequence is called a Cauchy sequence, if for every ε > 0 there exists N ∈ N
such that for all m, n ≥ N we have
|xn − xm | < ε.
In other words, a real sequence is a Cauchy sequence, if the terms of the sequence eventually all
become arbitrarily close to one another. Note that N depends on ε here.
For comparison, recall that a real sequence converges to α, if for every ε > 0 there exists N ∈ N
such that for all n ≥ N we have
|xn − α| < ε.
Example Show that the sequence (xn ) = ( n1 ) is a Cauchy sequence.
We want to show that ∀ε > 0 there exists an N ∈ N such that ∀m, n ≥ N , we have |xn −xm | < ε.
Let ε be given and choose N such that N > 2ε . If m, n ≥ N then n ≥ N > 2ε implies
that n1 ≤ N1 < 2ε and similarly, m ≥ N > 2ε implies that m1 ≤ N1 < 2ε . Therefore,
applying the triangle inequality we have
1 1 1 1 1 1 ε ε
|xn − xm | = − ≤ + ≤ + < + .
n m n m n m 2 2
48
Notice that if n is even then yn = 1, and so yn+1 = −1. Choose ε = 2 and select any
even n such that n ≥ N ∈ N. Then choose m = n + 1. Then
|xn − xk | < 1.
|xn − xN | < 1.
It follows that
49
6.6 Cauchy Convergence Criterion
Proof “⇒:” If (xn ) is convergent, then (xn ) is a Cauchy sequence by our Proposition above.
“⇐:”
Let (xn ) be a Cauchy sequence, then by our other Proposition above it is bounded.
Since (xn ) is bounded, by the Bolzano-Weierstrass Theorem there exists a convergent subse-
quence, say
(xnk )k .
Let
lim xnk = α.
k→∞
Note that we have proved even more: We have shown that every convergent subsequence of a
Cauchy sequence converges to the same limit, which is the limit of the Cauchy sequence.
6.7 Example
Use the Cauchy Convergence Criterion (above Theorem) to show that (zn ) with
sin 1 sin 2 sin n
zn = 1
+ 2 + ... + n
2 2 2
converges.
50
Solution We will show that the sequence is a Cauchy sequence, because then it is convergent.
Thus
sin(m + 1) sin n sin(m + 1) sin n 1 1
|zn − zm | = m+1
+ ... + n ≤ m+1
+ ... + n
< m+1 + ... + n
2 2 2 2 2 2
1 1 1
= m
− n < m
2 2 2
by the triangle inequality (and geometric progression). Given ε > 0, now we choose N such that
1
2N
< ε for all n > m ≥ N then
1 1
|zn − zm | < m
< N < ε.
2 2
Therefore, the sequence (zn ) is Cauchy and by the Cauchy convergence criterion the sequence is
also convergent.
6.8 Remarks
It is not enough to have each term ”close” to the next one (|xn − xn−1 | < ε), in order to show
a sequence is a Cauchy sequence. For example, the divergent sequence of partial sums of the
harmonic series (see later in the course) does satisfy this property, but not the condition for a
Cauchy sequence.
The fact that in R Cauchy sequences are the same as convergent sequences is also called the
Cauchy criterion for convergence.
The use of the Completeness Axiom to prove the Cauchy criterion for convergence is crucial. For
example, let (xn ) be a sequence of rational numbers converging to an irrational. Take
√
(1, 1.4, 1.41, 1.414, ...) → 2
Then since (xn ) is a convergent sequence in R it is a Cauchy sequence in R and hence also a
Cauchy sequence in Q. But it has no limit in Q.
In fact one can formulate the Completeness axiom in terms of Cauchy sequences. Here are some
equivalent formulations of the axiom:
51
1. Every subset of R which is bounded above has a least upper bound.
Note: the formulation (3) is a useful way of generalising the idea of completeness to structures
which are more general than ordered fields.
52