You are on page 1of 130

ANALYSIS

AN INTRODUCTORY COURSE

Ivan F Wilde

Mathematics Department

Kings College London

iwilde@mth.kcl.ac.uk
Contents

1 Sets 1

2 The Real Numbers 9

3 Sequences 29

4 Series 59

5 Functions 81

6 Power Series 105

7 The elementary functions 111


Chapter 1

Sets

It is very convenient to introduce some notation and terminology from set


theory. A set is just a collection of objects which will usually be certain
mathematical objects, such as numbers, points in the plane, functions or
some such. If A denotes some given set and x denotes an object belonging
to A, then this fact is indicated by the expression

xA

to be read as x belongs to A, or x is a member of A, or x is an element


of A. If x denotes some object which does not belong to the set A, then
this is indicated by the symbolism

x
/A

and is read as x does not belong to A, or x is not a member of A, or


x is not an element of A.
To say that the sets A and B are equal is to say that they have the same
elements. In other words, to say that A = B is to say both that if x A
then also x B and if y B then also y A. We can write this as
(
xA = x B
A = B is the same as
yB = y A.

The verification that given sets A and B are equal is made up of two
parts. The first is the verification that every element of A is also an element
of B and the second part is the verification that every element of B is also
an element of A.
We list a few examples of sets and also introduce some notation.

1
2 Chapter 1

Examples 1.1.

1. The set consisting of the three integers 2, 3, 4. We write this as { 2, 3, 4 }.

2. The set of natural numbers { 1, 2, 3, 4, 5, 6, . . . } (i.e., all strictly positive


integers). This set is denoted by N. Notice that 0 / N.

3. The set of all real numbers, denoted by R. For example, 8, 11, 0, 5,
12 , 13 , are elements of R.

4. The set of complex numbers is denoted by C.

5. The set of all integers (positive, negative and including zero) is denoted
by Z.

6. The set of all rational numbers (all real numbers of the form m n for
integers m, n with n 6= 0) is denoted by
Q. For example, the real numbers
3 17
4 , 9 , 0, 78, 3 belong to Q, but 2/ Q.

7. The set of even natural numbers { 2, 4, 6, 8, . . . }. This could also be


written as { n N : n = 2m for some m N }. (The colon : stands for
such that (or with the property that), so this can be read as the
set of all n in N such that n = 2m for some m in N.)

8. The set { x R : x > 1 } is the set of all those real numbers strictly
greater than 1.

9. The set { z C : |z| = 1 } is the set of complex numbers with absolute


value equal to 1. This is the unit circle in C ( the circle with centre
at the origin and with radius equal to 1).

Certain sets of real numbers, so-called intervals, are given a special


notation with the use of round and square brackets. Let a R and b R
and suppose that a < b.

{ x R : a x b } is denoted [ a, b] (closed interval)

{ x R : a < x < b } is denoted (a, b) (open interval)

{ x R : a x < b } is denoted [ a, b) (closed-open interval)

{ x R : a < x b } is denoted (a, b] (open-closed interval)

{ x R : x a } is denoted (, a]

{ x R : x < a } is denoted (, a)

{ x R : a x } is denoted [ a, )

{ x R : a < x } is denoted (a, ).

Department of Mathematics
Sets 3

It is important to realize that all this is just notation a useful visual


short-hand. In particular, the symbol is used in four of the cases. This in
no way is meant to imply that represents a real number it positively,
absolutely, certainly is not.

is not a real number. There is no such real number as .

Given sets A and B, we say that A is a subset of B if every element of


A is also an element of B, i.e., x A = x B. If this is the case, we
write
AB
read A is a subset of B. By virtue of our earlier discussion of the
equality A = B, we can say that

A=B both A B and B A.


is equivalent to
if and only if

We have N R, Q R, N Z.

Definition 1.2. Suppose that A and B are given sets. The union of A and B,
denoted by A B, is the set with elements which belong to either A or B
(or both);
A B = { x : x A or x B }
read A union B equals . . . . Note that the usage of the word or
allows both.

In non-mathematical language, the union A B is obtained by bundling


together everything in A and everything in B. Clearly, by construction,
A A B and also B A B.

Example 1.3. Suppose that A = { 1, 2, 3 } and B = { 3, 6, 8 }. Then we find


that A B = { 1, 2, 3, 6, 8 }.

Definition 1.4. The intersection of A and B, denoted by A B, is the set


with elements which belong to both A and B;

A B = { x : x A and x B }

read A intersect B equals . . . .

In non-mathematical language, the intersection A B is got by selecting


everything which belongs to both A and B. Clearly, by construction, we see
that A B A and also A B B.

Kings College London


4 Chapter 1

Example 1.5. With A = { 1, 2, 3 } and B = { 3, 6, 8 }, as in the example


above, we see that A B = { 3 }.

If A and B have no elements in common then their intersection AB has


no elements at all. It is convenient to provide a symbol for this situation.
We let denote the set with no elements. is called the empty set.
Then A B = if A and B have no common elements. In such a situation,
we say that A and B are disjoint.

Example 1.6. Let A and B be the intervals in R given as A = (1, 4] and


B = (4, 6). Then A B = and A B = (1, 6).

Remark 1.7. Let A and B be given sets and consider the truth, or otherwise,
of the statement A B . This fails to be true precisely when A possesses
an element which is not a member of B.
Now suppose that A = . The statement B is false provided
that there is some nuisance element of which is not an element of B.
However, has no elements at all, so there can be no such nuisance
element. In other words, the statement B cannot be false and
consequently must be true; obeys B for any set B. This might seem
a bit odd, but is just a logical consequence of the formalism.

Theorem 1.8. For sets A, B and C, we have

(1) A (B C) = (A B) (A C).

(2) A (B C) = (A B) (A C).

Proof. (1) We must show that lhs rhs and that rhs lhs. First, we shall
show that lhs rhs. If lhs = , then we are done, because is a subset of
any set. So now suppose that lhs 6= and let x lhs = A (B C). Then
x A or x (B C) (or both).
(i) Suppose x A. Then x A B and also x A C and therefore
x (A B) (A C), that is, x rhs.
(ii) Suppose that x (B C). Then x B and x C and so x A B
and also x A C. Therefore x (A B) (A C), that is, x rhs.
So in either case (i) or (ii) (and at least one these must be true), we find
that x rhs. Since x lhs is arbitrary, we deduce that every element of the
left hand side is also an element of the right hand side, that is, lhs rhs.
Now we shall show that rhs lhs. If rhs = , then there is no more
to prove. So suppose that rhs 6= . Let x rhs. Then x (A B) and
x (A C).
Case (i): suppose x A. Then certainly x A (B C) and so x lhs.
Case (ii): suppose x
/ A. Then since x (A B), it follows that x B.
Also x (A C) and so it follows that x C. Hence x B C and so
x A (B C) which tells us that x lhs.

Department of Mathematics
Sets 5

We have seen that every element of the right hand side also belongs to the
left hand side, that is, rhs lhs.
Combining these two parts, we have lhs rhs and also rhs lhs and so
it follows that lhs = rhs, as required.
(2) This is left as an exercise.

The notions of union and intersection extend to the situation with more
than just two sets. For example,

A1 A2 A3 = { x : x A1 or x A2 or x A3 }
= { x : x belongs to at least one of the sets A1 , A2 , A3 }
= { x : x Ai for some i = 1 or 2 or 3 }
= { x : x Ai for some i { 1, 2, 3 } }.

More generally, for n sets A1 , A2 , . . . , An , we have

A1 A2 An = { x : x Ai for some i { 1, 2, . . . , n } }.
n
[
This union is often denoted by Ai which is somewhat more concise than
i=1
the alternative A1 A2 An . Let denote the index set { 1, 2, . . . , n }.
This is just the set of labels for the collection of sets we are considering. Then
the above can be conveniently written as
[
Ai = { x : x Ai for some i }.
i

This all makes sense for any non-empty index set. Indeed, suppose that we
have some collection of sets indexed (that is, labelled,) by a set . Suppose
the set with label is denoted by A . The union of all the A s is
defined to be [
A = { x : x A for some }.


[ [
If = N, one often writes Ai for A .
i=1

Examples 1.9.

1. Suppose that = { 1, 2, 3, . . . , 57, 58 } and Aj = [j, j + 1] for each j .


(So, for example, with j = 7, A7 = [7, 7 + 1] = [7, 8].) Then
58
[
Aj = [1, 59].
j=1

Kings College London


6 Chapter 1

2. Suppose that = N and Aj = [1, j + 1] for j N. Then



[
Aj = [1, ).
j=1
S
To see this, suppose that x j=1 . Then x is an element of at least
one of the Aj s, that is, there is some j0 , say, in N such that x Aj0 .
This means that x [1, j0 + 1], that is, 1 x j0 + 1 and so certainly
x [1, ). It follows that lhs rhs.
Now suppose that x [1, ). Then, in particular, x 1. Let N
be any natural number satisfying N > x. Then certainly
S x satisfies
1 x N + 1 which means that x AN and so x j=1 j . Hence
A
rhs lhs and the equality lhs = rhs follows.

3. Suppose that is the interval (0, 1) and, for each (0, 1), A is given
by A = { (x, y) R2 : x = }. In other words, A is the vertical line
x = in the plane R2 . Then
[
A = { (x, y) R2 : 0 < x < 1 }

which is the vertical strip in R2 with boundary edges given by the lines
with x = 0 and x = 1, respectively. Note that these lines (boundary
edges) are not part of the union of the A s.

4. Let be the interval [3, 5] and for each [3, 5] let A = { }. In other
words, A consists of just one point, the real number . Then
[
A = [3, 5]

which just says that the interval [3, 5] is the union of all its points (as it
should be).

A similar discussion can be made regarding intersections.

A1 A2 A3 = { x : x A1 and x A2 and x A3 }
= { x : x belongs to every one of the sets A1 , A2 , A3 }
= { x : x Ai for all i = 1 or 2 or 3 }
= { x : x Ai for all i { 1, 2, 3 } }.

In general, if { A } is any collection of sets indexed by the (non-empty)


set , then the intersection of the A s is
\
A = { x : x A for all }.

Department of Mathematics
Sets 7

n
\ \
If = { 1, 2, . . . , n }, we usually write Ai for A and if = N, then
i=1

\ \
we usually write Ai for A .
i=1

Examples 1.10.

1. Suppose that = N and for each j = N, let Aj = [0, j]. Then


\
Aj = [0, 1].
jN

2. Let = N and set Aj = [j, j + 1] for j N. Then



\
Aj = .
j=1

3. Let = N and set Aj = [j, ) for j N. Then



\
Aj = .
j=1
T
To see this, note that x j=1 provided that x belongs to every Aj .
This means that x satisfies j x j + 1 for all j N. But clearly this
fails whenever j is a natural number strictly greater than x. In other
words, there are no real numbers which satisfy this criterion.

4. Suppose that = N and for each k N let Ak be the interval given by


Ak = [0, 1/k). Then, in this case,

\
Ak = { 0 }.
k=1

This follows because the only non-negative real number which is smaller
than every 1/k (where k N) is zero.

5. Let = N and let Ak = [0, 1 + 1/k] for k N. Then



\
Ak = [0, 1].
k=1

Indeed, [0, 1] Ak for every k and if x


/ [0, 1] then x must fail to belong
to some Ak .

Kings College London


8 Chapter 1

Theorem 1.11. Suppose that A and B , for , are given sets. Then
\ \
(1) A B = (A B ).

[ [
(2) A B = (A B ).

T
Proof. (1) Suppose T that x A B . If x A, then x A B for
all Tand so x (A B ). If x / A, then it must T be the case that
x B , in which Tcase x B
T for all and so x (A B ). We
have shown that A B
(A B ). T
To establish the reverse inclusion, suppose that x (AB T ). Then

x A B for every . If x A, the certainly x A T
B . If
x / A, then we must have that xT B for every , that is, x B .
But then itTfollows that x A T B .
Hence (A B ) A B and so the equality
\ \
A B = (A B )

follows.

(2) The proof of this proceeds along similar lines to part (1).

Department of Mathematics
Chapter 2

The Real Numbers

In this chapter, we will discuss the properties of R, the real number system.
It might well be appropriate to ask exactly what a real number is? It is
the job of mathematics to set out clear descriptions of the objects within its
scope, so it is not at all unreasonable to expect an answer to this. One must
start somewhere. For example, in geometry, one might take the concept
of point as a basic undefined object. Lines are then specified by pairs
of points the line passing through them. Beginning with the natural
numbers, N, one can construct Z and from Z one constructs the rationals, Q.
Finally from Q it is possible to construct the real numbers R. We will not
do this here, but rather we will take a close look at the structure and special
properties of R. Of course, everybody knows that numbers can be added
and multiplied and even subtracted and it makes sense to divide one number
by another (as long as the latter, the denominator, is not zero). We can also
compare two numbers and discuss which is the larger. It is precisely these
properties (or axioms) that we wish to isolate and highlight.

Arithmetic

To each pair of real numbers a, b R, there corresponds a third, denoted


a + b. This pairing, denoted + and called addition obeys

(A1) a + (b + c) = (a + b) + c, for all a, b, c R.

(A2) a + b = b + a, for all a, b R.

(A3) There is a unique element, denoted 0, in R such that a + 0 = a, for


any a R.

(A4) For any a R, there is a unique element (denoted a) in R such that


a + (a) = 0.

The properties (A1) (A4) say that R is an abelian group with respect to
the binary operation + .

9
10 Chapter 2

Next, we consider multiplication. To each pair a, b R, there is a


third, denoted a.b, the product of a and b. The operation . , called
multiplication, obeys

(A5) a.(b.c) = (a.b).c, for any a, b, c R.

(A6) a.b = b.a, for any a, b R.

(A7) There is a unique element, denoted 1, in R, with 1 6= 0 and such that


a.1 = a, for any a R.

(A8) For any a R with a 6= 0, there is a unique element in R, written a1


or a1 , such that a.a1 = 1. The element a1 is called the (multiplicative)
inverse, or reciprocal, of a.

(A9) a.(b + c) = a.b + a.c, for all a, b, c R.

Remarks 2.1.

1. 01 is not defined. The element 0 has no reciprocal. Such an object


simply does not exist in R. 1/0 has no meaning.

2. Subtraction is given by a b = a + (b), for a, b R.

3. Division is defined via a b = a.(b1 ) ( = a. 1b = a/b) provided b 6= 0. If


it should happen that b = 0, then the expression a/b has no meaning.

4. It is usual to omit the dot and write just ab for the product a.b. There
is almost never any confusion from this.

All the familiar arithmetic results are consequences of the above properties
(A1) (A9).

Examples 2.2.

1. For any x R, x.0 = 0.

Proof. By (A3), 0 + 0 = 0 and so x.(0 + 0) = x.0. Hence, by (A9),


x.0 + x.0 = x.0. Adding (x.0) to both sides gives

(x.0 + x.0) + ((x.0)) = x.0 + ((x.0))


= 0, by (A4).

Hence, by (A1), x.0 + (x.0 + ((x.0)) = 0 and so, using property (A4)
again, we get x.0 + 0 = 0. However, by (A3), x.0 + 0 = x.0 and so by
equating these last two expressions for x.0 + 0 we obtain x.0 = 0, as
required.

Department of Mathematics
The Real Numbers 11

2. For any x, y R, x.(y) = (x.y).

Proof.

(x.y) + x.(y) = x.(y + (y)) , by (A9),


= x.0 , by (A4),
= 0, by the previous result.

By (A4) (uniqueness), (x.y) must be the same as x.(y).

3. For any x R, (x) = x.

Proof. We have

x = x + 0, by (A3),
= x + ((x) + ((x))) , by (A4),
= (x + (x)) + ((x)) , by (A1),
= 0 + ((x)) , by (A4),
= (x) , by (A3)

as required.

4. For any x, y R, x.y = (x).(y).

Proof. By example 2, above, .() = (, ) for any , R. If we


now choose = x and = y, we get

(x).(y) = ((x).y)
= (y.(x)) , by (A6),
= ((y.x)) , by example 2, above,
= ((x.y)) , by (A6),
= x.y by example 3, above,

and we are done.

Kings College London


12 Chapter 2

Order properties

Here we formalize the idea of one number being greater than another. We
can order two numbers by thinking of the larger as being the higher in
order. More precisely, there is a relation < (read less than) between
elements of R satisfying the following:

(A10) For any a, b R, exactly one of the following is true:

a < b, b < a or a=b (trichotomy).

The notation u > a (read u is greater than a) means that a < u.

(A11) If a < b and b < c, then a < c.

(A12) If a < b, then a + c < b + c, for any c R.

(A13) If a < b and > 0, then a < b.

Notation We write a b to signify that either a < b is true or else a = b


is true. In view of (A10), we can say that a b means that it is false that
a > b. The notation x w is used to mean that w x and as already noted
above, x > w is used to mean w < x.
By (A10), if x 6= 0, then either x > 0 or else x < 0. If x > 0, then x is
said to be (strictly) positive and if x < 0, we say that x is (strictly) negative.
Thus, if x is not zero, then it is either positive or else it is negative. It is
quite common to call a number x positive if it obeys x 0 or negative if it
obeys x 0. Should it be necessary to indicate that x is not zero, then one
adds the adjective strictly.

Examples 2.3.

1. For any x R, we have x > 0 (x) < 0.

Proof. Using (A12), we have

0 < x = 0 + (x) < x + (x) (adding (x) to both sides),


= (x) < 0 since rhs = 0, by (A4).

Conversely, again from (A12),

(x) < 0 = (x) + x < 0 + x (adding x to both sides),


= 0 < x by (A2), (A3) and (A4)

and the result follows.

Department of Mathematics
The Real Numbers 13

2. For any x 6= 0, we have x2 > 0.

Proof. Since x 6= 0, we must have either x > 0 or x < 0, by (A10). If


x > 0, then by (A13) we have x2 > 0 (take a = 0, b = = x). On the
other hand, if x < 0, then x > 0 by the example above. Hence, by
(A13) (with a = 0, b = = (x)), it follows that (x)(x) > 0.
But we know (from the arithmetic properties) that ()() = , for
any , R and so we have

x2 = x x = (x)(x) > 0

as required.

The number 1 was introduced in (A7). If we set x = 1 here, then we


see that 1 = 12 > 0, i.e., 1 > 0. We have deduced that the number 1 is
positive. Nobody would doubt this, but we see explicitly that this is a
consequence of our set-up. Note that it follows from this, by (A12), that
a < a + 1, for any a R.

3. If a, b R with a b, then a b.

Proof. If a = b then certainly a = b, so we need only consider the


case when a < b.

a < b = a + (a) < b + (a) by (A12),


= 0 < b + (a)
= b < (b) + b + (a) by (A12) and (A4),
= b < a by (A1), (A2) and (A4)

and the result follows.

From now on, we will work with real numbers and inequalities just as we
normally would and will not follow through a succession of steps invoking
the various listed properties as required as we go. Suffice it to say that we
could do so if we wished.
Next, we introduce a very important function, the modulus or absolute
value.

Kings College London


14 Chapter 2

Definition 2.4. For any x R, the modulus (or absolute value) of x is the
number |x| defined according to the rule
(
x, if x 0,
|x| =
x, if x < 0.

For example, |5| = 5, |0| = 0, |3| = 3 and 12 = 12 . Note that |x| is never
negative. We also see that |x| = max{ x, x }.
Let f (x) = x and g(x) = x. Then |x| = f (x) when x 0 and
|x| = g(x) when x < 0. Now, we know what the graphs of y = f (x) = x and
y = g(x) = x look like and so we can sketch the graph of the function |x|.
It is made up of two straight lines, meeting at the origin.

|x|
6

@
@
@ y = x
y = x@
@
@
@
@
@ - x
0

Figure 2.1: The absolute value function |x|.

The basic properties of the absolute value are contained in the following
two propositions. They are used time and time again in analysis and it is
absolutely essential to be fluent in their use.
Proposition 2.5.
(i) For any a, b R, we have |ab| = |a| |b|.

(ii) For any a R and r > 0, the inequality |a| < r is equivalent to the
pair of inequalities r < a < r.
Proof. (i) We just consider the various possibilities. If either a or b is zero,
then so is the product ab. Hence |ab| = 0 and at least one of |a| or |b| is also
zero. Therefore |ab| = 0 = |a| |b|. If both a > 0 and b > 0, then ab > 0 and
we have |ab| = ab, |a| = a and |b| = b and so |ab| = |a| |b| in this case.
Now, if a > 0 but b < 0, then ab < 0 so we have |a| = a, |b| = b and
|ab| = ab = |a| |b|. The case a < 0 and b > 0 is similar.
Finally, suppose that both a < 0 and b < 0. Then ab > 0 and we have
|ab| = ab, |a| = a and |b| = b. Hence, |ab| = ab = (a)(b) = |a| |b|.

Department of Mathematics
The Real Numbers 15

(ii) Suppose that |a| < r. Then max{ a, a } < r and so both a < r
and a < r. In other words, a < r and r < a which can be written as
r < a < a.
On the other hand, if r < a < r, then both a < r and a < r so that
max{ a, a } < r. That is, |a| < r, as required.

Remark 2.6. Putting b = 1 in (i), above, and using the fact that |1| = 1,
we see that |a| = |a|.

Proposition 2.7. For any real numbers a and b,

(i) |a + b| |a| + |b|.

(ii) |a b| |a| + |b|.

(iii) ||a| |b|| |a b|.

Proof. (i) We have a + b |a| + |b| and (a + b) = a b |a| + |b|. Hence

|a + b| = max{ a + b, (a + b) } |a| + |b| .

(ii) Let c = b and apply (i) to the real numbers a and c to get the
inequality |a + c| |a| + |c|. But then this means that |a b| |a| + |b|.

(iii) We have
|a| = |(a b) + b| |a b| + |b|

by part (i) (with (a b) replacing a). This implies that |a| |b| |a b|.
Swapping around a and b, we have (|a| |b|) = |b| |a| |b a| = |a b|
and therefore

| |a| |b| | = max{ |a| |b| , (|a| |b|) } |a b|

as required.

If a and b are real numbers, how far apart are they? For example, if a = 7
and b = 11 then we might say that the distance between a and b is 4. If,
on the other hand, a = 10 and b = 6, then we would say that the distance
between them is 16. In either case, we notice that the distance is given by
|a b|. It is extremely useful to view |a b| as the distance between the
numbers a and b. For example, to say that |a b| is very small is to say
that a and b are close to each other.

Kings College London


16 Chapter 2

Proposition 2.8. Let a, b R be given and suppose that for any given > 0,
a and b obey the inequality a < b + . Then a b. In particular, if x <
for all > 0, then x 0.

Proof. We know that either a b or else a > b. Suppose the latter were
true, namely, a > b. Set 1 = a b. Then 1 > 0 and a = b + 1 . Taking
= 1 , we see that this conflicts with the hypothesis that a < b + for every
> 0 (it fails for the choice = 1 ). We conclude that a > b must be false
and so a b.
For the last part, simply set a = x and b = 0 to get the desired conclusion.

We have listed a number of properties obeyed by the real numbers:

(A1) . . . (A9) arithmetic

(A10) . . . (A13) order.

Is this it? Are there any more to be included? We notice that all of these
properties are satisfied by the rational numbers, Q. Are all real numbers
rational, i.e., is it true that Q = R? Or do we need to consider yet further
properties which distinguish between Q and R? Consider an apparently
unrelated question. Do all numbers have square roots? Since a2 is positive
for any a R, it is clear that no negative number can have a square root in
R. (Indeed, it is the consideration of C, the complex numbers, which allows
for square roots of negative numbers.) So we ask, does every positive real
number have a square root? Does every natural number n have a square
root in R? In particular, is there such a real number as the square root
of 2? It would be nice to think that there is such a real number. In fact,
according to Pythagoras Theorem, this should be the length of the diagonal
of a square whose sides have unit length. The following proposition tells us
that there is certainly no such rational number.

Proposition2.9. There are no integers m, n N satisfying m2 = 2n2 . In


particular, 2 is not a rational number.

Proof. To say that 2 is rationalis to say that there are integers m and
n (with n 6= 0) such that m/n = 2. This means that m2 /n2 = 2 and so
m2 = 2n2 for m, n N. (By replacing m or n by m or n, if necessary,
we may assume
that m and n in this last equality are both positive.) So the
fact that 2 / Q is a consequence of the first part of the proposition.
Consider the equality
m2 = 2n2 ()
To show that m2 = 2n2 is impossible for any m, n N, suppose the contrary,
namely that there are numbers m and n in N obeying (). We will show
that this leads to a contradiction.

Department of Mathematics
The Real Numbers 17

Indeed, if m2 = 2n2 , then m2 is even. The square of an odd number is


odd and so it follows that m must also be even. This means that we can
express m as m = 2k for some suitable k N. But then

(2k)2 = m2 = 2n2

which means that 2k 2 = n2 and so n2 is even. Arguing as above, we deduce


that n can be expressed as n = 2j for some j N. Substituting, we see that
k and j also obey (), namely, k 2 = 2j 2 . This tells us that m/2 and n/2 are
integers also obeying ().
Repeating this whole argument with m0 = m/2 and n0 = n/2, we find
that both m0 /2 and n0 /2 belong to N and also satisfy (). In other words,
m/22 and n/22 belong to N and obey (). We can keep repeating this argu-
ment to deduce that m/2j and n/2j are integers obeying (). In particular,
m/2j N implies that m/2j 1 and so m 2j . But this holds for any
j N and we can take j as large as we wish. We can take j so large that
2j > m. This leads to a contradiction, as we wanted it to. We finally
conclude that there are no natural numbers m and n obeying () and as a
consequence,
there is no element of Q whose square is equal to 2, that is,
2 is not a rational number.

Remark 2.10. A somewhat similar argument can be used to show that many
other numbers do not have square roots in Q. For example, 3 / Q. In

fact, one can show that if n N, then
either n N, that
is, n is a perfect

square, or else n / Q. For example, 16 = 4 N but 17 / Q.

Returning to the discussion of the defining properties of R, we still have


to pinpoint the extra property that R has which is not shared by Q. First
we need some terminology.

Definition 2.11. A non-empty subset S of R is said to be bounded from above


if there is some M R such that

aM

for all a S. Any such number M is called an upper bound for the set S.
Evidently, if M is an upper bound for S, then so is any number greater
than M .
We say that a non-empty subset S of R is bounded from below if there
is some m R such that
ma
for all a S. Any such number m is called a lower bound for the set S. If
m is a lower bound for S, then so is any number less than m.
If S is both bounded from above and from below, then S is said to be
bounded.

Kings College London


18 Chapter 2

Example 2.12. Consider the set A = (6, 4] . Then A is bounded because


any x A obeys 6 x 4. (In fact, any x A obeys the inequalities
6 < x 4.) Any real number greater than or equal to 4 is an upper bound
for A and any real number less than or equal to 6 is a lower bound for A.
The set A has a maximal element, namely 4, but A does not have a minimal
element.
Let

[
3 5 7
B = (1, 2 ) (2, 2 ) (3, 2 ) = (k, k + 12 ).
k=1

Then the set B is bounded from below (the number 1 is clearly a lower
bound for B). However, B contains k + 14 for every k N and so B is not
bounded from above (so B is not bounded). We also see that B does not
have a minimal element.

Remark 2.13. What does it mean to say that a set S is not bounded from
above? Consider the inequality

xM. ()

Now, given S and some particular real number M , the inequality () may
hold for some elements x in S but may fail for other elements of S. To say
that S is bounded from above is to say that there is some M such that ()
holds for all elements x S. If S is not bounded from above, then it must
be the case that whatever M we try, there will always be some x in S for
which () fails, that is, for any given M there will be some x S such that
x > M . In particular, if we try M = 1, then there will be some element
(many, in fact) in S greater than 1. Let us pick any such element and label
it as x1 . Then we have x1 S and x1 > 1.
We can now try M = 2. Again, () must fail for at least one element
in S and it could even happen that x1 > 2. To ensure that we get a new
element from S, let M = max{ 2, x1 }. Then there must be at least one
element of S greater than this M . Let x2 denote any such element. Then
we have x2 S and x2 > 2 and x2 6= x1 .
Now setting M = max{ 3, x1 , x2 }, we may say that there is some element
in S, which we choose to denote by x3 , such that x3 > 3 and x3 6= x1 and
x3 6= x2 . We can continue to do this and so we see that if S is not bounded
from above, then there exist elements x1 , x2 , x3 , . . . , xn , . . . (which are all
different) such that xn > n for each n N.

The following concepts play an essential role.

Department of Mathematics
The Real Numbers 19

Definition 2.14. Suppose that S is a non-empty subset of R which is bounded


from above. The number M is the least upper bound (lub) of S if
(i) a M for all a S (i.e., M is an upper bound for S).

(ii) If M 0 is any upper bound for S, then M M 0 .


If S is a non-empty subset of R which is bounded from below, then the
number m is the greatest lower bound (glb) of S if
(i) m a for all a S (i.e., m is a lower bound for S).

(ii) If m0 is any lower bound for S, then m0 m.


Note that the least upper bound and the greatest lower bound of a set S
need not themselves belong to S. They may or they may not. The least
upper bound is also called the supremum (sup) and the greatest lower bound
is also called the infimum (inf). The ideas are illustrated by some examples.
Examples 2.15.
1. Let S be the following set consisting of 4 elements, S = { 3, 1, 2, 5 }.
Then clearly S is bounded from above and from below. The least upper
bound is 5 and the greatest lower bound is 3.

2. Let S be the interval S = (6, 4]. Then lub S = 4 and glb S = 6. Note
that 4 S whereas 6 / S.

3. Let S = (1, ). S is not bounded from above and so has no least upper
bound. S is bounded from below and we see that glb S = 1. Note that
glb S
/ S in this case.
Remark 2.16. Suppose that M is the lub for a set S. Let > 0. Then
M < M and since any upper bound M 0 for S has to obey M M 0 , we
see that M cannot be an upper bound for S. But this means that it
is false that a M for all a S. In other words, there must be some
a S which satisfies M < a. Since M is an upper bound for S, we also
have a M and so a obeys

M < a M.

So no matter how small may be, there will always be some element a S
(possibly depending on and there may be many) such that M < a M ,
where M = lub S.

For any > 0, there is a S such that lub S < a lub S .

Kings College London


20 Chapter 2

Now suppose that m = glb S. Then for any > 0 (however small), we
note that m < m + and so m + cannot be a lower bound for S (because
all lower bounds for S must be less than or equal to m). Hence, there is
some a S such that a < m + , which means that

m a < m + .

For any > 0, there is a S such that glb S a < glb S + .

Remark 2.17. As already noted above, lub S and glb S may or may not
belong to the set S. If it should happen that lub S S, then in this case
lub S (or sup S) is the maximum element of S, denoted max S. If glb S S,
then glb S (or inf S) is the minimum element of S, denoted min S.
For example, the interval S = (2, 5] is bounded and, by inspection, we
see that sup S = 5 and inf S = 2. Since sup S = 5 S, the set S does
indeed have a maximum element, namely, 5 = sup S. However, inf S / S
and so S has no minimum element.

We are now in a position to discuss the final property satisfied by R and


it is precisely this last property which distinguishes R from Q.

(A14) (The completeness property of R)


Any non-empty subset of R which is bounded from above possesses a
least upper bound.
Any non-empty subset of R which is bounded from below possesses a
greatest lower bound.

These statements might appear self-evident, but as we will see, they have
far-reaching consequences. We note here that these two statements are not
independent, in fact, each implies the other, that is, they are equivalent.

Remark 2.18. It is very convenient to think of R as the set of points on a line


(the real line). Indeed, this is standard procedure when sketching graphs of
functions where the coordinate axes represent the real numbers.

Department of Mathematics
The Real Numbers 21

Imagine now the following situation.

R
| {z } | {z }
A
6 B

Figure 2.2: The real line has no gaps.

The set A consists of all points on the line (real numbers) to the left
of the arrow and B comprises all those points to the right. Numbers are
bigger the more they are to the right. The arrow points to the least upper
bound of A (which is also the greatest lower bound of B). The completeness
property (A14) ensures the existence of the real number in R that the arrow
supposedly points to. There are no gaps or missing points on the real
line. We can think of the integers Z or even the rationals Q as collections
of dots on a line, but it is property (A14) which allows us to visualize R as
the whole unbroken line itself.
The next result is so obvious that it seems hardly worth noting. However,
it is very important and follows from property (A14).
Theorem 2.19 (Archimedean Property). For any given x R, there is some
n N such that n > x.
Proof. Let x R be given. We use the method of proof by contradiction
so suppose that there is no n N obeying n > x. This means that n x
for all n N, that is, x is an upper bound for N in R. By the completeness
property, (A14), N has a least upper bound, , say. Then is an upper
bound for N so that
n ()
for all n N. Since is the least upper bound, 1 cannot be an upper
bound for N and so there must be some k N such that 1 < k. But
we can rewrite this as < k + 1 which contradicts () since k + 1 N. We
conclude that there is some n N obeying n > x, as claimed.

Corollary 2.20.
1
(i) For any given > 0, there is some n N such that < .
n

(ii) For any > 0, > 0, there is n N such < .
n
Proof. (i) Let > 0 be given. By the Archimedean Property, there is some
n N such that n > 1/. But then this gives 1/n < , as required.
(ii) For given > 0 and > 0, set = /. By (i), there is n N such
that 1/n < = / and so /n < .

Kings College London


22 Chapter 2

The next result is no surprise either.

Theorem 2.21. For any a R, there is a unique integer n Z such that


n a < n + 1.

Proof. Let S = { k Z : k > a }. By theorem 2.19, S is not empty and is


bounded below (by a). Hence, by the completeness property (A14), S has
a greatest lower bound , say, in R. We have

ak

for all k S. (The inequality a follows because a is a lower bound and


is the greatest lower bound and the inequality k follows because is
a lower bound of S.) Since is the greatest lower bound, + 1 cannot be
a lower bound of S and so there is some m S such that m < + 1, that
is, m 1 < .

R
| | |
m1=n a m=n+1

Figure 2.3: The integer part of a.

Now, is a lower bound for S and m 1 < and so m 1 / S. But


then, by the defining property of S, this means that it is false that m1 > a.
In other words, we have m 1 a. But m S and so m > a and so m
satisfies m 1 a < m. Putting n = m 1, we get n Z and n satisfies
the required inequalities n a < n + 1.
To show the uniqueness of such n Z, suppose that also n0 Z obeys
n a < n0 +1. Suppose that n < n0 . Then n+1 n0 and so the inequalities
0

n0 a and a < n + 1 give

n0 a < n + 1 n0

giving n0 < n0 which is impossible. Similarly, the assumption that n0 < n


would lead to the impossible inequality n < n. We conclude that n = n0
which is to say that n is unique.

Remark 2.22. For x R, let n Z be the unique integer obeying the


inequalities n x < n+1. Set r = xn. Then we see that 0 xn = r < 1
and so x = n + r with n Z and where 0 r < 1. The unique integer n
here is called the integer part of the real number x and is denoted by [x] (or
sometimes by bxc).

Department of Mathematics
The Real Numbers 23

Theorem 2.23. Between any pair of real numbers a < b, there are infinitely-
many rational numbers and also infinitely-many irrational numbers.
Proof. First, we shall show that there is at least one such rational, that is,
we shall show that for any given a < b in R, there is some q Q such that
a < q < b. The idea of the proof is as follows. If there is an integer between
a and b, then we are done. In any case, we note that since the integers are
spread one unit apart, there should certainly be at least one integer between
a and b if the distance between a and b is greater than 1. If the distance
between a and b is less than 1, then we can open up the gap between them
by multiplying both by a sufficiently large (positive) integer, n, say. The
gap between na and nb is n(b a). Clearly, if n is large enough, this value is
greater than 1. Then there will be some integer m, say, between na and nb,
i.e., na < m < nb. But then we see (since n is positive) that a < m/n < b
and q = m/n is a rational number which does the job.
We shall now write this argument out formally. Let n N be sufficiently
large that n(b a) > 1 so that na + 1 < nb and let m = [na] + 1. Since
[na] na < [na] + 1, it follows that
[na] na < [na] + 1 na + 1 < nb
| {z }
m

and so na < m < nb and hence a < m/n < b. (Note that n > 0, so this last
step is valid.) Setting q = m/n, we have that q Q and q obeys a < q < b,
as required.
To see that there are infinitely-many rationals between a and b, we just
repeat the above argument but with, say, q and b instead of a and b. This
tells us that there is a rational, q2 , say, obeying q < q2 < b. Once again,
repeating this argument, there is a rational, q3 , say, obeying q2 < q3 < b.
Continuing in this way, we see that for any n N, there are n rationals,
q, q2 , . . . , qn obeying
a < q < q2 < q3 < < qn < b .
Hence it follows that there are infinitely-many rationals between a and b.
To show that there are infinitely-many irrational numbers between a
and b, we use a trick together with the observation that if r is rational, then
r/ 2 is irrational.
The trick is simply to apply the first part to the numbers
a 2 and b 2 to deduce that for any n N there are rational numbers
r1 , r2 , . . . , rn obeying

a 2 < r1 < r2 < < rn < b 2 .

Now let j = rj / 2 for j = 1, 2, . . . , n. Then each j is irrational and we
have
a < 1 < 2 < < n < b
and the result follows.

Kings College London


24 Chapter 2

As a further application of the Completeness Property of R, we shall


show that any positive real number has a positive nth root.

Theorem 2.24. Let x 0 and n N be given. Then there is a unique s 0


such that sn = x. The real number s is called the (positive) nth root of x
and is denoted by x1/n .

Proof. If x = 0, then we can take s = 0, so suppose that x > 0.


Let A be the set A = { t 0 : tn < x }. Then 0 A and so A is not empty
and, by the Archimedean Property, there is some integer K with K > x.
But then every t A must obey t < K because otherwise we would have
t K and therefore tn K n K > x, which is not possible for any t A.
This means that A is bounded from above. By the Completeness Property
of R, A has a least upper bound, lub A = s, say. Note now that, since x > 0,
by the Archimedean Property there is some m N such that m > 1/x.
Hence mn m > 1/x which implies that 1/mn < x so that 1/mn A. This
means that s 1/mn . In particular, s > 0.
Now, exactly one of the statements sn = x, sn < x or sn > x is true.
We claim that sn = x and to show this we shall show that the last two
statements must be false.
Indeed, suppose that sn < x. For k N, let sk = s(1 + k1 ). Then
evidently sk > s and we will show that snk < x for suitably large k.
Let d = x sn . Then d > 0 and

x snk = x sn + sn snk = d (snk sn ) = d sn (1 + k1 )n 1 .

Now, writing = (1 + k1 ) and noting that 1 < 2, we estimate

(1 + k1 )n 1 = n 1
= ( 1)(n1 + n2 + + 1)
( 1)(2n1 + 2n2 + + 1)
( 1) n 2n
= 1
k n 2n .

Hence
sn n 2n
snk sn = sn (1 + k1 )n 1 .
k
For sufficiently large k, the right hand side of this inequality is less than d
and so
x snk = x sn + sn snk = d (snk sn ) > 0 .
It follows that if k is large enough, then sk A. But sk > s which means
that s cannot be the least upper bound of A and we have a contradiction.
Hence it must be false that sn < x.

Department of Mathematics
The Real Numbers 25

Suppose now that sn > x and let = sn x. For given k N, let


tk = s(1 k1 ). Writing = 1 k1 and noting that 0 1, we estimate
that

1 (1 k1 )n = 1 n
= ( n 1)
= ( 1)( n1 + n2 + + 1)
= (1 )( n1 + n2 + + 1)
(1 ) n
1
= k n.

It follows that

sn tnk = sn (1 (1 k1 )n ) 1
k sn n <

for sufficiently large k. But then this means that

tnk x = tnk sn + sn x = (sn tnk ) > 0

for large k. However, tk < s and since s = lub A, it follows that tk is not an
upper bound for A. In other words, there is some A such that > tk
and therefore n x > tnk x > 0. However, A means that n < x
which is a contradiction and so it is false that sn > x.
We have now shown that sn < x is false and also that sn > x is false
and so we conclude that it must be true that sn = x, as required.
We have established the existence of some s 0 such that sn = x and
so, finally, we must prove that such an s is unique. If x = 0, then s = 0
obeys sn = 0 = x. No s 6= 0 can obey sn = 0 because sn (1/s)n = 1 6= 0, so
s = 0 is the only solution to sn = 0.
Now let s > 0 and t > 0. If s > t, then s/t > 1 so that (s/t)n > 1 and
we find that sn > tn . Interchanging the roles of s and t, it follows that if
s < t, then sn < tn . We conclude that if sn = x = tn then both s < t and
s > t are impossible and so s = t.
The proof is complete.

Kings College London


26 Chapter 2

Principle of induction

Suppose that, for each n N, P (n) is a statement about the number n such
that
(i) P (1) is true.

(ii) For any k N, the truth of P (k) implies the truth of P (k + 1).
Then P (n) is true for all n.

Example 2.25. For any n N,


n(n + 1)(2n + 1)
12 + 22 + 32 + + n2 = .
6
Proof. For n N, let P (n) be the statement that
n(n + 1)(2n + 1)
12 + 22 + 32 + + n2 = .
6
Then P (1) is the statement that
1(1 + 1)(2 + 1)
12 =
6
which is true.
Now suppose that k N and that P (k) is true. We wish to show that
P (k + 1) is also true. Since we are assuming that P (k) is true, we see that
k(k + 1)(2k + 1)
12 + 22 + 32 + + k 2 + (k + 1)2 = + (k + 1)2 ,
6
using the truth of P (k),
k(k + 1)(2k + 1) + (k + 1)(6k + 6)
=
6
2
(k + 1)(2k + k + 6k + 6)
=
6
(k + 1)(k + 2)(2k + 3)
=
6
which is to say that P (k + 1) is true. By the principle of induction, we
conclude that P (n) is true for all n N.

We can rephrase the principle of induction as follows. Let T be the set


given by T = { k N : P (k) is true }, so k T if and only if P (k) is true. In
particular, P (1) is true if and only if 1 T . Hence the principle of induction
may be rephrased as follows.
Let T be a set of natural numbers such that 1 T and such that
if T contains k then it also contains k + 1. Then T = N.

Department of Mathematics
The Real Numbers 27

Principle of induction (2nd form)

Suppose that Q(n) is a statement about the natural number n such that

(i) Q(1) is true.

(ii) For any k N, the truth of all Q(1), Q(2), . . . , Q(k) implies the truth
of Q(k + 1).

Then Q(n) is true for all n.

In a nutshell:

Suppose that:
Q(1) is true and

Q(1) true

Q(2) true

Q(3) true = Q(k + 1) true
..

.


Q(k) true

Conclusion:
Q(n) is true for all n N.

This follows from the usual form of the principle.


To see this, let S = { m N : Q(m) is true }. We shall use the usual form
of induction to show that the hypotheses above imply that S = N.
For any n N, let P (n) be the statement { 1, 2, . . . , n } S .
Now, by hypothesis, Q(1) is true and so 1 S. Hence { 1 } S which
is to say that P (1) is true.
Next, suppose that the truth of P (k) implies that of P (k +1) and assume
that P (k) is true. This means that { 1, 2, . . . , k } S, that is, each of Q(1),
Q(2), . . . , Q(k) is true. But then by the 2nd part of the hypothesis above,
Q(k + 1) is true, that is to say, k + 1 S. Hence { 1, 2, . . . k, k + 1 } S.
But this just tells us that P (k + 1) is true. By induction (usual form), it
follows that P (n) is true for all n N. This means that { 1, 2, . . . , n } S
for all n. In particular, n S for every n N, that is, Q(n) is true for all
n N which is the content of the 2nd form of the principle.

Kings College London


28 Chapter 2

Department of Mathematics
Chapter 3

Sequences

A sequence of real numbers is just a listing a1 , a2 , a3 , . . . of real numbers


labelled by N, the set of natural numbers. Thus, to each n N, there
corresponds a real number an . Not surprisingly, an is called the nth term of
the sequence.

a1 , a2 , a3 , . . . , ak , ak+1 , . . .
?
labelled by N k th term

Figure 3.1: The sequence (an )nN .

Whilst it may seem a trivial comment, it is important to note that the


essential thing about a sequence is that it has a notion of direction it
makes sense to talk about one term being further down the sequence than
another. For example, a101 is further down the sequence than, say, a45 .
It is convenient to denote the above sequence by (an )nN or even simply
by (an ). Note that there is no requirement that the terms be different. It is
quite permissible for aj to be the same as an for different j and n. Indeed, one
could have an = , say, for all n. This is just a sequence with constant terms
(all equal to ) a somewhat trivial sequence, but a sequence nonetheless.

Remark 3.1. On a more formal level, one can think of a sequence of real
numbers to be nothing but a function from N into R. Indeed, we can define
such a function f : N R by setting f (n) = an for n N. Conversely,
any f : N R will determine a sequence of real numbers, as above, via the
assignment an = f (n).

One might wish to consider a finite sequence such as, say, the four term
sequence a1 , a2 , a3 , a4 . We will use the word sequence to mean an infinite
sequence and simply include the adjective finite when this is meant.

29
30 Chapter 3

Examples 3.2.

1. 1, 4, 9, 16, . . .
Here the general term an is given by the simple formula an = n2 .

2. 2, 3/2, 4/3, 5/4, 6/5, . . .


The general term is an = (n + 1)/n.

3. 2, 0, 2, 0, 2, 0, . . .
(
0, if n is even
Here an =
2, if n is odd.
This can also be expressed as an = 1 (1)n .

4. Let an be defined by the prescription a1 = a2 = 1 and an = an1 + an2


for n 3. The sequence (an ) is then
1, 1, 2, 3, 5, 8, 13, . . .
These are known as the Fibonacci numbers.

We are usually interested in the long-term behaviour of sequences,


that is, what happens as we look further and further down the sequence.
What happens to an when n gets very large? Do the terms settle down
or do they get sometimes big, sometimes small, . . . , or what?
In examples 3.2.1 and 3.2.4, the terms just get huge.
In example 3.2.2, we see, for example, that a99 = 100/99, a10000 =
10001/10000, a1020 = (1020 + 1)/1020 , . . . , so it looks as though the terms
become close to 1.
In example 3.2.3, the terms just keep oscillating between the two values 0
and 2.
In example 3.2.2, we would like to say that the sequence approaches 1
as we go further and further down it. Indeed, for example, the difference
between a1010 and 1 is that between (1010 + 1)/1010 and 1, that is, 1010 .
How can we formulate this idea of convergence of a sequence precisely?
We might picture a sequence in two ways, as follows. The first is as the graph
of the function n 7 an . (Notice that we do not join up the dots.)

a2 a4
a1
an

1 2 3 4 ... n N
a 3

Figure 3.2: A sequence as a graph.

Department of Mathematics
Sequences 31

The second way is just to indicate the values of the sequence on the real
line.


a4 a2 a1 a3 R

Figure 3.3: Plot the values of the sequence on the real line.

The example 3.2.3 above, would then be pictured either as

a1 a3 a5
r r r

a2 a4 a6
r r r
1 2 3 4 ... N

Figure 3.4: Graph with values 0 or 2.

or as

0 2

a2 a1 R
a4 a3
.. ..
. .

Figure 3.5: The values of an are either 0 or 2.

Kings College London


32 Chapter 3

Returning to the general situation now, how should we formulate the


idea that a sequence (an ) converges to ? According to our first pictorial
description, we would want the plotted points of the sequence (the graph)
to eventually become very close to the line y = .

y=


1 2 3 4 ... R

Figure 3.6: The graph gets close to the line y = .

In terms of the second pictorial description, we would simply demand


that the values of the sequence eventually cluster around the value x = .

r r rrrrr r r r
R
x=
Figure 3.7: The values of (an ) cluster around x = .

If we think of the index n as representing time, then we can think of an


as the value of the sequence at the time n. The sequence can be considered
to have some property eventually provided we are prepared to wait long
enough for it to become established. It is very convenient to use this word
eventually, so we shall indicate precisely what we mean by it.
We say that a sequence eventually has some particular property if there
is some N N such that all the terms an after aN (i.e., all an with n > N )
have the property under consideration. (The number N can be thought of
as some offered time after which we are guaranteed that the property under
consideration will hold and will continue to hold.)
As an example of this usage, let (an ) be the sequence given by the pre-
scription an = 100 n, for n N. Then a1 = 99, a2 = 98, . . . etc. It is clear
that an is negative whenever n is greater than 100. Thus, we can say that
this sequence (an ) is eventually negative.
Now we can formulate the notion of convergence of a sequence. The idea
is that (an ) converges to the number if eventually it is as close to
as desired. That is to say, given some preassigned tolerance , no matter
how small, we demand that eventually (an ) is close to within of . In

Department of Mathematics
Sequences 33

other words, the distance between an and (as points on the real line) is
eventually smaller than .
Definition 3.3. We say that the sequence (an )nN of real numbers converges
to the real number if for any given > 0, there is some natural number
N N such that |an | < whenever n > N .
is called the limit of the sequence. In such a situation, we write an
as n or alternatively limn an = .
The use of the symbol is just as part of a phrase and it has no meaning
in isolation. There is no real number .
Remark 3.4. The positive number is the assigned tolerance demanded.
Typically, the smaller is, so the larger we should expect N to have to
be. For example, consider the sequence (an ) where an = 1/n. We would
expect that an 0 as n . To see this, let > 0 be given. (We are
not able to choose this. It is given to us and its actual value is beyond our
control.) It will be true that |an 0| < provided n > 1/. So after some
contemplation, we proceed as follows. We are unable to influence the choice
of given to us, but once it is given then we can (and must) base our tactics
on it. So let N be any natural number larger than 1/. If n > N , then
n > N > 1/ and so 1/n < . That is, if n > N , then |an 0| = 1/n <
and so, according to our definition, we have shown that an 0 as n .
Notice that the smaller is, the larger N has to be.
Note that the statement
if n > N then |an | <
can also be written as
|an | < whenever n > N
or also as
n > N = |an | < .
Also, we should note that the inequality |an | < telling that the distance
between the real numbers an and is less than can also be expressed by
the pair of inequalities
< an <
or equivalently by the pair

< an < + .

This simply means that an lies on the real line somewhere between the two
values and + . This must happen eventually if the sequence is to
be convergent (to ).

Kings College London


34 Chapter 3

an lies in here
z }| {
( )
+ R

Figure 3.8: The value of an lies within of .

Example 3.5. Let (an )nN be the sequence with


2n + 5
an =
n
for n N. Does (an ) converge? Looking at the first few terms, we find

(an ) = (7, 29 , 11 13 15 17 19 205


3 , 4 , 5 , 6 , 7 , . . . , 100 , . . . ).

It seems that an 2 as n , but we must prove it.


Let > 0 be given. We have to show that eventually (an ) is within of 2.
We have |an 2| = |(2n + 5)/n 2| = 5/n. Now, the inequality 5/n < is
the same as n > 5/. Let N be any natural number which obeys N > 5/.
Then if n > N , we have
5
n>N >

and so 5/n < . This means that if n > N then |an 2| < and we have
succeeded in proving that an 2 as n .
Example 3.6. Let (an )nN be the sequence an = 1/n2 . We shall show that
an 0 as n .

Let > 0 be given.


We wish to show that there is N N such that if n > N then |an 0| < ,
that is, |an 0| = 1/n2 < .
Now,
1 1 1
< n2 > n >
n2

so take N N to be any natural number satisfying N > 1/ . Then if

n > N , it follows that n > 1/ and so n2 > 1/ which in turn implies that
1/n2 < and the proof is complete.
Alternatively, we note that 1/n2 1/n and so if 1/n < then it follows that
1/n2 1/n < . So let N N be any natural number such that N > 1/.
Then 1/N < and so if n > N we have
1 1 1 1 1
< < = 2 < < .
n N n n N

Department of Mathematics
Sequences 35

4 1
Example 3.7. Let (an )nN be the sequence an = + . We shall show
n3 n
that an 0 as n .
Let > 0 be given.
We must show that
4 1
|an 0| = 3
+ <
n n
whenever n is large enough. To see this, we note that
4 1 4 1 4 1 5
+ + + = .
n3 n n n n n n

If the right
hand side is less than , then so is the left hand side. Let N N
satisfy 5/ N < , that is, 25/N < 2 or N > 25/2 . Then if n > N , we
may say that 25/n < 25/N < 2 and so

4 1 5 5
3
+ < <
n n n N

that is, |an 0| < whenever n > N .

Example 3.8. Let |x| < 1 and for n N, let an = xn . Does (an ) converge?
Since |x| < 1, |xn | = |x|n gets smaller and smaller as n increases, so we
might guess that xn 0 as n .
Let > 0 be given.
We must show that eventually |xn 0| < which is the same as showing
that eventually |x|n < . Set d = |x|. Then we wish to show that eventually
dn < . Notice that d 0 and so we no longer have to worry about whether
x is positive or negative. We have transferred the problem from one about
x to one about d.
Consider first the case x = 0. Then also d = 0 and dn = 0 for all n. In
particular, if we go through the motions by choosing N = 1, then certainly
dn < whenever n > N (because dn = 0), which tells us (trivially) that
eventually dn < and so therefore xn 0 as n .
Now suppose that x 6= 0. Then 0 < |x| < 1, so that 0 < d < 1. Define
t by d = 1/(1 + t), that is t = (1 d)/d. Then t > 0. By the binomial
theorem, we have

n n 2
(1 + t) = 1 + nt + t + + tn > nt
2

for any n N. Hence


1 1
dn = n
< .
(1 + t) nt

Kings College London


36 Chapter 3

We shall use this to estimate dn . If the right hand side is less than , then
so is the left hand side. To carry this through, let N be any natural number
obeying N > 1/t. Then this means that 1/N t < . For any n > N , we
therefore have the inequality 1/n < 1/N and (since t > 0) we also have
1 1
dn < < < .
nt Nt
In other words, we have shown that eventually dn is less than . In terms
of x and an , we have
1 1 1
|an 0| = |x|n = < < <
(1 + t)n nt Nt
whenever n > N . Hence if |x| < 1 then xn 0 as n .
Is it possible for a sequence to converge to two different limits? To
convince ourselves that this is not possible, suppose the contrary. That is,
suppose that (an ) is some sequence which has the property that it converges
both to and , say, with 6= . Let > 0 be given. Then by definition
of convergence, (an ) is eventually within distance of and also (an ) is
eventually within distance of .

eventually in here eventually in here


z }| { z }| {
( ) ( )
+ + R

Figure 3.9: The sequence (an ) is eventually within of both and .

As one can see from the figure, if is small enough, then the two intervals
( , + ) and ( , + ) will not overlap and it will not be possible
for any terms of the sequence (an ) to belong to both of these intervals
simultaneously. We can turn this into a rigorous argument as follows.
Theorem 3.9. Suppose that (an )nN is a sequence such that an and also
an as n . Then = , that is, a convergent sequence has a unique
limit.
Proof. Let > 0 be given.
Since we know that an , then we are assured that eventually (an ) is
within of . Thus, there is some N1 N such that if n > N1 then the
distance between an and is less than , i.e., if n > N1 then |an | < .
Similarly, we know that an as n and so eventually (an ) is
within of . Thus, there is some N2 N such that if n > N2 then the
distance between an and is less than , i.e., if n > N2 then |an | < .

Department of Mathematics
Sequences 37

So far so good. What next? To get both of these happening simultaneously,


we let N = max{ N1 , N2 }. Then n > N means that both n > N1 and also
n > N2 . Hence we can say that if n > N then both |an | < and also
|an | < .
Now what? We expand out these sets of inequalities. Pick and fix any
n > N (for example n = N + 1 would do). Then

< an < +
< an < + .

The left hand side of the first pair together with the right hand side of the
second pair of inequalities gives < an < + and so

< + .

Similarly, the left hand side of the second pair together with the right hand
side of the first pair of inequalities gives < an < + and so

< + .

Combining these we see that

2 < < 2

which is to say that | | < 2. This happens for any given > 0 and
so the non-negative number | | must actually be zero. But this means
that = and the proof is complete.

Definition 3.10. We say that the sequence (an )nN is bounded from above if
there is some M R such that

an M

for all n N. The sequence (an )nN is said to be bounded from below if
there is some m R such that

m an

for all n N. If (an ) is bounded both from above and from below, then we
say that (an ) is bounded.
Examples 3.11.
1. Let an = n + (1)n n for n N. Then we see that (an ) is the sequence
given by (an ) = (0, 4, 0, 8, 0, 12, 0, . . . ). Evidently (an ) is bounded from
below (in fact, an 0) but (an ) is not bounded from above. (There is no
M for which an M holds for all n. Indeed, for any fixed M whatsoever,
if n is any even natural number greater than M , then an = 2n > n > M .)

Kings College London


38 Chapter 3

2. Let an = 1/n, n N. It is clear that an obeys 0 an 2 for all n


and so (an ) is bounded both from above and from below, that is, (an ) is
bounded.

Proposition 3.12. The sequence (an ) is bounded if and only if there is some
K 0 such that |an | K for all n.

Proof. Suppose first that (an ) is bounded. Then there is m and M such
that
m an M
for all n. We do not know whether m or M are positive or negative. However,
we can introduce |m| and |M | as follows. For any x R, it is true that
|x| x |x|. Applying this to m and M in the above inequalities, we see
that
|m| m an M |M | .
Let K = max{ |m| , |M | }. Then clearly,

K |m| m an M |M | K

which gives the inequalities

K an K

so that |an | K, for all n, as required.


For the converse, suppose that there is K 0 so that |an | K for all n.
Then this can be expressed as

K an K

for all n and therefore (an ) is bounded (taking m = K and M = K in the


definition).

Theorem 3.13. If a sequence converges then it is bounded.

Proof. Suppose that (an ) is a convergent sequence, an , say, as n .


Then, in particular, (an ) is eventually within distance 1, say, of . This
means that there is some N N such that if n > N then the distance
between an and is less than 1, i.e., if n > N then |an | < 1. We can
rewrite this as
1 an 1
or
1 an + 1
whenever n > N . This tells us that the tail (an for n > N ) of the sequence is
bounded but what about the whole sequence? This is now easy we know

Department of Mathematics
Sequences 39

about an when n > N so we only still need to take into account the beginning
of the sequence up to the N th term, that is, the terms a1 , a2 , . . . , aN . Let
M = max{ a1 , a2 , . . . , aN , + 1 } and let m = min{ a1 , a2 , . . . , aN , 1 }.
Then certainly + 1 M and m 1. Hence if n > N , then

m an M.

But by construction of m and M , we also have the inequalities m an M


for any 1 n N . Piecing together these two parts of the argument, we
conclude that
m an M
for any n and we have shown that (an ) is bounded, as required.

Remark 3.14. The converse of this is false. For example, let (an ) be the
sequence with an = (1)n . Then (an ) = (1, 1, 1, 1, 1, . . . ) which is
bounded (for example, 1 an 1 for all n) but does not converge.
Definition 3.15. A sequence (an ) of real numbers is said to be
(i) increasing if an+1 an for all n;
(ii) strictly increasing if an+1 > an for all n;
(iii) decreasing if an+1 an for all n;
(iv) strictly decreasing if an+1 < an for all n.
A sequence satisfying any of these conditions is said to be monotonic or
monotone. It is strictly monotonic if it satisfies either (ii) or (iv).
One reason for an interest in monotonic sequences is the following.
Theorem 3.16. If (an ) is an increasing sequence of real numbers and is
bounded from above, then it converges.
Proof. Suppose then that an an+1 and that an M for all n. Let
K = lub{ an : n N }, so that K is well-defined with K M . We claim
that an K as n .
Let > 0 be given. We must show that eventually (an ) is within distance
of K. Now, K is an upper bound for { an : n N } and so an K for
all n. It is enough then to show that K < an eventually. However, this
is true for the following reason. K < K and K is the least upper bound
of { an : n N } and so K is not an upper bound for { an : n N }. This
means that there is some aj , say, with aj > K . But the sequence (an )
is increasing and so an aj for all n > j. Hence an > K for all n > j.
We have shown that

K < an K < K +

for all n > j. This means that eventually |an K| < and so the proof is
complete.

Kings College London


40 Chapter 3

Remark 3.17. Note that in the course of the proof of the above result, we
have not only shown that (an ) converges but we have actually established
what the limit is it is the least upper bound of the set of real numbers
{ an : n N }. Of course, this does not necessarily provide us with the
numerical value of the limit.
It is also worth noting that from this result and the fact that a convergent
sequence is bounded, we can say that an increasing sequence converges if and
only if it is bounded. The sequence (an ) with an = n is clearly increasing.
It is not bounded and so we can say immediately that it does not converge
(which is no surprise, in this case).

Corollary 3.18. Any sequence which is decreasing and bounded from below
must converge.

Proof. Suppose that (bn ) is a sequence which is decreasing and bounded from
below. Then bn+1 bn for all n and there is some k such that bn k for
all n. Set an = bn and K = k. Then these inequalities become an an+1
and an K for all n, that is, (an ) is increasing and is also bounded from
above. By the theorem, we deduce that (an ) converges. Denote its limit by
and let = . We will show that bn as n (as one might well
expect). Let > 0 be given. Then there is some N N such that if n > N
then

|an | < .

In terms of bn and , the left hand side becomes |bn + | which is equal
to |bn | and so we have established that

|bn | <

whenever n > N , which completes the proof.

Example 3.19. Let (an ) be the sequence given by


a1 = 1, a2 = 1 + 1, a3 = 1 + 1 + 2!1 ,
a4 = 1 + 1 + 2!1 + 3!1 , ... an = 1 + 1 + 2!1 + + 1
(n1)! , ...
1
This can be written more succinctly as a1 = 1 and an = an1 + (n1)!
for n 2. Does (an ) converge? It is clear that an+1 > an and so (an )
is increasing (in fact, strictly increasing). If we can show that it is also
bounded then we conclude that it must converge. Can we find K such that

Department of Mathematics
Sequences 41

an K for all n? We have a1 = 1 and for any n 1


1 1 1 1
an+1 = 1 + 1 + + + + +
2 2.3 2.3.4 2.3 . . . n
1 1 1 1
1 + 1 + + 2 + 3 + + n1
2 1 n2 2 2
1 (2)
=1+ , summing the GP,
(1 21 )
= 1 + 2(1 ( 12 )n )
< 1 + 2 = 3.

Hence the increasing sequence (an ) is bounded above, by 3. We conclude


that (an ) converges. Because it is increasing, we know that its limit is equal
to lub{ an : n N } = , say. But an obeys an 3 and so 3 is an upper
bound for { an : n N } and therefore lub{ an : n N } 3, that is, 3.
Of course, = lub{ an : n N } ak for any particular k. Taking k = 3,
we get that a3 > 2 and so we can say that 2 < 3. In fact, is just
e (and e = 2.71828 . . . ).
If an and bn , then we might expect it to be the case that
an + bn + . After all, if (an ) is eventually close to and (bn ) is
eventually close to , then it seems quite reasonable to guess that (an + bn )
is eventually close to + . This is true, but we must take care with the
details.
Theorem 3.20. Suppose that (an ) and (bn ) are sequences in R.
(i) If an as n , then an as n , for any R.

(ii) If an as n and bn as n , then an + bn +


as n and also an bn as n .

(iii) If an as n and if bn as n and if bn 6= 0 for


all n and if 6= 0, then an /bn / as n .
Proof. (i) Fix R. Let > 0 be given. We must show that |an | <
eventually. If = 0, then an = 0 for all n and so it is clear that in this
case an = 0 0 = as n .
So now suppose that 6= 0. Let 0 > 0. (We will specify 0 in a moment.)
Then since we know that an , it follows that there is some N N such
that n > N implies that
|an | < 0 .
Now,

|an | = || |an |
< || 0

Kings College London


42 Chapter 3

whenever n > N . If we choose 0 = / || then see that

|an | < || 0 =

whenever n > N . Hence an , as required.


(ii) Let > 0 be given. Suppose 0 > 0. We will specify the value of 0 in
a moment. There is N1 N such that n > N1 implies that |an | < 0 .
Also, there is N2 N such that n > N2 implies that |bn | < 0 . Set
N = max{ N1 , N2 }. Then if n > N , we see that

|an + bn ( + )| = |an + bn | |an | + |bn | < 0 + 0 = 20 .


1
Setting 0 = 2 , it follows that if n > N , then

|an + bn ( + )| < 2 0 = ,

that is, an + bn + as n .
To show that an bn , consider first the case xn 0 and yn 0.
We shall show that xn yn 0.
Let > 0 be given.

Then we know that there is N1 N such that if n > N1 then |xn | < .

Similarly, we know that there is N2 N such that if n > N2 then |yn | < .
Let N = N1 + N2 . Then if n > N , it follows that

|xn yn | < =

that is, xn yn 0 as n .
Now, in the general case, we simply use previous results to note that

an bn = (an )(bn ) + bn + an
0 + +
=

as required.
(iii) Now suppose that an , bn and suppose that bn 6= 0 for all n
and that 6= 0. Let n = 1/bn and let = 1/. Then an /bn = an n . To
show that an /bn /, we shall show that n as n . The desired
conclusion will then follow from the second part of (ii), above.
We have

|n | = |1/bn 1/|
| bn |
= .
|bn |

Department of Mathematics
Sequences 43

For large enough n, the numerator is small and the denominator is close to
||2 , so we might hope that the whole expression is small. (Note that it is
imperative here that 6= 0.) We shall show that 1/ |bn | is bounded from
above. Indeed, || > 0 and so, taking 12 || as our , we can say that there
is some N 0 such that n > N 0 implies that
1
|bn | < 2 || .

Hence, if n > N 0 , we have

|| = | bn + bn | | bn | + |bn |
1
< 2 || + |bn |

and so 12 || < |bn |. If we set K = min{ |b1 | , |b2 | , . . . , |bN 0 | , 21 }, then it is


true that K > 0 and |bn | K for all n. Hence 1/ |bn | 1/K for all n.
Let > 0 be given. Let 0 = K ||. Since bn , there is N such that
n > N implies that
|bn | < 0 .
But then, for any n > N , we have

| bn | 0
|n | = < =
|bn | || K ||

and the proof is complete.

Examples 3.21.

1. Taking an = 1/n, it follows that /n 0 as n for any R.

2. Suppose that an as n . Then it follows immediately that


an 0 as n . Indeed, for any given > 0, there is some
N N such that n > N implies that |an | < . But |an | =
|(an ) 0|, so to say that an 0 is just to say that an 0 as
n .

3. With an = bn , we see that if an , then a2n 2 as n . Now


with bn = a2n , it follows that a3n 3 as n . Repeating this (i.e., by
induction), we see that if an as n , then akn k as n
for any given k N.

3n2 4 (3 4/n2 )
4. Let an = for n N. We can rewrite an as an = .
2n2 + 1 (2 + 1/n2 )
Then we note that 4/n2 0 and 1/n2 0, so that 3 4/n2 3 and
(3 4/n2 )
2 + 1/n2 1 as n . Finally, it follows that an = 3/2
(2 + 1/n2 )
as n .

Kings College London


44 Chapter 3

7n3 5n2 + 3n 9
5. Let an = . The first thing we do is to divide through
3n3 + 4n2 8n + 2
by the highest power of n occurring in the numerator or denominator,
i.e., in this case, by n3 . So, an can be rewritten as

(7 5/n + 3/n2 9/n3 )


an = .
(3 + 4/n 8/n2 + 2/n3 )
Then we see that the numerator converges to 7 and the denominator
converges to 3 as n . Hence an 7/3 as n .
n4 8 (1/n3 8/n7 )
6. Let an = . Then we have an = and so it follows
n7 + 3 (1 + 3/n7 )
that an 0/1 = 0 as n .
2n5 + 4 (2 + 4/n5 )
7. Let an = . Then an = . Now, the numerator
n3 + 6 (1/n2 + 6/n5 )
converges to 2 whilst the denominator converges to 0 as n . The
above theorem about the convergence of an /bn says nothing about the
case when bn or = lim bn are zero. In this example, we back up and
note that, by inspection, we have

2n5 + 4 2n5 2n5 2n2


an = > = .
n3 + 6 n3 + 6 n3 + 6n3 7
It follows that an is not bounded from above and so cannot converge.

8. Suppose that |x| < 1 and consider the sequence an = xn , for n N. Then
the sequence bn = |an | = |x|n is monotone decreasing and is bounded
below (by 0) and so therefore it converges, to `, say: bn ` as n .
Hence the sequence (b2n ) also converges to `. However,

b2n = |a2n | = x2n = |x|n |x|n
= bn bn `2

and so we see that ` = `2 . Therefore either ` = 0 or else ` = 1. The value


` = 1 is not possible because (bn ) converges to its greatest lower bound
and the value 1 is not a lower bound. Hence ` = 0 and we conclude that
|an | 0 as n .
Let > 0 be given.
Then there is some N such that n > N implies that

||an | 0| = |an | = |an 0| <

which shows that xn = an 0 as n .

The next result is very useful.

Department of Mathematics
Sequences 45

Proposition 3.22. Suppose that (cn ) is a sequence in R with cn 0 for all


n N and such that cn as n . Then 0. In other words,
the limit of a convergent positive sequence is positive. (Note that we are
using the term positive to mean not strictly negative, so that the value zero
is allowed.)

Proof. Exactly one of < 0, = 0 or > 0 is true. We wish to show that


the first is impossible. To do this, suppose the contrary, that is, suppose
that < 0. We will obtain a contradiction from this.
Let = . Then according to our hypothesis, > 0. We know that
cn as n and so we can say that there is some N in N such that
n > N implies that |cn | < . How can we use this? Fix any n > N , for
example we could take n = N + 1. The inequality |cn | < is equivalent
to the pair of inequalities

< cn < .

Recalling that = , we find that

cn < = .

This tells us that cn < 0 which is false. We have obtained our contradiction
and so we can conclude that, as claimed, it is true that 0.

It is natural to ask whether strict positivity of every cn implies that of


the limit , that is, if cn > 0 for all n, can we deduce that necessarily > 0?
The answer is no. To show this, we just need to exhibit an explicit example.
Such an example is provided by the sequence cn = 1/n. It is true here that
cn = 1/n > 0 for every n. The sequence (cn ) converges, but its limit is
= 0. So we have cn > 0 for all n, cn as n but = 0.
The following theorem provides a useful technique for exhibiting con-
vergence of a sequence even under circumstances where we do not know
explicitly the values of the terms of the sequence.

Theorem 3.23 (Sandwich Principle). Suppose that (an ), (bn ) and (xn ) are
sequences in R such that

(i) an xn bn for all n N and

(ii) both an and bn as n .

Then (xn ) converges and its limit is .

Proof. Let > 0 be given.

Kings College London


46 Chapter 3

The inequalities an xn bn can be rewritten as

0 xn an bn an .
| {z } | {z }
yn zn

Since both (an ) and (bn ) converge to , it follows that zn = bn an


= 0 as n . Hence there is some N in N such that n > N implies
that |zn | < . But since yn = xn an 0, we have |yn | = yn and so n > N
implies that
|yn | = yn zn = |zn | <
which means that yn 0 as n . To finish the proof, we observe that
xn = yn + an 0 + as n and we are done.

We illustrate this with a proof that any real number can be approximated
by rationals.
Theorem 3.24. Any real number is the limit of some sequence of rational
numbers.
Proof. Let a be any given real number. For each n N, we know that there
is a rational number qn , say, lying between the numbers a and a + 1/n. That
is, qn satisfies
a qn a + n1 .
Since 1/n 0, an application of the Sandwich Principle tells us immediately
that qn a as n , as required.

Note that a similar proof shows that any real number is the limit of
a sequence of irrational numbers (just replace the adjective rational by
irrational.) The point though is that even though one might think of the
irrational numbers as somewhat weird, they can nevertheless be approxi-
mated as closely as desired by rational numbers.

Subsequences
Consider the sequence (an ) given by an = sin( 12 n) for n N. Evidently,
an = 0 if n is even and alternates between 1 for n odd. For example, the
first 5 terms are a1 = 1, a2 = 0, a3 = 1, a4 = 0, a5 = 1.
Next, consider the sequence

(bn ) = (1, 23 , 13 , 54 , 51 , . . . ) .

This is given by (
1
bn = n, for n odd
n
n+1 , for n even.
We notice that the odd terms approach 0 whereas the even terms approach 1.

Department of Mathematics
Sequences 47

These two examples suggest that we might well be interested in considering


certain terms of a sequence in isolation from the original sequence. This idea
is formalized in the concept of a subsequence of a sequence. Roughly speak-
ing, a subsequence of a sequence is simply any sequence obtained by leaving
out particular terms from the original sequence. For example, the even terms
a2 , a4 , a6 , . . . form a subsequence of the sequence (an ). Another subsequence
of (an ) is obtained by considering, say, every tenth term, a10 , a20 , a30 , . . . .
Definition 3.25. Let (an ) be a given sequence. A subsequence of (an )nN is
any sequence of the form (an1 , an2 , an3 , . . . ) where n1 < n2 < n3 < . . . is
any (strictly increasing) sequence of natural numbers.
We can express this somewhat more formally as follows. A sequence
(bk )kN is a subsequence of the sequence (an )nN if there is some mapping
: N N such that i < j implies that (i) < (j) (i.e., is strictly
increasing) and such that bk = a(k) for each k N. This agrees with the
above formulation if we simply set (k) = nk and put bk = a(k) = ank . (It
really just amounts to a matter of notation.)
Of course, (bk )kN is a sequence in its own right and so one can consider
subsequences of (bk ). Evidently, a subsequence of (bk ) is also a subsequence
of (an )nN . This is intuitively clear. We get a subsequence of (bk ) by leaving
out some of its terms. However, (bk ) itself was obtained from (an ) by leaving
out various terms of (an ), so if we leave out both lots in one step, we get our
subsequence of (bk ) directly from (an ). To see this more formally, suppose
that (cj )jN is a subsequence of (bk )kN . Then there is a strictly increasing
map : N N such that cj = b(j) for all j N. However, since (bk ) is a
subsequence of (an ), there is a strictly increasing map : N N such that
bk = a(k) for all k N. This means that we can write cj as

cj = b(j) = a((j))

for j N. Let : N N be the map (j) = ((j)). Evidently is strictly


increasing and cj = a(j) for j N. This shows that (cj )jN is a subsequence
of (an )nN .
Remark 3.26. Let (anj ) be a subsequence of (an ). It is intuitively clear that,
say, the 20th term of (anj ) has to be at least the 20th term of (an ). In
general, the term anj is at least as far along the (an ) sequence as the j th or,
in other words, nj j.
We will verify this by induction. For j N, let P (j) be the statement
nj j . Now, nj N and so, in particular, n1 1, which means that
P (1) is true. Fix j N and suppose that P (j) is true. We will show that
this implies that P (j + 1) is also true. Indeed, nj is strictly increasing in j
and so we have

nj+1 > nj j , by the induction hypothesis that P (j) is true.

Kings College London


48 Chapter 3

Since all quantities under consideration are integer-valued, we deduce that


nj+1 j + 1, i.e., P (j + 1) is true. It follows, by induction, that P (j) is true
for all j N.
Proposition 3.27. Suppose that (an ) converges to . Then so does every
subsequence of (an ).
Proof. Let (ank )kN be any subsequence of (an ) whatsoever. We wish to
show that ank as n .
Let > 0 be given.
Now, we know that an as n . Therefore, we are assured that
there is some N N such that n > N implies that |an | < . But (ank )
is a subsequence of (an ) and so we know that nk k for all k N. It
follows that if k > N , then certainly nk > N . Hence, k > N implies that
|ank | < and the proof is complete.

Remark 3.28. The proposition tells us that if (an ) converges, then so does
any subsequence, and to the same limit.
Consider the sequence an = (1)n . Then we see that a2n = 1 for
all n, whereas a2n1 = 1 for all n, so that (an ) certainly possesses two
subsequences which both converge but to different limits. Consequently, the
original sequence cannot possibly converge. (If it did, every subsequence
would have to converge to the same limit, namely, the limit of the original
sequence.)

Bolzano-Weierstrass Theorem
Before we launch into one of the most important results of real analysis, let
us make one or two observations regarding upper and lower bounds.
Suppose that A and B are subsets of R with A B. If M is such that
b M for all b B, then certainly, in particular, a M for all a A. In
other words, an upper bound for B is also (a fortiori) an upper bound for
any subset A of B. Now, lub B is an upper bound for B and so lub B is
certainly an upper bound for A. It follows that

lub A lub B .

It is possible for the inequality here to be strict. For example, if A is the


interval A = [1, 2] and B is the interval B = [0, 3], then A B and evidently
lub A = 2 whereas lub B = 3, so that lub A < lub B in this case.
Similarly, we note that if m is a lower bound for B, then m is also a
lower bound for A and so
glb B glb A .
With the example A = [1, 2] and B = [0, 3], as above, we see that glb B = 0
and glb A = 1.

Department of Mathematics
Sequences 49

Theorem 3.29 (Bolzano-Weierstrass Theorem). Any bounded sequence of


real numbers possesses a convergent subsequence.
(In other words, if (an )nN is a bounded sequence in R, then there is a
strictly increasing sequence (nk )kN of natural numbers such that (ank )kN
converges.)

Proof. Suppose that M and m are upper and lower bounds for (an ),

m an M ()

The idea of the proof is to construct a certain bounded monotone decreasing


sequence and use the fact that this converges to its greatest lower bound
and to drag a suitable subsequence of (an ) along with this.

We construct the first element of the auxiliary monotone sequence. Let


M1 = lub{ an : n N }. Then M1 1 is not an upper bound for { an : n N }
and so there must be some n1 , say, in N such that

M1 1 < an1 M1 .

(The value 1 subtracted here (from M1 ) is not important. We could have


chosen any positive number. However, we shall repeat this process and we
require a sequence of positive numbers which converge to 0. The numbers
1, 12 , 13 , . . . suit our purpose.) We note that M1 is an upper bound for (an )
and so, in particular, it is an upper bound for the set { an : n > n1 }.
Next, we construct M2 as follows. Let M2 = lub{ an : n > n1 }. Then
M2 M1 and moreover, M2 12 is not an upper bound for { an : n > n1 }
and so there is some n2 > n1 such that
1
M2 2 < an2 M2 .

The way ahead is now clear. Let M3 = lub{ an : n > n2 }. Then


M3 M2 and since M3 13 is not an upper bound for { an : n > n2 } there
must be some n3 > n2 such that
1
M3 3 < an3 M3 .

Continuing in this way, we construct a sequence (Mj )jN and a sequence


(nj )jN of natural numbers such that Mj+1 Mj , nj+1 > nj , and
1
Mj j < anj Mj

for all j N.
Now we note that m anj Mj and so (Mj ) is a decreasing sequence
which is bounded from below. It follows that Mj as j , where
= glb{ Mj : j N }. We are not interested in the value of this limit .

Kings College London


50 Chapter 3

All we need to know is that the sequence (Mj )jN converges to something.
However, by our very construction,
1
Mj j < anj Mj

and so, by the Sandwich Principle, Theorem 3.23, anj as j .


We have succeeded in exhibiting a convergent subsequence, namely the
subsequence (anj )jN and the proof is complete.

Remark 3.30. Note that the theorem does not tell us anything about the
subsequence or its limit. Indeed, it cannot, because we know nothing about
our original sequence other than the fact that it is bounded. It can also
happen that there are many convergent subsequences with different limits.
It is easy to construct such examples. For example, let (un ), (vn ) and (wn )
be any three given convergent sequences, say, un u, vn v and wn w.
We construct the sequence (an ) as follows:

(a1 , a2 , a3 , a4 , . . . , ) = (u1 , v1 , w1 , u2 , v2 , w2 , u3 , . . . ) .

In other words, the three sequences (un ), (vn ) and (wn ) are dovetailed to
form (an ). Explicitly, for n N,


uk , if n = 3k 2 for some k N,
an = vk , if n = 3k 1 for some k N,


wk , if n = 3k for some k N.

Evidently, if u, v and w are different, then the sequences (a3j2 )jN =


(uj )jN , (a3j1 )jN = (vj )jN and (a3j )jN = (wj )jN are three convergent
subsequences of (an )nN with different limits.
Let us say that a real number is a limit point of a given sequence if it
is the limit of some convergent subsequence. Then in this terminology, the
real numbers u, v and w are limit points of the sequence (an ).

Next, we need a little more terminology.

Definition 3.31. A sequence (an )nN is said to be a Cauchy sequence (also


known as a fundamental sequence) if it has the property that for any given
> 0 there is some N N such that both n > N and m > N imply that

|an am | < .

In other words, eventually the distance between any two terms of the se-
quence is less than .

Department of Mathematics
Sequences 51

Proposition 3.32. Every Cauchy sequence is bounded.

Proof. Suppose that (an ) is a Cauchy sequence. Then we know that there
is some N N such that both n > N and m > N imply that

|an am | < 1 .

(The value 1 on the right hand side here is not at all critical. We could have
selected any positive real number instead, with obvious modifications to the
following reasoning.) In particular, for any j > N ,

|aj | |aj aN +1 | + |aN +1 | < 1 + |aN +1 | .

It follows that if we let M = 1 + max{ |ai | : 1 i N + 1 }, then we have

|ak | M

for all k N. This shows that (an ) is bounded.

We have seen that a bounded monotone sequence must converge. The


next theorem is very important as it gives us a necessary and sufficient
condition for convergence of a sequence.

Theorem 3.33. A sequence converges in R if and only if it is a Cauchy


sequence.

Proof. We must show that any Cauchy sequence has to converge and, con-
versely, that any convergent sequence is a Cauchy sequence.
So suppose that (an )nN is a Cauchy sequence. We must show that there
is some such that an as n . At first, this might seem impossible
because there is no way of knowing what might be. However, it turns out
that we do not need to know the actual value of but rather just that it
does exist. Indeed, we have seen that a Cauchy sequence is bounded and the
Bolzano-Weierstrass Theorem tells us that a bounded sequence possesses a
convergent subsequence. We shall show that this is enough to guarantee
that the sequence itself converges.
Let > 0 be given.
As noted above, we know that (an ) has some convergent subsequence, say
ank as k . We shall show that an by an /2-argument. Since
we know that ank as k , we can say that there is k0 N such that
k > k0 implies that
|ank | < 21 .
Since (an ) is a Cauchy sequence, there is N0 such that both n > N0 and
m > N0 imply that
|an am | < 21 .

Kings College London


52 Chapter 3

Let N = max{ k0 , N0 }. Now, if k > N it follows that also nk > N (since


nk k) and so if k > N then
1
|ak | |ak ank | + |ank | < 2 + 12 = .

Thus ak as k as required.
Next, suppose that (an ) converges. We must show that (an ) is a Cauchy
sequence.
Let > 0 be given.
We use an /2-argument. Let denote limn an . Then there is N N
such that n > N implies that
1
|an | < 2 .

But then if both n > N and m > N , we have


1
|an am | |an | | am | < 2 + 21 =

which verifies that (an ) is indeed a Cauchy sequence, as claimed.

Department of Mathematics
Sequences 53

Some special sequences


Example 3.34. What happens to c1/n as n for given fixed c > 0 ?
To investigate this, let c > 0 and consider the sequence given by (c1/n ) =
(c, c1/2 , c1/3 , c1/4 , . . . ). Suppose first that c > 1. Then c1/n > 1. For n N,
let dn be given by dn = c1/n 1, so that dn > 0 and c1/n = 1 + dn . Hence,
by the binomial theorem,

n n 2 n
c = (1 + dn ) = 1 + n dn + d + + dn1 + dnn
2 n n1 n
1 + n dn .

It follows that c 1 n dn and so we have


c1
0 < dn .
n
It follows from the Sandwich Principle that dn 0 as n . Hence, for
any c > 1, c1/n = 1 + dn 1 as n .
If c = 1, then evidently c1/n = 11/n = 1 1 as n .
Now suppose that 0 < c < 1. Set = 1/c so that > 1. Then from the
above, c1/n = (1/)1/n = 1/( 1/n ) 1 as n .
We conclude that c1/n 1 as n for any fixed c > 0.

c1/n 1 as n for any fixed c > 0

Example 3.35. What happens to n1/n as n ? There is conflicting


behaviour here. Taking the nth root would tend to make things smaller, but
one is taking the nth root of n which itself gets larger. It is not immediately
clear what will happen.
Define kn by n1/n = 1 + kn (so that kn = n1/n 1). Then kn > 0 for all
n > 1. We shall show that kn 0 as n . To see this, notice that for
any n > 1

n = (1 + kn )n
n(n 1) 2
= 1 + n kn + kn + + knn
2
n(n 1) 2
> kn .
2
Hence, for n > 1,
2
0 < kn <
n1

Kings College London


54 Chapter 3

and by the Sandwich Principle, we deduce that kn 0 as n . Hence


n1/n = 1 + kn 1 as n .

n1/n 1 as n

Example 3.36. What happens to cn /n! as n for fixed c R ? If c > 1,


then cn gets large as n grows but so does the denominator n!. There is
conflicting behaviour here, so it is not obvious what does happen.
For any c R, choose an integer k N such that k > |c|. The (k + m)th
term of the sequence is
ck+m ck cm
= .
(k + m)! k! (k + 1)(k + 2) . . . (k + m)
We have
k+m k
c c |c|m
0 =
(k + m)! k! (k + 1)(k + 2) . . . (k + m)
k m
c |c|
m
k!k k
c
= m
k!
k
where = |c| /k < 1. Now let aj = cj /j! for 1 j k and let aj = |c|k! m
for j = k + m with m 1. Then evidently aj 0 as j and
j
c
0 aj .
j!

By the Sandwich Principle, it follows that cj /j! 0 and hence we also
have cj /j! 0 as j .

cj /j! 0 as j for any fixed c R


Example 3.37. What happens to n + 1 n as n ? Each of the
two terms becomes large but what about their difference? To see what does
happen, we use a trick and write

( n + 1 n)( n + 1 + n)
0< n+1 n=
( n + 1 + n)
(n + 1) n
=
( n + 1 + n)
1 1
= <
( n + 1 + n) n

Department of Mathematics
Sequences 55


and so by the Sandwich Principle, we deduce that n + 1 n 0 as
n .


n + 1 n 0 as n

Example 3.38. Let 0 < a < 1 and let k N be fixed. What happens to nk an
as n ? The term nk gets large but the term an becomes small as n
grows. We have conflicting behaviour.
To investigate this, first let us note that nk = (nk/n )n and also that
nk/n = (n1/n )k 1k = 1 as n . It follows that nk/n a a as n .
Let r obey a < r < 1. Then eventually nk/n a < r (because with = r a,
eventually nk/n a a < = r a). It follows that eventually

0 < nk an = (nk/n a)n < rn .

But rn 0 and so by the Sandwich Principle we conclude that nk an 0


as n . (There is N N such that n > N implies that

0 < nk an = (nk/n a)n < rn .

For 1 n N , set bn = nk an and for n > N set bn = rn . Then bn 0 as


n and we have

0 < nk an bn

and so the Sandwich Principle tells us that nk an 0 as n .)

nk an 0 as n for any fixed 0 < a < 1

Kings College London


56 Chapter 3

Sequences of functions
Just as one can have a sequence of real numbers, so one can have a sequence
of functions. By this is simply meant a family of functions labelled by the
natural numbers N. Consider, then, a sequence (fn )nN of functions. For
each given x, the sequence (fn (x))nN is just a sequence of real numbers,
as considered already. Here, as always, fn (x) is the notation for the value
taken by the function fn at the real number x. In this way, we get many
sequences one for each x. Now, for some particular values of x the
sequence (fn (x))nN may converge whereas for other values of x it may not.
Even when it does converge, its limit will, in general, depend on the value
of x. These various values of the limit themselves determine a function of x.
This leads to the following notion of convergence of a sequence of functions.
Definition 3.39. Suppose that (fn )nN is a sequence of functions each defined
on a particular subset S in R. We say that the sequence (fn )nN converges
pointwise on S to the function f if for each x S the sequence (fn (x))nN of
real numbers converges to the real number f (x). We write fn f pointwise
on S as n .
Some examples will illustrate this important idea.
Examples 3.40.
1. Let fn (x) = xn and let S be the open interval S = (1, 1). We have
seen that xn 0 as n for any x with |x| < 1. This simply says
that (fn ) converges pointwise to f = 0, the function identically zero on
the set (1, 1).

2. Let fn (x) = xn as above, but now let S = (1, 1]. Then for |x| < 1,
we know that fn (x) = xn 0 as n . Furthermore, with x = 1,
we have fn (1) = 1n = 1, so that fn (1) 1 as n . Let f be the
function on S = (1, 1] given by
(
0, for 1 < x < 1,
f (x) =
1, for x = 1.

Then we can say that fn f pointwise on (1, 1].

3. Once again, let fn (x) = xn but now let S be the interval S = [1, 1].
We know that for each x (1, 1], the sequence (fn (x)) of real numbers
converges. We must investigate what happens for x = 1. We see that
fn (1) = (1)n , so that the sequence (fn (1))nN of real numbers does
not converge. This means that there does not exist a function f on
[1, 1] with the property that fn (1) f (1). The conclusion is that
in this case (fn ) does not converge pointwise on [1, 1] to any function
at all.

Department of Mathematics
Sequences 57

These examples illustrate the obvious but nevertheless crucial point that
pointwise convergence of a sequence of functions involves not only a particu-
lar sequence of functions but also the set on which the pointwise convergence
is to be considered to take place. The notion of pointwise convergence only
makes sense when used together with the set to which it refers.

Kings College London


58 Chapter 3

Department of Mathematics
Chapter 4

Series

Given a sequence a1 , a2 , . . . we wish to discus the infinite sum


a1 + a2 + a3 + . . .
P
Such an expression is called an infinite series and is denoted by k=1 ak .
We shall attempt to interpret such a series as a suitable limiting object. To
this end, let sn be the so-called nth partial sum
n
X
sn = ak = a1 + + an .
k=1
P
Then as n becomes larger, so sn looks more like the series k=1 ak . Of
course, there is the matter of convergence to bePconsidered. The point is
that one can always write down the expression k=1 ak but without some
extra discussion it is not all clear what it actually means. It is certainly a
combination of symbols, but does it have any reasonable interpretation as
a real number?P For example,
P if it happens to be the case that ak = 1 for
every k, then k=1 a k = k=1 1. What does this mean? WeP see that in this

special case, sn = n which gets large. The answer is P that k=1 ak simply
has no meaning in this case. We say that the series k=1 ak diverges.
As another example, suppose that ak = (1)k+1 . Then ak = 1 for odd k
and is otherwise equal to 1. Then

X
ak = 1 1 + 1 1 + 1 1 + . . .
k=1

which means what exactly? In this example, we see that sn = 1 if n is


odd but is zero if n is even. The partial sums flip interminably between the
two values 1 and 0.
P
Definition 4.1. The series k=1 ak is said to be convergent if the sequence
of partial sums (sn )nN converges.
If sn P as n , then is said to be the sum of the series and the
expression k=1 ak is defined to be this limit .
A series which is not convergent is said to be divergent.

59
60 Chapter 4

Example 4.2. Let ak = (1/3)k , so that



X
X 1
ak = .
3k
k=1 k=1

We see that the partial sums are given by

1 1 n ( 1 ( 1 )n+1 ) 1 1 1 n 1
sn = + + = 3 3
1 =
3 3 (1 3 ) 2 2 3 2
as n . Hence

X 1 1
k
= .
3 2
k=1
Note that the same argument shows that

X x
xk = ()
1x
k=1

for any x with |x| < 1. (The requirement that |x| < 1 ensures that xn 0
as n .)
Note that if we were to ignore the fact that it was not valid but go ahead
anyway
P and simply set x = 1 in the above formula (), then we would have
k=1 1 on the left hand side and 1/0 on the right hand side neither of
which have meaning as real numbers. Again, if we ignore the fact that it
is invalid but anyway set x = 1 in (), then the left hand side becomes
(1 1 + 1 1 + . . . ) and the right hand side becomes 12 which might lead
one to suggest that 1 1 + 1 1 + . . . is in some sense equal to 12 . The fact
is that 1 1 + 1 1 + . . . has no sensible interpretation as a real number.
Returning to the series itself and setting x = 5, say, we see that the
partial sum sn = 5 + + 5n 5n so that Pthe ksequence (sn ) does not
converge (it is not bounded) and thereforeP k=1k 5 is divergent. Thats it
nothing more to say. The expression k=1 5 does not represent a real
number
Pandkit cannot be manipulated as if it did. (It is tempting to say
that k=1 5 has no meaning at all. However, it does implicitly carry with
it the discussion here to the effect that the sequence of partial sums does
not converge.)
The following divergence test allows us to immediately spot certain series
as being divergent.
Proposition 4.3 (Test for divergence). SupposeP that the sequence (an ) fails
to converge to 0 as n . Then the series k=1 ak diverges.
P
Proof.PWe must show that if k=1 ak is convergent then an 0. So suppose
that k=1 a k is convergent
P with sum , say. This means that sn as
n , where sn = nk=1 ak .

Department of Mathematics
Series 61

Let > 0 be given.


We need an /2-argument. Since sn , we are assured that there is some
N 0 N such that
|sn | < 21
whenever n > N 0 . But then

|an | = |sn sn1 | = |sn + sn1 |


|sn | + | sn1 |
< 12 + 21 =

provided n > N 0 and n 1 > N 0 . So if we set N = N 0 + 1, then if n > N


we can be sure that
|an | <
which establishes that an 0 as n and the proof is complete.

Remark 4.4. It is very important to understand what this proposition says


and what it does not say. It says that if the terms of a series fail to converge
to zero, then the series itself is divergent.
It is quite possible to find a series whose terms do converge to zero but
nevertheless,
P the series is divergent. Such an example is provided by the
series k=1 ak with ak = 1/k.

1 1 1 1
1+ + + + + ... is a divergent series
2 3 4 5

Indeed, the sequence of partial sums (sn ) is not bounded. One can see this
as follows. That portion of the graph of the function y = 1/x between the
values x = k and x = k + 1 lies below the line y = 1/k. Let Rk denote the
rectangle with height 1/k and with base on the interval [k, k + 1]. Then the
area of Rk is greater than the area under the graph of y = 1/x between the
values x = k and x = k + 1, that is,
Z k+1
1 1
area Rk = > dx = ln(k + 1) ln k .
k k x
Summing from k = 1 to k = n, we get
n
1 1 X
sn = 1 + + + > (ln(k + 1) ln k) = ln(n + 1) ln 1 = ln(n + 1).
2 n
k=1

But ln(n + 1) > ln n which becomes arbitrarilyPlarge for large enough n.


So we conclude that (sn ) is unbounded and so k=1 ak , with ak = 1/k, is
divergent despite the fact that an = 1/n 0 as n .

Kings College London


62 Chapter 4

An alternative argument is as follows. One notices that


1 1
1+ 2 + + 1 + 1 + 1 + 1 + 1 + 1 + + 16
1
+ 1 + + 1 +...
|3 {z 4} |5 6 {z 7 8} |9 {z } |17 {z 32}
1
> 2
> 4 18 = 12 1
> 8 16 = 12 1
> 16 32 = 21

and so we see that

s1 = 1, s2 = s1 + 12 , s4 > s2 + 12 , s8 > s4 + 21 , s16 > s8 + 21 , ...

and so it follows that s2j (j + 2) 12 for j N. (This inequality is strict


for j > 1, but this is of no consequence here.) So the sequence (sn ) is not
bounded.
The next result tells us that we can do arithmetic with convergent series
just as we would expect.
P P
Proposition 4.5. Suppose
P that k=1 a k and k=1 bP
k are convergent series.

Then the series k=1 ak , for any R, and k=1 (ak + bk ) are also
convergent and have sums such that

X
X
X
X
X
ak = ak and (ak + bk ) = ak + bk .
k=1 k=1 k=1 k=1 k=1

Proof.PWe just need to look


P at the partial sums
P and their limits.
PSo let
sn = nk=1 ak and let tn = nk=1 bk and let = a
k=1 k and = k=1 ak .
By hypothesis, we know that sn and tn as n . But
n
X
ak = sn
k=1

and
n
X
(ak + bk ) = sn + tn +
k=1
P P
as n . It follows
P that k=1 a k is convergent with sum = k=1 ak
and
P also that P k=1 (ak + bk ) is convergent with sum given by + =
k=1 a k + k=1 bk , as required.

Example 4.6. For k N, let ak = 9/10k . Then



X 9 9 9
ak = + 2 + 3 + ...
10 10 10
k=1

which is usually referred to as 0.9 . . . recurring. Is this series convergent and


if so, what is its sum? We see that

X
X
X
9 1
ak = k
= 9 .
10 10k
k=1 k=1 k=1

Department of Mathematics
Series 63

P
But we have seen earlier that k=1
1
10k
is convergent with sum equal to
1 1 1
10 /(1 10 ) = 9 . It follows that

X 9 1
k
= 9 = 1.
10 9
k=1


X 9
= 0.9999 = 1
10k
k=1

Continuing with this theme, we have the following.


Example 4.7. Let (ak )kN P be any sequence of integers taking values in the
set { 0, 1, 2, . . . , 9 }. Then k
k=1 ak /10 is convergent with sum lying in the
interval [0, 1]. P
To see this, let sn = nk=1 10 ak
k denote the n
th partial sum of the series
P k n+1
k=1 ak /10 . We note that sn+1 sn = an+1 /10 0 and so the sequence
of partial sums (sn ) is monotone increasing. Furthermore, since each ak is
an integer in the range 0 to 9, it follows that ak /10 9/10 and so we can
say that ak /10k 9/10k . Hence, for any n N,
n n 1
X X 1 n+1 1
ak 9 10 ( 10 ) 10
sn = = 9 1 < 9 1 = 1.
10k 10k 1 10 1 10
k=1 k=1

We have shown that the sequence (sn ) is monotone increasingPand bounded


and therefore it converges. Hence, by definition, the series k
k=1 ak /10 is
convergent. P
We must now show that k
k=1 ak /10 = , say, lies between 0 and 1.
However, each sn 0 and since sn as n , it follows that it must
also be true that 0. Furthermore, we have seen that sn < 1 and so we
have 1 sn > 0. But 1 sn 1 and so it also follows that 1 0.
Hence 0 1, as claimed.
We can interpret this as saying that every infinite decimal represents
some real number x lying in the range 0 x 1. The converse is also true.
Example 4.8. Let x be a real number satisfying 0 x 1. Then there is
a sequence of integers
P (ak )kN with values in { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } such
that the series k
k=1 k /10 is convergent with sum equal to x.
a
To show this, we must construct the sequence (ak ). We know how to do
this for x = 1, (take ak = 9 for all k) so let us suppose now that 0 x < 1.
If we are told that
a1 a2 a3
x= + 2 + 3 + ...
10 10 10
then evidently,
a2 a3
10 x = a1 + + 2 + ...
10 10

Kings College London


64 Chapter 4

so that a1 is the integer part of 10x, a1 = [10x]. Similarly,


a3
100 x = 10a1 + a2 + + ...
10
so that a2 is the integer part of 100x 10a1 , a2 = [100x 10[10x]]. In this
way, we can write any ak in terms of x. We simply use this idea to construct
the ak s.
We isolate the following fact: if u is any real number with 0 u < 1, then
there is an integer a in the set { 0, 1, 2, . . . , 8, 9 } and a real number obeying
0 < 1 such that 10u = a+. To see this, we first note that 0 10u < 10
and so [10u], the integer part of 10u, lies in the set { 0, 1, 2, . . . , 8, 9 }. Let
a = [10u] and let = 10u [10u]. Since 0 w [w] < 1 for any real
number w, we see that 0 < 1 and 10u = a + as required.
Since 0 x < 1, as noted above, we can write 10x as 10x = a1 + 1
where a1 is an integer in the set { 0, 1, 2, . . . , 8, 9 } and 0 1 < 1. Then
a1 1
x= + .
10 10
Now, with 1 instead of x, we can say that we can write 1 as
a2 2
1 = +
10 10
for some integer a2 in the set { 0, 1, 2, . . . , 8, 9 } and some real number 2
obeying 0 2 < 1. Then
a1 1 a1 a2 2
x= + = + + .
10 10 10 102 102
Repeating this for 2 , we get
a1 a2 a3 3
x= + + +
10 102 103 103
with a3 { 0, 1, 2, . . . , 8, 9 } and 0 3 < 1.
Continuing in this way, we construct integers an in the range 0 to 9 and
real numbers n obeying 0 n < 1 such that
a1 a2 a3 an n
x= + 2 + 3 + + n + n .
|10 10 10
{z 10 } 10
sn
Finally, we note that
n 9
|x sn | = n
n 0
10 10
as
Pn , kthat is, sn x as n and so it follows that the series
k=1 ak /10 converges with sum equal to x, and the proof is complete.

Department of Mathematics
Series 65

This provides another proof that any given real number is the limit of
a sequence of rationals. Indeed, for b R, write b = [b] + x where [b] is
the integer part of b and 0 x < 1. As discussed above, x = lim sn where
each sn is the partial sum of a series with rational terms of the form ak /10k
for suitable ak { 0, 1, 2, . . . , 9 }. In particular, each sn is rational and so is
[b] + sn . However, [b] + sn [b] + x = b and the result follows.
Since 0.99 = 1 = 1.00 . . . it is clear that the decimal expansion of a
real number need not be unique. Indeed, further examples are provided by
0.5 = 0.499 . . . or 0.63 = 0.6299 . . . and so on. However, this is the only
possible kind of ambiguity as the next theorem shows.

Theorem 4.9. Suppose that 0 x < 1 and that

x = 0.a1 a2 = 0.b1 b2 . . . ,

that is,

X
X
ak bk
x= =
10k 10k
k=1 k=1

where each ak and bk belong to { 0, 1, 2, . . . , 8, 9 }. Then either ak = bk for


all k N or else there is some N N such that ak = bk for 1 k < N and
either aN = bN + 1 and ak = 0 and bk = 9 for all k > N or bN = aN + 1
and bk = 0 and ak = 9 for all k > N .

Proof. We will use the following result.


P
Lemma 4.10. Suppose that 0 k 9 and that k
k=1 k /10 = 0. Then
k = 0 for all k N.
Pn k
Proof of
P Lemma. Let sn = k=1 k /10 denote the partial sums of the
series k=1 k /10 . Then sn+1 sn = n+1 /10n+1 0 so
k
Pthat (sn ) is a
n
positive increasing sequence. Moreover,
P each sn obeys sn k=1 9/10k < 1
converges, that is, k=1 k /10k is a convergent series. Its value
so that (sn ) P
obeys sn k
k=1 k /10 . Hence, for any m N,

X
m
0 m sm k /10k = 0
10
k=1

so that it follows that m = 0 as claimed and the proof of the lemma is


complete.

We turn now to the proof of the theorem.


Case 1: x = 0.
P
In this case, we have 0 = x = k
k=1 ak /10 with ak { 0, 1, . . . , 9 }. By
the Lemma, we conclude that ak = 0 for all k N. Hence ak = bk = 0 for
k N.

Kings College London


66 Chapter 4

Case 2: 0 < x < 1.


Suppose that

X
X
ak bk
x= k
=
10 10k
k=1 k=1
and it is false that ak = bk for all k N. Let N be the smallest integer for
which ak 6= bk , so that ak = bk for 1 k < N but aN 6= bN .
Suppose that aN > bN . Then 0 bN < aN 9 and aN 1. We have

X
X
ak bk
0=xx=
10k 10k
k=1 k=1

X (ak bk )
=
10k
k=1
X
(ak bk )
=
10k
k=N
(aN bN ) (aN +1 bN +1 )
= + + ...
10N 10N +1
Multiplying by 10N , we see that
(aN +1 bN +1 )
0 = (aN bN ) + + ...
10
X
cn
= (aN bN ) +
10n
n=1

where cn = aN +n bN +n for all n N. Now, we can write (aN bN ) as

aN bN = c + 1

where c is an integer with c 0. We also note P that each cn belongs to the


set { 9, 8, . . . , 8, 9 }. Hence, writing 1 =
n=1 9/10 n , we get


X
X
9 cn
0=c+ n
+
10 10n
n=1 n=1

that is,

X (9 + cn )
0=c+ .
10n
n=1
Now, 9 + cn 0 and c 0 so that both terms on the rightPhand side above
(9+cn )
are non-negative. It must be the case that c = 0 and also n=1 10n = 0.
But then cn = 9 for all n N, by the Lemma.
Hence aN = bN + 1 and aN +n bN +n = 9 which implies that aN +n = 0
and bN +n = 9 for all n N and the result follows.

Department of Mathematics
Series 67

Returning now to the general theory, it is clear that the convergence of


a series will not be affected by changing the values of a few terms, although
of course, this will change the value of its sum. This is confirmed formally
in the next proposition.
P P
Proposition 4.11. Suppose that k=1 ak is a convergent series and k=1 bk
is
P any series such that bk = ak except for at most finitely-many k. Then
k=1 bk is also convergent.
Pn
Proof.PAs always, we look at the partial sums, so let sn = k=1 ak and let
tn = nk=1 bk . Evidently,

n
X n
X n
X n
X
tn = bk = ( bk ak ) + ak = ck + sn .
| {z }
k=1 k=1 = ck , say k=1 k=1

P
Next, let un = nk=1 ck . Now, by hypothesis, ck = 0 except for at most
finitely-many k. In other words, there is some N N such that ck = 0 for
all n > N . This means that un is eventually constant,

n
X N
X
un = ck = ck = uN ,
k=1 k=1

whenever n > N , and so (un ) converges (to the value uN ). But

tn = un + sn

and since the right hand side converges, so does the left hand side and the
result follows.

Theorem 4.12 (Comparison


P Test for positive series).
P Suppose 0 ak bk
for all k N and that k=1 bk converges. Then k=1 ak also converges.
P P
Proof. Let sn = nk=1 ak and tn = nk=1 bk . By hypothesis, (tn ) converges
and so (tn ) is a bounded sequence. Therefore there is some M > 0 such that

tn M

for all n N. But since 0 ak bk , it follows that sn tn and so


the sequence (sn ) of partial sums is bounded above (by M ). Furthermore,
sn+1 sn = an+1 0 and so (sn ) is monotone increasing. However, we
know that a monotone increasing sequence which is bounded above must
converge. Hence result.

Kings College London


68 Chapter 4


X 1 1 1 1
Example 4.13. Consider the series = 1 + 2 + 2 + 2 + ... .
k2 2 3 4
k=1
Pn 1 th partial sum. Then (s ) is increasing.
Let sn = k=1 k2 denote the n n
Furthermore, for n > 1, we see that

1 1 1 1
sn = 1 +2
+ 2 + 2 + + 2
2 3 4 n
1 1 1 1
<1+ + + + +
1.2 2.3 3.4 (n 1)n
1 1 1 1 1 1 1
=1+ 1 + + + +
2 2 3 3 4 n1 n
1
=2
n
<2

and so (sn ) is P
bounded from above and therefore must converge. Hence,
by definition, 1
k=1 k2 is convergent. Note, however, that this discussion
gives us no hint as to the value of its sum. This is an example where
the convergence of a series can quite sensibly be discussed without actually
knowing what its sum is.

X 1
Example 4.14. What about the series ?
k4
k=1

1 1
Since 4
2 for all k N, we can apply the Comparison Test for positive
k k
X
1 1 1
series to deduce that 4
is convergent. Indeed, since 2 for all
k k k
k=1 P
k N for any 2, we can say that the series k=1 1/k is convergent
whenever 2.


X 1
is convergent for any 0
k 2+
k=1

X
1
Example 4.15. What about the series ?
k=1
k
1 1
If this series were convergent, then we could use the inequality ,
k k
for every k N, togetherP with the Comparison Test for positive series to
conclude that the series k=1 1/k were convergent.
P However,
we know this
not to be the case. It follows that the series k=1 1/ k is not convergent.

Department of Mathematics
Series 69

P
Indeed, we can apply this reasoning to the series k=1 1/k for any 1.


X 1
is not convergent for any 1
k
k=1

X
1
Example 4.16. We have seen above that the series is convergent for
k
k=1
2 but not convergent for 1. It is natural to ask what happens for
values of lying in the range 1 < < 2. We shall see that the series is
convergent for all > 1.
P
Write = 1 + , where > 0 and let sn = nk=1 1/k . Evidently (sn )
is an increasing sequence so if we can show that it is bounded, then we will
be able to conclude that it converges. The idea is to compare the terms
1/k with the integral of the function y = 1/x over unit intervals. In fact,
over the range k x k + 1, the function y = 1/x(1+) is greater than
1/(k + 1)1+ and so
Z k+1
1 dx
1+
.
(k + 1) k x1+
Summing over k, we find that

1 1 1
sn = 1 +
+ + +
2 3 n
Z 2 Z 3 Z n
dx dx dx
1+ 1+
+ 1+
+ + 1+
x 2 x n1 x
Z1 n
dx
=1+ 1+
1 x
h 1 in
=1+
x 1
1 1
=1+
n
1
1+ .

We see that the sequence (sn ) is bounded from above and since it is also
increasing, it must converge.


X 1
is convergent for all > 1 and divergent for all 1
k
k=1

This technique of comparing terms of a series with integrals can be quite


useful. The general idea is contained in the following theorem.

Kings College London


70 Chapter 4

Theorem 4.17 (Integral Test). Suppose that : [1, )R R is a positive


n
decreasing function such P
that the sequence of integrals ( 1 (x) dx)nN con-

verges as n . Then n=1 (n) is convergent.
Pn
Proof. Since (x) 0, the sequence of partial sums sn = k=1 (k) is
increasing. Now, because is decreasing, it follows that (k) (x) for
Rk
all x [k 1, k] for all k 2. Hence k1 ((k) (x)) dx 0, that is,
Rk
(k) k1 (x) dx. Therefore

sn = (1) + (2) + (3) + + (n)


Z 2 Z 3 Z n
(1) + (x) dx + (x) dx + + (x) dx
1 2 n1
Z n
= (1) + (x) dx .
1
R n
By hypothesis, the sequence of integrals 1 (x) dx converges and so is
bounded. It follows that the sequence of partial sums (sn ) is bounded from
above and therefore converges. The result follows.

This theorem can be rephrased in a slightly more general form, as follows.

Theorem 4.18 (Integral Test). Let (an ) be a sequence of positive real numbers
and suppose Rthat there is some positive function such that the sequence
n
of integrals ( 1 (x) dx)nN converges as n and such that, for each
k 2,
ak (x)
P
for all (k 1) x k. Then k=1 ak is convergent.
P
Proof. As usual, let sn = nk=1 ak . Then (sn )nN is an increasing sequence
(because each ak 0). We need only show that (sn ) is bounded. To see
this, note that for k 2,
Z k Z k
ak = ak dx (x) dx
k1 k1

and so

sn = a1 + a2 + + an
Z 2 Z 3 Z n
a1 + (x) dx + (x) dx + + (x) dx
1 2 n1
Z n
= a1 + (x) dx .
1

Department of Mathematics
Series 71

Rn
Now, the sequence ( 1 (x) dx) converges, by hypothesis,
Rn and so it is bounded
and therefore there is a constant C such that 1 (x) dx C for all n N.
Hence, for any n,
Z n
0 sn a1 + (x) dx a1 + C
1

which shows that (sn ) is a bounded sequence and the result follows.

The following test for convergence of positive series is very useful.

Theorem 4.19 (DAlemberts Ratio Test for positive series).


Suppose that an > 0 for all n.

(i) Suppose that there is some 0 < < 1 and some N N such that if
an+1
n > N then < .
an
P
Then the series n=1 an is convergent.

an+1
(ii) If there is N 0 N such that 1 for all n > N 0 , then the
P an
series n=1 an is divergent.
an+1
Proof. (i) Suppose that 0 < < 1 and that < for n > N . Then
an

aN +2 < aN +1
aN +3 < aN +2 < aN +1 2
aN +4 < aN +3 < aN +1 3
..
.
aN +k+1 < aN +1 k .
aN +1
Hence aN +k+1 < K N +k+1 for all k 1, where we have let K = .
N +1
In other words, we have an < K n for all n > N + 1. Now we construct
a new sequence (un ) by setting
(
0, nN +1
un =
an , n > N + 1 .
P
Then certainly un < K n for all n. Now, we know that n
n=1 K is
convergent (with sum K /(1 )) because 0 < < 1.
P
By the Comparison Test, it follows that P
n=1 un is convergent. However,
an = un eventually and so it follows that n=1 an is also convergent and
the proof of (i) is complete.

Kings College London


72 Chapter 4

an+1
(ii) Suppose now that 1 for all n > N 0 . Then for any k N
an
aN 0 +k aN 0 +k1 aN 0 +1 > 0 .

This means that it is impossible for an 0 as n P (every term after


the (N 0 + 1)th is greater than aN 0 +1 ). We conclude that n=1 an must be
divergent.

There is another (weaker) but also very useful version of this theorem.

Theorem 4.20 (DAlemberts Ratio Test for positive series (2nd version)).
an+1
Suppose that an > 0 for all n and that L as n .
an
P
(i) If L < 1, then n=1 an is convergent.
P
(ii) If L > 1, then n=1 an is divergent.

(There is no claim as to what happens when L = 1.)


an+1
Proof. (i) Suppose that L where 0 L < 1. Then for any > 0,
an
an+1
we may say that eventually (L , L + ).
an
an+1
In particular, < L + eventually. Let be so small that L + < 1,
an
an+1
then < where = L+ < 1. By the previous version of the Theorem,
an P
it follows that n=1 an is convergent.
an+1
(ii) Now suppose that L > 1. Then for any > 0, eventually
an
an+1
belongs to the interval (L , L + ). In particular, eventually > L .
an
But L > 1, so if > 0 is chosen so small that L > 1, then we may say
an+1 P
that eventually > L > 1 and so n=1 an is divergent, by the
an
previous version of the Theorem.

Example 4.21. What can be said when L = 1? Without further analysis, the
answer is nothing. Indeed, there are examples of series which converge
when L = 1 and other examples of P series which diverge when L = 1.
For example, we know that k=1 1/k 2 is convergent and we see that

an+1 /an = n2 /(n + 1)2 1 as


P n , so that L = 1 in this case.
However, we also know that k=1 1/k is divergent, but here again, we see
that an+1 /an = n/(n + 1) 1 = L as n .

When L = 1, the Ratio Test tells us nothing.

Department of Mathematics
Series 73

P k
Example 4.22. For fixed 0 c < 1, the series k=1 kc is convergent.
If c = 0, there is nothing to prove, so suppose that 0 < c < 1. Setting
an = ncn , we see that

an+1 (n + 1)cn+1 (n + 1)c


= n
= c
an nc n
as n . Since an > 0 forPall n and since L = c < 1, we can apply the
Ratio Test to conclude that k
k=1 kc is convergent. P
The same argument shows that for any power p, the series p k
k=1 k c is
convergent (provided 0 c < 1).


X
For any 0 c < 1 and any p N, the series k p ck is convergent.
k=1

Theorem 4.23 (nth Root Test). Suppose that an > 0 for all n N and that
(an )1/n ` as n .
P
(i) If ` < 1, the series
k=1 ak is convergent.
P
(ii) If ` > 1, then the series k=1 ak is divergent.

(There is no conclusion when ` = 1.)

Proof. Suppose that ` < 1. Choose such that ` < < 1 and set = `.
1/n
Then > 0 and so there is some N N such that |an `| < whenever
n > N . In particular,
(an )1/n ` < = `
P
i.e., an < n , whenever n > N . We must show that sn = nk=1 ak converges.
Since an > 0, the sequence (sn ) is monotone increasing so it is enough to
show that (sn ) is bounded from above. But for any n > N ,

sn = a1 + a2 + + an
= sN + aN +1 + + an
< sN + N +1 + N +2 + + n
N +1 n+1
= sN +
1
N +1
< sN + .
(1 )
Hence, for any j,
N +j+1
sj < sN +j < sN +
1

Kings College London


74 Chapter 4

which shows that the sequence (sn ) is bounded from above and therefore
converges, as claimed.
Next, suppose that ` > 1. Choose d such that 1 < d < ` and let = `d.
Then > 0 and there is N N such that
a1/n
n (` , ` + )
whenever n > N . In particular, for n > N ,
` < a1/n
n

which means that an > dn . It follows that, for any n > N ,


sn = sN + aN +1 + + an > an > dn > 1 .
From this, we see
P that it is false that an 0 as n and so by the Test
for divergence, k=1 an is divergent.

We have considered tests applicable only to positive series. The following


is a convergence test for the case when the terms alternate between positive
and negative values.
Theorem 4.24 (Alternating Series Test). Suppose that (an ) is a positive,
decreasing sequence such that an 0 as n . Then the (alternating)
series

X
a1 a2 + a3 a4 + . . . = (1)n+1 an
n=1
is convergent.
Proof. By hypothesis, an 0, an+1 an and an 0 as n .
Let sn = a1 a2 + a3 a4 + + (1)n+1 an denote the nth partial sum
of the series, as usual. We shall consider the two cases when n is even and
when n is odd.
Suppose that n is even, say n = 2m. Then
s2m+2 = s2m + ( a2m+1 a2m+2 )
| {z }
0

and so s2m+2 s2m .


Next, we note that
s2m = a1 a2 + a3 a4 + a5 a2m
= a1 ( a2 a3 ) ( a4 a5 ) ( a2m2 a2m1 ) a2m
| {z } | {z } | {z } |{z}
0 0 0 0

a1 .
For notational convenience, let xm = s2m . Then we have shown that (xm ) is
increasing and bounded from above (by a1 ). It follows that (xm ) converges,
say xm as m .

Department of Mathematics
Series 75

Claim: sn as n .
Let > 0 be given.
Then there is N1 N such that if m > N1 then |xm | < 12 . Also,
there is N2 N such that if n > N2 then |an | < 12 . Let N = 2(N1 + N2 ).
Let n > N and consider |sn |. If n is even, say, n = 2m, then

n = 2m > N = 2m > 2(N1 + N2 ) = m > N1

and so
1
|sn | = |s2m | = |xm | < 2 < .
If n is odd, say n = 2k + 1, then

n = 2k + 1 > N = 2k N = 2(N1 + N2 ) = k > N1 .

Moreover, since N > N2 , we have n > N = n > N2 and so we see that

n = 2k + 1 > N = both k > N1 and n > N2 .

Hence

|sn | = |s2k+1 | = |s2k + a2k+1 |


= |xk + an |
|xk | + |an |
1
< 2 + 12
= .

So regardless of whether n is even or odd, if n > N then |sn | < . Hence


sPn as n , as claimed, and we conclude that the alternating series
n+1 a is convergent.
n=1 (1) n

Example 4.25. The series 1 12 + 13 14 + 15 16 + . . . converges.


This follows immediately from the Alternating Series Test.
P
Definition
P 4.26. The series n=1 an is said to converge absolutely if the
series n=1 |a
P n | is convergent.
The series n=1 an is said to converge conditionally if it converges
P but does
not converge absolutely, i.e., it converges but the series n=1 n | is not
|a
convergent.
Example 4.27. We have seen that the series

X
(1)n+1 n1 = 1 1
2 + 1
3 1
4 + 1
5 1
6 + ...
n=1
P
converges.
P However, we know that 1
n=1 n does not converge and so the
n+1 1
series n=1 (1) n is an example of a conditionally convergent series.

Kings College London


76 Chapter 4

Theorem 4.28. Every absolutely convergent series is convergent.


P Pn
Proof. Suppose
Pn that n=1 an is absolutely convergent. Let tnP= k=1 |ak |
and sn = k=1 ak . Then we know that tn converges (since n=1 an con-
verges absolutely). It follows that (tn ) is a Cauchy sequence. We shall show
that (sn ) is also a Cauchy sequence.
Let > 0 be given.
Then there is N such that n, m > N imply that

|tn tm | < .

However, for n > m,

|sn sm | = |am+1 + + an | |am+1 | + + |an | = |tn tm |

and so it follows that


|sn sm | <
whenever n, m > N , which shows that (sn ) is a Cauchy sequence. But any
Cauchy sequence in R converges and the result follows.

We know that if a and b are real numbers, then a + b = b + a. More


generally, if a1 , . . . , am is a collection of m real numbers, then their sum
a1 + + am is the same irrespective
P of the order in which we choose to add
them together. Now, a series n=1 n is the result of adding together real
a
numbers, so it is natural to guess that the order of the addition does not
matter. To discuss this, we shall need the notion of a rearrangement.
P P
Definition 4.29. The series n=1 bn is a rearrangement of the series n=1 an
if there is some one-one map of N onto N such that bn = a(n) for each
n N. In other words, every b is one of the as and every a appears as
some b.
P
Theorem 4.30. Suppose that the series n=1 an converges absolutely. Then
every rearrangement also converges, with the same sum.
P P
Proof. Let n=1 bn be a rearrangement of n=1 an . Then there is some
Pn one-
one map
Pn of N onto N
Pnsuch that bn = a (n)
Pfor every n. Let sn = k=1 ak ,
tn = k=1 |ak |, rn = k=1 bk and let s = k=1 ak = limn sn . We must
show that (rn ) converges and that its limit is equal to s.
Let > 0 be given.
Since sn s and (tn ) is a Cauchy sequence, there is some N N such that
n, m N imply that both

|sn s| < /2 and |tn tm | < /2 .

Now, the sequence of bs is a relabelling of the as and so for each j there


is some kj such that aj = bkj . Let N 0 = max{ kj : 1 j N } so that the

Department of Mathematics
Series 77

collection a1 , a2 , . . . , aN is included in the collection b1 , b2 , . . . , bN 0 . Then for


any n > N 0

b1 + b2 + + bn = a1 + a2 + + aN + n

where n = a`1 + +a`r for some integers `1 , . . . , `r with N < `1 < < `r .
Now
`r
X
|n | |a`1 | + + |a`r | |ak | = t`r tN < /2
k=N +1

and so if n > N 0

|rn s| = |sn + n s|
|sn s| + |n |
/2 + /2 =

and the proof is complete.

Theorem 4.31 (Cauchys Condensation Test). Suppose (an )nN is a positive,


decreasing sequence of real numbers (that is, Pan 0 and an+1 an ). For
each k N, let bkP= 2k a2k . Then the series n=1 an is convergent if and

only if the series k=1 bk is convergent.
(In other words, either both series converge or neither does.)
P P
Proof. Let sn = nm=1 am and tk = ki=1 bk be the partial sums of the series
under consideration. Since an 0, the sequences (an )nN and (bk )kN are
increasing sequences. Now, we know that if an increasing sequence in R is
bounded from above, then it must converge, so our strategy is to show that
the sequences of partial sums are bounded (from above).
The idea is to estimate the partial sums of the series in terms of each
other by bracketing the an terms into groups of size 2, 4, 8, 16, . . . and
using the fact that an an+1 . We note that

2a4 a3 + a4 2a2
4a8 a5 + a6 + a7 + a8 4a4
8a16 a9 + a10 + + a15 + a16 8a8
..
.

Summing, we find that (for k > 1)

2a4 + 4a8 + + 2k1 a2k a3 + a4 + + a2k


2a2 + 4a4 + + 2k1 a2k1 .

Kings College London


78 Chapter 4

In terms of the bk s, this becomes

1
2 (b2 + + bk ) a3 + a4 + + a2k b1 + b2 + + bk1

giving the pair of inequalities

1
2 tk b1 s2k (a1 + a2 ) tk1 ()
P
Suppose
P now that the series n=1 an converges and, for clarity, let us write
s = n=1 an = limj sj . Since (sj ) is increasing, it follows that sj s
for all j N. From (), it follows that

1
2 tk b1 s2k (a1 + a2 ) s (a1 + a2 )

for all k N. Hence (tk ) is a bounded,


P increasing sequence and so converges.
But, by definition, this means that i=1 bi is convergent.
P
Next, suppose that the series i=1 bi converges and write t for its sum. Then
tk t for all k N. Now, for any n N, it is true that 2n > n (as
can be seen
n n n
by the Binomial Theorem, as follows; 2 = (1+1) = 1+n+ 2 + +1 > n).
Hence sn s2n and so, using (), we get

sn s2n tn1 + (a1 + a2 ) t + (a1 + a2 )

for all n 1 N. Therefore


P (sn ) is a bounded, increasing sequence and so
converges. Therefore j=1 aj is convergent.
P
Example 4.32. We already know that the series n=1 1/n diverges, but let
us consider it again via the Condensation Test. First, we note that an = 1/n
satisfies the hypotheses required to apply the Condensation Test. Now,

bk = 2k a2k = 2k /2k = 1
P P
and so it is clear that Pk bk = k 1 diverges. Applying the Condensation
Test, we conclude that n=1 1/n diverges. (In fact, we have already shown
this from first principles using this method of grouping.)
P 1+ for given > 0. Once again, a = 1/n1+
Next, consider n=1 1/n n
satisfies the hypotheses required to apply the Condensation Test. In this
case, we have
2k 2k 1
bk = 2k a2k = k(1+) = k k = k
2 2 2 2
P
so that k bk is a geometric series with common ratio 1/2 . This series
therefore converges (because 1/2 is smaller than 1).

Department of Mathematics
Series 79

P
We might say that the series n 1/n diverges presumably because the terms
1/n do not become small enough quickly enough. Increasing P the power of
n from 1 to 1 + is sufficient to speed things up so that Pn 1/n1+ does
converge, no matter how small may be. Consider the series n=2 1/(n ln n)
(we cannot start this series with n = 1 because ln 1 = 0). Is the change from
an = 1/n to an = 1/(n ln n) enough to give convergence of the series?
To investigate its convergence or otherwise, let an = 1/(n ln n) for n 2
and set a1 = 5, say, or any value greater than 1/2 ln 2. This choice of a1
is not quite arbitrary but is chosen so that (an )nN satisfies
P the hypotheses
required to apply thePCondensation Test. The series n=2 n converges if
a
and only if the series a
n=1 n does, regardless of our
P choice for a1 . Applying
the Condensation PTest, nwe may say that the series n=2 1/(n ln n) converges
if (and only if) n=1 2 a2n does. But

2n 1 1
2n a2n = =
2n ln(2n ) ln 2 n
P
and we know that the series
P n 1/n does not converge. We can conclude,
then, that the series n=2 1/(n ln n) is divergent.


X 1
The series is divergent.
(n ln n)
n=2

Kings College London


80 Chapter 4

Department of Mathematics
Chapter 5

Functions

Suppose that x represents the value of the length of a side of a square. Then
its area depends on x and, in fact, is given by the formula: area = x2 . The
area is a function of x. In general, if S is some given subset of R, then a
real-valued function f on S is a rule or assignment by which to each element
x S is associated some real number, denoted by f (x). We write f : S R
which is read as f maps S into R. One also writes x 7 f (x) which is
read as x is mapped to the value f (x). The set S is called the domain
(of definition) of the function f . If x
/ S, then f (x) has not been given a
meaning.
More generally, if A and B are given sets, then a mapping g : A B
is an association a 7 g(a) of each element of ato some element g(a) B.
t 1
For example, for each 0 t 1, let g(t) = . Then t 7 g(t) is
0 t3
an example of a mapping from the interval [0, 1] into the set of 2 2 real
matrices.
In general, if B is equal to either R or C, the the mapping is often
referred to as a function. Note that a function may be given by a pretty
formula but it does not have to be. For example, the function f : R R
with f (x) = 1 + x2 is given by a formula. To get f (x), we just substitute
the value of x into the formula. However, the function

2
x , x < 1,
x 7 1, 1 x 0,

3
x + 1, x > 0
is a perfectly good function, but is not given by a formula in the same way
as the previous example. In fact, this function seems to be a concoction
constructed from the functions x2 , 1 and x3 + 1. A slightly more involved
example is


0, if x
/Q,
x 7 1/n, if x Q and x = k/n (with k Z, n N and


where k and n have no common divisors).

81
82 Chapter 5

For a function to be well-defined, there must be specified


(i) its domain of definition,

(ii) some assignment giving the value it takes at each point of its domain.
It is often very useful to consider the visual representation of f given by
plotting the points (x, f (x)) in R2 . This is the graph of f .
Examples 5.1.
1. Linear functions: x 7 f (x) = mx + c for constants m, c R and x S.
2. Polynomials: x 7 f (x) = a0 + a1 x + a2 x2 + + an xn for x S, where
the coefficients a0 , a1 , . . . an are constants in R and an 6= 0. n is the
degree of such a polynomial.
p(x)
3. Rational functions: x 7 f (x) = for x S where p and q are
q(x)
polynomials. Note that the right hand side is not defined for any values
of x for which q(x) = 0.
(
1/x, x 6= 0 ,
4. S = R, f (x) =
3, x = 0.
(
x2 , x 6= 0 ,
5. S = [1, 1], f (x) =
2, x = 0.
(
0, x / Q,
6. S = R, f (x) =
1, x Q .
(
1, x / Q,
(A thought: let g(x) = and let h = f + g.
0, x Q
Then
R1 we see that h(x) = f (x) + g(x) =R 1 for all x RR. Certainly
1 1
0 h(x) dx = 1 but what are the values of 0 f (x) dx and 0 g(x) dx and
is it true that
Z 1 Z 1 Z 1
1= h(x) dx = f (x) dx + g(x) dx ?)
0 0 0


0, x < 0,


1, 0 x < 1,
7. S = R, f (x) = 41

2, 1 x < 6,


1, x 6.
This kind of step-function is familiar from probability theory it is the
cumulative distribution function f (x) = Prob{ X x } for a random
variable X taking the values 0, 1 and 6 with probabilities 14 , 14 and 21 ,
respectively.

Department of Mathematics
Functions 83

Let f : S R be a given function and let A S.


We say that f is bounded from above on A if there is some M such that
f (x) M for all x A.
Analogously, f is said to be bounded from below on A if there is some m
such that f (x) m for all x A.
If f is both bounded from above and from below on A, then we say that f
is bounded on A.

increasing f (x1 ) f (x2 )
We say that f is on A if for any x1 , x2 A
decreasing f (x1 ) f (x2 )
with x1 < x2 .

increasing f (x1 ) < f (x2 )
We say that f is strictly on A if for any
decreasing f (x1 ) > f (x2 )
x1 , x2 A with x1 < x2 .

Examples 5.2.

1. S = R, f (x) = x2 . Then f is bounded from below on R (by m = 0)


but f is not bounded from above on R. f is strictly increasing on [0, )
and f is strictly decreasing on (, 0]. f is bounded on any bounded
interval [a, b]. (We see that 0 f (x) max{ a2 , b2 } on [a, b].)

2. S = [1, ), f (x) = 1 x1 for x S. Then f is increasing and bounded


on S. We see that f attains its glb, namely 0 but does not attain its
lub, 1.

Definition 5.3. Let f : S R and let x0 S. We say that f is continuous


at the point x0 if for any given > 0 there is some > 0 such that

x S and |x x0 | < = |f (x) f (x0 )| < .

We say that f is continuous on some given set A if f is continuous at each


point of A.

Whats going on? Note that continuity is defined at some point x0 . The idea is
that one is first given a margin of error, this is the > 0. For f to be continuous
at the specified point x0 , we demand that f (x) be within distance of f (x0 ) as
long as x is suitably close to x0 , i.e., x is within some suitable distance of x0 .
It must be possible to find such no matter how small is. In general, one must
expect that the smaller is, then the smaller will need to be. The requirement
that x S ensures that f (x) actually makes sense in the first place.
If we set h = x x0 , then we demand that f (x0 + h) be within distance of f (x0 )
whenever |h| < (provided that x0 + h S). The point x0 and the error value
must be given first. Then one must be able to find a suitable as indicated.

Kings College London


84 Chapter 5

We shall illustrate the idea with a simple example (so no surprises here).

Example 5.4. Let S = R and set f (x) = x2 . Let x0 be arbitrary (but fixed).
We shall show that f is continuous at x0 . The procedure is as follows.
Let > 0 be given.
We must find some > 0 such that |f (x) f (x0 )| < whenever |x x0 | < .
For convenience, write x = x0 + h. We see that

|f (x) f (x0 )| = x2 x20 = (x0 + h)2 x20 = 2x0 h + h2 . ()

How small must h be in order for this to be smaller than ? We do not


need
an optimal
estimate, any will do. One idea would be to notice that
2x0 h + h2 |2x0 h| + h2 and then try to make each of these two terms
smaller than 12 , that is, we try to make sure that both |2x0 h| < 12 and
h2 < 21 . This suggests the two requirements that |h| < /(4 |x0 |) and
p
|h| < /2. We must be careful here because it might happen that x0 = 0,
in which case we cannot divide by |x0 |. To side-step this nuisance, we shall
consider the two cases x0 = 0 and x0 6= 0 separately.
So first suppose that x0 6= 0. Then
p we simply choose to be the minimum
of the two terms /(4 |x0 |) and /2. This will ensure that if |h| < then
|f (x) f (x0 )| < .
Next, suppose that x0 = 0. Then the right hand side of () is simply equal

to h2 . If we choose = , then |h| < implies that h2 < and so, by (),
we have |f (x) f (x0 )| < .
In either of the cases x0 = 0 or x0 6= 0, we have exhibited a suitable
so that |x x0 | < implies that |f (x) f (x0 )| < . We have shown that
f (x) = x2 is continuous at any given point x0 R and the proof is complete.
Notice that the depends on both and x0 . We must always expect this to
happen (even though in some trivial situations it might not).

The following theorem gives us an extremely useful characterization of


continuity.

Theorem 5.5. Let f : S R and let x0 S. The following two statements


are equivalent:

(i) f is continuous at x0 ;

(ii) if (an )nN is any sequence in S such that an x0 as n , then


the sequence (f (an ))nN converges to f (x0 ).

Proof. Suppose that statement (i) holds. To show that (ii) is also true, let
(an ) be any sequence in S with the property that an x0 as n . We
must show that f (an ) f (x0 ) as n .

Department of Mathematics
Functions 85

Let > 0 be given.


By hypothesis, f is continuous at x0 and so there is some > 0 such
that
|x x0 | < = |f (x) f (x0 )| < . ()
But an x0 as n and so there exists N N such that

n > N = |an x0 | < . ()

Evidently, () and () together (with x = an ) tell us that

n > N = |f (an ) f (x0 )| <

which means that (f (an )) converges to f (x0 ) as n , as required.

Now suppose that (ii) holds. We must show that this implies that f is
continuous at any x0 S. Suppose that this were not true, that is, let us
suppose that f is not continuous at the point x0 S. What does this mean?
It means that there is some 0 > 0 such that it is false that there is some
> 0 so that

x S and |x x0 | < = |f (x) f (x0 )| < 0 .

That is, there is some 0 > 0 such that no matter what > 0 we choose, it
will be false that

x S and |x x0 | < = |f (x) f (x0 )| < 0 .

That is, there is some 0 > 0 such that for any > 0 there is some x S
with |x x0 | < such that it is false that |f (x) f (x0 )| < 0 .
That is, there is some 0 > 0 such that for any > 0 there is some
x S with |x x0 | < such that |f (x) f (x0 )| 0 . Note that x may
well depend on .
How does this help? For given n N, set = n1 . Then, according to the
discussion above, there is some point x S with |x x0 | < n1 but such that
|f (x) f (x0 )| 0 . The number x could depend on n, so let us relabel it
and call it an . Then
1
|an x0 | < but |f (an ) f (x0 )| 0 .
n
If we do this for each n N we get a sequence (an )nN in S which clearly
converges to x0 . However, because |f (an ) f (x0 )| 0 for all n N, the
sequence (f (an ))nN does not converge to f (x0 ). This is a contradiction (we
started with the hypothesis that (ii) was true). Therefore our assumption
that f was not continuous on S is wrong and we conclude that f is indeed
continuous on S. This completes the proof that the truth of statement (ii)
implies that of statement (i).

Kings College London


86 Chapter 5

We can now apply this theorem, together with various known results
about sequences, to establish some (not very surprising but) basic properties
of continuous functions.

Theorem 5.6. Suppose that f : S R, g : S R and that R. Suppose


that x0 S and that f and g are continuous at x0 . Then

(i) The sum f + g is continuous at x0 .

(ii) f is continuous at x0 .

(iii) The product f g is continuous at x0 .

(iv) If g does not vanish on S, then the quotient f /g is defined on S


and is continuous at x0 .

Proof. Suppose that (an ) is any sequence in S with the property that an
x0 as n . Then we know from the previous theorem that f (an ) f (x0 )
and also that g(an ) g(x0 ) as n . It follows that

(i) The sum (f + g)(an ) = f (an ) + g(an ) f (x0 ) + g(x0 ) = (f + g)(x0 )


as n .

(ii) f (an ) f (x0 ) as n .

(iii) The product (f g)(an ) = f (an )g(an ) f (x0 )g(x0 ) as n .

(iv) Since g does not vanish on S, the quotient f /g is well-defined on S.


Moreover, g(an ) 6= 0 for any n N and so (f /g)(an ) = f (an )/g(an )
f (x0 )/g(x0 ) as n .

Now applying the previous theorem once again proves (i)(iv).

Remark 5.7. We could also have proved the above facts directly from the
definition of continuity. For example, a proof that f + g is continuous at x0
is as follows.
Let > 0 be given.
Then there is some 0 > 0 such that

|x x0 | < 0 (and x S) = |f (x) f (x0 )| < 21 . ()

The reason for using 12 rather than will become clear below. Similarly,
there is some 00 > 0 such that

|x x0 | < 00 (and x S) = |g(x) g(x0 )| < 21 . ()

Department of Mathematics
Functions 87

Now, let = min{ 0 , 00 }. Then, from () and (),

|(f + g)(x) (f + g)(x0 )| = |f (x) f (x0 ) + g(x) g(x0 )|


|f (x) f (x0 )| + |g(x) g(x0 )|
< 21 + 12 =

whenever |x x0 | < (and x S) and so, by definition, it follows that f + g


is continuous at the point x0 in S.

Remark 5.8. The function f (x) = x is continuous on R and so with g = f , we


deduce from the theorem that f 2 (x) is also continuous on R. This is just the
statement that the function x2 is continuous. By induction, we can deduce
from the theorem that products of continuous functions and also finite linear
combinations of continuous functions are continuous, i.e., if f1 , . . . , fk are
each continuous at x0 , then so is the product function f1 f2 . . . fk as well
as the linear combination 1 f1 + + k fk , for any 1 , . . . , k R. In
particular, any power of a continuous function is continuous and taking
f (x) = x, we see that any polynomial a0 + a1 x + + an xn is continuous
on R.

Example 5.9. The function x 7 x is continuous on [0, ). This can be
shown as follows.

Let x0 [0, ) be fixed and let > 0 be given. Suppose first that x0 > 0.
For any x 0, we have

| x x0 |
| x x0 | = | x + x0 |
| x + x0 |
| x x0 |
=
x + x0
| x x0 |
<
x0
<

provided |x x0 | < where we have chosen = x0 .

To conclude, consider the case x0 = 0. Then we simply observe that



| x x0 | = x
<

whenever |x 0| < with chosen to be 2 .

Kings College London


88 Chapter 5

Example 5.10. The function x 7 1/x, for x > 0, is continuous on (0, ).


Let f (x) = 1/x for x > 0. To show that f is continuous on (0, ), let
x0 (0, ) be given and suppose that (an )nN is any sequence in (0, )
such that an x0 as n . We know that this means that 1/an 1/x0 ,
that is, f (an ) f (x0 ) as n . But this implies that f is continuous at
x0 , as required.
Note that f is bounded from below (by 0) but f is not bounded from
above on (0, ). For any M > 0, there is k N such that k > M , by the
Archimedean Property. Hence, if 0 < x < 1/k, then f (x) = 1/x > k > M .
It follows that there is no constant M such that f (x) < M for all x (0, ),
that is, f is not bounded from above on (0, ).

From the example above, we see that if f (x) = 1/x for x in any interval
of the form (0, b), say, then f is continuous on (0, b) but is not bounded
there. This situation cannot happen on closed intervals. This is the content
of the following important theorem.

Theorem 5.11. Suppose that the function f : [a, b] R is continuous on the


closed interval [a, b]. Then f is bounded on [a, b].

Proof. We argue by contradiction. Suppose that f is continuous on [a, b] but


is not bounded. Suppose that f is not bounded from above. This means
that for any given M whatsoever, there will be some x [a, b] such that
f (x) > M . In particular, for each n N (taking M = n) we know that
there is some point an , say, in the interval [a, b] such that f (an ) > n.
Consider the sequence (an )nN . This sequence lies in the bounded inter-
val [a, b] and so, by the Bolzano-Weierstrass Theorem, it has a convergent
subsequence (ank )kN , say; ank as k . Since a ank b for all
k, it follows that a b. (The limit of a convergent sequence belonging
to a closed interval also belongs to the same closed interval.) But, by hy-
pothesis, f is continuous at and so ank implies that f (ank ) f ().
It is this that will provide our sought after contradiction. By construction,
f (ank ) > nk and so it looks rather unlikely that (f (ank )) could converge.
To see that this is the situation, we observe that there is some K N such
that
|f (ank ) f ()| < 1
for all k > K (because f (ank ) f ()). But then

f (ank ) = f (ank ) f () + f ()
|f (ank ) f ()| + f ()
< 1 + f ()

for all k > K. However, f (ank ) > nk k so 1 + f () > k for all k N.


This is a contradiction and we conclude that f is bounded from above.

Department of Mathematics
Functions 89

To show that f is also bounded from below, we consider g = f . Then


g is continuous because f is. The argument just presented, applied to g,
shows that g is bounded from above. But this just means that f is bounded
from below and the proof is complete.

Remark 5.12. The two essential ingredients are that f is continuous and that
the interval is both closed and bounded. The boundedness was required so
that we could invoke the Bolzano-Weierstrass Theorem and the fact that it
was closed ensured that , the limit of the Bolzano-Weierstrass convergent
subsequence actually also belonged to the interval. This in turn guaranteed
that f was not only defined at but was continuous there.
If we try to relax these requirements, we see that the conclusion of the
theorem need no longer be true. For example, we must insist that f be
continuous. Indeed, consider the function f on the closed interval [0, 1]
given by
(
0, for x = 0
f (x) =
1/x, for 0 < x 1.

Evidently f is not bounded on [0, 1] but then f is not continuous at the


point x = 0.
Taking f (x) = 1/x for x (0, 1], we see that again f is not bounded on
(0, 1], but then (0, 1] is not a closed interval.
Let f (x) = x for x [0, ). Again, f is not bounded on the interval [0, )
but this interval is not bounded.

We have seen that a continuous function on a closed interval is bounded.


The next theorem tells us that it attains its bounds.

Theorem 5.13. Suppose that f is continuous on the closed interval [a, b].
Then there is some [a, b] and [a, b] such that f () f (x) f ()
for all x [a, b]. In other words, if ran f = { f (x) : x [a, b] } is the range
of f , then f () = inf ran f = min ran f and f () = sup ran f = max ran f .

Proof. We have seen that f is bounded. Let m = inf ran f and M =


sup ran f . By definition of the supremum, there is some sequence (yn ) in
ran f such that yn M as n . Since yn ran f , there is some
xn [a, b] such that yn = f (xn ). By the Bolzano-Weierstrass Theorem, (xn )
has a convergent subsequence (xnk )kN . Let = limk xnk . Then [a, b].
Since f is continuous on [a, b], it follows that f (xnk ) f () as k .
But f (xnk ) = ynk and (ynk )kN is a subsequence of the convergent sequence
(yn ). Therefore (ynk )kN converges to the same limit, that is, ynk M as
k . Since ynk = f (xnk ) f () as k , we deduce that M = f ().
That is, sup ran f = f () and so f (x) f () for all x [a, b].

Kings College London


90 Chapter 5

We can argue in a similar way to show that there is some [a, b] such
that m = f (). However, we can draw the same conclusion using the above
result as follows. Note that if g = f , then g is continuous on the interval
[a, b] and sup ran g = m. By the argument above, there is some [a, b]
such that m = g(). This gives the desired result that m = f ().

Alternative Proof. We know that f is bounded. Let M = sup ran f . To


show that f achieves its least upper bound M , we suppose not and obtain a
contradiction. Since M is an upper bound and is not achieved by f , we must
have that f (x) < M for all x [a, b]. In particular, M f is continuous and
strictly positive on [a, b]. It follows that h = 1/(M f ) is also continuous
and positive on [a, b]. But then h is bounded on [a, b] and so there is some
constant K such that 0 < h K on [a, b], that is,

1
0< K.
M f

Hence f M 1/K which says that M 1/K is an upper bound for f on


[a, b]. But then this contradicts the fact that M is the least upper bound for
f on [a, b]. We conclude that f achieves this bound, i.e., there is [a, b]
such that f () = M = sup ran f .
In a similar way, if f does not achieve its greatest lower bound, m, then
f m is continuous and strictly positive on [a, b]. Hence there is L such
that
1
0< L
f m
on [a, b]. Hence m + 1/L f and m + 1/L is a lower bound for f on [a, b].
This contradicts the fact that m is the greatest lower bound for f on [a, b]
and we can conclude that f does achieve its greatest lower bound, that is,
there is [a, b] such that f () = m.

Theorem 5.14 (Intermediate-Value Theorem). Any real-valued function f


continuous on the interval [a, b] assumes all values between f (a) and f (b).
In other words, if lies between the values f (a) and f (b), then there is
some s with a s b such that f (s) = .

Proof. Suppose f is continuous on [a, b] and let be any value between f (a)
and f (b). If = f (a), take s = a and if = f (b) take s = b.
Suppose that f (a) < f (b) and let f (a) < < f (b). Let A be the set
A = { x [a, b] : f (x) < }. Then a A and so A is a non-empty subset
of the bounded interval [a, b]. Hence A is bounded and so has a least upper
bound, s, say. We shall show that f (s) = .
Since s = lub A, there is some sequence (an ) in A such that an s. But
A [a, b] and so a an b and it follows that a s b. Furthermore,

Department of Mathematics
Functions 91

by the continuity of f at s, it follows that f (an ) f (s). However, an A


and so f (an ) < for each n and it follows that f (s) . Since, in addition,
< f (b), we see that s 6= b and so we must have a s < b.
Let (tn ) be any sequence in (s, b) such that tn s. Since tn [a, b]
and tn > s, it must be the case that tn / A, that is, f (tn ) . Now, f
is continuous at s and so f (tn ) f (s) which implies that f (s) . We
deduce that f (s) = , as required.

Now suppose that f (a) > > f (b). Set g(x) = f (x). Then we have
that g(a) < < g(b) and applying the above result to g, we can say that
there is s [a, b] such that g(s) = , that is f (s) = and the proof is
complete.

Corollary 5.15. Suppose that f is continuous on [a, b]. Then ran f , the range
of f , is a closed interval [m, M ].

Proof. We know that f is bounded and that f achieves its bounds, that is,
there is [a, b] and [a, b] such that

m = inf ran f = f () f (x) M = sup ran f = f ()

for all x [a, b]. Evidently, ran f [m, M ].


Let c obey m c M . By the Intermediate-Value Theorem, there is
some s between and such that f (s) = c. In particular, c ran f and so
we conclude that ran f = [m, M ].

Example 5.16. f (x) = x6 + 3x2 1 has a zero inside the interval [0, 1].
To see this, we simply notice that f (0) = 1 and f (1) = 3. Since f is
continuous on R, it is continuous on [0, 1] and so, by the Intermediate-Value
Theorem, f assumes every value between 1 and 3 over the interval [0, 1].
In particular, there is some s [0, 1] such that f (s) = 0, as claimed. Of
course, this argument does not tell us whether such s is unique or not. (In
fact, it is because f is strictly increasing on [0, ) and so cannot take any
value twice on [0, ). A moments reflection reveals that f (x) f (0) = 1,
f is not bounded from above and f (x) = f (x). Therefore f assumes every
value in the range (1, ) exactly twice and assumes the value 1 at the
single point x = 0.)

Example 5.17 (Thomaes function). We wish to exhibit a function which is


continuous at each irrational point in [0, 1] but is not continuous at any
rational point in [0, 1]. Such a function was constructed by Thomae in 1875.
Any rational number x may be written as x = p/q where we may assume
that p and q are coprime and that p Z and q N. This done, we define

Kings College London


92 Chapter 5

: Q R by setting (x) = 1/q where x = p/q. For example,

(x) = 1 for x = 1
1 1
(x) = 2 for x = 2
1 1 2
(x) = 3 for x = 3, 3
1 1 3
(x) = 4 for x = 4, 4
..
.
1 1 2 10
(x) = 11 for x = 11 , 11 , ..., 11
. . . and so on.

Suppose x Q obeys 0 < x < 1 and that (x) = 1/q. Then x must be of
the form x = p/q for some p N with 1 p q 1. In particular, for any
given q N, { x Q : 0 < x < 1 and (x) = 1/q } is a finite set of rational
numbers.
Next, we define f : [0, 1] R with the help of as follows.


1, x=0
f (x) = (x), x Q [0, 1]


0, x/ Q [0, 1] .

Claim: f is discontinuous at every rational in [0, 1].


Proof First we note that f (0) = 1 and that f (x) = 1/q when x has the
form p/q (with p, q coprime). In any event, f (x) > 0 for any given rational
x in [0, 1]. Now let r Q [0, 1] be given and let (xn ) be any sequence of
irrationals in [0,
1] which converge to r. (For example,
if r 6= 0 we could let
xn = r(1 1/n 2) but otherwise let xn = 1/n 2.) Then f (xn ) = 0 for
every n so it cannot be true that f (xn ) f (r) (because f (r) > 0), that is,
f fails to be continuous at r, as claimed.
Claim: f is continuous at every irrational in [0, 1].
Proof Let x0 be any given irrational number with 0 < x0 < 1. Then
f (x0 ) = 0.
Let > 0 be given.
We must show that there is some > 0 such that

x [0, 1] and |x x0 | < = |f (x) f (x0 )| < . ()


| {z }
= |f (x)0| = |f (x)|

Now f (0) = f (1) = 1 and so () must fail for x = 0 or x = 1 if < 1.


Furthermore, () fails if x = p/q (p, q coprime) and (x) = 1/q , that is,
q 1/. In other words, () will fail if x = 0, x = 1 or else x Q [0, 1] and

Department of Mathematics
Functions 93

(x) = 1/q where q 1/. However, there are only finitely-many numbers
q N obeying q 1/ and so the set

A = { r Q [0, 1] : r = 0 or (r) }

is finite. Write A = { r1 , . . . , rm }.
Since x0 / Q, it follows that x0 6= rj for any 1 j m. For each
1 j m, let j = |x0 rj | and let = min{ j : 1 j m }. Then
> 0 and if x obeys |x x0 | < it must be the case that x 6= rj for any
1 j m. It follows that if x [0, 1] and obeys |x x0 | < , then either
x / Q and so f (x) = 0 or else x Q but x / A and so f (x) = (x) < . In
any event, () holds and so f is continuous at x0 , as required.

Kings College London


94 Chapter 5

Differentiability
We know from calculus that the slope of the tangent to the graph of a
function f at some point is given by the so-called derivative at the point
in question. To find this slope, one considers the limiting behaviour of the
Newton quotient
f (a + h) f (a)
h
as h approaches 0. We wish to set this up formally.

Definition 5.18. We say that the function f is differentiable at the point a


if limh0, h6=0 f (a+h)f
h
(a)
exists, that is, if there is some R such that for
any > 0 there is some > 0 such that
f (a + h) f (a)

0 < |h| < = < .
h
The real number is called the derivative of f at a and is usually written
df
as f 0 (a) or as dx (a).

Remarks 5.19.
f (a + h) f (a)
1. Note that the Newton quotient is not defined for h = 0
h
and clearly, it will only make any sense if both f (a) and f (a + h) are
defined. We shall take it to be part of the definition that this is true, at
least for suitably small values of h. That is, we assume that there is some
(possibly very small) open interval around a of the form (a , a + )
on which f is defined. This means that if a function is defined only
on the integers Z, say, then it will not make any sense to discuss its
differentiability.

2. We see immediately that if f is constant, then f (a + h) = f (a) for any


h and so the Newton quotient is zero for all h 6= 0 and therefore f is
indeed differentiable at a with derivative f 0 (a) = 0.

3. Suppose that f is differentiable at a with derivative f 0 (a). Let f,a be


the function given by

f (x) f (a) , x 6= a
f,a (x) = xa
f 0 (a) , x = a.

Then
f (a + h) f (a)
, h 6= 0
f,a (a + h) = h
f 0 (a) , h = 0.

Department of Mathematics
Functions 95

By definition of differentiability, for any given > 0 there is some > 0


such that

0 < |h| < = f,a (a + h) f 0 (a) < ,

that is,
0 < |h| < = |f,a (a + h) f,a (a)| < . ()

Now, () is still valid if we allow h = 0 and so (with x = a + h), we see


that
|x a| < = |f,a (x) f,a (a)| < .

In other words, the differentiability of f implies that f,a is continuous


at x = a.
(
x3 , x 0
Example 5.20. Let f (x) = What is f 0 (x)?
x2 , x < 0 .
Consider the region x > 0. Here, f (x) = x3 and so f is differentiable
with derivative 3x2 for any x > 0. In the region x < 0, f (x) = x2 and so
f 0 (x) = 2x for any x < 0. What about x = 0? We must argue from first
principles. The Newton quotient (with a = 0) is
3
h 0
f (0 + h) f (0) h = h
2 for h > 0
= 2
h h 0 = h

for h < 0
h
0 as h 0.

Hence f is differentiable at x = 0 with derivative f 0 (0) = 0.

Proposition 5.21. If f is differentiable at a, then f is continuous at a.

Proof. The idea is straightforward. For h 6= 0, we can write


f (a + h) f (a)
f (a + h) f (a) = h.
h

The first term on the right hand side approaches f 0 (a) as h 0 and so the
whole right hand side should approach zero as h 0. Looking at the left
hand side, this means that f (a + h) approaches f (a) as h 0. Formally,
we have
f (x) = f (a) + f,a (x) (x a) .

The right hand side is the product of the two functions f,a (x) and (x a),
each being continuous at x = a and so the same is true of their product.
Therefore the left hand side is continuous at x = a, as required.

Kings College London


96 Chapter 5

Example 5.22. The converse to Proposition 5.21 is false. As an example,


consider f (x) = |x| for x R. Then f is continuous at every x R.
However, f is not differentiable at x = 0. Indeed,
(
f (0 + h) f (0) |h + 0| |0| |h| 1, if h > 0
= = =
h h h 1, if h < 0

so the Newton quotient does not have a limit as h 0 (with h 6= 0) and


consequently f is not differentiable at x = 0.

The following are familiar and very important rules.

Proposition 5.23. Suppose that f and g are differentiable at x0 .

(i) For any R, f is differentiable at x0 and (f )0 (x0 ) = f 0 (x0 ).

(ii) The sum f + g is differentiable at x0 and

(f + g)0 (x0 ) = f 0 (x0 ) + g 0 (x0 ) .

(iii) The product f g is differentiable at x0 and

(f g)0 (x0 ) = f 0 (x0 ) g(x0 ) + f (x0 ) g 0 (x0 ) .

(iv) Suppose that f 6= 0. Then 1/f is differentiable at x0 and


1 0 f 0 (x0 )
(x0 ) = .
f (f (x0 ))2

Proof. In the following, h is small but h 6= 0.


(i) We have

( f )(x0 + h) ( f )(x0 ) f (x0 + h) f (x0 )


=
h h
= f,x0 (x0 + h) f 0 (x0 )

as h 0.
(ii) We have

(f + g)(x0 + h) (f + g)(x0 ) f (x0 + h) + g(x0 + h) f (x0 ) g(x0 )


=
h h
f (x0 + h) f (x0 ) g(x0 + h) g(x0 )
= +
h h
f 0 (x0 ) + g 0 (x0 )

as h 0.

Department of Mathematics
Functions 97

(iii) We have

(f g)(x0 + h) (f g)(x0 ) f (x0 + h) g(x0 + h) f (x0 ) g(x0 )


=
h h
f (x0 + h) f (x0 ) g(x0 + h) g(x0 )
= g(x0 + h) + f (x0 )
h h
f 0 (x0 ) g(x0 ) + g 0 (x0 ) f (x0 )

as h 0, since g is continuous at x0 .
(iv) We have

1/f (x0 + h) 1/f (x0 ) 1 1 1
=
h h f (x0 + h) f (x0 )

1 f (x0 ) f (x0 + h)
=
h f (x0 + h) f (x0 )
f (x0 + h)
=
f (x0 + h) f (x0 )
f 0 (x0 )

(f (x0 ))2

as h 0 since f is continuous at x0 .

Recall that f g denotes the composition x 7 f (g(x)) (function of a


function). Of course, for this to be well-defined the range of g must be
contained in the domain of definition of f . In the following, we assume that
this is satisfied.

Theorem 5.24 (Chain Rule). Suppose that g is differentiable at x0 and that


f is differentiable at v0 = g(x0 ). Then the composition f g is differentiable
at x0 and
(f g)0 (x0 ) = f 0 (g(x0 )) g 0 (x0 ) .

Proof. Suppose that h is small and that h 6= 0. Let v0 = g(x0 ) and put
= g(x0 + h) g(x0 ) so that g(x0 + h) = v0 + . Then

(f g)(x0 + h) (f g)(x0 ) f (g(x0 + h)) f (g(x0 ))


=
h h
f (v0 + ) f (v0 )
=
h
1
= f,v0 (v0 + ) (even if = 0)
h
g(x + h) g(x )
0 0
= f,v0 (v0 + )
h
= f,v0 (v0 + ) g,x0 (x0 + h) .

Kings College London


98 Chapter 5

Now, g(x0 + h) g(x0 ) as h 0 because g is continuous at x0 . In other


words, = g(x0 + h) g(x0 ) 0 as h 0. It follows that

f,v0 (v0 + ) g,x0 (x0 + h) f,v0 (v0 ) g,x0 (x0 )


= f 0 (v0 ) g 0 (x0 )
= f 0 (g(x0 )) g 0 (x0 )

as h 0 and the result follows.

Imagine a function f (x) on the interval [0, 1], say, which has the property
that f (0) = f (1). Can we draw any conclusions about the behaviour of f (x)
for x between 0 and 1? It seems clear that either f is constant on [0, 1] or
else goes up and or down but in any event must have a turning point.
We know from calculus that this should demand that f 0 be zero somewhere.
However, it is clear that f cannot be entirely arbitrary for this to be true. For
example, suppose that f (0) = 0 = f (1) and that f (x) = 5x for 0 < x < 1.
Evidently f 0 is never zero. In fact, f 0 (x) = 5 for 0 < x < 1. We note that f
is not continuous at x = 1.
As another example, consider f (x) = 1 |x| for x [1, 1]. We see that
f (1) = 0 = f (1) but is it true that f 0 is zero for x between 1 and 1?
No, it is not. We see that f 0 (x) = 1 for 1 < x < 0 and that f 0 (x) = 1
for 0 < x < 1 and f is not differentiable at x = 0. In this example, f is
continuous on [1, 1] but fails to be differentiable on (1, 1).
If we impose suitable continuity and differentiability hypotheses, then
what we want will be true.

Theorem 5.25 (Rolles Theorem). Suppose that f is continuous on the closed


interval [a, b] and is differentiable in the open interval (a, b). Suppose further
that f (a) = f (b). Then there is some (a, b) such that f 0 () = 0. (Note
that need not be unique.)

Proof. Since f is continuous on [a, b], it follows that f is bounded and attains
its bounds, by Theorem 5.13. Let m = inf{ f (x) : x [a, b] } and let
M = sup{ f (x) : x [a, b] }, so that

m f (x) M , for all x [a, b].

If m = M , then f is constant on [a, b] and this means that f 0 (x) = 0 for all
x (a, b). In this case, any (a, b) will do.
Suppose now that m 6= M , so that m < M . Since f (a) = f (b) at least
one of m or M must be different from this common value f (a) = f (b).
Suppose that M 6= f (a) ( = f (b)). As noted above, by Theorem 5.13,
there is some [a, b] such that f () = M . Now, M 6= f (a) and M 6= f (b)
and so 6= a and 6= b. It follows that belongs to the open interval (a, b).

Department of Mathematics
Functions 99

We shall show that f 0 () = 0. To see this, we note that f (x) M = f ()


for any x [a, b] and so (putting x = +h) it follows that f ( +h)f () 0
provided |h| is small enough to ensure that + h [a, b]. Hence

f ( + h) f ()
0 for h > 0 and small ()
h

and
f ( + h) f ()
0 for h < 0 and small. ()
h
But () approaches f 0 () as h 0 which implies that f 0 () 0. On the other
hand, () approaches f 0 () as h 0 and so f 0 () 0. Putting these two
results together, we see that it must be the case that f 0 () = 0, as required.
It remains to consider the case when M = f (a). This must require that
m < f (a) ( = f (b)). We proceed now just as before to deduce that there
is some (a, b) such that f () = m and so () and () hold but with
the inequalities reversed. However, the conclusion is the same, namely that
f 0 () = 0.

Theorem 5.26 (Mean Value Theorem). Suppose that f is continuous on the


closed interval [a, b] and differentiable on the open interval (a, b). Then there
is some (a, b) such that

f (b) f (a)
f 0 () = .
ba

Proof. Let y = `(x) = mx + c be the straight line passing through the pair
of points (a, f (a)) and (b, f (b)). Then the slope m is equal to the ratio
(f (b) f (a))/(b a).
Let g(x) = f (x) `(x). Evidently, g is continuous on [a, b] and dif-
ferentiable on (a, b) (because ` is). Furthermore, since `(a) = f (a) and
`(b) = f (b), by construction, we find that g(a) = 0 = g(b). By Rolles Theo-
rem, Theorem 5.25, applied to g, there is some (a, b) such that g 0 () = 0.
However, g 0 (x) = f 0 (x) m for any x (a, b) and so

f (b) f (a)
f 0 () = m =
ba

and the proof is complete.

We know that a function which is constant on an open interval is differ-


entiable and that its derivative is zero. The converse is true (so no surprise
there then).

Kings College London


100 Chapter 5

Corollary 5.27. Suppose that f is differentiable on the open interval (a, b)


and that f 0 (x) = 0 for all x (a, b). Then f is constant on (a, b).

Proof. Let and be any pair of points in (a, b). We shall show that
f () = f (). By relabelling, if necessary, we may suppose that < . By
hypothesis, f is differentiable at each point in the closed interval [, ] and
so is also continuous there, by Proposition 5.21. f obeys the hypotheses
of the Mean value Theorem on [, ] and so we can say that there is some
(, ) such that
f () f ()
f 0 () = .

However, f 0 vanishes on (a, b) and so f 0 () = 0 which means that we must
have f () = f () and the result follows.

Remark 5.28. The Mean Value Theorem can sometimes be useful for obtain-
ing inequalities. For example, setting f (x) = sin x and assuming standard
properties of the trigonometric functions, we can apply the Mean Value
Theorem to f on the interval [0, x] for x > 0 to find that

f (x) f (0) sin x


f 0 () = or cos =
x0 x
for some (0, x). However, cos 1 for all and so we find that sin x x
for all x > 0.
Similarly, applying the Mean Value Theorem to f (x) = ln(1 + x) on the
interval [0, x], we find that

f (x) f (0) 1 ln(1 + x)


f 0 () = or =
x0 1+ x

for some (0, x). But then 1/(1 + ) < 1 and we find that ln(1 + x) < x
for any x > 0.
These inequalities could also have easily been obtained from the fact
that the integral of a positive function is positive. Indeed,
Z x
x sin x = (1 cos t) dt 0 .
0

In the same way,


Z x
1

x ln(1 + x) = 1 1+t dt 0.
0

In fact, one can show that both integrals are strictly positive if x > 0 so this
last method gives the strict inequalities sin x < x and ln(1 + x) < x for all
x > 0. (In this connection, note that if ln(1 + x) = x, then 1 + x = ex . This
is not possible for any x > 0 as is seen from the series expansion for ex .)

Department of Mathematics
Functions 101

Suppose that f and g are continuous on [a, b], differentiable on (a, b) and
that g 0 is never zero on (a, b). The Mean Value Theorem applied to f and g
tells us that there is some and in (a, b) such that
f (b) f (a) g(b) g(a)
= f 0 () and = g 0 () .
ba ba
Dividing (and noting that g(b) g(a) 6= 0 since g 0 () 6= 0, by hypothesis),
gives
f (b) f (a) f 0 ()
= 0 .
g(b) g(a) g ()
It is possible to do a little better.
Theorem 5.29 (Cauchys Mean Value Theorem). Suppose that f and g are
continuous on [a, b] and differentiable on (a, b). Suppose further that g 0 is
never zero on (a, b). Then there is some (a, b) such that
f (b) f (a) f 0 ()
= 0 .
g(b) g(a) g ()
Proof. First, we observe that if g(a) = g(b) then Rolles Theorem tells us
that g 0 () = 0 for some (a, b). However, g 0 has no zeros on (a, b), by
hypothesis, and so it follows as noted above that g(a) 6= g(b). Set

(x) = g(b) g(a) f (x) f (b) f (a) g(x) .

Then
(a) = g(b) f (a) f (b) g(a) = (b)
and satisfies the hypotheses of Rolles Theorem. Hence there is some
(a, b) such that 0 () = 0, that is,

g(b) g(a) f 0 () f (b) f (a) g 0 () = 0

or
f (b) f (a) f 0 ()
= 0 ,
g(b) g(a) g ()
as required.

Remark 5.30. Notice that interchanging a and b does not affect the left
hand side of the above equality. This means that we can slightly rephrase
Cauchys Mean Value Theorem to say that for any a 6= b there is some
between a and b such that
f (b) f (a) f 0 ()
= 0 ,
g(b) g(a) g ()
regardless of whether a < b or a > b.

Kings College London


102 Chapter 5

Taylors Theorem
It is convenient to let f (k) denote the k th -derivative of f (whenever it exists).
dk (xj ) dk (xj )
Now, if k > j, then = 0, whereas if k j, then we see that =
dxk dxk
j(j 1) . . . (j (k 1))xjk . This vanishes when x = 0 and so we see that

dk (xj )
=0
dxk x=0
for any k, j N.
Consider the polynomial p(x) = 0 + 1 x + 2 x2 + + m xm . Taking
derivatives and setting x = 0, we find p(0) = 0 , p0 (0) = 1 , p(2) (0) =
22 , p(3) (0) = 3! 3 . In general,

p(k) (0) = k! k .

Now consider some general function f (x) and define


1 (k)
a0 = f (0), a1 = f 0 (0), a2 = 1
2 f (2) (0), . . . , ak = f (0), . . . , etc.
k!
Let

Pn1 (x) = a0 + a1 x + a2 x2 + + an1 xn1


Rn (x) = f (x) Pn1 (x) .

If f (x) is a polynomial of degree n 1, then f (x) = Pn1 (x) and Rn (x) = 0.


So, in general, we can think of Pn1 (x) as a polynomial approximation to
f (x) and Rn (x) as the remainder. The smaller Rn (x) is, so f (x) is closer to
a polynomial. The question is, what can be said about Rn (x)? This is the
content of Taylors Theorem.
To begin with, we notice that for k n 1,
(k)
R(k) (0) = f (k) (0) Pn1 (0) = f (k) (0) k! ak = 0 ,

by our construction of the ak s. We will use this in the following discussion.


Now, for x 6= 0, we apply Cauchys Mean Value Theorem to the pair of
functions Rn (t) and gn (t) = tn , to write
Rn (x) Rn (0) R0 ()
= 0n
gn (x) gn (0) gn ()
for some lying between 0 and x. (It does not matter whether x > 0 or
x < 0.) Now, any such can be expressed in the form = 1 x for some
0 < 1 < 1. Hence
Rn (x) Rn (x) Rn (0) R0 (1 x)
= = 0n
gn (x) gn (x) gn (0) gn (1 x)

Department of Mathematics
Functions 103

for some 0 < 1 < 1, since both Rn (0) = 0 and gn (0) = 0.


(k) (k)
We repeat this argument applied successively to Rn (t) and gn (t), and
(k) (k)
use the facts that Rn (0) = 0 and gn (0) = 0 for k n 1, to deduce that

Rn (x) Rn (x) Rn (0) R0 (1 x)


= = 0n for some 0 < 1 < 1,
gn (x) gn (x) gn (0) gn (1 x)
R0 (1 x) Rn0 (0) Rn00 (2 1 x)
= n0 = for some 0 < 2 < 1,
gn (1 x) gn0 (0) gn00 (2 1 x)
(3)
Rn00 (2 1 x) Rn00 (0) Rn (3 2 1 x)
= 00 00
= (3) for some 0 < 3 < 1,
gn (2 1 x) gn (0) gn (3 2 1 x)
..
.
(n1) (n1) (n)
Rn (n1 . . . 1 x) Rn (0) Rn (n . . . 1 x)
= (n1) (n1)
= (n)
gn (n1 . . . 1 x) gn (0) gn (n . . . 1 x)
for some 0 < n < 1.

(n) (n) (n)


However, Rn (s) = f (n) (s) Pn1 (s) = f (n) (s) since Pn1 (s) = 0 and
(n)
gn (s) = n! . Let = 1 2 . . . n . Then 0 < < 1 and we get that

f (x) Pn1 (x) Rn (x) f (n) ( x)


= =
xn gn (x) n!

We can rewrite this to give


xn (n)
f (x) = Pn1 (x) + f ( x)
n!
for some 0 < < 1. We have established the following theorem.

Theorem 5.31 (Taylors Theorem). Suppose f is defined on some interval


(, ) and has derivatives up to order n at all points in (, ). Suppose also
that 0 (, ) and x (, ). Then

x2 00 xn1
f (x) = f (0) + x f 0 (0) + f (0) + . . . + f (n1) (0) + Rn (x)
2! (n 1)!

xn (n)
where Rn (x) = f () for some between 0 and x.
n!
Remark 5.32. Note that will generally depend on f , x and also n.

Example 5.33. Let f (x) = ln(1 + x) on, say (1, 3). The derivatives of f are
given by
(1)k+1 (k 1)!
f (k) (x) = for k N.
(1 + x)k

Kings College London


104 Chapter 5

For any x (1, 3), by Taylors Theorem (up to remainder order n + 1), we
may say that
x2 x3 x4
ln(1 + x) = x + + + Rn+1 (x)
2 3 4
where
xn+1 (1)n+2 n! xn+1 (1)n+2
Rn+1 (x) = =
(n + 1)! (1 + )n+1 (n + 1) (1 + )n+1
for some between 0 and x. Now let x = 1. Then f (1) = ln 2 and so
n
ln 2 1 12 + 13 14 + + (1)
n = Rn+1 (1)

(1)n+2
where Rn+1 (1) = for some 0 < < 1.
(n + 1)(1 + )n+1
1
But |Rn+1 (1)| < which means that Rn+1 (1) 0 as n . It
n+1
follows that


X
1 1 1 (1)n+1
ln 2 = 1 2 + 3 4 + ... =
n
n=1

There is a further more general formulation. For fixed a, let g(s) = f (s + a)


and apply Taylors Theorem to g(s) to get

sn1 sn (n)
g(s) = g(0) + s g 0 (0) + 12 s2 g 00 (0) + + g (n1) (0) + g ()
(n 1)! n!
for some between 0 and s. Now, g(s) = f (s + a) and g(0) = f (a).
Furthermore, by the chain rule, we find that g (k) (0) = f (k) (a) and g (n) () =
f (n) ( + a). But if lies between 0 and s then + a lies between a and s + a.
Putting x = s + a, we have s = x a and so = + a lies between a and x.
We arrive at the following version of Taylors Theorem.
Theorem 5.34 (Taylors Theorem for f about a). Suppose f is defined on
some interval (, ) and has derivatives up to order n at all points in (, ).
Suppose also that a (, ) and x (, ). Then

(x a)2 00
f (x) = f (a) + (x a) f 0 (a) + f (a) +
2!
(x a)n1 (n1)
+ f (a) + Rn (x)
(n 1)!
xn (n)
where Rn (x) = f () for some between a and x.
n!

Department of Mathematics
Chapter 6

Power Series

P n
Definition 6.1. A series of the form n=0 an (x ) , where the an are
constants, is called a power series (about x = ).

We notice immediately that such a power series always converges for


x = (in this case, all terms, except possibly for the a0 term, are zero).
What can be said about the convergence of power series? The following
results explain the situation. By setting w = x , it is often sufficient to
consider the case = 0, so that the powers are simply powers of x and we
will usually do this.
P n
Proposition 6.2. Suppose that the power series n=0 an x converges for
some value x = x0 with x0 6= 0. Then it converges absolutely for every
x satisfying |x| < |x0 |.
P
Proof. Let Sn (x) = nk=0 ak xk . By hypothesis, (Sn (x0 ))nN{ 0 } converges.
In particular, (ak xk0 ) converges (to zero) and so is a bounded sequence; that
is, there is some M > 0 such that |ak xk0 | < M for all k.
P
We wish to show that nk=0 |ak xk | converges for every x with |x| < |x0 |.
Suppose, then, thatP x obeys |x| < |x0 | and set = |x/x0 |. Evidently,
0 < 1 and so k=0 k converges. But then

|ak xk | = |ak xk0 | |x/x0 |k M k


Pn k
and so k=0 |ak x | converges by the Comparison Test.

Radius of Convergence of a Power Series


P n
Consider a given power series n=0 an x and let
P n
J = {x R : n=0 an x converges }.

What can be said about J ? Certainly, 0 J and it could happen that this
is the only element of J. For example, if an = nn , then an xn = (nx)n and so
no matter how small x is, eventually |nx| > 1 provided x 6= 0. This means

105
106 Chapter 6

that for any given x 6= 0, it is false that an xn 0 as n and so the


power series cannot converge. In thisPcase Jn = { 0 }.
Suppose
P that x 0 J, so that n=0 an x0 is convergent. Then we know
n
that n=0 an x also converges (absolutely) for every x obeying |x| < |x0 |.
In other words, if x0 J, then every point in the interval ( |x0 | , |x0 |) also
belongs to J. What does this mean for J? There are 3 distinct (mutually
exclusive) possibilities.

(i) J = { 0 }.

(ii) J is bounded but there is some t 6= 0 with t J


(that is, J 6= { 0 } but is bounded).

(iii) J is unbounded.

We can immediately deduce that if J is not bounded, case (iii), then it must
be the whole of R. Indeed, to say that J is not bounded is to say that for
any r > 0, there is some x J with |x| > r. Hence [r, r] J for all r > 0
and so J = R.
Now consider case (ii) and let
P n
A = {r > 0 : n=0 an x converges for x (r, r) } .

Evidently, if t J, then |t| A and so A is bounded because J is. Let


R = lub A. Then R > 0 otherwise we are in case (i).
Suppose 0 < < R. Then, by definition
P of lub, there is r A such
that < r R. But then the series a
n=0 n xn converges (absolutely) for

x (r, r) and, in particular, for x with |x| = . P


Next, suppose that x R with |x| = > R. If n
n=0 an x were to
converge, then we could deduce that (, ) J which would mean that

P A. This contradicts the fact that R is an upper bound for A and so
n
n=0 an x cannot converge for any such x. P
Case (ii) means then that there is some R > 0 such that n=0 an x
n

converges (absolutely) for all x with |x| < R but diverges for any x with
|x| > R. The behaviour of the power series when |x| = R (i.e., x = R)
requires separate extra discussion and will depend on the particular power
series. Anything is possible.
This discussion is summarized in the following very important theorem.

Department of Mathematics
Power Series 107

Theorem 6.3 (Radius P of Convergence Theorem for Power Series). For any
given power series n=0 a n (x ) n , exactly one of the following three pos-

sibilities applies.
P n
(i) n=0 an (x ) converges only for x = .
P
(ii) There is R > 0 such that n
n=0 an (x ) converges (absolutely)
for all |x | < R but diverges for any x with |x | > R.
P
(iii) n=0 an (x )n converges (absolutely) for all x.

Definition 6.4. The value R above is called the radius of convergence of the
power series. In case (iii), one says that the series has an infinite radius of
convergence.

Examples 6.5.
P
1. Consider n
n=0 x . This series converges if |x| < 1 (by the Ratio Test)
and otherwise diverges, so R = 1. Note that the series diverges at both
of the boundary values x = 1.

2. Consider

X x2 x3
an xn = 1 + x + + + ...
2 3
n=0

The series converges if |x| < 1 (by Comparison with 1 + x + x2 + . . . ).


If x = 1, then it becomes 1 + 1 + 21 + 13 + . . . which we know diverges. It
follows that it cannot converge for any x with |x| > 1. When x = 1, it
becomes 1 1 + 12 13 + . . . which converges. So 1 + x + x2 /2 + x3 /3 + . . .
converges at x = 1 but diverges at x = 1.
Replacing x by x, we see that the series

x2 x3 x4 x5
1x+ + + ...
2 3 4 5
converges for |x| < 1 and for x = 1 but diverges when x = 1.

3. Formally adding together the two series above, suggests the power series

2x2 2x4 2x6 x4 x6


2+ + + + = 1 + 1 + x2 + + + ...
2 4 6 2 3
which converges for |x| < 1 = R but diverges when x = 1.

4. The series
x2 x6 x8
1
+ + ...
2 3 4
converges for |x| < 1 = R and also converges for both x = 1.

Kings College London


108 Chapter 6

5. The series

X xn x2 x3
=1+x+ + + ...
n! 2! 3!
n=0
converges absolutely for all x R, by the Ratio Test.
P
If the power series P n
n=0 an x is differentiated term by term, then the

resulting power series is n=1 nan xn1 . This is called the associated derived
series. The next theorem tells us that this makes sense.
P n
Theorem 6.6. Suppose
P that n1n=0 an x has radius of convergence R > 0.
Then the series n=1 nan x also has radius of convergence equal to R.
(The possibility of an infinite radius of convergence is included.)
Proof.
P Suppose that 0 < |u| < R. Let r > 0 obey 0 < |u| < r < R. Then
|a | r n converges. Since n1/n 1 as n , it follows that there is
n=0 n
some N N such that n1/n < r/ |u| for all n > N . Therefore

n |an | |un | = |an | (n1/n |u|)n < |an | rn

for all n > N . By Comparison, it follows that



X
X
n1
n an u = (1/u) n an un
n=1 n=1
P
converges absolutely. It follows that the power series n=1 nan x
n1 has

radius of convergence at least equal to R.


P
On the other hand, if the derived series n=1 nan x
n1 converges absolutely,

then the inequality


|an | |x|n |x| n |an | |x|n1
P
for n 1 implies that n
n=0 an x converges absolutely, by Comparison. The
result follows.

RemarkP 6.7. By applying the theorem once again, we see that the power
series n=2 n(n 1) an xn2 also has radius of convergence equal to R. Of
course, we can now apply the theorem again . . .
The big question is whether the derived series is indeed the derivative of
the original power series. We shall now show that this is true.
We recall that Taylors Theorem, with 2nd order remainder for a function
f about x0 , gives

f (2) (c)
f (x) = f (x0 ) + (x x0 )f 0 (x0 ) + (x x0 )2
2!
for some c between x and x0 . Setting f (x) = xk gives the equality

xk xk0 = k (x x0 ) xk1
0 + 21 k(k 1)ck2
k (x x0 )
2

Department of Mathematics
Power Series 109

for some ck between x0 and x. Note that ck may depend on k (as well as x0
and x). If x = x0 + h, then this becomes

(x0 + h)k xk0 = h k xk1


0 + 21 k(k 1)ck2
k h2 ()

for some ck between x0 and x0 + h.


We can use this to find the derivative of a power series inside P its disc
of convergence. Indeed, suppose that the power series f (x) = n=0 an x
n

has radius of convergence R > 0. Let |x0 | < R be given and let r > 0 obey
0 < |x0 | < r < R. Let h 6= P
0 be so small that |x0 | + |h| < r. This means that
r < x0 + h < r so that n
n=0 an (x0 + h) converges (absolutely). Using
(), we find that

f (x0 + h) f (x0 ) X X
n an xn1
0 = 1
2 h n(n 1)cnn2 .
h
n=1 n=2

Now cn is between x0 and x0 + h and both of these points lie in the interval
(r, r) and so it follows that
P cn (r, r),n2
that is |cn | < r. But then, by
Comparison with the series n=2 n(n1)r , the power series on the right
hand side is convergent. Letting h 0 gives the desired result that

f (x0 + h) f (x0 ) X
f 0 (x0 ) = lim = n an x0n1 .
h0 h
n=1

We have proved the following important theorem.


P
Theorem 6.8 (Differentiation of Power Series). The power series n=0 an x
n

is differentiable at each point x0 inside its radius of convergence.


P
Moreover, its derivative is given by the derived series n1
n=1 n an x0 .

Example 6.9. We shall show that

x2 x3 x4
ln(1 + x) = x + + ...
2 3 4
for any x (1, 1). The radius of convergence, R, of the power series on
the right hand side is R = 1.
Let us begin by guessing that ln(1 + x) = a0 + a1 x + a2 x2 + . . . . If this is to
be true, then putting x = 0, we should have ln 1 = a0 + 0, that is, a0 = 0.
Differentiating term by term and then setting x = 0, we might guess that
d
dx ln(1 + x) |x=0 = a1 . This gives 1 = a1 . Differentiating twice (term by
d2
term) and setting x = 0, we might guess that dx x ln(1 + x) |x=0 = 2a2 , that

is, a2 = 21 . Repeating this, we guess that ak = (1)k+1 /k. So much for


the guessing, now let us justify our reasoning.

Kings College London


110 Chapter 6

Let g(x) be the power series

x2 x3 x4
g(x) = x + + ... .
2 3 4
We see that this power series converges for x = 1 and so it must converge
absolutely for |x| < 1. (This can also be seen directly by the Ratio Test.)
The series does not converge when x = 1 and so we deduce that its radius
of convergence is R = 1. For any x with |x| < R = 1, the power series
can be differentiated and the derivative is that obtained by term by term
differentiation. Hence

g 0 (x) = 1 x + x2 x3 + . . .

for any x with |x| < 1. However, we know that


1
= 1 x + x2 x3 + . . .
1+x
d
for |x| < 1 and so g 0 (x) = 1/(1 + x) for x (1, 1). But dx ln(1 + x) =
1/(1+x) for x (1, 1) and so ln(1+x)g(x) has zero derivative on (1, 1).
It follows that ln(1 + x) g(x) is constant on (1, 1). Setting x = 0, we see
that this constant must be ln 1 g(0) = 0 and so ln(1 + x) = g(x) on the
interval (1, 1), as required.
Note that we have shown that ln(1 + x) = x 21 x2 + 13 x3 . . . for
any x (1, 1). We have already seen (thanks to Taylors Theorem) that
ln 2 = 1 12 + 13 14 + . . . which means that this expansion is also valid for
x = 1.
When x = 1, the left hand side becomes ln 0, which is not defined and
the right hand side becomes the divergent series 1 21 31 14 . . . .

Department of Mathematics
Chapter 7

The elementary functions

We have already used the elementary functions (the trigonometric functions,


exponential function and the logarithm) as examples to illustrate various
aspects of the theory. Now is the time to give their formal definitions.
The trigonometric functions sin x and cos x and the exponential function
exp x are defined as follows.

Definition 7.1. For any x R,


X (1)n x2n+1 x3 x5 x7
sin x = =x + + ...
(2n + 1)! 3! 5! 7!
n=0

X (1)n x2n x2 x4 x6
cos x = =1 + + ...
(2n)! 2! 4! 6!
n=0

X xn x2 x3
exp x = =1+x+ + + ... .
n! 2! 3!
n=0

Each of these power series converges absolutely for all x R (by the Ratio
Test) so they have an infinite radius of convergence.

Remark 7.2. These are the definitions and so each and every property that
these functions possess must be obtainable from these definitions.

We can see immediately that sin 0 = 0, cos 0 = 1 and exp 0 = 1. We also


note that sin(x) = sin x (so sin x is an odd function) and cos(x) = cos x
(so cos x is an even function). Furthermore, by the basic differentiation of
power series theorem, Theorem 6.8, we see that these functions are differen-
tiable at every x R with derivatives given by term by term differentiation

111
112 Chapter 7

so that
d d x3 x5 x2 x4
sin x = x + ... = 1 + = cos x
dx dx 3! 5! 2! 4!
d d x2 x4 x3 x5
cos x = 1 + . . . = x + + = sin x
dx dx 2! 4! 3! 5!
d d x2 x3 x2
exp x = 1+x+ + + ... = 0 + 1 + x + + = exp x .
dx dx 2! 3! 2!
We shall establish further familiar properties.
Theorem 7.3. For any x R, sin2 x + cos2 x = 1.
Proof. Let (x) = sin2 x + cos2 x. Then we calculate the derivative

0 (x) = 2 sin x cos x 2 cos x sin x = 0 .

It follows that (x) is constant on R. In particular,

(x) = (0) = sin2 0 + cos2 0 = 0 + 1 = 1

that is, sin2 x + cos2 x = 1, as required.

Remark 7.4. Since both terms sin2 x and cos2 x are non-negative, we can say
that 1 sin x 1 and also 1 cos x 1 for all x R. The functions
sin x and cos x are bounded (by 1). This is not at all obvious just by
looking at the power series in their definitions.
Theorem 7.5 (Addition Formulae). For any a, b R, we have

sin(a + b) = sin a cos b + cos a sin b


cos(a + b) = cos a cos b sin a sin b .

Proof. Let (x) = sin( x) cos x + cos( x) sin x. Then we see that

0 (x) = cos( x) cos x sin( x) sin x


+ sin( x) sin x + cos( x) cos x = 0 .

It follows that (x) is constant on R and so (x) = (0), that is,

sin( x) cos x + cos( x) sin x = sin .

Putting = a + b and x = b, we obtain the desired formula

sin(a + b) = sin a cos b + cos a sin b .

The other formula can be obtained similarly. Indeed, let

(x) = cos( x) cos x sin( x) sin x .

Department of Mathematics
The elementary functions 113

Then we find that 0 (x) = 0 so that (x) is constant on R. Hence (x) =


(0) = cos . Again setting = a + b and x = b, we find that

cos(a + b) = cos a cos b sin a sin b

and the proof is complete.

Remark 7.6. The formulae

sin(a b) = sin a cos b cos a sin b


cos(a b) = cos a cos b + sin a sin b

follow by replacing b by b and using the facts that sin(b) = sin b whereas
cos(b) = cos b. Notice further that if we set a = x and b = x in this last
formula, then we get

cos(x x) = cos2 x + sin2 x

that is, we recover the formula sin2 x + cos2 x = 1.

The number
The elementary geometric approach to the trigonometric functions is by
means of triangles and circles. The number makes its appearance in the
formula relating the circumference and the radius of a circle (or giving the
area A = r2 of a circle of radius r). For us here, we must always proceed via
the power series definitions of the trigonometric functions. The identification
of begins with some preliminary properties of the functions sin x and cos x.
Lemma 7.7.
(i) sin x > 0 for all x (0, 2) .
(ii) cos 2 < 0 .
Proof. (i) Taylors Theorem (up to order 2) says that

x2 00
f (x) = f (0) + x f 0 (0) + f (c)
2!
for some c between 0 and x. With f (x) = sin x, we obtain

x2 x2
sin x = 0 + x sin(c) x
2 2
for some c between 0 and x. We have used the facts that sin 0 = 0, cos 0 = 1
and sin(c) 1. Hence

sin x x 12 x2 = 1
2 x(2 x) > 0

if 0 < x < 2, as claimed.

Kings College London


114 Chapter 7

(ii) Applying Taylors Theorem (up to order 4), we may say that there
is some between 0 and x such that
x2 x4
cos x = 1 0 +0+ cos .
2! 4!
But cos 1 and so
x2 x4
cos x 1 + .
2 4!
Putting x = 2 gives
4 16 2 1
cos 2 1 + = 1 + =
2 24 3 3

which implies that cos 2 13 < 0, as required.

Now we come to the crucial part.

Theorem 7.8. There is a unique 0 < < 2 such that cos = 0.

Proof. We know that cos 0 = 1 and we have just seen that cos 2 < 0. It
follows by the Intermediate Value Theorem (applied to the function cos x
on the interval [0, 2]) that there is some (0, 2) such that cos = 0.
We must now show that there is only one such . To see this, suppose
that cos = 0 for some (0, 2) with 6= . Then by Rolles Theorem,
there is some between and such that dxd
cos xx= = 0, that is, sin = 0.
But we have shown that sin x > 0 on (0, 2). This gives a contradiction and
so we conclude that there can be no such . In other words, there is a unique
with 0 < < 2 such that cos = 0.

Definition 7.9. The real number is defined to be = 2, where is the


unique solution in (0, 2) to cos = 0.

All we can say at the moment is that 0 < < 4. It is known that is
irrational and its decimal expansion is known to some two million decimal
places. Curiously enough, it seems that each of the digits 0, 1, . . . , 9 appears
with about the same frequency in this expansion.

Theorem 7.10. The number is such that sin( 12 ) = 1, cos(2) = 1 and


sin(2) = 0. Furthermore, for any x R

sin(x + 2) = sin x
cos(x + 2) = cos x .

Proof. By its very definition, we know that cos( 12 ) = 0. But since we have
the identity sin2 x + cos2 x = 1, it follows that sin( 21 ) = 1. However, we
have seen that sin x > 0 on (0, 2) and so it follows that sin( 12 ) = 1.

Department of Mathematics
The elementary functions 115

By the addition formulae, sin = 2 sin( 12 ) cos( 21 ) = 0. This then


implies that sin(2) = 2 sin cos = 0. To show that cos(2) = 1, we use
the addition formula again to find that

cos(2x) = cos2 x sin2 x = 1 2 sin2 x .

Setting x = , we get cos(2) = 1 because sin = 0.


Finally, using the above results together with the addition formulae, we
calculate

sin(x + 2) = sin x cos(2) + cos x sin(2) = sin x


cos(x + 2) = cos x cos(2) sin x sin(2) = cos x

for any x R and the proof is complete.

Properties of the exponential function


We now turn to a discussion of the exponential function.

Proposition 7.11. The function exp x enjoys the following properties.


d
(i) dx exp x = exp x for all x R.

(ii) exp 0 = 1.

(iii) For any a, b R, exp(a + b) = exp a exp b.

(iv) exp(x) = 1/ exp x for all x R.

(v) exp x > 0 for all x R.

Proof. (i) As already noted, this follows because the derivative of the power
series is that power series got by differentiating term by term.
(ii) Putting x = 0 in the power series gives exp 0 = 1.
(iii) Fix u R and set (x) = exp x exp(u x). Then

0 (x) = exp x exp(u x) exp x exp(u x) = 0

for all x R. It follows that (x) is constant, so that (x) = (0). But
(0) = exp u and so (x) = exp u. Letting u = a + b and x = a, we find
that exp a exp b = exp(a + b), as required.
(iv) From the above, we find that exp x exp(x) = exp 0 = 1 and so
exp(x) = 1/ exp x.

Kings College London


116 Chapter 7

(v) Since exp x exp(x) = 1 it follows that exp x 6= 0 for any x R.


However, it is clear from the power series that exp x > 0 if x > 0 and so the
formula exp x exp(x) = 1 implies that exp(x) > 0 too.
(Alternatively, one can note that exp x = exp( 12 x) exp( 21 x) = (exp( 21 x))2
which is positive.)

Because of the property exp(a + b) = exp a exp b, one often writes ex


for exp x, so this reads ea+b = ea eb . However, this notation needs some
further discussion. The point is that the symbols e2 , say, now appear to
have two interpretations. Firstly as exp(2) and secondly as the square of
the number e. The real number e is defined as exp(1) and we see that

e2 = exp(1)2 = exp(1) exp(1) = exp(1 + 1) = exp(2)

so the two interpretations actually agree. What about, say, e1/2 ? This is
interpreted as either exp( 12 ) or as the square root of e. But

exp( 12 ) exp( 12 ) = exp( 12 + 12 ) = exp(1) = e

so exp( 12 ) is the square root of e. This extends to any rational power.

Theorem 7.12. For any r Q, exp(r) = er , where e = exp(1)

Proof. If r = 0, then exp(0) = 1 = e0 , by definition of the power e0 . Now


suppose that r > 0 and write r = p/q for p and q N. We have

(exp r)q = exp r exp r = exp(rq)


| {z }
q factors

= exp p = exp(1
|+1+
{z + 1})
p terms
p
= exp 1 exp 1 = e
| {z }
p factors

and so exp(r) = ep/q = er .


Now let r = s where s Q and s > 0. The above discussion tells us
that exp(s) = es so that
1 1
exp(r) = exp(s) = = s = es = er
exp(s) e

and we are done.

Remark 7.13. This result clarifies the symbolism ex . This can always be
considered as shorthand notation for exp x, but if x is rational, then it can
also mean the xth power of the real number e. In this (rational) case, the
values are the same, as the theorem shows, so there is no ambiguity.

Department of Mathematics
The elementary functions 117

Remark 7.14. We have seen that the power series expression for exp x tells us
d
that dx exp x = exp x and exp 0 = 1. These properties completely determine
exp x. In fact, if (x) is the power series (x) = a0 + a1 x + a2 x2 + . . . , then
the requirement that 0 (x) = (x) demands that

a1 + 2a2 x + 3a3 x2 + 4a4 x3 + = a0 + a1 x + a2 x2 + . . .

This holds if kak = ak1 for all k = 0, 1, 2, . . . which means that a1 = a0 ,


a2 = a1 /2 = a0 /2, . . . , ak = ak1 /k = ak2 /k(k 1) = = a0 /k!. If
(0) = 1, then a0 = 1 and ak = 1/k! so we find that (x) = exp x.
This holds without assuming that we begin with a power series. Indeed,
suppose that (x) is differentiable on R and that 0 (x) = (x) and (0) = 1.
We shall show that (x) = exp x.
Let g(x) = (x) exp(x). Then g is differentiable on R and

g 0 (x) = 0 (x) exp(x) (x) exp(x) = 0

since 0 (x) = (x). Fix u R and let (a, b) be any interval in R such that
both u (a, b) and 0 (a, b). Then g 0 is zero on the interval (a, b) and
so g is constant there. In particular, g(u) = g(0). However, by construction,
g(0) = (0) exp 0 = 1 and so g(u) = g(0) = 1. Hence (u) exp(u) = 1
and we finally arrive at the required result that (u) = exp u.
The function exp x has further interesting properties.
Theorem 7.15. The function exp x obeys the following.
(i) The map x 7 exp x is one-one from R onto (0, ). In fact, exp x
is strictly increasing on R.

(ii) For any k N, xk / exp x 0 as x .


Proof. (i) From the power series expression for exp x, we see that if x > 0
then exp x > 1 + x > 1. Suppose that a, b R and that a < b. Then
b a > 0 so that exp(b a) > 1. Multiplying by exp a (which is positive),
we see that exp(b a) exp a > exp a, that is exp b > exp a. It follows that
exp x is strictly increasing and so exp a = exp b is only possible if a = b, that
is, x 7 exp x is one-one.
(Alternatively, the Mean Value Theorem tells us that (exp b exp a)/(b a)
is equal to the derivative of exp x evaluated at some point between a and b.
This derivative is always positive and so (exp b exp a) and (b a) always
have the same sign. In particular, exp a = exp b only if a = b.)
We still have to show that exp x maps R onto (0, ). To see this, let
(0, ). We must show that there is some u R such that exp u = . Let
> and let > 1/. Then exp > 1 + > and exp > 1 + > 1/,
so that exp() = 1/ exp < . So we have

exp() < < exp .

Kings College London


118 Chapter 7

Now exp x is continuous on R and so in particular is continuous on the


closed interval [, ]. By the Intermediate Value Theorem, there is some
u between and such that exp u = , as required.
(ii) For x > 0, the power series expression for exp x tells us that

X xn xk+1
exp x = > .
n! (k + 1)!
n=0

Hence 0 < xk / exp x < (k + 1)!/x for x > 0 and so xk / exp x 0 as


x .

Remark 7.16. This last result can be written as xk exp(x) 0 as x


or as (exp x)/xk as x and it implies that xk exp x 0 as
x .

It is clear from the power series definition (with x = 1) that e > 1+1 = 2.
We can easily obtain an upper bound for e via Taylors Theorem. Indeed,
exp(k) (x) = exp(x) for any k N and exp(0) = 1, so by Taylors Theorem
up to remainder of order 3, we have
x2 x3
exp(x) = 1 + x + + exp(cx )
2! 3!
for some cx between 0 and x. If x = 1, then c1 < 1 and so ec1 < e and we
get
e (1 + 1 + 12 ) = 16 ec1 < 16 e ,
that is,
e < 3.
We can profitably pursue this method of estimation. Taylors Theorem up
to remainder of order m + 1 gives
x2 xm
ex = 1 + x + + + + Rm+1
2! m!
xm+1
where Rm+1 = (m+1)! ecx . Now setting x = 1 and noting the inequalities
3
0 < ec1 < e < 3, we see that 0 < Rm+1 < (m+1)! .
3 1
However, if m 3, then (m+1)! < m! and we deduce that
1 1 1
0 < e 1 + 1 + + + < . ()
2! m! m!
This can be rewritten as
1 1 1 1 1
1 + 1 + + + < e < 1 + 1 + + + +
2! m! 2! m! m!
for any m 3. These estimates allow us to prove the following interesting
fact.

Department of Mathematics
The elementary functions 119

Theorem 7.17. The real number e is irrational.

Proof. The proof is by contradiction. Suppose it were the case that e Q


and let e = p/q where p, q N. Let m N obey m > q + 3 (so that m is
greater than both q and 3). Using the estimate () and multiplying through
by m! , we see that

p 1 1
0 < m! 1 + 1 + + + < 1.
q 2! m!
However,

p 1 1 m! p m! m!
m! 1+1+ ++ = m! + m! + ++
q 2! m! q 2! m!

which is an integer because each term is an integer. This gives us our


contradiction since there is no integer lying strictly between 0 and 1. The
proof is complete.

In fact, we can prove more, namely that all powers and roots of powers
are irrational. That is, ep/q is irrational for any p, q Z with p 6= 0 and
q 6= 0. (If q = 0, then p/q does not make any sense. If q 6= 0 but p = 0, then
we have ep/q = e0 = 1 which is rational.) In order to show this, we need a
few preliminary results.
xn (1 x)n
Lemma 7.18. For given n N, let f (x) = . Then
n!
2n
1 X
(i) f (x) = cm xm , with cm Z.
n! m=n

(ii) If 0 < x < 1, then 0 < f (x) < 1/n!.

(iii) f (k) (0) Z and f (k) (1) Z for all k 0.

Proof. (i) The Binomial Theorem tells us that (1 x)n can be written as

(1 x)n = a0 + a1 x + a2 x2 + + an xn

for suitable integers a0 , a1 , . . . , an . In fact, a0 = 1, a1 = n, a2 = n(n 1)


and so on. In general, am = (1)m n! /(n m)! m!.
Alternatively, this can be proved by induction. Indeed, let P (n) be the
statement that (1 x)n = a0 + a1 x + a2 x2 + + an xn for coefficients
a0 , a1 , . . . , an Z. Then with n = 1, we have (1 x)1 = 1 x and we see
that P (1) is true.
Now suppose that n N and P (n) is true. Then

(1 x)n+1 = (1 x) (1 x)n = (1 x)(a0 + a1 x + a2 x2 + + an xn )

Kings College London


120 Chapter 7

for coefficients a0 , a1 , . . . , an Z. Expanding the right hand side gives

(1 x)n+1 = a0 + a1 x + a2 x2 + + an xn
x(a0 + a1 x + a2 x2 + + an xn )
= a0 + (a1 a0 )x + (a2 a1 )x2 +
+ (an an1 )xn an xn+1 .

Evidently, the coefficients all belong to Z and so P (n + 1) is true. By


induction, it follows that P (n) is true for all n N.
(ii) If 0 < x < 1, then also 0 < (1x) < 1 and therefore both 0 < xn < 1
and 0 < (1 x)n < 1. Hence 0 < f (x) < 1/n! .
(iii) We first note that differentiating k times the power xm and then
setting x = 0 gives
(
dk xm 0, if m 6= k
k =
dx x=0 k!, if m = k.

It follows directly from (i) that


(
(k) k! ck /n! , if n k 2n
f (0) =
0, otherwise.

Furthermore, if k n, then k!/n! Z and so we see that f (k) (0) Z for


any k 0.
Next, we use the relation f (x) = f (1 x) together with the chain rule
to find f (k) (1). Let u = 1 x. Then du/dx = 1 so that
d f (1 x) df (u) du df (u)
= = (1) .
dx du dx du
Differentiating k times gives

dk f (1 x) dk f (u)
k
= (1)k .
dx duk
Hence, using the equality f (x) = f (1 x) = f (u), we get

dk f (x) dk f (1 x) k
k d f (u)
f (k) (x) = = = (1)
dxk dxk duk
for all x. Putting x = 1 gives u = 0 and so f (k) (1) = (1)k f (k) (0). However,
we know that f (k) (0) Z and so (1)k f (k) (0) Z. That is, f (k) (1) Z, as
claimed.

We are now in a position to prove the following result we are interested


in concerning the irrationality of various powers of e.

Department of Mathematics
The elementary functions 121

Theorem 7.19. er is irrational for every r Q \ { 0 }.

Proof. We first show that es / Q for every s N. The proof is by contra-


diction, so suppose the contrary, namely that there is some s N such that
es Q. Then we can write es = p/q for p, q N.
Choose and fix n N obeying the inequality n! > p s2n+1 and let f (x) =
xn (1 x)n /n! as in the previous lemma, Lemma 7.17. We introduce the
following function F (x) defined to be

F (x) = s2n f (x) s2n1 f 0 (x) + s2n2 f 00 (x)


s2n3 f (3) (x) + sf (2n1) (x) + f (2n) (x) .

Now, by part (i) of Lemma 7.17, f (x) has degree 2n and so f (k) (x) = 0 for
all k > 2n. Hence, differentiating the formula above, we find that

F 0 (x) = s2n f 0 (x) s2n1 f 00 (x) + s2n2 f (3) (x)


s2n3 f (4) (x) + sf (2n) (x) + f (2n+1) (x) .
| {z }
=0

Hence (after many cancellations)

F 0 (x) + sF (x) = s2n f 0 (x) s2n1 f 00 (x) + s2n2 f (3) (x)


s2n3 f (4) (x) + sf (2n) (x)
+ s2n+1 f (x) s2n f 0 (x) + s2n1 f 00 (x)
s2n2 f (3) (x) + s2 f (2n1) (x) + sf (2n) (x)
= s2n+1 f (x) .

It follows that
d sx
I= e F (x) = s esx F (x) + esx F 0 (x)
dx
= esx sF (x) + F 0 (x)
= s2n+1 esx f (x) .

Hence
Z 1 h i1
s2n+1 esx f (x) dx = esx F (x)
0 0
s
= e F (1) F (0)
p
= q F (1) F (0)

since es = p/q. Therefore

q I = p F (1) q F (0) .

Kings College London


122 Chapter 7

Now, s N and by Lemma 7.17 we know that f (k) (0) Z and f (k) (1) Z
for all k 0. It follows from the expression for F (x) that both F (0) Z
and F (1) Z. Hence q I Z. Furthermore, the integrand in the formula
for I is positive on (0, 1) and so I > 0. It follows that q I N.
Now, by Lemma 7.17 again, 0 < f (x) < 1/n! for 0 < x < 1 and esx < es
for x < 1 and so
Z 1
0 < qI = q s2n+1 esx f (x) dx
0
Z 1
2n+1 1
< qs esx dx
0 n!
1 p s2n+1
< q s2n+1 es =
n! n!
<1

by our very choice of n at the start. However, there are no integers strictly
between 0 and 1 so we have finally arrived at our contradiction and we
conclude that es is irrational for every s N.
Let m N. Since em = 1/em and we have just shown that em / Q, it
follows that em / Q and so we may conclude that es
/ Q for all s Z\{ 0 }.
Now let r Q \ { 0 }. Write r = m/n for m Z \ { 0 } and n N. If
er were rational, it would follow that em = (em/n )n = (er )n is also rational.
But we know that em is irrational for every m Z \ { 0 } and so it follows
that er is irrational and the proof is complete.

Compound Interest
If one pound is invested for one year at an annual interest rate of 100r% and
compounded at n regular intervals, the compound interest formula states
that its value on maturity is (1 + r/n)n pounds. This value is approximately
equal to er . We shall see why this is so.

Proposition 7.20. For fixed r > 0, let xn = (1 + r/n)n , n N. Then (xn ) is


a bounded increasing sequence and so converges.

Proof. Using the Binomial Theorem, we write


n
n X n r k
xn = 1 + r/n =
k n
k=0
n(n 1) . . . (n (k 1)) r k
n
X
=
k! n
k=0
n
X
= bk (n) rk
k=0

Department of Mathematics
The elementary functions 123

n(n1)...(n(k1))
where bk (n) = k! nk
1 1 n1 1 n2 . . . 1 (k1)
= 1
k! n .
j
Now, as n increases, j/n decreases and so 1 n increases. In other
words, for each fixed k n, bk (n) < bk (n + 1). It follows that

n+1
X n
X
xn+1 = bk (n + 1) rk = bk (n + 1) rk + bn+1 (n + 1) rn+1
k=0 k=0
n
X
> bk (n) rk = xn
k=0

which shows that (xn ) is an increasing sequence. Moreover, it is clear that

1
bk (n)
k!

so we see that
n
X n
X rk
xn = bk (n) rk < er
k!
k=0 k=0

and the proof is complete.

We can now establish the result we are interested in.


n
Theorem 7.21. For any fixed r > 0, 1 + r/n er as n .

Proof. Let > 0 be given.


Using the notation established above, we know that xn = (1 + r/n)n
for some R. We must show that = er . Let N1 N be such that if
n > N1 then
1
|xn | < 5 .
Pn k /k!
Next, let sn = k=0 r and let N2 N be such that if n > N2 then

|sn er | < 1
5 .

(We know that sn er .)


Now we note that for each fixed k, bk (n) 1/k! as n . Fix
N > N1 + N2 and let N3 N be such that N3 > N and if n > N3 then

N N
X X
bk (n) rk 1
rk < 1
.
k! 5
k=0 k=0

Kings College London


124 Chapter 7

For any n > N3 , we have


n n
X X
|xn sn | = k
bk (n) r 1 k
k! r
k=0 k=0
XN XN

= bk (n) rk 1
k! rk
k=0 k=0
n
X n
X
+ bk (n) rk 1
k! rk
k=N +1 k=N +1
N
X N
X n
X

bk (n) rk 1
k! rk + 2 1
k! rk
k=0 k=0 k=N +1

X
< 1
5 +2 1
k! rk
k=N +1
1 r
= 5 + 2 (e sN )
1
< 5 + 25 = 3
5 .
But then, for n > N3 ,
| er | | xn | + |xn sn | + |sn er | < 1
5 + 53 + 15 =
and we conclude that = er .
Corollary 7.22. For any r > 0, (1 r/n)n er as n .
Proof. Let yn = (1 r2 /n2 )n and suppose that n is so large that r/n < 1.
For such n, we see that
0 < 1 yn = 1 (1 r2 /n2 )n
Xn r2 k
=1 bk (n)
n
k=0
n
X r2 k
= bk (n)
n
k=1
n
X r2 k
bk (n)
n
k=1
r2 /n
<e 1.
2
Now er /n 1 and so by the Sandwich Principle, we see that yn 1 as
n . However, we then find that
(1 r2 /n2 )n 1
(1 r/n)n = r = er
(1 + r/n)n e
as required.

Department of Mathematics
The elementary functions 125

The logarithm
The logarithm is defined via the exponential function. We know that exp x
maps R one-one onto (0, ). This means that to each x (0, ) there is
one and only one v R such that exp v = x.

Definition 7.23. For x (0, ), log x is the value v R such that exp v = x.

It follows that x 7 log x maps (0, ) onto R.

log x is defined by the formula elog x = x for x > 0.

Remark 7.24. The notation ln x is also used for the function log x here. The
notation ln emphasizes the fact that this is the logarithm to base e, the
so-called natural logarithm.

Proposition 7.25. The function log x has the following properties.

(i) log 1 = 0 and log e = 1.

(ii) For any s, t > 0, log(st) = log s + log t.

(iii) For any x > 0, log(1/x) = log x.

(iv) log x is strictly increasing and log x as x .

(v) (log x)/x 0 as x .

Proof. We shall make use of the identity log(es ) = s for s R.

(i) We have log 1 = log(e0 ) = 0. Also log e = log e1 = 1.

(ii) For any s, t > 0, we have

log(s t) = log(elog s elog t ) = log(elog s+log t ) = log s + log t .

(iii) We have

log(1/x) = log(1/elog x ) = log(e log x ) = log x .

(iv) Suppose that a < b. Then elog a = a < b = elog b and so we have
log a < log b because exp x is strictly increasing.
Now let M > 0 be given. Set m = eM . Then if x > m, it follows that
log x > log m, that is, log x > log(eM ) = M .

Kings College London


126 Chapter 7

(v) Let v = log x. Then x = ev and

log x v
= v.
x e

Now, if x then also log x , that is, v . However,


we already know that v/ev 0 as v and so (log x)/x 0 as
x .

The proof is complete.

Theorem 7.26. The function log x is continuous at each point in (0, ).

Proof. Let s (0, ) be given and let > 0 be given. We know that the
function t 7 et is strictly increasing, that is, < if and only if e < e .
This means that
< t < e < et < e .

In particular, if = log s , t = log x and = log s + , this becomes

log s < log x < log s + s e < x < s e .

Let = min{ s e s, s s e }. Then

|x s| < = s < x < s + = s e < x < s e .

Therefore log s < x < log s + or |log x log s| < and it follows that
log x is continuous at s.

The next theorem tells us what the derivative of log x is.

Theorem 7.27. The function log x is differentiable at every s > 0 and its
d
derivative at s is 1/s. (In other words, dx log x = 1/x on (0, ).)

Proof. Let s (0, ) be given and let h be small but h 6= 0. We must show
that the Newton quotient h1 (log(s+h)log(s)) approaches 1/s as h 0. To
see this, let v = log s and let k = log(s+h)log s, so that log(s+h) = v +k.
The continuity of log x at s implies that log(s + h) log(s) as h 0, that
is, k 0 as h 0. In terms of v and k, we have

h = (s + h) s = elog(s+h) elog s = ev+k ev = exp(v + k) exp(v) .

Note also that since h 6= 0, it follows that s + h 6= s and so log(s + h) 6= log s

Department of Mathematics
The elementary functions 127

which means that k 6= 0. Using these remarks, we get


log(s + h) log(s) k
=
h exp(v + k) exp(v)
exp(v + k) exp(v) 1
=
k
0
1
exp (v) as h 0 (since also k 0),
1
=
exp(v)
1
=
s
as required.

For any positive real number a, we know what the power ak means for
any k N. We also know what ap/q means for p, q N: it is the real
number whose q thpower is equal to ap . However, it is not at all clear what
a power such as 3 2 means. We would like to set up a reasonable definition
for powers such as this. We need some preliminary results.
Proposition 7.28. For any a > 0 and m, n N,
log(am/n ) = m
n log(a) .
Proof. First note that log(sk ) = k log(s) for any s > 0 and k N. We shall
verify this by induction. For k N, let P (k) be the statement log(sk ) =
k log(s). Evidently, P (1) is true. Using the previous proposition, we see
that
log(sk+1 ) = log(sk s) = log(sk ) + log(s) = k log(s) + log(s) = (k + 1) log(s)
if P (k) is true. Hence the truth of P (k + 1) follows from that of P (k) and
so, by induction, we conclude that P (k) is true for all k N.
Let t = a1/n so that tn = a and tm = am/n . We have
n log(am/n ) = n log(tm )
= nm log(t)
= m log(tn )
= m log(a)
m
and it follows that log(am/n ) = n log(a).

From the above, we see that am/n = exp( m n log(a)). Moreover, a


m/n =

1/am/n = 1/ exp( m m
n log(a)) = exp( n log(a)). Hence a = e
r r log a for any
r
a > 0 and any r Q. Now, the left hand side, a , makes no sense unless r is
rational but the right hand side, namely, er log a (which is short-hand notation
for exp(r log a)), is well-defined for any real number r. This suggests the
following definition of the power as for any s R.

Kings College London


128 Chapter 7

Definition 7.29. For a > 0 and s R, the power as is defined to be

as = es log a .

A further remark is in order here. By setting a = e, the real number


exp(1), we have a formula for the power es . But this is just

es = exp(s log e) = exp s

since log e = 1. This is in agreement with our penchant for using the short-
hand notation es = exp s, so everything works out alright; that is, we can
think of the expression es as being the power, as defined above, or as an
abbreviation for the exponential, exp s. These are the same thing. The next
proposition tells us that the expected power laws hold.

Proposition 7.30. For any a (0, ) and any s, t R, we have

as+t = as at and (as )t = ast .

Proof. From the definition, we have

as+t = exp((s + t) log a) = exp(s log a) exp(t log a)


= as at .

Similarly,

(as )t = exp(t log(as )) = exp(t log(es log a ))


= exp(t s log a)
= ast ,

as required.

Proposition 7.31. For any a R, the function f (x) = xa is differentiable


on (0, ) and
f 0 (x) = a xa1 .

Proof. From the definition, f (x) = xa = ea log x and so the standard rules
of differentiation imply that
a a log x xa
f 0 (x) = e =a = a xa1
x x
as claimed.

Department of Mathematics

You might also like