You are on page 1of 72

Analysis II: Continuity and Differentiability HT 2010

Janet Dyson

Contents

0. Introduction
1. Limits of Functions
2. Continuity of functions
3. Continuity and Uniform Continuity
4. Boundedness of continuous functions on a closed and bounded inter-
val.
5. Intermediate value theorem
6. Monotonic Functions and Inverse Function Theorem
7. Limits at infinity and infinite limits
8. Uniform Convergence
9. Uniform Convergence: Examples and Applications
10. Differentiation: definitions and elementary results
11. The elementary functions
12. The Mean Value Theorem
13. Applications of the MVT
14. L’Hospital’s Rule
15. Taylor’s Theorem
16. The Binomial Theorem
Introduction

Acknowledgement

These lectures have been developed by a number of lecturers over the years. I would partic-
ulary like to thank Professor Roger Heath-Brown who gave the first lectures for this course
in its present form in 2003 and Dr Brian Stewart and Dr Zhongmin Qian who allowed me to
adapt their lecture notes and use their LATEX files.

Lectures

To get the most out of the course you must attend the lectures. There will be more expla-
nation in the lectures than there is in the notes.

On the other hand I will not put everything on the board which is in the printed notes. In
some places I have put in extra examples which I will not have time to demonstrate in the
lectures. There is also some extra material which I have put in for interest wbut which I do
not regard as central to the course.

Numbering system:

In the printed notes there are 16 sections. Within each section there are subsections. The-
orems, defintions, etc are numbered consecutively within each subsection. So for example
Theorem 1.2.3 is the third result in Section 1, Subsection 2. I will use the numbering in the
printed notes, even though I will omit some subsections in the lectures, so the numbering
will no longer be consecutive.

Exercise sheets

The weekly problem sheets which accompany the lectures are an integral part of the course.
In Analysis above all you will only understand the definitions and theorems by using them.
I assume that week 1 tutorials are being devoted to the final sheets from the Michaelmas
Term courses.
I suggest that the problem sheets for this course are tackled in tutorials in weeks 2–8, with
the 8th sheet used a vacation work for a tutorial in the first week of Trinity Term.

Corrections

Please email any corrections to me at Janet.Dyson@mansfield.ox.ac.uk


Notation:

I will use this notation (which was used in courses on MT) throughout.
• C: set of all complex numbers - the complex plane.
• R: set of all real numbers - the real line; R ⊂ C.
• Q: the rational numbers, Q ⊂ R.
• N: the natural numbers, 1, 2, . . . , N ⊂ Q.
• ∀: “for all” or“for every” or “whenever”.
• ∃: “there exist(s)” or “there is (are)”.
• Sometimes I will write “s. t.” for “such that” , “resp.” for “respectively”, “iff” for “if
and only if”.

Recall the following definition from ‘Introduction to Pure Mathematics’ last term
If a, b ∈ R then we define intervals as follows:

(a, b) := {x ∈ R : a < x < b}


[a, b] := {x ∈ R : a ≤ x ≤ b}
(−∞, a) := {x ∈ R : x < a},

etc.
1 Limits of Functions

1.1 Sequence limits and completeness

This course builds on the ideas from Analysis I and also uses many of the results from that
course. I have put some of the most important results from Analysis I in these notes but
I will not write them on the board in the lecture. However, I will begin by recalling the
definition of limits for sequences.

Definition 1.1.1. A sequence {zn } of real (or complex) numbers has limit l, if
∀ε > 0, ∃N ∈ N, such that,
|zn − l| < ε ∀n > N.
We denote this by ‘zn → l as n → ∞’ or by ‘limn→∞ zn = l’.

Definition 1.1.2. A sequence {zn } of real (or complex) numbers converges if it has a limit
l.

Often we prove things by contradiction. We start by assuming that what we want is not
true. That means we have to be able to write down the contrapositive of a proposition.
We can do this mechanically: working from the left change every ∀ into ∃, every ∃ into ∀
and negate the simple proposition at the end.
For example, by the first definition, a sequence {zn } does not converge to l 1 , if and only if
∃ε > 0, such that ∀k ∈ N, ∃nk > k s. t.

|znk − l| > ε.

Definition 1.1.3. {zn } is called a Cauchy sequence if ∀ε > 0 ∃N ∈ N such that ∀n, m >
N
|zn − zm | < ε.

Here is the key theorem, sometimes called The General Principle for Convergence:

Theorem (Cauchy’s Criterion). A sequence {zn } of real (or complex) numbers converges
if and only if it is a Cauchy sequence.

When mathematicians say that the real number system R and the complex number system
C are complete what they mean is that this theorem is true. There are no sequences which
look as though they converge but don’t, there are no ‘gaps’ in the real number line.

According to Cauchy’s criterion, {zn } diverges [i.e. has no finite limit], if and only if ∃ε > 0,
such that ∀k ∈ N, there exists [at least] two integers nk1 , nk2 > k s. t. |znk1 − znk2 | > ε.
1
i.e. either {zn } diverges, or zn → a 6= l

1
1.2 Compactness

The following theorem demonstrates the “compactness” of a bounded subset.


Theorem. (The Bolzano–Weierstrass Theorem) Any bounded sequence in R (or in C)
has a subsequence which converges to a point in R (in C).

The proofs of theorems about continuous functions we are going to prove in this course rely
on the following
Corollary 1.2.1. A bounded sequence {zn } in R (or in C) converges to a limit l if and only
if all convergent subsequences of {zn } have the same limit l.

Proof. =⇒: This was proved in Analysis I: any subsequence of a convergent sequence tends
to the same limit.
⇐=: We argue by contradictionSuppose {zn } is divergent.
Since {zn } is bounded, there exists a subsequence {znk } converging to some limit l1 by the
Bolzano–Weierstrass Theorem. Notice that N \ {nk : k ∈ N} can’t be finite: if that were to
happen then {zn } and {znk } would have a common tail and so zn → l1 after all.
We can therefore let {zmk } be the subsequence got by omitting the terms labelled by nk . If
this subsequence converges to l1 then it is easy to see that {zn } converges to l1 . (In Analysis
I we did this for special cases like z2n → l and z2n+1 → l; the argument is easily modified.)
So we have that {zmk } does not converge to l1 . Writing down the contrapositive, then, we
have there exists ε0 > 0 such that for every natural number j there exists a natural number
which we denote by rj such that rj > j and |zmrj − l1 | > ε0 .

Now {zmrj } is bounded, so by the Bolzano–Weierstrass Theorem there exists a convergent


subsequence {zmrj }, with zmrj → l2 say. Then letting s → ∞ in
s s

0 < ε0 6 |zmrj − l1 |
s

we get, by the preservation of weak inequalities, that


0 < ε0 6 |l2 − l1 |
and so l1 6= l2 . We thus have found two subsequences of {zn } which converge to distinct
limits.

1.3 Limit points

Before we talk about limits of functions ‘f (x) → l as x → a’ we need to say something about
the sort of points a in a set that we are interested in—we want to exclude points which x
can’t get near!
Definition 1.3.1. Let E ⊆ R (or C). A point p ∈ R (or C) is called a limit point (or an
accumulation point, or a cluster point) of E, if ∀δ > 0, there is at least one point z ∈ E
other then p such that
0 < |z − p| < δ.

2
Definition 1.3.2. A point which is not a limit point of E is called an isolated point of E.

There are all sorts of exotic examples of limit points but most sets we will consider are
intervals so the following result is crucial:

Theorem 1.3.3. p ∈ R is a limit point of an interval [a, b] ( or (a, b], or [a, b) or (a, b)) if
and only if p ∈ [a, b].

Proof for the interval (a, b]. There are (by trichotomy) only three cases: p < a, p ∈ [a, b], and
p > b. In the first take δ := (a − p)/2 and get a contradiction, in the third take δ := (p − b)/2.
The case p ∈ [a, b] is an exercise.

1.4 Functions

Let f : X → Y , where X and Y are subsets of C or R.

Although there’s no such thing a typical function here are three examples, which are often
useful as test cases when we formulate definitions and make conjectures.

Example 1.4.1. f (x) = 1 − x2 with domain E = [−1, 1]. What is its graph? Its graph
looks continuous . . . .

Example 1.4.2. Consider function f on E = (0, 1] given by


 p
 q if x = pq in lowest terms,
f (x) :=
 0 when x is irrational.

This time our sketch of the graph is a bit more sketchy. [Try with Maple].

Example 1.4.3. The function f (x) = x sin x1 with domain R\{0} is an important test case
in our work. As x gets close to 0, the values of f oscillate, but they do get close to 0. Once
we have formalised this we will see that f has limit 0 as x goes to 0.

1.5 Limits of Functions

Having looked at these examples we make a definition.

Definition 1.5.1. Let E ⊆ R (or C), and f : E → R (or C) be a real (or complex) function.
Let p be a limit point of E and let l be a number. We say that f tends to l as x tends to
p if ∀ε > 0 ∃δ > 0 such that

|f (x) − l| < ε ∀x ∈ E such that 0 < |x − p| < δ.

In symbols we write this as ‘limx→p f (x) = l’ or ‘f (x) → l as x → p.’

Remark 1.5.2. (i) Note that p is not necessarily in E.


(ii) Note that in the definition δ may depend on p and ε.

3
Example 1.5.3. Let α > 0 be a constant. Consider the function f (x) = |x|α sin x1 on the
domain E = R\{0}. Show that f (x) → 0 as x → 0.

Since | sin θ| 6 1 we have that ¯xα sin x1 ¯ ≤ |x|α for any x 6= 0. Therefore, ∀ ε > 0, choose
¯ ¯

δ = ε1/α . Then
¯ ¯
¯x sin 1 − 0¯ ≤ |x|α < ε
¯ α ¯
whenever 0 < |x − 0| < δ.
¯ x ¯

According to the definition, |x|α sin x1 → 0 as x → 0.

Example 1.5.4. Consider the function f (x) = x2 on the domain E = R. Let a ∈ R. Show
that f (x) → a as x → a.

Note that |x2 − a2 | = |x − a||x + a| ≤ |x − a|(|x| + |a|). So we want to get a bound on x.


Suppose that |x − a| < 1, then

|x| = |x − a + a| ≤ |x − a| + |a| < 1 + |a|.


ǫ
So ∀ǫ > 0, choose δ = min{1, 1+2|a| }. Then

|x2 − a2 | ≤ |x − a|((1 + |a|) + |a|) < ǫ whenever |x − a| < δ

as required.

Theorem 1.5.5. Let f : E → R (or C) and p be a limit point of E. If f has a limit as


x → p, then the limit is unique.

Proof. Suppose f (x) → l1 and also f (x) → l2 as x → p, where l1 6= l2 . Then 21 |l1 − l2 | > 0,
so by definition, ∃δ1 > 0 such that

|f (x) − l1 | < 21 |l1 − l2 | ∀x ∈ E such that 0 < |x − p| < δ1

Similarly, ∃δ2 > 0 such that

|f (x) − l2 | < 21 |l1 − l2 | ∀x ∈ E such that 0 < |x − p| < δ2 .

Let δ = min{δ1 , δ2 }. Since p is a limit point of E and δ > 0, ∃x0 ∈ E such that 0 < |x0 − p| <
δ. However

|l1 − l2 | = |(f (x0 ) − l1 ) − (f (x0 ) − l2 )| [Add and subtract technique]


6 |f (x0 ) − l1 | + |f (x0 ) − l2 | [Triangle Law]
1 1
< 2 |l1 − l2 | + 2 |l1 − l2 |
= |l1 − l2 |

a contradiction.

Remark 1.5.6. An exercise in contrapositives: f doesn’t converge to l as x → p (i.e. either


f has no limit or f (x) → a 6= l as x → p), means that ∃ε > 0, such that ∀δ > 0, ∃x ∈ E
such that 0 < |x − p| < δ but |f (x) − l| > ε.

4
The following theorem translates questions about function limits to questions about sequence
limits, and so we can make use of results in Analysis I.
Theorem 1.5.7. Let f : E → R (or C) where E ⊆ R (or C), p be a limit point of E and
l ∈ C. Then the following two statements are equivalent:

(a) f (x) → l as x → p;

(b) For every sequence {pn } in E such that pn 6= p and limn→∞ pn = p we have that
f (pn ) → l as n → ∞.

We might say informally that limx→p f (x) = l if and only if f tends to the same limit l along
any sequence in E going to p.

Proof. =⇒: Suppose limx→p f (x) = l. Then ∀ε > 0, ∃δ > 0 such that

|f (x) − l| < ε ∀x ∈ E such that 0 < |x − p| < δ.

Now suppose {pn } is a sequence in E, with pn → p and pn 6= p. Then ∃N ∈ N such that

|pn − p| < δ ∀n > N .

So, since pn 6= p,
|f (pn ) − l| < ε ∀n > N.
Hence, limn→∞ f (pn ) = l.

⇐=: Argue by contradiction. Suppose limx→p f (x) = l is not true. Then ∃ε0 > 0,
such that ∀δ > 0,—which we choose to be 1/n for arbitrary n— ∃xn ∈ E, with 0 < |xn −p| <
1/n but
|f (xn ) − l| > ε0 .
Therefore we have found a sequence {xn } which converges to p but {f (xn )} does not tend
to l. Contradiction.

1.6 Example

The above result is very useful when we want to prove that limits do not exist.
Example 1.6.1. Show that limx→0 sin x1 doesn’t exist.

1 1
Let xn = and yn = π . Then both sequences xn and yn tend to 0, but
πn 2πn + 2

1
lim sin =0
n→∞ xn
and
1
lim sin = 1.
n→∞ yn
1
So limx→0 sin x cannot exist.

5
1.7 Algebra of Limits

We can use the theorem of the previous subsection together with the Algebra of Limits of
Sequences to prove the corresponding results: we get the Algebra of Limits of Functions. We
state the theorem for C but it also holds for R.

Theorem 1.7.1. Let E ⊆ C and let p be a limit point of E. Let f, g : E → C, and let
α, β ∈ C. Suppose that f (x) → A, g(x) → B as x → p. Then the following limits exist and
have the values stated:

(Addition) limx→p (f (x) + g(x)) = A + B;

(Negation) limx→p (−f (x)) = −A;

(Linear Combination) limx→p (α · f + β · g) (x) = αA + βB;

(Product) limx→p (f (x)g(x)) = AB;

(Quotient) if B 6= 0 then ∃δ > 0 s.t. g(x) 6= 0 ∀x ∈ E such that 0 < |x − p| < δ, and
limx→p (f (x)/g(x)) = A/B;

(Weak Inequality) if f (x) > 0 for all x ∈ E then A > 0.

It is a good exercise to prove these results directly from the definitions; just mimic the
sequence proofs.
Example of proof: If B 6= 0 then ∃δ > 0 s.t. g(x) 6= 0 ∀x ∈ E such that 0 < |x − p| < δ, and
limx→p (1/g(x)) = 1/B;
I will do it both ways:
(i) Deduction from AOL for sequences: Suppose first that there is no such δ. Then for each
n, ∃pn ∈ E such that 0 < |pn − p| < 1/n, and g(pn ) = 0. But then pn → p, so g(pn ) → B,
giving B = 0, a contradiction. So δ > 0 exists.
Now let {xn } be any sequence in E with xn → p and xn 6= p. We may assume xn ∈ (p−δ, p+δ)
(by tails). Hence g(xn ) 6= 0 and g(xn ) → B. Thus by the AOL for sequences, 1/g(xn ) → 1/B.
Thus 1/g(x) → 1/B as required.
(ii) Direct proof: Take ǫ = |B|/2 > 0. So ∃δ1 > 0 such that

|g(x) − B| < |B|/2 ∀x ∈ E such that 0 < |x − p| < δ1 .

Thus by the Triangle Law

|g(x)| = |B + (g(x) − B)| ≥ |B| − |g(x) − B| > |B| − |B|/2 = |B|/2

∀x ∈ E such that 0 < |x − p| < δ1 . (So in particular, g(x) 6= 0 whenever 0 < |x − p| < δ1 .)
Now, given ǫ > 0, ∃δ2 > 0 such that

|g(x) − B| < |B|2 ǫ/2 ∀x ∈ E such that 0 < |x − p| < δ2 .

6
Take δ = min{δ1 , δ2 }. Then if x ∈ E is such that 0 < |x − p| < δ
|g(x) − B| |B|2 ǫ/2
|1/g(x) − 1/B| = < =ǫ
|g(x)||B| |B|2 /2
as required.
Remark 1.7.2. Note we have also proved above that if limx→p g(x) = B 6= 0, then there is
a positive number δ > 0 such that
|B|
|g(x)| > ∀x ∈ E such that 0 < |x − p| < δ.
2
In particular, |g(x)| > 0 ∀x ∈ E such that 0 < |x − p| < δ.
B
It can be proved similarly that if g : E → R and B > 0, then ∃δ > 0 such that g(x) > 2 >0
∀x ∈ E such that 0 < |x − p| < δ.

1.8 An extension

Sometimes we want to extend the notion ‘f (x) → ℓ as x → p’ to cover ‘infinity’. Here is


one such extension: note that although ∞ appears in the language, we have not given it
the status of a number: it can only appear in certain phrases in our mathematical language
which are shorthand for quite complicated statements about real numbers.
Definition 1.8.1. Suppose that E ⊆ R is a set which is unbounded above and f : E → R.
Then we write f (x) → ∞ as x → ∞ to mean that ∀B > 0, ∃D > 0 such that
f (x) > B ∀x ∈ E s.t. x > D.

Extra result added 26th January 2010:


Here is a version of the sandwich theorem.
Proposition 1.8.2. Let E ⊆ R and let p be a limit point of E. Let f, m, M : E → R. Suppose
that there exists δ > 0 s.t. m(x) ≤ f (x) ≤ M (x) for all x ∈ E such that 0 < |x − p| < δ and
that m(x) → l, M (x) → l as x → p. Then limx→p f (x) exists and equals l.

Proof. Since m(x) → l and M (x) → l as x → p we have:


∀ε > 0 ∃δ1 > 0 s.t. l − ε < m(x) < l + ε ∀x ∈ E s.t. 0 < |x − p| < δ1 . and
∀ε > 0 ∃δ2 > 0 s.t. l − ε < M (x) < l + ε ∀x ∈ E s.t. 0 < |x − p| < δ2 .
So if we take δ3 = min{δ, δ1 , δ2 }, then
∀ε > 0 l − ε < m(x) ≤ f (x) ≤ M (x) < l + ε ∀x ∈ E s.t. 0 < |x − p| < δ3 ,
and we are done.

2 Continuity of functions

We all have a good informal idea of what it means to say that a function has a continuous
graph: we can draw it without lifting the pencil from the paper. But we want now to use
our precise definition of ‘f (x) → l as x → p’ to discuss the idea of continuity. That is we
want to discuss the precise question of whether f is continuous at a particular point p.

7
2.1 Definition

In the definition of limx→p f (x), the point p need not belong to the domain E of f . But even
if it does, and f (p) is well-defined, the limit of f at p may have nothing to do with f (p).
The classic example is the function
½
0 if x 6= 0,
f (x) :=
1 otherwise.

Then limx→0 f (x) = 0 6= 1.


This example motivates our definition.

Definition 2.1.1. Let f : E → R (or C), where E ⊆ R (or C), and p ∈ E. If ∀ε > 0 ∃δ > 0
such that
|f (x) − f (p)| < ε ∀x ∈ E such that |x − p| < δ
then we say that f is continuous at p.

We continue with the notation of the definition for a moment and see what this means for
isolated and limit points.

Proposition 2.1.2. f is continuous at any isolated point of E.

Proof. As p is isolated there exists δ > 0 such that there are no other points of x ∈ E such
that 0 < |x − p| < δ. The inequality required is therefore vacuously true.

Proposition 2.1.3. If p ∈ E is a limit point of E, then f is continuous at p, if and only if

lim f (x) exists and lim f (x) = f (p).


x→p x→p

Proof. It’s clear that the continuity definition implies the limit one at once. The limit one,
provided the limit is f (p), delivers all that we need for continuity except that the inequality
|f (x) − l| < ε holds for x = p as well as the other points x in |x − p| < δ. But this is
immediate.

2.2 Examples

Example 2.2.1. Let α > 0 be a constant. The function f (x) = |x|α sin x1 not defined at
x = 0 so it makes no sense to ask if it is continuous there. In such circumstances we modify
f in some suitable way. So we look at

|x|α sin x1
½
if x 6= 0,
g(x) :=
0 if x = 0.

Then 0 is a limit point of the domain, and we calculated before that limx→0 g(x) = 0 = g(0),
so g is continuous at 0.

8
Example 2.2.2. Let f : (0, 1] → R be defined by
½ 1
n if x = mn in lowest terms,
f (x) :=
0 if x is irrational.

At which points of (0, 1] is f continuous?

This is very like problem on the Exercise Sheets, so I won’t give a full proof here, only
indicate how I would tackle it.
Every p ∈ (0, 1] is a limit point, so we need to work out limx→p f (x) for each p. We know
that we can do this by looking at limn→∞ f (pn ) for each sequence {pn } converging to p.
We know that there is always a sequence of irrationals {xn } converging to p. (Because, from
Analysis I, for every n ∈ N the interval (p, p + 1/n) contains an irrational number xn .) Then
the sequence {f (xn )} is just the null sequence (0, 0, . . . ) with limit 0.
So perhaps we need to distinguish between rational and irrational points?
Suppose p 6= 0 is rational. Then, with {xn } as above, f (xn ) → 0 but f (p) = p 6= 0. Therefore
f is not continuous at non-zero rational points.
Now let p be irrational. Some sequences (for example irrational ones) tend to 0 = f (p). But
do all sequences have this property? Let pn → p, and consider f (pn ). If this does not tend
to zero, then for some ε > 0 we can find a subsequence such that f (pnj ) > ε. That is, these
pnj must be rational and have denominators less than 1ε . There are only a finite number of
such points in the interval, and so—since pn → p—we can find a n such that for n > N all
the pn are irrational or have denominators at least 1ε . Hence we cannot have the claimed
subsequence.
Therefore f is continuous at irrational points since for all sequences {pn } we have that
f (pn ) → f (p).

2.3 Algebraic properties

We can use our characterisation of continuity at limit points in terms of limx→p f (x) together
with the Algebra of Function Limits to prove that the class of functions continuous at p is
closed under all the usual operations. We state the theorem for C but it also holds for R.
Theorem 2.3.1. Let E ⊆ C and let p ∈ E. Let f, g : E → C, and let α, β ∈ C. Suppose
that f, g are continuous at p. Then the following functions are also continuous at p:

(Addition) f (x) + g(x);

(Negation) −f (x);

(Linear Combination) (α · f + β · g) (x);

(Product) (f (x)g(x)); and

(Quotient) (f (x)/g(x)) provided g(p) 6= 0 (which guarantees that there exists δ such that
f (x)/g(x) is defined ∀x ∈ E such that |x − p| < δ).

9
Proof. Follows directly from the Algebra of Function Limits. However, it is a good2 exercise
to write out a proof from the definition—again just mimic what was done for the AOL for
sequences.

Example 2.3.2. Let f : C → C (or R → R) be a polynomial. Then f is continuous at every


point of C (or R).
r(x)
Further, if f (x) = q(x) , where r, q : C → C (or R → R) are polynomials. Then if q(p) 6= 0,
f is continuous at p.

This follows immediately from the above theorem because the function f (x) = x with domain
C (or R) is continuous.

2.4 Composition of continuous functions

However we can do more than these trivial algebraic results.


Theorem 2.4.1. Let f : E → C and g : f (E) → C, and define h : E → C by

h(x) = (g ◦ f )(x) ≡ g(f (x)) for x ∈ E.

If f is continuous at p ∈ E and g is continuous at f (p), then h is continuous at p.

Proof. For any ε > 0, since g is continuous at f (p), ∃δ1 > 0 such that

|g(y) − g(f (p))| < ε ∀y ∈ f (E) such that |y − f (p)| < δ1 .

That is
|g(f (x)) − g(f (p))| < ε ∀x ∈ E such that |f (x) − f (p)| < δ1 .
However, f is continuous at p, so ∃δ > 0 such that

|f (x) − f (p)| < δ1 ∀x ∈ E such that |x − p| < δ.

Hence
|g(f (x)) − g(f (p))| < ε ∀x ∈ E such that |x − p| < δ
so that h is continuous at p.

Extra result added 26th January 2010:


The following theorem follows immediately from Proposition 2.1.3 and the proof of Theorem
1.5.7. In this case we do not need to avoid sequences which hit the point:
Theorem. 1.5.7 ′ Let f : E → R (or C) where E ⊆ R (or C) and p ∈ E. Then the following
two statements are equivalent:

(a) f (x) is continuous at p;

(b) For every sequence {pn } in E such that limn→∞ pn = p we have that f (pn ) → f (p) as
n → ∞.
2
Doing this will reinforce the definitions, but also consolidate your understanding of sequences.

10
3 Continuity and Uniform Continuity

3.1 Continuous functions on sets

Having made our definition of ‘continuity’ we will see that actually, what usually matters is
not continuity at a point, but continuity at all points of a set, and the interesting sets are
usually intervals or disks. In the later lectures we are going to establish several important
theorems about continuous functions on bounded intervals.
But here is the definition of continuity on a set.
Definition 3.1.1. Let f : E → R (or C). We say that f is continuous on E if f is
continuous at every point of E.

For later use we decode this in terms of εs and δs.


Proposition 3.1.2. Let f : E → R (or C). Then f is continuous on E if,

∀p ∈ E and ∀ε > 0, ∃δ > 0

such that
|f (x) − f (p)| < ε ∀x ∈ E such that |x − p| < δ.

Note that the δ may depend on ε and on the point p.

We are about to look at uniform continuity, in which δ does not depend on p. First we will
consider an example which is not uniformly continuous.

3.2 An Example

We look at an example of a function continuous on a set.


1
Example 3.2.1. Let f : (0, ∞) → R be given by f (x) := .
x
1
Show that for every p 6= 0, limx→p x = p1 , and thus f (x) is continuous on (0, ∞).

By the algebra of limits this is all clear. But we want to analyse what is going on more
carefully, to see how the δ is related to ε and the the point x in question.
First, ¯ ¯
¯ 1 1 ¯ |x − p|
|f (x) − f (p)| = ¯¯ − ¯¯ =
x p |x||p|
1
and we can see that the problem term is .
x
However, |p| > 0, and so when |x − p| < 21 |p| we have by the Triangle Law that

|x| > |p| − |x − p| > 21 |p|;

so we’re going to have to pick δ 6 12 |p|.

11
For these x, then, we have that
2
|f (x) − f (p)| 6 |x − p|
|p|2
2
and if we make sure |p|2
|x − p| < ε we will be done.
This can be achieved that by choosing
2
¡1 1
¢
δ := min 2 |p|, 2 ε|p|

which is indeed positive.


Note that for small ε (the interesting ones) the values of δ we need depend heavily on p.
1
Near 1 choosing 21 ε will do, but at 10−6 we need 2.10 12 ε. Our function is certainly continuous
at every point, but there’s no way of controlling over the whole interval how far it strays in
a small neighbourhood.

3.3 Uniform Continuity

Sometimes we want to be able to control what happens over a set more ‘uniformly’.

Definition 3.3.1. Let f : E → R (or C). Then f is uniformly continuous on E if,

∀ε > 0, ∃δ > 0

such that
|f (p) − f (x)| < ε ∀p ∈ E and ∀x ∈ E such that |p − x| < δ.

Note the difference3 between this and the definition of ‘continuous on E’. In this, the uniform
case, we must find δ on the basis of ε alone, and we have to choose one that will give the
inequality for all x ∈ E and p ∈ E. Obviously if we can do this it is very nice, it gives us a
way of controlling what happens on a set all at once.

Of course f : E → R (or C) is uniformly continuous on E implies that f is continuous on E.


Here is one class of functions that satisfy the uniform continuity condition.

Example 3.3.2. Suppose that f is Lipschitz continuous in E: that is, assume that there is
a constant M such that

|f (x) − f (y)| 6 M |x − y| ∀x, y ∈ E.

Then f is uniformly continuous on E.


3
For those who like pure formulae,

Continuity on E: ∀p ∈ E ∀ε > 0 ∃δ > 0 ∀x ∈ E [|x − p| < δ =⇒ |f (x) − f (p)| < ε]


Uniform Continuity on E: ∀ε > 0 ∃δ > 0 ∀p ∈ E ∀x ∈ E [|x − p| < δ =⇒ |f (x) − f (p)| < ε]

Swapping ∀s doesn’t give problems, but swapping the ∀p and ∃δ is the crunch.

12
ε
Take x, y ∈ E. Given ε > 0, choose δ = M +1 > 0. Then

|f (x) − f (y)| 6 M |x − y|
µ ¶
ε
6 M <ε
M +1

whenever |y − x| < δ.
Note that our choice of δ does not depend on x or y. For a given ε > 0 we can find a δ that
works for all x and y.

Example 3.3.3. f (x) = x is Lipschitz continuous on [1, ∞), so it is uniformly continuous.

To see the Lipschitz condition note that


√ √ |x − y|
| x − y| 6 √ √ 6 21 |x − y|
x+ y

for all x, y > 1.

3.4 Continuity implies Uniform Continuity on [a, b]

Our first real theorem is:

Theorem 3.4.1 (Uniform Continuity on [a, b]). If f : [a, b] → R (or C) is continuous,


then f is uniformly continuous.

More generally, a continuous function on a closed and bounded set—‘compact set’ as we’ll
say next year—is uniformly continuous.

Proof. Suppose that f were not uniformly continuous. By the contrapositive of ‘uniform
continuity’ there would exist ε > 0, such that for any δ > 0—which we choose as δ = n1 for
arbitrary n—there exists a pair of points xn , yn ∈ [a, b], such that
1
|xn − yn | < n but |f (xn ) − f (yn )| > ε.

Since {xn : n ∈ N} ⊆ [a, b] is bounded, by the Bolzano–Weierstrass Theorem there exists a


subsequence {xnk } which converges to some p. Hence p must be a limit point of [a, b], so
p ∈ [a, b]. But

|ynk − p| 6 |xnk − ynk | + |xnk − p|


1
< + |xnk − p| → 0
nk
Thus xnk → p and ynk → p, so that by continuity at p we have

0 < ε 6 lim |f (xnk ) − f (ynk )| = |f (p) − f (p)| = 0


k→∞

which is impossible.

13
3.5 An example on an unbounded interval

Example 3.5.1. f (x) = x is uniformly continuous in the unbounded interval [0, +∞).

We do this in three steps: we prove uniform continuity on [0, 1], we prove uniform continuity
on [1, +∞), and we patch these together.

It is easy to get that x is continuous on [0, 1]; provided |x − p| < 21 p we will get
¯√
¯ x − √p¯ 6 √|x − p| 2
¯
√ 6 √ |x − p|
x+ p 3 p
and can argue from there. Thus it must be uniformly continuous by Theorem 3.4.1.

Secondly we have already shown that x is Lipschitz continuous on [1, ∞), so it is uniformly
continuous on [1, ∞).
Now we have to patch these together. This is a standard sort of argument which we do this
time as an example.
We have that for all ε > 0, ∃δ1 > 0 such that
√ √
| x − y| < 21 ε ∀x, y ∈ [0, 1] such that |x − y| < δ1 .

and ∃δ2 > 0 √ √


| x − y| < 12 ε ∀x, y ∈ [1, ∞) such that |x − y| < δ2 .
Choose δ = min{δ1 , δ2 , 21 } > 0. Then, suppose that |x − y| < δ. If x, y > 1 or x, y 6 1 we are
done.
So suppose that x ∈ [0, 1] and y > 1. Then |x − 1| < δ and |y − 1| < δ so that
√ √ √ √ √ √
| x − y| 6 | x − 1| + | y − 1|
1
< 2ε + 21 ε = ε

Hence we have √ √
| x − y| < ε

whenever x, y ∈ [0, ∞) such that |x − y| < δ. By definition, f (x) = x is uniformly
continuous in the unbounded interval [0, +∞).

3.6 A counterexample on a half open interval

The condition that the interval [a, b] is closed cannot be relaxed.


1
Example 3.6.1. f (x) = x is not uniformly continuous in the half open interval (0, 1]. (see
also Example 3.2.1)

Take ǫ = 1. We show that there is no δ > 0 such that definition 3.3.1 holds.
Take sequences xn = n1 and yn = n+1 1
. Then |f (xn ) − f (yn )| = 1, but |xn − yn | → 0. So
for any δ > 0, there exists n such that |xn − yn | < δ but |f (xn ) − f (yn )| <
6 1. So f is not
uniformly continuous.

14
4 Continuous functions on a closed and bounded interval

4.1 Boundedness

We begin with some definitions.

Definition 4.1.1. Let f : E → R (or C), and let M be a non-negative real number. We say
that f is bounded by M on E if

|f (z)| 6 M ∀z ∈ E.

We also say that M is a bound for f on E. If there is a bound for f on E we say that f is
bounded (on E).

Here is one of the important theorems of the course:

Theorem 4.1.2 (Continuous functions on [a, b] are bounded). If f : [a, b] → R (or C)


is continuous, then f is bounded.

Proof. Argue by contradiction. Suppose f were unbounded, then for any n ∈ N, there is at
least one point xn ∈ [a, b] such that |f (xn )| > n. Since {xn } is bounded, by the Bolzano–
Weierstrass Theorem, there exists a subsequence {xnk } converging to p, say. Then p is a limit
point of the interval [a, b] so p ∈ [a, b]. Note that |f (xnk )| > nk > k. Now f is continuous at
p and so we have that
f (p) = lim f (xnk )
k→∞

so in particular the sequence {f (xnk )} is convergent. Hence, by an Analysis I result, this


sequence is bounded. As its k-th term exceeds k we have a contradiction.
Therefore f must be bounded.

We will now show that these bounds are ‘attained’.

Notation 4.1.3. Let f : E → R be a bounded real-valued function, with E 6= ∅. Then write

sup f (x) := sup{f (t) | t ∈ E}


x∈E
inf f (x) := inf{f (t) | t ∈ E}
x∈E

noting that these exist by the Completeness Axiom.

Corollary 4.1.4. Let f : [a, b] → R be continuous then supx∈[a,b] f (x) and inf x∈[a,b] f (x)
exist.

Proof. Immediate.

15
Note 4.1.5. Recall that the supremum is precisely this: an upper bound, such that nothing
smaller is an upper bound. It is convenient to translate this into ε-language about functions
as follows:
½
∀x ∈ E, f (x) 6 M ; and
M = sup f (x) if and only if
x∈E ∀ε > 0 ∃xε ∈ E such that f (xε ) > M − ε.

We have a similar characterisation of infimum:


½
∀x ∈ E, f (x) > m; and
m = inf f (x) if and only if
x∈E ∀ε > 0 ∃xε ∈ E such that f (xε ) < m + ε.

Here now is our second important theorem; note that it is only for real-valued functions.

Theorem 4.1.6 (Continuous functions on [a, b] attain their bounds). Let f : [a, b] →
R be continuous, then f attains (or achieves) its supremum and infimum. That is, there
exist points4 x1 and x2 in [a, b] such that f (x1 ) = supx∈[a,b] f (x) and f (x2 ) = inf x∈[a,b] f (x).

Proof. (1st Proof: by contradiction.) Let us prove by contradiction that the supremum M
of f is attained.
Assume the contrary, that is

f (t) < M for all t ∈ [a, b].

Consider the function g defined on [a, b] by


1
g(x) =
M − f (x)

which is positive and continuous on [a, b]. Therefore g is, as we have proved, bounded on
[a, b], by M0 say:
1
= g(x) 6 M0 .
M − f (x)
It follows that
1
f (x) 6 M −
M0
for all x ∈ [a, b] which is a contradiction to the fact that M is the least upper bound.
A similar argument deals with the infimum. 5

As this is such an important theorem we give a different proof.


4
Note that x1 , x2 may be not unique.
5
Or we may apply what we have done to −f and get the result at once since

inf{t | t ∈ E} = − sup{−t | t ∈ E}.

16
Proof. (2nd Proof: direct.) The continuous function f is bounded by our earlier theorem,
so that m := inf x∈[a,b] f (x) exists by the Completeness Axiom of the real number system
[Analysis I]. Apply the characterisation of infimum we have given, taking ε := n1 to find a
point xn ∈ [a, b] such that
m 6 f (xn ) < m + n1 .

Now {xn } is bounded, so we may use the Bolzano–Weierstrass Theorem to extract a con-
vergent subsequence {xnk }; suppose we have xnk → p. Then p is a limit point of [a, b] so
p ∈ [a, b]. Since f is continuous at p, we have that f (xnk ) → f (p). From the inequality

1
m 6 f (xnk ) < m +
nk
we can deduce, as limits preserve weak inequalities, that
µ ¶
1
m 6 lim f (xnk ) = f (p) 6 lim m + =m
k→∞ k→∞ nk

so that f (p) = m = inf x∈[a,b] f (x).


A similar argument will deal with the supremum.

4.2 A Generalisation

In the proofs we have used only:


(i) [a, b] is bounded;
(ii) [a, b] is closed (i.e. [a, b] contains all limit points of [a, b];
(iii) f is continuous.
This prompts us to make the following definition:

Definition 4.2.1. A subset A of R (or of C) is compact if it is bounded, and if it contains


all its limit points.

Our proofs would then give the more general result:

Theorem. Let f : E → R be a real valued function on a compact subset E of R or C. Then


f is bounded, uniformly continuous, and attains its bounds.

5 The Intermediate Value Theorem

So far we have concentrated on extreme values, the supremum and the infimum. What can
we say about possible values between these?

Theorem 5.0.2 (IVT). Let f : [a, b] → R be continuous, and let c be a number between
f (a) and f (b). Then there is at least one ξ ∈ [a, b] such that f (ξ) = c.

This is one of the most important theorems in this course.

17
Proof. By considering −f instead of f if necessary, we may assume that f (a) 6 c 6 f (b).
The cases c = f (a) and c = f (b) are trivial, so assume f (a) < c < f (b).
Define g(x) = f (x) − c. Then g(a) < 0 < g(b). Let x1 = a and y1 = b. Divide the interval
[x1 , y1 ] into two equal parts. If g( 12 (x1 + y1 )) = 0 then ξ := 21 (x1 + y1 ) will do. Otherwise,
we choose x2 = x1 and y2 = ( 12 (x1 + y1 ) if g( 12 (x1 + y1 )) > 0, or x2 = 21 (x1 + y1 ) and y2 = y1
if g( 21 (x1 + y1 )) < 0. Then
1
g(x2 )g(y2 ) < 0; [x2 , y2 ] ⊂ [x1 , y1 ]; and |y2 − x2 | = 2 (y1 − x1 ) .

Apply the same argument to [x2 , y2 ] instead of [x1 , y1 ], we then find that: either g( 21 (x2 +
y2 )) = 0 and we can take ξ := 21 (x2 + y2 ), or there exist x3 , y3 such that
1
g(x3 )g(y3 ) < 0; [x3 , y3 ] ⊂ [x2 , y2 ]; and |y3 − x3 | = 2 (y2 − x2 ) .

By repeating the same procedure, we thus find two sequences xn , yn , such that

(i) either g( 21 (xn−1 + yn−1 )) = 0 and we can take ξ := 21 (xn−1 + yn−1 ),


or g(xn )g(yn ) < 0;
(ii) [xn , yn ] ⊂ [xn−1 , yn−1 ] for any n = 2, . . . ;
b−a
(iii) |yn − xn | = 12 |yn−1 − xn−1 | = · · · = 1
|y
2n−1 1
− x1 | = .
2n−1

Obviously, {xn } is a bounded increasing sequence, and {yn } is a bounded decreasing sequence.
Bounded monotone sequences converge and so xn → ξ and yn → ξ ′ for some ξ, ξ ′ ∈ [a, b].
Since by Algebra of Limits

|ξ ′ − ξ| = lim |yn − xn | = lim 1


n−1 (b − a) = 0,
n→∞ n→∞ 2

we get ξ = ξ ′ . Since g is continuous at ξ, we have by Algebra of Limits and the preservation


of weak inequalities that

0 > lim g(xn )g(yn ) = lim g(xn ) lim g(yn ) = g(ξ)2 .


n→∞ n→∞ n→∞

Hence g(ξ)2 = 0 as we are dealing with real numbers, so that g(ξ) = 0.


That is, f (ξ) = c.

Remark 5.0.3. The above proof of the IVT also provides a method of finding roots to
f (ξ) = c, but other methods may find roots faster if additional information about f (e.g. that
f is differentiable) is available.
Corollary 5.0.4. Let {[xn , yn ]} be a decreasing net6 of closed intervals of R such that the
length yn − xn → 0. Then ∩∞ n=1 [xn , yn ] contains exactly one point.

Proof. Just extract the relevant lines of the IVT proof.

Remark 5.0.5. The proof of IVT requires more than what we needed for boundedness and
the attainment of bounds. We have used the fact that [a, b] is unbroken. That is, we have
used the fact that [a, b] is “connected”.
6
That is [xn+1 , yn+1 ] ⊂ [xn , yn ] for each n

18
Here (for interest) is a sketch of an alternative proof, which identifies ξ as the supremum of
a certain set.

Sketch of alternative proof to IVT. As in the original proof it is sufficient to prove that
if g : [a, b] → R is continous and g(a) < 0 < g(b), then there exists ξ ∈ (a, b) such that
g(ξ) = 0.
Define E = {x ∈ [a, b] : g(x) < 0}.
Then E 6= ∅ (why?) and E is bounded (why?). So, by the Completeness Axiom, ξ = sup E
exists. We prove g(ξ) = 0.
Suppose first that g(ξ) < 0 (so ξ ∈ [a, b)). But then, as g is continuous, there exists h > 0
s.t. (ξ, ξ + h) ⊂ [a, b] and g(t) < 0 for t ∈ (ξ, ξ + h). (Proof?) But then ξ + h/2 ∈ E so ξ is
not the sup, a contradiction.
Suppose now that g(ξ) > 0 (so ξ ∈ (a, b]). But then, again because g is continuous, there
exists h > 0 s.t. (ξ − h, ξ) ⊂ [a, b] and g(t) > 0 for t ∈ (ξ − h, ξ). (Proof?) But there exists
t ∈ (ξ − h, ξ] such that g(t) < 0 which is also a contradiction.
Hence g(ξ) = 0 as required.

This proof may seem


√ familiar from Analysis I, where a proof similar to this was used to prove
the existence of 2. In fact we can now prove this directly from the IVT.

Example 5.0.6. There exists a unique positive number ξ s.t. ξ 2 = 2.


Proof: Consider f (x) = x2 − 2. Note that f (0) = −2 and f (2) = 2. So f : [0, 2] → R,
f (0) < 0 < f (2) and also, as f is a polynomial, it is continuous. Thus, by the IVT, there
exists ξ ∈ (0, 2) such that f (ξ) = 0, as required.

More generally the IVT is often used to show that algebraic equations have solutions. In
the following, if you draw the graphs of y = ex and y = αx, you will see that if α = e the
curves touch, if α < e they do not meet, but if α > e then they meet twice. The following
example shows how to make this graphical argument rigorous using the IVT. It shows that
if α > e there exist two solutions. Once we have covered differentiability you will be able to
prove that there are exactly two solutions, by using the fact that f ′ (x) < 0 if x < log α, but
f ′ (x) > 0 if x > log α.

Example 5.0.7. Let α > e. Show that there exist two distinct points xi > 0, i = 1, 2, such
that exi = αxi .
Proof: Consider f (x) = ex − αx. We will prove later that for all x, ex is continuous. Hence
2
f (x) is continuous on [0, ∞). ex is defined by its power series so that ex > x2 . Thus eX > αX
for any X > 2α. Fix such an X(> log α).
Then f (0) = 1 > 0, f (log α) = α(1 − log α) < 0, f (X) > 0. So we can apply the IVT
to the two intervals [0, log α], and [log α, X] to find that there exist x1 ∈ [0, log α] such that
f (x1 ) = 0, and x2 ∈ [log α, X] such that f (x2 ) = 0 as required.

19
5.1 Closed bounded intervals map to closed bounded intervals

We can reformulate the theorems of sections 4 and 5 as follows.

Theorem 5.1.1. Let f : [a, b] → R be a real valued continuous map. Then f ([a, b]) = [m, M ]
for some m, M ∈ R.

That is, a continuous real-valued function maps a closed and bounded interval onto a closed
and bounded interval.

Proof. Let m := inf x∈[a,b] f (x) and M := supx∈[a,b] f (x). These exist by the theorem on
boundedness. Clearly f ([a, b]) ⊆ [m, M ].
By the theorem on the attainment of bounds, there exist ξ ∈ [a, b] and η ∈ [a, b] such that
f (ξ) = m and f (η) = M ; hence m, M ∈ f ([a, b]).
Now let y ∈ [m, M ], so f (ξ) ≤ y ≤ f (η). By applying the IVT to f restricted to the interval
[ξ, η] (or [η, ξ] as case may be) we find an x ∈ [ξ, η] ⊆ [a, b] such that f (x) = y; hence
y ∈ f ([a, b]). Hence [m, M ] ⊆ f ([a, b]).

6 Monotonic Functions and Inverse Function Theorem

6.1 Monotone Functions

The following definitions require the ordered structure of real numbers, and so are not avail-
able for functions on a subset of the complex plane.

Definition 6.1.1. Let f be a real function on E ⊆ R.

(a) (i) We say that f is increasing if f (x) 6 f (y) whenever x 6 y.


(ii) We say that f is strictly increasing if f (x) < f (y) whenever x < y.

(b) (i) We say that f is decreasing if f (x) > f (y) whenever x 6 y.


(ii) We say that f is strictly decreasing if f (x) > f (y) whenever x < y.

A function is called monotone on E if it is increasing or decreasing on E.

6.2 Continuity of the Inverse Function

Recall that the inverse function was defined in ‘Introduction to Pure Mathematics’ last term.

Definition 6.2.1. Let f : A → B be a function. We say that ‘f is invertible’ if there exists


a function g : B → A such that g(f (x)) = x for all x ∈ A and f (g(y)) = y, for all y ∈ B.
We then call g an inverse of f.

20
We have seen that continuous functions map intervals to intervals. We want to say something
about the inverse function when it exists. Note that any result about increasing functions
f can be translated into a result about decreasing functions by the simple expedient of
considering the functions −f .
We will prove:

Theorem 6.2.2 (Inverse Function Theorem (IFT)). Let f be a strictly increasing and
continuous real function on [a, b]. Then f has a well-defined continuous inverse on [f (a), f (b)].

This is contained in the following theorem.

Theorem 6.2.3. Let f : [a, b] → R be strictly increasing and continuous on [a, b]. Then

(i) f ([a, b]) = [f (a), f (b)].

(ii) there exists a unique function g : [f (a), f (b)] → R such that g(f (x)) = x for all x ∈ [a, b]
and f (g(y)) = y for all y ∈ [f (a), f (b)];

(iii) g is strictly monotone increasing

(iv) g is continuous.

Proof. The first assertion is just Theorem 5.1.1 as in this case m = f (a) and M = f (b).
The second is straightforward; f : [a, b] → [f (a), f (b)] is now 1—1 and onto. So given
y ∈ [f (a), f (b)] there exists a unique x ∈ [a, b] such that f (x) = y. Define g(y) = x. So the
inverse function exists and is unique.
The third assertion is also straightforward. Assume there exist u, v ∈ [f (a), f (b)] with u < v
but g(u) > g(v). But as f is strictly increasing this implies u = f (g(u)) > f (g(v)) = v, a
contradicton.
It is the fourth assertion that needs our attention. We must prove that for any y0 ∈
[f (a), f (b)] the function g is continuous at y0 .
For y0 ∈ (f (a), f (b)). Given ǫ > 0, if necessary take ǫ smaller such that g(y0 ) + ǫ ∈ [a, b] and
g(y0 ) − ǫ ∈ [a, b]. Choose δ = min{f (g(y0 ) + ǫ) − y0 , y0 − f (g(y0 ) − ǫ)}. (Draw the graph of
g(y) to see why we choose it like this) Then

y0 − δ < y < y0 + δ
=⇒ f (g(y0 ) − ǫ) < y < f (g(y0 ) + ǫ)
=⇒ g(f (g(y0 ) − ǫ)) < g(y) < g(f (g(y0 ) + ǫ))
=⇒ g(y0 ) − ǫ < g(y) < g(y0 ) + ǫ

and g is continuous at y0 as required. The points y0 = f (a) and y0 = f (b) are similar.

Remark 6.2.4. Note that from Q1, problem sheet 3, if f : [a, b] → R is a continuous, 1–1
function with f (a) < f (b), then f is strictly increasing on [a, b]. So for the Inverse Function
Theorem (IFT) it is sufficient to assume that f : [a, b] → R is continuous and 1–1.

21
Note 6.2.5. I have not used the notation f −1 for the inverse function. If you do choose to
use it then you must make very clear what you intend the domains of f and f −1 to be. It
is not for nothing that the special notations ‘arcsin’ etc. exist! For example sine and cosine
are only invertible on a part of their domain where they are inceasing or decreasing.

6.3 Exponentials, Logarithms, Powers etc.

In the following I will consider the functions only on real domains. Some of the results extend
to complex domains.
Recall from Analysis I that functions such as exp(x), sin(x), cos(x), sinh(x) and cosh(x) etc
are defined by their power series each of which has infinite radius of convergence. Later we
will see that a power series is continuous within its radius of convergence so each of these
functions is continuous on R. For each of them, if we take as domain a closed interval on
which the function is strictly monotone, then we can use the IFT to show the function has
a continuous inverse. (See also Problem sheet 3 Q3)
In particular we can therefore define the exponential function: exp : R → R as exp(x) =
P xn
n! . The following properties were proved in Analysis I (though some used results to be
proved in this course):

1. exp′ (x) = exp(x);

2. exp(x) exp(y) = exp(x + y);

3. exp 0 = 1 and exp(−x) = 1/ exp(x);

4. exp(x) > 0;

5. exp is strictly increasing and exp : R → (0, ∞) is a bijection and hence invertible. The
inverse is denoted by log : (0, ∞) → R;

6. log(xy) = log(x) + log(y);


P 1
7. Let e denote the real number e = exp(1) = n! then log e = 1;

8. For any a > 0 and any x ∈ R define ax = exp(x log a). Then ax+y = ax ay ; Also
ex = exp(x);
In addition

9. As noted above exp is continuous. But we can also prove it directly

Lemma 6.3.1. The function exp is continuous.

Proof. We have
| exp(x + h) − exp(x)| = exp(x)| exp(h) − 1)|
so for |h| < 1 we have by the Triangle Law and the preservation of 6 under limits
X X |h|
| exp(x + h) − exp(x)| 6 exp(x) |h|n /n! 6 exp(x) |h|n = exp(x),
1 − |h|
n>1 n>1

22
which tends to 0 as h → 0.

10. We can obtain numerous inequalities: For example if x > 0,


n
x2 X xr
exp(x) = 1 + x + + lim > 1 + x,
2! n→∞ r!
r=3

and hence also if x > 0,


1
exp(−x) < .
1+x
P zn
Note: We can also define exp : C → C by exp(z) = n! . The first 3 of the above properties
also hold in C and also exp(z) 6= 0.

We can now apply the Inverse Function Theorem to get:

Lemma 6.3.2. For every y > 0 the function log is continuous at y.

Proof. We apply the theorem by fnding an A > 0 such that 1/(1 + A) < y < 1 + A and
considering exp : [−A, A] → [exp(−A), exp(A)], as, from (10) above, the image interval then
contains y.

6.4 Left-hand and Right-hand limits

For functions defined on an interval, we may talk about right-hand and left-hand limits.

Definition 6.4.1. (i) Let f : [a, b) → R (or C) and p ∈ [a, b); and let l ∈ R (or l ∈ C). We
say that l is the right-hand limit of f at p if, ∀ε > 0, ∃δ > 0 such that

|f (x) − l| < ε ∀x ∈ [a, b) such that 0 < x − p < δ.

We write this as

lim f (x) = l; or as x→p


lim f (x); or sometimes as f (p+) = l.
x→p+
x>p

Similarly we have:
(ii) Let f : (a, b] → R (or C) and p ∈ (a, b]; and let l ∈ R (or l ∈ C). We say that l is the
left-hand limit of f at p if, ∀ε > 0, ∃δ > 0 such that

|f (x) − l| < ε ∀x ∈ [a, b) such that − δ < x − p < 0.

We write this as

lim f (x) = l; or as x→p


lim f (x); or sometimes as f (p−) = l.
x→p−
x<p

The following provides good practice in using the definitions.

23
Proposition 6.4.2. Let f : (a, b) → C and let p ∈ (a, b). Then the following are equivalent:

(i) limx→p f (x) = l;

(ii) Both limx→p+ f (x) = l and limx→p− f (x) = l.

Example 6.4.3. Consider function f : R → R given by


½
x if x > 0;
f (x) =
x+1 if x < 0.

Then f (0+) = 0 and f (0−) = 1. But limx→0 does not exist.

6.5 Left-continuity and Right-continuity

We translate the above definitions into ‘continuity’ language.

Definition 6.5.1. (i) We say f is right continuous at p if f (p+) = f (p).7


(ii) We say f is left continuous at p if f (p−) = f (p).

Again, for practice prove the following.

Proposition 6.5.2. Let f : (a, b) → R and let p ∈ (a, b). Then the following are equivalent:

(i) f is continuous at p;

(ii) f is both left-continuous at p and right-continuous at p.

Example 6.5.3. Again consider the function


½
x if x > 0;
f (x) =
x+1 if x < 0.

Then at 0 f is right continuous but not left continuous. It is not continuous at 0.

6.6 Continuity of Monotone Functions

This will probably be omitted from lectures for lack of time.

We now discuss the continuity of monotone functions. Remember that any result about increasing
functions f can be translated into a result about decreasing functions by the simple expedient of
considering the functions −f .

Theorem 6.6.1. Let f : (a, b) → R be an increasing function. Then for every x0 ∈ (a, b) the
right-hand limit f (x0 +) and the left-hand limit f (x0 −) of f at x0 exist.
Moreover, f (x0 −) = supa<x<x0 f (x), f (x0 +) = inf x0 <x<b f (x) and

f (x0 −) 6 f (x0 ) 6 f (x0 +).


7
Note that we are saying that the limit exists and that it equals f (p).

24
Proof. By hypothesis,{f (x) : a < x < x0 } is non-empty and is bounded above by f (x0 ), and therefore
has a least upper bound A := supa<x<x0 f (x). Then A 6 f (x0 ). We have to show that f (x0 −) = A.
Let ε > 0 be given. It follows from the definition of supa<x<x0 f (x), that there is a xε ∈ (a, x0 ) such
that
A − ε < f (xε ) 6 A.
As x0 − xε > 0 choose δ := x0 − xε . Then, x ∈ (xε , x0 ) if and only if 0 < x0 − x < δ, and thus, as f
is increasing £ ¤
A − ε < f (xε ) 6 f (x) 6 A for all 0 < x0 − x < δ.
By definition f (x0 −) = A and we are done.
The other inequality can be obtained by a similar argument (a good exercise); or by applying what
we have done to the function −f (b − x) on (0, b − a) and juggling with the inequalities.

Remark 6.6.2. Informally we call the difference f (x0 +) − f (x0 −) the “jump” of f at x0 .

7 Limits at infinity and infinite limits

7.1 Limits at infinity: functions of a real variable

We want to extend our definition of the limit ‘limx→a f (x)’ to allow us to talk about the end
points of infinite intervals like (0, ∞).

Definition 7.1.1. Let f be a real or complex valued function defined on a subset E of R,


and let l ∈ R or l ∈ C as the case may be. Suppose that for every b ∈ R the set E ∩ (b, +∞)
is non-empty. We say that f (x) → l as x → +∞ if, ∀ε > 0, ∃B > 0 such that

|f (x) − l| < ε ∀x ∈ E such that x > B.

We write this as limx→+∞ f (x) = l.

Exercise 7.1.2. Make a similar definition for limx→−∞ f (x) = l.

Note 7.1.3. We will often just write ‘f (x) → l as x → ∞’ for ‘f (x) → l as x → +∞’. There
is a slight danger of confusion—see what we say about functions of a complex variable—but
if we take care it will be all right.

7.3 Limits at infinity: functions of a complex variable

Definition 7.3.1. Let f be a real or complex valued function defined on a subset E of C,


and let l ∈ R or l ∈ C as the case may be. Suppose that for every b ∈ R there are points
z ∈ E such that |z| > b. We say that f (z) → l as z → ∞ if, ∀ε > 0, ∃B > 0 such that

|f (z) − l| < ε ∀z ∈ E such that |z| > B.

We write this as limz→∞ f (z) = l.

Note that there may be a mild inconsistency with the previous definition if E ⊆ R. If we are
thinking ‘complex’ we’ll need both the real limits at ±∞ to be equal.

25
¯ ¯
sin z ¯ sin x ¯
Example 7.3.2. Consider as z → ∞. For real values z = x we get that ¯¯ ¯ 6
z x ¯
1
→ 0 as x → ∞. But for pure imaginary values like zk = 2πik, with k ∈ Z we’ll get that
¯|x|
¯ sin zk ¯ e2πk − e−2πk
¯
¯ zk ¯ =
¯ ¯ → ∞ as k → ∞.
4πk
Exercise 7.3.3. Write down the contrapositive of ‘f tends to a limit as z → ∞.

7.4 Tending to infinity. . .

Very briefly we discuss ‘infinite limits’. We must take great care not to deceive ourselves: in
neither R nor C is there a number ∞.

Definition 7.4.1. Let f : E → R be a real valued function on a subset of R or C and let p


be a limit point of E. We say that f (z) tends to +∞ as z → p if ∀B > 0, ∃δ > 0 such that

f (z) > B ∀z ∈ E such that 0 < |z − p| < δ.

We may write this as f (z) → +∞ as z → p.

Exercise 7.4.2. Make a similar definition for f (z) → −∞ as z → p.

For complex valued functions things are easier:

Definition 7.4.3. Let f : E → C be a real valued function on a subset of R or C and let p


be a limit point of E. We say that f (z) tends to ∞ as z → p if ∀B > 0 ∃δ > 0 such that

|f (z)| > B ∀z ∈ E such that 0 < |z − p| < δ.

We may write this as f (z) → ∞ as z → p.

7.5 Euler’s Limit

We prove the following result.


¢x ¢x
Proposition 7.5.1. The limits limx→∞ 1 + x1 and limx→−∞ 1 + x1 exist and are both
¡ ¡

equal to e.

Proof. First limit: By the


¡ continuity of exp, from Problem Sheet 3,Q4b, it is enough to
prove that limx→∞ x log 1 + x1 = 1, or by AOL that limx→∞ x log 11+ 1 = 1. Write y =
¢
( x)
log 1 + x1 ; then
¡ ¢

1 exp(y) − 1 − y
1 −1= .
¡ ¢
x log 1 + x y
1
Note that as 1 + x> 1 for x > 0, we have y > 0, and then
n n
P P
exp(y) − 1 − y n>2 y /n! n>2 y y
06 = 6 = .
y y y 1−y

26
So if we can show that y → 0 as x → ∞ we are done.
Let ε > 0; By continuity of log at 1 we can find δ such that | log t| < ε for t ∈ (1, 1 + δ). Take
K = 1δ . Then, as x > K implies x1 < δ,

1
|y| = | log(1 + )| < ε, ∀x > K.
x

A similar argument will deal with the other limit.

8 Uniform Convergence

8.1 Motivation

Let E ⊆ R (or C), and let p ∈ E be a limit point, so that p = limx→p x. We have seen that
‘continuity at p’ is exactly the right condition to ensure that

lim f (x) = f ( lim x),


x→p x→p

that is to ensure that ‘taking the limit limx→p ’ and ‘finding the value under f ’ can be
interchanged.

There are many other situations in which we would like to understand whether the order of
in which we perform two mathematical operations is significant or not:

(i) Suppose we have not just a single function f on E but a whole sequence {fn }. When is
limn→∞ limx→p fn (x) = limx→p limn→∞ fn (x)?

(ii) Similarly, when is limx→p ∞


P P∞
0 fn (x) = 0 limx→p fn (x)?

(iii) Once we have defined derivatives and integrals—as limits—we will want to know when
Rb Rb
limn→∞ fn′ (x) = (limn→∞ fn (x))′ ? and when limn→∞ a fn (t) dt = a limn→∞ fn (t) dt?

The answers to some of these questions are given in this lecture and the next.
To see that there are non-trivial problems we look at one typical example.

Example 8.1.1. Consider the sequence of functions {fn }, where fn : [0, 1] → R given by

if 0 6 x < n1 ,
½
−nx + 1
fn (x) =
0 if x > n1 .

Consider also the function f : [0, 1] → R given by


½
1 if x = 0,
f (x) =
0 if x > 0.

Sketch their graphs, and note that for all x ∈ [0, 1] we have that f (x) = limn→∞ fn (x).

27
Note that although all the fn are continuous the limit function f is not continuous at 0.
Also,
lim lim fn (x) = lim f (x) = 0
x→0 n→∞ x→0
while
lim lim fn (x) = lim 1 = 1
n→∞ x→0 n→∞
so that
lim lim fn (x) 6= lim lim fn (x).
x→0 n→∞ n→∞ x→0

8.2 Definition

Just as when we defined ‘uniform continuity’ as a stronger version of ‘continuous at all points’
by insisting on being able to choose one ‘δ’ to deal with all points, so we now strengthen our
definition of ‘convergent’.
So let E ⊆ R (or C) and let fn : E → R (or C) be a sequence of functions. Then for each
(fixed) x ∈ E, {fn (x)} is a sequence of real (or complex) numbers. If this sequence converges
for every x ∈ E, then the limit which will depend on x so we will call it f (x). Thus f : E → R
(or C) is a function. Hence we have the definition (using Analysis I):
Definition 8.2.1. By fn converges to f on E we mean that ∀x ∈ E, and ∀ε > 0, ∃N ∈ N
such that
|fn (x) − f (x)| < ε ∀n > N.

So, of course, in general N depends on x. For ‘uniform convergence’ we insist that one N
works for all x.
Definition 8.2.2. By fn converges uniformly to f on E we mean that ∀ε > 0, ∃N ∈ N
such that
|fn (x) − f (x)| < ε ∀n > N and ∀x ∈ E.
u
We write this as ‘fn → f uniformly on E’ or ‘fn → f ’.

It is trivial to see that:


Proposition 8.2.3. If the sequence {fn } converges uniformly to f on E then at every point
x ∈ E we have that the sequence {fn (x)} converges to f (x).

There is one special case which we should single out. Suppose that for each n ∈ N we have
that sn (x) = n0 fk (x). Suppose that s : E → R (or C). If we apply the definition to the
P
sequence {sn } and the function s we will get
P
Remark 8.2.4. We say that the series fn converges uniformly to s on E if ∀ε > 0,
∃N ∈ N such that
¯ n ¯
¯X ¯
fk (x) − s(x)¯ < ε ∀n > N and ∀x ∈ E.
¯ ¯
¯
¯ ¯
0

We may write this as ‘ ∞


P
0 fn (x) = s(x) (uniformly on E)’.

28
8.3 Test for Uniform Convergence

We can re-express the definition in a more practical way:


Theorem 8.3.1. Let E be a non-empty subset of R or C. Let fn ,f : E → R (or C). Then
the following are equivalent:

(i) fn → f uniformly on E;
(ii) ∃N s.t. ∀n > N , the real numbers mn := supx∈E |fn (x) − f (x)| exist and moreover
mn → 0 as n → ∞.

Proof. (=⇒)
Suppose fn → f uniformly on E, then (by definition) ∀ε > 0, ∃N ∈ N such that
|fn (x) − f (x)| < 21 ε ∀x ∈ E and ∀n > N.
1
Hence, for each n, is an upper bound of the set {|fn (x) − f (x)| : x ∈ E}. Then the least

upper bounds satisfy
mn = sup |fn (x) − f (x)| 6 21 ε < ε ∀n > N.
x∈E
By the definition of sequence limits, limn→∞ mn = 0.
(⇐=)
Suppose the mn exist for all n > N1 , and that limn→∞ supx∈E |fn (x) − f (x)| = 0, Then
∀ε > 0 ∃N > N1 such that
sup |fn (x) − f (x)| < ε ∀n > N.
x∈E
Therefore
|fn (x) − f (x)| 6 sup |fn (x) − f (x)| < ε ∀x ∈ E and ∀n > N.
x∈E
This is the definition that fn → f uniformly on E.

Example 8.3.2. Let E = [0, 1) and let fn (x) = xn . Clearly limn→∞ fn (x) = 0, so f (x) = 0.
1
Then mn = supx∈E |xn − 0| = supx∈E xn . But xn = (1/2) n ∈ E and fn (xn ) = 1/2 so that
mn > fn (xn ) = 1/2 9 0, as n → ∞
so fn is not uniformly convergent on [0, 1).
However, if instead we consider E = [0, r], where 0 < r < 1 is a fixed constant. Then xn → 0
uniformly on E, because now
mn = sup xn 6 rn → 0, as n → ∞.
[0,r]

Remark 8.3.3. The test is practical on a closed and bounded interval E = [a, b] in cases
where the functions fn and f are differentiable. In such cases the supremum will be achieved
d(fn (x) − f (x))
either at a or at b or at some interior point where = 0. We will prove this
dx
later in the course; for the moment use it in exercises.8

8
Of course we will not use it in building up the theory.

29
8.4 Cauchy’s Criterion

Just as we found for sequences of numbers there is a characterisation of uniform convergence


which does not depend on knowing the limit function.
Theorem 8.4.1 (Cauchy’s Criterion for Uniform Convergence). Let E ⊆ R (or C)
and let fn : E → R (or C). Then fn converges uniformly on E, if and only if, ∀ε > 0,
∃N ∈ N such that

|fn (x) − fm (x)| < ε ∀n, m > N and ∀x ∈ E. (∗)

Proof. (=⇒) Suppose fn converges uniformly on E with limit function f , then ∀ε > 0,
∃N ∈ N such that
|fn (x) − f (x)| < 12 ε ∀n > N and ∀x ∈ E.
So, ∀x ∈ E and ∀n, m > N

|fn (x) − fm (x)| 6 |fn (x) − f (x)| + |fm (x) − f (x)|


1
< 2ε + 12 ε
= ε.

(⇐=) Conversely, suppose (*) holds. Then for any x ∈ E, {fn (x)} is a Cauchy se-
quence, so that it is convergent. Let us denote its limit by f (x). For every ε > 0, choose
N ∈ N such that

|fn (x) − fm (x)| < 21 ε ∀n, m > N and ∀x ∈ E.

For any fixed n > N and x ∈ E, letting m → ∞ in the above inequality we obtain, by the
preservation of weak inequalities, that

|fn (x) − f (x)| = lim |fn (x) − fm (x)| 6 21 ε < ε.


m→∞

According to the definition, fn → f uniformly on E.


P∞
Corollary 8.4.2 (Cauchy’s criterion for uniform convergence of series). The series n=0 fn
is uniformly convergent on E if and only if ∀ε > 0, ∃N ∈ N such that
¯ n ¯
¯ X ¯
fk (x)¯ < ε ∀n > m > N and ∀x ∈ E.
¯ ¯
¯
¯ ¯
k=m+1

8.5 The M -test

As a consequence, we prove the following simple but important uniform convergence test for
series.
Theorem 8.5.1 (The Weierstrass M-Test). Let E ⊆ R (or C) and fn : E → R (or C).
Suppose that there is a sequence {Mn } of real numbers such that

|fn (x)| 6 Mn ∀x ∈ E.
P∞ P∞
If n=0 Mn converges then n=0 fn converges uniformly on E.

30
Note that the Mn must be independent of x.
P
Proof. By Cauchy’s Criterion for the convergence of Mn we have that ∀ε > 0, ∃N ∈ N
such that
X n
Mk < ε ∀n > m > N.
k=m+1

Now by the Triangle Law


¯ n ¯ n n
¯ X ¯ X X
fk (x)¯ 6 |fk (x)| 6 Mk < ǫ, ∀n > m > N and ∀x ∈ E,
¯ ¯
¯
¯ ¯
k=m+1 k=m+1 k=m+1

which is Cauchy’s criterion for the uniform convergence of the series.


P
Corollary 8.5.2. Suppose the conditions for the M -test hold, and Mn is convergent.
Then ¯∞ ∞ ∞
¯
¯X ¯ X X
f (x) 6 |f (x)| 6 Mn ∀x ∈ E.
¯ ¯
¯ n ¯ n
¯ ¯
n=0 n=0 n=0

Proof. Apply the preservation of weak inequalities as N → ∞ to the obvious inequalities


¯N ¯ N N
¯X ¯ X X
fn (x)¯ 6 |fn (x)| 6 Mn ∀x ∈ E.
¯ ¯
¯
¯ ¯
n=0 n=0 n=0

9 Uniform Convergence: Examples and Applications

9.1 Examples

Example 9.1.1. Let E = [0, 1] and let


x
fn (x) = .
1 + n2 x2
Then clearly limn→∞ fn (x) = 0 for every x ∈ E.
Using (1 − nx)2 > 0 we can see that
1 2nx 1
0 6 fn (x) = 2 2
6 →0
2n 1 + n x 2n
and so get (by looking at ‘sups’) that fn → f uniformly on [0, 1].

Example 9.1.2. Let E = [0, 1] and let


nx
fn (x) = .
1 + n2 x2
Then clearly limn→∞ fn (x) = 0 for every x ∈ [0, 1].

31
But fn (1/n) = 1/2, so that
1
sup |fn (x) − f (x)| > 9 0 as n → ∞
x∈[0,1] 2

and so fn converges to 0 but not uniformly in [0, 1].


P∞ n 1
Example 9.1.3. n=0 x converges to 1−x in (−1, 1), but not uniformly.
n+1
From Analysis I, sn (x) = nk=0 xk = 1−x 1
P
1−x tends to 1−x for any |x| < 1. On the other
hand
n+1
¯ ¯
¯sn (x) − 1 ¯ = |x|
¯ ¯
¯ 1 − x ¯ |1 − x|
n+1
so that (look at x = n+2 )
³ ´n+1
¯ ¯ n+1
¯ 1 ¯ ¯ n+2 n+2
sup ¯¯sn (x) − ¯ > |1 − n+1 | = ³ ´n+1 → ∞.
x∈(−1,1) 1 − x 1
n+2 1 + n+1
P∞ n
Hence n=0 x
doesn’t converge uniformly.
P∞ n
Example 9.1.4. n=0 x converges uniformly on [−r, r] for any 0 < r < 1.

This follows from the M -test with Mn := rn .

9.2 Uniform Convergence preserves continuity

We have already seen that the limit of a sequence of continuous functions may not be con-
tinuous. This theorem tells us that ‘uniformity’ gives us the extra condition we need.
Theorem 9.2.1. Let fn ,f : E → R (or C), and fn → f uniformly in E. Suppose all fn are
continuous at x0 ∈ E. Then the limit function f is also continuous at x0 , so that

lim lim fn (x) = lim fn (x0 ) = lim lim fn (x).


x→x0 n→∞ n→∞ n→∞ x→x0

Proof. ∀ε > 0, ∃N ∈ N s.t.

|fn (x) − f (x)| < 31 ε ∀n > N and ∀x ∈ E.

Since fN +1 is continuous at x0 , ∃δ > 0 (depending on x0 and ε) such that

|fN +1 (x) − fN +1 (x0 )| < 31 ε for all |x − x0 | < δ.

Hence, if |x − x0 | < δ then by the Triangle Law

|f (x) − f (x0 )|
6 |f (x) − fN +1 (x)| + |fN +1 (x) − fN +1 (x0 )| + |fN +1 (x0 ) − f (x0 )|
1
< 3ε + 13 ε + 31 ε
= ε.

By definition, f is continuous at x0 .

32
Note it is very importan that N + 1 is fixed, so that δ does not depend on n.
Remark 9.2.2 (Version for series). If ∞
P
n=0 fn converges uniformly on E and every fn is
continuous at x0 ∈ E, then
X∞ X∞
lim fn (x) = fn (x0 ).
x→x0
n=0 n=0
P∞
In particular, if fn is continuous on E for all n and n=0 fn converges uniformly on E,
then ∞
P
f
n=1 n is continuous on E.

9.3 Power Series

We can apply the the results of the previous subsection to the important case of power series.
Theorem
P∞ 9.3.1n(Continuity of Power Series). Suppose the radius of convergence
P∞ of the power
series n=0 an x is R, where 0 6 R 6 ∞. Then for every n
P∞ 0 6 rn< R, n=0 an x converges
uniformly on the closed disk {x : |x| 6 r}. Therefore, n=0 an x is continuous on the open
disk {x : |x| < R}.

of ‘radius of convergence’, ∞ n
P
Proof. According to the definition P n=0 an x is absolutely con-
∞ n
vergent for |x| < R. In particular, n=0 |an |r is convergent. Since
|an xn | 6 |an |rn for all x such that |x| 6 r
we have, by Weierstrass M-test, that ∞ n
P
n=0 an x converges uniformly on {x : |x| 6 r}.
P∞
But an xn is continuous for any n ∈ N . So, for any r < R, n
n=0 an x is continuous for
|x| 6 r, and hence on the open disk {x : |x| < R}.

Note 9.3.2. Note that this says nothing about convergence or continuity at the end-points.
If you are interested subsection 9.5 deals with this in the real case.

Corollary 9.3.3. The functions exp x, sin x, cos x, cosh x and sinh x can all be defined by
power series with infinite radius of convergence so are all continuous on C.

9.4 Integrals and derivatives of sequences

Next term, in the course Analysis III, you will learn how to define integrals, and the proofs
of the following theorems will be given.
Theorem 9.4.1. If fn → f uniformly on [a, b] and if every fn is continuous, then
Z b Z b Z b
f= lim fn = lim fn .
a a n→∞ n→∞ a

Similarly, if the series ∞


P
n=1 fn converges uniformly on [a, b] and if all fn are continuous,
then we may integrate the series term by term

Z bX X∞ Z b
fn = fn .
a n=1 n=1 a

33
Note 9.4.2. However, uniform convergence is not the ‘right’R condition for integrating a
b
series term by term: we can exchange the order of integration a (which involves a limiting
procedure) and limn→∞ under much weaker conditions. The search for correct conditions
for term-by-term integration led to the discovery of Lebesgue integration [Part A option:
Integration].
Theorem 9.4.3. Let fn (x) → f (x) for each x ∈ [a, b]. Suppose fn′ exists and is continuous
on [a, b] for every n, and that fn′ → g uniformly on [a, b]. Then f ′ exists and is continuous
on [a, b], and
d d
lim fn (x) = lim fn (x).
dx n→∞ n→∞ dx

fn converges on [a, b], and if every fn′ exists and is continuous on [a, b] and
P
Similarly,
P ′ if
if fn converges uniformly on [a, b], then
∞ ∞
d X X
fn = fn′ .
dx
n=1 n=1

9.5 The end points

This section is likely to be omitted for lack of time.

When 0 < R < ∞ the points where |z| = R need to be handled differently. We only deal with the
real case, so there are two such points R and −R. Scaling (replacing x by x/R or −x/R) lets us deal
only with power series where the radius is 1 and describe what happens at x = 1.
P∞
Theorem 9.5.1 (Abel’s Continuity Theorem).P∞ Suppose that the series n=0 an xn has radius of
convergence R = 1. Suppose further that n=0 an converges.
P∞
Then n=0 an xn converges uniformly on [0, 1].
P∞
Consequently, n=0 an xn is continuous on (−1, 1], and in particular

X ∞
X
lim an xn = an .
x→1−
n=0 n=0

Proof. First note that our general result gives continuity on (−1, 1); it is only the point x = 1 we
have to deal with. We will get continuity provided we get uniform convergence on [0, 1].
P∞
By Cauchy’s Criterion for the convergent n=0 an we have that, for every ε > 0, there is N such
that, for every n > m > N we have ¯ ¯
¯Xn ¯
ak ¯ < ε.
¯ ¯
¯
¯ ¯
k=m

Now fix m > N , and for the partial sums from m use the notation
k
X
Ak = aj for k > m; and Am−1 = 0
j=m

noting that subtracting consecutive sums gives us back the original sequence9
ak = Ak − Ak−1 .
9
Think ‘Differentiation undoes Integration’.

34
By what we have from the Cauchy Criterion above, |Ak | < ε whenever k > m − 1. We have by
elementary algebra the following formula10
n
X n
X
k
ak x = (Ak − Ak−1 )xk
k=m k=m
Xn n
X
= Ak xk − Ak−1 xk
k=m k=m
n−1
X
Ak xk − xk+1 + An xn .
¡ ¢
=
k=m

Hence, by the Triangle Law we have that


¯ n ¯ n−1
¯X ¯ X

|Ak | xk − xk+1 + |An |xn
¡ ¢
ak x ¯ 6
¯
¯
¯ ¯
k=m k=m
n−1
X
xk − xk+1 + εxn
¡ ¢
< ε
k=m
= εxm
6 ε

for any x ∈ [0, 1].


P∞
The Cauchy Criterion yields that n=0 an xn is uniformly convergent on [0, 1].

9.6 Monotone Sequences of Continuous Functions

This section is likely to be omitted for lack of time.

The theorem of this subsection is a partial converse of our theorem that ‘uniform convergence preserves
continuity’; if the sequence is monotone then the continuity of the limit will give uniformity of
convergence.

Theorem 9.6.1 (The Dini Theorem). . Let fn be a sequence of real continuous functions on [a, b];
and let f be a real continuous function on [a, b].
Suppose that
lim fn (x) = f (x) for every x ∈ [a, b]
n→∞

and that
fn (x) > fn+1 (x) for all n and for all x ∈ [a, b].

Then fn → f uniformly on [a, b].

Proof. Let gn (x) = fn (x)−f (x). Then gn is continuous for every n, gn > 0 and limn→∞ gn (x) = 0 for
any x ∈ [a, b]. Suppose {gn } were not uniformly convergent on [a, b]. Write down the contrapositive
to see that for some ε > 0, and every natural number k there exists a natural number nk > k and a
point xk ∈ [a, b] such that
|gnk (xk )| = gnk (xk ) > ε.
10
This is called Abel’s summation formula—think ‘integration by parts’.

35
We may choose nk so that k → nk is increasing. We may assume that xk → p—otherwise use the
Bolzano–Weierstrass theorem to extract a convergent subsequence of {xk } and use it instead. Then
p ∈ [a, b]. For any (fixed) k, since {gn } is decreasing,

ε 6 gnl (xl ) 6 gnk (xl )

for all l > k. Letting l → ∞ in the above inequality, we obtain

ε 6 lim gnk (xl ) = gnk (p)


l→∞

as gnk is continuous at p. This contradicts to the assumption that limk→∞ gnk (p) = 0.
1
Example 9.6.2. Let fn (x) = 1+nx for x ∈ (0, 1). Then limn→∞ fn (x) = 0 for every x ∈ (0, 1), fn
is decreasing in n, but fn does not converge uniformly. Dini’s theorem doesn’t apply, as (0, 1) is not
compact.

10 Differentiation: definitions and elementary results

10.1 Definitions

In this course we only study differentiability for real (or complex)-valued functions on E,
where E is a subset of the real line R. The theory of the differentiability of complex valued
functions on the complex plane C is very different from the real case and requires another
theory—See Complex Analysis [Part A: Analysis].

Definition 10.1.1. Let f : (a, b) → R (or C), and let x0 ∈ (a, b). By f is differentiable
at x0 we mean that the following limit exists:

f (x) − f (x0 )
lim .
x→x0 x − x0

When it exists we denote the limit by f ′ (x0 ) which we call the derivative of f at x0 .

[That is ∀ε > 0 ∃δ > 0 such that


¯ ¯
¯ f (x) − f (x0 ) ′
¯
¯
¯ x − x0 − f (x0 ¯ < ε ∀x ∈ (a, b) such that 0 < |x − x0 | < δ.]

For example, it is easy to see that the function x → 7 x is differentiable at every point of R
and has derivative f ′ (x0 ) = 1 at every point; and the function t 7→ e2πit is differentiable at
every point, although we can’t yet prove that.
Sometimes it is helpful to also define ‘left-hand’ and ‘right-hand’ versions of these.

Definition 10.1.2. (i) Let f : [a, b) → R (or C), and let x0 ∈ [a, b). We say that f has a
right-derivative at x0 if the following limit exists

f (x) − f (x0 )
lim .
x→x0 + x − x0

If the limit exists we denote it by f+′ (x0 ).

36
(ii) Let f : (a, b] → R (or C), and let x0 ∈ (a, b]. We say that f has a left-derivative at
x0 if the following limit exists
f (x) − f (x0 )
lim .
x→x0 − x − x0
If the limit exists we denote it by f−′ (x0 ).

The following result is easily proved (compare what we did for left- and right-continuity).

Proposition 10.1.3. Let f : (a, b) → R (or C). Then the following are equivalent:

(a) f is differentiable at x0 and f ′ (x0 ) = l;

(b) f has both left- and right-derivatives at x0 , and f−′ (x0 ) = l = f+′ (x0 ).

Definition 10.1.4. (i) Suppose that f : (a, b) → R (or C). Then we say that f is differ-
entiable on (a, b) if f is differentiable at every point of (a, b).
(ii) Suppose that f : [a, b] → R (or C). Then we say that f is differentiable on [a, b] if f
is differentiable at every point of (a, b), and if f+′ (a) and f−′ (b) exist.

If you wish you can define differentiable on (a, b] and [a, b) as well.

Remark 10.1.5. Let y = f (x). There are other notations for derivatives
dy df (x0 )
dx or dx [G. W. Leibnitz]
y ′ or f ′ (x0 ) [J. L. Lagrange]
Dy or Df (x0 ) [A. L. Cauchy, in particular for vector-valued functions of several variables].

10.2 An Example

Define a function f : R → R by
( 1
x2 sin for x > 0,
f (x) = x
0 for x 6 0.

Then we can show that



 0
 when x < 0,

f (x) = 0 when x = 0,
 2x sin 1 − cos 1

when x > 0.
x x

The derivative for x 6 0 can be found directly from the definition. Later we will see that we
can use the chain rule to find the derivative for x > 0.
Note that the derivative is not continuous at the origin. (See problem sheet 5.)

1 1
We can get other interesting examples by replacing the ‘x2 by xα and the ‘ ’ by β .
x x

37
10.3 Derivatives and differentials

By looking at the definition of ‘limit’ in terms of ε and δ (see problemsheet) we can easily
prove that:

Proposition 10.3.1. Suppose that f : (a, b) → R is differentiable at x0 ∈ (a, b) and that


f ′ (x0 ) > 0. Then there exists a δ > 0 such that for all x ∈ (x0 , x0 + δ) we have that
f (x) > f (x0 ), and for all x ∈ (x0 − δ, x0 ) we have that f (x) < f (x0 ).

We have corollaries like:

Corollary 10.3.2. Suppose that f : [a, b) → R is right-differentiable at x0 ∈ [a, b) and that


f+′ (x0 ) > 0. Then there exists a δ > 0 such that for all x ∈ (x0 , x0 + δ) we have that
f (x) > f (x0 ).

In fact, if f is differentiable at x0 , then the ‘increment’ of f near x0 can be expressed

f (x) − f (x0 ) = f ′ (x0 ) (x − x0 ) + o(x, x0 )

where o is a function of x and x0 satisfying

o(x, x0 )
lim = 0.
x→x0 x − x0

That is, the ‘linear part’ of the increment f (x) − f (x0 ) is f ′ (x0 ) (x − x0 ); all the rest is
small in comparison. This is sometimes called the differential of f at x0 . It is the first
approximation to f near x0 .

10.4 Differentiability and Continuity

Theorem 10.4.1 (Differentiability =⇒ Continuity). Let f : (a, b) → R (or C). If f is


differentiable at x0 ∈ (a, b) then f is continuous at x0 .

Proof. Since

f (x) − f (x0 )
lim (f (x) − f (x0 )) = lim (x − x0 )
x→x0 x→x0 x − x0
f (x) − f (x0 )
= lim lim (x − x0 ) by AOL
x→x0 x − x0 x→x0
= f ′ (x0 ) × 0
= 0.

Therefore limx→x0 f (x) = f (x0 ), so that, by definition, f is continuous at x0 .

Note: The converse is not true. For example |x| is continuous but is not differentiable at 0.
In fact there exist functions which are continuous everywhere, but not differentiable at any
point! (See Bartle and Sherbert. )

38
10.5 Algebraic properties

The following results are straightforward consequences of the Algebra of Limits. They let us
build up at once all the calculus we learned at school—provided we can differentiate a few
standard functions (constants, linear functions, exp, sin and cos).

Theorem 10.5.1. Suppose f, g : (a, b) → R (or C) are both are differentiable at x0 ∈ (a, b),
and λ, µ ∈ R (or C).
(i) [Linearity of differentiation] λ · f + µ · g is differentiable at x0 and

(λ · f + µ · g)′ (x0 ) = λ · f ′ (x0 ) + µ · g ′ (x0 ).

(ii)[The Product Rule] f g : x 7→ f (x)g(x) is differentiable at x0 and

(f g)′ (x0 ) = f (x0 )g ′ (x0 ) + f ′ (x0 )g(x0 ).


f (x)
(iii) [The Quotient Rule] Suppose g(x0 ) 6= 0. Then x 7→ g(x) is differentiable at x0 and
µ ¶′
f f ′ (x0 )g(x0 ) − f (x0 )g ′ (x0 )
(x0 ) = .
g g 2 (x0 )

Proof. (ii) Apply AOL to


f (x)g(x) − f (x0 )g(x0 ) g(x) − g(x0 ) f (x) − f (x0 )
= f (x) + g(x0 ) .
x − x0 x − x0 x − x0
Let x → x0 and use the definitions of f ′ (x0 ), g ′ (x0 ), and the continuity of f (x) so f (x) →
f (x0 ).
(iii)See problem sheet 5.

10.6 The Chain Rule

Theorem 10.6.1 (The Chain Rule). Suppose f : (a, b) → R, and that g : (c, d) → R.
Suppose that f ((a, b)) ⊆ (c, d), so that g ◦ f : (a, b) → R is defined.
Suppose further that f is differentiable at x0 ∈ (a, b), and that g is differentiable at f (x0 ).
Then g ◦ f is differentiable at x0 and

(g ◦ f )′ (x0 ) = g ′ (f (x0 )) f ′ (x0 ).

Proof. Write y0 = f (x0 ), and define a function v on (c, d) by



 g(y) − g(y0 ) − g ′ (y0 )

for all y 6= y0 ,
v(y) = y − y0

 0 for y = y0 .

Note that v(y) → 0 as y → y0 , so that v is continuous at y0 .

39
Rewriting the definition of v we see that we have an expression for the increment
g(y) − g(y0 ) = (y − y0 ) g ′ (y0 ) + v(y)
¡ ¢

valid for any y ∈ (c, d). In particular


g(f (x)) − g(f (x0 )) = (f (x) − f (x0 )) g ′ (y0 ) + v(f (x))
¡ ¢

so that
g(f (x)) − g(f (x0 )) f (x) − f (x0 ) f (x) − f (x0 )
= g ′ (y0 ) + v(f (x)) .
x − x0 x − x0 x − x0

Since f is differentiable at x0 , f continuous at x0 . But v is continuous at y0 = f (x0 ) and


hence v(f (x)) is continuous at x0 . Thus v(f (x)) → 0 as x → x0 . Letting x → x0 we obtain,
using AOL
g(f (x)) − g(f (x0 )) f (x) − f (x0 )
lim = g ′ (y0 ) lim
x→x0 x − x0 x→x0 x − x0
f (x) − f (x0 )
+ lim v(f (x)) lim
x→x0 x→x0 x − x0
′ ′ ′
= g (y0 )f (x0 ) + 0 × f (x0 )
= f ′ (x0 )g ′ (y0 ) .

10.7 Higher Derivatives

Suppose that f : (a, b) → R (or C) is differentiable at every point of some (x0 − δ, x0 + δ).
The it makes sense to ask if f ′ is differentiable at x0 . If it is differentiable then we denote
its derivative by f ′′ (x0 ).
More generally we can define in a recursive way (n + 1)-th derivatives f (n+1) .
Definition 10.7.1. Suppose that f : (a, b) → R (or C) is such that f , f ′ ,. . . ,f (n) exist at
every point of (a, b). Suppose that x0 ∈ (a, b). By f is (n + 1)-times differentiable at x0
we mean that f (n) is differentiable at x0 . We write f (n+1) (x0 ) := f (n)′ (x0 ).

The following is proved by an easy induction using Linearity and the Product Rule.
Theorem 10.7.2 (The Leibnitz Formula). Let f, g : (a, b) → R (or C) be n-times differen-
tiable on (a, b). Then x 7→ f (x)g(x) is n-times differentiable and
n µ ¶
(n)
X n (j)
(f g) (x) = f (x)g (n−j) (x).
j
j=0

11 The elementary functions

11.1 Differentiating power series

The elementary functions—exp x, cos x, sin x, log x, arctan x—are defined as power series, or
are got as inverse functions of real functions defined by power series.

40
We start with a lemma:
P∞ n
P∞ n
Lemma 11.1.1. The power series n=0 an x and n=0 (n+1)an+1 x have the same radius
of convergence.

Proof. Let the radii be R and R′ ; we will show R > R′ and R′ > R.

P∞ n
P∞ that |x1 | < R ; then
First suppose n=0 (n + 1)an+1 x is absolutely convergent at x = x1 .
That is, n=0 (nP + 1)|an+1 ||x1 | converges. Now note that |an xn1 | 6 n|an ||xn1 |. Hence by the
n

comparison test ∞ n
n=0 |an ||x1 | converges.Therefore, by definition of ‘radius of convergence’
we have that R > R .′

Now suppose that |x1 | < R; and choose x2 so that |x1 | < |x2 | < R. Then ∞ n
P
n=0 |an ||x2 |
converges, and so (Analysis I) |an ||x2 |n → 0 as n → ∞. But a convergent sequence is
bounded (Analysis I) so there exists M such that |an ||x2 |n < M for all n. Now
¯ ¯n
n
¯ x1 ¯
|(n + 1)an+1 x1 | 6 (n + 1)M ¯¯ ¯¯
x2
¯ ¯n
and as, by the Ratio Test (n + 1) ¯ xx21 ¯ is convergent, we have by the Comparison Test
P ¯ ¯
P∞
that n=0 (n + 1)|an+1 ||x1 |n is convergent. By the definition of ‘radius of convergence’ we
have that R′ > R.
P∞ n
Theorem P∞ 11.1.2 (Term-by-term differentiation). The power series f (x) := k=0 an x and
g(x) := k=0 (n + 1)an+1 xn have the same radius of convergence R, and for any x such that
|x| < R we have that f is differentiable at x and moreover that f ′ (x) = g(x).

Proof. (not examinable)


The first part is done by the lemma.
Suppose |x| < R; choose some r such that |x| < r < R. (For example, r := (|x| + R)/2 if
R < ∞, or r = |x| + 1 if R = ∞.)
For any point w such that |w| < r, consider
∞ µ n
w − xn

f (w) − f (x) X
n−1
− g(x) = an − nx
w−x w−x
n=1
∞ µ n
w − xn
X ¶
n−1
= an − nx ;
w−x
n=2

where we have added the series f (w), f (x) and g(x) term by term, which is justified by AOL.
Our aim is to show that
f (w) − f (x)
− g(x) → 0 as w → x.
w−x
The binomial identity
wn − xn
= xn−1 + xn−2 w + · · · + xwn−2 + wn−1
w−x

41
is easily proved by induction; then we have that for any w 6= x and n > 2
wn − xn
− nxn−1 = xn−1 + xn−2 w + · · · + xwn−2 + wn−1
w−x
−xn−1 − xn−1 − · · · − xn−1 − xn−1
n−1
X³ ´
= xn−1−k wk − xn−1
k=1
n−1
X ³ ´
= xn−1−k wk − xk .
k=1

Let
n−1
X ³ ´
hn (w) = an xn−1−k wk − xk for n = 2, 3, · · · .
k=1

Then

f (w) − f (x) X
− g(x) = hn (w)
w−x
n=2

All hn are P
continuous in R as they are polynomials in w; and hn (x) = 0 for all n > 2. We
claim that ∞ n=2 hn (w) converges uniformly in |w| 6 r. In fact

n−1
X ³ ´
|hn (w) | 6 |an | |x|n−1−k |w|k + |x|k
k=1
6 2n|an |rn−1 .

P∞
n|an |rn−1 is convergent, so that
P
Now n=2 hn (w)
P∞converges uniformly in closed disk
{w : |w| 6 r} by the Weierstrass M-test. Hence n=2 hn (w) is continuous in the disk
|w| 6 r as the uniform limit of continuous functions is continuous. Therefore

X ∞
X
lim hn (w) = hn (x) = 0
w→x
n=2 n=2

so that
µ ¶
f (w) − f (x) f (w) − f (x)
lim = lim − g(x) + g(x)
w→x w−x w→x w−x

X
= lim hn (w) + g(x)
w→x
n=2
= g(x).

11.2 The Exponential Function, Trigonometric Functions, Hyperbolic Func-


tions

The following result follows immediately

42
Proposition 11.2.1. The functions exp x, sin x, cos x, cosh x and sinh x can all be defined
by power series with infinite radius of convergence so are all differentiable on R. Further:

(i) exp′ x = exp x.

(ii) cos′ x = − sin x and sin′ x = cos x.

(iii) cosh′ x = sinh x and sinh′ x = cosh x.

Note 11.2.2. The other trigonometric and hyperbolic functions are defined in terms of cos
sin x
and sin or cosh and sinh. For example tan x := is defined for those x such that
cos x
cos x 6= 0. Then by the quotient rule it is differentiable wherever it is defined, and tan′ x =
cos2 x + sin2 x
. We will soon give an easy proof that cos2 x + sin2 x = 111
cos2 x

11.3 The Inverse Function

Theorem 11.3.1. Let f : [a, b] → [m, M ] be a strictly increasing continuous function from
[a, b] on to [m, M ], with inverse function g : [m, M ] → [a, b]. Suppose that f is differentiable
at x0 ∈ (a, b) and that f ′ (x0 ) 6= 0. Then g is differentiable at f (x0 ), and
1
g ′ (f (x0 ) =
f ′ (x0 )

Proof. We have already proved that g is continuous. Write y0 = f (x0 ). Then for y 6= y0

g(y) − g(y0 ) x − x0 1
= = f (x)−f (x0 )
y − y0 f (x) − f (x0 )
x−x0

where x = g(y), and so y = f (x).


Since g is continuous, x = g(y) → g(y0 ) = x0 as y → y0 . Hence

f (x) − f (x0 )
→ f ′ (x0 ) as y → y0 .
x − x0

As f ′ (x0 ) 6= 0 we use AOL to see that

g(y) − g(y0 ) 1
lim = ′
y→y0 y − y0 f (x0 )

exists. That is, g is differentiable at y0 , and


1 1
g ′ (y0 ) = = .
f ′ (x0 ) f ′ (f −1 (y 0 ))

11
The Pythagoras Theorem.

43
11.4 Logarithms

We continue to deal only with the real case where, in section 6, we defined log : (0, ∞) → R
as the inverse function of the real exponential function.
To see that log is differentiable at any y > 0 proceed as we did when we discussed continuity,
by finding an A such that exp(−A) < y < exp(A) and then using the Inverse Function
Theorem on the differentiable function exp : [−A, A] → [exp(−A), exp(A)]. We will find that
1 1 1
log′ y = = =
exp′ (log y) exp(log y) y

which may not be surprising—nevertheless it is good to have a proof.

11.5 Powers

For any x > 0 and any α ∈ R in section 6 we defined xα = exp(α log x). From the Chain
Rule and the properties of exponentials and logarithms we therefore have that
d α
x = αxα−1 .
dx

12 The Mean Value Theorem

12.1 Local maxima and minima

Definition 12.1.1. Let E ⊆ R and f : E → R.


(i) x0 ∈ E is a local maximum if for some δ > 0, f (x) 6 f (x0 ) whenever x ∈ (x0 − δ, x0 +
δ) ∩ E.
(ii) x0 ∈ E is a local minimum if for some δ > 0, f (x) > f (x0 ) whenever x ∈ (x0 − δ, x0 +
δ) ∩ E.

A local maximum or minimum is called a local extremum. If the inequality is strict we


will say that the extremum is strict.

Here is the crucial property (which, of course, you knew long before you started the course).

Proposition 12.1.2 (Fermat). Let f : (a, b) → R. Suppose that x0 ∈ (a, b) is a local


extremum and f is differentiable at x0 . Then f ′ (x0 ) = 0.

f (x) − f (x0 )
Proof. If x0 is a local maximum, then there exists δ > 0 such that 6 0 whenever
x − x0
0 < x − x0 < δ and x ∈ (a, b), so that

f (x) − f (x0 )
f+′ (x0 ) = lim 6 0.
x→x0 + x − x0

44
f (x) − f (x0 )
On the other hand, > 0 whenever −δ < x − x0 < 0 and x ∈ (a, b), so that
x − x0
f (x) − f (x0 )
f−′ (x0 ) = lim > 0.
x→x0 − x − x0

Since f is differentiable at x0 , f ′ (x0 ) = f−′ (x0 ) = f+′ (x0 ) and hence f ′ (x0 ) = 0.
Similarly if x0 is a local minimum.

Remark 12.1.3. It is essential that the interval (a, b) is open. Why?

12.2 Rolle’s Theorem

Theorem 12.2.1 (Rolle, 1691). Let f : [a, b] → R be continuous, and suppose that f is
differentiable on (a, b). Suppose further that f (a) = f (b). Then there exists a point ξ ∈ (a, b)
such that f ′ (ξ) = 0.

Proof. If f is constant in [a, b], then f ′ (x) = 0 for every x ∈ (a, b), so that any point—say
ξ = 21 (a + b)—will do.
As f is continuous on [a, b] it attains its maximum and minimum on [a, b] (by Theorems 4.1.2
and 4.1.6.) Suppose ξ1 is the minimum and ξ2 is the maximum. As f (a) = f (b), either ξ1
or ξ2 lies in the open interval (a, b), or else f is constant and we are done. Suppose then
that ξ ∈ (a, b) gives either the maximum or minimum. It is then a local extremum, and by
Fermat’s result f ′ (ξ) = 0.

We can express this informally by saying


‘between any two roots of f there is a root of f ′ ’.

Note 12.2.2. (i) Remember that f is differentiable implies that f is continuous. Thus
the hypotheses of Rolle would be satisfied if f was differentiable on [a, b] and f (a) = f (b).
However, often it is important that Rolle holds under the given weaker conditions.
(ii) When using these theorems remember to check ALL conditions including the continuity
and differentiability conditions. For example f : [−1, 1] → R given by f (x) = |x| satisfies all
conditions of Rolle except that f is not differentiable at x = 0. But there is no ξ such that
f ′ (ξ) = 0.

12.3 The Mean Value Theorem

This is one of the most important results in this course. It is a rotated version of Rolle.12
Theorem 12.3.1 (MVT). Let f : [a, b] → R be continuous, and suppose that f is differen-
tiable on (a, b). Then there exists a point ξ ∈ (a, b) such that

f (b) − f (a) = f ′ (ξ)(b − a).


12
If in an examination you are asked to prove the Mean Value Theorem, then you need to provide also
proofs of Fermat’s result and Rolle’s Theorem.

45
Proof. Apply Rolle’s theorem to the function

F (x) = f (x) − k(x − a),

where k is a constant to be determined. F : [a, b] → R is continuous, and is differentiable


f (b) − f (a)
on (a, b). We choose k so that F (a) = F (b), that is k = . Thus Rolle’s theorem
b−a
applies, so for some number ξ between a and b, F ′ (ξ) = 0. But F ′ (x) = f ′ (x) − k, so
f (b) − f (a)
f ′ (ξ) = k = , as required.
b−a
Note 12.3.2. Suppose we have the hypotheses of the MVT. Then for any a 6 a1 < b1 6 b
we can apply the MVT to f restricted to [a1 , b1 ] and get

f (b1 ) − f (a1 ) = f ′ (ξ1 )(b1 − a1 ) for some ξ1 ∈ (a1 , b1 ).

Note that (for a given function f ) the value of ξ1 may depend on a1 and b1 .

Corollary 12.3.3 (The Taylor Theorem, mark 1). Suppose that we have the hypotheses of
the MVT and that x, x + h ∈ [a, b]. Then

f (x + h) − f (x) = f ′ (x + θh)h for some θ ∈ (0, 1).

Proof. Suppose h < 0; then a 6 x + h < x 6 b. From the MVT applied to f on the interval
[x + h, x] there exists ξ ∈ (x + h, x) such that

f (x) − f (x + h) = f ′ (ξ)(−h).

Write ξ = x + θh, and note that x + h < x + θh < x implies—as h < 0—that 0 < θ < 1.
The cases h = 0 and h > 0 are left as exercises.

12.4 The Identity Theorem

Here is one of the most useful consequences of the MVT.


Corollary 12.4.1 (Identity Theorem). Let f : (a, b) → R be differentiable, and satisfy
f ′ (t) = 0 for all t ∈ (a, b). Then f is constant on (a, b).

Proof. Apply MVT to f on [x, y] where x, y are any two points in (a, b). (Note that f
is differentiable on (a, b) implies that f is continuous on (a, b) and hence f is continuous
on [x, y].) Then f (x) − f (y) = f ′ (ξ)(x − y) for some ξ ∈ (x, y). But f ′ (ξ) = 0, so that
f (x) = f (y). Therefore f is constant in (a, b).

Example 12.4.2. Suppose that φ is a function whose derivative is x2 . Then we have, for
all x, that φ(x) = 31 x3 + A for some constant A.

Proof. Let f (x) := φ(x) − 13 x3 ; then f is differentiable and we can calculate that f ′ (x) =
x2 − 31 · 3x2 = 0. By the Identity Theorem f (x) = A for some constant A. You can justify
other ‘integrations’ similarly . Just guess the ‘integral’ and proceed as above.

46
Note 12.4.3. Often when applying mathematics we have to solve a differential equation.
Last term you learned methods for guessing solutions. The Identity Theorem gives us a tool
to prove the uniqueness of solutions of DE, and lets us justify these clever tricks (see Section
13.3). Those who do PDEs have already seen this idea this term where you showed that
E ′ (t) = 0 and then deduced that E(t) is a constant (which then turned out to be zero).

12.5 Derivatives and monotonicity

Corollary 12.5.1. Let f : (a, b) → R be differentiable.


(i) If f ′ (x) > 0 for all x ∈ (a, b) then f is increasing on (a, b).
Proof: Apply the MVT to any [x, y] ⊂ (a, b) to get f (y) − f (x) = f ′ (ξ)(y − x), a product of
non-negative numbers. Hence f (y) > f (x) and we are done.
(ii) If f ′ (x) 6 0 for all x ∈ (a, b) then f is decreasing on (a, b).
(iii) If f ′ (x) > 0 for all x ∈ (a, b) then f is strictly increasing on (a, b).
(iv) If f ′ (x) < 0 for all x ∈ (a, b) then f is strictly decreasing on (a, b).

12.6 The Cauchy Mean Value Theorem

Sometimes we are concerned with more than one function, and would like to use the MVT
or a MVT type argument. The following is what we need: except in the most trivial cases it
never helps to apply the MVT to the functions separately—we generate too many distinct
ξ’s.

Corollary 12.6.1 (Cauchy’s Mean Value Theorem). 13 Let f, g : [a, b] → R be continuous


on [a, b] and differentiable on (a, b). Suppose that g ′ (x) 6= 0 for all x ∈ (a, b). Then for some
ξ ∈ (a, b) we have that
f ′ (ξ) f (b) − f (a)

= .
g (ξ) g(b) − g(a)

Proof. To be supplied later when we need the result.

13 Applications of the MVT

13.1 Exponential and Logarithm

Proposition 13.1.1. exp(x + y) = exp(x) exp(y) for all x, y ∈ R.

Proof. We have to use the Identity Theorem—but on what function? Fixing y and looking
at f (x) = exp(x + y) − exp(x) exp(y) leads to f ′ = f and f (0) = 0 which we could now solve
to get f (x) = 0 (see section 13.3).
13
This is where the result belongs logically, but in the lectures it will not appear until later, when we do
L’Hospital’s Rule.

47
However a much better (more direct) way is to fix x + y instead. So, fix c ∈ R, and
put g(t) = exp c − exp t exp(c − t). Then we have that g ′ (t) = 0 so that g(t) = g(0) by
the Identity Theorem. Now g(0) = exp c − exp 0 exp c = 0. So for any c, t we have that
exp c − exp t exp(c − t) = 0. Put c := (x + y), and t := x to get the result.

Corollary 13.1.2. log(uv) = log(u) + log(v) for all u, v ∈ (0, ∞).

Proof. From above

exp(log(u) + log(v)) = exp(log(u)) exp(log(v)) = uv = exp(log(uv))

and take logs.

We can also use the MVT to prove the monotonicity of the exponential function.

Proposition 13.1.3. The function exp : R → (0, ∞) is strictly increasing.

Proof. As exp x > 0, its derivative is positive.

13.2 Trigonometric Functions

Proposition 13.2.1 (The Pythagoras Theorem). For all real x we have that

cos2 x + sin2 x = 1.

Proof. Let f (x) := cos2 x + sin2 x − 1. Then by what we have proved about derivatives of
trigonometric functions,

f ′ (x) = 2 cos x(− sin x) + 2 sin x cos x − 0 = 0.

for all x.
By the Identity Theorem applied to f on some interval (−A, A) we see that for x ∈ (−A, A)
we have that
f (x) = f (0) = cos2 0 + sin2 0 − 1 = (1)2 − 0 − 1 = 0.
As this is true for any A we get that f (x) = 0 on R.

Proposition 13.2.2 (Addition Formulae). For all real x, y we have that

(i) cos(x + y) = cos x cos y − sin x sin y;

(ii) sin(x + y) = sin x cos y + cos x sin y.

Proof. It is enough to prove one, the other is got by fixing y and taking the derivative of the
resulting function of x.
We recall what we did for exponentials: let

h(x) = cos c − cos x cos(c − x) + sin x sin(c − x)

48
whose derivative is

h′ (x) = 0 + sin x cos(c − x) − cos x sin(c − x) + cos x sin(c − x) − sin x cos(c − x) = 0

so that by the Identity Theorem h(x) = h(c) = 0.


P∞ (−)k x2k
Proposition 13.2.3. The function cos x := 0 has a least positive zero which we
(2k)!
denote (for the moment) by α.

Proof. First we need to see that there are positive zeros. Note that

X (−)k 02k
cos 0 = =1>0
(2k)!
0

and (by looking at pairs of terms)



x2 x4 X x4k+2 x2 x2 x4
µ ¶
cos x = 1 − + − 1− 61− +
2! 4! (4k + 2)! (4k + 4)(4k + 3) 2! 4!
1

x2 1 x4

provided x2 6 (4 + 4)(4 + 3). As 1 −
£ 2
(x − 6)2 − 12 we see that cos 6 < 0.
¤
2! +
= 24 4!

By the IVT, cos x has at least one zero in [0, 6].
Now let
S = {t > 0 : cos t = 0}.
Then S 6= ∅ and S is bounded below, so that α = inf S exists. By definition of inf S, given n
there exists tn such that α ≤ tn < α + 1/n. Thus tn → α as n → ∞. But cos x is continuous,
so that cos tn → cos α, and hence cos α = 0. But cos 0 = 1 so that α is the minimum positive
zero required.

Proposition 13.2.4. sin α = 1.

Proof. By Pythagoras, sin α = ±1. Suppose sin α = −1; then by the MVT there would be
sin α − sin 0 −1
some ξ ∈ (0, α) such that cos ξ = = < 0. However, cos 0 = 1, and α is the
α−0 α
first root, so by the IVT cos ξ cannot be negative.

Proposition 13.2.5 (Periodicity). For all real x we have that

(i) cos(x + α) = − sin x and sin(x + α) = cos x;


(ii) cos(x + 2α) = − cos x and sin(x + 2α) = − sin x;
(iii) cos(x + 4α) = cos x and sin(x + 4α) = sin x.

Proof. We just use the addition formula repeatedly, inserting the values cos α = 0 and
sin α = 1.

Now that we have proved these results, and the danger of using ‘obvious’ but unproved
properties of π has passed we can make the following definition:

49
Definition 13.2.6. π := 2 · inf{β | β > 0 and cos β = 0}(= 2α).

We need one more result, and then we have established “all” the usual facts about the
trigonometric functions.

Proposition 13.2.7. The zeros of cos x are at precisely the points {(k + 12 )π : k ∈ Z}.

Proof. By Proposition 13.2.5(ii) and (iii), for k ∈ Z, cos( 12 π + kπ) = (−1)k cos 12 π = 0 so
these are all zeros. If β were such that cos β = 0 then as above we have a k such that
β0 = β + kπ ∈ (0, π] is a zero of cos x. Clearly β0 6< 12 π by definition. Using

cos(π − x) = − cos(−x) = − cos(x)

we see that if β0 > 21 π then π − β0 < 12 π is a zero of cos x, which cannot be. Hence β0 = 12 π,
and β has the required form.

13.3 Differential Equations

Here is a very fundamental example of how we use the MVT to establish the uniqueness of
a solution of a differential equation.

Example 13.3.1. Show that the general solution for f ′ (x) = f (x) ; x ∈ (0, +∞), is
f (x) = A exp(x) where A is a constant.

The ‘trick’ for solving differential equations is to manipulate them so that they look like
d
F = 0 for some F , and then ‘integrate’. We often achieve this by multiplying by ‘inte-
dx
grating factors’, for which there are recipe books. The same ‘trick’ lets us apply the MVT
(or Identity Theorem) to prove that we have a solution.
df R
Given this differential equation − f = 0 we would multiply it by e− 1dx , rewrite it as
dx
d −x
(e f (x)) = 0 and deduce that e−x f (x) = A.
dx
Let’s write this as a piece of pure mathematics!
Consider g(x) := f (x) exp(−x). Then g ′ (x) = f ′ (x) exp(−x) − f (x) exp(−x) = 0. Hence, by
the Identity Theorem g(x) is constant; that is f (x) exp(−x) = A say, and so f (x) = A exp(x).

Example 13.3.2. This example is based on a Calculus question from a Mods Collection.
Find all solutions of
2
y ′′ − y = 0.
1 + x2
(The emphasis for us is on “all”.)

I will probably not do this in the lectures.


The following is all motivated by the method for finding a second solution for second-order
linear ordinary differential equations when one solution is known, which you learnt in the

50
‘Calculus of One Variable’ course last term. We use the methods from this course to show
that these are all the solutions.
We can check easily that (1 + x2 ) is a solution; so we write

y(x)
z(x) = .
1 + x2
An easy calculation yields

y ′ = z ′ (1 + x2 ) + 2xz and y ′′ = z ′′ (1 + x2 ) + 4xz ′ + 2z,

so that z must satisfy


z ′′ (1 + x2 ) + 4xz ′ = 0
and hence £ ′ ¤′
z (1 + x2 )2 = (z ′′ (1 + x2 ) + 4xz ′ )(1 + x2 ) = 0.
By the Identity Theorem
z ′ (1 + x2 )2 = A
for some constant A and so
A
z ′ (x) = .
(1 + x2 )2
Although of course we can’t ‘integrate up’ yet — we don’t know what that means — we can
take the hint and look at what the integral would be, namely
· ¸
1 x
w(x) = arctan x + ;
2 1 + x2

here arctan is the inverse function of tan. So by the Inverse Function Theorem and the other
rules of differentiation which we have established we can check that
1
w′ (x) = = z ′ (x)/A.
(1 + x2 )2

Hence by the Identity Theorem z(x) − Aw(x) = B for some constant B, and so the only
solutions are

(1 + x2 ) arctan x + x + B(1 + x2 ).
¤
y(x) =
2

sin x
13.4 The function
x
This is a good example of how the Mean Value Theorem and its various corollaries are used practically.
I will probably not do this in lectures.

Proposition 13.4.1. Let 0 < x < 12 π. Then

sin x
(i) sin x < x < tan x and so cos x < x < 1;
sin x
(ii) limx→0 x = 1;
2 sin x
(iii) π < x < 1 (Jordan’s inequality).

51
We therefore have the following bounds:
2 sin x
max{cos x, }< < 1.
π x

Proof. To prove the first inequality, consider f (x) = tan x−x, for x ∈ [0, 21 π). Then f is differentiable
on (0, 12 π) and
1
f ′ (x) = − 1 > 0 for all x ∈ (0, 21 π).
cos2 x
Hence f is strictly increasing on [0, 21 π); in particular f (x) > f (0) for any x ∈ (0, 12 π) which yields
tan x > x. Considering x − sin x in the same way will give x > sin x.
The second inequality in (i) is got by inverting and multiplying by sin x; this is justified since sin x > 0
until the smallest positive zero of cos x.
sin x
For (ii) we use a version of the sandwich theorem and the continuity of cos x to get that limx→0+ x
exists and
sin x
1 = lim cos x 6 lim 6 1.
x→0+ x→0+ x

sin x sin x
As is an even function this gives that limx→0 = 1.
x x
Now consider
sin x
h(x) = for x ∈ (0, 12 π].
x
Then
cos x(x − tan x)
h′ (x) = < 0 for all x ∈ (0, 21 π)
x2
so that h is strictly decreasing, and hence g(x) > g( 21 π) for any x ∈ (0, 21 π); this gives the first
inequality of (iii). The second is already included in (i).

14 L’Hôpital’s Rule

This section is devoted to a variety of rules and techniques for calculating limits of quotients.
They derive from results of Guillaume de l’Hôpital; perhaps they are really due to Johann
Bernoulli whose lecture notes l’Hôpital published in 1696.

14.1 The Cauchy Mean Value Theorem

As promised earlier here is the proof of Cauchy’s symmetric form of the MVT. (At first sight
one might think we could just apply the MVT to f and g separately. However, a moment’s
reflection will show that we would then get two different ξ.)

Theorem 14.1.1 (Cauchy’s Mean Value Theorem). Let f, g : [a, b] → R be continuous on


[a, b] and differentiable on (a, b). Suppose that g ′ (x) 6= 0 for all x ∈ (a, b). Then for some
ξ ∈ (a, b) we have that
f ′ (ξ) f (b) − f (a)

= .
g (ξ) g(b) − g(a)

52
Proof. First, this makes sense: we cannot have g(b) − g(a) = 0 or by Rolle’s Theorem there
would be a point η ∈ (a, b) with g ′ (η) = 0.
Now let the function F be defined on [a, b] by
¯ ¯
¯ 1 1 1 ¯
¯ ¯
F (x) := ¯¯ f (x) f (a) f (b) ¯
¯
¯ g(x) g(a) g(b) ¯

that is

F (x) = (f (a)g(b) − f (b)g(a)) + f (x) (g(a) − g(b)) + g(x) (f (b) − f (a))

which, being a linear combination of f and g is continuous on [a, b] and differentiable on


(a, b). Clearly F (a) = F (b) = 0; so Rolle’s Theorem applies and yields a ξ ∈ (a, b) such that
F (ξ) = 0. But

0 = F ′ (ξ) = 0 + f ′ (ξ) (g(a) − g(b)) + g ′ (ξ) (f (b) − f (a))

and we are done after dividing by the non-zero g ′ (ξ)(g(b) − g(a)).

14.2 The L’Hôpital Rule

Proposition 14.2.1. Suppose f , g are continuous on [a, a + δ] (for some δ > 0), and
′ (x)
differentiable in (a, a + δ), and that f (a) = g(a) = 0. Suppose further that l := limx→a+ fg′ (x)
exists.
Then
f (x) f ′ (x)
lim = lim ′ .
x→a+ g(x) x→a+ g (x)

Proof. Note that there must exist a δ ′ < δ such that on (a, a + δ ′ ] we have that g ′ (x) 6= 0,
for otherwise the function f ′ (x)/g ′ (x) would not be defined near a and so this limit could
not be defined.
For every x ∈ (a, a+δ ′ ), apply Cauchy’s MVT to f , g on the interval [a, x]: there is ξx ∈ (a, x)
such that
f (x) f (x) − f (a) f ′ (ξx )
= = ′ .
g(x) g(x) − g(a) g (ξx )
But if x → a+, then ξx → a with ξx > a, so that

f ′ (ξx )
lim = l.
x→a+ g ′ (ξx )

Hence
f (x) f ′ (ξx )
lim = lim ′ =l
x→a+ g(x) x→a+ g (ξx )

Similarly we prove

53
Corollary 14.2.2. Suppose f , g are continuous on [a − δ, a] (for some δ > 0), and differen-
′ (x)
tiable in (a − δ, a), and that f (a) = g(a) = 0. Suppose further that l := limx→a− fg′ (x) exists.
Then
f (x) f ′ (x)
lim = lim ′ .
x→a− g(x) x→a− g (x)

The proof of the following is now immediate.

Corollary 14.2.3 (L’Hôpital’s Rule (L’HR)). Suppose f , g are continuous on [a − δ, a + δ]


(for some δ > 0), and differentiable in (a − δ, a + δ) \ {a}, and that f (a) = g(a) = 0. Suppose
′ (x)
further that l := limx→a fg′ (x) exists. Then

f (x) f ′ (x)
lim = lim ′ .
x→a g(x) x→a g (x)

0
Note 14.2.4. Sometimes this is called the 0 case of L’HR.

14.3 Some Applications

Example 14.3.1. Prove that


1 − cos x 1
lim = .
x→0 x2 2

We argue like this:

1 − cos x sin x
lim = lim by L’HR, provided this limit exists
x→0 x2 x→0 2x
cos x
= lim by L’HR, provided this limit exists
x→0 2
1
= and this limit exists by the continuity of cos x;
2
so the above equalities hold.

To justify this we need to see L’HR, which we have used twice is actually applicable. But
by standard results we have already proved:
1−cos x and x2 are continuous on [− 12 π, 21 π], zero at zero, and differentiable on ( 21 π, 12 π)\{0}
with derivatives sin x and 2x;
sin x, and 2x are continuous on [− 21 π, 12 π], zero at zero, and differentiable on ( 12 π, 21 π) \ {0}
with derivatives cos x and 2

sin x
Exercise: Prove similary that limx→0 x = 1.

Example 14.3.2.
log(1 + x)
lim = 1.
x→0 x

54
Again we argue:
log(1 + x) log′ (1 + x)
lim = lim by L’HR, provided this limit exists
x→0 x x→0 x′
1 1
= lim derivative of log t is
x→0 1 + x t
1
= 1 by continuity of
1+x
—as this exists previous equalities hold.

To justify the use of L’HR we need to see that log(1 + x) and x are continuous on [− 21 , 21 ], 0
at 0, and differentiable on (− 21 , 21 ) \ {0}.
Example 14.3.3.
1
lim (1 + x) x = e.
x→0

1
Recall that by definition (1 + x) x := exp x1 log(1 + x) . So consider first log(1+x)
¡ ¢
x . By the
previous example this has limit 1. Now by the continuity of exp(x) we see that
µ ¶
1 log(1 + x)
(1 + x) x = exp → exp(1) = e as x → 0.
x

14.4 L’Hôpital’s Rule: infinite limits

If we have all the hypotheses for L’Hôpital’s rule, except that we have
f ′ (x)
→ +∞ as x → a
g ′ (x)
then we swap f and g, then use L’HR and conclude that
g(x)
→0 as x → a.
f (x)

14.5 L’Hôpital’s Rule at ∞

Suppose f, g : (a, +∞) → R are continuous and differentiable, with f (x) → 0 and g(x) → 0
′ (x)
as x → ∞. If g ′ (x) 6= 0 on (a, +∞) and fg′ (x) → l as x → ∞, then we can deduce that
f (x)
limx→∞ g(x) = l.

All we need do is apply L’HR to the functions F (x) = f ( x1 ) and G(x) = g( x1 ), with F (0) =
0 = G(0), checking carefully that the hypotheses hold.


14.6 L’Hôpital’s Rule—the ∞
case

There is one important variant which we cannot obtain by algebraic manipulation, or by


taking logarithms or exponentials or similar tricks. This will probably not be covered in the
lectures.

55

Proposition 14.6.1 (L’HR, the ∞ case). Let f, g : (a, a + δ) → R be differentiable for some
′ (x)
δ > 0. Suppose further that f (x) → ∞ and g(x) → ∞ as x → a+ and that limx→a+ fg′ (x)
exists.
Then
f (x) f ′ (x)
lim = lim ′ .
x→a+ g(x) x→a+ g (x)

Note 14.6.2. We do not want to make too much heavy weather in this proof; checking all
the details is a good exercise.


Proof. Write K := limx→a+ fg′ (x)
(x)
. Let ε > 0, then there exists a δ1 > 0 such that δ1 < δ and
¯ ′ ¯
¯ f (x) ¯ 1
¯ g ′ (x) − K ¯ < 2 ε for all x ∈ (a, a + δ1 ).
¯ ¯

Now fix some c in (a, a + δ1 ).


For any x ∈ (a, c) we apply Cauchy’s MVT to f , g on [x, c]: there is a number ξx ∈ (x, c)
such that
f (c) − f (x) f ′ (ξx )
= ′ .
g(c) − g(x) g (ξx )

Since ξx ∈ (x, c) ⊂ (a, a + δ1 ), we have that


¯ ¯ ¯ ′ ¯
¯ f (x) − f (c) ¯ ¯ f (ξx ) ¯ 1
¯
¯ g(x) − g(c) − K ¯ = ¯
¯ ¯ g ′ (ξx ) − K ¯< ε
¯ 2 for all x ∈ (a, c).

(Unlike the 00 case we cannot conclude immediately that fg(x)−g(c)


(x)−f (c)
→ K as x → a+ (although
it does !!), as there is no guarantee that ξx will tend to a as x → a+).
Clearing the fraction we have that

|f (x) − f (c) − Kg(x) + Kg(c)| < 21 ε |g(x) − g(c)|

so that the Triangle Law gives us

|f (x) − Kg(x)| < 12 ε |g(x) − g(c)| + |f (c) − Kg(c)|

or ¯ ¯ ¯ ¯
¯ < ε ¯1 − g(c) ¯ + |f (c) − Kg(c)| .
¯ f (x) ¯ 1 ¯ ¯
¯ − K ¯ 2 ¯
¯ g(x) g(x) ¯ |g(x)|

Now use the fact that g(x) → ∞; we can find a δ2 > 0, such that δ2 < δ1 and such that
¯ ¯
g(c) ¯ 3 |f (c) − Kg(c)|
< 14 ε
¯
¯1 − ¯< and
¯ g(x) ¯ 2 |g(x)|
so that for a < x < a + δ2 we have
¯ ¯
¯ f (x) 13
+ 14 ε = ε
¯
¯ g(x) − K ¯ < 2 2ε
¯ ¯

as required.

56
14.7 More applications

These examples might be better done by using the standard limit from Analysis I that, if
α > 0, x exp(−αx) → 0 as x → ∞ .
log x
Example 14.7.1. limx→+∞ xµ = 0 for any µ > 0.

Let g(x) = xµ = exp(µ log x). Then g ′ (x) = µxµ−1 . So by L’Hôpital’s rule ( ∞ case) we have
1
log x x
lim = lim provided this limit exists
x→+∞ xµ x→+∞ µxµ−1
1
= = 0 which does exist.
lim
x→+∞ µxµ

Example 14.7.2. For any µ > 0, limx→0+ xµ log x = 0 .



We transform this into ∞ form and then by L’HR
log x
lim xµ log x = lim
x→0+ x−µ
x→0+
log′ x
= lim if this limit exists
x→0+ (x−µ )′
1
x xµ
= lim = lim =0 which does exist.
x→0+ (−µ)x−µ−1 x→0+ (−µ)

Finally
Example 14.7.3. Show that
1 1
= e− 3 .
¡ sin x ¢
lim 1−cos x
x→0 x

¢ 1 1
Since f (x) = sinx x 1−cos x is an even function, we only need to show that limx→0+ f (x) = e− 3 .
¡

According to definition
µ ¶
1 sin x
f (x) = exp log
1 − cos x x
µ ¶
log sin x − log x
= exp .
1 − cos x
By the L’Hôpital Rule,
cos x
log sin x − log x − x1
sin x sin x
lim = lim [provided it exists; recall → 1]
x→0+0 1 − cos x x→0+ sin x x
x cos x − sin x
= lim
x→0+ x sin2 x
cos x − x sin x − cos x
= lim [if exists,using L’Hôpital]
x→0+ sin2 x + 2x sin x cos x
x
= − lim
x→0+ sin x + 2x cos x
1
= − lim [if exists,using L’Hôpital]
x→0+ cos x + 2 cos x − 2x sin x
1
= − [continuity] .
3

57
Since exp is continuous at − 31 ,
µ ¶ 1 µ ¶
sin x 1−cos x log sin x − log x
lim = lim exp
x→0+ x x→0+ 1 − cos x
µ ¶
log sin x − log x
= exp lim [by continuity of exp]
x→0+ 1 − cos x
µ ¶
1
= exp − .
3

14.8 Health Warning

L’Hôpital’s Rule is very seductive. But it is often not the best way to evaluate limits. Taylor’s
Theorem, to which we turn next, is often more useful, and indeed more informative.
sinh x4 − x4
If you doubt this, then use L’HR to work out limx→0 , and then later use Taylor’s
(x − sin x)4
Theorem to write it down at sight—and decide which is better.

15 Taylor’s Theorem

15.1 Motivation

Suppose that f : (a − δ, a + δ) → R and that for some n > 1 the derivatives f ′ , f ′′ , . . . , f (n)
exist on the interval. For convenience write f 0 := f .
We can then form the Taylor polynomials

f ′′ (a) f (n) (a)


Pn (x) := f (a) + f ′ (a)(x − a) + (x − a)2 + · · · + (x − a)n
2! n!
(k)
a polynomial of degree n in x. This polynomial ‘agrees with f ’ to the extent that Pn−1 (a) =
f (k) (a) for k = 0, . . . , n.
We have

P0 (x) = f (a) constant approximation, not very interesting;


P1 (x) = f (a) + f ′ (a)(x − a) linear approximation;
f ′′ (a)
P2 (x) = f (a) + f ′ (a)(x − a) + (x − a)2 quadratic approximation;
2!
· · · and so on.

We might hope, on the basis of our experience, that Pn (x) is a good approximation to f (x);
we’d like to justify that intuition.
We’d also like to consider the power series

X f (k) (a)
P (x) := (x − a)k
k!
k=0

58
which is called the Taylor expansion of f at a. Our previous experience leads us to conjecture
that this must equal f (x).
To investigate these questions we will look at the ‘error term’

En (x) := f (x) − Pn (x).

(Clearly, if f has derivatives of all orders, Pn (x) → f (x) as n → ∞ if and only if En (x) → 0.)
Unfortunately, even if f has derivatives of all orders, it need not be true that En (x) → 0 as
n → ∞, so we have to move more carefully. First, we will prove Taylor’s Theorem which will
give us information about En (x). Secondly, in individual cases we have to consider whether
En (x) → 0 as n → ∞.

15.2 A cautionary example

Our intuition, built on experience of polynomials, trigonometric and exponential functions,


is misleading. The following example shows us that there are functions f (x), with derivatives
P f k (0) k
of all orders at every point of R, such that k! x is convergent for every x—but for which
En (x) 6→ 0.
Consider f : R → R defined by

exp(− x12 )
½
whenever x 6= 0,
f (x) =
0 for x = 0.

A little experimentation shows that the k-th derivative must look like

Qk ( x1 ) exp(− x12 )
½
(k) whenever x 6= 0,
f (x) =
0 for x = 0,

for some polynomial Qk of degree 3k. We can prove this by induction: At points x 6= 0 this
is routine use of linearity, the product rule and the chain rule. But at x = 0 we need to take
more care, and use the definition:
¶ 3k+1
X exp(− 12 )
f (k) (x) − f (k) (0)
µ ¶ µ
1 1 1 x
= Qk exp − 2 = as
x−0 x x x xs
s=1

which we must prove tends to zero as x → 0; if we change the variable to t = x1 then we have
a finite sum of terms like ts exp −t2 which we know tend to zero as |t| tends to infinity.
P f (k) (0 k
So for this function f the series k! x = 0 so converges to 0 at every x. But the error
term En (x) is the same for all n (it equals f (x)) and so does not tend to 0 at any point
except 0.

Note that we can add this function to exp x and sin x and so on, and get functions with
the same set of derivatives at 0 as these functions, so that they will have the same Taylor
polynomials—but are different functions.
Remark 15.2.1. Functions defined and differentiable on C are very different: for them, our
naive intuition is a good guide—but that is next year’s Analysis course.

59
15.3 Taylor’s Theorem with Lagrange Remainder

We now concentrate on the Taylor polynomial and investigate its difference from the function.

Theorem 15.3.1 (Taylor’s Theorem). Let f : [a, b] → R. Suppose that for some n > 1 we
have that f , f ′ , f ′′ , . . . , f (n−1) exist and are continuous on [a, b] and that f (n) exists on (a, b).
Then there is a number ξ ∈ (a, b) such that
n−1
f (k) (a) f (n) (ξ)
(b − a)k + (b − a)n .
X
f (b) =
k! n!
k=0

Note 15.3.2. Recall that at the end points a and b ‘differentiable’ means ‘left- (or right-)
differentiable.
(n)
Note 15.3.3. The term f n!(ξ) (b − a)n is called Lagrange’s form of the remainder. Note
that the crucial parameter ξ, may depend on (i) the function f ; (ii) the degree n; (iii) the
end points a and b.14

Note 15.3.4. If we set b − a = h, then Taylor’s theorem may be stated as


n−1
X f (k) (a) k f (n) (a + θh) n
f (a + h) = h + h (Tn )
k! n!
k=0

where θ is some number between 0 and 1.

Proof. We look at the Taylor polynomial part,


n−1
f (k) (a)
(b − a)k
X
k!
k=0

and we use the method of “varying a constant”: that is we look at the following function
defined on [a, b]
n−1
X f (k) (x)
F (x) := (b − x)k .
k!
k=0

This is clearly continuous on [a, b] and on (a, b) we have that


n−1 n−1
f (k+1) (x) X f (k) (x)
(b − x)k + (−1)k (b − x)k−1
X
F ′ (x) =
k! k!
k=0 k=0
n n−1
f (k) (x) f (k) (x) f (n) (x)
(b − x)k−1 − (b − x)k−1 = (b − x)n−1 .
X X
=
(k − 1)! (k − 1)! (n − 1)!
k=1 k=1

Pn−1 f (k) (a)


Note also that F (a) = k=0 k! (b − a)k and F (b) = f (b).
14
When applying Taylor’s Theorem to different functions (perhaps as similar as f (x) and f (−x)) or different
ranges (perhaps as similar as [0, b] and [−b, 0]) it is essential to use a different letter for each ξ that is introduced.

60
Let G(x) be continuous on [a, b] and differentiable on (a, b). We use Cauchy’s Mean Value
Theorem on this pair of functions to see that there exists a ξ ∈ (a, b) such that
F (a) − F (b) F ′ (ξ)
= ′ .
G(a) − G(b) G (ξ)
That is
f (n) (ξ)
f (k) (a) (b − ξ)n−1
Pn−1
k=0 k!(b − a)k − f (b) (n−1)!
= . (∗)
G(a) − G(b) G′ (ξ)
But if we take
G(x) := (b − x)n ,
which is clearly continuous and differentiable on (a, b) with derivative −n(b − x)n−1 < 0,
then (*) simplifies at once to
n−1
f (k) (a) f (n) (ξ)
(b − a)k + (b − a)n .
X
f (b) =
k! n!
k=0

We have proved the strongest theorem we could. But often we know a bit more, and can
get, for example this symmetric version:
Corollary 15.3.5 (Taylor’s Theorem). Let f : (a − δ, a + δ) → R for some δ > 0. Suppose
that for some n > 1 we have that f ′ , f ′′ , . . . , f (n) exist. Let x ∈ (a − δ, a + δ). Then there is
a number ξ between a and x such that
n−1
f (k) (a) f (n) (ξ)
(x − a)k + (x − a)n .
X
f (x) =
k! n!
k=0

Proof. If x > a then this is just the Taylor Theorem we have proved. If x < a we just
use the Taylor Theorem we have proved on the function f (−x) and sort out the signs and
inequalities. If x = a then take ξ = a.

15.4 Other forms of the remainder

In the proof of Taylor’s Theorem we may use any function G which is continuous in [a, b],
differentiable in (a, b), and such that G′ 6= 0. Then we will have a ξ ∈ (a, b) such that

f (n) (ξ) G(b) − G(a)


f (b) = Pn−1 (b) + (b − ξ)n−1 .
(n − 1)! G′ (ξ)
By choosing different functions G, you may prove Taylor’s Theorem with the remainder of
different forms. For example, if we choose G(x) = x − a, then G(b)−G(a)
G′ (ξ) = b − a. Thus

f (n) (ξ)
f (b) = Pn−1 (b) + (b − a) (b − ξ)n−1 for some ξ ∈ (a, b).
(n − 1)!
Exercise 15.4.1. Try G(x) = (x − a)m for a power m > 1 to see what kind of Taylor’s
formula you can get.

61
15.5 The error estimate

Taylor’s Theorem also provides us with an explicit estimate of the difference between f (x)
Pn−1 f (k) (a)
and its n-Taylor approximation k=0 k! (x − a)k :
Corollary 15.5.1. Let f : [a, b] → R satisfy the conditions in Taylor’s Theorem, and let
n
En := |b−a|
n! supξ∈(a,b) f
(n) (ξ). Then

n−1
¯ ¯
¯ X f (k) (a) ¯

¯f (x) − (x − a) ¯ 6 En for all x ∈ [a, b].
¯
¯ k! ¯
k=0

Of course this may not be useful, as the supremum may be infinite. If however in a given
situation we know a bit more—for example, that f (n) is differentiable on [a, b] then we can
use standard calculus to evaluate En .

15.6 Example: the function log(1 + x)

By way of an example we prove the following:


Proposition 15.6.1. We have

X xn
log(1 + x) = (−1)n−1 for all x ∈ (−1, 1].
n
n=1

Before we start, note that there is no point in trying to prove that these are equal on any
larger real domain as the radius of convergence of the series is by the ratio test equal to 1
P 1 at the other end point x = −1 the series is the notoriously divergent Harmonic Series
and
n.
(−1)n−1
But we will prove equality on all of (−1, 1], in particular that log 2 = ∞
P
n=1 n .

Proof. Consider f (x) = log(1 + x). We have already proved that on (−1, ∞) the function f
n+1
1
is differentiable with f ′ (x) = 1+x ; and so, by the usual rules we have f (n) (x) = (−1)(1+x)(n−1)!
n
for all n > 1.
Hence, by Taylor’s Theorem (the symmetric version)
n−1 ¶n
(−1)k−1 xk
µ
X 1 x
log(1 + x) − = (−1)n−1
k n 1 + ξn
k=1

for some ξn between 0 and x.


To get our result it will be enough to show that
¯ ¯
¯ x ¯
¯ 1 + ξn ¯ 6 1
¯ ¯

for every n and x ∈ (−1, 1].

62
x
For x > 0 this is no problem, 0 < ξn < 1 and so 1 + ξn > 1; hence 1+ξn < x 6 1.
For negative x it is not so easy; the nearer x is to −1 the nearer 1+ξn may get to 0. However,
if x > − 21 we have
− 21 6 x 6 ξn 6 0
and so
1
2 6 1 + x 6 1 + ξn 6 1
which implies
1 1
2> > > 1;
1+x 1 + ξn
multiplying by x < 0 yields
x x
2x 6 6 6 x.
1+x 1 + ξn
Now 2x > −1 and x 6 1 so we have
¯ ¯
¯ x ¯
¯ 1 + ξn ¯ 6 1
¯ ¯

as required.
k−1 xk
P∞
That is, the functions log(1 + x) and k=1 (−1) k are equal on [− 12 , 1].

What about (−1, − 21 )? We must use a different argument.


k−1 xk on (−1, 1). Both
P∞
Consider the functions f (x) = log(1 + x) and g(x) = k=1 (−1) k
1
are differentiable there; we have proved f ′ (x) = 1+x , and by the theorem on term-by-term
differentiation of power series
∞ ∞

X
k−1 xk−1 X 1
g (x) = (−1) k = (−1)k−1 xk−1 = .
k 1+x
k=1 k=1

Hence f ′ (x) − g ′ (x) = 0, so by the Identity Theorem,

f (x) − g(x) = f (0) − g(0) = 0.

That is, on the whole of (−1, 1] we have the required series expansion.

Remark 15.6.2. The last part has actually proved the result for x ∈ (−1, 1). It is only at
x = 1 that we have to prove that the error tends to zero.

16 The Binomial Theorem

In this section we use many of the theorems we have proved about uniform convergence and
continuity, power series, monotonicity as well as Taylor’s Theorem. As well as proving an
important result we are showing off the techniques we now have available to us.

63
16.1 Motivation and Preliminary Algebra

By simple induction we can prove that for any natural number n (including 0) we have for
all real or complex x that
k=n
X µn¶
(1 + x)n = xk ;
k
k=0
¡n¢ k
where the coefficient k of x can be proved to be
µ ¶
n n! n · (n − 1) · · · · · (n − k + 1)
= = .
k k!(n − k)! k · (k − 1) · · · · · 1

We have also seen in our work on sequences and series that



X
−1
(1 + x) = (−1)k xk for all |x| < 1
k=0

and here the coefficient of xk is


(−1) · (−2) · · · · · (−k)
(−1)k = ;
k · (k − 1) · · · · · 1
and we can prove by induction (for example using differentiation term by term) that for all
natural numbers n > 1 we have that

X (−n) · (−(n + 1)) · · · · · (−(n + k − 1))
(1 + x)−n = xk for all |x| < 1.
k · (k − 1) · · · · · 1
k=0

In this section we are going to generalise these—in the case of some real values of x—to all
values of n, not just integers. Note that this is altogether deeper: (1 + x)p is defined for
non-integral p, and for (real) x > −1, to be the function exp(p log(1 + x)).

Definition 16.1.1. For all p ∈ R and all k ∈ N we extend the definition of binomial
coefficient as follows:
µ ¶ µ ¶
p p p(p − 1) . . . (p − k + 1)
:= 1; and := .
0 k k!

We now make sure that the key properties of binomial coefficients are still true in this more
general setting.
Lemma 16.1.2. µ ¶ µ ¶
p p−1
k =p , for all k > 1.
k k−1

Proof. If k = 1 then by the definition we must see 1 · p1 = p · 1 which is clear. Otherwise


µ ¶ µ ¶
p p(p − 1) . . . (p − k + 1) (p − 1) . . . (p − k + 1) p−1
k =k =p =p .
k k! (k − 1)! k−1

64
Lemma 16.1.3. µ ¶ µ ¶ µ ¶
p p p+1
+ = , for all k > 1.
k k−1 k

Proof. When k = 1 we must prove p1 + 1 = p+1 1 which is clear. Otherwise


µ ¶ µ ¶
p p p(p − 1) . . . (p − k + 1) p(p − 1) . . . (p − k + 2)
+ = +
k k−1 k! (k − 1)!
p(p − 1) . . . (p − k + 2)
= [(p − k + 1) + k]
k!
(p + 1)p(p − 1) . . . (p − k + 2)
=
µ ¶ k!
p+1
= .
k

16.2 The Real Binomial Theorem

Theorem 16.2.1 (The Binomial Expansion). Let p be a real number. Then


∞ µ ¶
X p k
(1 + x)p = x for all |x| < 1.
k
k=0

Note that the coefficients are non-zero provided p is not a natural number or zero; as we
have a proof of the expansion in that case we may assume that p 6∈ N ∪ {0}.

Lemma 16.2.2. The function f defined on (−1, 1) by f (x) := (1 + x)p is differentiable, and
satisfies (1 + x)f ′ (x) = pf (x). Also, f (0) = 1.

Proof. The derivative is easily got by the chain rule from the definition of f ; it is f ′ (x) =
p(1+x)p−1 . Multiply by (1+x) and get the required relationship. The value at 0 is clear.

Lemma 16.2.3. The radius of convergence of ∞


P ¡p¢ k
k=0 k x is R = 1.

Proof. Use the ratio test; we have that “|ak+1 xk+1 /ak xk |” is
¯ ¯ ¯ ¯
¯ p · (p − 1) · · · · · (p − k) k · (k − 1) · · · · · 1 ¯ ¯p − k ¯
¯ (k + 1) · k · (k − 1) · · · · · 1 · p · (p − 1) · · · · · (p − k + 1) x¯ = ¯ k + 1 x¯ → |x| as k → ∞.
¯ ¯ ¯ ¯

P∞ ¡p¢
Lemma 16.2.4. The function g defined on (−1, 1) by g(x) = k=0 k xk is differentiable,
with derivative satisfying (1 + x)g ′ (x) = pg(x). Also, g(0) = 1.

65
Proof.
∞ µ ¶

X p
(1 + x)g (x) = (1 + x) kxk−1
k
k=0
∞ µ ¶
X p
= (1 + x) kxk−1
k
k=1
∞ µ ¶
X p − 1 k−1
= p(1 + x) x
k−1
k=1
(∞ µ ∞ µ ¶ )
X p − 1¶ X p − 1
=p xk−1 + xk
k−1 k−1
k=1 k=1
( ∞ µ ¶ ∞
)
X p−1 X µp−1¶
m m
=p x + x
m m−1
m=0 m=1
∞ µ ∞ µ
( ¶ ¶ )
X p−1 m X p−1 m
=p 1+ x + x
m m−1
m=1 m=1
∞ ·µ
( ¶ µ ¶¸ )
X p−1 p−1
=p 1+ + xm
m m−1
m=1
∞ µ ¶
( )
X p m
=p 1+ x
m
m=1
∞ µ ¶
X p m
=p x
m
m=0
= pg(x).

g(x)
Proof of the Binomial Theorem. Consider φ(x) = , which is well-defined on (−1, 1) as
f (x)
f (x) > 0. By the Quotient Rule we can calculate φ′ (x), and then use the lemmas:

f (x)g ′ (x) − f ′ (x)g(x) p f (x)g(x) − f (x)g(x)


φ′ (x) = 2
= = 0.
f (x) 1+x f (x)2

Hence by the Identity Theorem, φ(x) is constant, φ(x) = φ(0) = 1. This implies that
f (x) = g(x) on (−1, 1).

66
16.3 The end points: preliminary issue

This section will probably be omitted from the lectures.

The existence of these functions and their equality at the end points requires more sophisticated
argument. The following sections should be viewed as illustrations of the way Taylor’s
Theorem can be exploited, rather than theorems to be learnt.
The cases x = 1 or x = −1 need to be considered separately. But there is a difference between
these! For x = −1 we have not yet defined (1 + x)p ; recall our definition for arbitrary real p, that
(1 + x)p := exp p log(1 + x). For integral p (such as p = 0) we have the usual algebraic definition,
which is consistent with the exp log definition when both apply. Can we define 0p sensibly for any
other values of p? For p > 0 we’d clearly like to define 0p = 0. But if we do so, then to preserve the
rule of exponents Ap Aq = Ap+q we cannot define negative powers; if p > 0 then 0−p makes no sense.
So let us extend out definition of (1 + x)p in this way, in the case when p > 0.
But we need to take care.

Lemma 16.3.1. If p > 0 then the function (1 + x)p is continuous on [−1, ∞).

Lemma 16.3.2. If p > 1 then the function (1 + x)p is differentiable on [−1, ∞) with derivative
p(1 + x)p−1 .

Proofs. Exercises.

16.4 The end points: p 6 −1

Let p 6 −1. Then as remarked above, the function (1 + x)p is not defined at x = −1. Further the
expansion is not vaild at x = 1:
P∞ p · (p − 1) · · · · · (p − k + 1)
Proposition 16.4.1. The series k=0 is divergent.
k · (k − 1) · · · · · 1

Proof. Write q = −p > 1; then the modulus of the k-th term


¯ ¯ ¯ ¯
¯ p · (p − 1) · · · · · (p − k + 1) ¯ ¯
¯ = ¯(−1)k q · · · q + s · · · q + k − 1 ¯ > 1;
¯
¯
¯ k · (k − 1) · · · · · 1 ¯ ¯ 1 s+1 k ¯

the terms alternate in sign but as they do not tend to 0 the series diverges.

16.5 The end points: −1 < p < 0

Let −1 < p < 0; note that p + 1 > 0. Again the function (1 + x)p is not defined at x = −1. However,
now the expansion is valid at x = 1:
P∞ p · (p − 1) · · · · · (p − k + 1)
Proposition 16.5.1. The series k=0 is convergent with sum 2p .
k · (k − 1) · · · · · 1

Proof. We apply Taylor’s Theorem to (1 + x)p on the interval [0, 1] and find, for each n > 1, a point
ξn ∈ (0, 1) such that
n−1
p
X p · (p − 1) · · · · · (p − k + 1) p · (p − 1) · · · · · (p − n + 1)
2 = + En where En = (1 + ξn )p−n .
k · (k − 1) · · · · · 1 n · (n − 1) · · · · · 1
k=0

67
We have then that ¯ ¯
¯ p · (p − 1) · · · · · (p − n + 1) ¯
|En | 6 ¯
¯ ¯;
n · (n − 1) · · · · · 1 ¯
and we will have the result if we prove that this tends to 0 as n → ∞. We rewrite the part depending
on n as
¯ ¯
¯ [(p + 1) − 1] · · · · · [(p + 1) − s] · · · · [(p + 1) − n] ¯
¯ ¯
¯ 1 · 2 · ··· · n ¯
µ ¶ µ ¶ µ ¶ µ ¶
p+1 p+1 p+1 p+1
= 1− · 1− ... 1 − ... 1 − .
1 2 s n

Now exp(−x) + x − 1 has positive derivative on (0, 1) so by the MVT we have that
µ ¶ µ ¶
p+1 p+1
1− 6 exp −
s s

so that à !
n
X 1
|En | 6 exp −(p + 1) .
s=1
s

As the harmonic series diverges and (p + 1) > 0, we get that En → 0 as n → ∞.

16.6 The end points: 0 < p

Let 0 < p. In this case the expansion is valid at x = 1 and x = −1.


P∞ p · (p − 1) · · · · · (p − k + 1)
Proposition 16.6.1. The series k=0 is convergent with sum 2p ; and
k · (k − 1) · · · · · 1
P∞ p · (p − 1) · · · · · (p − k + 1)
the series k=0 (−1)k is convergent with sum 0.
k · (k − 1) · · · · · 1

Proof. The end point x = +1 is straightforward; use Taylor’s Theorem as before and consider the
error estimate
p · (p − 1) · · · · · (p − n + 1)
En = (1 + ξn )p−n
n · (n − 1) · · · · · 1
for some ξn ∈ (0, 1). Then

p ¯¯ (p − 1) · · · · · (p − n + 1) ¯¯ 2p
¯ ¯
|En | 6 ¯ ¯ 1n .
n 1 · 2 · · · · · (n − 1)
¯ ¯
¯p − s¯
Now ¯¯ ¯ 6 1 whenever 2s > p; so we get that
s ¯
¯ £p¤ ¯
p ¯¯ (p − 1) · · · · · (p − 2 ) ¯¯ 2p
|En | 6 ¯ ¯ n → 0 as n → ∞
1 · 2 · · · · · ( p2 )
£ ¤
n¯ ¯1

as required. The end point x = −1 is more difficult. What we do is prove that the sum con-

verges. Noting that as soon as k > p + 1 all the terms have the same sign, we see that this means
we have proved that the series is absolutely convergent. Now by the properties of power series
P∞ p · (p − 1) · · · · · (p − k + 1) k
k=0 x is absolutely convergent on (−1, 1). In particular we have that
k · (k − 1) · · · · · 1

68
the series is absolutely convergent on the closed interval [−1, 0]. Hence the series is uniformly con-
vergent on that interval; and so the series is continuous on [−1, 0]. As the series is equal to (1 + x)p
on (−1, 0] we have by continuity that there is equality at −1 as well.
So we must prove that the series converges. We claim that if we can prove
¯ this for
¯ any p then we can
¯ p+1 ¯
prove it for (p + 1). This is because for all n > 2p + 2 we have that ¯¯ ¯ 6 1; this allows us
p − n + 1¯
to compare the n-th terms and see that those for (p + 1) are smaller in modulus. As both series are
ultimately the series of terms of constant sign, the comparison test will yield that convergence for p
yields convergence for (p + 1). So assume from now that 0 < p < 1; it will suffice to deal with this
case.
The modulus of the n-th term can then be written
µ ¶
p³ p´ ³ p´ p
|un | = 1− ... 1 − ... 1 −
n 1 s n−1

and so, using again (1 − t) 6 exp(−t), we have that


à n−1 !
p X1
|un | 6 exp −p
n s=1
s
à Ãn−1 !!
p X1
= exp −p − log n exp (−p(log n))
n s=1
s
à Ãn−1 !!
p 1 X1
= exp −p − log n .
n np s=1
s

Now we have (Integral Test argument) that


n−1
X 1
− log n → γ as n → ∞ (γ is Euler’s constant).
s=1
s

Hence we have a constant C such that


1
|un | 6 C for sufficiently large n,
n · np
|un | converges.15
P
and so, by the Comparison Test,

15 P 1
ns
is convergent for s > 1 by the Integral Test.

69

You might also like