Professional Documents
Culture Documents
Janet Dyson
Contents
0. Introduction
1. Limits of Functions
2. Continuity of functions
3. Continuity and Uniform Continuity
4. Boundedness of continuous functions on a closed and bounded inter-
val.
5. Intermediate value theorem
6. Monotonic Functions and Inverse Function Theorem
7. Limits at infinity and infinite limits
8. Uniform Convergence
9. Uniform Convergence: Examples and Applications
10. Differentiation: definitions and elementary results
11. The elementary functions
12. The Mean Value Theorem
13. Applications of the MVT
14. L’Hospital’s Rule
15. Taylor’s Theorem
16. The Binomial Theorem
Introduction
Acknowledgement
These lectures have been developed by a number of lecturers over the years. I would partic-
ulary like to thank Professor Roger Heath-Brown who gave the first lectures for this course
in its present form in 2003 and Dr Brian Stewart and Dr Zhongmin Qian who allowed me to
adapt their lecture notes and use their LATEX files.
Lectures
To get the most out of the course you must attend the lectures. There will be more expla-
nation in the lectures than there is in the notes.
On the other hand I will not put everything on the board which is in the printed notes. In
some places I have put in extra examples which I will not have time to demonstrate in the
lectures. There is also some extra material which I have put in for interest wbut which I do
not regard as central to the course.
Numbering system:
In the printed notes there are 16 sections. Within each section there are subsections. The-
orems, defintions, etc are numbered consecutively within each subsection. So for example
Theorem 1.2.3 is the third result in Section 1, Subsection 2. I will use the numbering in the
printed notes, even though I will omit some subsections in the lectures, so the numbering
will no longer be consecutive.
Exercise sheets
The weekly problem sheets which accompany the lectures are an integral part of the course.
In Analysis above all you will only understand the definitions and theorems by using them.
I assume that week 1 tutorials are being devoted to the final sheets from the Michaelmas
Term courses.
I suggest that the problem sheets for this course are tackled in tutorials in weeks 2–8, with
the 8th sheet used a vacation work for a tutorial in the first week of Trinity Term.
Corrections
I will use this notation (which was used in courses on MT) throughout.
• C: set of all complex numbers - the complex plane.
• R: set of all real numbers - the real line; R ⊂ C.
• Q: the rational numbers, Q ⊂ R.
• N: the natural numbers, 1, 2, . . . , N ⊂ Q.
• ∀: “for all” or“for every” or “whenever”.
• ∃: “there exist(s)” or “there is (are)”.
• Sometimes I will write “s. t.” for “such that” , “resp.” for “respectively”, “iff” for “if
and only if”.
Recall the following definition from ‘Introduction to Pure Mathematics’ last term
If a, b ∈ R then we define intervals as follows:
etc.
1 Limits of Functions
This course builds on the ideas from Analysis I and also uses many of the results from that
course. I have put some of the most important results from Analysis I in these notes but
I will not write them on the board in the lecture. However, I will begin by recalling the
definition of limits for sequences.
Definition 1.1.1. A sequence {zn } of real (or complex) numbers has limit l, if
∀ε > 0, ∃N ∈ N, such that,
|zn − l| < ε ∀n > N.
We denote this by ‘zn → l as n → ∞’ or by ‘limn→∞ zn = l’.
Definition 1.1.2. A sequence {zn } of real (or complex) numbers converges if it has a limit
l.
Often we prove things by contradiction. We start by assuming that what we want is not
true. That means we have to be able to write down the contrapositive of a proposition.
We can do this mechanically: working from the left change every ∀ into ∃, every ∃ into ∀
and negate the simple proposition at the end.
For example, by the first definition, a sequence {zn } does not converge to l 1 , if and only if
∃ε > 0, such that ∀k ∈ N, ∃nk > k s. t.
|znk − l| > ε.
Definition 1.1.3. {zn } is called a Cauchy sequence if ∀ε > 0 ∃N ∈ N such that ∀n, m >
N
|zn − zm | < ε.
Here is the key theorem, sometimes called The General Principle for Convergence:
Theorem (Cauchy’s Criterion). A sequence {zn } of real (or complex) numbers converges
if and only if it is a Cauchy sequence.
When mathematicians say that the real number system R and the complex number system
C are complete what they mean is that this theorem is true. There are no sequences which
look as though they converge but don’t, there are no ‘gaps’ in the real number line.
According to Cauchy’s criterion, {zn } diverges [i.e. has no finite limit], if and only if ∃ε > 0,
such that ∀k ∈ N, there exists [at least] two integers nk1 , nk2 > k s. t. |znk1 − znk2 | > ε.
1
i.e. either {zn } diverges, or zn → a 6= l
1
1.2 Compactness
The proofs of theorems about continuous functions we are going to prove in this course rely
on the following
Corollary 1.2.1. A bounded sequence {zn } in R (or in C) converges to a limit l if and only
if all convergent subsequences of {zn } have the same limit l.
Proof. =⇒: This was proved in Analysis I: any subsequence of a convergent sequence tends
to the same limit.
⇐=: We argue by contradictionSuppose {zn } is divergent.
Since {zn } is bounded, there exists a subsequence {znk } converging to some limit l1 by the
Bolzano–Weierstrass Theorem. Notice that N \ {nk : k ∈ N} can’t be finite: if that were to
happen then {zn } and {znk } would have a common tail and so zn → l1 after all.
We can therefore let {zmk } be the subsequence got by omitting the terms labelled by nk . If
this subsequence converges to l1 then it is easy to see that {zn } converges to l1 . (In Analysis
I we did this for special cases like z2n → l and z2n+1 → l; the argument is easily modified.)
So we have that {zmk } does not converge to l1 . Writing down the contrapositive, then, we
have there exists ε0 > 0 such that for every natural number j there exists a natural number
which we denote by rj such that rj > j and |zmrj − l1 | > ε0 .
0 < ε0 6 |zmrj − l1 |
s
Before we talk about limits of functions ‘f (x) → l as x → a’ we need to say something about
the sort of points a in a set that we are interested in—we want to exclude points which x
can’t get near!
Definition 1.3.1. Let E ⊆ R (or C). A point p ∈ R (or C) is called a limit point (or an
accumulation point, or a cluster point) of E, if ∀δ > 0, there is at least one point z ∈ E
other then p such that
0 < |z − p| < δ.
2
Definition 1.3.2. A point which is not a limit point of E is called an isolated point of E.
There are all sorts of exotic examples of limit points but most sets we will consider are
intervals so the following result is crucial:
Theorem 1.3.3. p ∈ R is a limit point of an interval [a, b] ( or (a, b], or [a, b) or (a, b)) if
and only if p ∈ [a, b].
Proof for the interval (a, b]. There are (by trichotomy) only three cases: p < a, p ∈ [a, b], and
p > b. In the first take δ := (a − p)/2 and get a contradiction, in the third take δ := (p − b)/2.
The case p ∈ [a, b] is an exercise.
1.4 Functions
Although there’s no such thing a typical function here are three examples, which are often
useful as test cases when we formulate definitions and make conjectures.
√
Example 1.4.1. f (x) = 1 − x2 with domain E = [−1, 1]. What is its graph? Its graph
looks continuous . . . .
This time our sketch of the graph is a bit more sketchy. [Try with Maple].
Example 1.4.3. The function f (x) = x sin x1 with domain R\{0} is an important test case
in our work. As x gets close to 0, the values of f oscillate, but they do get close to 0. Once
we have formalised this we will see that f has limit 0 as x goes to 0.
Definition 1.5.1. Let E ⊆ R (or C), and f : E → R (or C) be a real (or complex) function.
Let p be a limit point of E and let l be a number. We say that f tends to l as x tends to
p if ∀ε > 0 ∃δ > 0 such that
3
Example 1.5.3. Let α > 0 be a constant. Consider the function f (x) = |x|α sin x1 on the
domain E = R\{0}. Show that f (x) → 0 as x → 0.
Since | sin θ| 6 1 we have that ¯xα sin x1 ¯ ≤ |x|α for any x 6= 0. Therefore, ∀ ε > 0, choose
¯ ¯
δ = ε1/α . Then
¯ ¯
¯x sin 1 − 0¯ ≤ |x|α < ε
¯ α ¯
whenever 0 < |x − 0| < δ.
¯ x ¯
Example 1.5.4. Consider the function f (x) = x2 on the domain E = R. Let a ∈ R. Show
that f (x) → a as x → a.
as required.
Proof. Suppose f (x) → l1 and also f (x) → l2 as x → p, where l1 6= l2 . Then 21 |l1 − l2 | > 0,
so by definition, ∃δ1 > 0 such that
Let δ = min{δ1 , δ2 }. Since p is a limit point of E and δ > 0, ∃x0 ∈ E such that 0 < |x0 − p| <
δ. However
a contradiction.
4
The following theorem translates questions about function limits to questions about sequence
limits, and so we can make use of results in Analysis I.
Theorem 1.5.7. Let f : E → R (or C) where E ⊆ R (or C), p be a limit point of E and
l ∈ C. Then the following two statements are equivalent:
(a) f (x) → l as x → p;
(b) For every sequence {pn } in E such that pn 6= p and limn→∞ pn = p we have that
f (pn ) → l as n → ∞.
We might say informally that limx→p f (x) = l if and only if f tends to the same limit l along
any sequence in E going to p.
Proof. =⇒: Suppose limx→p f (x) = l. Then ∀ε > 0, ∃δ > 0 such that
So, since pn 6= p,
|f (pn ) − l| < ε ∀n > N.
Hence, limn→∞ f (pn ) = l.
⇐=: Argue by contradiction. Suppose limx→p f (x) = l is not true. Then ∃ε0 > 0,
such that ∀δ > 0,—which we choose to be 1/n for arbitrary n— ∃xn ∈ E, with 0 < |xn −p| <
1/n but
|f (xn ) − l| > ε0 .
Therefore we have found a sequence {xn } which converges to p but {f (xn )} does not tend
to l. Contradiction.
1.6 Example
The above result is very useful when we want to prove that limits do not exist.
Example 1.6.1. Show that limx→0 sin x1 doesn’t exist.
1 1
Let xn = and yn = π . Then both sequences xn and yn tend to 0, but
πn 2πn + 2
1
lim sin =0
n→∞ xn
and
1
lim sin = 1.
n→∞ yn
1
So limx→0 sin x cannot exist.
5
1.7 Algebra of Limits
We can use the theorem of the previous subsection together with the Algebra of Limits of
Sequences to prove the corresponding results: we get the Algebra of Limits of Functions. We
state the theorem for C but it also holds for R.
Theorem 1.7.1. Let E ⊆ C and let p be a limit point of E. Let f, g : E → C, and let
α, β ∈ C. Suppose that f (x) → A, g(x) → B as x → p. Then the following limits exist and
have the values stated:
(Quotient) if B 6= 0 then ∃δ > 0 s.t. g(x) 6= 0 ∀x ∈ E such that 0 < |x − p| < δ, and
limx→p (f (x)/g(x)) = A/B;
It is a good exercise to prove these results directly from the definitions; just mimic the
sequence proofs.
Example of proof: If B 6= 0 then ∃δ > 0 s.t. g(x) 6= 0 ∀x ∈ E such that 0 < |x − p| < δ, and
limx→p (1/g(x)) = 1/B;
I will do it both ways:
(i) Deduction from AOL for sequences: Suppose first that there is no such δ. Then for each
n, ∃pn ∈ E such that 0 < |pn − p| < 1/n, and g(pn ) = 0. But then pn → p, so g(pn ) → B,
giving B = 0, a contradiction. So δ > 0 exists.
Now let {xn } be any sequence in E with xn → p and xn 6= p. We may assume xn ∈ (p−δ, p+δ)
(by tails). Hence g(xn ) 6= 0 and g(xn ) → B. Thus by the AOL for sequences, 1/g(xn ) → 1/B.
Thus 1/g(x) → 1/B as required.
(ii) Direct proof: Take ǫ = |B|/2 > 0. So ∃δ1 > 0 such that
∀x ∈ E such that 0 < |x − p| < δ1 . (So in particular, g(x) 6= 0 whenever 0 < |x − p| < δ1 .)
Now, given ǫ > 0, ∃δ2 > 0 such that
6
Take δ = min{δ1 , δ2 }. Then if x ∈ E is such that 0 < |x − p| < δ
|g(x) − B| |B|2 ǫ/2
|1/g(x) − 1/B| = < =ǫ
|g(x)||B| |B|2 /2
as required.
Remark 1.7.2. Note we have also proved above that if limx→p g(x) = B 6= 0, then there is
a positive number δ > 0 such that
|B|
|g(x)| > ∀x ∈ E such that 0 < |x − p| < δ.
2
In particular, |g(x)| > 0 ∀x ∈ E such that 0 < |x − p| < δ.
B
It can be proved similarly that if g : E → R and B > 0, then ∃δ > 0 such that g(x) > 2 >0
∀x ∈ E such that 0 < |x − p| < δ.
1.8 An extension
2 Continuity of functions
We all have a good informal idea of what it means to say that a function has a continuous
graph: we can draw it without lifting the pencil from the paper. But we want now to use
our precise definition of ‘f (x) → l as x → p’ to discuss the idea of continuity. That is we
want to discuss the precise question of whether f is continuous at a particular point p.
7
2.1 Definition
In the definition of limx→p f (x), the point p need not belong to the domain E of f . But even
if it does, and f (p) is well-defined, the limit of f at p may have nothing to do with f (p).
The classic example is the function
½
0 if x 6= 0,
f (x) :=
1 otherwise.
Definition 2.1.1. Let f : E → R (or C), where E ⊆ R (or C), and p ∈ E. If ∀ε > 0 ∃δ > 0
such that
|f (x) − f (p)| < ε ∀x ∈ E such that |x − p| < δ
then we say that f is continuous at p.
We continue with the notation of the definition for a moment and see what this means for
isolated and limit points.
Proof. As p is isolated there exists δ > 0 such that there are no other points of x ∈ E such
that 0 < |x − p| < δ. The inequality required is therefore vacuously true.
Proof. It’s clear that the continuity definition implies the limit one at once. The limit one,
provided the limit is f (p), delivers all that we need for continuity except that the inequality
|f (x) − l| < ε holds for x = p as well as the other points x in |x − p| < δ. But this is
immediate.
2.2 Examples
Example 2.2.1. Let α > 0 be a constant. The function f (x) = |x|α sin x1 not defined at
x = 0 so it makes no sense to ask if it is continuous there. In such circumstances we modify
f in some suitable way. So we look at
|x|α sin x1
½
if x 6= 0,
g(x) :=
0 if x = 0.
Then 0 is a limit point of the domain, and we calculated before that limx→0 g(x) = 0 = g(0),
so g is continuous at 0.
8
Example 2.2.2. Let f : (0, 1] → R be defined by
½ 1
n if x = mn in lowest terms,
f (x) :=
0 if x is irrational.
This is very like problem on the Exercise Sheets, so I won’t give a full proof here, only
indicate how I would tackle it.
Every p ∈ (0, 1] is a limit point, so we need to work out limx→p f (x) for each p. We know
that we can do this by looking at limn→∞ f (pn ) for each sequence {pn } converging to p.
We know that there is always a sequence of irrationals {xn } converging to p. (Because, from
Analysis I, for every n ∈ N the interval (p, p + 1/n) contains an irrational number xn .) Then
the sequence {f (xn )} is just the null sequence (0, 0, . . . ) with limit 0.
So perhaps we need to distinguish between rational and irrational points?
Suppose p 6= 0 is rational. Then, with {xn } as above, f (xn ) → 0 but f (p) = p 6= 0. Therefore
f is not continuous at non-zero rational points.
Now let p be irrational. Some sequences (for example irrational ones) tend to 0 = f (p). But
do all sequences have this property? Let pn → p, and consider f (pn ). If this does not tend
to zero, then for some ε > 0 we can find a subsequence such that f (pnj ) > ε. That is, these
pnj must be rational and have denominators less than 1ε . There are only a finite number of
such points in the interval, and so—since pn → p—we can find a n such that for n > N all
the pn are irrational or have denominators at least 1ε . Hence we cannot have the claimed
subsequence.
Therefore f is continuous at irrational points since for all sequences {pn } we have that
f (pn ) → f (p).
We can use our characterisation of continuity at limit points in terms of limx→p f (x) together
with the Algebra of Function Limits to prove that the class of functions continuous at p is
closed under all the usual operations. We state the theorem for C but it also holds for R.
Theorem 2.3.1. Let E ⊆ C and let p ∈ E. Let f, g : E → C, and let α, β ∈ C. Suppose
that f, g are continuous at p. Then the following functions are also continuous at p:
(Negation) −f (x);
(Quotient) (f (x)/g(x)) provided g(p) 6= 0 (which guarantees that there exists δ such that
f (x)/g(x) is defined ∀x ∈ E such that |x − p| < δ).
9
Proof. Follows directly from the Algebra of Function Limits. However, it is a good2 exercise
to write out a proof from the definition—again just mimic what was done for the AOL for
sequences.
This follows immediately from the above theorem because the function f (x) = x with domain
C (or R) is continuous.
Proof. For any ε > 0, since g is continuous at f (p), ∃δ1 > 0 such that
That is
|g(f (x)) − g(f (p))| < ε ∀x ∈ E such that |f (x) − f (p)| < δ1 .
However, f is continuous at p, so ∃δ > 0 such that
Hence
|g(f (x)) − g(f (p))| < ε ∀x ∈ E such that |x − p| < δ
so that h is continuous at p.
(b) For every sequence {pn } in E such that limn→∞ pn = p we have that f (pn ) → f (p) as
n → ∞.
2
Doing this will reinforce the definitions, but also consolidate your understanding of sequences.
10
3 Continuity and Uniform Continuity
Having made our definition of ‘continuity’ we will see that actually, what usually matters is
not continuity at a point, but continuity at all points of a set, and the interesting sets are
usually intervals or disks. In the later lectures we are going to establish several important
theorems about continuous functions on bounded intervals.
But here is the definition of continuity on a set.
Definition 3.1.1. Let f : E → R (or C). We say that f is continuous on E if f is
continuous at every point of E.
such that
|f (x) − f (p)| < ε ∀x ∈ E such that |x − p| < δ.
We are about to look at uniform continuity, in which δ does not depend on p. First we will
consider an example which is not uniformly continuous.
3.2 An Example
By the algebra of limits this is all clear. But we want to analyse what is going on more
carefully, to see how the δ is related to ε and the the point x in question.
First, ¯ ¯
¯ 1 1 ¯ |x − p|
|f (x) − f (p)| = ¯¯ − ¯¯ =
x p |x||p|
1
and we can see that the problem term is .
x
However, |p| > 0, and so when |x − p| < 21 |p| we have by the Triangle Law that
11
For these x, then, we have that
2
|f (x) − f (p)| 6 |x − p|
|p|2
2
and if we make sure |p|2
|x − p| < ε we will be done.
This can be achieved that by choosing
2
¡1 1
¢
δ := min 2 |p|, 2 ε|p|
Sometimes we want to be able to control what happens over a set more ‘uniformly’.
∀ε > 0, ∃δ > 0
such that
|f (p) − f (x)| < ε ∀p ∈ E and ∀x ∈ E such that |p − x| < δ.
Note the difference3 between this and the definition of ‘continuous on E’. In this, the uniform
case, we must find δ on the basis of ε alone, and we have to choose one that will give the
inequality for all x ∈ E and p ∈ E. Obviously if we can do this it is very nice, it gives us a
way of controlling what happens on a set all at once.
Example 3.3.2. Suppose that f is Lipschitz continuous in E: that is, assume that there is
a constant M such that
Swapping ∀s doesn’t give problems, but swapping the ∀p and ∃δ is the crunch.
12
ε
Take x, y ∈ E. Given ε > 0, choose δ = M +1 > 0. Then
|f (x) − f (y)| 6 M |x − y|
µ ¶
ε
6 M <ε
M +1
whenever |y − x| < δ.
Note that our choice of δ does not depend on x or y. For a given ε > 0 we can find a δ that
works for all x and y.
√
Example 3.3.3. f (x) = x is Lipschitz continuous on [1, ∞), so it is uniformly continuous.
More generally, a continuous function on a closed and bounded set—‘compact set’ as we’ll
say next year—is uniformly continuous.
Proof. Suppose that f were not uniformly continuous. By the contrapositive of ‘uniform
continuity’ there would exist ε > 0, such that for any δ > 0—which we choose as δ = n1 for
arbitrary n—there exists a pair of points xn , yn ∈ [a, b], such that
1
|xn − yn | < n but |f (xn ) − f (yn )| > ε.
which is impossible.
13
3.5 An example on an unbounded interval
√
Example 3.5.1. f (x) = x is uniformly continuous in the unbounded interval [0, +∞).
We do this in three steps: we prove uniform continuity on [0, 1], we prove uniform continuity
on [1, +∞), and we patch these together.
√
It is easy to get that x is continuous on [0, 1]; provided |x − p| < 21 p we will get
¯√
¯ x − √p¯ 6 √|x − p| 2
¯
√ 6 √ |x − p|
x+ p 3 p
and can argue from there. Thus it must be uniformly continuous by Theorem 3.4.1.
√
Secondly we have already shown that x is Lipschitz continuous on [1, ∞), so it is uniformly
continuous on [1, ∞).
Now we have to patch these together. This is a standard sort of argument which we do this
time as an example.
We have that for all ε > 0, ∃δ1 > 0 such that
√ √
| x − y| < 21 ε ∀x, y ∈ [0, 1] such that |x − y| < δ1 .
Hence we have √ √
| x − y| < ε
√
whenever x, y ∈ [0, ∞) such that |x − y| < δ. By definition, f (x) = x is uniformly
continuous in the unbounded interval [0, +∞).
Take ǫ = 1. We show that there is no δ > 0 such that definition 3.3.1 holds.
Take sequences xn = n1 and yn = n+1 1
. Then |f (xn ) − f (yn )| = 1, but |xn − yn | → 0. So
for any δ > 0, there exists n such that |xn − yn | < δ but |f (xn ) − f (yn )| <
6 1. So f is not
uniformly continuous.
14
4 Continuous functions on a closed and bounded interval
4.1 Boundedness
Definition 4.1.1. Let f : E → R (or C), and let M be a non-negative real number. We say
that f is bounded by M on E if
|f (z)| 6 M ∀z ∈ E.
We also say that M is a bound for f on E. If there is a bound for f on E we say that f is
bounded (on E).
Proof. Argue by contradiction. Suppose f were unbounded, then for any n ∈ N, there is at
least one point xn ∈ [a, b] such that |f (xn )| > n. Since {xn } is bounded, by the Bolzano–
Weierstrass Theorem, there exists a subsequence {xnk } converging to p, say. Then p is a limit
point of the interval [a, b] so p ∈ [a, b]. Note that |f (xnk )| > nk > k. Now f is continuous at
p and so we have that
f (p) = lim f (xnk )
k→∞
Corollary 4.1.4. Let f : [a, b] → R be continuous then supx∈[a,b] f (x) and inf x∈[a,b] f (x)
exist.
Proof. Immediate.
15
Note 4.1.5. Recall that the supremum is precisely this: an upper bound, such that nothing
smaller is an upper bound. It is convenient to translate this into ε-language about functions
as follows:
½
∀x ∈ E, f (x) 6 M ; and
M = sup f (x) if and only if
x∈E ∀ε > 0 ∃xε ∈ E such that f (xε ) > M − ε.
Here now is our second important theorem; note that it is only for real-valued functions.
Theorem 4.1.6 (Continuous functions on [a, b] attain their bounds). Let f : [a, b] →
R be continuous, then f attains (or achieves) its supremum and infimum. That is, there
exist points4 x1 and x2 in [a, b] such that f (x1 ) = supx∈[a,b] f (x) and f (x2 ) = inf x∈[a,b] f (x).
Proof. (1st Proof: by contradiction.) Let us prove by contradiction that the supremum M
of f is attained.
Assume the contrary, that is
which is positive and continuous on [a, b]. Therefore g is, as we have proved, bounded on
[a, b], by M0 say:
1
= g(x) 6 M0 .
M − f (x)
It follows that
1
f (x) 6 M −
M0
for all x ∈ [a, b] which is a contradiction to the fact that M is the least upper bound.
A similar argument deals with the infimum. 5
16
Proof. (2nd Proof: direct.) The continuous function f is bounded by our earlier theorem,
so that m := inf x∈[a,b] f (x) exists by the Completeness Axiom of the real number system
[Analysis I]. Apply the characterisation of infimum we have given, taking ε := n1 to find a
point xn ∈ [a, b] such that
m 6 f (xn ) < m + n1 .
Now {xn } is bounded, so we may use the Bolzano–Weierstrass Theorem to extract a con-
vergent subsequence {xnk }; suppose we have xnk → p. Then p is a limit point of [a, b] so
p ∈ [a, b]. Since f is continuous at p, we have that f (xnk ) → f (p). From the inequality
1
m 6 f (xnk ) < m +
nk
we can deduce, as limits preserve weak inequalities, that
µ ¶
1
m 6 lim f (xnk ) = f (p) 6 lim m + =m
k→∞ k→∞ nk
4.2 A Generalisation
So far we have concentrated on extreme values, the supremum and the infimum. What can
we say about possible values between these?
Theorem 5.0.2 (IVT). Let f : [a, b] → R be continuous, and let c be a number between
f (a) and f (b). Then there is at least one ξ ∈ [a, b] such that f (ξ) = c.
17
Proof. By considering −f instead of f if necessary, we may assume that f (a) 6 c 6 f (b).
The cases c = f (a) and c = f (b) are trivial, so assume f (a) < c < f (b).
Define g(x) = f (x) − c. Then g(a) < 0 < g(b). Let x1 = a and y1 = b. Divide the interval
[x1 , y1 ] into two equal parts. If g( 12 (x1 + y1 )) = 0 then ξ := 21 (x1 + y1 ) will do. Otherwise,
we choose x2 = x1 and y2 = ( 12 (x1 + y1 ) if g( 12 (x1 + y1 )) > 0, or x2 = 21 (x1 + y1 ) and y2 = y1
if g( 21 (x1 + y1 )) < 0. Then
1
g(x2 )g(y2 ) < 0; [x2 , y2 ] ⊂ [x1 , y1 ]; and |y2 − x2 | = 2 (y1 − x1 ) .
Apply the same argument to [x2 , y2 ] instead of [x1 , y1 ], we then find that: either g( 21 (x2 +
y2 )) = 0 and we can take ξ := 21 (x2 + y2 ), or there exist x3 , y3 such that
1
g(x3 )g(y3 ) < 0; [x3 , y3 ] ⊂ [x2 , y2 ]; and |y3 − x3 | = 2 (y2 − x2 ) .
By repeating the same procedure, we thus find two sequences xn , yn , such that
Obviously, {xn } is a bounded increasing sequence, and {yn } is a bounded decreasing sequence.
Bounded monotone sequences converge and so xn → ξ and yn → ξ ′ for some ξ, ξ ′ ∈ [a, b].
Since by Algebra of Limits
Remark 5.0.3. The above proof of the IVT also provides a method of finding roots to
f (ξ) = c, but other methods may find roots faster if additional information about f (e.g. that
f is differentiable) is available.
Corollary 5.0.4. Let {[xn , yn ]} be a decreasing net6 of closed intervals of R such that the
length yn − xn → 0. Then ∩∞ n=1 [xn , yn ] contains exactly one point.
Remark 5.0.5. The proof of IVT requires more than what we needed for boundedness and
the attainment of bounds. We have used the fact that [a, b] is unbroken. That is, we have
used the fact that [a, b] is “connected”.
6
That is [xn+1 , yn+1 ] ⊂ [xn , yn ] for each n
18
Here (for interest) is a sketch of an alternative proof, which identifies ξ as the supremum of
a certain set.
Sketch of alternative proof to IVT. As in the original proof it is sufficient to prove that
if g : [a, b] → R is continous and g(a) < 0 < g(b), then there exists ξ ∈ (a, b) such that
g(ξ) = 0.
Define E = {x ∈ [a, b] : g(x) < 0}.
Then E 6= ∅ (why?) and E is bounded (why?). So, by the Completeness Axiom, ξ = sup E
exists. We prove g(ξ) = 0.
Suppose first that g(ξ) < 0 (so ξ ∈ [a, b)). But then, as g is continuous, there exists h > 0
s.t. (ξ, ξ + h) ⊂ [a, b] and g(t) < 0 for t ∈ (ξ, ξ + h). (Proof?) But then ξ + h/2 ∈ E so ξ is
not the sup, a contradiction.
Suppose now that g(ξ) > 0 (so ξ ∈ (a, b]). But then, again because g is continuous, there
exists h > 0 s.t. (ξ − h, ξ) ⊂ [a, b] and g(t) > 0 for t ∈ (ξ − h, ξ). (Proof?) But there exists
t ∈ (ξ − h, ξ] such that g(t) < 0 which is also a contradiction.
Hence g(ξ) = 0 as required.
More generally the IVT is often used to show that algebraic equations have solutions. In
the following, if you draw the graphs of y = ex and y = αx, you will see that if α = e the
curves touch, if α < e they do not meet, but if α > e then they meet twice. The following
example shows how to make this graphical argument rigorous using the IVT. It shows that
if α > e there exist two solutions. Once we have covered differentiability you will be able to
prove that there are exactly two solutions, by using the fact that f ′ (x) < 0 if x < log α, but
f ′ (x) > 0 if x > log α.
Example 5.0.7. Let α > e. Show that there exist two distinct points xi > 0, i = 1, 2, such
that exi = αxi .
Proof: Consider f (x) = ex − αx. We will prove later that for all x, ex is continuous. Hence
2
f (x) is continuous on [0, ∞). ex is defined by its power series so that ex > x2 . Thus eX > αX
for any X > 2α. Fix such an X(> log α).
Then f (0) = 1 > 0, f (log α) = α(1 − log α) < 0, f (X) > 0. So we can apply the IVT
to the two intervals [0, log α], and [log α, X] to find that there exist x1 ∈ [0, log α] such that
f (x1 ) = 0, and x2 ∈ [log α, X] such that f (x2 ) = 0 as required.
19
5.1 Closed bounded intervals map to closed bounded intervals
Theorem 5.1.1. Let f : [a, b] → R be a real valued continuous map. Then f ([a, b]) = [m, M ]
for some m, M ∈ R.
That is, a continuous real-valued function maps a closed and bounded interval onto a closed
and bounded interval.
Proof. Let m := inf x∈[a,b] f (x) and M := supx∈[a,b] f (x). These exist by the theorem on
boundedness. Clearly f ([a, b]) ⊆ [m, M ].
By the theorem on the attainment of bounds, there exist ξ ∈ [a, b] and η ∈ [a, b] such that
f (ξ) = m and f (η) = M ; hence m, M ∈ f ([a, b]).
Now let y ∈ [m, M ], so f (ξ) ≤ y ≤ f (η). By applying the IVT to f restricted to the interval
[ξ, η] (or [η, ξ] as case may be) we find an x ∈ [ξ, η] ⊆ [a, b] such that f (x) = y; hence
y ∈ f ([a, b]). Hence [m, M ] ⊆ f ([a, b]).
The following definitions require the ordered structure of real numbers, and so are not avail-
able for functions on a subset of the complex plane.
Recall that the inverse function was defined in ‘Introduction to Pure Mathematics’ last term.
20
We have seen that continuous functions map intervals to intervals. We want to say something
about the inverse function when it exists. Note that any result about increasing functions
f can be translated into a result about decreasing functions by the simple expedient of
considering the functions −f .
We will prove:
Theorem 6.2.2 (Inverse Function Theorem (IFT)). Let f be a strictly increasing and
continuous real function on [a, b]. Then f has a well-defined continuous inverse on [f (a), f (b)].
Theorem 6.2.3. Let f : [a, b] → R be strictly increasing and continuous on [a, b]. Then
(ii) there exists a unique function g : [f (a), f (b)] → R such that g(f (x)) = x for all x ∈ [a, b]
and f (g(y)) = y for all y ∈ [f (a), f (b)];
(iv) g is continuous.
Proof. The first assertion is just Theorem 5.1.1 as in this case m = f (a) and M = f (b).
The second is straightforward; f : [a, b] → [f (a), f (b)] is now 1—1 and onto. So given
y ∈ [f (a), f (b)] there exists a unique x ∈ [a, b] such that f (x) = y. Define g(y) = x. So the
inverse function exists and is unique.
The third assertion is also straightforward. Assume there exist u, v ∈ [f (a), f (b)] with u < v
but g(u) > g(v). But as f is strictly increasing this implies u = f (g(u)) > f (g(v)) = v, a
contradicton.
It is the fourth assertion that needs our attention. We must prove that for any y0 ∈
[f (a), f (b)] the function g is continuous at y0 .
For y0 ∈ (f (a), f (b)). Given ǫ > 0, if necessary take ǫ smaller such that g(y0 ) + ǫ ∈ [a, b] and
g(y0 ) − ǫ ∈ [a, b]. Choose δ = min{f (g(y0 ) + ǫ) − y0 , y0 − f (g(y0 ) − ǫ)}. (Draw the graph of
g(y) to see why we choose it like this) Then
y0 − δ < y < y0 + δ
=⇒ f (g(y0 ) − ǫ) < y < f (g(y0 ) + ǫ)
=⇒ g(f (g(y0 ) − ǫ)) < g(y) < g(f (g(y0 ) + ǫ))
=⇒ g(y0 ) − ǫ < g(y) < g(y0 ) + ǫ
and g is continuous at y0 as required. The points y0 = f (a) and y0 = f (b) are similar.
Remark 6.2.4. Note that from Q1, problem sheet 3, if f : [a, b] → R is a continuous, 1–1
function with f (a) < f (b), then f is strictly increasing on [a, b]. So for the Inverse Function
Theorem (IFT) it is sufficient to assume that f : [a, b] → R is continuous and 1–1.
21
Note 6.2.5. I have not used the notation f −1 for the inverse function. If you do choose to
use it then you must make very clear what you intend the domains of f and f −1 to be. It
is not for nothing that the special notations ‘arcsin’ etc. exist! For example sine and cosine
are only invertible on a part of their domain where they are inceasing or decreasing.
In the following I will consider the functions only on real domains. Some of the results extend
to complex domains.
Recall from Analysis I that functions such as exp(x), sin(x), cos(x), sinh(x) and cosh(x) etc
are defined by their power series each of which has infinite radius of convergence. Later we
will see that a power series is continuous within its radius of convergence so each of these
functions is continuous on R. For each of them, if we take as domain a closed interval on
which the function is strictly monotone, then we can use the IFT to show the function has
a continuous inverse. (See also Problem sheet 3 Q3)
In particular we can therefore define the exponential function: exp : R → R as exp(x) =
P xn
n! . The following properties were proved in Analysis I (though some used results to be
proved in this course):
4. exp(x) > 0;
5. exp is strictly increasing and exp : R → (0, ∞) is a bijection and hence invertible. The
inverse is denoted by log : (0, ∞) → R;
8. For any a > 0 and any x ∈ R define ax = exp(x log a). Then ax+y = ax ay ; Also
ex = exp(x);
In addition
Proof. We have
| exp(x + h) − exp(x)| = exp(x)| exp(h) − 1)|
so for |h| < 1 we have by the Triangle Law and the preservation of 6 under limits
X X |h|
| exp(x + h) − exp(x)| 6 exp(x) |h|n /n! 6 exp(x) |h|n = exp(x),
1 − |h|
n>1 n>1
22
which tends to 0 as h → 0.
Proof. We apply the theorem by fnding an A > 0 such that 1/(1 + A) < y < 1 + A and
considering exp : [−A, A] → [exp(−A), exp(A)], as, from (10) above, the image interval then
contains y.
For functions defined on an interval, we may talk about right-hand and left-hand limits.
Definition 6.4.1. (i) Let f : [a, b) → R (or C) and p ∈ [a, b); and let l ∈ R (or l ∈ C). We
say that l is the right-hand limit of f at p if, ∀ε > 0, ∃δ > 0 such that
We write this as
Similarly we have:
(ii) Let f : (a, b] → R (or C) and p ∈ (a, b]; and let l ∈ R (or l ∈ C). We say that l is the
left-hand limit of f at p if, ∀ε > 0, ∃δ > 0 such that
We write this as
23
Proposition 6.4.2. Let f : (a, b) → C and let p ∈ (a, b). Then the following are equivalent:
Proposition 6.5.2. Let f : (a, b) → R and let p ∈ (a, b). Then the following are equivalent:
(i) f is continuous at p;
We now discuss the continuity of monotone functions. Remember that any result about increasing
functions f can be translated into a result about decreasing functions by the simple expedient of
considering the functions −f .
Theorem 6.6.1. Let f : (a, b) → R be an increasing function. Then for every x0 ∈ (a, b) the
right-hand limit f (x0 +) and the left-hand limit f (x0 −) of f at x0 exist.
Moreover, f (x0 −) = supa<x<x0 f (x), f (x0 +) = inf x0 <x<b f (x) and
24
Proof. By hypothesis,{f (x) : a < x < x0 } is non-empty and is bounded above by f (x0 ), and therefore
has a least upper bound A := supa<x<x0 f (x). Then A 6 f (x0 ). We have to show that f (x0 −) = A.
Let ε > 0 be given. It follows from the definition of supa<x<x0 f (x), that there is a xε ∈ (a, x0 ) such
that
A − ε < f (xε ) 6 A.
As x0 − xε > 0 choose δ := x0 − xε . Then, x ∈ (xε , x0 ) if and only if 0 < x0 − x < δ, and thus, as f
is increasing £ ¤
A − ε < f (xε ) 6 f (x) 6 A for all 0 < x0 − x < δ.
By definition f (x0 −) = A and we are done.
The other inequality can be obtained by a similar argument (a good exercise); or by applying what
we have done to the function −f (b − x) on (0, b − a) and juggling with the inequalities.
Remark 6.6.2. Informally we call the difference f (x0 +) − f (x0 −) the “jump” of f at x0 .
We want to extend our definition of the limit ‘limx→a f (x)’ to allow us to talk about the end
points of infinite intervals like (0, ∞).
Note 7.1.3. We will often just write ‘f (x) → l as x → ∞’ for ‘f (x) → l as x → +∞’. There
is a slight danger of confusion—see what we say about functions of a complex variable—but
if we take care it will be all right.
Note that there may be a mild inconsistency with the previous definition if E ⊆ R. If we are
thinking ‘complex’ we’ll need both the real limits at ±∞ to be equal.
25
¯ ¯
sin z ¯ sin x ¯
Example 7.3.2. Consider as z → ∞. For real values z = x we get that ¯¯ ¯ 6
z x ¯
1
→ 0 as x → ∞. But for pure imaginary values like zk = 2πik, with k ∈ Z we’ll get that
¯|x|
¯ sin zk ¯ e2πk − e−2πk
¯
¯ zk ¯ =
¯ ¯ → ∞ as k → ∞.
4πk
Exercise 7.3.3. Write down the contrapositive of ‘f tends to a limit as z → ∞.
Very briefly we discuss ‘infinite limits’. We must take great care not to deceive ourselves: in
neither R nor C is there a number ∞.
equal to e.
1 exp(y) − 1 − y
1 −1= .
¡ ¢
x log 1 + x y
1
Note that as 1 + x> 1 for x > 0, we have y > 0, and then
n n
P P
exp(y) − 1 − y n>2 y /n! n>2 y y
06 = 6 = .
y y y 1−y
26
So if we can show that y → 0 as x → ∞ we are done.
Let ε > 0; By continuity of log at 1 we can find δ such that | log t| < ε for t ∈ (1, 1 + δ). Take
K = 1δ . Then, as x > K implies x1 < δ,
1
|y| = | log(1 + )| < ε, ∀x > K.
x
8 Uniform Convergence
8.1 Motivation
Let E ⊆ R (or C), and let p ∈ E be a limit point, so that p = limx→p x. We have seen that
‘continuity at p’ is exactly the right condition to ensure that
that is to ensure that ‘taking the limit limx→p ’ and ‘finding the value under f ’ can be
interchanged.
There are many other situations in which we would like to understand whether the order of
in which we perform two mathematical operations is significant or not:
(i) Suppose we have not just a single function f on E but a whole sequence {fn }. When is
limn→∞ limx→p fn (x) = limx→p limn→∞ fn (x)?
(iii) Once we have defined derivatives and integrals—as limits—we will want to know when
Rb Rb
limn→∞ fn′ (x) = (limn→∞ fn (x))′ ? and when limn→∞ a fn (t) dt = a limn→∞ fn (t) dt?
The answers to some of these questions are given in this lecture and the next.
To see that there are non-trivial problems we look at one typical example.
Example 8.1.1. Consider the sequence of functions {fn }, where fn : [0, 1] → R given by
if 0 6 x < n1 ,
½
−nx + 1
fn (x) =
0 if x > n1 .
Sketch their graphs, and note that for all x ∈ [0, 1] we have that f (x) = limn→∞ fn (x).
27
Note that although all the fn are continuous the limit function f is not continuous at 0.
Also,
lim lim fn (x) = lim f (x) = 0
x→0 n→∞ x→0
while
lim lim fn (x) = lim 1 = 1
n→∞ x→0 n→∞
so that
lim lim fn (x) 6= lim lim fn (x).
x→0 n→∞ n→∞ x→0
8.2 Definition
Just as when we defined ‘uniform continuity’ as a stronger version of ‘continuous at all points’
by insisting on being able to choose one ‘δ’ to deal with all points, so we now strengthen our
definition of ‘convergent’.
So let E ⊆ R (or C) and let fn : E → R (or C) be a sequence of functions. Then for each
(fixed) x ∈ E, {fn (x)} is a sequence of real (or complex) numbers. If this sequence converges
for every x ∈ E, then the limit which will depend on x so we will call it f (x). Thus f : E → R
(or C) is a function. Hence we have the definition (using Analysis I):
Definition 8.2.1. By fn converges to f on E we mean that ∀x ∈ E, and ∀ε > 0, ∃N ∈ N
such that
|fn (x) − f (x)| < ε ∀n > N.
So, of course, in general N depends on x. For ‘uniform convergence’ we insist that one N
works for all x.
Definition 8.2.2. By fn converges uniformly to f on E we mean that ∀ε > 0, ∃N ∈ N
such that
|fn (x) − f (x)| < ε ∀n > N and ∀x ∈ E.
u
We write this as ‘fn → f uniformly on E’ or ‘fn → f ’.
There is one special case which we should single out. Suppose that for each n ∈ N we have
that sn (x) = n0 fk (x). Suppose that s : E → R (or C). If we apply the definition to the
P
sequence {sn } and the function s we will get
P
Remark 8.2.4. We say that the series fn converges uniformly to s on E if ∀ε > 0,
∃N ∈ N such that
¯ n ¯
¯X ¯
fk (x) − s(x)¯ < ε ∀n > N and ∀x ∈ E.
¯ ¯
¯
¯ ¯
0
28
8.3 Test for Uniform Convergence
(i) fn → f uniformly on E;
(ii) ∃N s.t. ∀n > N , the real numbers mn := supx∈E |fn (x) − f (x)| exist and moreover
mn → 0 as n → ∞.
Proof. (=⇒)
Suppose fn → f uniformly on E, then (by definition) ∀ε > 0, ∃N ∈ N such that
|fn (x) − f (x)| < 21 ε ∀x ∈ E and ∀n > N.
1
Hence, for each n, is an upper bound of the set {|fn (x) − f (x)| : x ∈ E}. Then the least
2ε
upper bounds satisfy
mn = sup |fn (x) − f (x)| 6 21 ε < ε ∀n > N.
x∈E
By the definition of sequence limits, limn→∞ mn = 0.
(⇐=)
Suppose the mn exist for all n > N1 , and that limn→∞ supx∈E |fn (x) − f (x)| = 0, Then
∀ε > 0 ∃N > N1 such that
sup |fn (x) − f (x)| < ε ∀n > N.
x∈E
Therefore
|fn (x) − f (x)| 6 sup |fn (x) − f (x)| < ε ∀x ∈ E and ∀n > N.
x∈E
This is the definition that fn → f uniformly on E.
Example 8.3.2. Let E = [0, 1) and let fn (x) = xn . Clearly limn→∞ fn (x) = 0, so f (x) = 0.
1
Then mn = supx∈E |xn − 0| = supx∈E xn . But xn = (1/2) n ∈ E and fn (xn ) = 1/2 so that
mn > fn (xn ) = 1/2 9 0, as n → ∞
so fn is not uniformly convergent on [0, 1).
However, if instead we consider E = [0, r], where 0 < r < 1 is a fixed constant. Then xn → 0
uniformly on E, because now
mn = sup xn 6 rn → 0, as n → ∞.
[0,r]
Remark 8.3.3. The test is practical on a closed and bounded interval E = [a, b] in cases
where the functions fn and f are differentiable. In such cases the supremum will be achieved
d(fn (x) − f (x))
either at a or at b or at some interior point where = 0. We will prove this
dx
later in the course; for the moment use it in exercises.8
8
Of course we will not use it in building up the theory.
29
8.4 Cauchy’s Criterion
Proof. (=⇒) Suppose fn converges uniformly on E with limit function f , then ∀ε > 0,
∃N ∈ N such that
|fn (x) − f (x)| < 12 ε ∀n > N and ∀x ∈ E.
So, ∀x ∈ E and ∀n, m > N
(⇐=) Conversely, suppose (*) holds. Then for any x ∈ E, {fn (x)} is a Cauchy se-
quence, so that it is convergent. Let us denote its limit by f (x). For every ε > 0, choose
N ∈ N such that
For any fixed n > N and x ∈ E, letting m → ∞ in the above inequality we obtain, by the
preservation of weak inequalities, that
As a consequence, we prove the following simple but important uniform convergence test for
series.
Theorem 8.5.1 (The Weierstrass M-Test). Let E ⊆ R (or C) and fn : E → R (or C).
Suppose that there is a sequence {Mn } of real numbers such that
|fn (x)| 6 Mn ∀x ∈ E.
P∞ P∞
If n=0 Mn converges then n=0 fn converges uniformly on E.
30
Note that the Mn must be independent of x.
P
Proof. By Cauchy’s Criterion for the convergence of Mn we have that ∀ε > 0, ∃N ∈ N
such that
X n
Mk < ε ∀n > m > N.
k=m+1
9.1 Examples
31
But fn (1/n) = 1/2, so that
1
sup |fn (x) − f (x)| > 9 0 as n → ∞
x∈[0,1] 2
We have already seen that the limit of a sequence of continuous functions may not be con-
tinuous. This theorem tells us that ‘uniformity’ gives us the extra condition we need.
Theorem 9.2.1. Let fn ,f : E → R (or C), and fn → f uniformly in E. Suppose all fn are
continuous at x0 ∈ E. Then the limit function f is also continuous at x0 , so that
|f (x) − f (x0 )|
6 |f (x) − fN +1 (x)| + |fN +1 (x) − fN +1 (x0 )| + |fN +1 (x0 ) − f (x0 )|
1
< 3ε + 13 ε + 31 ε
= ε.
By definition, f is continuous at x0 .
32
Note it is very importan that N + 1 is fixed, so that δ does not depend on n.
Remark 9.2.2 (Version for series). If ∞
P
n=0 fn converges uniformly on E and every fn is
continuous at x0 ∈ E, then
X∞ X∞
lim fn (x) = fn (x0 ).
x→x0
n=0 n=0
P∞
In particular, if fn is continuous on E for all n and n=0 fn converges uniformly on E,
then ∞
P
f
n=1 n is continuous on E.
We can apply the the results of the previous subsection to the important case of power series.
Theorem
P∞ 9.3.1n(Continuity of Power Series). Suppose the radius of convergence
P∞ of the power
series n=0 an x is R, where 0 6 R 6 ∞. Then for every n
P∞ 0 6 rn< R, n=0 an x converges
uniformly on the closed disk {x : |x| 6 r}. Therefore, n=0 an x is continuous on the open
disk {x : |x| < R}.
of ‘radius of convergence’, ∞ n
P
Proof. According to the definition P n=0 an x is absolutely con-
∞ n
vergent for |x| < R. In particular, n=0 |an |r is convergent. Since
|an xn | 6 |an |rn for all x such that |x| 6 r
we have, by Weierstrass M-test, that ∞ n
P
n=0 an x converges uniformly on {x : |x| 6 r}.
P∞
But an xn is continuous for any n ∈ N . So, for any r < R, n
n=0 an x is continuous for
|x| 6 r, and hence on the open disk {x : |x| < R}.
Note 9.3.2. Note that this says nothing about convergence or continuity at the end-points.
If you are interested subsection 9.5 deals with this in the real case.
Corollary 9.3.3. The functions exp x, sin x, cos x, cosh x and sinh x can all be defined by
power series with infinite radius of convergence so are all continuous on C.
Next term, in the course Analysis III, you will learn how to define integrals, and the proofs
of the following theorems will be given.
Theorem 9.4.1. If fn → f uniformly on [a, b] and if every fn is continuous, then
Z b Z b Z b
f= lim fn = lim fn .
a a n→∞ n→∞ a
33
Note 9.4.2. However, uniform convergence is not the ‘right’R condition for integrating a
b
series term by term: we can exchange the order of integration a (which involves a limiting
procedure) and limn→∞ under much weaker conditions. The search for correct conditions
for term-by-term integration led to the discovery of Lebesgue integration [Part A option:
Integration].
Theorem 9.4.3. Let fn (x) → f (x) for each x ∈ [a, b]. Suppose fn′ exists and is continuous
on [a, b] for every n, and that fn′ → g uniformly on [a, b]. Then f ′ exists and is continuous
on [a, b], and
d d
lim fn (x) = lim fn (x).
dx n→∞ n→∞ dx
fn converges on [a, b], and if every fn′ exists and is continuous on [a, b] and
P
Similarly,
P ′ if
if fn converges uniformly on [a, b], then
∞ ∞
d X X
fn = fn′ .
dx
n=1 n=1
When 0 < R < ∞ the points where |z| = R need to be handled differently. We only deal with the
real case, so there are two such points R and −R. Scaling (replacing x by x/R or −x/R) lets us deal
only with power series where the radius is 1 and describe what happens at x = 1.
P∞
Theorem 9.5.1 (Abel’s Continuity Theorem).P∞ Suppose that the series n=0 an xn has radius of
convergence R = 1. Suppose further that n=0 an converges.
P∞
Then n=0 an xn converges uniformly on [0, 1].
P∞
Consequently, n=0 an xn is continuous on (−1, 1], and in particular
∞
X ∞
X
lim an xn = an .
x→1−
n=0 n=0
Proof. First note that our general result gives continuity on (−1, 1); it is only the point x = 1 we
have to deal with. We will get continuity provided we get uniform convergence on [0, 1].
P∞
By Cauchy’s Criterion for the convergent n=0 an we have that, for every ε > 0, there is N such
that, for every n > m > N we have ¯ ¯
¯Xn ¯
ak ¯ < ε.
¯ ¯
¯
¯ ¯
k=m
Now fix m > N , and for the partial sums from m use the notation
k
X
Ak = aj for k > m; and Am−1 = 0
j=m
noting that subtracting consecutive sums gives us back the original sequence9
ak = Ak − Ak−1 .
9
Think ‘Differentiation undoes Integration’.
34
By what we have from the Cauchy Criterion above, |Ak | < ε whenever k > m − 1. We have by
elementary algebra the following formula10
n
X n
X
k
ak x = (Ak − Ak−1 )xk
k=m k=m
Xn n
X
= Ak xk − Ak−1 xk
k=m k=m
n−1
X
Ak xk − xk+1 + An xn .
¡ ¢
=
k=m
The theorem of this subsection is a partial converse of our theorem that ‘uniform convergence preserves
continuity’; if the sequence is monotone then the continuity of the limit will give uniformity of
convergence.
Theorem 9.6.1 (The Dini Theorem). . Let fn be a sequence of real continuous functions on [a, b];
and let f be a real continuous function on [a, b].
Suppose that
lim fn (x) = f (x) for every x ∈ [a, b]
n→∞
and that
fn (x) > fn+1 (x) for all n and for all x ∈ [a, b].
Proof. Let gn (x) = fn (x)−f (x). Then gn is continuous for every n, gn > 0 and limn→∞ gn (x) = 0 for
any x ∈ [a, b]. Suppose {gn } were not uniformly convergent on [a, b]. Write down the contrapositive
to see that for some ε > 0, and every natural number k there exists a natural number nk > k and a
point xk ∈ [a, b] such that
|gnk (xk )| = gnk (xk ) > ε.
10
This is called Abel’s summation formula—think ‘integration by parts’.
35
We may choose nk so that k → nk is increasing. We may assume that xk → p—otherwise use the
Bolzano–Weierstrass theorem to extract a convergent subsequence of {xk } and use it instead. Then
p ∈ [a, b]. For any (fixed) k, since {gn } is decreasing,
as gnk is continuous at p. This contradicts to the assumption that limk→∞ gnk (p) = 0.
1
Example 9.6.2. Let fn (x) = 1+nx for x ∈ (0, 1). Then limn→∞ fn (x) = 0 for every x ∈ (0, 1), fn
is decreasing in n, but fn does not converge uniformly. Dini’s theorem doesn’t apply, as (0, 1) is not
compact.
10.1 Definitions
In this course we only study differentiability for real (or complex)-valued functions on E,
where E is a subset of the real line R. The theory of the differentiability of complex valued
functions on the complex plane C is very different from the real case and requires another
theory—See Complex Analysis [Part A: Analysis].
Definition 10.1.1. Let f : (a, b) → R (or C), and let x0 ∈ (a, b). By f is differentiable
at x0 we mean that the following limit exists:
f (x) − f (x0 )
lim .
x→x0 x − x0
When it exists we denote the limit by f ′ (x0 ) which we call the derivative of f at x0 .
For example, it is easy to see that the function x → 7 x is differentiable at every point of R
and has derivative f ′ (x0 ) = 1 at every point; and the function t 7→ e2πit is differentiable at
every point, although we can’t yet prove that.
Sometimes it is helpful to also define ‘left-hand’ and ‘right-hand’ versions of these.
Definition 10.1.2. (i) Let f : [a, b) → R (or C), and let x0 ∈ [a, b). We say that f has a
right-derivative at x0 if the following limit exists
f (x) − f (x0 )
lim .
x→x0 + x − x0
36
(ii) Let f : (a, b] → R (or C), and let x0 ∈ (a, b]. We say that f has a left-derivative at
x0 if the following limit exists
f (x) − f (x0 )
lim .
x→x0 − x − x0
If the limit exists we denote it by f−′ (x0 ).
The following result is easily proved (compare what we did for left- and right-continuity).
Proposition 10.1.3. Let f : (a, b) → R (or C). Then the following are equivalent:
(b) f has both left- and right-derivatives at x0 , and f−′ (x0 ) = l = f+′ (x0 ).
Definition 10.1.4. (i) Suppose that f : (a, b) → R (or C). Then we say that f is differ-
entiable on (a, b) if f is differentiable at every point of (a, b).
(ii) Suppose that f : [a, b] → R (or C). Then we say that f is differentiable on [a, b] if f
is differentiable at every point of (a, b), and if f+′ (a) and f−′ (b) exist.
If you wish you can define differentiable on (a, b] and [a, b) as well.
Remark 10.1.5. Let y = f (x). There are other notations for derivatives
dy df (x0 )
dx or dx [G. W. Leibnitz]
y ′ or f ′ (x0 ) [J. L. Lagrange]
Dy or Df (x0 ) [A. L. Cauchy, in particular for vector-valued functions of several variables].
10.2 An Example
Define a function f : R → R by
( 1
x2 sin for x > 0,
f (x) = x
0 for x 6 0.
The derivative for x 6 0 can be found directly from the definition. Later we will see that we
can use the chain rule to find the derivative for x > 0.
Note that the derivative is not continuous at the origin. (See problem sheet 5.)
1 1
We can get other interesting examples by replacing the ‘x2 by xα and the ‘ ’ by β .
x x
37
10.3 Derivatives and differentials
By looking at the definition of ‘limit’ in terms of ε and δ (see problemsheet) we can easily
prove that:
o(x, x0 )
lim = 0.
x→x0 x − x0
That is, the ‘linear part’ of the increment f (x) − f (x0 ) is f ′ (x0 ) (x − x0 ); all the rest is
small in comparison. This is sometimes called the differential of f at x0 . It is the first
approximation to f near x0 .
Proof. Since
f (x) − f (x0 )
lim (f (x) − f (x0 )) = lim (x − x0 )
x→x0 x→x0 x − x0
f (x) − f (x0 )
= lim lim (x − x0 ) by AOL
x→x0 x − x0 x→x0
= f ′ (x0 ) × 0
= 0.
Note: The converse is not true. For example |x| is continuous but is not differentiable at 0.
In fact there exist functions which are continuous everywhere, but not differentiable at any
point! (See Bartle and Sherbert. )
38
10.5 Algebraic properties
The following results are straightforward consequences of the Algebra of Limits. They let us
build up at once all the calculus we learned at school—provided we can differentiate a few
standard functions (constants, linear functions, exp, sin and cos).
Theorem 10.5.1. Suppose f, g : (a, b) → R (or C) are both are differentiable at x0 ∈ (a, b),
and λ, µ ∈ R (or C).
(i) [Linearity of differentiation] λ · f + µ · g is differentiable at x0 and
Theorem 10.6.1 (The Chain Rule). Suppose f : (a, b) → R, and that g : (c, d) → R.
Suppose that f ((a, b)) ⊆ (c, d), so that g ◦ f : (a, b) → R is defined.
Suppose further that f is differentiable at x0 ∈ (a, b), and that g is differentiable at f (x0 ).
Then g ◦ f is differentiable at x0 and
39
Rewriting the definition of v we see that we have an expression for the increment
g(y) − g(y0 ) = (y − y0 ) g ′ (y0 ) + v(y)
¡ ¢
so that
g(f (x)) − g(f (x0 )) f (x) − f (x0 ) f (x) − f (x0 )
= g ′ (y0 ) + v(f (x)) .
x − x0 x − x0 x − x0
Suppose that f : (a, b) → R (or C) is differentiable at every point of some (x0 − δ, x0 + δ).
The it makes sense to ask if f ′ is differentiable at x0 . If it is differentiable then we denote
its derivative by f ′′ (x0 ).
More generally we can define in a recursive way (n + 1)-th derivatives f (n+1) .
Definition 10.7.1. Suppose that f : (a, b) → R (or C) is such that f , f ′ ,. . . ,f (n) exist at
every point of (a, b). Suppose that x0 ∈ (a, b). By f is (n + 1)-times differentiable at x0
we mean that f (n) is differentiable at x0 . We write f (n+1) (x0 ) := f (n)′ (x0 ).
The following is proved by an easy induction using Linearity and the Product Rule.
Theorem 10.7.2 (The Leibnitz Formula). Let f, g : (a, b) → R (or C) be n-times differen-
tiable on (a, b). Then x 7→ f (x)g(x) is n-times differentiable and
n µ ¶
(n)
X n (j)
(f g) (x) = f (x)g (n−j) (x).
j
j=0
The elementary functions—exp x, cos x, sin x, log x, arctan x—are defined as power series, or
are got as inverse functions of real functions defined by power series.
40
We start with a lemma:
P∞ n
P∞ n
Lemma 11.1.1. The power series n=0 an x and n=0 (n+1)an+1 x have the same radius
of convergence.
Proof. Let the radii be R and R′ ; we will show R > R′ and R′ > R.
′
P∞ n
P∞ that |x1 | < R ; then
First suppose n=0 (n + 1)an+1 x is absolutely convergent at x = x1 .
That is, n=0 (nP + 1)|an+1 ||x1 | converges. Now note that |an xn1 | 6 n|an ||xn1 |. Hence by the
n
comparison test ∞ n
n=0 |an ||x1 | converges.Therefore, by definition of ‘radius of convergence’
we have that R > R .′
Now suppose that |x1 | < R; and choose x2 so that |x1 | < |x2 | < R. Then ∞ n
P
n=0 |an ||x2 |
converges, and so (Analysis I) |an ||x2 |n → 0 as n → ∞. But a convergent sequence is
bounded (Analysis I) so there exists M such that |an ||x2 |n < M for all n. Now
¯ ¯n
n
¯ x1 ¯
|(n + 1)an+1 x1 | 6 (n + 1)M ¯¯ ¯¯
x2
¯ ¯n
and as, by the Ratio Test (n + 1) ¯ xx21 ¯ is convergent, we have by the Comparison Test
P ¯ ¯
P∞
that n=0 (n + 1)|an+1 ||x1 |n is convergent. By the definition of ‘radius of convergence’ we
have that R′ > R.
P∞ n
Theorem P∞ 11.1.2 (Term-by-term differentiation). The power series f (x) := k=0 an x and
g(x) := k=0 (n + 1)an+1 xn have the same radius of convergence R, and for any x such that
|x| < R we have that f is differentiable at x and moreover that f ′ (x) = g(x).
where we have added the series f (w), f (x) and g(x) term by term, which is justified by AOL.
Our aim is to show that
f (w) − f (x)
− g(x) → 0 as w → x.
w−x
The binomial identity
wn − xn
= xn−1 + xn−2 w + · · · + xwn−2 + wn−1
w−x
41
is easily proved by induction; then we have that for any w 6= x and n > 2
wn − xn
− nxn−1 = xn−1 + xn−2 w + · · · + xwn−2 + wn−1
w−x
−xn−1 − xn−1 − · · · − xn−1 − xn−1
n−1
X³ ´
= xn−1−k wk − xn−1
k=1
n−1
X ³ ´
= xn−1−k wk − xk .
k=1
Let
n−1
X ³ ´
hn (w) = an xn−1−k wk − xk for n = 2, 3, · · · .
k=1
Then
∞
f (w) − f (x) X
− g(x) = hn (w)
w−x
n=2
All hn are P
continuous in R as they are polynomials in w; and hn (x) = 0 for all n > 2. We
claim that ∞ n=2 hn (w) converges uniformly in |w| 6 r. In fact
n−1
X ³ ´
|hn (w) | 6 |an | |x|n−1−k |w|k + |x|k
k=1
6 2n|an |rn−1 .
P∞
n|an |rn−1 is convergent, so that
P
Now n=2 hn (w)
P∞converges uniformly in closed disk
{w : |w| 6 r} by the Weierstrass M-test. Hence n=2 hn (w) is continuous in the disk
|w| 6 r as the uniform limit of continuous functions is continuous. Therefore
∞
X ∞
X
lim hn (w) = hn (x) = 0
w→x
n=2 n=2
so that
µ ¶
f (w) − f (x) f (w) − f (x)
lim = lim − g(x) + g(x)
w→x w−x w→x w−x
∞
X
= lim hn (w) + g(x)
w→x
n=2
= g(x).
42
Proposition 11.2.1. The functions exp x, sin x, cos x, cosh x and sinh x can all be defined
by power series with infinite radius of convergence so are all differentiable on R. Further:
Note 11.2.2. The other trigonometric and hyperbolic functions are defined in terms of cos
sin x
and sin or cosh and sinh. For example tan x := is defined for those x such that
cos x
cos x 6= 0. Then by the quotient rule it is differentiable wherever it is defined, and tan′ x =
cos2 x + sin2 x
. We will soon give an easy proof that cos2 x + sin2 x = 111
cos2 x
Theorem 11.3.1. Let f : [a, b] → [m, M ] be a strictly increasing continuous function from
[a, b] on to [m, M ], with inverse function g : [m, M ] → [a, b]. Suppose that f is differentiable
at x0 ∈ (a, b) and that f ′ (x0 ) 6= 0. Then g is differentiable at f (x0 ), and
1
g ′ (f (x0 ) =
f ′ (x0 )
Proof. We have already proved that g is continuous. Write y0 = f (x0 ). Then for y 6= y0
g(y) − g(y0 ) x − x0 1
= = f (x)−f (x0 )
y − y0 f (x) − f (x0 )
x−x0
f (x) − f (x0 )
→ f ′ (x0 ) as y → y0 .
x − x0
g(y) − g(y0 ) 1
lim = ′
y→y0 y − y0 f (x0 )
11
The Pythagoras Theorem.
43
11.4 Logarithms
We continue to deal only with the real case where, in section 6, we defined log : (0, ∞) → R
as the inverse function of the real exponential function.
To see that log is differentiable at any y > 0 proceed as we did when we discussed continuity,
by finding an A such that exp(−A) < y < exp(A) and then using the Inverse Function
Theorem on the differentiable function exp : [−A, A] → [exp(−A), exp(A)]. We will find that
1 1 1
log′ y = = =
exp′ (log y) exp(log y) y
11.5 Powers
For any x > 0 and any α ∈ R in section 6 we defined xα = exp(α log x). From the Chain
Rule and the properties of exponentials and logarithms we therefore have that
d α
x = αxα−1 .
dx
Here is the crucial property (which, of course, you knew long before you started the course).
f (x) − f (x0 )
Proof. If x0 is a local maximum, then there exists δ > 0 such that 6 0 whenever
x − x0
0 < x − x0 < δ and x ∈ (a, b), so that
f (x) − f (x0 )
f+′ (x0 ) = lim 6 0.
x→x0 + x − x0
44
f (x) − f (x0 )
On the other hand, > 0 whenever −δ < x − x0 < 0 and x ∈ (a, b), so that
x − x0
f (x) − f (x0 )
f−′ (x0 ) = lim > 0.
x→x0 − x − x0
Since f is differentiable at x0 , f ′ (x0 ) = f−′ (x0 ) = f+′ (x0 ) and hence f ′ (x0 ) = 0.
Similarly if x0 is a local minimum.
Theorem 12.2.1 (Rolle, 1691). Let f : [a, b] → R be continuous, and suppose that f is
differentiable on (a, b). Suppose further that f (a) = f (b). Then there exists a point ξ ∈ (a, b)
such that f ′ (ξ) = 0.
Proof. If f is constant in [a, b], then f ′ (x) = 0 for every x ∈ (a, b), so that any point—say
ξ = 21 (a + b)—will do.
As f is continuous on [a, b] it attains its maximum and minimum on [a, b] (by Theorems 4.1.2
and 4.1.6.) Suppose ξ1 is the minimum and ξ2 is the maximum. As f (a) = f (b), either ξ1
or ξ2 lies in the open interval (a, b), or else f is constant and we are done. Suppose then
that ξ ∈ (a, b) gives either the maximum or minimum. It is then a local extremum, and by
Fermat’s result f ′ (ξ) = 0.
Note 12.2.2. (i) Remember that f is differentiable implies that f is continuous. Thus
the hypotheses of Rolle would be satisfied if f was differentiable on [a, b] and f (a) = f (b).
However, often it is important that Rolle holds under the given weaker conditions.
(ii) When using these theorems remember to check ALL conditions including the continuity
and differentiability conditions. For example f : [−1, 1] → R given by f (x) = |x| satisfies all
conditions of Rolle except that f is not differentiable at x = 0. But there is no ξ such that
f ′ (ξ) = 0.
This is one of the most important results in this course. It is a rotated version of Rolle.12
Theorem 12.3.1 (MVT). Let f : [a, b] → R be continuous, and suppose that f is differen-
tiable on (a, b). Then there exists a point ξ ∈ (a, b) such that
45
Proof. Apply Rolle’s theorem to the function
Note that (for a given function f ) the value of ξ1 may depend on a1 and b1 .
Corollary 12.3.3 (The Taylor Theorem, mark 1). Suppose that we have the hypotheses of
the MVT and that x, x + h ∈ [a, b]. Then
Proof. Suppose h < 0; then a 6 x + h < x 6 b. From the MVT applied to f on the interval
[x + h, x] there exists ξ ∈ (x + h, x) such that
f (x) − f (x + h) = f ′ (ξ)(−h).
Write ξ = x + θh, and note that x + h < x + θh < x implies—as h < 0—that 0 < θ < 1.
The cases h = 0 and h > 0 are left as exercises.
Proof. Apply MVT to f on [x, y] where x, y are any two points in (a, b). (Note that f
is differentiable on (a, b) implies that f is continuous on (a, b) and hence f is continuous
on [x, y].) Then f (x) − f (y) = f ′ (ξ)(x − y) for some ξ ∈ (x, y). But f ′ (ξ) = 0, so that
f (x) = f (y). Therefore f is constant in (a, b).
Example 12.4.2. Suppose that φ is a function whose derivative is x2 . Then we have, for
all x, that φ(x) = 31 x3 + A for some constant A.
Proof. Let f (x) := φ(x) − 13 x3 ; then f is differentiable and we can calculate that f ′ (x) =
x2 − 31 · 3x2 = 0. By the Identity Theorem f (x) = A for some constant A. You can justify
other ‘integrations’ similarly . Just guess the ‘integral’ and proceed as above.
46
Note 12.4.3. Often when applying mathematics we have to solve a differential equation.
Last term you learned methods for guessing solutions. The Identity Theorem gives us a tool
to prove the uniqueness of solutions of DE, and lets us justify these clever tricks (see Section
13.3). Those who do PDEs have already seen this idea this term where you showed that
E ′ (t) = 0 and then deduced that E(t) is a constant (which then turned out to be zero).
Sometimes we are concerned with more than one function, and would like to use the MVT
or a MVT type argument. The following is what we need: except in the most trivial cases it
never helps to apply the MVT to the functions separately—we generate too many distinct
ξ’s.
Proof. We have to use the Identity Theorem—but on what function? Fixing y and looking
at f (x) = exp(x + y) − exp(x) exp(y) leads to f ′ = f and f (0) = 0 which we could now solve
to get f (x) = 0 (see section 13.3).
13
This is where the result belongs logically, but in the lectures it will not appear until later, when we do
L’Hospital’s Rule.
47
However a much better (more direct) way is to fix x + y instead. So, fix c ∈ R, and
put g(t) = exp c − exp t exp(c − t). Then we have that g ′ (t) = 0 so that g(t) = g(0) by
the Identity Theorem. Now g(0) = exp c − exp 0 exp c = 0. So for any c, t we have that
exp c − exp t exp(c − t) = 0. Put c := (x + y), and t := x to get the result.
We can also use the MVT to prove the monotonicity of the exponential function.
Proposition 13.2.1 (The Pythagoras Theorem). For all real x we have that
cos2 x + sin2 x = 1.
Proof. Let f (x) := cos2 x + sin2 x − 1. Then by what we have proved about derivatives of
trigonometric functions,
for all x.
By the Identity Theorem applied to f on some interval (−A, A) we see that for x ∈ (−A, A)
we have that
f (x) = f (0) = cos2 0 + sin2 0 − 1 = (1)2 − 0 − 1 = 0.
As this is true for any A we get that f (x) = 0 on R.
Proof. It is enough to prove one, the other is got by fixing y and taking the derivative of the
resulting function of x.
We recall what we did for exponentials: let
48
whose derivative is
Proof. First we need to see that there are positive zeros. Note that
∞
X (−)k 02k
cos 0 = =1>0
(2k)!
0
x2 1 x4
√
provided x2 6 (4 + 4)(4 + 3). As 1 −
£ 2
(x − 6)2 − 12 we see that cos 6 < 0.
¤
2! +
= 24 4!
√
By the IVT, cos x has at least one zero in [0, 6].
Now let
S = {t > 0 : cos t = 0}.
Then S 6= ∅ and S is bounded below, so that α = inf S exists. By definition of inf S, given n
there exists tn such that α ≤ tn < α + 1/n. Thus tn → α as n → ∞. But cos x is continuous,
so that cos tn → cos α, and hence cos α = 0. But cos 0 = 1 so that α is the minimum positive
zero required.
Proof. By Pythagoras, sin α = ±1. Suppose sin α = −1; then by the MVT there would be
sin α − sin 0 −1
some ξ ∈ (0, α) such that cos ξ = = < 0. However, cos 0 = 1, and α is the
α−0 α
first root, so by the IVT cos ξ cannot be negative.
Proof. We just use the addition formula repeatedly, inserting the values cos α = 0 and
sin α = 1.
Now that we have proved these results, and the danger of using ‘obvious’ but unproved
properties of π has passed we can make the following definition:
49
Definition 13.2.6. π := 2 · inf{β | β > 0 and cos β = 0}(= 2α).
We need one more result, and then we have established “all” the usual facts about the
trigonometric functions.
Proposition 13.2.7. The zeros of cos x are at precisely the points {(k + 12 )π : k ∈ Z}.
Proof. By Proposition 13.2.5(ii) and (iii), for k ∈ Z, cos( 12 π + kπ) = (−1)k cos 12 π = 0 so
these are all zeros. If β were such that cos β = 0 then as above we have a k such that
β0 = β + kπ ∈ (0, π] is a zero of cos x. Clearly β0 6< 12 π by definition. Using
we see that if β0 > 21 π then π − β0 < 12 π is a zero of cos x, which cannot be. Hence β0 = 12 π,
and β has the required form.
Here is a very fundamental example of how we use the MVT to establish the uniqueness of
a solution of a differential equation.
Example 13.3.1. Show that the general solution for f ′ (x) = f (x) ; x ∈ (0, +∞), is
f (x) = A exp(x) where A is a constant.
The ‘trick’ for solving differential equations is to manipulate them so that they look like
d
F = 0 for some F , and then ‘integrate’. We often achieve this by multiplying by ‘inte-
dx
grating factors’, for which there are recipe books. The same ‘trick’ lets us apply the MVT
(or Identity Theorem) to prove that we have a solution.
df R
Given this differential equation − f = 0 we would multiply it by e− 1dx , rewrite it as
dx
d −x
(e f (x)) = 0 and deduce that e−x f (x) = A.
dx
Let’s write this as a piece of pure mathematics!
Consider g(x) := f (x) exp(−x). Then g ′ (x) = f ′ (x) exp(−x) − f (x) exp(−x) = 0. Hence, by
the Identity Theorem g(x) is constant; that is f (x) exp(−x) = A say, and so f (x) = A exp(x).
Example 13.3.2. This example is based on a Calculus question from a Mods Collection.
Find all solutions of
2
y ′′ − y = 0.
1 + x2
(The emphasis for us is on “all”.)
50
‘Calculus of One Variable’ course last term. We use the methods from this course to show
that these are all the solutions.
We can check easily that (1 + x2 ) is a solution; so we write
y(x)
z(x) = .
1 + x2
An easy calculation yields
here arctan is the inverse function of tan. So by the Inverse Function Theorem and the other
rules of differentiation which we have established we can check that
1
w′ (x) = = z ′ (x)/A.
(1 + x2 )2
Hence by the Identity Theorem z(x) − Aw(x) = B for some constant B, and so the only
solutions are
A£
(1 + x2 ) arctan x + x + B(1 + x2 ).
¤
y(x) =
2
sin x
13.4 The function
x
This is a good example of how the Mean Value Theorem and its various corollaries are used practically.
I will probably not do this in lectures.
sin x
(i) sin x < x < tan x and so cos x < x < 1;
sin x
(ii) limx→0 x = 1;
2 sin x
(iii) π < x < 1 (Jordan’s inequality).
51
We therefore have the following bounds:
2 sin x
max{cos x, }< < 1.
π x
Proof. To prove the first inequality, consider f (x) = tan x−x, for x ∈ [0, 21 π). Then f is differentiable
on (0, 12 π) and
1
f ′ (x) = − 1 > 0 for all x ∈ (0, 21 π).
cos2 x
Hence f is strictly increasing on [0, 21 π); in particular f (x) > f (0) for any x ∈ (0, 12 π) which yields
tan x > x. Considering x − sin x in the same way will give x > sin x.
The second inequality in (i) is got by inverting and multiplying by sin x; this is justified since sin x > 0
until the smallest positive zero of cos x.
sin x
For (ii) we use a version of the sandwich theorem and the continuity of cos x to get that limx→0+ x
exists and
sin x
1 = lim cos x 6 lim 6 1.
x→0+ x→0+ x
sin x sin x
As is an even function this gives that limx→0 = 1.
x x
Now consider
sin x
h(x) = for x ∈ (0, 12 π].
x
Then
cos x(x − tan x)
h′ (x) = < 0 for all x ∈ (0, 21 π)
x2
so that h is strictly decreasing, and hence g(x) > g( 21 π) for any x ∈ (0, 21 π); this gives the first
inequality of (iii). The second is already included in (i).
14 L’Hôpital’s Rule
This section is devoted to a variety of rules and techniques for calculating limits of quotients.
They derive from results of Guillaume de l’Hôpital; perhaps they are really due to Johann
Bernoulli whose lecture notes l’Hôpital published in 1696.
As promised earlier here is the proof of Cauchy’s symmetric form of the MVT. (At first sight
one might think we could just apply the MVT to f and g separately. However, a moment’s
reflection will show that we would then get two different ξ.)
52
Proof. First, this makes sense: we cannot have g(b) − g(a) = 0 or by Rolle’s Theorem there
would be a point η ∈ (a, b) with g ′ (η) = 0.
Now let the function F be defined on [a, b] by
¯ ¯
¯ 1 1 1 ¯
¯ ¯
F (x) := ¯¯ f (x) f (a) f (b) ¯
¯
¯ g(x) g(a) g(b) ¯
that is
Proposition 14.2.1. Suppose f , g are continuous on [a, a + δ] (for some δ > 0), and
′ (x)
differentiable in (a, a + δ), and that f (a) = g(a) = 0. Suppose further that l := limx→a+ fg′ (x)
exists.
Then
f (x) f ′ (x)
lim = lim ′ .
x→a+ g(x) x→a+ g (x)
Proof. Note that there must exist a δ ′ < δ such that on (a, a + δ ′ ] we have that g ′ (x) 6= 0,
for otherwise the function f ′ (x)/g ′ (x) would not be defined near a and so this limit could
not be defined.
For every x ∈ (a, a+δ ′ ), apply Cauchy’s MVT to f , g on the interval [a, x]: there is ξx ∈ (a, x)
such that
f (x) f (x) − f (a) f ′ (ξx )
= = ′ .
g(x) g(x) − g(a) g (ξx )
But if x → a+, then ξx → a with ξx > a, so that
f ′ (ξx )
lim = l.
x→a+ g ′ (ξx )
Hence
f (x) f ′ (ξx )
lim = lim ′ =l
x→a+ g(x) x→a+ g (ξx )
Similarly we prove
53
Corollary 14.2.2. Suppose f , g are continuous on [a − δ, a] (for some δ > 0), and differen-
′ (x)
tiable in (a − δ, a), and that f (a) = g(a) = 0. Suppose further that l := limx→a− fg′ (x) exists.
Then
f (x) f ′ (x)
lim = lim ′ .
x→a− g(x) x→a− g (x)
f (x) f ′ (x)
lim = lim ′ .
x→a g(x) x→a g (x)
0
Note 14.2.4. Sometimes this is called the 0 case of L’HR.
1 − cos x sin x
lim = lim by L’HR, provided this limit exists
x→0 x2 x→0 2x
cos x
= lim by L’HR, provided this limit exists
x→0 2
1
= and this limit exists by the continuity of cos x;
2
so the above equalities hold.
To justify this we need to see L’HR, which we have used twice is actually applicable. But
by standard results we have already proved:
1−cos x and x2 are continuous on [− 12 π, 21 π], zero at zero, and differentiable on ( 21 π, 12 π)\{0}
with derivatives sin x and 2x;
sin x, and 2x are continuous on [− 21 π, 12 π], zero at zero, and differentiable on ( 12 π, 21 π) \ {0}
with derivatives cos x and 2
sin x
Exercise: Prove similary that limx→0 x = 1.
Example 14.3.2.
log(1 + x)
lim = 1.
x→0 x
54
Again we argue:
log(1 + x) log′ (1 + x)
lim = lim by L’HR, provided this limit exists
x→0 x x→0 x′
1 1
= lim derivative of log t is
x→0 1 + x t
1
= 1 by continuity of
1+x
—as this exists previous equalities hold.
To justify the use of L’HR we need to see that log(1 + x) and x are continuous on [− 21 , 21 ], 0
at 0, and differentiable on (− 21 , 21 ) \ {0}.
Example 14.3.3.
1
lim (1 + x) x = e.
x→0
1
Recall that by definition (1 + x) x := exp x1 log(1 + x) . So consider first log(1+x)
¡ ¢
x . By the
previous example this has limit 1. Now by the continuity of exp(x) we see that
µ ¶
1 log(1 + x)
(1 + x) x = exp → exp(1) = e as x → 0.
x
If we have all the hypotheses for L’Hôpital’s rule, except that we have
f ′ (x)
→ +∞ as x → a
g ′ (x)
then we swap f and g, then use L’HR and conclude that
g(x)
→0 as x → a.
f (x)
Suppose f, g : (a, +∞) → R are continuous and differentiable, with f (x) → 0 and g(x) → 0
′ (x)
as x → ∞. If g ′ (x) 6= 0 on (a, +∞) and fg′ (x) → l as x → ∞, then we can deduce that
f (x)
limx→∞ g(x) = l.
All we need do is apply L’HR to the functions F (x) = f ( x1 ) and G(x) = g( x1 ), with F (0) =
0 = G(0), checking carefully that the hypotheses hold.
∞
14.6 L’Hôpital’s Rule—the ∞
case
55
∞
Proposition 14.6.1 (L’HR, the ∞ case). Let f, g : (a, a + δ) → R be differentiable for some
′ (x)
δ > 0. Suppose further that f (x) → ∞ and g(x) → ∞ as x → a+ and that limx→a+ fg′ (x)
exists.
Then
f (x) f ′ (x)
lim = lim ′ .
x→a+ g(x) x→a+ g (x)
Note 14.6.2. We do not want to make too much heavy weather in this proof; checking all
the details is a good exercise.
′
Proof. Write K := limx→a+ fg′ (x)
(x)
. Let ε > 0, then there exists a δ1 > 0 such that δ1 < δ and
¯ ′ ¯
¯ f (x) ¯ 1
¯ g ′ (x) − K ¯ < 2 ε for all x ∈ (a, a + δ1 ).
¯ ¯
or ¯ ¯ ¯ ¯
¯ < ε ¯1 − g(c) ¯ + |f (c) − Kg(c)| .
¯ f (x) ¯ 1 ¯ ¯
¯ − K ¯ 2 ¯
¯ g(x) g(x) ¯ |g(x)|
Now use the fact that g(x) → ∞; we can find a δ2 > 0, such that δ2 < δ1 and such that
¯ ¯
g(c) ¯ 3 |f (c) − Kg(c)|
< 14 ε
¯
¯1 − ¯< and
¯ g(x) ¯ 2 |g(x)|
so that for a < x < a + δ2 we have
¯ ¯
¯ f (x) 13
+ 14 ε = ε
¯
¯ g(x) − K ¯ < 2 2ε
¯ ¯
as required.
56
14.7 More applications
These examples might be better done by using the standard limit from Analysis I that, if
α > 0, x exp(−αx) → 0 as x → ∞ .
log x
Example 14.7.1. limx→+∞ xµ = 0 for any µ > 0.
∞
Let g(x) = xµ = exp(µ log x). Then g ′ (x) = µxµ−1 . So by L’Hôpital’s rule ( ∞ case) we have
1
log x x
lim = lim provided this limit exists
x→+∞ xµ x→+∞ µxµ−1
1
= = 0 which does exist.
lim
x→+∞ µxµ
Finally
Example 14.7.3. Show that
1 1
= e− 3 .
¡ sin x ¢
lim 1−cos x
x→0 x
¢ 1 1
Since f (x) = sinx x 1−cos x is an even function, we only need to show that limx→0+ f (x) = e− 3 .
¡
According to definition
µ ¶
1 sin x
f (x) = exp log
1 − cos x x
µ ¶
log sin x − log x
= exp .
1 − cos x
By the L’Hôpital Rule,
cos x
log sin x − log x − x1
sin x sin x
lim = lim [provided it exists; recall → 1]
x→0+0 1 − cos x x→0+ sin x x
x cos x − sin x
= lim
x→0+ x sin2 x
cos x − x sin x − cos x
= lim [if exists,using L’Hôpital]
x→0+ sin2 x + 2x sin x cos x
x
= − lim
x→0+ sin x + 2x cos x
1
= − lim [if exists,using L’Hôpital]
x→0+ cos x + 2 cos x − 2x sin x
1
= − [continuity] .
3
57
Since exp is continuous at − 31 ,
µ ¶ 1 µ ¶
sin x 1−cos x log sin x − log x
lim = lim exp
x→0+ x x→0+ 1 − cos x
µ ¶
log sin x − log x
= exp lim [by continuity of exp]
x→0+ 1 − cos x
µ ¶
1
= exp − .
3
L’Hôpital’s Rule is very seductive. But it is often not the best way to evaluate limits. Taylor’s
Theorem, to which we turn next, is often more useful, and indeed more informative.
sinh x4 − x4
If you doubt this, then use L’HR to work out limx→0 , and then later use Taylor’s
(x − sin x)4
Theorem to write it down at sight—and decide which is better.
15 Taylor’s Theorem
15.1 Motivation
Suppose that f : (a − δ, a + δ) → R and that for some n > 1 the derivatives f ′ , f ′′ , . . . , f (n)
exist on the interval. For convenience write f 0 := f .
We can then form the Taylor polynomials
We might hope, on the basis of our experience, that Pn (x) is a good approximation to f (x);
we’d like to justify that intuition.
We’d also like to consider the power series
∞
X f (k) (a)
P (x) := (x − a)k
k!
k=0
58
which is called the Taylor expansion of f at a. Our previous experience leads us to conjecture
that this must equal f (x).
To investigate these questions we will look at the ‘error term’
(Clearly, if f has derivatives of all orders, Pn (x) → f (x) as n → ∞ if and only if En (x) → 0.)
Unfortunately, even if f has derivatives of all orders, it need not be true that En (x) → 0 as
n → ∞, so we have to move more carefully. First, we will prove Taylor’s Theorem which will
give us information about En (x). Secondly, in individual cases we have to consider whether
En (x) → 0 as n → ∞.
exp(− x12 )
½
whenever x 6= 0,
f (x) =
0 for x = 0.
A little experimentation shows that the k-th derivative must look like
Qk ( x1 ) exp(− x12 )
½
(k) whenever x 6= 0,
f (x) =
0 for x = 0,
for some polynomial Qk of degree 3k. We can prove this by induction: At points x 6= 0 this
is routine use of linearity, the product rule and the chain rule. But at x = 0 we need to take
more care, and use the definition:
¶ 3k+1
X exp(− 12 )
f (k) (x) − f (k) (0)
µ ¶ µ
1 1 1 x
= Qk exp − 2 = as
x−0 x x x xs
s=1
which we must prove tends to zero as x → 0; if we change the variable to t = x1 then we have
a finite sum of terms like ts exp −t2 which we know tend to zero as |t| tends to infinity.
P f (k) (0 k
So for this function f the series k! x = 0 so converges to 0 at every x. But the error
term En (x) is the same for all n (it equals f (x)) and so does not tend to 0 at any point
except 0.
Note that we can add this function to exp x and sin x and so on, and get functions with
the same set of derivatives at 0 as these functions, so that they will have the same Taylor
polynomials—but are different functions.
Remark 15.2.1. Functions defined and differentiable on C are very different: for them, our
naive intuition is a good guide—but that is next year’s Analysis course.
59
15.3 Taylor’s Theorem with Lagrange Remainder
We now concentrate on the Taylor polynomial and investigate its difference from the function.
Theorem 15.3.1 (Taylor’s Theorem). Let f : [a, b] → R. Suppose that for some n > 1 we
have that f , f ′ , f ′′ , . . . , f (n−1) exist and are continuous on [a, b] and that f (n) exists on (a, b).
Then there is a number ξ ∈ (a, b) such that
n−1
f (k) (a) f (n) (ξ)
(b − a)k + (b − a)n .
X
f (b) =
k! n!
k=0
Note 15.3.2. Recall that at the end points a and b ‘differentiable’ means ‘left- (or right-)
differentiable.
(n)
Note 15.3.3. The term f n!(ξ) (b − a)n is called Lagrange’s form of the remainder. Note
that the crucial parameter ξ, may depend on (i) the function f ; (ii) the degree n; (iii) the
end points a and b.14
and we use the method of “varying a constant”: that is we look at the following function
defined on [a, b]
n−1
X f (k) (x)
F (x) := (b − x)k .
k!
k=0
60
Let G(x) be continuous on [a, b] and differentiable on (a, b). We use Cauchy’s Mean Value
Theorem on this pair of functions to see that there exists a ξ ∈ (a, b) such that
F (a) − F (b) F ′ (ξ)
= ′ .
G(a) − G(b) G (ξ)
That is
f (n) (ξ)
f (k) (a) (b − ξ)n−1
Pn−1
k=0 k!(b − a)k − f (b) (n−1)!
= . (∗)
G(a) − G(b) G′ (ξ)
But if we take
G(x) := (b − x)n ,
which is clearly continuous and differentiable on (a, b) with derivative −n(b − x)n−1 < 0,
then (*) simplifies at once to
n−1
f (k) (a) f (n) (ξ)
(b − a)k + (b − a)n .
X
f (b) =
k! n!
k=0
We have proved the strongest theorem we could. But often we know a bit more, and can
get, for example this symmetric version:
Corollary 15.3.5 (Taylor’s Theorem). Let f : (a − δ, a + δ) → R for some δ > 0. Suppose
that for some n > 1 we have that f ′ , f ′′ , . . . , f (n) exist. Let x ∈ (a − δ, a + δ). Then there is
a number ξ between a and x such that
n−1
f (k) (a) f (n) (ξ)
(x − a)k + (x − a)n .
X
f (x) =
k! n!
k=0
Proof. If x > a then this is just the Taylor Theorem we have proved. If x < a we just
use the Taylor Theorem we have proved on the function f (−x) and sort out the signs and
inequalities. If x = a then take ξ = a.
In the proof of Taylor’s Theorem we may use any function G which is continuous in [a, b],
differentiable in (a, b), and such that G′ 6= 0. Then we will have a ξ ∈ (a, b) such that
f (n) (ξ)
f (b) = Pn−1 (b) + (b − a) (b − ξ)n−1 for some ξ ∈ (a, b).
(n − 1)!
Exercise 15.4.1. Try G(x) = (x − a)m for a power m > 1 to see what kind of Taylor’s
formula you can get.
61
15.5 The error estimate
Taylor’s Theorem also provides us with an explicit estimate of the difference between f (x)
Pn−1 f (k) (a)
and its n-Taylor approximation k=0 k! (x − a)k :
Corollary 15.5.1. Let f : [a, b] → R satisfy the conditions in Taylor’s Theorem, and let
n
En := |b−a|
n! supξ∈(a,b) f
(n) (ξ). Then
n−1
¯ ¯
¯ X f (k) (a) ¯
k¯
¯f (x) − (x − a) ¯ 6 En for all x ∈ [a, b].
¯
¯ k! ¯
k=0
Of course this may not be useful, as the supremum may be infinite. If however in a given
situation we know a bit more—for example, that f (n) is differentiable on [a, b] then we can
use standard calculus to evaluate En .
Before we start, note that there is no point in trying to prove that these are equal on any
larger real domain as the radius of convergence of the series is by the ratio test equal to 1
P 1 at the other end point x = −1 the series is the notoriously divergent Harmonic Series
and
n.
(−1)n−1
But we will prove equality on all of (−1, 1], in particular that log 2 = ∞
P
n=1 n .
Proof. Consider f (x) = log(1 + x). We have already proved that on (−1, ∞) the function f
n+1
1
is differentiable with f ′ (x) = 1+x ; and so, by the usual rules we have f (n) (x) = (−1)(1+x)(n−1)!
n
for all n > 1.
Hence, by Taylor’s Theorem (the symmetric version)
n−1 ¶n
(−1)k−1 xk
µ
X 1 x
log(1 + x) − = (−1)n−1
k n 1 + ξn
k=1
62
x
For x > 0 this is no problem, 0 < ξn < 1 and so 1 + ξn > 1; hence 1+ξn < x 6 1.
For negative x it is not so easy; the nearer x is to −1 the nearer 1+ξn may get to 0. However,
if x > − 21 we have
− 21 6 x 6 ξn 6 0
and so
1
2 6 1 + x 6 1 + ξn 6 1
which implies
1 1
2> > > 1;
1+x 1 + ξn
multiplying by x < 0 yields
x x
2x 6 6 6 x.
1+x 1 + ξn
Now 2x > −1 and x 6 1 so we have
¯ ¯
¯ x ¯
¯ 1 + ξn ¯ 6 1
¯ ¯
as required.
k−1 xk
P∞
That is, the functions log(1 + x) and k=1 (−1) k are equal on [− 12 , 1].
That is, on the whole of (−1, 1] we have the required series expansion.
Remark 15.6.2. The last part has actually proved the result for x ∈ (−1, 1). It is only at
x = 1 that we have to prove that the error tends to zero.
In this section we use many of the theorems we have proved about uniform convergence and
continuity, power series, monotonicity as well as Taylor’s Theorem. As well as proving an
important result we are showing off the techniques we now have available to us.
63
16.1 Motivation and Preliminary Algebra
By simple induction we can prove that for any natural number n (including 0) we have for
all real or complex x that
k=n
X µn¶
(1 + x)n = xk ;
k
k=0
¡n¢ k
where the coefficient k of x can be proved to be
µ ¶
n n! n · (n − 1) · · · · · (n − k + 1)
= = .
k k!(n − k)! k · (k − 1) · · · · · 1
In this section we are going to generalise these—in the case of some real values of x—to all
values of n, not just integers. Note that this is altogether deeper: (1 + x)p is defined for
non-integral p, and for (real) x > −1, to be the function exp(p log(1 + x)).
Definition 16.1.1. For all p ∈ R and all k ∈ N we extend the definition of binomial
coefficient as follows:
µ ¶ µ ¶
p p p(p − 1) . . . (p − k + 1)
:= 1; and := .
0 k k!
We now make sure that the key properties of binomial coefficients are still true in this more
general setting.
Lemma 16.1.2. µ ¶ µ ¶
p p−1
k =p , for all k > 1.
k k−1
64
Lemma 16.1.3. µ ¶ µ ¶ µ ¶
p p p+1
+ = , for all k > 1.
k k−1 k
Note that the coefficients are non-zero provided p is not a natural number or zero; as we
have a proof of the expansion in that case we may assume that p 6∈ N ∪ {0}.
Lemma 16.2.2. The function f defined on (−1, 1) by f (x) := (1 + x)p is differentiable, and
satisfies (1 + x)f ′ (x) = pf (x). Also, f (0) = 1.
Proof. The derivative is easily got by the chain rule from the definition of f ; it is f ′ (x) =
p(1+x)p−1 . Multiply by (1+x) and get the required relationship. The value at 0 is clear.
Proof. Use the ratio test; we have that “|ak+1 xk+1 /ak xk |” is
¯ ¯ ¯ ¯
¯ p · (p − 1) · · · · · (p − k) k · (k − 1) · · · · · 1 ¯ ¯p − k ¯
¯ (k + 1) · k · (k − 1) · · · · · 1 · p · (p − 1) · · · · · (p − k + 1) x¯ = ¯ k + 1 x¯ → |x| as k → ∞.
¯ ¯ ¯ ¯
P∞ ¡p¢
Lemma 16.2.4. The function g defined on (−1, 1) by g(x) = k=0 k xk is differentiable,
with derivative satisfying (1 + x)g ′ (x) = pg(x). Also, g(0) = 1.
65
Proof.
∞ µ ¶
′
X p
(1 + x)g (x) = (1 + x) kxk−1
k
k=0
∞ µ ¶
X p
= (1 + x) kxk−1
k
k=1
∞ µ ¶
X p − 1 k−1
= p(1 + x) x
k−1
k=1
(∞ µ ∞ µ ¶ )
X p − 1¶ X p − 1
=p xk−1 + xk
k−1 k−1
k=1 k=1
( ∞ µ ¶ ∞
)
X p−1 X µp−1¶
m m
=p x + x
m m−1
m=0 m=1
∞ µ ∞ µ
( ¶ ¶ )
X p−1 m X p−1 m
=p 1+ x + x
m m−1
m=1 m=1
∞ ·µ
( ¶ µ ¶¸ )
X p−1 p−1
=p 1+ + xm
m m−1
m=1
∞ µ ¶
( )
X p m
=p 1+ x
m
m=1
∞ µ ¶
X p m
=p x
m
m=0
= pg(x).
g(x)
Proof of the Binomial Theorem. Consider φ(x) = , which is well-defined on (−1, 1) as
f (x)
f (x) > 0. By the Quotient Rule we can calculate φ′ (x), and then use the lemmas:
Hence by the Identity Theorem, φ(x) is constant, φ(x) = φ(0) = 1. This implies that
f (x) = g(x) on (−1, 1).
66
16.3 The end points: preliminary issue
The existence of these functions and their equality at the end points requires more sophisticated
argument. The following sections should be viewed as illustrations of the way Taylor’s
Theorem can be exploited, rather than theorems to be learnt.
The cases x = 1 or x = −1 need to be considered separately. But there is a difference between
these! For x = −1 we have not yet defined (1 + x)p ; recall our definition for arbitrary real p, that
(1 + x)p := exp p log(1 + x). For integral p (such as p = 0) we have the usual algebraic definition,
which is consistent with the exp log definition when both apply. Can we define 0p sensibly for any
other values of p? For p > 0 we’d clearly like to define 0p = 0. But if we do so, then to preserve the
rule of exponents Ap Aq = Ap+q we cannot define negative powers; if p > 0 then 0−p makes no sense.
So let us extend out definition of (1 + x)p in this way, in the case when p > 0.
But we need to take care.
Lemma 16.3.1. If p > 0 then the function (1 + x)p is continuous on [−1, ∞).
Lemma 16.3.2. If p > 1 then the function (1 + x)p is differentiable on [−1, ∞) with derivative
p(1 + x)p−1 .
Proofs. Exercises.
Let p 6 −1. Then as remarked above, the function (1 + x)p is not defined at x = −1. Further the
expansion is not vaild at x = 1:
P∞ p · (p − 1) · · · · · (p − k + 1)
Proposition 16.4.1. The series k=0 is divergent.
k · (k − 1) · · · · · 1
the terms alternate in sign but as they do not tend to 0 the series diverges.
Let −1 < p < 0; note that p + 1 > 0. Again the function (1 + x)p is not defined at x = −1. However,
now the expansion is valid at x = 1:
P∞ p · (p − 1) · · · · · (p − k + 1)
Proposition 16.5.1. The series k=0 is convergent with sum 2p .
k · (k − 1) · · · · · 1
Proof. We apply Taylor’s Theorem to (1 + x)p on the interval [0, 1] and find, for each n > 1, a point
ξn ∈ (0, 1) such that
n−1
p
X p · (p − 1) · · · · · (p − k + 1) p · (p − 1) · · · · · (p − n + 1)
2 = + En where En = (1 + ξn )p−n .
k · (k − 1) · · · · · 1 n · (n − 1) · · · · · 1
k=0
67
We have then that ¯ ¯
¯ p · (p − 1) · · · · · (p − n + 1) ¯
|En | 6 ¯
¯ ¯;
n · (n − 1) · · · · · 1 ¯
and we will have the result if we prove that this tends to 0 as n → ∞. We rewrite the part depending
on n as
¯ ¯
¯ [(p + 1) − 1] · · · · · [(p + 1) − s] · · · · [(p + 1) − n] ¯
¯ ¯
¯ 1 · 2 · ··· · n ¯
µ ¶ µ ¶ µ ¶ µ ¶
p+1 p+1 p+1 p+1
= 1− · 1− ... 1 − ... 1 − .
1 2 s n
Now exp(−x) + x − 1 has positive derivative on (0, 1) so by the MVT we have that
µ ¶ µ ¶
p+1 p+1
1− 6 exp −
s s
so that à !
n
X 1
|En | 6 exp −(p + 1) .
s=1
s
Proof. The end point x = +1 is straightforward; use Taylor’s Theorem as before and consider the
error estimate
p · (p − 1) · · · · · (p − n + 1)
En = (1 + ξn )p−n
n · (n − 1) · · · · · 1
for some ξn ∈ (0, 1). Then
p ¯¯ (p − 1) · · · · · (p − n + 1) ¯¯ 2p
¯ ¯
|En | 6 ¯ ¯ 1n .
n 1 · 2 · · · · · (n − 1)
¯ ¯
¯p − s¯
Now ¯¯ ¯ 6 1 whenever 2s > p; so we get that
s ¯
¯ £p¤ ¯
p ¯¯ (p − 1) · · · · · (p − 2 ) ¯¯ 2p
|En | 6 ¯ ¯ n → 0 as n → ∞
1 · 2 · · · · · ( p2 )
£ ¤
n¯ ¯1
as required. The end point x = −1 is more difficult. What we do is prove that the sum con-
verges. Noting that as soon as k > p + 1 all the terms have the same sign, we see that this means
we have proved that the series is absolutely convergent. Now by the properties of power series
P∞ p · (p − 1) · · · · · (p − k + 1) k
k=0 x is absolutely convergent on (−1, 1). In particular we have that
k · (k − 1) · · · · · 1
68
the series is absolutely convergent on the closed interval [−1, 0]. Hence the series is uniformly con-
vergent on that interval; and so the series is continuous on [−1, 0]. As the series is equal to (1 + x)p
on (−1, 0] we have by continuity that there is equality at −1 as well.
So we must prove that the series converges. We claim that if we can prove
¯ this for
¯ any p then we can
¯ p+1 ¯
prove it for (p + 1). This is because for all n > 2p + 2 we have that ¯¯ ¯ 6 1; this allows us
p − n + 1¯
to compare the n-th terms and see that those for (p + 1) are smaller in modulus. As both series are
ultimately the series of terms of constant sign, the comparison test will yield that convergence for p
yields convergence for (p + 1). So assume from now that 0 < p < 1; it will suffice to deal with this
case.
The modulus of the n-th term can then be written
µ ¶
p³ p´ ³ p´ p
|un | = 1− ... 1 − ... 1 −
n 1 s n−1
15 P 1
ns
is convergent for s > 1 by the Integral Test.
69