This action might not be possible to undo. Are you sure you want to continue?
Introduction to Real Analysis
by Jiˇ rí Lebl
April 26, 2011
2
Typeset in L
A
T
E
X.
Copyright c (2009–2011 Jiˇrí Lebl
This work is licensed under the Creative Commons AttributionNoncommercialShare Alike 3.0
United States License. To view a copy of this license, visit http://creativecommons.org/
licenses/byncsa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite
300, San Francisco, California, 94105, USA.
You can use, print, duplicate, share these notes as much as you want. You can base your own notes
on these and reuse parts if you keep the license the same. If you plan to use these commercially (sell
them for more than just duplicating cost), then you need to contact me and we will work something
out. If you are printing a course pack for your students, then it is ﬁne if the duplication service is
charging a fee for printing and selling the printed copy. I consider that duplicating cost.
During the writing of these notes, the author was in part supported by NSF grant DMS0900885.
See http://www.jirka.org/ra/ for more information (including contact information).
Contents
Introduction 5
0.1 Notes about these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
0.2 About analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.3 Basic set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1 Real Numbers 21
1.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2 The set of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3 Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.4 Intervals and the size of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2 Sequences and Series 39
2.1 Sequences and limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Facts about limits of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3 Limit superior, limit inferior, and BolzanoWeierstrass . . . . . . . . . . . . . . . 57
2.4 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.5 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3 Continuous Functions 79
3.1 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.2 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.3 Minmax and intermediate value theorems . . . . . . . . . . . . . . . . . . . . . . 92
3.4 Uniform continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4 The Derivative 103
4.1 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.2 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
3
4 CONTENTS
5 The Riemann Integral 117
5.1 The Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.2 Properties of the integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.3 Fundamental theorem of calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6 Sequences of Functions 139
6.1 Pointwise and uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.2 Interchange of limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.3 Picard’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Further Reading 157
Index 159
Introduction
0.1 Notes about these notes
This book is a one semester course in basic analysis. These were my lecture notes for teaching Math
444 at the University of Illinois at UrbanaChampaign (UIUC) in Fall semester 2009. The course is
a ﬁrst course in mathematical analysis aimed at students who do not necessarily wish to continue a
graduate study in mathematics. A prerequisite for the course is a basic proof course, for example
one using the (unfortunately rather pricey) book [DW]. The course does not cover topics such as
metric spaces, which a more advanced course would. It should be possible to use these notes for a
beginning of a more advanced course, but further material should be added.
The book normally used for the class at UIUC is Bartle and Sherbert, Introduction to Real
Analysis third edition [BS]. The structure of the notes mostly follows the syllabus of UIUC Math 444
and therefore has some similarities with [BS]. Some topics covered in [BS] are covered in slightly
different order, some topics differ substantially from [BS] and some topics are not covered at all.
For example, we will deﬁne the Riemann integral using Darboux sums and not tagged partitions.
The Darboux approach is far more appropriate for a course of this level. In my view, [BS] seems
to be targeting a different audience than this course, and that is the reason for writing this present
book. The generalized Riemann integral is not covered at all.
As the integral is treated more lightly, we can spend some extra time on the interchange of limits
and in particular on a section on Picard’s theorem on the existence and uniqueness of solutions of
ordinary differential equations if time allows. This theorem is a wonderful example that uses many
results proved in the book.
Other excellent books exist. My favorite is without doubt Rudin’s excellent Principles of
Mathematical Analysis [R2] or as it is commonly and lovingly called baby Rudin (to distinguish
it from his other great analysis textbook). I have taken a lot of inspiration and ideas from Rudin.
However, Rudin is a bit more advanced and ambitious than this present course. For those that
wish to continue mathematics, Rudin is a ﬁne investment. An inexpensive alternative to Rudin is
Rosenlicht’s Introduction to Analysis [R1]. Rosenlicht may not be as dry as Rudin for those just
starting out in mathematics. There is also the freely downloadable Introduction to Real Analysis by
William Trench [T] for those that do not wish to invest much money.
I want to mention a note about the style of some of the proofs. Many proofs that are traditionally
done by contradiction, I prefer to do by a direct proof or at least by a contrapositive. While the
5
6 INTRODUCTION
book does include proofs by contradiction, I only do so when the contrapositive statement seemed
too awkward, or when the contradiction follows rather quickly. In my opinion, contradiction is
more likely to get the beginning student into trouble. In a contradiction proof, we are arguing about
objects that do not exist. In a direct proof or a contrapositive proof one can be guided by intuition,
but in a contradiction proof, intuition usually leads us astray.
I also try to avoid unnecessary formalism where it is unhelpful. Furthermore, the proofs and the
language get slightly less formal as we progress through the book, as more and more details are left
out to avoid clutter.
As a general rule, I will use := instead of = to deﬁne an object rather than to simply show
equality. I use this symbol rather more liberally than is usual. I may use it even when the context is
“local,” that is, I may simply deﬁne a function f (x) := x
2
for a single exercise or example.
If you are teaching (or being taught) with [BS], here is the correspondence of the sections. The
correspondences are only approximate, the material in these notes and in [BS] differs, as described
above.
Section Section in [BS]
§0.3 §1.1–§1.3
§1.1 §2.1 and §2.3
§1.2 §2.3 and §2.4
§1.3 §2.2
§1.4 §2.5
§2.1 parts of §3.1, §3.2, §3.3, §3.4
§2.2 §3.2
§2.3 §3.3 and §3.4
§2.4 §3.5
§2.5 §3.7
§3.1 §4.1–§4.2
§3.2 §5.1 (and §5.2?)
Section Section in [BS]
§3.3 §5.3 ?
§3.4 §5.4
§4.1 §6.1
§4.2 §6.2
§4.3 §6.3
§5.1 §7.1, §7.2
§5.2 §7.2
§5.3 §7.3
§6.1 §8.1
§6.2 §8.2
§6.3 Not in [BS]
It is possible to skip or skim some material in the book as it is not used later on. The optional
material is marked in the notes that appear below every section title. Section §0.3 can be covered
lightly, or left as reading. The material within is considered prerequisite. The section on Taylor’s
theorem (§4.3) can safely be skipped as it is never used later. Uncountability of R in §1.4 can safely
be skipped. The alternative proof of BolzanoWeierstrass in §2.3 can safely be skipped. And of
course, the section on Picard’s theorem can also be skipped if there is no time at the end of the
course, though I have not marked the section optional.
Finally I would like to acknowledge Jana Maˇríková and Glen Pugh for teaching with the notes
and ﬁnding many typos and errors. I would also like to thank Dan Stoneham, Frank Beatrous, and
an anonymous reader for suggestions and ﬁnding errors and typos.
0.2. ABOUT ANALYSIS 7
0.2 About analysis
Analysis is the branch of mathematics that deals with inequalities and limiting processes. The
present course will deal with the most basic concepts in analysis. The goal of the course is to
acquaint the reader with the basic concepts of rigorous proof in analysis, and also to set a ﬁrm
foundation for calculus of one variable.
Calculus has prepared you (the student) for using mathematics without telling you why what
you have learned is true. To use (or teach) mathematics effectively, you cannot simply know what is
true, you must know why it is true. This course is to tell you why calculus is true. It is here to give
you a good understanding of the concept of a limit, the derivative, and the integral.
Let us give an analogy to make the point. An auto mechanic that has learned to change the oil,
ﬁx broken headlights, and charge the battery, will only be able to do those simple tasks. He will
not be able to work independently to diagnose and ﬁx problems. A high school teacher that does
not understand the deﬁnition of the Riemann integral will not be able to properly answer all the
student’s questions that could come up. To this day I remember several nonsensical statements I
heard from my calculus teacher in high school who simply did not understand the concept of the
limit, though he could “do” all problems in calculus.
We will start with discussion of the real number system, most importantly its completeness
property, which is the basis for all that we will talk about. We will then discuss the simplest form
of a limit, that is, the limit of a sequence. We will then move to study functions of one variable,
continuity, and the derivative. Next, we will deﬁne the Riemann integral and prove the fundamental
theorem of calculus. We will end with discussion of sequences of functions and the interchange of
limits.
Let me give perhaps the most important difference between analysis and algebra. In algebra, we
prove equalities directly. That is, we prove that an object (a number perhaps) is equal to another
object. In analysis, we generally prove inequalities. To illustrate the point, consider the following
statement.
Let x be a real number. If 0 ≤x < ε is true for all real numbers ε > 0, then x = 0.
This statement is the general idea of what we do in analysis. If we wish to show that x = 0, we
will show that 0 ≤x < ε for all positive ε.
The term “real analysis” is a little bit of a misnomer. I prefer to normally use just “analysis.”
The other type of analysis, that is, “complex analysis” really builds up on the present material,
rather than being distinct. Furthermore, a more advanced course on “real analysis” would talk about
complex numbers often. I suspect the nomenclature is just historical baggage.
Let us get on with the show. . .
8 INTRODUCTION
0.3 Basic set theory
Note: 1–3 lectures (some material can be skipped or covered lightly)
Before we can start talking about analysis we need to ﬁx some language. Modern
∗
analysis
uses the language of sets, and therefore that’s where we will start. We will talk about sets in a
rather informal way, using the socalled “naïve set theory.” Do not worry, that is what majority of
mathematicians use, and it is hard to get into trouble.
It will be assumed that the reader has seen basic set theory and has had a course in basic proof
writing. This section should be thought of as a refresher.
0.3.1 Sets
Deﬁnition 0.3.1. A set is just a collection of objects called elements or members of a set. A set
with no objects is called the empty set and is denoted by / 0 (or sometimes by ¦¦).
The best way to think of a set is like a club with a certain membership. For example, the students
who play chess are members of the chess club. However, do not take the analogy too far. A set is
only deﬁned by the members that form the set; two sets that have the same members are the same
set.
Most of the time we will consider sets of numbers. For example, the set
S :=¦0, 1, 2¦
is the set containing the three elements 0, 1, and 2. We write
1 ∈ S
to denote that the number 1 belongs to the set S. That is, 1 is a member of S. Similarly we write
7 / ∈ S
to denote that the number 7 is not in S. That is, 7 is not a member of S. The elements of all sets
under consideration come from some set we call the universe. For simplicity, we often consider the
universe to be a set that contains only the elements (for example numbers) we are interested in. The
universe is generally understood from context and is not explicitly mentioned. In this course, our
universe will most often be the set of real numbers.
The elements of a set will usually be numbers. Do note, however, the elements of a set can also
be other sets, so we can have a set of sets as well.
A set can contain some of the same elements as another set. For example,
T :=¦0, 2¦
contains the numbers 0 and 2. In this case all elements of T also belong to S. We write T ⊂S. More
formally we have the following deﬁnition.
∗
The term “modern” refers to late 19th century up to the present.
0.3. BASIC SET THEORY 9
Deﬁnition 0.3.2.
(i) A set A is a subset of a set B if x ∈ A implies that x ∈ B, and we write A ⊂ B. That is, all
members of A are also members of B.
(ii) Two sets A and B are equal if A ⊂B and B ⊂A. We write A = B. That is, A and B contain the
exactly the same elements. If it is not true that A and B are equal, then we write A = B.
(iii) A set A is a proper subset of B if A ⊂B and A = B. We write A B.
When A = B, we consider A and B to just be two names for the same exact set. For example, for
S and T deﬁned above we have T ⊂S, but T = S. So T is a proper subset of S. At this juncture, we
also mention the set building notation,
¦x ∈ A : P(x)¦.
This notation refers to a subset of the set A containing all elements of A that satisfy the property
P(x). The notation is sometimes abbreviated (A is not mentioned) when understood from context.
Furthermore, x is sometimes replaced with a formula to make the notation easier to read. Let us see
some examples of sets.
Example 0.3.3: The following are sets including the standard notations for these.
(i) The set of natural numbers, N :=¦1, 2, 3, . . .¦.
(ii) The set of integers, Z :=¦0, −1, 1, −2, 2, . . .¦.
(iii) The set of rational numbers, Q:=¦
m
n
: m, n ∈ Z and n = 0¦.
(iv) The set of even natural numbers, ¦2m : m ∈ N¦.
(v) The set of real numbers, R.
Note that N ⊂Z ⊂Q⊂R.
There are many operations we will want to do with sets.
Deﬁnition 0.3.4.
(i) A union of two sets A and B is deﬁned as
A∪B :=¦x : x ∈ A or x ∈ B¦.
(ii) An intersection of two sets A and B is deﬁned as
A∩B :=¦x : x ∈ A and x ∈ B¦.
10 INTRODUCTION
(iii) A complement of B relative to A (or settheoretic difference of A and B) is deﬁned as
A`B :=¦x : x ∈ A and x / ∈ B¦.
(iv) We just say complement of B and write B
c
if A is understood from context. A is either the
entire universe or is the obvious set that contains B.
(v) We say that sets A and B are disjoint if A∩B = / 0.
The notation B
c
may be a little vague at this point. But for example if the set B is a subset of the
real numbers R, then B
c
will mean R`B. If B is naturally a subset of the natural numbers, then B
c
is N`B. If ambiguity would ever arise, we will use the set difference notation A`B.
A∪B
A`B B
c
A∩B
B
A
B B
A
B
A
Figure 1: Venn diagrams of set operations.
We illustrate the operations on the Venn diagrams in Figure 1. Let us now establish one of most
basic theorems about sets and logic.
Theorem 0.3.5 (DeMorgan). Let A, B,C be sets. Then
(B∪C)
c
= B
c
∩C
c
,
(B∩C)
c
= B
c
∪C
c
,
0.3. BASIC SET THEORY 11
or, more generally,
A`(B∪C) = (A`B) ∩(A`C),
A`(B∩C) = (A`B) ∪(A`C).
Proof. We note that the ﬁrst statement is proved by the second statement if we assume that set A is
our “universe.”
Let us prove A`(B∪C) = (A`B) ∩(A`C). Remember the deﬁnition of equality of sets. First,
we must show that if x ∈ A`(B∪C), then x ∈ (A`B) ∩(A`C). Second, we must also show that if
x ∈ (A`B) ∩(A`C), then x ∈ A`(B∪C).
So let us assume that x ∈ A`(B∪C). Then x is in A, but not in B nor C. Hence x is in A and not
in B, that is, x ∈ A`B. Similarly x ∈ A`C. Thus x ∈ (A`B) ∩(A`C).
On the other hand suppose that x ∈ (A`B) ∩(A`C). In particular x ∈ (A`B) and so x ∈ A and
x / ∈ B. Also as x ∈ (A`C), then x / ∈C. Hence x ∈ A`(B∪C).
The proof of the other equality is left as an exercise.
We will also need to intersect or union several sets at once. If there are only ﬁnitely many, then
we just apply the union or intersection operation several times. However, suppose that we have an
inﬁnite collection of sets (a set of sets) ¦A
1
, A
2
, A
3
, . . .¦. We deﬁne
∞
¸
n=1
A
n
:=¦x : x ∈ A
n
for some n ∈ N¦,
∞
¸
n=1
A
n
:=¦x : x ∈ A
n
for all n ∈ N¦.
We could also have sets indexed by two integers. For example, we could have the set of sets
¦A
1,1
, A
1,2
, A
2,1
, A
1,3
, A
2,2
, A
3,1
, . . .¦. Then we can write
∞
¸
n=1
∞
¸
m=1
A
n,m
=
∞
¸
n=1
∞
¸
m=1
A
n,m
.
And similarly with intersections.
It is not hard to see that we could take the unions in any order. However, switching unions and
intersections is not generally permitted without proof. For example:
∞
¸
n=1
∞
¸
m=1
¦k ∈ N : mk < n¦ =
∞
¸
n=1
/ 0 = / 0.
However,
∞
¸
m=1
∞
¸
n=1
¦k ∈ N : mk < n¦ =
∞
¸
m=1
N =N.
12 INTRODUCTION
0.3.2 Induction
A common method of proof is the principle of induction. We start with the set of natural numbers
N = ¦1, 2, 3, . . .¦. We note that the natural ordering on N (that is, 1 < 2 < 3 < 4 < ) has a
wonderful property. The natural numbers N ordered in the natural way possess the well ordering
property or the well ordering principle.
Well ordering property of N. Every nonempty subset of N has a least (smallest) element.
The principle of induction is the following theorem, which is equivalent to the well ordering
property of the natural numbers.
Theorem 0.3.6 (Principle of induction). Let P(n) be a statement depending on a natural number n.
Suppose that
(i) (basis statement) P(1) is true,
(ii) (induction step) if P(n) is true, then P(n+1) is true.
Then P(n) is true for all n ∈ N.
Proof. Suppose that S is the set of natural numbers m for which P(m) is not true. Suppose that S is
nonempty. Then S has a least element by the well ordering principle. Let us call m the least element
of S. We know that 1 / ∈ S by assumption. Therefore m > 1 and m−1 is a natural number as well.
Since m was the least element of S, we know that P(m−1) is true. But by the induction step we can
see that P(m−1+1) = P(m) is true, contradicting the statement that m ∈ S. Therefore S is empty
and P(n) is true for all n ∈ N.
Sometimes it is convenient to start at a different number than 1, but all that changes is the
labeling. The assumption that P(n) is true in “if P(n) is true, then P(n+1) is true” is usually called
the induction hypothesis.
Example 0.3.7: Let us prove that for all n ∈ N we have
2
n−1
≤n!.
We let P(n) be the statement that 2
n−1
≤n! is true. By plugging in n = 1, we can see that P(1) is
true.
Suppose that P(n) is true. That is, suppose that 2
n−1
≤n! holds. Multiply both sides by 2 to
obtain
2
n
≤2(n!).
As 2 ≤(n+1) when n ∈ N, we have 2(n!) ≤(n+1)(n!) = (n+1)!. That is,
2
n
≤2(n!) ≤(n+1)!,
and hence P(n +1) is true. By the principle of induction, we see that P(n) is true for all n, and
hence 2
n−1
≤n! is true for all n ∈ N.
0.3. BASIC SET THEORY 13
Example 0.3.8: We claim that for all c = 1, we have that
1+c +c
2
+ +c
n
=
1−c
n+1
1−c
.
Proof: It is easy to check that the equation holds with n = 1. Suppose that it is true for n. Then
1+c +c
2
+ +c
n
+c
n+1
= (1+c +c
2
+ +c
n
) +c
n+1
=
1−c
n+1
1−c
+c
n+1
=
1−c
n+1
+(1−c)c
n+1
1−c
=
1−c
n+2
1−c
.
There is an equivalent principle called strong induction. The proof that strong induction is
equivalent to induction is left as an exercise.
Theorem 0.3.9 (Principle of strong induction). Let P(n) be a statement depending on a natural
number n. Suppose that
(i) (basis statement) P(1) is true,
(ii) (induction step) if P(k) is true for all k = 1, 2, . . . , n, then P(n+1) is true.
Then P(n) is true for all n ∈ N.
0.3.3 Functions
Informally, a settheoretic function f taking a set A to a set B is a mapping that to each x ∈ A assigns
a unique y ∈ B. We write f : A →B. For example, we could deﬁne a function f : S →T taking
S = ¦0, 1, 2¦ to T = ¦0, 2¦ by assigning f (0) := 2, f (1) := 2, and f (2) := 0. That is, a function
f : A →B is a black box, into which we can stick an element of A and the function will spit out an
element of B. Sometimes f is called a mapping and we say that f maps A to B.
Often, functions are deﬁned by some sort of formula, however, you should really think of a
function as just a very big table of values. The subtle issue here is that a single function can have
several different formulas, all giving the same function. Also a function need not have any formula
being able to compute its values.
To deﬁne a function rigorously ﬁrst let us deﬁne the Cartesian product.
Deﬁnition 0.3.10. Let A and B be sets. Then the Cartesian product is the set of tuples deﬁned as
follows.
AB :=¦(x, y) : x ∈ A, y ∈ B¦.
14 INTRODUCTION
For example, the set [0, 1] [0, 1] is a set in the plane bounded by a square with vertices (0, 0),
(0, 1), (1, 0), and (1, 1). When A and B are the same set we sometimes use a superscript 2 to denote
such a product. For example [0, 1]
2
= [0, 1] [0, 1], or R
2
=RR (the Cartesian plane).
Deﬁnition 0.3.11. A function f : A →B is a subset of AB such that for each x ∈ A, there is a
unique (x, y) ∈ f . Sometimes the set f is called the graph of the function rather than the function
itself.
The set A is called the domain of f (and sometimes confusingly denoted D( f )). The set
R( f ) :=¦y ∈ B : there exists an x such that (x, y) ∈ f ¦
is called the range of f .
Note that R( f ) can possibly be a proper subset of B, while the domain of f is always equal to A.
Example 0.3.12: From calculus, you are most familiar with functions taking real numbers to real
numbers. However, you have seen some other types of functions as well. For example the derivative
is a function mapping the set of differentiable functions to the set of all functions. Another example
is the Laplace transform, which also takes functions to functions. Yet another example is the
function that takes a continuous function g deﬁned on the interval [0, 1] and returns the number
1
0
g(x)dx.
Deﬁnition 0.3.13. Let f : A →B be a function. Let C ⊂A. Deﬁne the image (or direct image) of C
as
f (C) :=¦ f (x) ∈ B : x ∈C¦.
Let D ⊂B. Deﬁne the inverse image as
f
−1
(D) :=¦x ∈ A : f (x) ∈ D¦.
Example 0.3.14: Deﬁne the function f : R →R by f (x) := sin(πx). Then f ([0,
1
/2]) = [0, 1],
f
−1
(¦0¦) =Z, etc. . . .
Proposition 0.3.15. Let f : A →B. Let C, D be subsets of B. Then
f
−1
(C∪D) = f
−1
(C) ∪ f
−1
(D),
f
−1
(C∩D) = f
−1
(C) ∩ f
−1
(D),
f
−1
(C
c
) =
f
−1
(C)
c
.
Read the last line as f
−1
(B`C) = A` f
−1
(C).
Proof. Let us start with the union. Suppose that x ∈ f
−1
(C∪D). That means that x maps to C or D.
Thus f
−1
(C∪D) ⊂ f
−1
(C) ∪ f
−1
(D). Conversely if x ∈ f
−1
(C), then x ∈ f
−1
(C∪D). Similarly
for x ∈ f
−1
(D). Hence f
−1
(C∪D) ⊃ f
−1
(C) ∪ f
−1
(D), and we are have equality.
The rest of the proof is left as an exercise.
0.3. BASIC SET THEORY 15
The proposition does not hold for direct images. We do have the following weaker result.
Proposition 0.3.16. Let f : A →B. Let C, D be subsets of A. Then
f (C∪D) = f (C) ∪ f (D),
f (C∩D) ⊂ f (C) ∩ f (D).
The proof is left as an exercise.
Deﬁnition 0.3.17. Let f : A →B be a function. The function f is said to be injective or onetoone
if f (x
1
) = f (x
2
) implies x
1
= x
2
. In other words, f
−1
(¦y¦) is empty or consists of a single element
for all y ∈ B. We then call f an injection.
The function f is said to be surjective or onto if f (A) = B. We then call f a surjection.
Finally, a function that is both an injection and a surjection is said to be bijective and we say it
is a bijection.
When f : A →B is a bijection, then f
−1
(¦y¦) is always a unique element of A, and we could
then consider f
−1
as a function f
−1
: B →A. In this case we call f
−1
the inverse function of f . For
example, for the bijection f (x) := x
3
we have f
−1
(x) =
3
√
x.
A ﬁnal piece of notation for functions that we will need is the composition of functions.
Deﬁnition 0.3.18. Let f : A →B, g: B →C. Then we deﬁne a function g◦ f : A →C as follows.
(g◦ f )(x) := g
f (x)
.
0.3.4 Cardinality
A very subtle issue in set theory and one generating a considerable amount of confusion among
students is that of cardinality, or “size” of sets. The concept of cardinality is important in modern
mathematics in general and in analysis in particular. In this section, we will see the ﬁrst really
unexpected theorem.
Deﬁnition 0.3.19. Let A and B be sets. We say A and B have the same cardinality when there exists
a bijection f : A →B. We denote by [A[ the equivalence class of all sets with the same cardinality
as A and we simply call [A[ the cardinality of A.
Note that A has the same cardinality as the empty set if and only if A itself is the empty set. We
then write [A[ := 0.
Deﬁnition 0.3.20. Suppose that A has the same cardinality as ¦1, 2, 3, . . . , n¦ for some n ∈ N. We
then write [A[ := n, and we say that A is ﬁnite. When A is the empty set, we also call A ﬁnite.
We say that A is inﬁnite or “of inﬁnite cardinality” if A is not ﬁnite.
16 INTRODUCTION
That the notation [A[ =n is justiﬁed we leave as an exercise. That is, for each nonempty ﬁnite set
A, there exists a unique natural number n such that there exists a bijection from A to ¦1, 2, 3, . . . , n¦.
We can also order sets by size.
Deﬁnition 0.3.21. We write
[A[ ≤[B[
if there exists an injection from A to B. We write [A[ =[B[ if A and B have the same cardinality. We
write [A[ <[B[ if [A[ ≤[B[, but A and B do not have the same cardinality.
We state without proof that [A[ = [B[ have the same cardinality if and only if [A[ ≤ [B[ and
[B[ ≤[A[. This is the socalled CantorBernsteinSchroeder theorem. Furthermore, if A and B are
any two sets, we can always write [A[ ≤[B[ or [B[ ≤[A[. The issues surrounding this last statement
are very subtle. As we will not require either of these two statements, we omit proofs.
The interesting cases of sets are inﬁnite sets. We start with the following deﬁnition.
Deﬁnition 0.3.22. If [A[ = [N[, then A is said to be countably inﬁnite. If A is ﬁnite or countably
inﬁnite, then we say A is countable. If A is not countable, then A is said to be uncountable.
Note that the cardinality of N is usually denoted as ℵ
0
(read as alephnaught)
†
.
Example 0.3.23: The set of even natural numbers has the same cardinality as N. Proof: Given an
even natural number, write it as 2n for some n ∈ N. Then create a bijection taking 2n to n.
In fact, let us mention without proof the following characterization of inﬁnite sets: A set is
inﬁnite if and only if it is in one to one correspondence with a proper subset of itself.
Example 0.3.24: NN is a countably inﬁnite set. Proof: Arrange the elements of NN as follows
(1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1), . . . . That is, always write down ﬁrst all the elements whose
two entries sum to k, then write down all the elements whose entries sum to k +1 and so on. Then
deﬁne a bijection with N by letting 1 go to (1, 1), 2 go to (1, 2) and so on.
Example 0.3.25: The set of rational numbers is countable. Proof: (informal) Follow the same
procedure as in the previous example, writing
1
/1,
1
/2,
2
/1, etc. . . . However, leave out any fraction
(such as
2
/2) that has already appeared.
For completeness we mention the following statement. If A ⊂B and B is countable, then A is
countable. Similarly if A is uncountable, then B is uncountable. As we will not need this statement
in the sequel, and as the proof requires the CantorBernsteinSchroeder theorem mentioned above,
we will not give it here.
We give the ﬁrst truly striking result. First, we need a notation for the set of all subsets of a set.
Deﬁnition 0.3.26. If A is a set, we deﬁne the power set of A, denoted by P(A), to be the set of all
subsets of A.
†
For the fans of the TV show Futurama, there is a movie theater in one episode called an ℵ
0
plex.
0.3. BASIC SET THEORY 17
For example, if A :=¦1, 2¦, then P(A) =¦/ 0, ¦1¦, ¦2¦, ¦1, 2¦¦. Note that for a ﬁnite set A of
cardinality n, the cardinality of P(A) is 2
n
. This fact is left as an exercise. That is, the cardinality
of P(A) is strictly larger than the cardinality of A, at least for ﬁnite sets. What is an unexpected
and striking fact is that this statement is still true for inﬁnite sets.
Theorem 0.3.27 (Cantor). [A[ < [P(A)[. In particular, there exists no surjection from A onto
P(A).
Proof. There of course exists an injection f : A →P(A). For any x ∈ A, deﬁne f (x) := ¦x¦.
Therefore [A[ ≤[P(A)[.
To ﬁnish the proof, we have to show that no function f : A →P(A) is a surjection. Suppose
that f : A →P(A) is a function. So for x ∈ A, f (x) is a subset of A. Deﬁne the set
B :=¦x ∈ A : x / ∈ f (x)¦.
We claim that B is not in the range of f and hence f is not a surjection. Suppose that there exists
an x
0
such that f (x
0
) = B. Either x
0
∈ B or x
0
/ ∈ B. If x
0
∈ B, then x
0
/ ∈ f (x
0
) = B, which is a
contradiction. If x
0
/ ∈ B, then x
0
∈ f (x
0
) = B, which is again a contradiction. Thus such an x
0
does
not exist. Therefore, B is not in the range of f , and f is not a surjection. As f was an arbitrary
function, no surjection can exist.
One particular consequence of this theorem is that there do exist uncountable sets, as P(N)
must be uncountable. This fact is related to the fact that the set of real numbers (which we study
in the next chapter) is uncountable. The existence of uncountable sets may seem unintuitive, and
the theorem caused quite a controversy at the time it was announced. The theorem not only says
that uncountable sets exist, but that there in fact exist progressively larger and larger inﬁnite sets N,
P(N), P(P(N)), P(P(P(N))), etc. . . .
0.3.5 Exercises
Exercise 0.3.1: Show A`(B∩C) = (A`B) ∪(A`C).
Exercise 0.3.2: Prove that the principle of strong induction is equivalent to the standard induction.
Exercise 0.3.3: Finish the proof of Proposition 0.3.15.
Exercise 0.3.4: a) Prove Proposition 0.3.16.
b) Find an example for which equality of sets in f (C∩D) ⊂ f (C) ∩ f (D) fails. That is, ﬁnd an f ,
A, B, C, and D such that f (C∩D) is a proper subset of f (C) ∩ f (D).
Exercise 0.3.5 (Tricky): Prove that if A is ﬁnite, then there exists a unique number n such that there
exists a bijection between A and ¦1, 2, 3, . . . , n¦. In other words, the notation [A[ := n is justiﬁed.
Hint: Show that if n > m, then there is no injection from ¦1, 2, 3, . . . , n¦ to ¦1, 2, 3, . . . , m¦.
18 INTRODUCTION
Exercise 0.3.6: Prove
a) A∩(B∪C) = (A∩B) ∪(A∩C)
b) A∪(B∩C) = (A∪B) ∩(A∪C)
Exercise 0.3.7: Let A∆B denote the symmetric difference, that is, the set of all elements that belong
to either A or B, but not to both A and B.
a) Draw a Venn diagram for A∆B.
b) Show A∆B = (A`B) ∪(B`A).
c) Show A∆B = (A∪B) `(A∩B).
Exercise 0.3.8: For each n ∈ N, let A
n
:=¦(n+1)k : k ∈ N¦.
a) Find A
1
∩A
2
.
b) Find
¸
∞
n=1
A
n
.
c) Find
¸
∞
n=1
A
n
.
Exercise 0.3.9: Determine P(S) (the power set) for each of the following:
a) S = / 0,
b) S =¦1¦,
c) S =¦1, 2¦,
d) S =¦1, 2, 3, 4¦.
Exercise 0.3.10: Let f : A →B and g: B →C be functions.
a) Prove that if g◦ f is injective, then f is injective.
b) Prove that if g◦ f is surjective, then g is surjective.
c) Find an explicit example where g◦ f is bijective, but neither f nor g are bijective.
Exercise 0.3.11: Prove that n < 2
n
by induction.
Exercise 0.3.12: Show that for a ﬁnite set A of cardinality n, the cardinality of P(A) is 2
n
.
Exercise 0.3.13: Prove
1
12
+
1
23
+ +
1
n(n+1)
=
n
n+1
for all n ∈ N.
Exercise 0.3.14: Prove 1
3
+2
3
+ +n
3
=
n(n+1)
2
2
for all n ∈ N.
0.3. BASIC SET THEORY 19
Exercise 0.3.15: Prove that n
3
+5n is divisible by 6 for all n ∈ N.
Exercise 0.3.16: Find the smallest n ∈ N such that 2(n +5)
2
< n
3
and call it n
0
. Show that
2(n+5)
2
< n
3
for all n ≥n
0
.
Exercise 0.3.17: Find all n ∈ N such that n
2
< 2
n
.
Exercise 0.3.18: Finish the proof that the principle of induction is equivalent to the well ordering
property of N. That is, prove the well ordering property for N using the principle of induction.
Exercise 0.3.19: Give an example of a countable collection of ﬁnite sets A
1
, A
2
, . . ., whose union is
not a ﬁnite set.
Exercise 0.3.20: Give an example of a countable collection of inﬁnite sets A
1
, A
2
, . . ., with A
j
∩A
k
being inﬁnite for all j and k, such that
¸
∞
j=1
A
j
is nonempty and ﬁnite.
20 INTRODUCTION
Chapter 1
Real Numbers
1.1 Basic properties
Note: 1.5 lectures
The main object we work with in analysis is the set of real numbers. As this set is so fundamental,
often much time is spent on formally constructing the set of real numbers. However, we will take an
easier approach here and just assume that a set with the correct properties exists. We need to start
with some basic deﬁnitions.
Deﬁnition 1.1.1. A set A is called an ordered set, if there exists a relation < such that
(i) For any x, y ∈ A, exactly one of x < y, x = y, or y < x holds.
(ii) If x < y and y < z, then x < z.
For example, the rational numbers Q are an ordered set by letting x < y if and only if y −x is a
positive rational number. Similarly, N and Z are also ordered sets.
We will write x ≤y if x < y or x = y. We deﬁne > and ≥ in the obvious way.
Deﬁnition 1.1.2. Let E ⊂A, where A is an ordered set.
(i) If there exists a b ∈ A such that x ≤b for all x ∈ E, then we say E is bounded above and b is
an upper bound of E.
(ii) If there exists a b ∈ A such that x ≥b for all x ∈ E, then we say E is bounded below and b is a
lower bound of E.
(iii) If there exists an upper bound b
0
of E such that whenever b is any upper bound for E we have
b
0
≤b, then b
0
is called the least upper bound or the supremum of E. We write
sup E := b
0
.
21
22 CHAPTER 1. REAL NUMBERS
(iv) Similarly, if there exists a lower bound b
0
of E such that whenever b is any lower bound for E
we have b
0
≥b, then b
0
is called the greatest lower bound or the inﬁmum of E. We write
inf E := b
0
.
Note that a supremum or inﬁmum for E (even if they exist) need not be in E. For example the
set ¦x ∈ Q: x < 1¦ has a least upper bound of 1, but 1 is not in the set itself.
Deﬁnition 1.1.3. An ordered set A has the leastupperbound property if every nonempty subset
E ⊂A that is bounded above has a least upper bound, that is sup E exists in A.
Sometimes leastupperbound property is called the completeness property or the Dedekind
completeness property.
Example 1.1.4: For example Q does not have the leastupperbound property. The set ¦x ∈ Q :
x
2
< 2¦ does not have a supremum. The obvious supremum
√
2 is not rational. Suppose that x
2
= 2
for some x ∈ Q. Write x =
m
/n in lowest terms. So (
m
/n)
2
= 2 or m
2
= 2n
2
. Hence m
2
is divisible
by 2 and so m is divisible by 2. We write m = 2k and so we have (2k)
2
= 2n
2
. We divide by 2 and
note that 2k
2
= n
2
and hence n is divisible by 2. But that is a contradiction as we said
m
/n was in
lowest terms.
That Q does not have the leastupperbound property is one of the most important reasons
why we work with R in analysis. The set Q is just ﬁne for algebraists. But analysts require the
leastupperbound property to do any work. We also require our real numbers to have many algebraic
properties. In particular, we require that they are a ﬁeld.
Deﬁnition 1.1.5. A set F is called a ﬁeld if it has two operations deﬁned on it, addition x +y and
multiplication xy, and if it satisﬁes the following axioms.
(A1) If x ∈ F and y ∈ F, then x +y ∈ F.
(A2) (commutativity of addition) If x +y = y +x for all x, y ∈ F.
(A3) (associativity of addition) If (x +y) +z = x +(y +z) for all x, y, z ∈ F.
(A4) There exists an element 0 ∈ F such that 0+x = x for all x ∈ F.
(A5) For every element x ∈ F there exists an element −x ∈ F such that x +(−x) = 0.
(M1) If x ∈ F and y ∈ F, then xy ∈ F.
(M2) (commutativity of multiplication) If xy = yx for all x, y ∈ F.
(M3) (associativity of multiplication) If (xy)z = x(yz) for all x, y, z ∈ F.
1.1. BASIC PROPERTIES 23
(M4) There exists an element 1 (and 1 = 0) such that 1x = x for all x ∈ F.
(M5) For every x ∈ F such that x = 0 there exists an element
1
/x ∈ F such that x(
1
/x) = 1.
(D) (distributive law) x(y +z) = xy +xz for all x, y, z ∈ F.
Example 1.1.6: The set Q of rational numbers is a ﬁeld. On the other hand Z is not a ﬁeld, as it
does not contain multiplicative inverses.
Deﬁnition 1.1.7. A ﬁeld F is said to be an ordered ﬁeld if F is also an ordered set such that:
(i) For x, y, z ∈ F, x < y implies x +z < y +z.
(ii) For x, y ∈ F such that x > 0 and y > 0 implies xy > 0.
If x > 0, we say x is positive. If x < 0, we say x is negative. We also say x is nonnegative if x ≥0,
and x is nonpositive if x ≤0.
Proposition 1.1.8. Let F be an ordered ﬁeld and x, y, z ∈ F. Then:
(i) If x > 0, then −x < 0 (and viceversa).
(ii) If x > 0 and y < z, then xy < xz.
(iii) If x < 0 and y < z, then xy > xz.
(iv) If x = 0, then x
2
> 0.
(v) If 0 < x < y, then 0 <
1
/y <
1
/x.
Note that (iv) implies in particular that 1 > 0.
Proof. Let us prove (i). The inequality x > 0 implies by item (i) of deﬁnition of ordered ﬁeld
that x + (−x) > 0 + (−x). Now apply the algebraic properties of ﬁelds to obtain 0 > −x. The
“viceversa” follows by similar calculation.
For (ii), ﬁrst notice that y < z implies 0 < z −y by applying item (i) of the deﬁnition of ordered
ﬁelds. Now apply item (ii) of the deﬁnition of ordered ﬁelds to obtain 0 < x(z −y). By algebraic
properties we get 0 < xz −xy, and again applying item (i) of the deﬁnition we obtain xy < xz.
Part (iii) is left as an exercise.
To prove part (iv) ﬁrst suppose that x > 0. Then by item (ii) of the deﬁnition of ordered ﬁelds
we obtain that x
2
> 0 (use y = x). If x < 0, we can use part (iii) of this proposition. Plug in y = x
and z = 0.
Finally to prove part (v), notice that
1
/x cannot be equal to zero (why?). If
1
/x < 0, then
−1
/x > 0
by (i). Then apply part (ii) (as x > 0) to obtain x(
−1
/x) > 0x or −1 > 0, which contradicts 1 > 0 by
using part (i) again. Similarly
1
/y > 0. Hence (
1
/x)(
1
/y) > 0 by deﬁnition and we have
(
1
/x)(
1
/y)x < (
1
/x)(
1
/y)y.
By algebraic properties we get
1
/y <
1
/x.
24 CHAPTER 1. REAL NUMBERS
Product of two positive numbers (elements of an ordered ﬁeld) is positive. However, it is not
true that if the product is positive, then each of the two factors must be positive. We do have the
following proposition.
Proposition 1.1.9. Let x, y ∈ F where F is an ordered ﬁeld. Suppose that xy > 0. Then either both
x and y are positive, or both are negative.
Proof. It is clear that both possibilities can in fact happen. If either x and y are zero, then xy is zero
and hence not positive. Hence we can assume that x and y are nonzero, and we simply need to show
that if they have opposite signs, then xy < 0. Without loss of generality suppose that x > 0 and
y < 0. Multiply y < 0 by x to get xy < 0x = 0. The result follows by contrapositive.
1.1.1 Exercises
Exercise 1.1.1: Prove part (iii) of Proposition 1.1.8.
Exercise 1.1.2: Let S be an ordered set. Let A ⊂S be a nonempty ﬁnite subset. Then A is bounded.
Furthermore, inf A exists and is in A and sup A exists and is in A. Hint: Use induction.
Exercise 1.1.3: Let x, y ∈ F, where F is an ordered ﬁeld. Suppose that 0 < x < y. Show that
x
2
< y
2
.
Exercise 1.1.4: Let S be an ordered set. Let B ⊂S be bounded (above and below). Let A ⊂B be a
nonempty subset. Suppose that all the inf’s and sup’s exist. Show that
inf B ≤inf A ≤sup A ≤sup B.
Exercise 1.1.5: Let S be an ordered set. Let A ⊂ S and suppose that b is an upper bound for A.
Suppose that b ∈ A. Show that b = sup A.
Exercise 1.1.6: Let S be an ordered set. Let A ⊂ S be a nonempty subset that is bounded above.
Suppose that sup A exists and that sup A / ∈ A. Show that A contains a countably inﬁnite subset. In
particular, A is inﬁnite.
Exercise 1.1.7: Find a (nonstandard) ordering of the set of natural numbers N such that there
exists a proper subset A N and such that sup A exists in N but sup A / ∈ A.
1.2. THE SET OF REAL NUMBERS 25
1.2 The set of real numbers
Note: 2 lectures
1.2.1 The set of real numbers
We ﬁnally get to the real number system. Instead of constructing the real number set from the
rational numbers, we simply state their existence as a theorem without proof. Notice that Q is an
ordered ﬁeld.
Theorem 1.2.1. There exists a unique
∗
ordered ﬁeld R with the leastupperbound property such
that Q⊂R.
Note that also N ⊂Q. As we have seen, 1 > 0. By induction (exercise) we can prove that n > 0
for all n ∈ N. Similarly we can easily verify all the statements we know about rational numbers and
their natural ordering.
Let us prove one of the most basic but useful results about the real numbers. The following
proposition is essentially how an analyst proves that a number is zero.
Proposition 1.2.2. If x ∈ R is such that x ≥0 and x ≤ε for all ε ∈ R where ε > 0, then x = 0.
Proof. If x > 0, then 0 <
x
/2 < x (why?). Taking ε =
x
/2 obtains a contradiction. Thus x = 0.
A more general and related simple fact is that any time we have two real numbers a < b, then
there is another real number c such that a < c < b. Just take for example c =
a+b
2
(why?). In fact,
there are inﬁnitely many real numbers between a and b.
The most useful property of R for analysts, however, is not just that it is an ordered ﬁeld, but
that it has the leastupperbound property. Essentially we want Q, but we also want to take suprema
(and inﬁma) willynilly. So what we do is to throw in enough numbers to obtain R.
We have already seen that R must contain elements that are not in Q because of the leastupper
bound property. We have seen that there is no rational square root of two. The set ¦x ∈ Q: x
2
< 2¦
implies the existence of the real number
√
2 that is not rational, although this fact requires a bit of
work.
Example 1.2.3: Claim: There exists a unique positive real number r such that r
2
= 2. We denote r
by
√
2.
Proof. Take the set A :=¦x ∈ R : x
2
< 2¦. First we must note that if x
2
< 2, then x < 2. To see this
fact, note that x ≥ 2 implies x
2
≥ 4 (use Proposition 1.1.8 we will not explicitly mention its use
from now on), hence any number such that x ≥2 is not in A. Thus A is bounded above. As 1 ∈ A,
then A is nonempty.
∗
Uniqueness is up to isomorphism, but we wish to avoid excessive use of algebra. For us, it is simply enough to
assume that a set of real numbers exists. See Rudin [R2] for the construction and more details.
26 CHAPTER 1. REAL NUMBERS
Let us deﬁne r := sup A. We will show that r
2
= 2 by showing that r
2
≥ 2 and r
2
≤ 2. This
is the way analysts show equality, by showing two inequalities. Note that we already know that
r ≥1 > 0.
Let us ﬁrst show that r
2
≥ 2. Take a number s ≥ 1 such that s
2
< 2. Note that 2 −s
2
> 0.
Therefore
2−s
2
2(s+1)
> 0. We can choose an h ∈ R such that 0 < h <
2−s
2
2(s+1)
. Furthermore, we can
assume that h < 1.
Claim: 0 < a < b implies b
2
−a
2
< 2(b−a)b. Proof: Write
b
2
−a
2
= (b−a)(a+b) < (b−a)2b.
Let us use the claim by plugging in a = s and b = s +h. We obtain
(s +h)
2
−s
2
< h2(s +h)
< 2h(s +1)
since h < 1
< 2−s
2
since h <
2−s
2
2(s +1)
.
This implies that (s+h)
2
<2. Hence s+h ∈A but as h >0 we have s+h >s. Hence, s <r =sup A.
As s ≥1 was an arbitrary number such that s
2
< 2, it follows that r
2
≥2.
Now take a number s such that s
2
> 2. Hence s
2
−2 > 0, and as before
s
2
−2
2s
> 0. We can choose
an h ∈ R such that 0 < h <
s
2
−2
2s
and h < s.
Again we use the fact that 0 < a < b implies b
2
−a
2
< 2(b −a)b. We plug in a = s −h and
b = s (note that s −h > 0). We obtain
s
2
−(s −h)
2
< 2hs
< s
2
−2
since h <
s
2
−2
2s
.
By subtracting s
2
from both sides and multiplying by −1, we ﬁnd (s −h)
2
> 2. Therefore s −h / ∈ A.
Furthermore, if x ≥s −h, then x
2
≥(s −h)
2
> 2 (as x > 0 and s −h > 0) and so x / ∈ A and so
s −h is an upper bound for A. However, s −h < s, or in other words s > r = sup A. Thus r
2
≤2.
Together, r
2
≥2 and r
2
≤2 imply r
2
= 2. The existence part is ﬁnished. We still need to handle
uniqueness. Suppose that s ∈ R such that s
2
= 2 and s > 0. Thus s
2
= r
2
. However, if 0 < s < r,
then s
2
< r
2
. Similarly if 0 < r < s implies r
2
< s
2
. Hence s = r.
The number
√
2 / ∈ Q. The set R`Q is called the set of irrational numbers. We have seen that
R`Q is nonempty, later on we will see that is it actually very large.
Using the same technique as above, we can show that a positive real number x
1/n
exists for all
n ∈ N and all x > 0. That is, for each x > 0, there exists a positive real number r such that r
n
= x.
The proof is left as an exercise.
1.2. THE SET OF REAL NUMBERS 27
1.2.2 Archimedean property
As we have seen, in any interval, there are plenty of real numbers. But there are also inﬁnitely many
rational numbers in any interval. The following is one of the most fundamental facts about the real
numbers. The two parts of the next theorem are actually equivalent, even though it may not seem
like that at ﬁrst sight.
Theorem 1.2.4.
(i) (Archimedean property) If x, y ∈ R and x > 0, then there exists an n ∈ N such that
nx > y.
(ii) (Q is dense in R) If x, y ∈ R and x < y, then there exists an r ∈ Q such that x < r < y.
Proof. Let us prove (i). We can divide through by x and then what (i) says is that for any real
number t :=
y
/x, we can ﬁnd natural number n such that n >t. In other words, (i) says that N⊂R is
unbounded. Suppose for contradiction that N is bounded. Let b := supN. The number b−1 cannot
possibly be an upper bound for N as it is strictly less than b. Thus there exists an m ∈ N such that
m > b−1. We can add one to obtain m+1 > b, which contradicts b being an upper bound.
Now let us tackle (ii). First assume that x ≥0. Note that y −x > 0. By (i), there exists an n ∈ N
such that
n(y −x) > 1.
Also by (i) the set A :=¦k ∈ N : k > nx¦ is nonempty. By the well ordering property of N, A has a
least element m. As m ∈ A, then m > nx. As m is the least element of A, m−1 / ∈ A. If m > 1, then
m−1 ∈ N, but m−1 / ∈ A and so m−1 ≤nx. If m = 1, then m−1 = 0, and m−1 ≤nx still holds
as x ≥0. In other words,
m−1 ≤nx < m.
We divide through by n to get x <
m
/n. On the other hand from n(y −x) > 1 we obtain ny > 1+nx.
As nx ≥m−1 we get that 1+nx ≥m and hence ny > m and therefore y >
m
/n.
Now assume that x <0. If y >0, then we can just take r =0. If y <0, then note that 0 <−y <−x
and ﬁnd a rational q such that −y < q <−x. Then take r =−q.
Let us state and prove a simple but useful corollary of the Archimedean property. Other
corollaries are easy consequences and we leave them as exercises.
Corollary 1.2.5. inf¦
1
/n : n ∈ N¦ = 0.
Proof. Let A :=¦
1
/n : n ∈ N¦. Obviously A is not empty. Furthermore,
1
/n > 0 and so 0 is a lower
bound, so b := inf A exists. As 0 is a lower bound, then b ≥ 0. If b > 0. By the Archimedean
property there exists an n such that nb >1, or in other words b >
1
/n. However,
1
/n ∈A contradicting
the fact that b is a lower bound. Hence b = 0.
28 CHAPTER 1. REAL NUMBERS
1.2.3 Using supremum and inﬁmum
To make using suprema and inﬁma even easier, we want to be able to always write sup A and inf A
without worrying about A being bounded and nonempty. We make the following natural deﬁnitions
Deﬁnition 1.2.6. Let A ⊂R be a set.
(i) If A is empty, then sup A :=−∞.
(ii) If A is not bounded above, then sup A :=∞.
(iii) If A is empty, then inf A :=∞.
(iv) If A is not bounded below, then inf A :=−∞.
For convenience, we will sometimes treat ∞ and −∞ as if they were numbers, except we will
not allow arbitrary arithmetic with them. We can make R
∗
:=R∪¦−∞, ∞¦ into an ordered set by
letting
−∞ <∞ and −∞ < x and x <∞ for all x ∈ R.
The set R
∗
is called the set of extended real numbers. It is possible to deﬁne some arithmetic on R
∗
,
but we will refrain from doing so as it leads to easy mistakes because R
∗
will not be a ﬁeld.
Now we can take suprema and inﬁma without fear. Let us say a little bit more about them. First
we want to make sure that suprema and inﬁma are compatible with algebraic operations. For a set
A ⊂R and a number x deﬁne
x +A :=¦x +y ∈ R : y ∈ A¦,
xA :=¦xy ∈ R : y ∈ A¦.
Proposition 1.2.7. Let A ⊂R.
(i) If x ∈ R, then sup(x +A) = x +sup A.
(ii) If x ∈ R, then inf(x +A) = x +inf A.
(iii) If x > 0, then sup(xA) = x(sup A).
(iv) If x > 0, then inf(xA) = x(inf A).
(v) If x < 0, then sup(xA) = x(inf A).
(vi) If x < 0, then inf(xA) = x(sup A).
Do note that multiplying a set by a negative number switches supremum for an inﬁmum and
viceversa.
1.2. THE SET OF REAL NUMBERS 29
Proof. Let us only prove the ﬁrst statement. The rest are left as exercises.
Suppose that b is a bound for A. That is, y < b for all y ∈ A. Then x +y < x +b, and so x +b is
a bound for x +A. In particular, if b = sup A, then
sup(x +A) ≤x +b = x +sup A.
The other direction is similar. If b is a bound for x +A, then x +y < b for all y ∈ A and so
y < b−x. So b−x is a bound for A. If b = sup(x +A), then
sup A ≤b−x = sup(x +A) −x.
And the result follows.
Sometimes we will need to apply supremum twice. Here is an example.
Proposition 1.2.8. Let A, B ⊂R such that x ≤y whenever x ∈ A and y ∈ B. Then sup A ≤inf B.
Proof. First note that any x ∈ A is a lower bound for B. Therefore x ≤inf B. Now inf B is an upper
bound for A and therefore sup A ≤inf B.
We have to be careful about strict inequalities and taking suprema and inﬁma. Note that x < y
whenever x ∈ A and y ∈ B still only implies sup A ≤ inf B, and not a strict inequality. This is an
important subtle point that comes up often.
For example, take A :=¦0¦ and take B :=¦
1
/n : n ∈ N¦. Then 0 <
1
/n for all n ∈ N. However,
sup A = 0 and inf B = 0 as we have seen.
1.2.4 Maxima and minima
By Exercise 1.1.2 we know that a ﬁnite set of numbers always has a supremum or an inﬁmum that
is contained in the set itself. In this case we usually do not use the words supremum or inﬁmum.
When we have a set A of real numbers bounded above, such that sup A ∈ A, then we can use the
word maximum and notation maxA to denote the supremum. Similarly for inﬁmum. When a set A
is bounded below and inf A ∈ A, then we can use the word minimum and the notation minA. For
example,
max¦1, 2.4, π, 100¦ = 100,
min¦1, 2.4, π, 100¦ = 1.
While writing sup and inf may be technically correct in this situation, max and min are generally
used to emphasize that the supremum or inﬁmum is in the set itself.
30 CHAPTER 1. REAL NUMBERS
1.2.5 Exercises
Exercise 1.2.1: Prove that if t > 0 (t ∈ R), then there exists an n ∈ N such that
1
n
2
<t.
Exercise 1.2.2: Prove that if t > 0 (t ∈ R), then there exists an n ∈ N such that n−1 ≤t < n.
Exercise 1.2.3: Finish proof of Proposition 1.2.7.
Exercise 1.2.4: Let x, y ∈ R. Suppose that x
2
+y
2
= 0. Prove that x = 0 and y = 0.
Exercise 1.2.5: Show that
√
3 is irrational.
Exercise 1.2.6: Let n ∈ N. Show that either
√
n is either an integer or it is irrational.
Exercise 1.2.7: Prove the arithmeticgeometric mean inequality. That is, for two positive real
numbers x, y we have
√
xy ≤
x +y
2
.
Furthermore, equality occurs if and only if x = y.
Exercise 1.2.8: Show that for any two real numbers such that x < y, we have an irrational number
s such that x < s < y. Hint: Apply the density of Q to
x
√
2
and
y
√
2
.
Exercise 1.2.9: Let A and B be two bounded sets of real numbers. Let C :=¦a+b : a ∈ A, b ∈ B¦.
Show that C is a bounded set and that
sup C = sup A+sup B and inf C = inf A+inf B.
Exercise 1.2.10: Let A and B be two bounded sets of nonnegative real numbers. Let C :=¦ab : a ∈
A, b ∈ B¦. Show that C is a bounded set and that
sup C = (sup A)(sup B) and inf C = (inf A)(inf B).
Exercise 1.2.11 (Hard): Given x > 0 and n ∈ N, show that there exists a unique positive real
number r such that x = r
n
. Usually r is denoted by x
1/n
.
1.3. ABSOLUTE VALUE 31
1.3 Absolute value
Note: 0.51 lecture
A concept we will encounter over and over is the concept of absolute value. You want to think
of the absolute value as the “size” of a real number. Let us give a formal deﬁnition.
[x[ :=
x if x ≥0,
−x if x < 0.
Let us give the main features of the absolute value as a proposition.
Proposition 1.3.1.
(i) [x[ ≥0, and [x[ = 0 if and only if x = 0.
(ii) [−x[ =[x[ for all x ∈ R.
(iii) [xy[ =[x[ [y[ for all x, y ∈ R.
(iv) [x[
2
= x
2
for all x ∈ R.
(v) [x[ ≤y if and only if −y ≤x ≤y.
(vi) −[x[ ≤x ≤[x[ for all x ∈ R.
Proof. (i): This statement is obvious from the deﬁnition.
(ii): Suppose that x > 0, then [−x[ =−(−x) = x =[x[. Similarly when x < 0, or x = 0.
(iii): If x or y is zero, then the result is obvious. When x and y are both positive, then [x[ [y[ = xy.
xy is also positive and hence xy = [xy[. Finally without loss of generality assume that x > 0 and
y < 0. Then [x[ [y[ = x(−y) =−(xy). Now xy is negative and hence [xy[ =−(xy).
(iv): Obvious if x = 0 and if x > 0. If x < 0, then [x[
2
= (−x)
2
= x
2
.
(v): Suppose that [x[ ≤ y. If x > 0, then x ≤ y. Obviously y ≥ 0 and hence −y ≤ 0 < x so
−y ≤x ≤y holds. If x < 0, then [x[ ≤y means −x ≤y. Negating both sides we get x ≥−y. Again
y ≥ 0 and so y ≥ 0 > x. Hence, −y ≤ x ≤ y. If x = 0, then as y ≥ 0 it is obviously true that
−y ≤0 = x = 0 ≤y.
On the other hand, suppose that −y ≤x ≤y is true. If x ≥0, then x ≤y is equivalent to [x[ ≤y.
If x < 0, then −y ≤x implies (−x) ≤y, which is equivalent to [x[ ≤y.
(vi): Just apply (v) with y =[x[.
A property used frequently enough to give it a name is the socalled triangle inequality.
Proposition 1.3.2 (Triangle Inequality). [x +y[ ≤[x[ +[y[ for all x, y ∈ R.
32 CHAPTER 1. REAL NUMBERS
Proof. From Proposition 1.3.1 we have −[x[ ≤ x ≤ [x[ and −[y[ ≤ y ≤ [y[. We add these two
inequalities to obtain
−([x[ +[y[) ≤x +y ≤[x[ +[y[ .
Again by Proposition 1.3.1 we have that [x +y[ ≤[x[ +[y[.
There are other versions of the triangle inequality that are applied often.
Corollary 1.3.3. Let x, y ∈ R
(i) (reverse triangle inequality)
([x[ −[y[)
≤[x −y[.
(ii) [x −y[ ≤[x[ +[y[.
Proof. Let us plug in x = a−b and y = b into the standard triangle inequality to obtain
[a[ =[a−b+b[ ≤[a−b[ +[b[ .
or [a[ −[b[ ≤[a−b[. Switching the roles of a and b we obtain or [b[ −[a[ ≤[b−a[ =[a−b[. Now
applying Proposition 1.3.1 again we obtain the reverse triangle inequality.
The second version of the triangle inequality is obtained from the standard one by just replacing
y with −y and noting again that [−y[ =[y[.
Corollary 1.3.4. Let x
1
, x
2
, . . . , x
n
∈ R. Then
[x
1
+x
2
+ +x
n
[ ≤[x
1
[ +[x
2
[ + +[x
n
[ .
Proof. We will proceed by induction. Note that it is true for n = 1 trivially and n = 2 is the standard
triangle inequality. Now suppose that the corollary holds for n. Take n+1 numbers x
1
, x
2
, . . . , x
n+1
and compute, ﬁrst using the standard triangle inequality, and then the induction hypothesis
[x
1
+x
2
+ +x
n
+x
n+1
[ ≤[x
1
+x
2
+ +x
n
[ +[x
n+1
[
≤[x
1
[ +[x
2
[ + +[x
n
[ +[x
n+1
[.
Let us see an example of the use of the triangle inequality.
Example 1.3.5: Find a number M such that [x
2
−9x +1[ ≤M for all −1 ≤x ≤5.
Using the triangle inequality, write
[x
2
−9x +1[ ≤[x
2
[ +[9x[ +[1[ =[x[
2
+9[x[ +1.
It is obvious that [x[
2
+9[x[ +1 is largest when [x[ is largest. In the interval provided, [x[ is largest
when x = 5 and so [x[ = 5. One possibility for M is
M = 5
2
+9(5) +1 = 71.
There are, of course, other M that work. The bound of 71 is much higher than it need be, but we
didn’t ask for the best possible M, just one that works.
1.3. ABSOLUTE VALUE 33
The last example leads us to the concept of bounded functions.
Deﬁnition 1.3.6. Suppose f : D →R is a function. We say f is bounded if there exists a number
M such that [ f (x)[ ≤M for all x ∈ D.
In the example we have shown that x
2
−9x +1 is bounded when considered as a function on
D =¦x : −1 ≤x ≤5¦. On the other hand, if we consider the same polynomial as a function on the
whole real line R, then it is not bounded.
If a function f : D →R is bounded, then we can talk about its supremum and its inﬁmum. We
write
sup
x∈D
f (x) := sup f (D),
inf
x∈D
f (x) := inf f (D).
To illustrate some common issues, let us prove the following proposition.
Proposition 1.3.7. If f : D →R and g: D →R are bounded functions and
f (x) ≤g(x) for all x ∈ D,
then
sup
x∈D
f (x) ≤sup
x∈D
g(x) and inf
x∈D
f (x) ≤ inf
x∈D
g(x). (1.1)
You should be careful with the variables. The x on the left side of the inequality in (1.1) is
different from the x on the right. You should really think of the ﬁrst inequality as
sup
x∈D
f (x) ≤sup
y∈D
g(y).
Let us prove this inequality. If b is an upper bound for g(D), then f (x) ≤g(x) ≤b and hence b is
an upper bound for f (D). Therefore taking the least upper bound we get that for all x
f (x) ≤sup
y∈D
g(y).
But that means that sup
y∈D
g(y) is an upper bound for f (D), hence is greater than or equal to the
least upper bound of f (D).
sup
x∈D
f (x) ≤sup
y∈D
g(y).
The second inequality (the statement about the inf) is left as an exercise.
Do note that a common mistake is to conclude that
sup
x∈D
f (x) ≤ inf
y∈D
g(y). (1.2)
The inequality (1.2) is not true given the hypothesis of the claim above. For this stronger inequality
we need the stronger hypothesis
f (x) ≤g(y) for all x ∈ D and y ∈ D.
The proof is left as an exercise.
34 CHAPTER 1. REAL NUMBERS
1.3.1 Exercises
Exercise 1.3.1: Let ε > 0. Show that [x −y[ < ε if and only if x −ε < y < x +ε.
Exercise 1.3.2: Show that
a) max¦x, y¦ =
x+y+[x−y[
2
b) min¦x, y¦ =
x+y−[x−y[
2
Exercise 1.3.3: Find a number M such that [x
3
−x
2
+8x[ ≤M for all −2 ≤x ≤10
Exercise 1.3.4: Finish the proof of Proposition 1.3.7. That is, prove that given any set D, and two
bounded functions f : D →R and g: D →R such that f (x) ≤g(x), then
inf
x∈D
f (x) ≤ inf
x∈D
g(x).
Exercise 1.3.5: Let f : D →R and g: D →R be functions.
a) Suppose that f (x) ≤g(y) for all x ∈ D and y ∈ D. Show that
sup
x∈D
f (x) ≤ inf
x∈D
g(x).
b) Find a speciﬁc D, f , and g, such that f (x) ≤g(x) for all x ∈ D, but
sup
x∈D
f (x) > inf
x∈D
g(x).
1.4. INTERVALS AND THE SIZE OF R 35
1.4 Intervals and the size of R
Note: 0.51 lecture (proof of uncountability of R can be optional)
You have seen the notation for intervals before, but let us give a formal deﬁnition here. For
a, b ∈ R such that a < b we deﬁne
[a, b] :=¦x ∈ R : a ≤x ≤b¦,
(a, b) :=¦x ∈ R : a < x < b¦,
(a, b] :=¦x ∈ R : a < x ≤b¦,
[a, b) :=¦x ∈ R : a ≤x < b¦.
The interval [a, b] is called a closed interval and (a, b) is called an open interval. The intervals of
the form (a, b] and [a, b) are called halfopen intervals.
The above intervals were all bounded intervals, since both a and b were real numbers. We deﬁne
unbounded intervals,
[a, ∞) :=¦x ∈ R : a ≤x¦,
(a, ∞) :=¦x ∈ R : a < x¦,
(−∞, b] :=¦x ∈ R : x ≤b¦,
(−∞, b) :=¦x ∈ R : x < b¦.
For completeness we deﬁne (−∞, ∞) :=R.
We have already seen that any open interval (a, b) (where a < b of course) must be nonempty.
For example, it contains the number
a+b
2
. An unexpected fact is that from a settheoretic perspective,
all intervals have the same “size,” that is, they all have the same cardinality. For example the map
f (x) := 2x takes the interval [0, 1] bijectively to the interval [0, 2].
Or, maybe more interestingly, the function f (x) := tan(x) is a bijective map from (−π, π)
to R, hence the bounded interval (−π, π) has the same cardinality as R. It is not completely
straightforward to construct a bijective map from [0, 1] to say (0, 1), but it is possible.
And do not worry, there does exist a way to measure the “size” of subsets of real numbers that
“sees” the difference between [0, 1] and [0, 2]. However, its proper deﬁnition requires much more
machinery than we have right now.
Let us say more about the cardinality of intervals and hence about the cardinality of R. We
have seen that there exist irrational numbers, that is R` Q is nonempty. The question is, how
many irrational numbers are there. It turns out there are a lot more irrational numbers than rational
numbers. We have seen that Q is countable, and we will show in a little bit that R is uncountable.
In fact, the cardinality of R is the same as the cardinality of P(N), although we will not prove this
claim.
Theorem 1.4.1 (Cantor). R is uncountable.
36 CHAPTER 1. REAL NUMBERS
We give a modiﬁed version of Cantor’s original proof from 1874 as this proof requires the least
setup. Normally this proof is stated as a contradiction proof, but a proof by contrapositive is easier
to understand.
Proof. Let X ⊂R be a countable subset such that for any two numbers a < b, there is an x ∈ X such
that a < x < b. If R were countable, then we could take X =R. If we can show that X must be a
proper subset, then X cannot equal to R and R must be uncountable.
As X is countable, there is a bijection from N to X. Consequently, we can write X as a sequence
of real numbers x
1
, x
2
, x
3
, . . ., such that each number in X is given by some x
j
for some j ∈ N.
Let us construct two other sequences of real numbers a
1
, a
2
, a
3
, . . . and b
1
, b
2
, b
3
, . . .. Let a
1
:= 0
and b
1
:= 1. Next, for each k > 1:
(i) Deﬁne a
k
:= x
j
, where j is the smallest j ∈ N such that x
j
∈ (a
k−1
, b
k−1
). As an open interval
is nonempty, we know that such an x
j
always exists by our assumption on X.
(ii) Next, deﬁne b
k
:= x
j
where j is the smallest j ∈ N such that x
j
∈ (a
k
, b
k−1
).
Claim: a
j
< b
k
for all j and k in N. This is because a
j
< a
j+1
for all j and b
k
> b
k+1
for all k.
If there did exist a j and a k such that a
j
≥b
k
, then there is an n such that a
n
≥b
n
(why?), which is
not possible by deﬁnition.
Let A =¦a
j
: j ∈ N¦ and B =¦b
j
: j ∈ N¦. We have seen before that
sup A ≤inf B.
Deﬁne y = sup A. The number y cannot be a member of A. If y = a
j
for some j, then y < a
j+1
,
which is impossible. Similarly y cannot be a member of B.
If y / ∈ X, then we are done; we have shown that X is a proper subset of R. If y ∈ X, then there
exists some k such that y = x
k
. Notice however that y ∈ (a
m
, b
m
) and y ∈ (a
m
, b
m−1
) for all m ∈ N.
We claim that this means that y would be picked for a
m
or b
m
in one of the steps, which would be a
contradiction. To see the claim note that the smallest j such that x
j
is in (a
k−1
, b
k−1
) or (a
k
, b
k−1
)
always becomes larger in every step. Hence eventually we will reach a point where x
j
= y. In this
case we would make either a
k
= y or b
k
= y, which is a contradiction.
Therefore, the sequence x
1
, x
2
, . . . cannot contain all elements of R and thus R is uncountable.
1.4.1 Exercises
Exercise 1.4.1: For a < b, construct an explicit bijection from (a, b] to (0, 1].
Exercise 1.4.2: Suppose that f : [0, 1] →(0, 1) is a bijection. Construct a bijection from [−1, 1] to
R using f .
1.4. INTERVALS AND THE SIZE OF R 37
Exercise 1.4.3 (Hard): Show that the cardinality of R is the same as the cardinality of P(N). Hint:
If you have a binary representation of a real number in the interval [0, 1], then you have a sequence
of 1’s and 0’s. Use the sequence to construct a subset of N. The tricky part is to notice that some
numbers have more than one binary representation.
Exercise 1.4.4 (Hard): Construct an explicit bijection from (0, 1] to (0, 1). Hint: One approach is
as follows: First map (
1
/2, 1] to (0,
1
/2], then map (
1
/4,
1
/2] to (
1
/2,
3
/4], etc. . . . Write down the map
explicitly, that is, write down an algorithm that tells you exactly what number goes where. Then
prove that the map is a bijection.
Exercise 1.4.5 (Hard): Construct an explicit bijection from [0, 1] to (0, 1).
38 CHAPTER 1. REAL NUMBERS
Chapter 2
Sequences and Series
2.1 Sequences and limits
Note: 2.5 lectures
Analysis is essentially about taking limits. The most basic type of a limit is a limit of a sequence
of real numbers. We have already seen sequences used informally. Let us give the formal deﬁnition.
Deﬁnition 2.1.1. A sequence is a function x: N →R. Instead of x(n) we will usually denote the
nth element in the sequence by x
n
. We will use the notation ¦x
n
¦ or more precisely
¦x
n
¦
∞
n=1
to denote a sequence.
A sequence ¦x
n
¦ is bounded if there exists a B ∈ R such that
[x
n
[ ≤B for all n ∈ N.
In other words, the sequence ¦x
n
¦ is bounded whenever the set ¦x
n
: n ∈ N¦ is bounded.
For example, ¦
1
/n¦
∞
n=1
, or simply ¦
1
/n¦, stands for the sequence 1,
1
/2,
1
/3,
1
/4,
1
/5, . . .. When we
need to give a concrete sequence we will often give each term as a formula in terms of n. The
sequence ¦
1
/n¦ is a bounded sequence (B = 1 will sufﬁce). On the other hand the sequence ¦n¦
stands for 1, 2, 3, 4, . . ., and this sequence is not bounded (why?).
While the notation for a sequence is similar
∗
to that of a set, the notions are distinct. For
example, the sequence ¦(−1)
n
¦ is the sequence −1, 1, −1, 1, −1, 1, . . ., whereas the set of values,
the range of the sequence, is just the set ¦−1, 1¦. We could write this set as ¦(−1)
n
: n ∈ N¦. When
ambiguity could arise, we use the words sequence or set to distinguish the two concepts.
Another example of a sequence is the constant sequence. That is a sequence ¦c¦ = c, c, c, c, . . .
consisting of a single constant c ∈ R.
∗
[BS] use the notation (x
n
) to denote a sequence instead of ¦x
n
¦, which is what [R2] uses. Both are common.
39
40 CHAPTER 2. SEQUENCES AND SERIES
We now get to the idea of a limit of a sequence. We will see in Proposition 2.1.6 that the notation
below is well deﬁned. That is, if a limit exists, then it is unique. So it makes sense to talk about the
limit of a sequence.
Deﬁnition 2.1.2. A sequence ¦x
n
¦ is said to converge to a number x ∈ R, if for every ε > 0, there
exists an M ∈ N such that [x
n
−x[ < ε for all n ≥M. The number x is said to be the limit of ¦x
n
¦.
We will write
lim
n→∞
x
n
:= x.
A sequence that converges is said to be convergent. Otherwise, the sequence is said to be
divergent.
It is good to know intuitively what a limit means. It means that eventually every number in the
sequence is close to the number x. More precisely, we can be arbitrarily close to the limit, provided
we go far enough in the sequence. It does not mean we will ever reach the limit. It is possible, and
quite common, that there is no x
n
in the sequence that equals the limit x.
When we write lim x
n
= x for some real number x, we are saying two things. First, that ¦x
n
¦ is
convergent, and second that the limit is x.
The above deﬁnition is one of the most important deﬁnitions in analysis, and it is necessary to
understand it perfectly. The key point in the deﬁnition is that given any ε > 0, we can ﬁnd an M.
The M can depend on ε, so we only pick an M once we know ε. Let us illustrate this concept on a
few examples.
Example 2.1.3: The constant sequence 1, 1, 1, 1, . . . is convergent and the limit is 1. For every
ε > 0, we can pick M = 1.
Example 2.1.4: The sequence ¦
1
/n¦ is convergent and
lim
n→∞
1
n
= 0.
Let us verify this claim. Given an ε > 0, we can ﬁnd an M∈ N such that 0 <
1
/M <ε (Archimedean
property at work). Then for all n ≥M we have that
[x
n
−0[ =
1
n
=
1
n
≤
1
M
< ε.
Example 2.1.5: The sequence ¦(−1)
n
¦ is divergent. If there were a limit x, then for ε =
1
2
we
expect an M that satisﬁes the deﬁnition. Suppose such an M exists, then for an even n ≥ M we
compute
1
/2 >[x
n
−x[ =[1−x[ and
1
/2 >[x
n+1
−x[ =[−1−x[ .
But
2 =[1−x −(−1−x)[ ≤[1−x[ +[−1−x[ <
1
/2 +
1
/2 = 1,
and that is a contradiction.
2.1. SEQUENCES AND LIMITS 41
Proposition 2.1.6. A convergent sequence has a unique limit.
The proof of this proposition exhibits a useful technique in analysis. Many proofs follow the
same general scheme. We want to show a certain quantity is zero. We write the quantity using the
triangle inequality as two quantities, and we estimate each one by arbitrarily small numbers.
Proof. Suppose that the sequence ¦x
n
¦ has the limit x and the limit y. Take an arbitrary ε > 0. From
the deﬁnition we ﬁnd an M
1
such that for all n ≥M
1
, [x
n
−x[ <
ε
/2. Similarly we ﬁnd an M
2
such
that for all n ≥M
2
we have [x
n
−y[ <
ε
/2. Now take M := max¦M
1
, M
2
¦. For n ≥M (so that both
n ≥M
1
and n ≥M
2
) we have
[y −x[ =[x
n
−x −(x
n
−y)[
≤[x
n
−x[ +[x
n
−y[
<
ε
2
+
ε
2
= ε.
As [y −x[ < ε for all ε > 0, then [y −x[ = 0 and y = x. Hence the limit (if it exists) is unique.
Proposition 2.1.7. A convergent sequence ¦x
n
¦ is bounded.
Proof. Suppose that ¦x
n
¦ converges to x. Thus there exists a M ∈ N such that for all n ≥ M we
have [x
n
−x[ < 1. Let B
1
:=[x[ +1 and note that for n ≥M we have
[x
n
[ =[x
n
−x +x[
≤[x
n
−x[ +[x[
< 1+[x[ = B
1
.
The set ¦[x
1
[ , [x
2
[ , . . . , [x
M−1
[¦ is a ﬁnite set and hence let
B
2
:= max¦[x
1
[ , [x
2
[ , . . . , [x
M−1
[¦.
Let B := max¦B
1
, B
2
¦. Then for all n ∈ N we have
[x
n
[ ≤B.
The sequence ¦(−1)
n
¦ shows that the converse does not hold. A bounded sequence is not
necessarily convergent.
Example 2.1.8: The sequence
n
2
+1
n
2
+n
¸
converges and
lim
n→∞
n
2
+1
n
2
+n
= 1.
42 CHAPTER 2. SEQUENCES AND SERIES
Given any ε > 0, ﬁnd M ∈ N such that
1
M+1
< ε. Then for any n ≥M we have
n
2
+1
n
2
+n
−1
=
n
2
+1−(n
2
+n)
n
2
+n
=
1−n
n
2
+n
=
n−1
n
2
+n
≤
n
n
2
+n
=
1
n+1
≤
1
M+1
< ε.
Therefore, lim
n
2
+1
n
2
+n
= 1.
2.1.1 Monotone sequences
The simplest type of a sequence is a monotone sequence. Checking that a monotone sequence
converges is as easy as checking that it is bounded. It is also easy to ﬁnd the limit for a convergent
monotone sequence, provided we can ﬁnd the supremum or inﬁmum of a countable set of numbers.
Deﬁnition 2.1.9. A sequence ¦x
n
¦ is monotone increasing if x
n
≤x
n+1
for all n ∈ N. A sequence
¦x
n
¦ is monotone decreasing if x
n
≥x
n+1
for all n ∈ N. If a sequence is either monotone increasing
or monotone decreasing, we simply say the sequence is monotone. Some authors also use the word
monotonic.
Theorem 2.1.10. A monotone sequence ¦x
n
¦ is bounded if and only if it is convergent.
Furthermore, if ¦x
n
¦ is monotone increasing and bounded, then
lim
n→∞
x
n
= sup¦x
n
: n ∈ N¦.
If ¦x
n
¦ is monotone decreasing and bounded, then
lim
n→∞
x
n
= inf¦x
n
: n ∈ N¦.
Proof. Let us suppose that the sequence is monotone increasing. Suppose that the sequence is
bounded. That means that there exists a B such that x
n
≤B for all n, that is the set ¦x
n
: n ∈ N¦ is
bounded from above. Let
x := sup¦x
n
: n ∈ N¦.
Let ε > 0 be arbitrary. As x is the supremum, then there must be at least one M ∈ N such that
x
M
> x −ε (because x is the supremum). As ¦x
n
¦ is monotone increasing, then it is easy to see (by
induction) that x
n
≥x
M
for all n ≥M. Hence
[x
n
−x[ = x −x
n
≤x −x
M
< ε.
2.1. SEQUENCES AND LIMITS 43
Hence the sequence converges to x. We already know that a convergent sequence is bounded, which
completes the other direction of the implication.
The proof for monotone decreasing sequences is left as an exercise.
Example 2.1.11: Take the sequence ¦
1
√
n
¦.
First we note that
1
√
n
> 0 and hence the sequence is bounded from below. Let us show that it
is monotone decreasing. We start with
√
n+1 ≥
√
n (why is that true?). From this inequality we
obtain
1
√
n+1
≤
1
√
n
.
So the sequence is monotone decreasing, bounded from below (and hence bounded). We can apply
the theorem to note that the sequence is convergent and that in fact
lim
n→∞
1
√
n
= inf
1
√
n
.
We already know that the inﬁmum is greater than or equal to 0, as 0 is a lower bound. Take a number
b ≥0 such that b ≤
1
√
n
for all n. We can square both sides to obtain
b
2
≤
1
n
for all n ∈ N.
We have seen before that this implies that b
2
≤0 (a consequence of the Archimedean property). As
we also have b
2
≥ 0, then b
2
= 0 and hence b = 0. Hence b = 0 is the greatest lower bound and
hence the limit.
Example 2.1.12: Be careful however. We have to show that a monotone sequence is bounded
in order to use Theorem 2.1.10. For example, take the sequence ¦1 +
1
/2 + +
1
/n¦. This is a
monotone increasing sequence that grows very slowly. We will see, once we get to series, that this
sequence has no upper bound and so does not converge. It is not at all obvious that this sequence
has no bound.
A common example of where monotone sequences arise is the following proposition. The proof
is left as an exercise.
Proposition 2.1.13. Let S ⊂R be a nonempty bounded set. Then there exist monotone sequences
¦x
n
¦ and ¦y
n
¦ such that x
n
, y
n
∈ S and
sup S = lim
n→∞
x
n
and inf S = lim
n→∞
y
n
.
44 CHAPTER 2. SEQUENCES AND SERIES
2.1.2 Tail of a sequence
Deﬁnition 2.1.14. For a sequence ¦x
n
¦, the Ktail (where K ∈ N) or just the tail of the sequence is
the sequence starting at K+1, usually written as
¦x
n+K
¦
∞
n=1
or ¦x
n
¦
∞
n=K+1
.
The main result about the tail of a sequence is the following proposition.
Proposition 2.1.15. For any K ∈ N, the sequence ¦x
n
¦
∞
n=1
converges if and only if the Ktail
¦x
n+K
¦
∞
n=1
converges. Furthermore, if the limit exists, then
lim
n→∞
x
n
= lim
n→∞
x
n+K
.
Proof. Deﬁne y
n
:= x
n+K
. We wish to show that ¦x
n
¦ converges if and only if ¦y
n
¦ converges. And
furthermore that the limits are equal.
Suppose that ¦x
n
¦ converges to some x ∈ R. That is, given an ε > 0, there exists an M ∈ N such
that [x −x
n
[ < ε for all n ≥M. Note that n ≥M implies n+K ≥M. Therefore, it is true that for all
n ≥M we have that
[x −y
n
[ =[x −x
n+K
[ < ε.
Therefore ¦y
n
¦ converges to x.
Now suppose that ¦y
n
¦ converges to x ∈ R. That is, given an ε > 0, there exists an M
/
∈ N such
that [x −y
n
[ < ε for all n ≥ M
/
. Let M := M
/
+K. Then n ≥ M implies that n −K ≥ M
/
. Thus,
whenever n ≥M we have
[x −x
n
[ =[x −y
n−K
[ < ε.
Therefore ¦x
n
¦ converges to x.
Essentially, the limit does not care about how the sequence begins, it only cares about the tail of
the sequence. That is, the beginning of the sequence may be arbitrary.
2.1.3 Subsequences
A very useful concept related to sequences is that of a subsequence. A subsequence of ¦x
n
¦ is a
sequence that contains only some of the numbers from ¦x
n
¦ in the same order.
Deﬁnition 2.1.16. Let ¦x
n
¦ be a sequence. Let ¦n
i
¦ be a strictly increasing sequence of natural
numbers (that is n
1
< n
2
< n
3
< ). The sequence
¦x
n
i
¦
∞
i=1
is called a subsequence of ¦x
n
¦.
2.1. SEQUENCES AND LIMITS 45
For example, take the sequence ¦
1
/n¦. The sequence ¦
1
/3n¦ is a subsequence. To see how these
two sequences ﬁt in the deﬁnition, take n
i
:= 3i. Note that the numbers in the subsequence must
come from the original sequence, so 1, 0,
1
/3, 0,
1
/5, . . . is not a subsequence of ¦
1
/n¦. Similarly order
must be preserved, so the sequence 1,
1
/3,
1
/2,
1
/5, . . . is not a subsequence of ¦
1
/n¦.
Note that a tail of a sequence is one type of subsequence. For an arbitrary subsequence, we have
the following proposition.
Proposition 2.1.17. If ¦x
n
¦ is a convergent sequence, then any subsequence ¦x
n
i
¦ is also convergent
and
lim
n→∞
x
n
= lim
i→∞
x
n
i
.
Proof. Suppose that lim
n→∞
x
n
= x. That means that for every ε > 0 we have an M ∈ N such that
for all n ≥M
[x
n
−x[ < ε.
It is not hard to prove (do it!) by induction that n
i
≥i. Hence i ≥M implies that n
i
≥M. Thus, for
all i ≥M we have
[x
n
i
−x[ < ε.
and we are done.
Example 2.1.18: Do note that the implication in the other direction is not true. For example, take
the sequence 0, 1, 0, 1, 0, 1, . . .. That is x
n
= 0 if n is odd, and x
n
= 1 if n is even. It is not hard to see
that ¦x
n
¦ is divergent, however, the subsequence ¦x
2n
¦ converges to 1 and the subsequence ¦x
2n+1
¦
converges to 0. See also Theorem 2.3.7.
2.1.4 Exercises
In the following exercises, feel free to use what you know from calculus to ﬁnd the limit, if it exists.
But you must prove that you have found the correct limit, or prove that the series is divergent.
Exercise 2.1.1: Is the sequence ¦3n¦ bounded? Prove or disprove.
Exercise 2.1.2: Is the sequence ¦n¦ convergent? If so, what is the limit.
Exercise 2.1.3: Is the sequence
(−1)
n
2n
convergent? If so, what is the limit.
Exercise 2.1.4: Is the sequence ¦2
−n
¦ convergent? If so, what is the limit.
Exercise 2.1.5: Is the sequence
n
n+1
convergent? If so, what is the limit.
Exercise 2.1.6: Is the sequence
n
n
2
+1
convergent? If so, what is the limit.
46 CHAPTER 2. SEQUENCES AND SERIES
Exercise 2.1.7: Let ¦x
n
¦ be a sequence.
a) Show that lim x
n
= 0 (that is, the limit exists and is zero) if and only if lim[x
n
[ = 0.
b) Find an example such that ¦[x
n
[¦ converges and ¦x
n
¦ diverges.
Exercise 2.1.8: Is the sequence
2
n
n!
convergent? If so, what is the limit.
Exercise 2.1.9: Show that the sequence
1
3
√
n
is monotone, bounded, and use Theorem 2.1.10 to
ﬁnd the limit.
Exercise 2.1.10: Show that the sequence
n+1
n
is monotone, bounded, and use Theorem 2.1.10
to ﬁnd the limit.
Exercise 2.1.11: Finish proof of Theorem 2.1.10 for monotone decreasing sequences.
Exercise 2.1.12: Prove Proposition 2.1.13.
Exercise 2.1.13: Let ¦x
n
¦ be a convergent monotone sequence. Suppose that there exists a k ∈ N
such that
lim
n→∞
x
n
= x
k
.
Show that x
n
= x
k
for all n ≥k.
Exercise 2.1.14: Find a convergent subsequence of the sequence ¦(−1)
n
¦.
Exercise 2.1.15: Let ¦x
n
¦ be a sequence deﬁned by
x
n
:=
n if n is odd,
1
/n if n is even.
a) Is the sequence bounded? (prove or disprove)
b) Is there a convergent subsequence? If so, ﬁnd it.
Exercise 2.1.16: Let ¦x
n
¦ be a sequence. Suppose that there are two convergent subsequences
¦x
n
i
¦ and ¦x
m
i
¦. Suppose that
lim
i→∞
x
n
i
= a and lim
i→∞
x
m
i
= b,
where a = b. Prove that ¦x
n
¦ is not convergent, without using Proposition 2.1.17.
2.2. FACTS ABOUT LIMITS OF SEQUENCES 47
2.2 Facts about limits of sequences
Note: 2.5 lectures
In this section we will go over some basic results about the limits of sequences. We start with
looking at how sequences interact with inequalities.
2.2.1 Limits and inequalities
A basic lemma about limits is the socalled squeeze lemma. It allows us to show convergence of
sequences in difﬁcult cases if we can ﬁnd two other simpler convergent sequences that “squeeze”
the original sequence.
Lemma 2.2.1 (Squeeze lemma). Let ¦a
n
¦, ¦b
n
¦, and ¦x
n
¦ be sequences such that
a
n
≤x
n
≤b
n
for all n ∈ N.
Suppose that ¦a
n
¦ and ¦b
n
¦ converge and
lim
n→∞
a
n
= lim
n→∞
b
n
.
Then ¦x
n
¦ converges and
lim
n→∞
x
n
= lim
n→∞
a
n
= lim
n→∞
b
n
.
The intuitive idea of the proof is best illustrated on a picture, see Figure 2.1. If x is the limit of
a
n
and b
n
, then if they are both within
ε
/3 of x, then the distance between a
n
and b
n
is at most
2ε
/3.
As x
n
is between a
n
and b
n
it is at most
2ε
/3 from a
n
. Since a
n
is at most
ε
/3 away from x, then x
n
must be at most ε away from x. Let us follow through on this intuition rigorously.
a
n
b
n
x x
n
Figure 2.1: Squeeze lemma in picture.
Proof. Let x := lim a
n
= lim b
n
. Let ε > 0 be given.
Find an M
1
such that for all n ≥ M
1
we have that [a
n
−x[ <
ε
/3, and an M
2
such that for all
n ≥M
2
we have [b
n
−x[ <
ε
/3. Set M := max¦M
1
, M
2
¦. Suppose that n ≥M. We compute
[x
n
−a
n
[ = x
n
−a
n
≤b
n
−a
n
=[b
n
−x +x −a
n
[
≤[b
n
−x[ +[x −a
n
[
<
ε
3
+
ε
3
=
2ε
3
.
48 CHAPTER 2. SEQUENCES AND SERIES
Armed with this information we estimate
[x
n
−x[ =[x
n
−x +a
n
−a
n
[
≤[x
n
−a
n
[ +[a
n
−x[
<
2ε
3
+
ε
3
= ε.
And we are done.
Example 2.2.2: A simple example of how to use the squeeze lemma is to compute limits of
sequences using limits that are already known. For example, suppose that we have the sequence
¦
1
n
√
n
¦. Since
√
n ≥1 for all n ∈ N we have
0 ≤
1
n
√
n
≤
1
n
.
for all n ∈ N. We already know that lim
1
/n = 0. Hence, using the constant sequence ¦0¦ and the
sequence ¦
1
/n¦ in the squeeze lemma, we conclude that
lim
n→∞
1
n
√
n
= 0.
Limits also preserve inequalities.
Lemma 2.2.3. Let ¦x
n
¦ and ¦y
n
¦ be convergent sequences and
x
n
≤y
n
,
for all n ∈ N. Then
lim
n→∞
x
n
≤ lim
n→∞
y
n
.
Proof. Let x := lim x
n
and y := lim y
n
. Let ε > 0 be given. Find an M
1
such that for all n ≥M
1
we
have [x
n
−x[ <
ε
/2. Find an M
2
such that for all n ≥ M
2
we have [y
n
−y[ <
ε
/2. In particular, for
n ≥max¦M
1
, M
2
¦ we have x −x
n
<
ε
/2 and y
n
−y <
ε
/2. We add these inequalities to obtain
y
n
−x
n
+x −y < ε, or y
n
−x
n
< y −x +ε.
Since x
n
≤y
n
we have 0 ≤y
n
−x
n
and hence
0 < y −x +ε, or −ε < y −x.
In other words, x−y <ε for all ε >0. That means that x−y ≤0, as we have seen that a nonnegative
number less than any positive ε is zero. Therefore x ≤y.
2.2. FACTS ABOUT LIMITS OF SEQUENCES 49
We give an easy corollary that can be proved using constant sequences and an application of
Lemma 2.2.3. The proof is left as an exercise.
Corollary 2.2.4.
i) Let ¦x
n
¦ be a convergent sequence such that x
n
≥0, then
lim
n→∞
x
n
≥0.
ii) Let a, b ∈ R and let ¦x
n
¦ be a convergent sequence such that
a ≤x
n
≤b,
for all n ∈ N. Then
a ≤ lim
n→∞
x
n
≤b.
Note in Lemma 2.2.3 we cannot simply replace all the nonstrict inequalities with strict inequal
ities. For example, let x
n
:=
−1
/n and y
n
:=
1
/n. Then x
n
< y
n
, x
n
< 0, and y
n
> 0 for all n. However,
these inequalities are not preserved by the limit operation as we have lim x
n
= lim y
n
= 0. The
moral of this example is that strict inequalities may become nonstrict inequalities when limits are
applied. That is, if we know that x
n
< y
n
for all n, we can only conclude that
lim
n→∞
x
n
≤ lim
n→∞
y
n
.
This issue is a common source of errors.
2.2.2 Continuity of algebraic operations
Limits interact nicely with algebraic operations.
Proposition 2.2.5. Let ¦x
n
¦ and ¦y
n
¦ be convergent sequences.
(i) The sequence ¦z
n
¦, where z
n
:= x
n
+y
n
, converges and
lim
n→∞
(x
n
+y
n
) = lim
n→∞
z
n
= lim
n→∞
x
n
+ lim
n→∞
y
n
.
(ii) The sequence ¦z
n
¦, where z
n
:= x
n
−y
n
, converges and
lim
n→∞
(x
n
−y
n
) = lim
n→∞
z
n
= lim
n→∞
x
n
− lim
n→∞
y
n
.
(iii) The sequence ¦z
n
¦, where z
n
:= x
n
y
n
, converges and
lim
n→∞
(x
n
y
n
) = lim
n→∞
z
n
=
lim
n→∞
x
n
lim
n→∞
y
n
.
50 CHAPTER 2. SEQUENCES AND SERIES
(iv) If lim y
n
= 0, and y
n
= 0 for all n, then the sequence ¦z
n
¦, where z
n
:=
x
n
y
n
, converges and
lim
n→∞
x
n
y
n
= lim
n→∞
z
n
=
lim x
n
lim y
n
.
Proof. Let us start with (i). Let ¦x
n
¦ and ¦y
n
¦ be convergent sequences and let z
n
:= x
n
+y
n
. Let
x := lim x
n
and y := lim y
n
. Let z := x +y.
Let ε > 0 be given. Find an M
1
such that for all n ≥M
1
we have [x
n
−x[ <
ε
/2. Find an M
2
such
that for all n ≥M
2
we have [y
n
−y[ <
ε
/2. Take M := max¦M
1
, M
2
¦. For all n ≥M we have
[z
n
−z[ =[(x
n
+y
n
) −(x +y)[ =[x
n
−x +y
n
−y[
≤[x
n
−x[ +[y
n
−y[
<
ε
2
+
ε
2
= ε.
Therefore (i) is proved. Proof of (ii) is almost identical and is left as an exercise.
Let us tackle (iii). Let ¦x
n
¦ and ¦y
n
¦ be convergent sequences and let z
n
:= x
n
y
n
. Let x := lim x
n
and y := lim y
n
. Let z := xy.
Let ε > 0 be given. As ¦x
n
¦ is convergent, it is bounded. Therefore, ﬁnd a B > 0 such that
[x
n
[ ≤B for all n ∈ N. Find an M
1
such that for all n ≥M
1
we have [x
n
−x[ <
ε
2([y[+1)
. Find an M
2
such that for all n ≥M
2
we have [y
n
−y[ <
ε
2B
. Take M := max¦M
1
, M
2
¦. For all n ≥M we have
[z
n
−z[ =[(x
n
y
n
) −(xy)[
=[x
n
y
n
−(x +x
n
−x
n
)y[
=[x
n
(y
n
−y) +(x
n
−x)y[
≤[x
n
(y
n
−y)[ +[(x
n
−x)y[
=[x
n
[ [y
n
−y[ +[x
n
−x[ [y[
≤B[y
n
−y[ +[x
n
−x[ [y[
< B
ε
2B
+
ε
2([y[ +1)
[y[
<
ε
2
+
ε
2
= ε.
Finally let us tackle (iv). Instead of proving (iv) directly, we prove the following simpler claim:
Claim: If ¦y
n
¦ is a convergent sequence such that lim y
n
= 0 and y
n
= 0 for all n ∈ N, then
lim
n→∞
1
y
n
=
1
lim y
n
.
Once the claim is proved, we take the sequence ¦
1
/y
n
¦ and multiply it by the sequence ¦x
n
¦ and
apply item (iii).
2.2. FACTS ABOUT LIMITS OF SEQUENCES 51
Proof of claim: Let ε > 0 be given. Let y := lim y
n
. Find an M such that for all n ≥M we have
[y
n
−y[ < min
[y[
2
ε
2
,
[y[
2
.
Note that
[y[ =[y −y
n
+y
n
[ ≤[y −y
n
[ +[y
n
[ ,
or in other words [y
n
[ ≥[y[ −[y −y
n
[. Now [y
n
−y[ <
[y[
2
implies that
[y[ −[y
n
−y[ >
[y[
2
.
Therefore
[y
n
[ ≥[y[ −[y −y
n
[ >
[y[
2
and consequently
1
[y
n
[
<
2
[y[
.
Now we can ﬁnish the proof of the claim,
1
y
n
−
1
y
=
y −y
n
yy
n
=
[y −y
n
[
[y[ [y
n
[
<
[y −y
n
[
[y[
2
[y[
<
[y[
2 ε
2
[y[
2
[y[
= ε.
And we are done.
By plugging in constant sequences, we get several easy corollaries. If c ∈ R and ¦x
n
¦ is a
convergent sequence, then for example
lim
n→∞
cx
n
= c
lim
n→∞
x
n
and lim
n→∞
(c +x
n
) = c + lim
n→∞
x
n
.
Similarly with subtraction and division.
As we can take limits past multiplication we can show that lim x
k
n
= (lim x
n
)
k
. That is, we can
take limits past powers. Let us see if we can do the same with roots.
Proposition 2.2.6. Let ¦x
n
¦ be a convergent sequence such that x
n
≥0. Then
lim
n→∞
√
x
n
=
lim
n→∞
x
n
.
52 CHAPTER 2. SEQUENCES AND SERIES
Of course to even make this statement, we need to apply Corollary 2.2.4 to show that lim x
n
≥0
so that we can take the square root without worry.
Proof. Let ¦x
n
¦ be a convergent sequence and let x := lim x
n
.
First suppose that x = 0. Let ε > 0 be given. Then there is an M such that for all n ≥M we have
x
n
=[x
n
[ < ε
2
, or in other words
√
x
n
< ε. Hence
√
x
n
−
√
x
=
√
x
n
< ε.
Now suppose that x > 0 (and hence
√
x > 0).
√
x
n
−
√
x
=
x
n
−x
√
x
n
+
√
x
=
1
√
x
n
+
√
x
[x
n
−x[
≤
1
√
x
[x
n
−x[ .
We leave the rest of the proof to the reader.
A similar proof works the kth root. That is, we also obtain lim x
1/k
n
= (lim x
n
)
1/k
. We leave this
to the reader as a challenging exercise.
We may also want to take the limit past the absolute value sign.
Proposition 2.2.7. If ¦x
n
¦ is a convergent sequence, then ¦[x
n
[¦ is convergent and
lim
n→∞
[x
n
[ =
lim
n→∞
x
n
.
Proof. We simply note the reverse triangle inequality
[x
n
[ −[x[
≤[x
n
−x[ .
Hence if [x
n
−x[ can be made arbitrarily small, so can
[x
n
[ −[x[
. Details are left to the reader.
2.2.3 Recursively deﬁned sequences
Once we know we can interchange limits and algebraic operations, we will actually be able to easily
compute the limits for a large class of sequences. One such class are recursively deﬁned sequences.
That is sequences where the next number in the sequence computed using a formula from a ﬁxed
number of preceding numbers in the sequence.
2.2. FACTS ABOUT LIMITS OF SEQUENCES 53
Example 2.2.8: Let ¦x
n
¦ be deﬁned by x
1
:= 2 and
x
n+1
:= x
n
−
x
2
n
−2
2x
n
.
We must ﬁnd out if this sequence is well deﬁned, we must show we never divide by zero. Then we
must ﬁnd out if the sequence converges. Only then can we attempt to ﬁnd the limit.
First let us prove that x
n
> 0 for all n (then the sequence is well deﬁned). Let us show this by
induction. We know that x
1
= 2 > 0. For the induction step, suppose that x
n
> 0. Then
x
n+1
= x
n
−
x
2
n
−2
2x
n
=
2x
2
n
−x
2
n
+2
2x
n
=
x
2
n
+2
2x
n
.
If x
n
> 0, then x
2
n
+2 > 0 and hence x
n+1
> 0. Next let us show that the sequence is monotone
decreasing. If we can show that x
2
n
−2 ≥0 for all n, then x
n+1
≤x
n
for all n. Obviously x
2
1
−2 =
4−2 = 2 > 0. For an arbitrary n we have that
x
2
n+1
−2 =
x
2
n
+2
2x
n
2
−2 =
x
4
n
+4x
2
n
+4−8x
2
n
4x
2
n
=
x
4
n
−4x
2
n
+4
4x
2
n
=
x
2
n
−2
2
4x
2
n
.
Since x
n
> 0 and any number squared is nonnegative, we have that x
2
n+1
−2 ≥0 for all n. Therefore,
¦x
n
¦ is monotone decreasing and bounded, and therefore the limit exists. It remains to ﬁnd out what
the limit is.
Let us write
2x
n
x
n+1
= x
2
n
+2.
Since ¦x
n+1
¦ is the 1tail of ¦x
n
¦, it converges to the same limit. Let us deﬁne x := lim x
n
. We can
take the limit of both sides to obtain
2x
2
= x
2
+2,
or x
2
= 2. As x ≥0, we know that x =
√
2.
You should, however, be careful. Before taking any limits, you must make sure the sequence
converges. Let us see an example.
Example 2.2.9: Suppose x
1
:= 1 and x
n+1
:= x
2
n
+x
n
. If we blindly assumed that the limit exists
(call it x), then we would get the equation x = x
2
+x, from which we might conclude that x = 0.
However, it is not hard to show that ¦x
n
¦ is unbounded and therefore does not converge.
The thing to notice in this example is that the method still works, but it depends on the initial
value x
1
. If we made x
1
= 0, then the sequence converges and the limit really is 0. An entire branch
of mathematics, called dynamics, deals precisely with these issues.
54 CHAPTER 2. SEQUENCES AND SERIES
2.2.4 Some convergence tests
Sometimes it is not necessary to go back to the deﬁnition of convergence to prove that a sequence is
convergent. First a simple test. Essentially, the main idea is that ¦x
n
¦ converges to x if and only if
¦[x
n
−x[¦ converges to zero.
Proposition 2.2.10. Let ¦x
n
¦ be a sequence. Suppose that there is an x ∈ R and a convergent
sequence ¦a
n
¦ such that
lim
n→∞
a
n
= 0
and
[x
n
−x[ ≤a
n
for all n. Then ¦x
n
¦ converges and lim x
n
= x.
Proof. Let ε > 0 be given. Note that a
n
≥0 for all n. Find an M ∈ N such that for all n ≥M we
have a
n
=[a
n
−0[ < ε. Then, for all n ≥M we have
[x
n
−x[ ≤a
n
< ε.
As the proposition shows, to study when a sequence has a limit is the same as studying when
another sequence goes to zero. For some special sequences we can test the convergence easily. First
let us compute the limit of a very speciﬁc sequence.
Proposition 2.2.11. Let c > 0.
(i) If c < 1, then
lim
n→∞
c
n
= 0.
(ii) If c > 1, then ¦c
n
¦ is unbounded.
Proof. First let us suppose that c > 1. We write c = 1+r for some r > 0. By induction (or using
the binomial theorem if you know it) we see that
c
n
= (1+r)
n
≥1+nr.
By the Archimedean property of the real numbers, the sequence ¦1+nr¦ is unbounded (for any
number B, we can ﬁnd an n such that nr ≥B−1). Therefore c
n
is unbounded.
Now let c < 1. Write c =
1
1+r
, where r > 0. Then
c
n
=
1
(1+r)
n
≤
1
1+nr
≤
1
r
1
n
.
As ¦
1
n
¦ converges to zero, so does ¦
1
r
1
n
¦. Hence, ¦c
n
¦ converges to zero.
2.2. FACTS ABOUT LIMITS OF SEQUENCES 55
If we look at the above proposition, we note that the ratio of the (n+1)th term and the nth term
is c. We can generalize this simple result to a larger class of sequences. The following lemma will
come up again once we get to series.
Lemma 2.2.12 (Ratio test for sequences). Let ¦x
n
¦ be a sequence such that x
n
= 0 for all n and
such that the limit
L := lim
n→∞
[x
n+1
[
[x
n
[
exists.
(i) If L < 1, then ¦x
n
¦ converges and lim x
n
= 0.
(ii) If L > 1, then ¦x
n
¦ is unbounded (hence diverges).
Even if L exists, but L = 1, the lemma says nothing. We cannot make any conclusion based on
that information alone. For example, consider the sequences 1, 1, 1, 1, . . . and 1, −1, 1, −1, 1, . . ..
Proof. Suppose L < 1. As
[x
n+1
[
[x
n
[
≥0, we have that L ≥0. Pick r such that L < r < 1. As r −L > 0,
there exists an M ∈ N such that for all n ≥M we have
[x
n+1
[
[x
n
[
−L
< r −L.
Therefore,
[x
n+1
[
[x
n
[
< r.
For n > M (that is for n ≥M+1) we write
[x
n
[ =[x
M
[
[x
n
[
[x
n−1
[
[x
n−1
[
[x
n−2
[
[x
M+1
[
[x
M
[
<[x
M
[ rr r =[x
M
[ r
n−M
= ([x
M
[ r
−M
)r
n
.
The sequence ¦r
n
¦ converges to zero and hence [x
M
[ r
−M
r
n
converges to zero. By Proposition 2.2.10,
the Mtail of ¦x
n
¦ converges to zero and therefore ¦x
n
¦ converges to zero.
Now suppose L > 1. Pick r such that 1 < r < L. As L−r > 0, there exists an M ∈ N such that
for all n ≥M we have
[x
n+1
[
[x
n
[
−L
< L−r.
Therefore,
[x
n+1
[
[x
n
[
> r.
Again for n > M we write
[x
n
[ =[x
M
[
[x
n
[
[x
n−1
[
[x
n−1
[
[x
n−2
[
[x
M+1
[
[x
M
[
>[x
M
[ rr r =[x
M
[ r
n−M
= ([x
M
[ r
−M
)r
n
.
The sequence ¦r
n
¦ is unbounded (since r > 1), and therefore [x
n
[ cannot be bounded (if [x
n
[ ≤B for
all n, then r
n
<
B
[x
M
[
r
M
for all n, which is impossible). Consequently, ¦x
n
¦ cannot converge.
56 CHAPTER 2. SEQUENCES AND SERIES
Example 2.2.13: A simple example of using the above lemma is to prove that
lim
n→∞
2
n
n!
= 0.
Proof: We ﬁnd that
2
n+1
/(n+1)!
2
n
/n!
=
2
n+1
2
n
n!
(n+1)!
=
2
n+1
.
It is not hard to see that ¦
2
n+1
¦ converges to zero. The conclusion follows by the lemma.
2.2.5 Exercises
Exercise 2.2.1: Prove Corollary 2.2.4. Hint: Use constant sequences and Lemma 2.2.3.
Exercise 2.2.2: Prove part (ii) of Proposition 2.2.5.
Exercise 2.2.3: Prove that if ¦x
n
¦ is a convergent sequence, k ∈ N, then
lim
n→∞
x
k
n
=
lim
n→∞
x
n
k
.
Hint: Use induction.
Exercise 2.2.4: Suppose that x
1
:=
1
2
and x
n+1
:= x
2
n
. Show that ¦x
n
¦ converges and ﬁnd lim x
n
.
Hint: You cannot divide by zero!
Exercise 2.2.5: Let x
n
:=
n−cos(n)
n
. Use the squeeze lemma to show that ¦x
n
¦ converges and ﬁnd
the limit.
Exercise 2.2.6: Let x
n
:=
1
n
2
and y
n
:=
1
n
. Deﬁne z
n
:=
x
n
y
n
and w
n
:=
y
n
x
n
. Does ¦z
n
¦ and ¦w
n
¦
converge? What are the limits? Can you apply Proposition 2.2.5? Why or why not?
Exercise 2.2.7: True or false, prove or ﬁnd a counterexample. If ¦x
n
¦ is a sequence such that ¦x
2
n
¦
converges, then ¦x
n
¦ converges.
Exercise 2.2.8: Show that
lim
n→∞
n
2
2
n
= 0.
Exercise 2.2.9: Suppose that ¦x
n
¦ is a sequence and suppose that for some x ∈ R, the limit
L := lim
n→∞
[x
n+1
−x[
[x
n
−x[
exists and L < 1. Show that ¦x
n
¦ converges to x.
Exercise 2.2.10 (Challenging): Let ¦x
n
¦ be a convergent sequence such that x
n
≥ 0 and k ∈ N.
Then
lim
n→∞
x
1/k
n
=
lim
n→∞
x
n
1/k
.
Hint: Find an expression q such that
x
1/k
n
−x
1/k
x
n
−x
=
1
q
.
2.3. LIMIT SUPERIOR, LIMIT INFERIOR, AND BOLZANOWEIERSTRASS 57
2.3 Limit superior, limit inferior, and BolzanoWeierstrass
Note: 1.52 lectures, alternative proof of BW optional
In this section we study bounded sequences and their subsequences. In particular we deﬁne
the socalled limit superior and limit inferior of a bounded sequence and talk about limits of
subsequences. Furthermore, we prove the socalled BolzanoWeierstrass theorem
†
, which is an
indispensable tool in analysis.
We have seen that every convergent sequence is bounded, but there exist many bounded divergent
sequences. For example, the sequence ¦(−1)
n
¦ is bounded, but we have seen it is divergent. All is
not lost however and we can still compute certain limits with a bounded divergent sequence.
2.3.1 Upper and lower limits
There are ways of creating monotone sequences out of any sequence, and in this way we get the
socalled limit superior and limit inferior. These limits will always exist for bounded sequences.
Note that if a sequence ¦x
n
¦ is bounded, then the set ¦x
k
: k ∈ N¦ is bounded. Then for every n
the set ¦x
k
: k ≥n¦ is also bounded (as it is a subset).
Deﬁnition 2.3.1. Let ¦x
n
¦ be a bounded sequence. Let a
n
:= sup¦x
k
: k ≥ n¦ and b
n
:= inf¦x
k
:
k ≥ n¦. We note that the sequence ¦a
n
¦ is bounded monotone decreasing and ¦b
n
¦ is bounded
monotone increasing (more on this point below). We deﬁne
limsup
n→∞
x
n
:= lim
n→∞
a
n
,
liminf
n→∞
x
n
:= lim
n→∞
b
n
.
For a bounded sequence, liminf and limsup always exist. It is possible to deﬁne liminf and
limsup for unbounded sequences if we allow ∞ and −∞. It is not hard to generalize the following
results to include unbounded sequences, however, we will restrict our attention to bounded ones.
Let us see why ¦a
n
¦ is a decreasing sequence. As a
n
is the least upper bound for ¦x
k
: k ≥n¦, it
is also an upper bound for the subset ¦x
k
: k ≥(n+1)¦. Therefore a
n+1
, the least upper bound for
¦x
k
: k ≥(n+1)¦, has to be less than or equal to a
n
, that is, a
n
≥a
n+1
. Similarly, b
n
is an increasing
sequence. It is left as an exercise to show that if x
n
is bounded, then a
n
and b
n
must be bounded.
Proposition 2.3.2. Let ¦x
n
¦ be a bounded sequence. Deﬁne a
n
and b
n
as in the deﬁnition above.
(i) limsup
n→∞
x
n
= inf¦a
n
: n ∈ N¦ and liminf
n→∞
x
n
= sup¦b
n
: n ∈ N¦.
(ii) liminf
n→∞
x
n
≤limsup
n→∞
x
n
.
†
Named after the Czech mathematician Bernhard Placidus Johann Nepomuk Bolzano (1781 – 1848), and the
German mathematician Karl Theodor Wilhelm Weierstrass (1815 – 1897).
58 CHAPTER 2. SEQUENCES AND SERIES
Proof. The ﬁrst item in the proposition follows as the sequences ¦a
n
¦ and ¦b
n
¦ are monotone.
For the second item, we note that b
n
≤ a
n
, as the inf of a set is less than or equal to its sup.
We know that ¦a
n
¦ and ¦b
n
¦ converge to the limsup and the liminf (respectively). We can apply
Lemma 2.2.3 to note that
lim
n→∞
b
n
≤ lim
n→∞
a
n
.
Example 2.3.3: Let ¦x
n
¦ be deﬁned by
x
n
:=
n+1
n
if n is odd,
0 if n is even.
Let us compute the liminf and limsup of this sequence
liminf
n→∞
x
n
= lim
n→∞
(inf¦x
k
: k ≥n¦) = lim
n→∞
0 = 0.
For the limit superior we write
limsup
n→∞
x
n
= lim
n→∞
(sup¦x
k
: k ≥n¦).
It is not hard to see that
sup¦x
k
: k ≥n¦ =
n+1
n
if n is odd,
n+2
n+1
if n is even.
We leave it to the reader to show that the limit is 1. That is,
limsup
n→∞
x
n
= 1.
Do note that the sequence ¦x
n
¦ is not a convergent sequence.
We can associate with limsup and liminf certain subsequences.
Theorem 2.3.4. If ¦x
n
¦ is a bounded sequence, then there exists a subsequence ¦x
n
k
¦ such that
lim
k→∞
x
n
k
= limsup
n→∞
x
n
.
Similarly, there exists a (perhaps different) subsequence ¦x
n
k
¦ such that
lim
k→∞
x
n
k
= liminf
n→∞
x
n
.
2.3. LIMIT SUPERIOR, LIMIT INFERIOR, AND BOLZANOWEIERSTRASS 59
Proof. Deﬁne a
n
:= sup¦x
k
: k ≥ n¦. Write x := limsup x
n
= lim a
n
. Deﬁne the subsequence as
follows. Pick n
1
:= 1 and work inductively. Suppose we have deﬁned the subsequence until n
k
for
some k. Now pick some m > n
k
such that
a
(n
k
+1)
−x
m
<
1
k +1
.
We can do this as a
(n
k
+1)
is a supremum of the set ¦x
n
: x ≥n
k
+1¦ and hence there are elements of
the sequence arbitrarily close (or even equal) to the supremum. Set n
k+1
:= m. The subsequence
¦x
n
k
¦ is deﬁned. Next we need to prove that it has the right limit.
Note that a
(n
k−1
+1)
≥a
n
k
(why?) and that a
n
k
≥x
n
k
. Therefore, for every k > 1 we have
[a
n
k
−x
n
k
[ = a
n
k
−x
n
k
≤a
(n
k−1
+1)
−x
n
k
<
1
k
.
Let us show that ¦x
n
k
¦ is convergent to x. Note that the subsequence need not be monotone. Let
ε > 0 be given. As ¦a
n
¦ converges to x, then the subsequence ¦a
n
k
¦ converges to x. Thus there
exists an M
1
∈ N such that for all k ≥M
1
we have
[a
n
k
−x[ <
ε
2
.
Find an M
2
∈ N such that
1
M
2
≤
ε
2
.
Take M := max¦M
1
, M
2
¦ and compute. For all k ≥M we have
[x −x
n
k
[ =[a
n
k
−x
n
k
+x −a
n
k
[
≤[a
n
k
−x
n
k
[ +[x −a
n
k
[
<
1
k
+
ε
2
≤
1
M
2
+
ε
2
≤
ε
2
+
ε
2
= ε.
We leave the statement for liminf as an exercise.
2.3.2 Using limit inferior and limit superior
The advantage of liminf and limsup is that we can always write them down for any (bounded)
sequence. Working with liminf and limsup is a little bit like working with limits, although there
are subtle differences. If we could somehow compute them, we can also compute the limit of the
sequence if it exists.
60 CHAPTER 2. SEQUENCES AND SERIES
Theorem 2.3.5. Let ¦x
n
¦ be a bounded sequence. Then ¦x
n
¦ converges if and only if
liminf
n→∞
x
n
= limsup
n→∞
x
n
.
Furthermore, if ¦x
n
¦ converges, then
lim
n→∞
x
n
= liminf
n→∞
x
n
= limsup
n→∞
x
n
.
Proof. Deﬁne a
n
and b
n
as in Deﬁnition 2.3.1. Now note that
b
n
≤x
n
≤a
n
.
If liminf x
n
= limsup x
n
, then we know that ¦a
n
¦ and ¦b
n
¦ have limits and that these two limits are
the same. By the squeeze lemma (Lemma 2.2.1), ¦x
n
¦ converges and
lim
n→∞
b
n
= lim
n→∞
x
n
= lim
n→∞
a
n
.
Now suppose that ¦x
n
¦ converges to x. We know by Theorem 2.3.4 that there exists a subse
quence ¦x
n
k
¦ that converges to limsup x
n
. As ¦x
n
¦ converges to x, we know that every subsequence
converges to x and therefore limsup x
n
= x. Similarly liminf x
n
= x.
Limit superior and limit inferior behave nicely with subsequences.
Proposition 2.3.6. Suppose that ¦x
n
¦ is a bounded sequence and ¦x
n
k
¦ is a subsequence. Then
liminf
n→∞
x
n
≤liminf
k→∞
x
n
k
≤limsup
k→∞
x
n
k
≤limsup
n→∞
x
n
.
Proof. The middle inequality has been noted before already. We will prove the third inequality, and
leave the ﬁrst inequality as an exercise.
That is, we want to prove that limsup x
n
k
≤ limsup x
n
. Deﬁne a
j
:= sup¦x
k
: k ≥ j¦ as usual.
Also deﬁne c
j
:= sup¦x
n
k
: k ≥ j¦. It is not true that c
j
is necessarily a subsequence of a
j
. However,
as n
k
≥k for all k, we have that ¦x
n
k
: k ≥ j¦ ⊂¦x
k
: k ≥ j¦. A supremum of a subset is less than or
equal to the supremum of the set and therefore
c
j
≤a
j
.
We apply Lemma 2.2.3 to conclude that
lim
j→∞
c
j
≤ lim
j→∞
a
j
,
which is the desired conclusion.
2.3. LIMIT SUPERIOR, LIMIT INFERIOR, AND BOLZANOWEIERSTRASS 61
Limit superior and limit inferior are in fact the largest and smallest subsequential limits. If the
subsequence in the previous proposition is convergent, then of course we have that liminf x
n
k
=
lim x
n
k
= limsup x
n
k
. Therefore,
liminf
n→∞
x
n
≤ lim
k→∞
x
n
k
≤limsup
n→∞
x
n
.
Similarly we also get the following useful test for convergence of a bounded sequence. We leave
the proof as an exercise.
Theorem 2.3.7. A bounded sequence ¦x
n
¦ is convergent and converges to x if and only if every
convergent subsequence ¦x
n
k
¦ converges to x.
2.3.3 BolzanoWeierstrass theorem
While it is not true that a bounded sequence is convergent, the BolzanoWeierstrass theorem tells us
that we can at least ﬁnd a convergent subsequence. The version of BolzanoWeierstrass that we will
present in this section is the BolzanoWeierstrass for sequences.
Theorem 2.3.8 (BolzanoWeierstrass). Suppose that a sequence ¦x
n
¦ of real numbers is bounded.
Then there exists a convergent subsequence ¦x
n
i
¦.
Proof. We can use Theorem 2.3.4. It says that there exists a subsequence whose limit is limsup x
n
.
The reader might complain right now that Theorem 2.3.4 is strictly stronger than the Bolzano
Weierstrass theorem as presented above. That is true. However, Theorem 2.3.4 only applies to the
real line, but BolzanoWeierstrass applies in more general contexts (that is, in R
n
) with pretty much
the exact same statement.
As the theorem is so important to analysis, we present an explicit proof. The following proof
generalizes more easily to different contexts.
Alternate proof of BolzanoWeierstrass. As the sequence is bounded, then there exist two numbers
a
1
< b
1
such that a
1
≤x
n
≤b
1
for all n ∈ N.
We will deﬁne a subsequence ¦x
n
i
¦ and two sequences ¦a
i
¦ and ¦b
i
¦, such that ¦a
i
¦ is monotone
increasing, ¦b
i
¦ is monotone decreasing, a
i
≤ x
n
i
≤ b
i
and such that lim a
i
= lim b
i
. That x
n
i
converges follows by the squeeze lemma.
We deﬁne the sequence inductively. We will always assume that a
i
< b
i
. Further we will always
have that x
n
∈ [a
i
, b
i
] for inﬁnitely many n ∈ N. We have already deﬁned a
1
and b
1
. We can take
n
1
:= 1, that is x
n
1
= x
1
.
Now suppose we have deﬁned the subsequence x
n
1
, x
n
2
, . . . , x
n
k
, and the sequences ¦a
i
¦ and ¦b
i
¦
up to some k ∈ N. We ﬁnd y =
a
k
+b
k
2
. It is clear that a
k
< y < b
k
. If there exist inﬁnitely many j ∈ N
such that x
j
∈ [a
k
, y], then set a
k+1
:= a
k
, b
k+1
:= y, and pick n
k+1
> n
k
such that x
n
k+1
∈ [a
k
, y]. If
62 CHAPTER 2. SEQUENCES AND SERIES
there are not inﬁnitely many j such that x
j
∈ [a
k
, y], then it must be true that there are inﬁnitely
many j ∈ N such that x
j
∈ [y, b
k
]. In this case pick a
k+1
:= y, b
k+1
:= b
k
, and pick n
k+1
> n
k
such
that x
n
k+1
∈ [y, b
k
].
Now we have the sequences deﬁned. What is left to prove is that lim a
i
= lim b
i
. Obviously the
limits exist as the sequences are monotone. From the construction, it is obvious that b
i
−a
i
is cut in
half in each step. Therefore b
i+1
−a
i+1
=
b
i
−a
i
2
. By induction, we obtain that
b
i
−a
i
=
b
1
−a
1
2
i−1
.
Let x := lim a
i
. As ¦a
i
¦ is monotone we have that
x = sup¦a
i
: i ∈ N¦
Now let y := lim b
i
= inf¦b
i
: i ∈ N¦. Obviously y ≤ x as a
i
< b
i
for all i. As the sequences are
monotone, then for any i we have (why?)
y −x ≤b
i
−a
i
=
b
1
−a
1
2
i−1
.
As
b
1
−a
1
2
i−1
is arbitrarily small and y −x ≥ 0, we have that y −x = 0. We ﬁnish by the squeeze
lemma.
Yet another proof of the BolzanoWeierstrass theorem proves the following claim, which is left
as a challenging exercise. Claim: Every sequence has a monotone subsequence.
2.3.4 Exercises
Exercise 2.3.1: Suppose that ¦x
n
¦ is a bounded sequence. Deﬁne a
n
and b
n
as in Deﬁnition 2.3.1.
Show that ¦a
n
¦ and ¦b
n
¦ are bounded.
Exercise 2.3.2: Suppose that ¦x
n
¦ is a bounded sequence. Deﬁne b
n
as in Deﬁnition 2.3.1. Show
that ¦b
n
¦ is an increasing sequence.
Exercise 2.3.3: Finish the proof of Proposition 2.3.6. That is, suppose that ¦x
n
¦ is a bounded
sequence and ¦x
n
k
¦ is a subsequence. Prove liminf
n→∞
x
n
≤liminf
k→∞
x
n
k
.
Exercise 2.3.4: Prove Theorem 2.3.7.
Exercise 2.3.5: a) Let x
n
:=
(−1)
n
n
, ﬁnd limsup x
n
and liminf x
n
.
b) Let x
n
:=
(n−1)(−1)
n
n
, ﬁnd limsup x
n
and liminf x
n
.
2.3. LIMIT SUPERIOR, LIMIT INFERIOR, AND BOLZANOWEIERSTRASS 63
Exercise 2.3.6: Let ¦x
n
¦ and ¦y
n
¦ be sequences such that x
n
≤y
n
for all n. Then show that
limsup
n→∞
x
n
≤limsup
n→∞
y
n
and
liminf
n→∞
x
n
≤liminf
n→∞
y
n
.
Exercise 2.3.7: Let ¦x
n
¦ and ¦y
n
¦ be bounded sequences.
a) Show that ¦x
n
+y
n
¦ is bounded.
b) Show that
(liminf
n→∞
x
n
) +(liminf
n→∞
y
n
) ≤liminf
n→∞
(x
n
+y
n
).
Hint: Find a subsequence ¦x
n
i
+y
n
i
¦ of ¦x
n
+y
n
¦ that converges. Then ﬁnd a subsequence
¦x
n
m
i
¦ of ¦x
n
i
¦ that converges. Then apply what you know about limits.
c) Find an explicit ¦x
n
¦ and ¦y
n
¦ such that
(liminf
n→∞
x
n
) +(liminf
n→∞
y
n
) < liminf
n→∞
(x
n
+y
n
).
Hint: Look for examples that do not have a limit.
Exercise 2.3.8: Let ¦x
n
¦ and ¦y
n
¦ be bounded sequences (from the previous exercise we know that
¦x
n
+y
n
¦ is bounded).
a) Show that
(limsup
n→∞
x
n
) +(limsup
n→∞
y
n
) ≥limsup
n→∞
(x
n
+y
n
).
Hint: See previous exercise.
b) Find an explicit ¦x
n
¦ and ¦y
n
¦ such that
(limsup
n→∞
x
n
) +(limsup
n→∞
y
n
) > limsup
n→∞
(x
n
+y
n
).
Hint: See previous exercise.
Exercise 2.3.9: If S ⊂ R is a set, then x ∈ R is a cluster point if for every ε > 0, the set (x −
ε, x +ε) ∩S`¦x¦ is not empty. That is, if there are points of S arbitrarily close to x. For example,
S :=¦
1
/n : n ∈ N¦ has a unique (only one) cluster point 0, but 0 / ∈ S. Prove the following version of
the BolzanoWeierstrass theorem:
Theorem. Let S ⊂R be a bounded inﬁnite set, then there exists at least one cluster point of S.
Hint: If S is inﬁnite, then S contains a countably inﬁnite subset. That is, there is a sequence
¦x
n
¦ of distinct numbers in S.
64 CHAPTER 2. SEQUENCES AND SERIES
Exercise 2.3.10 (Challenging): a) Prove that any sequence contains a monotone subsequence.
Hint: Call n ∈ N a peak if a
m
≤ a
n
for all m ≥ n. Now there are two possibilities: either the
sequence has at most ﬁnitely many peaks, or it has inﬁnitely many peaks.
b) Now conclude the BolzanoWeierstrass theorem.
2.4. CAUCHY SEQUENCES 65
2.4 Cauchy sequences
Note: 0.51 lecture
Often we wish to describe a certain number by a sequence that converges to it. In this case, it is
impossible to use the number itself in the proof that the sequence converges. It would be nice if we
could check for convergence without being able to ﬁnd the limit.
Deﬁnition 2.4.1. A sequence ¦x
n
¦ is a Cauchy sequence
‡
if for every ε > 0 there exists an M ∈ N
such that for all n ≥M and all k ≥M we have
[x
n
−x
k
[ < ε.
Intuitively what it means is that the terms of the sequence are eventually arbitrarily close to each
other. We would expect such a sequence to be convergent. It turns out that is true because R is
complete (has the leastupperbound property). First, let us look at some examples.
Example 2.4.2: The sequence ¦
1
/n¦ is a Cauchy sequence.
Proof: Let ε > 0 be given. Take M >
2
/ε. Then for n ≥M we have that
1
/n <
ε
/2. Therefore, for
all n, k ≥M we have
1
n
−
1
k
≤
1
n
+
1
k
<
ε
2
+
ε
2
= ε.
Example 2.4.3: The sequence ¦
n+1
n
¦ is a Cauchy sequence.
Proof: Given ε > 0, ﬁnd M such that M >
2
/ε. Then for n, k ≥ M we have that
1
/n <
ε
/2 and
1
/k <
ε
/2. Therefore
n+1
n
−
k +1
k
=
k(n+1) −n(k +1)
nk
=
kn+k −nk −n
nk
=
k −n
nk
≤
k
nk
+
−n
nk
=
1
n
+
1
k
<
ε
2
+
ε
2
= ε.
Proposition 2.4.4. A Cauchy sequence is bounded.
‡
Named after the French mathematician AugustinLouis Cauchy (1789–1857).
66 CHAPTER 2. SEQUENCES AND SERIES
Proof. Suppose that ¦x
n
¦ is Cauchy. Pick M such that for all n, k ≥ M we have [x
n
−x
k
[ < 1. In
particular, we have that for all n ≥M
[x
n
−x
M
[ < 1.
Or by the reverse triangle inequality, [x
n
[ −[x
M
[ ≤[x
n
−x
M
[ < 1. Hence for n ≥M we have
[x
n
[ < 1+[x
M
[ .
Let
B := max¦[x
1
[ , [x
2
[ , . . . , [x
M
[ , 1+[x
M
[¦.
Then [x
n
[ ≤B for all n ∈ N.
Theorem 2.4.5. A sequence of real numbers is Cauchy if and only if it converges.
Proof. Let ε > 0 be given and suppose that ¦x
n
¦ converges to x. Then there exists an M such that
for n ≥M we have
[x
n
−x[ <
ε
2
.
Hence for n ≥M and k ≥M we have
[x
n
−x
k
[ =[x
n
−x +x −x
k
[ ≤[x
n
−x[ +[x −x
k
[ <
ε
2
+
ε
2
= ε.
Alright, that direction was easy. Now suppose that ¦x
n
¦ is Cauchy. We have shown that ¦x
n
¦ is
bounded. If we can show that
liminf
n→∞
x
n
= limsup
n→∞
x
n
,
Then ¦x
n
¦ must be convergent by Theorem 2.3.5.
Deﬁne a := liminf x
n
and b := limsup x
n
. If we can show a = b, then the sequence converges.
By Theorem 2.3.7, there exist subsequences ¦x
n
i
¦ and ¦x
m
i
¦, such that
lim
i→∞
x
n
i
= a and lim
i→∞
x
m
i
= b.
Given an ε > 0, there exists an M
1
such that for all i ≥M
1
we have [x
n
i
−a[ <
ε
/3 and an M
2
such
that for all i ≥M
2
we have [x
m
i
−b[ <
ε
/3. There also exists an M
3
such that for all n, k ≥M
3
we
have [x
n
−x
k
[ <
ε
/3. Let M := max¦M
1
, M
2
, M
3
¦. Now note that if i ≥M, then n
i
≥M and m
i
≥M.
Hence
[a−b[ =[a−x
n
i
+x
n
i
−x
m
i
+x
m
i
−b[
≤[a−x
n
i
[ +[x
n
i
−x
m
i
[ +[x
m
i
−b[
<
ε
3
+
ε
3
+
ε
3
= ε.
As [a−b[ < ε for all ε > 0, then a = b and therefore the sequence converges.
2.4. CAUCHY SEQUENCES 67
Remark 2.4.6. The statement of this proposition is sometimes used to deﬁne the completeness
property of the real numbers. That is, we can say that R is Cauchycomplete (or sometimes just
complete). We have proved above that as R has the leastupperbound property, then R is Cauchy
complete. We can “complete” Q by “throwing in” just enough points to make all Cauchy sequences
converge (we omit the details). The resulting ﬁeld will have the leastupperbound property. The
advantage of using Cauchy sequences to deﬁne completeness is that this idea generalizes to more
abstract settings.
2.4.1 Exercises
Exercise 2.4.1: Prove that ¦
n
2
−1
n
2
¦ is Cauchy using directly the deﬁnition of Cauchy sequences.
Exercise 2.4.2: Let ¦x
n
¦ be a sequence such that there exists a 0 <C < 1 such that
[x
n+1
−x
n
[ ≤C[x
n
−x
n−1
[ .
Prove that ¦x
n
¦ is Cauchy. Hint: You can freely use the formula (for C = 1)
1+C+C
2
+ +C
n
=
1−C
n+1
1−C
.
Exercise 2.4.3: Suppose that F is an ordered ﬁeld that contains the rational numbers Q. We can
deﬁne a convergent sequence and Cauchy sequence in F in exactly the same way as before. Suppose
that every convergent sequence is Cauchy. Prove that F has the leastupperbound property.
Exercise 2.4.4: Let ¦x
n
¦ and ¦y
n
¦ be sequences such that lim y
n
= 0. Suppose that for all k ∈ N
and for all m ≥k we have
[x
m
−x
k
[ ≤y
k
.
Show that ¦x
n
¦ is Cauchy.
Exercise 2.4.5: Suppose that a Cauchy sequence ¦x
n
¦ is such that for every M ∈ N, there exists a
k ≥M and an n ≥M such that x
k
< 0 and x
n
> 0. Using simply the deﬁnition of a Cauchy sequence
and of a convergent sequence, show that the sequence converges to 0.
68 CHAPTER 2. SEQUENCES AND SERIES
2.5 Series
Note: 2 lectures
A fundamental object in mathematics is that of a series. In fact, when foundations of analysis
were being developed, the motivation was to understand series. Understanding series is very
important in applications of analysis. For example, solving differential equations often includes
series, and differential equations are the basis for understanding almost all of modern science.
2.5.1 Deﬁnition
Deﬁnition 2.5.1. Given a sequence ¦x
n
¦, we write the formal object
∞
∑
n=1
x
n
or sometimes just
∑
x
n
and call it a series. A series converges, if the sequence ¦s
n
¦ deﬁned by
s
n
:=
n
∑
k=1
x
k
= x
1
+x
2
+ +x
n
,
converges. If x := lim s
n
, we write
∞
∑
n=1
x
n
= x.
In this case, we treat ∑
∞
n=1
x
n
as a number. The numbers s
n
are called partial sums.
On the other hand, if the sequence ¦s
n
¦ diverges, we say that the series is divergent. In this case,
∑x
n
is simply a formal object and not a number.
In other words, for a convergent series we have
∞
∑
n=1
x
n
= lim
n→∞
n
∑
k=1
x
k
.
We should be careful however to only use this equality if the limit on the right actually exists. That
is, the righthand side does not make sense (the limit does not exist) if the series does not converge.
Remark 2.5.2. Before going further, let us remark that it is sometimes convenient to start the series
at an index different from 1. That is, for example we can write
∞
∑
n=0
r
n
:=
∞
∑
n=1
r
n−1
.
The lefthand side is more convenient to write. The idea is the same as the notation for the tail of a
sequence.
2.5. SERIES 69
Remark 2.5.3. It is common to write the series ∑x
n
as
x
1
+x
2
+x
3
+
with the understanding that the ellipsis indicates that this is a series and not a simple sum. We will
not use this notation as it easily leads to mistakes in proofs.
Example 2.5.4: The series
∞
∑
n=1
1
2
n
converges and the limit is 1. That is,
∞
∑
n=1
1
2
n
= lim
n→∞
n
∑
k=1
1
2
k
= 1.
First we prove the following equality
n
∑
k=1
1
2
k
+
1
2
n
= 1.
Note that the equation is easy to see when n = 1. The proof follows by induction, which we leave
as an exercise. Let s
n
be the partial sum. We write
[1−s
n
[ =
1−
n
∑
k=1
1
2
k
=
1
2
n
=
1
2
n
.
The sequence ¦
1
2
n
¦ converges to zero and so ¦[1−s
n
[¦ converges to zero. So, ¦s
n
¦ converges to 1.
For −1 < r < 1, the geometric series
∞
∑
n=0
r
n
converges. In fact, ∑
∞
n=0
r
n
=
1
1−r
. The proof is left as an exercise to the reader. The proof consists
of showing that
n−1
∑
k=0
r
k
=
1−r
n
1−r
,
and then taking the limit.
A fact we will use a lot is the following analogue of looking at the tail of a sequence.
Proposition 2.5.5. Let ∑x
n
be a series. Let M ∈ N. Then
∞
∑
n=1
x
n
converges if and only if
∞
∑
n=M
x
n
converges.
70 CHAPTER 2. SEQUENCES AND SERIES
Proof. We look at partial sums of the two series (for k ≥M)
k
∑
n=1
x
n
=
M−1
∑
n=1
x
n
+
k
∑
n=M
x
n
.
Note that ∑
M−1
n=1
x
n
is a ﬁxed number. Now use Proposition 2.2.5 to ﬁnish the proof.
2.5.2 Cauchy series
Deﬁnition 2.5.6. A series ∑x
n
is said to be Cauchy or a Cauchy series, if the sequence of partial
sums ¦s
n
¦ is a Cauchy sequence.
A sequence of real numbers converges if and only if it is Cauchy. Therefore a series is convergent
if and only if it is Cauchy.
The series ∑x
n
is Cauchy if for every ε > 0, there exists an M ∈ N, such that for every n ≥M
and k ≥M we have
k
∑
j=1
x
j
−
n
∑
j=1
x
j
< ε.
Without loss of generality we can assume that n < k. Then we write
k
∑
j=1
x
j
−
n
∑
j=1
x
j
=
k
∑
j=n+1
x
j
< ε.
We have proved the following simple proposition.
Proposition 2.5.7. The series ∑x
n
is Cauchy if for every ε > 0, there exists an M ∈ N such that for
every n ≥M and every k > n we have
k
∑
j=n+1
x
j
< ε.
2.5.3 Basic properties
Proposition 2.5.8. Let ∑x
n
be a convergent series. Then the sequence ¦x
n
¦ is convergent and
lim
n→∞
x
n
= 0.
Proof. Let ε > 0 be given. As ∑x
n
is convergent, it is Cauchy. Thus we can ﬁnd an M such that for
every n ≥M we have
ε >
n+1
∑
j=n+1
x
j
=[x
n+1
[ .
Hence for every n ≥M+1 we have that [x
n
[ < ε.
2.5. SERIES 71
Hence if a series converges, the terms of the series go to zero. However, the implication goes
only one way. Let us give an example.
Example 2.5.9: The series ∑
1
n
diverges (despite the fact that lim
1
n
= 0). This is the famous
harmonic series
§
.
We will simply show that the sequence of partial sums is unbounded, and hence cannot converge.
Write the partial sums s
n
for n = 2
k
as:
s
1
= 1,
s
2
= (1) +
1
2
,
s
4
= (1) +
1
2
+
1
3
+
1
4
,
s
8
= (1) +
1
2
+
1
3
+
1
4
+
1
5
+
1
6
+
1
7
+
1
8
,
.
.
.
s
2
k = 1+
k
∑
j=1
2
j
∑
m=2
j−1
+1
1
m
.
We note that
1
/3 +
1
/4 ≥
1
/4 +
1
/4 =
1
/2 and
1
/5 +
1
/6 +
1
/7 +
1
/8 ≥
1
/8 +
1
/8 +
1
/8 +
1
/8 =
1
/2. More
generally
2
k
∑
m=2
k−1
+1
1
m
≥
2
k
∑
m=2
k−1
+1
1
2
k
= (2
k−1
)
1
2
k
=
1
2
.
Therefore
s
2
k = 1+
k
∑
j=1
2
k
∑
m=2
k−1
+1
1
m
≥1+
k
∑
j=1
1
2
= 1+
k
2
.
As ¦
k
2
¦ is unbounded by the Archimedean property, that means that ¦s
2
k ¦ is unbounded, and
therefore ¦s
n
¦ is unbounded. Hence ¦s
n
¦ diverges, and consequently ∑
1
n
diverges.
Convergent series are linear. That is, we can multiply them by constants and add them and these
operations are done term by term.
Proposition 2.5.10 (Linearity of series). Let α ∈ R and ∑x
n
and ∑y
n
be convergent series.
(i) Then ∑αx
n
is a convergent series and
∞
∑
n=1
αx
n
= α
∞
∑
n=1
x
n
.
§
The divergence of the harmonic series was known before the theory of series was made rigorous. In fact the proof
we give is the earliest proof and was given by Nicole Oresme (1323–1382).
72 CHAPTER 2. SEQUENCES AND SERIES
(ii) Then ∑(x
n
+y
n
) is a convergent series and
∞
∑
n=1
(x
n
+y
n
) =
∞
∑
n=1
x
n
+
∞
∑
n=1
y
n
.
Proof. For the ﬁrst item, we simply write the nth partial sum
n
∑
k=1
αx
k
= α
n
∑
k=1
x
k
.
We look at the righthand side and note that the constant multiple of a convergent sequence is
convergent. Hence, we simply can take the limit of both sides to obtain the result.
For the second item we also look at the nth partial sum
n
∑
k=1
(x
k
+y
k
) =
n
∑
k=1
x
k
+
n
∑
k=1
y
k
.
We look at the righthand side and note that the sum of convergent sequences is convergent. Hence,
we simply can take the limit of both sides to obtain the proposition.
Note that multiplying series is not as simple as adding, and we will not cover this topic here. It
is not true, of course, that we can multiply term by term, since that strategy does not work even for
ﬁnite sums.
2.5.4 Absolute convergence
Since monotone sequences are easier to work with than arbitrary sequences, it is generally easier
to work with series ∑x
n
where x
n
≥ 0 for all n. Then the sequence of partial sums is monotone
increasing. Let us formalize this statement as a proposition.
Proposition 2.5.11. If x
n
≥ 0 for all n, then ∑x
n
converges if and only if the sequence of partial
sums is bounded.
The following criterion often gives a convenient way to test for convergence of a series.
Deﬁnition 2.5.12. A series ∑x
n
converges absolutely if the series ∑[x
n
[ converges. If a series
converges, but does not converge absolutely, we say it is conditionally convergent.
Proposition 2.5.13. If the series ∑x
n
converges absolutely, then it converges.
2.5. SERIES 73
Proof. We know that a series is convergent if and only if it is Cauchy. Hence suppose that ∑[x
n
[ is
Cauchy. That is for every ε > 0, there exists an M such that for all k ≥M and n > k we have that
n
∑
j=k+1
x
j
=
n
∑
j=k+1
x
j
< ε.
We can apply the triangle inequality for a ﬁnite sum to obtain
n
∑
j=k+1
x
j
=
n
∑
j=k+1
x
j
< ε.
Hence ∑x
n
is Cauchy and therefore it converges.
Of course, if ∑x
n
converges absolutely, the limits of ∑x
n
and ∑[x
n
[ are different. Computing
one will not help us compute the other.
Absolutely convergent series have many wonderful properties for which we do not have space
in these notes. For example, absolutely convergent series can be rearranged arbitrarily.
We state without proof that
∞
∑
n=1
(−1)
n
n
converges. On the other hand we have already seen that
∞
∑
n=1
1
n
diverges. Therefore ∑
(−1)
n
n
is a conditionally convergent subsequence.
2.5.5 Comparison test and the pseries
We have noted above that for a series to converge the terms not only have to go to zero, but they
have to go to zero “fast enough.” If we know about convergence of a certain series we can use the
following comparison test to see if the terms of another series go to zero “fast enough.”
Proposition 2.5.14 (Comparison test). Let ∑x
n
and ∑y
n
be series such that 0 ≤ x
n
≤ y
n
for all
n ∈ N.
(i) If ∑y
n
converges, then so does ∑x
n
.
(ii) If ∑x
n
diverges, then so does ∑y
n
.
74 CHAPTER 2. SEQUENCES AND SERIES
Proof. Since the terms of the series are all nonnegative, the sequence of partial sums are both
monotone increasing. We note that since x
n
≤y
n
for all n, then the partial sums satisfy
n
∑
k=1
x
k
≤
n
∑
k=1
y
k
. (2.1)
If the series ∑y
n
converges the partial sums for the series are bounded. Therefore the righthand
side of (2.1) is bounded for all n. Hence the partial sums for ∑x
n
are also bounded. Since the partial
sums are a monotone increasing sequence they are convergent. The ﬁrst item is thus proved.
On the other hand if ∑x
n
diverges, the sequence of partial sums must be unbounded since it is
monotone increasing. That is, the partial sums for ∑x
n
are bigger than any real number. Putting this
together with (2.1) we see that for any B ∈ R, there is an n such that
B ≤
n
∑
k=1
x
k
≤
n
∑
k=1
y
k
.
Hence the partial sums for ∑y
n
are also unbounded, and hence ∑y
n
also diverges.
A useful series to use with the comparison test is the pseries.
Proposition 2.5.15 (pseries or the ptest). For p > 0, the series
∞
∑
n=1
1
n
p
converges if and only if p > 1.
Proof. As n ≥1 and p ≤1, then
1
n
p
≥
1
n
. Since ∑
1
n
diverges, we see that the ∑
1
n
p
must diverge for
all p ≤1.
Now suppose that p > 1. We proceed in a similar fashion as we did in the case of the harmonic
series, but instead of showing that the sequence of partial sums is unbounded we show that it is
bounded. Since the terms of the series are positive, the sequence of partial sums is monotone
increasing. If we show that it is bounded, it must converge. Let s
k
denote the kth partial sum.
s
1
= 1,
s
3
= (1) +
1
2
p
+
1
3
p
,
s
7
= (1) +
1
2
p
+
1
3
p
+
1
4
p
+
1
5
p
+
1
6
p
+
1
7
p
,
.
.
.
s
2
k
−1
= 1+
k−1
∑
j=1
2
j+1
−1
∑
m=2
j
1
m
p
.
2.5. SERIES 75
Instead of estimating from below, we estimate from above. In particular, as p > 1, then 2
p
< 3
p
,
and hence
1
2
p
+
1
3
p
<
1
2
p
+
1
2
p
. Similarly
1
4
p
+
1
5
p
+
1
6
p
+
1
7
p
<
1
4
p
+
1
4
p
+
1
4
p
+
1
4
p
. Therefore
s
2
k
−1
= 1+
k
∑
j=1
2
j+1
−1
∑
m=2
j
1
m
p
< 1+
k
∑
j=1
2
j+1
−1
∑
m=2
j
1
(2
j
)
p
= 1+
k
∑
j=1
2
j
(2
j
)
p
= 1+
k
∑
j=1
1
2
p−1
j
.
As p > 1, then
1
2
p−1
< 1. Then by using the result of Exercise 2.5.2, we note that
∞
∑
j=1
1
2
p−1
j
.
converges. Therefore
s
2
k
−1
< 1+
k
∑
j=1
1
2
p−1
j
≤1+
∞
∑
j=1
1
2
p−1
j
.
As ¦s
n
¦ is a monotone sequence, then all s
n
≤s
2
k
−1
for all n ≤2
k
−1. Thus for all n,
s
n
< 1+
∞
∑
j=1
1
2
p−1
j
.
The sequence of partial sums is bounded and hence converges.
Note that neither the pseries test nor the comparison test will tell us what the sum converges
to. They only tell us that a limit of the partial sums exists. For example, while we know that ∑
1
/n
2
converges it is far harder to ﬁnd
¶
that the limit is
π
2
/2. In fact, if we treat ∑
1
/n
p
as a function of p,
we get the socalled Riemann ζ function. Understanding the behavior of this function contains one
of the most famous problems in mathematics today and has applications in seemingly unrelated
areas such as modern cryptography.
Example 2.5.16: The series ∑
1
n
2
+1
converges.
Proof: First note that
1
n
2
+1
<
1
n
2
for all n ∈ N. Note that ∑
1
n
2
converges by the pseries test.
Therefore, by the comparison test, ∑
1
n
2
+1
converges.
¶
Demonstration of this fact is what made the Swiss mathematician Leonhard Paul Euler (1707 – 1783) famous.
76 CHAPTER 2. SEQUENCES AND SERIES
2.5.6 Ratio test
Proposition 2.5.17 (Ratio test). Let ∑x
n
be a series such that
L := lim
n→∞
[x
n+1
[
[x
n
[
exists. Then
(i) If L < 1, then ∑x
n
converges absolutely.
(ii) If L > 1, then ∑x
n
diverges.
Proof. From Lemma 2.2.12 we note that if L > 1, then x
n
diverges. Since it is a necessary condition
for the convergence of series that the terms go to zero, we know that ∑x
n
must diverge.
Thus suppose that L < 1. We will argue that ∑[x
n
[ must converge. The proof is similar to that
of Lemma 2.2.12. Of course L ≥0. Now pick r such that L < r < 1. As r −L > 0, there exists an
M ∈ N such that for all n ≥M
[x
n+1
[
[x
n
[
−L
< r −L.
Therefore,
[x
n+1
[
[x
n
[
< r.
For n > M (that is for n ≥M+1) write
[x
n
[ =[x
M
[
[x
n
[
[x
n−1
[
[x
n−1
[
[x
n−2
[
[x
M+1
[
[x
M
[
<[x
M
[ rr r =[x
M
[ r
n−M
= ([x
M
[ r
−M
)r
n
.
For n > M we can therefore write the partial sum as
n
∑
k=1
[x
k
[ =
M
∑
k=1
[x
k
[
+
n
∑
k=M+1
[x
k
[
≤
M
∑
k=1
[x
k
[
+
n
∑
k=M+1
([x
M
[ r
−M
)r
n
≤
M
∑
k=1
[x
k
[
+([x
M
[ r
−M
)
n
∑
k=M+1
r
k
.
As 0 < r < 1 the geometric series ∑
∞
k=0
r
k
converges and thus of course ∑
∞
k=M+1
r
k
converges as
well (why?). Thus we can take the limit as n goes to inﬁnity on the righthand side to obtain.
n
∑
k=1
[x
k
[ ≤
M
∑
k=1
[x
k
[
+([x
M
[ r
−M
)
n
∑
k=M+1
r
n
≤
M
∑
k=1
[x
k
[
+([x
M
[ r
−M
)
∞
∑
k=M+1
r
k
.
2.5. SERIES 77
The righthand side is a number that does not depend on n. Hence the sequence of partial sums of
∑[x
n
[ is bounded and therefore ∑[x
n
[ is convergent. Thus ∑x
n
is absolutely convergent.
Example 2.5.18: The series
∞
∑
n=1
2
n
n!
converges absolutely.
Proof: We have already seen that
lim
n→∞
2
n
n!
= 0.
Therefore, the series converges absolutely by the ratio test.
2.5.7 Exercises
Exercise 2.5.1: For r = 1, prove
n−1
∑
k=0
r
k
=
1−r
n
1−r
.
Hint: Let s :=∑
n−1
k=0
r
k
, then compute s(1−r) = s −rs, and solve for s.
Exercise 2.5.2: Prove that for −1 < r < 1 we have
∞
∑
n=0
r
n
=
1
1−r
.
Hint: Use the previous exercise.
Exercise 2.5.3: Decide the convergence or divergence of the following series.
a)
∞
∑
n=1
3
9n+1
b)
∞
∑
n=1
1
2n−1
c)
∞
∑
n=1
(−1)
n
n
2
d)
∞
∑
n=1
1
n(n+1)
e)
∞
∑
n=1
ne
−n
2
78 CHAPTER 2. SEQUENCES AND SERIES
Exercise 2.5.4:
a) Prove that if
∞
∑
n=1
x
n
converges, then
∞
∑
n=1
(x
2n
+x
2n+1
) also converges.
b) Find an explicit example where the converse does not hold.
Exercise 2.5.5: For j = 1, 2, . . . , n, let ¦x
j,k
¦
∞
k=1
denote n sequences. Suppose that for each j
∞
∑
k=1
x
j,k
is convergent. Then show
n
∑
j=1
∞
∑
k=1
x
j,k
=
∞
∑
k=1
n
∑
j=1
x
j,k
.
Exercise 2.5.6: Prove the following stronger version of the ratio test: Let ∑x
n
be a series.
a) If there is an N and a ρ < 1 such that for all n ≥N we have
[x
n+1
[
[x
n
[
<ρ, then the series converges.
b) If there is an N such that for all n ≥N we have
[x
n+1
[
[x
n
[
≥1, then the series diverges.
Exercise 2.5.7: Let ¦x
n
¦ be a decreasing sequence such that ∑x
n
converges. Show that lim
n→∞
nx
n
=0.
Chapter 3
Continuous Functions
3.1 Limits of functions
Note: 3 lectures
Before we can deﬁne continuity of functions, we need to visit a somewhat more general notion
of a limit. That is, given a function f : S →R, we want to see how f (x) behaves as x tends to a
certain point.
3.1.1 Cluster points
First, let us return to a concept we have previously seen in an exercise.
Deﬁnition 3.1.1. Let S ⊂ R be a set. A number x ∈ R is called a cluster point of S if for every
ε > 0, the set (x −ε, x +ε) ∩S`¦x¦ is not empty.
That is, x is a cluster point of S if there are points of S arbitrarily close to x. Another way of
phrasing the deﬁnition is to say that x is a cluster point of S if for every ε > 0, there exists a y ∈ S
such that y = x and [x −y[ < ε.
Let us see some examples.
(i) The set ¦
1
/n : n ∈ N¦ has a unique cluster point zero.
(ii) The cluster points of the open interval (0, 1) are all points in the closed interval [0, 1].
(iii) For the set Q, the set of cluster points is the whole real line R.
(iv) For the set [0, 1) ∪¦2¦, the set of cluster points is the interval [0, 1].
(v) The set N has no cluster points in R.
79
80 CHAPTER 3. CONTINUOUS FUNCTIONS
Proposition 3.1.2. Let S ⊂ R. Then x ∈ R is a cluster point of S if and only if there exists a
convergent sequence of numbers ¦x
n
¦ such that x
n
= x, x
n
∈ S, and lim x
n
= x.
Proof. First suppose that x is a cluster point of S. For any n ∈ N, we pick x
n
to be an arbitrary point
of (x −
1
/n, x +
1
/n) ∩S`¦x¦, which we know is nonempty because x is a cluster point of S. Then x
n
is within
1
/n of x, that is,
[x −x
n
[ <
1
/n.
As ¦
1
/n¦ converges to zero, then ¦x
n
¦ converges to x.
On the other hand if we start with a sequence of numbers ¦x
n
¦ in S converging to x such that
x
n
= x for all n, then for every ε > 0 there is an M such that in particular [x
M
−x[ < ε. That is,
x
M
∈ (x −ε, x +ε) ∩S`¦x¦.
3.1.2 Limits of functions
If a function f is deﬁned on a set S and c is a cluster point of S, then we can deﬁne the limit of
f (x) as x gets close to c. Do note that it is irrelevant for the deﬁnition if f is deﬁned at c or not.
Furthermore, even if the function is deﬁned at c, the limit of the function as x goes to c could very
well be different from f (c).
Deﬁnition 3.1.3. Let f : S →R be a function and c be a cluster point of S. Suppose that there exists
an L ∈ R and for every ε > 0, there exists a δ > 0 such that whenever x ∈ S`¦c¦ and [x −c[ < δ,
then
[ f (x) −L[ < ε.
In this case we say that f (x) converges to L as x goes to c. We also say that L is the limit of f (x) as
x goes to c. We write
lim
x→c
f (x) := L,
or
f (x) →L as x →c.
If no such L exists, then we say that the limit does not exist or that f diverges at c.
Again the notation and language we are using above assumes that the limit is unique even though
we have not yet proved that. Let us do that now.
Proposition 3.1.4. Let c be a cluster point of S ⊂R and let f : S →R be a function such that f (x)
converges as x goes to c. Then the limit of f (x) as x goes to c is unique.
Proof. Let L
1
and L
2
be two numbers that both satisfy the deﬁnition. Take an ε > 0 and ﬁnd a
δ
1
> 0 such that [ f (x) −L
1
[ <
ε
/2 for all x ∈ S, [x −c[ < δ
1
and x = c. Also ﬁnd δ
2
> 0 such that
[ f (x) −L
2
[ <
ε
/2 for all x ∈ S, [x −c[ < δ
2
, and x = c. Put δ := min¦δ
1
, δ
2
¦. Suppose that x ∈ S,
[x −c[ < δ, and x = c. Then
[L
1
−L
2
[ =[L
1
− f (x) + f (x) −L
2
[ ≤[L
1
− f (x)[ +[ f (x) −L
2
[ <
ε
2
+
ε
2
= ε.
3.1. LIMITS OF FUNCTIONS 81
As [L
1
−L
2
[ < ε for arbitrary ε > 0, then L
1
= L
2
.
Example 3.1.5: Let f : R →R be deﬁned as f (x) := x
2
. Then
lim
x→c
f (x) = lim
x→c
x
2
= c
2
.
Proof: First let c be ﬁxed. Let ε > 0 be given. Take
δ := min
1,
ε
2[c[ +1
.
Take x = c such that [x −c[ < δ. In particular, [x −c[ < 1. Then by reverse triangle inequality we
get
[x[ −[c[ ≤[x −c[ < 1.
Adding 2[c[ to both sides we obtain [x[ +[c[ < 2[c[ +1. We can now compute
f (x) −c
2
=
x
2
−c
2
=[(x +c)(x −c)[
=[x +c[ [x −c[
≤([x[ +[c[)[x −c[
< (2[c[ +1)[x −c[
< (2[c[ +1)
ε
2[c[ +1
= ε.
Example 3.1.6: Let S := [0, 1). Deﬁne
f (x) :=
x if x > 0,
1 if x = 0.
Then
lim
x→0
f (x) = 0,
even though f (0) = 1.
Proof: Let ε > 0 be given. Let δ := ε. Then for x ∈ S, x = 0, and [x −0[ < δ we get
[ f (x) −0[ =[x[ < δ = ε.
3.1.3 Sequential limits
Let us connect the limit as deﬁned above with limits of sequences.
82 CHAPTER 3. CONTINUOUS FUNCTIONS
Lemma 3.1.7. Let S ⊂R and c be a cluster point of S. Let f : S →R be a function.
Then f (x) →L as x →c, if and only if for every sequence ¦x
n
¦ of numbers such that x
n
∈ S,
x
n
= c, and such that lim x
n
= c, we have that the sequence ¦ f (x
n
)¦ converges to L.
Proof. Suppose that f (x) →L as x →c. Now suppose that ¦x
n
¦ is a sequence as in the proposition.
We wish to show that ¦ f (x
n
)¦ converges to L. Let ε > 0 be given. Find a δ > 0 such that if
x ∈ S∩(x −δ, x +δ) `¦c¦, then we have [ f (x) −L[ < ε. We know that ¦x
n
¦ converges to c, hence
ﬁnd an M such that for n ≥ M we have that [x
n
−c[ < δ. Therefore x
n
∈ S ∩(x −δ, x +δ) ` ¦c¦,
and thus
[ f (x
n
) −L[ < ε.
Thus ¦ f (x
n
)¦ converges to L.
For the other direction, we will use proof by contrapositive. Suppose that it is not true that
f (x) →L as x →c. The simple negation of the deﬁnition is that there exists an ε > 0 such that for
every δ > 0 there exists an x ∈ S, [x −c[ < δ and x = c and [ f (x) −L[ ≥ε.
Let us use
1
/n for δ in the above statement. We have that for every n, there exists a point x
n
∈ S,
x
n
= c and [x
n
−c[ <
1
/n such that [ f (x
n
) −L[ ≥ε. This is precisely the negation of the statement
that the sequence ¦ f (x
n
)¦ converges to L. And we are done.
Example 3.1.8: lim
x→0
sin(
1
/x) does not exist, but lim
x→0
xsin(
1
/x) = 0. See Figure 3.1.
Figure 3.1: Graphs of sin(
1
/x) and xsin(
1
/x). Note that the computer cannot properly graph sin(
1
/x)
near zero as it oscillates too fast.
Proof: Let us work with sin(
1
/x) ﬁrst. Let us deﬁne a sequence x
n
:=
1
πn+
π
/2
. It is not hard to see
that lim x
n
= 0. Furthermore,
sin(
1
/x
n
) = sin(πn+
π
/2) = (−1)
n
.
Therefore, ¦sin(
1
/x
n
)¦ does not converge. Thus, by Lemma 3.1.7, lim
x→0
sin(
1
/x) does not exist.
3.1. LIMITS OF FUNCTIONS 83
Now let us look at xsin(
1
/x). Let x
n
be a sequence such that x
n
= 0 for all n and such that
lim x
n
= 0. Notice that [sin(t)[ ≤1 for any t ∈ R. Therefore,
[x
n
sin(
1
/x
n
) −0[ =[x
n
[ [sin(
1
/x
n
)[ ≤[x
n
[ .
As x
n
goes to 0, then [x
n
[ goes to zero, and hence ¦x
n
sin(
1
/x
n
)¦ converges to zero. By Lemma 3.1.7,
lim
x→0
xsin(
1
/x) = 0.
Using the proposition above we can start applying anything we know about sequential limits to
limits of functions. Let us give a few important examples.
Corollary 3.1.9. Let S ⊂R and c be a cluster point of S. Let f : S →R and g: S →R be functions.
Suppose that the limits of f (x) and g(x) as x goes to c both exist, and that
f (x) ≤g(x) for all x ∈ S.
Then
lim
x→c
f (x) ≤ lim
x→c
g(x).
Proof. Take ¦x
n
¦ be a sequence of numbers from S`¦c¦ that converges to c. Let
L
1
:= lim
x→c
f (x), and L
2
:= lim
x→c
g(x).
By Lemma 3.1.7 we know ¦ f (x
n
)¦ converges to L
1
and ¦g(x
n
)¦ converges to L
2
. We obtain L
1
≤L
2
using Lemma 2.2.3.
By applying constant functions, we get the following corollary. The proof is left as an exercise.
Corollary 3.1.10. Let S ⊂R and c be a cluster point of S. Let f : S →R be a function. And suppose
that the limit of f (x) as x goes to c exists. Suppose that there are two real numbers a and b such that
a ≤ f (x) ≤b for all x ∈ S.
Then
a ≤ lim
x→c
f (x) ≤b.
By applying Lemma 3.1.7 in the same way as above we also get the following corollaries, whose
proofs are again left as an exercise.
Corollary 3.1.11. Let S ⊂R and c be a cluster point of S. Let f : S →R, g: S →R, and h: S →R
be functions. Suppose that
f (x) ≤g(x) ≤h(x) for all x ∈ S,
84 CHAPTER 3. CONTINUOUS FUNCTIONS
and the limits of f (x) and h(x) as x goes to c both exist, and
lim
x→c
f (x) = lim
x→c
h(x).
Then the limit of g(x) as x goes to c exists and
lim
x→c
g(x) = lim
x→c
f (x) = lim
x→c
h(x).
Corollary 3.1.12. Let S ⊂R and c be a cluster point of S. Let f : S →R and g: S →R be functions.
Suppose that limits of f (x) and g(x) as x goes to c both exist. Then
(i) lim
x→c
f (x) +g(x)
=
lim
x→c
f (x)
+
lim
x→c
g(x)
.
(ii) lim
x→c
f (x) −g(x)
=
lim
x→c
f (x)
−
lim
x→c
g(x)
.
(iii) lim
x→c
f (x)g(x)
=
lim
x→c
f (x)
lim
x→c
g(x)
.
(iv) If lim
x→c
g(x) = 0, and g(x) = 0 for all x ∈ S, then
lim
x→c
f (x)
g(x)
=
lim
x→c
f (x)
lim
x→c
g(x)
.
3.1.4 Restrictions and limits
It is not necessary to always consider all of S. Sometimes we may be able to just work with the
function deﬁned on a smaller set.
Deﬁnition 3.1.13. Let f : S →R be a function. Let A ⊂S. Deﬁne the function f [
A
: A →R by
f [
A
(x) := f (x) for x ∈ A.
The function f [
A
is called the restriction of f to A.
The function f [
A
is simply the function f taken on a smaller domain. The following proposition
is the analogue of taking a tail of a sequence.
Proposition 3.1.14. Let S ⊂R and let c ∈ R. Let A ⊂S be a subset such that there is some α > 0
such that A∩(c −α, c +α) = S∩(c −α, c +α). Let f : S →R be a function.
(i) The point c is a cluster point of A if and only if c is a cluster point of S.
(ii) Supposing c is a cluster point of S, then f (x) →L as x →c if and only if f [
A
(x) →L as x →c.
3.1. LIMITS OF FUNCTIONS 85
Proof. First let c be a cluster point of A. Since A ⊂S, then if A`¦c¦∩(c −ε, c +ε) is nonempty,
then S`¦c¦∩(c−ε, c+ε) is nonempty for every ε >0 and thus c is a cluster point of A. On the other
hand, if c is a cluster point of S, then for ε > 0 such that ε <α we get that A`¦c¦∩(c−ε, c+ε) =
S`¦c¦∩(c−ε, c+ε). This is true for all ε <α and hence A`¦c¦∩(c−ε, c+ε) must be nonempty
for all ε > 0. Thus c is a cluster point of A.
Now suppose that f (x) → L as x → c. Hence for every ε > 0 there is a δ > 0 such that if
x ∈ S`¦c¦ and [x −c[ <δ, then [ f (x) −L[ <ε. As A ⊂S, then if x is in A`¦c¦, then x is in S`¦c¦,
and hence f [
A
(x) →L as x →c.
Now suppose that f [
A
(x) → L as x → c. Hence for every ε > 0 there is a δ > 0 such that
if x ∈ A` ¦c¦ and [x −c[ < δ, then [ f [
A
(x) −L[ < ε. If we picked δ > α, then set δ := α. If
[x −c[ < δ, then x ∈ S`¦c¦ if and only if x ∈ A`¦c¦. Thus [ f (x) −L[ =[ f [
A
x −L[ < ε.
3.1.5 Exercises
Exercise 3.1.1: Find the limit or prove that the limit does not exist
a) lim
x→c
√
x, for c ≥0.
b) lim
x→c
x
2
+x +1, for any c ∈ R.
c) lim
x→0
x
2
cos(
1
/x)
d) lim
x→0
sin(
1
/x)cos(
1
/x)
e) lim
x→0
sin(x)cos(
1
/x)
Exercise 3.1.2: Prove Corollary 3.1.10.
Exercise 3.1.3: Prove Corollary 3.1.11.
Exercise 3.1.4: Prove Corollary 3.1.12.
Exercise 3.1.5: Let A ⊂S. Show that if c is a cluster point of A, then c is a cluster point of S. Note
the difference from Proposition 3.1.14.
Exercise 3.1.6: Let A ⊂S. Suppose that c is a cluster point of A and it is also a cluster point of S.
Let f : S →R be a function. Show that if f (x) →L as x →c, then f [
A
(x) →L as x →c. Note the
difference from Proposition 3.1.14.
Exercise 3.1.7: Find an example of a function f : [−1, 1] →R such that for A := [0, 1], the restric
tion f [
A
(x) →0 as x →0, but the limit of f (x) as x →0 does not exist. Note why you cannot apply
Proposition 3.1.14.
Exercise 3.1.8: Find example functions f and g such that the limit of neither f (x) nor g(x) exists
as x →0, but such that the limit of f (x) +g(x) exists as x →0.
86 CHAPTER 3. CONTINUOUS FUNCTIONS
3.2 Continuous functions
Note: 2.5 lectures
You have undoubtedly heard of continuous functions in your schooling. A high school criterion
for this concept is that a function is continuous if we can draw its graph without lifting the pen from
the paper. While that intuitive concept may be useful in simple situations, we will require a rigorous
concept. The following deﬁnition took three great mathematicians (Bolzano, Cauchy, and ﬁnally
Weierstrass) to get correctly and its ﬁnal form dates only to the late 1800s.
3.2.1 Deﬁnition and basic properties
Deﬁnition 3.2.1. Let S ⊂ R. Let f : S →R be a function. Let c ∈ S be a number. We say that
f is continuous at c if for every ε > 0 there is a δ > 0 such that [x −c[ < δ, x ∈ S, implies that
[ f (x) − f (c)[ < ε.
When f : S →R is continuous at all c ∈ S, then we simply say that f is a continuous function.
This deﬁnition is one of the most important to get correctly in analysis, and it is not an easy one
to understand. Note that δ not only depends on ε, but also on c. That is, we need not have to pick
one δ for all c ∈ S.
Sometimes we say that f is continuous on A ⊂ S. Then we mean that f is continuous at all
c ∈ A. It is left as an exercise to prove that if f is continuous on A, then f [
A
is continuous.
It is no accident that the deﬁnition of a continuous function is similar to the deﬁnition of a limit
of a function. The main feature of continuous functions is that these are precisely the functions that
behave nicely with limits.
Proposition 3.2.2. Suppose that f : S →R is a function and c ∈ S. Then
(i) If c is not a cluster point of S, then f is continuous at c.
(ii) If c is a cluster point of S, then f is continuous at c if and only if the limit of f (x) as x →c
exists and
lim
x→c
f (x) = f (c).
(iii) f is continuous at c if and only if for every sequence ¦x
n
¦ where x
n
∈ S and lim x
n
= c, the
sequence ¦ f (x
n
)¦ converges to f (c).
Proof. Let us start with the ﬁrst item. Suppose that c is not a cluster point of S. Then there exists
a δ > 0 such that S∩(c −δ, c +δ) =¦c¦. Therefore, for any ε > 0, simply pick this given delta.
The only x ∈ S such that [x −c[ < δ is x = c. Therefore [ f (x) − f (c)[ =[ f (c) − f (c)[ = 0 < ε.
Let us move to the second item. Suppose that c is a cluster point of S. Let us ﬁrst suppose that
lim
x→c
f (x) = f (c). Then for every ε > 0 there is a δ > 0 such that if x ∈ S`¦c¦ and [x −c[ < δ,
3.2. CONTINUOUS FUNCTIONS 87
then [ f (x) − f (c)[ < ε. As [ f (c) − f (c)[ = 0 < ε, then the deﬁnition of continuity at c is satisﬁed.
On the other hand, suppose that f is a continuous function at c. For every ε > 0, there exists a δ > 0
such that for x ∈ S where [x −c[ < δ we have [ f (x) − f (c)[ < ε. Then the statement is, of course,
still true if x ∈ S`¦c¦ ⊂S. Therefore lim
x→c
f (x) = f (c).
For the third item, suppose that f is continuous. Let ¦x
n
¦ be a sequence such that x
n
∈ S and
lim x
n
= c. Let ε > 0 be given. Find δ > 0 such that [ f (x) − f (c)[ < ε for all x ∈ S such that
[x −c[ <δ. Now ﬁnd an M ∈ N such that for n ≥M we have [x
n
−c[ <δ. Then for n ≥M we have
that [ f (x
n
) − f (c)[ < ε, so ¦f (x
n
)¦ converges to f (c).
Let us prove the converse by contrapositive. Suppose that f is not continuous at c. This means
that there exists an ε > 0 such that for all δ > 0, there exists an x ∈ S such that [x −c[ < δ and
[ f (x) − f (c)[ ≥ε. Let us deﬁne a sequence x
n
as follows. Let x
n
∈ S be such that [x
n
−c[ <
1
/n and
[ f (x
n
) − f (c)[ ≥ε. As f is not continuous at c, we can do this. Now ¦x
n
¦ is a sequence of numbers
in S such that lim x
n
= c and such that [ f (x
n
) − f (c)[ ≥ ε for all n ∈ N. Thus ¦ f (x
n
)¦ does not
converge to f (c) (it may or may not converge, but it deﬁnitely does not converge to f (c)).
The last item in the proposition is particularly powerful. It allows us to quickly apply what we
know about limits of sequences to continuous functions and even to prove that certain functions are
continuous.
Example 3.2.3: f : (0, ∞) →R deﬁned by f (x) :=
1
/x is continuous.
Proof: Fix c ∈ (0, ∞). Let ¦x
n
¦ be a sequence in (0, ∞) such that lim x
n
= c. Then we know that
lim
n→∞
1
x
n
=
1
lim x
n
=
1
c
= f (c).
Thus f is continuous at c. As f is continuous at all c ∈ (0, ∞), f is continuous.
We have previously shown that lim
x→c
x
2
= c
2
directly. Therefore the function x
2
is continuous.
However, we can use the continuity of algebraic operations with respect to limits of sequences we
have proved in the previous chapter to prove a much more general result.
Proposition 3.2.4. Let f : R →R be a polynomial. That is
f (x) = a
d
x
d
+a
d−1
x
d−1
+ +a
1
x +a
0
,
for some constants a
0
, a
1
, . . . , a
d
. Then f is continuous.
Proof. Fix c ∈ R. Let ¦x
n
¦ be a sequence such that lim x
n
= c. Then
lim
n→∞
f (x
n
) = lim
n→∞
a
d
x
d
n
+a
d−1
x
d−1
n
+ +a
1
x
n
+a
0
= a
d
(lim x
n
)
d
+a
d−1
(lim x
n
)
d−1
+ +a
1
(lim x
n
) +a
0
= a
d
c
d
+a
d−1
c
d−1
+ +a
1
c +a
0
= f (c).
Thus f is continuous at c. As f is continuous at all c ∈ R, f is continuous.
88 CHAPTER 3. CONTINUOUS FUNCTIONS
By similar reasoning, or by appealing to Corollary 3.1.12 we can prove the following. The
details of the proof are left as an exercise.
Proposition 3.2.5. Let f : S →R and g: S →R be functions continuous at c ∈ S.
(i) The function h: S →R deﬁned by h(x) := f (x) +g(x) is continuous at c.
(ii) The function h: S →R deﬁned by h(x) := f (x) −g(x) is continuous at c.
(iii) The function h: S →R deﬁned by h(x) := f (x)g(x) is continuous at c.
(iv) If g(x) = 0 for all x ∈ S, then the function h: S →R deﬁned by h(x) :=
f (x)
g(x)
is continuous at c.
Example 3.2.6: The functions sin(x) and cos(x) are continuous. In the following computations
we use the sumtoproduct trigonometric identities. We also use the simple facts that [sin(x)[ ≤[x[,
[cos(x)[ ≤1, and [sin(x)[ ≤1.
[sin(x) −sin(c)[ =
2sin
x −c
2
cos
x +c
2
= 2
sin
x −c
2
cos
x +c
2
≤2
sin
x −c
2
≤2
x −c
2
=[x −c[
[cos(x) −cos(c)[ =
−2sin
x −c
2
sin
x +c
2
= 2
sin
x −c
2
sin
x +c
2
≤2
sin
x −c
2
≤2
x −c
2
=[x −c[
The claim that sin and cos are continuous follows by taking an arbitrary sequence ¦x
n
¦ converg
ing to c. Details are left to the reader.
3.2.2 Composition of continuous functions
You have probably already realized that one of the basic tools in constructing complicated functions
out of simple ones is composition. A very useful property of continuous functions is that compo
sitions of continuous functions are again continuous. Recall that for two functions f and g, the
composition f ◦g is deﬁned by ( f ◦g)(x) := f
g(x)
.
3.2. CONTINUOUS FUNCTIONS 89
Proposition 3.2.7. Let A, B ⊂R and f : B →R and g: A →B be functions. If g is continuous at
c ∈ A and f is continuous at g(c), then f ◦g: A →R is continuous at c.
Proof. Let ¦x
n
¦ be a sequence in A such that lim x
n
= c. Then as g is continuous at c, then ¦g(x
n
)¦
converges to g(c). As f is continuous at g(c), then ¦ f
g(x
n
)
¦ converges to f
g(c)
. Thus f ◦g is
continuous at c.
Example 3.2.8: Claim:
sin(
1
/x)
2
is a continuous function on (0, ∞).
Proof: First note that
1
/x is a continuous function on (0, ∞) and sin(x) is a continuous function
on (0, ∞) (actually on all of R, but (0, ∞) is the range for
1
/x). Hence the composition sin(
1
/x) is
continuous. We also know that x
2
is continuous on the interval (−1, 1) (the range of sin). Thus the
composition
sin(
1
/x)
2
is also continuous on (0, ∞).
3.2.3 Discontinuous functions
Let us spend a bit of time on discontinuous functions. If we state the contrapositive of the third item
of Proposition 3.2.2 as a separate claim we get an easy to use test for discontinuities.
Proposition 3.2.9. Let f : S →R be a function. Suppose that for some c ∈S, there exists a sequence
¦x
n
¦, x
n
∈ S, and lim x
n
= c such that ¦ f (x
n
)¦ does not converge to f (c) (or does not converge at
all), then f is not continuous at c.
We say that f is discontinuous at c, or that it has a discontinuity at c.
Example 3.2.10: The function f : R →R deﬁned by
f (x) :=
−1 if x < 0,
1 if x ≥0,
is not continuous at 0.
Proof: Simply take the sequence ¦−
1
/n¦. Then f (−
1
/n) = −1 and so lim f (−
1
/n) = −1, but
f (0) = 1.
Example 3.2.11: For an extreme example we take the socalled Dirichlet function.
f (x) :=
1 if x is rational,
0 if x is irrational.
The function f is discontinuous at all c ∈ R.
Proof: Suppose that c is rational. Then we can take a sequence ¦x
n
¦ of irrational numbers such
that lim x
n
= c. Then f (x
n
) = 0 and so lim f (x
n
) = 0, but f (c) = 1. If c is irrational, then take a
sequence of rational numbers ¦x
n
¦ that converges to c. Then lim f (x
n
) = 1 but f (c) = 0.
90 CHAPTER 3. CONTINUOUS FUNCTIONS
As a ﬁnal example, let us yet again test the limits of your intuition. Can there exist a function
that is continuous on all irrational numbers, but discontinuous at all rational numbers? Note that
there are rational numbers arbitrarily close to any irrational number. But, perhaps strangely, the
answer is yes. The following example is called the Thomae function
∗
or the popcorn function.
Example 3.2.12: Let f : (0, 1) →R be deﬁned by
f (x) :=
1
/k if x =
m
/k where m, k ∈ N and m and k have no common divisors,
0 if x is irrational.
Then f is continuous at all c ∈ (0, 1) that are irrational and is discontinuous at all rational c. See the
graph of the function in Figure 3.2.
Figure 3.2: Graph of the “popcorn function.”
Proof: Suppose that c =
m
/k is rational. Then take a sequence of irrational numbers ¦x
n
¦ such
that lim x
n
= c. Then lim f (x
n
) = lim0 = 0 but f (c) =
1
/k = 0. So f is discontinuous at c.
Now suppose that c is irrational and hence f (c) = 0. Take a sequence ¦x
n
¦ of numbers in (0, ∞)
such that lim x
n
= c. For a given ε > 0, ﬁnd K ∈ N such that
1
/K < ε by the Archimedean property.
If
m
/k is written in lowest terms (no common divisors) and
m
/k ∈(0, 1), then m<k. It is then obvious
that there are only ﬁnitely rational numbers in (0, 1) whose denominator k in lowest terms is less
than K. Hence there is an M such that for n ≥M, all the rational numbers x
n
have a denominator
larger than or equal to K. Thus for n ≥M
[ f (x
n
) −0[ = f (x
n
) ≤
1
/K < ε.
Therefore f is continuous at irrational c.
3.2.4 Exercises
Exercise 3.2.1: Using the deﬁnition of continuity directly prove that f : R→Rdeﬁned by f (x) :=x
2
is continuous.
∗
Named after the German mathematician Johannes Karl Thomae (1840 – 1921).
3.2. CONTINUOUS FUNCTIONS 91
Exercise 3.2.2: Using the deﬁnition of continuity directly prove that f : (0, ∞) →R deﬁned by
f (x) :=
1
/x is continuous.
Exercise 3.2.3: Let f : R →R be deﬁned by
f (x) :=
x if x is rational,
x
2
if x is irrational.
Using the deﬁnition of continuity directly prove that f is continuous at 1 and discontinuous at 2.
Exercise 3.2.4: Let f : R →R be deﬁned by
f (x) :=
sin(
1
/x) if x = 0,
0 if x = 0.
Is f continuous? Prove your assertion.
Exercise 3.2.5: Let f : R →R be deﬁned by
f (x) :=
xsin(
1
/x) if x = 0,
0 if x = 0.
Is f continuous? Prove your assertion.
Exercise 3.2.6: Prove Proposition 3.2.5.
Exercise 3.2.7: Prove the following statement. Let S ⊂R and A ⊂S. Let f : S →R be a continuous
function. Then the restriction f [
A
is continuous.
Exercise 3.2.8: Suppose that S ⊂ R. Suppose that for some c ∈ R and α > 0, we have A =
(c −α, c +α) ⊂ S. Let f : S →R be a function. Prove that if f [
A
is continuous at c, then f is
continuous at c.
Exercise 3.2.9: Give an example of functions f : R →R and g: R →R such that the function h
deﬁned by h(x) := f (x) +g(x) is continuous, but f and g are not continuous. Can you ﬁnd f and g
that are nowhere continuous, but h is a continuous function?
Exercise 3.2.10: Let f : R →R and g: R →R be continuous functions. Suppose that for all
rational numbers r, f (r) = g(r). Show that f (x) = g(x) for all x.
Exercise 3.2.11: Let f : R →R be continuous. Suppose that f (c) > 0. Show that there exists an
α > 0 such that for all x ∈ (c −α, c +α) we have f (x) > 0.
Exercise 3.2.12: Let f : Z →R be a function. Show that f is continuous.
92 CHAPTER 3. CONTINUOUS FUNCTIONS
3.3 Minmax and intermediate value theorems
Note: 1.5 lectures
Let us now state and prove some very important results about continuous functions deﬁned on
the real line. In particular, on closed bounded intervals of the real line.
3.3.1 Minmax theorem
Recall that a function f : [a, b] →R is bounded if there exists a B ∈ R such that [ f (x)[ < B for all
x ∈ [a, b]. We have the following lemma.
Lemma 3.3.1. Let f : [a, b] →R be a continuous function. Then f is bounded.
Proof. Let us prove this by contrapositive. Suppose that f is not bounded, then for each n ∈ N,
there is an x
n
∈ [a, b], such that
[ f (x
n
)[ ≥n.
Now ¦x
n
¦ is a bounded sequence as a ≤x
n
≤b. By the BolzanoWeierstrass theorem, there is a
convergent subsequence ¦x
n
i
¦. Let x := lim x
n
i
. Since a ≤ x
n
i
≤ b for all i, then a ≤ x ≤ b. The
limit lim f (x
n
i
) does not exist as the sequence is not bounded as [ f (x
n
i
)[ ≥n
i
≥i. On the other hand
f (x) is a ﬁnite number and
f (x) = f
lim
i→∞
x
n
i
.
Thus f is not continuous at x.
The main point will not be just that f is bounded, but the minimum and the maximum are
actually achieved. Recall from calculus that f : S →R achieves an absolute minimum at c ∈ S if
f (x) ≥ f (c) for all x ∈ S.
On the other hand, f achieves an absolute maximum at c ∈ S if
f (x) ≤ f (c) for all x ∈ S.
We simply say that f achieves an absolute minimum or an absolute maximum on S if such a c ∈ S
exists. It turns out that if S is a closed and bounded interval, then f must have an absolute minimum
and an absolute maximum.
Theorem 3.3.2 (Minimummaximum theorem). Let f : [a, b] →R be a continuous function. Then
f achieves both an absolute minimum and an absolute maximum on [a, b].
3.3. MINMAX AND INTERMEDIATE VALUE THEOREMS 93
Proof. We have shown that f is bounded by the lemma. Therefore, the set f ([a, b]) =¦ f (x) : x ∈
[a, b]¦ has a supremum and an inﬁmum. From what we know about suprema and inﬁma, there exist
sequences in the set f ([a, b]) that approach them. That is, there are sequences ¦ f (x
n
)¦ and ¦ f (y
n
)¦,
where x
n
, y
n
are in [a, b], such that
lim
n→∞
f (x
n
) = inf f ([a, b]) and lim
n→∞
f (y
n
) = sup f ([a, b]).
We are not done yet, we need to ﬁnd where the minimum and the maxima are. The problem is that
the sequences ¦x
n
¦ and ¦y
n
¦ need not converge. We know that ¦x
n
¦ and ¦y
n
¦ are bounded (their
elements belong to a bounded interval [a, b]). We apply the BolzanoWeierstrass theorem. Hence
there exist convergent subsequences ¦x
n
i
¦ and ¦y
n
i
¦. Let
x := lim
i→∞
x
n
i
and y := lim
i→∞
y
n
i
.
Then as a ≤x
n
i
≤b, we have that a ≤x ≤b. Similarly a ≤y ≤b, so x and y are in [a, b]. Now we
apply that a limit of a subsequence is the same as the limit of the sequence if it converged to get,
and we apply the continuity of f to obtain
inf f ([a, b]) = lim
n→∞
f (x
n
) = lim
i→∞
f (x
n
i
) = f
lim
i→∞
x
n
i
= f (x).
Similarly,
sup f ([a, b]) = lim
n→∞
f (y
n
) = lim
i→∞
f (y
n
i
) = f
lim
i→∞
y
n
i
= f (y).
Therefore, f achieves an absolute minimum at x and f achieves an absolute maximum at y.
Example 3.3.3: The function f (x) := x
2
+1 deﬁned on the interval [−1, 2] achieves a minimum at
x = 0 when f (0) = 1. It achieves a maximum at x = 2 where f (2) = 5. Do note that the domain of
deﬁnition matters. If we instead took the domain to be [−10, 10], then x = 2 would no longer be a
maximum of f . Instead the maximum would be achieved at either x = 10 or x =−10.
Let us show by examples that the different hypotheses of the theorem are truly necessary.
Example 3.3.4: The function f (x) :=x, deﬁned on the whole real line, achieves neither a minimum,
nor a maximum. So it is important that we are looking at a bounded interval.
Example 3.3.5: The function f (x) :=
1
/x, deﬁned on (0, 1) achieves neither a minimum, nor a
maximum. The values of the function are unbounded as we approach 0. Also as we approach x = 1,
the values of the function approach 1 as well but f (x) > 1 for all x ∈ (0, 1). There is no x ∈ (0, 1)
such that f (x) = 1. So it is important that we are looking at a closed interval.
Example 3.3.6: Continuity is important. Deﬁne f : [0, 1] →R by f (x) :=
1
/x for x > 0 and let
f (0) := 0. Then the function does not achieve a maximum. The problem is that the function is not
continuous at 0.
94 CHAPTER 3. CONTINUOUS FUNCTIONS
3.3.2 Bolzano’s intermediate value theorem
Bolzano’s intermediate value theorem is one of the cornerstones of analysis. It is sometimes called
only intermediate value theorem, or just Bolzano’s theorem. To prove Bolzano’s theorem we prove
the following simpler lemma.
Lemma 3.3.7. Let f : [a, b] →R be a continuous function. Suppose that f (a) < 0 and f (b) > 0.
Then there exists a c ∈ [a, b] such that f (c) = 0.
Proof. The proof will follow by deﬁning two sequences ¦a
n
¦ and ¦b
n
¦ inductively as follows.
(i) Let a
1
:= a and b
1
:= b.
(ii) If f
a
n
+b
n
2
≥0, let a
n+1
:= a
n
and b
n+1
:=
a
n
+b
n
2
.
(iii) If f
a
n
+b
n
2
< 0, let a
n+1
:=
a
n
+b
n
2
and b
n+1
:= b
n
.
From the deﬁnition of the two sequences it is obvious that if a
n
< b
n
, then a
n+1
< b
n+1
. Thus by
induction a
n
< b
n
for all n. Once we know that fact we can see that a
n
≤a
n+1
and b
n
≥b
n+1
for all
n. Finally we notice that
b
n+1
−a
n+1
=
b
n
−a
n
2
.
By induction we can see that
b
n
−a
n
=
b
1
−a
1
2
n−1
= 2
1−n
(b−a).
As ¦a
n
¦ and ¦b
n
¦ are monotone, they converge. Let c := lim a
n
and d := lim b
n
. As a
n
< b
n
for all n, then c ≤d. Furthermore, as a
n
is increasing and b
n
is decreasing, c is the supremum of a
n
and d is the supremum of the b
n
. Thus d −c ≤b
n
−a
n
for all n. Thus
[d −c[ = d −c ≤b
n
−a
n
≤2
1−n
(b−a)
for all n. As 2
1−n
(b−a) →0 as n →∞, we see that c = d. By construction, for all n
f (a
n
) < 0 and f (b
n
) ≥0.
We can use the fact that lim a
n
= lim b
n
= c, and use continuity of f to take limits in those
inequalities to get
f (c) = lim f (a
n
) ≤0 and f (c) = lim f (b
n
) ≥0.
As f (c) ≥0 and f (c) ≤0 we know that f (c) = 0.
3.3. MINMAX AND INTERMEDIATE VALUE THEOREMS 95
Notice that the proof tells us how to ﬁnd the c. Therefore the proof is not only useful for us pure
mathematicians, but it is a very useful idea in applied mathematics.
Theorem 3.3.8 (Bolzano’s intermediate value theorem). Let f : [a, b] →R be a continuous function.
Suppose that there exists a y such that f (a) < y < f (b) or f (a) > y > f (b). Then there exists a
c ∈ [a, b] such that f (c) = y.
The theorem says that a continuous function on a closed interval achieves all the values between
the values at the endpoints.
Proof. If f (a) < y < f (b), then deﬁne g(x) := f (x) −y. Then we see that g(a) < 0 and g(b) > 0
and we can apply Lemma 3.3.7 to g. If g(c) = 0, then f (c) = y.
Similarly if f (a) > y > f (b), then deﬁne g(x) := y − f (x). Then again g(a) < 0 and g(b) > 0
and we can apply Lemma 3.3.7. Again if g(c) = 0, then f (c) = y.
Of course as we said, if a function is continuous, then the restriction to a subset is continuous. So
if f : S →R is continuous and [a, b] ⊂S, then f [
[a,b]
is also continuous. Hence, we generally apply
the theorem to a function continuous on some large set S, but we restrict attention to an interval.
Example 3.3.9: The polynomial f (x) := x
3
−2x
2
+x−1 has a real root in [1, 2]. We simply notice
that f (1) =−1 and f (2) = 1. Hence there must exist a point c ∈ [1, 2] such that f (c) = 0. To ﬁnd
a better approximation of the root we could follow the proof of Lemma 3.3.7. For example, next
we would look at 1.5 and ﬁnd that f (1.5) =−0.625. Therefore, there is a root of the equation in
[1.5, 2]. Next we look at 1.75 and note that f (1.75) ≈−0.016. Hence there is a root of f in [1.75, 2].
Next we look at 1.875 and ﬁnd that f (1.875) ≈0.44, thus there is root in [1.75, 1.875]. We follow
this procedure until we gain sufﬁcient precision.
The technique above is the simplest method of ﬁnding roots of polynomials. Finding roots of
polynomials is perhaps the most common problem in applied mathematics. In general it is very
hard to do quickly, precisely and automatically. We can use the intermediate value theorem to ﬁnd
roots for any continuous function, not just a polynomial.
There are better and faster methods of ﬁnding roots of equations, for example the Newton’s
method. One advantage of the above method is its simplicity. Another advantage is that the moment
we ﬁnd an initial interval where the intermediate value theorem can be applied, we are guaranteed
that we will ﬁnd a root up to a desired precision after ﬁnitely many steps.
Do note that the theorem guarantees a single c such that f (c) =y. There could be many different
roots of the equation f (c) = y. If we follow the procedure of the proof, we are guaranteed to ﬁnd
approximations to one such root. We will need to work harder to ﬁnd any other roots that may exist.
Let us prove the following interesting result about polynomials. Note that polynomials of even
degree may not have any real roots. For example, there is no real number x such that x
2
+1 = 0.
Odd polynomials, on the other hand, always have at least one real root.
Proposition 3.3.10. Let f (x) be a polynomial of odd degree. Then f has a real root.
96 CHAPTER 3. CONTINUOUS FUNCTIONS
Proof. Suppose f is a polynomial of odd degree d. Then we can write
f (x) = a
d
x
d
+a
d−1
x
d−1
+ +a
1
x +a
0
,
where a
d
= 0. We can divide by a
d
to obtain a polynomial
g(x) = x
d
+b
d−1
x
d−1
+ +b
1
x +b
0
,
where b
k
=
a
k/a
d
. We look at the sequence ¦g(n)¦ for n ∈ N. We look at
b
d−1
n
d−1
+ +b
1
n+b
0
n
d
=
b
d−1
n
d−1
+ +b
1
n+b
0
n
d
≤
[b
d−1
[ n
d−1
+ +[b
1
[ n+[b
0
[
n
d
≤
[b
d−1
[ n
d−1
+ +[b
1
[ n
d−1
+[b
0
[ n
d−1
n
d
=
n
d−1
([b
d−1
[ + +[b
1
[ +[b
0
[)
n
d
=
1
n
([b
d−1
[ + +[b
1
[ +[b
0
[).
Therefore
lim
n→∞
b
d−1
n
d−1
+ +b
1
n+b
0
n
d
= 0.
Thus there exists an M ∈ N such that
b
d−1
M
d−1
+ +b
1
M+b
0
M
d
< 1,
or in other words
b
d−1
M
d−1
+ +b
1
M+b
0
< M
d
.
Therefore g(M) > 0.
Next we look at the sequence ¦g(−n)¦. By a similar argument (exercise) we ﬁnd that there
exists some K ∈ N such that −(b
d−1
(−K)
d−1
+ +b
1
(−K) +b
0
) < K
d
and therefore g(−K) < 0
(why?). In the proof make sure you use the fact that d is odd. In particular, this means that
(−n)
d
=−(n
d
).
Nowwe appeal to the intermediate value theorem, which implies that there must be a c ∈[−K, M]
such that g(c) = 0. As g(x) =
f (x)
a
d
, we see that f (c) = 0, and the proof is done.
Example 3.3.11: An interesting fact is that there do exist discontinuous functions that have the
intermediate value property. For example, the function
f (x) :=
sin(
1
/x) if x = 0,
0 if x = 0,
3.3. MINMAX AND INTERMEDIATE VALUE THEOREMS 97
is not continuous at 0, however it has the intermediate value property. That is, for any a < b, and
any y such that f (a) < y < f (b) or f (a) > y > f (b), there exists a c such that f (y) = c. Proof is
left as an exercise.
3.3.3 Exercises
Exercise 3.3.1: Find an example of a discontinuous function f : [0, 1] →R where the intermediate
value theorem fails.
Exercise 3.3.2: Find an example of a bounded discontinuous function f : [0, 1] →Rthat has neither
an absolute minimum nor an absolute maximum.
Exercise 3.3.3: Let f : (0, 1) →R be a continuous function such that lim
x→0
f (x) = lim
x→1
f (x) = 0.
Show that f achieves either an absolute minimum or an absolute maximum on (0, 1) (but perhaps
not both).
Exercise 3.3.4: Let
f (x) :=
sin(
1
/x) if x = 0,
0 if x = 0,
Show that f has the intermediate value property. That is, for any a < b, if there exists a y such that
f (a) < y < f (b) or f (a) > y > f (b), then there exists a c ∈ (a, b) such that f (c) = y.
Exercise 3.3.5: Suppose that g(x) is a polynomial of odd degree d such that
g(x) = x
d
+b
d−1
x
d−1
+ +b
1
x +b
0
,
for some real numbers b
0
, b
1
, . . . , b
d−1
. Show that there exists a K ∈ N such that g(−K) < 0. Hint:
Make sure to use the fact that d is odd. You will have to use that (−n)
d
=−(n
d
).
Exercise 3.3.6: Suppose that g(x) is a polynomial of even degree d such that
g(x) = x
d
+b
d−1
x
d−1
+ +b
1
x +b
0
,
for some real numbers b
0
, b
1
, . . . , b
d−1
. Suppose that g(0) < 0. Show that g has at least two distinct
real roots.
Exercise 3.3.7: Suppose that f : [a, b] →R is a continuous function. Prove that the direct image
f ([a, b]) is a closed and bounded interval.
98 CHAPTER 3. CONTINUOUS FUNCTIONS
3.4 Uniform continuity
Note: 1.5 lectures
3.4.1 Uniform continuity
We have made a fuss of saying that the δ in the deﬁnition of continuity depended on the point c.
There are situations when it is advantageous to have a δ independent of any point. Let us therefore
deﬁne this concept.
Deﬁnition 3.4.1. Let S ⊂R. Let f : S →R be a function. Suppose that for any ε > 0 there exists
a δ > 0 such that whenever x, c ∈ S and [x −c[ < δ, then [ f (x) − f (c)[ < ε. Then we say f is
uniformly continuous.
It is not hard to see that a uniformly continuous function must be continuous. The only difference
in the deﬁnitions is that for a given ε > 0 we pick a δ > 0 that works for all c ∈ S. That is, δ
can no longer depend on c, it only depends on ε. Do note that the domain of deﬁnition of the
function makes a difference now. A function that is not uniformly continuous on a larger set, may
be uniformly continuous when restricted to a smaller set.
Example 3.4.2: f : (0, 1) →R, deﬁned by f (x) :=
1
/x is not uniformly continuous, but it is contin
uous. Given ε > 0, then for ε >[
1
/x −
1
/y[ to hold we must have
ε >[
1
/x −
1
/y[ =
[y −x[
[xy[
=
[y −x[
xy
,
or
[x −y[ < xyε.
Therefore, to satisfy the deﬁnition of uniform continuity we would have to have δ ≤xyε for all x, y
in (0, 1), but that would mean that δ ≤0. Therefore there is no single δ > 0.
Example 3.4.3: f : [0, 1] →R, deﬁned by f (x) := x
2
is uniformly continuous. Write (note that
0 ≤x, c ≤1)
x
2
−c
2
=[x +c[ [x −c[ ≤([x[ +[c[)[x −c[ ≤(1+1)[x −c[ .
Therefore given ε > 0, let δ :=
ε
/2. Then if [x −c[ < δ, then
x
2
−c
2
< ε.
However, f : R →R, deﬁned by f (x) := x
2
is not uniformly continuous. Suppose it is, then for
all ε > 0, there would exist a δ > 0 such that if [x −c[ < δ, then
x
2
−c
2
< ε. Take x > 0 and let
c := x +
δ
/2. Write
ε ≥
x
2
−c
2
=[x +c[ [x −c[ = (2x +
δ
/2)
δ
/2 ≥δx.
Therefore x ≤
ε
/δ for all x > 0, which is a contradiction.
3.4. UNIFORM CONTINUITY 99
We have seen that if f is deﬁned on an interval that is either not closed or not bounded, then f
can be continuous, but not uniformly continuous. For closed and bounded interval [a, b], we can,
however, make the following statement.
Theorem 3.4.4. Let f : [a, b] →R be a continuous function. Then f is uniformly continuous.
Proof. We will prove the statement by contrapositive. Let us suppose that f is not uniformly
continuous, and we will ﬁnd some c ∈ [a, b] where f is not continuous. Let us negate the deﬁnition
of uniformly continuous. There exists an ε > 0 such that for every δ > 0, there exist points x, y in S
with [x −y[ < δ and [ f (x) − f (y)[ ≥ε.
So for the ε > 0 above, we can ﬁnd sequences ¦x
n
¦ and ¦y
n
¦ such that [x
n
−y
n
[ <
1
/n and such
that [ f (x
n
) − f (y
n
)[ ≥ε. By BolzanoWeierstrass, there exists a convergent subsequence ¦x
n
k
¦. Let
c := lim x
n
k
. Note that as a ≤x
n
k
≤b, then a ≤c ≤b. Write
[c −y
n
k
[ =[c −x
n
k
+x
n
k
−y
n
k
[ ≤[c −x
n
k
[ +[x
n
k
−y
n
k
[ <[c −x
n
k
[ +
1
/n
k
.
As [c −x
n
k
[ goes to zero as does
1
/n
k
as k goes to inﬁnity, we see that ¦y
n
k
¦ converges and the limit
is c. We now show that f is not continuous at c. We estimate
[ f (c) − f (x
n
k
)[ =[ f (c) − f (y
n
k
) + f (y
n
k
) − f (x
n
k
)[
≥[ f (y
n
k
) − f (x
n
k
)[ −[ f (c) − f (y
n
k
)[
≥ε −[ f (c) − f (y
n
k
)[ .
Or in other words
[ f (c) − f (x
n
k
)[ +[ f (c) − f (y
n
k
)[ ≥ε.
Therefore, at least one of the sequences ¦ f (x
n
k
)¦ or ¦ f (y
n
k
)¦ cannot converge to f (c) (else the left
hand side of the inequality goes to zero while the righthand side is positive). Thus f cannot be
continuous at c.
3.4.2 Continuous extension
Before we get to continuous extension, we show the following useful lemma. It says that uniformly
continuous functions behave nicely with respect to Cauchy sequences. The main difference here is
that for a Cauchy sequence we no longer know where the limit ends up and it may not end up in the
domain of the function.
Lemma 3.4.5. Let f : S →R be a uniformly continuous function. Let ¦x
n
¦ be a Cauchy sequence
in S. Then ¦ f (x
n
)¦ is Cauchy.
Proof. Let ε > 0 be given. Then there is a δ > 0 such that [ f (x) − f (y)[ < ε whenever [x −y[ < δ.
Now ﬁnd an M ∈ N such that for all n, k ≥M we have [x
n
−x
k
[ < δ. Then for all n, k ≥M we have
[ f (x
n
) − f (x
k
)[ < ε.
100 CHAPTER 3. CONTINUOUS FUNCTIONS
An application of the above lemma is the following theorem. It says that a function on an open
interval is uniformly continuous if and only if it can be extended to a continuous function on the
closed interval.
Theorem 3.4.6. A function f : (a, b) →R is uniformly continuous if and only if the limits
L
a
:= lim
x→a
f (x) and L
b
:= lim
x→b
f (x)
exist and if the function
˜
f : [a, b] →R deﬁned by
˜
f (x) :=
f (x) if x ∈ (a, b),
L
a
if x = a,
L
b
if x = b,
is continuous.
Proof. On direction is not hard to prove. If
˜
f is continuous, then it is uniformly continuous by
Theorem 3.4.4. As f is the restriction of
˜
f to (a, b), then f is also uniformly continuous (easy
exercise).
Now suppose that f is uniformly continuous. We must ﬁrst show that the limits L
a
and L
b
exist. Let us concentrate on L
a
. Take a sequence ¦x
n
¦ in (a, b) such that lim x
n
= a. The sequence
is a Cauchy sequence and hence by Lemma 3.4.5, the sequence ¦ f (x
n
)¦ is Cauchy and therefore
convergent. We have some number L
1
:= lim f (x
n
). Now take another sequence ¦y
n
¦ in (a, b)
such that lim y
n
= a. By the same reasoning we get L
2
:= lim f (y
n
). If we can show that L
1
= L
2
,
then the limit L
a
= lim
x→a
f (x) exists. Let ε > 0 be given, ﬁnd δ > 0 such that [x −y[ < δ implies
[ f (x) − f (y)[ <
ε
/3. Now ﬁnd M ∈ N such that for n ≥ M we have [a−x
n
[ <
δ
/2, [a−y
n
[ <
δ
/2,
[ f (x
n
) −L
1
[ <
ε
/3, and [ f (y
n
) −L
2
[ <
ε
/3. Then for n ≥M we have
[x
n
−y
n
[ =[x
n
−a+a−y
n
[ ≤[x
n
−a[ +[a−y
n
[ <
δ
/2 +
δ
/2 = δ.
So
[L
1
−L
2
[ =[L
1
− f (x
n
) + f (x
n
) − f (y
n
) + f (y
n
) −L
2
[
≤[L
1
− f (x
n
)[ +[ f (x
n
) − f (y
n
)[ +[ f (y
n
) −L
2
[
≤
ε
/3 +
ε
/3 +
ε
/3 = ε.
Therefore L
1
= L
2
. Thus L
a
exists. To show that L
b
exists is left as an exercise.
Now that we know that the limits L
a
and L
b
exist, we are done. If lim
x→a
f (x) exists, then
lim
x→a
˜
f (x) exists (See Proposition 3.1.14). Similarly with L
b
. Hence
˜
f is continuous at a and b.
And since f is continuous at c ∈ (a, b), then
˜
f is continuous at c ∈ (a, b).
3.4. UNIFORM CONTINUITY 101
3.4.3 Lipschitz continuous functions
Deﬁnition 3.4.7. Let f : S →R be a function such that there exists a number K such that for all x
and y in S we have
[ f (x) − f (y)[ ≤K[x −y[ .
Then f is said to be Lipschitz continuous.
A large class of functions is Lipschitz continuous. Be careful however. As for uniformly
continuous functions, the domain of deﬁnition of the function is important. See the examples below
and the exercises. First let us justify using the word “continuous.”
Proposition 3.4.8. A Lipschitz continuous function is uniformly continuous.
Proof. Let f : S →R be a function and let K be a constant such that for all x, y in S we have
[ f (x) − f (y)[ ≤K[x −y[.
Let ε > 0 be given. Take δ :=
ε
/K. For any x and y in S such that [x −y[ < δ we have that
[ f (x) − f (y)[ ≤K[x −y[ < Kδ = K
ε
K
= ε.
Therefore f is uniformly continuous.
We can interpret Lipschitz continuity geometrically. If f is a Lipschitz continuous function with
some constant K. The inequality can be rewritten that for x = y we have
f (x) − f (y)
x −y
≤K.
The quantity
f (x)−f (y)
x−y
is the slope of the line between the points
x, f (x)
and
y, f (y)
. Therefore,
f is Lipschitz continuous if every line that intersects the graph of f at least two points has slope less
than or equal to K.
Example 3.4.9: The functions sin(x) and cos(x) are Lipschitz continuous. We have seen (Exam
ple 3.2.6) the following two inequalities.
[sin(x) −sin(y)[ ≤[x −y[ and [cos(x) −cos(y)[ ≤[x −y[ .
Hence sin and cos are Lipschitz continuous with K = 1.
Example 3.4.10: The function f : [1, ∞) →R deﬁned by f (x) :=
√
x is Lipschitz continuous.
√
x −
√
y
=
x −y
√
x +
√
y
=
[x −y[
√
x +
√
y
.
102 CHAPTER 3. CONTINUOUS FUNCTIONS
As x ≥1 and y ≥1, we can see that
1
√
x+
√
y
≤
1
2
. Therefore
√
x −
√
y
=
x −y
√
x +
√
y
≤
1
2
[x −y[ .
On the other hand f : [0, ∞) →R deﬁned by f (x) :=
√
x is not Lipschitz continuous. Let us see
why. Suppose that we have
√
x −
√
y
≤K[x −y[ ,
for some K. Let y = 0 to obtain
√
x ≤ Kx. If K > 0, then for x > 0 we then get
1
/K ≤
√
x. This
cannot possibly be true for all x >0. Thus no such K >0 can exist and f is not Lipschitz continuous.
Note that the last example shows an example of a function that is uniformly continuous but not
Lipschitz continuous. To see that
√
x is uniformly continuous on [0, ∞) note that it is uniformly
continuous on [0, 1] by Theorem 3.4.4. It is also Lipschitz (and therefore uniformly continuous) on
[1, ∞). It is not hard (exercise) to show that this means that
√
x is uniformly continuous on [0, ∞).
3.4.4 Exercises
Exercise 3.4.1: Let f : S →R be uniformly continuous. Let A ⊂ S. Then the restriction f [
A
is
uniformly continuous.
Exercise 3.4.2: Let f : (a, b) → R be a uniformly continuous function. Finish proof of Theo
rem 3.4.6 by showing that the limit lim
x→b
f (x) exists.
Exercise 3.4.3: Show that f : (c, ∞) →R for some c > 0 and deﬁned by f (x) :=
1
/x is Lipschitz
continuous.
Exercise 3.4.4: Show that f : (0, ∞) →R deﬁned by f (x) :=
1
/x is not Lipschitz continuous.
Exercise 3.4.5: Let A, B be intervals. Let f : A → R and g: B → R be uniformly continuous
functions such that f (x) = g(x) for x ∈ A∩B. Deﬁne the function h: A∪B →R by h(x) := f (x) if
x ∈ A and h(x) := g(x) if x ∈ B`A. a) Prove that if A∩B = / 0, then h is uniformly continuous. b)
Find an example where A∩B = / 0 and h is not even continuous.
Exercise 3.4.6: Let f : R →R be a polynomial of degree d ≥ 2. Show that f is not Lipschitz
continuous.
Exercise 3.4.7: Let f : (0, 1) →R be a bounded continuous function. Show that the function
g(x) := x(1−x) f (x) is uniformly continuous.
Exercise 3.4.8: Show that f : (0, ∞) →R deﬁned by f (x) := sin(
1
/x) is not uniformly continuous.
Exercise 3.4.9: Let f : Q → R be a uniformly continuous function. Show that there exists a
uniformly continuous function
˜
f : R →R such that f (x) =
˜
f (x) for all x ∈ Q.
Chapter 4
The Derivative
4.1 The derivative
Note: 1 lecture
The idea of a derivative is the following. Let us suppose that a graph of a function looks locally
like a straight line. We can then talk about the slope of this line. The slope tells us how fast is the
value of the function changing at the particular point. Of course, we are leaving out any function
that has corners or discontinuities. Let us be precise.
4.1.1 Deﬁnition and basic properties
Deﬁnition 4.1.1. Let I be an interval, let f : I →R be a function, and let c ∈ I. Suppose that the
limit
L := lim
x→c
f (x) − f (c)
x −c
exists. Then we say that f is differentiable at c and we say that L is the derivative of f at c and we
write f
/
(c) := L.
If f is differentiable at all c ∈ I, then we simply say that f is differentiable, and then we obtain a
function f
/
: I →R.
The expression
f (x)−f (c)
x−c
is called the difference quotient.
The graphical interpretation of the derivative is depicted in Figure 4.1. The lefthand plot gives
the line through
c, f (c)
and
x, f (x)
with slope
f (x)−f (c)
x−c
. As we take the limit as x goes to c, we
get the righthand plot. On this plot we can see that the derivative of the function at the point c is
the slope of the line tangent to the graph of f at the point
c, f (c)
.
Note that we allow I to be a closed interval and we allow c to be an endpoint of I. Some calculus
books will not allow c to be an endpoint of an interval, but all the theory still works by allowing it,
and it will make our work easier.
103
104 CHAPTER 4. THE DERIVATIVE
c x
slope =
f (x)−f (c)
x−c
c
slope = f
/
(c)
Figure 4.1: Graphical interpretation of the derivative.
Example 4.1.2: Let f (x) := x
2
deﬁned on the whole real line. Then we ﬁnd that
lim
x→c
x
2
−c
2
x −c
= lim
x→c
(x +c) = 2c.
Therefore f
/
(c) = 2c.
Example 4.1.3: The function f (x) :=[x[ is not differentiable at the origin. When x > 0, then
[x[ −[0[
x −0
= 1,
and when x < 0 we have
[x[ −[0[
x −0
=−1.
A famous example of Weierstrass shows that there exists a continuous function that is not
differentiable at any point. The construction of this function is beyond the scope of this book. On
the other hand, a differentiable function is always continuous.
Proposition 4.1.4. Let f : I →R be differentiable at c ∈ I, then it is continuous at c.
Proof. We know that the limits
lim
x→c
f (x) − f (c)
x −c
= f
/
(c) and lim
x→c
(x −c) = 0
exist. Furthermore,
f (x) − f (c) =
f (x) − f (c)
x −c
(x −c).
4.1. THE DERIVATIVE 105
Therefore the limit of f (x) − f (c) exists and
lim
x→c
f (x) − f (c)
= f
/
(c) 0 = 0.
Hence lim
x→c
f (x) = f (c), and f is continuous at c.
One of the most important properties of the derivative is linearity. The derivative is the approxi
mation of a function by a straight line, that is, we are trying to approximate the function at a point
by a linear function. It then makes sense that the derivative is linear.
Proposition 4.1.5. Let I be an interval, let f : I →R and g: I →R be differentiable at c ∈ I, and
let α ∈ R.
(i) Deﬁne h: I →R by h(x) := α f (x). Then h is differentiable at c and h
/
(c) = α f
/
(c).
(ii) Deﬁne h: I →R by h(x) := f (x) +g(x). Then h is differentiable at c and h
/
(c) = f
/
(c) +g
/
(c).
Proof. First, let h(x) = α f (x). For x ∈ I, x = c we have
h(x) −h(c)
x −c
=
α f (x) −α f (c)
x −c
= α
f (x) − f (c)
x −c
.
The limit as x goes to c exists on the right by Corollary 3.1.12. We get
lim
x→c
h(x) −h(c)
x −c
= α lim
x→c
f (x) − f (c)
x −c
.
Therefore h is differentiable at c, and the derivative is computed as given.
Next, deﬁne h(x) := f (x) +g(x). For x ∈ I, x = c we have
h(x) −h(c)
x −c
=
f (x) +g(x)
−
f (c) +g(c)
x −c
=
f (x) − f (c)
x −c
+
g(x) −g(c)
x −c
.
The limit as x goes to c exists on the right by Corollary 3.1.12. We get
lim
x→c
h(x) −h(c)
x −c
= lim
x→c
f (x) − f (c)
x −c
+lim
x→c
g(x) −g(c)
x −c
.
Therefore h is differentiable at c and the derivative is computed as given.
It is not true that the derivative of a multiple of two functions is the multiple of the derivatives.
Instead we get the socalled product rule or the Leibniz rule
∗
.
∗
Named for the German mathematician Gottfried Wilhelm Leibniz (1646–1716).
106 CHAPTER 4. THE DERIVATIVE
Proposition 4.1.6 (Product Rule). Let I be an interval, let f : I →R and g: I →R be functions
differentiable at c. If h: I →R is deﬁned by
h(x) := f (x)g(x),
then h is differentiable at c and
h
/
(c) = f (c)g
/
(c) + f
/
(c)g(c).
The proof of the product rule is left as an exercise. The key is to use the identity f (x)g(x) −
f (c)g(c) = f (x)
g(x) −g(c)
+g(c)
f (x) − f (c)
.
Proposition 4.1.7 (Quotient Rule). Let I be an interval, let f : I →Rand g: I →Rbe differentiable
at c and g(x) = 0 for all x ∈ I. If h: I →R is deﬁned by
h(x) :=
f (x)
g(x)
,
then h is differentiable at c and
h
/
(c) =
f (c)g
/
(c) + f
/
(c)g(c)
g(c)
2
.
Again the proof is left as an exercise.
4.1.2 Chain rule
A useful rule for computing derivatives is the chain rule.
Proposition 4.1.8 (Chain Rule). Let I
1
, I
2
be intervals, let g: I
1
→I
2
be differentiable at c ∈ I
2
, and
f : I
2
→R be differentiable at g(c). If h: I →R is deﬁned by
h(x) := ( f ◦g)(x) = f
g(x)
,
then h is differentiable at c and
h
/
(c) = f
/
g(c)
g
/
(c).
Proof. Let d := g(c). Deﬁne
u(y) :=
f (y)−f (d)
y−d
if y = d,
f
/
(d) if y = d,
v(x) :=
g(x)−g(c)
x−c
if x = c,
g
/
(c) if x = c.
4.1. THE DERIVATIVE 107
By the deﬁnition of the limit we see that lim
y→d
u(y) = f
/
(d) and lim
x→c
v(x) = g
/
(c) (the functions u and
v are continuous at d and c respectively). Therefore,
f (y) − f (d) = u(y)(y −d) and g(x) −g(c) = v(x)(x −c).
We plug in to obtain
h(x) −h(c) = f
g(x)
− f
g(c)
= u
g(x)
g(x) −g(c)
= u
g(x)
v(x)(x −c)
.
Therefore,
h(x) −h(c)
x −c
= u
g(x)
v(x).
We note that lim
x→c
v(x) = g
/
(c), g is continuous at c, that is lim
x→c
g(x) = g(c), and ﬁnally that
lim
y→g(c)
u(y) = f
/
g(c)
. Therefore the limit of the righthand side exists and is equal to f
/
g(c)
g
/
(c).
Thus h is differentiable at c and the limit is f
/
g(c)
g
/
(c).
4.1.3 Exercises
Exercise 4.1.1: Prove the product rule. Hint: Use f (x)g(x) − f (c)g(c) = f (x)
g(x) −g(c)
+
g(c)
f (x) − f (c)
.
Exercise 4.1.2: Prove the quotient rule. Hint: You can do this directly, but it may be easier to ﬁnd
the derivative of
1
/x and then use the chain rule and the product rule.
Exercise 4.1.3: Prove that x
n
is differentiable and ﬁnd the derivative. Hint: Use the product rule.
Exercise 4.1.4: Prove that a polynomial is differentiable and ﬁnd the derivative. Hint: Use the
previous exercise.
Exercise 4.1.5: Let
f (x) :=
x
2
if x ∈ Q,
0 otherwise.
Prove that f is differentiable at 0, but discontinuous at all points except 0.
Exercise 4.1.6: Assume the inequality [x −sin(x)[ ≤x
2
. Prove that sin is differentiable at 0, and
ﬁnd the derivative at 0.
Exercise 4.1.7: Using the previous exercise, prove that sin is differentiable at all x and that the
derivative is cos(x). Hint: Use the sumtoproduct trigonometric identity as we did before.
Exercise 4.1.8: Let f : I →R be differentiable. Deﬁne f
n
be the function deﬁned by f
n
(x) :=
f (x)
n
. Prove that ( f
n
)
/
(x) = n
f (x)
n−1
f
/
(x).
108 CHAPTER 4. THE DERIVATIVE
Exercise 4.1.9: Suppose that f : R →R is a differentiable Lipschitz continuous function. Prove
that f
/
is a bounded function.
Exercise 4.1.10: Let I
1
, I
2
be intervals. Let f : I
1
→I
2
be a bijective function and g: I
2
→I
1
be the
inverse. Suppose that both f is differentiable at c ∈ I
1
and f
/
(c) = 0 and g is differentiable at f (c).
Use the chain rule to ﬁnd a formula for g
/
f (c)
(in terms of f
/
(c)).
4.2. MEAN VALUE THEOREM 109
4.2 Mean value theorem
Note: 2 lectures (some applications may be skipped)
4.2.1 Relative minima and maxima
Deﬁnition 4.2.1. Let S ⊂R be a set and let f : S →R be a function. The function f is said to have
a relative maximum at c ∈ S if there exists a δ > 0 such that for all x ∈ S such that [x −c[ < δ we
have f (x) ≤ f (c). The deﬁnition of relative minimum is analogous.
Theorem 4.2.2. Let f : [a, b] →R be a function differentiable at c ∈ (a, b), and c is a relative
minimum or a relative maximum of f . Then f
/
(c) = 0.
Proof. We will prove the statement for a maximum. For a minimum the statement follows by
considering the function −f .
Let c be a relative maximum of f . In particular as long as [x −c[ < δ we have f (x) − f (c) ≤0.
Then we look at the difference quotient. If x > c we note that
f (x) − f (c)
x −c
≤0,
and if x < c we have
f (x) − f (c)
x −c
≥0.
We now take sequences ¦x
n
¦ and ¦y
n
¦, such that x
n
> c, and y
n
< c for all n ∈ N, and such that
lim x
n
= lim y
n
= c. Since f is differentiable at c we know that
0 ≥ lim
n→∞
f (x
n
) − f (c)
x
n
−c
= f
/
(c) = lim
n→∞
f (y
n
) − f (c)
y
n
−c
≥0.
4.2.2 Rolle’s theorem
Suppose that a function is zero on both endpoints of an interval. Intuitively it should attain a
minimum or a maximum in the interior of the interval. Then at a minimum or a maximum, the
derivative should be zero. See Figure 4.2 for the geometric idea. This is the content of the socalled
Rolle’s theorem.
Theorem 4.2.3 (Rolle). Let f : [a, b] →R be continuous function differentiable on (a, b) such that
f (a) = f (b) = 0. Then there exists a c ∈ (a, b) such that f
/
(c) = 0.
110 CHAPTER 4. THE DERIVATIVE
c a
b
Figure 4.2: Point where tangent line is horizontal, that is f
/
(c) = 0.
Proof. As f is continuous on [a, b] it attains an absolute minimum and an absolute maximum in
[a, b]. If it attains an absolute maximum at c ∈ (a, b), then c is also a relative maximum and we
apply Theorem 4.2.2 to ﬁnd that f
/
(c) = 0. If the absolute maximum as at a or at b, then we look for
the absolute minimum. If the absolute minimum is at c ∈ (a, b), then again we ﬁnd that f
/
(c) = 0.
So suppose that the absolute minimum is also at a or b. Hence the relative minimum is 0 and the
relative maximum is 0, and therefore the function is identically zero. Thus f
/
(x) = 0 for all x ∈ [a, b]
so pick an arbitrary c.
4.2.3 Mean value theorem
We extend Rolle’s theorem to functions that attain different values at the endpoints.
Theorem 4.2.4 (Mean value theorem). Let f : [a, b] →R be continuous function differentiable on
(a, b). Then there exists a point c ∈ (a, b) such that
f (b) − f (a) = f
/
(c)(b−a).
Proof. The theorem follows easily from Rolle’s theorem. Deﬁne the function g: [a, b] →R by
g(x) := f (x) − f (b) +
f (b) − f (a)
b−x
b−a
.
Then we know that g is a differentiable function on (a, b) continuous on [a, b] such that g(a) = 0
and g(b) = 0. Thus there exists c ∈ (a, b) such that g
/
(c) = 0.
0 = g
/
(c) = f
/
(c) +
f (b) − f (a)
−1
b−a
.
Or in other words f
/
(c)(b−a) = f (b) − f (a).
4.2. MEAN VALUE THEOREM 111
For a geometric interpretation of the mean value theorem, see Figure 4.3. The idea is that the
value
f (b)−f (a)
b−a
is the slope of the line between the points
a, f (a)
and
b, f (b)
. Then c is the
point such that f
/
(c) =
f (b)−f (a)
b−a
, that is, the tangent line at the point
c, f (c)
has the same slope
as the line between
a, f (a)
and
b, f (b)
.
c
(a, f (a))
(b, f (b))
Figure 4.3: Graphical interpretation of the mean value theorem.
4.2.4 Applications
We can now solve our very ﬁrst differential equation.
Proposition 4.2.5. Let I be an interval and let f : I →R be a differentiable function such that
f
/
(x) = 0 for all x ∈ I. Then f is a constant.
Proof. We will show this by contrapositive. Suppose that f is not constant, then there exist x and y
in I such that x < y and f (x) = f (y). Then f restricted to [x, y] satisﬁes the hypotheses of the mean
value theorem. Therefore there is a c ∈ (x, y) such that
f (y) − f (x) = f
/
(c)(y −x).
As y = x and f (y) = f (x) we see that f
/
(c) = 0.
Now that we know what it means for the function to stay constant, let us look at increasing and
decreasing functions. We say that f : I →R is increasing (resp. strictly increasing) if x < y implies
f (x) ≤ f (y) (resp. f (x) < f (y)). We deﬁne decreasing and strictly decreasing in the same way by
switching the inequalities for f .
Proposition 4.2.6. Let f : I →R be a differentiable function.
(i) f is increasing if and only if f
/
(x) ≥0 for all x ∈ I.
112 CHAPTER 4. THE DERIVATIVE
(ii) f is decreasing if and only if f
/
(x) ≤0 for all x ∈ I.
Proof. Let us prove the ﬁrst item. Suppose that f is increasing, then for all x and c in I we have
f (x) − f (c)
x −c
≥0.
Taking a limit as x goes to c we see that f
/
(c) ≥0.
For the other direction, suppose that f
/
(x) ≥0 for all x ∈ I. Let x < y in I. Then by the mean
value theorem there is some c ∈ (x, y) such that
f (x) − f (y) = f
/
(c)(x −y).
As f
/
(c) ≥0, and x −y > 0, then f (x) − f (y) ≥0 and so f is increasing.
We leave the decreasing part to the reader as exercise.
Example 4.2.7: We can make a similar but weaker statement about strictly increasing and decreas
ing functions. If f
/
(x) > 0 for all x ∈ I, then f is strictly increasing. The proof is left as an exercise.
However, the converse is not true. For example, f (x) := x
3
is a strictly increasing function but
f
/
(0) = 0.
Another application of the mean value theorem is the following result about location of extrema.
The theorem is stated for an absolute minimum and maximum, but the way it is applied to ﬁnd
relative minima and maxima is to restrict f to an interval (c −δ, c +δ).
Proposition 4.2.8. Let f : (a, b) →R be continuous. Let c ∈ (a, b) and suppose f is differentiable
on (a, c) and (c, d).
(i) If f
/
(x) ≤0 for x ∈ (a, c) and f
/
(x) ≥0 for x ∈ (c, b), then f has an absolute minimum at c.
(ii) If f
/
(x) ≥0 for x ∈ (a, c) and f
/
(x) ≤0 for x ∈ (c, b), then f has an absolute maximum at c.
Proof. Let us prove the ﬁrst item. The second is left to the reader. Let x be in (a, c) and ¦y
n
¦ a
sequence such that x <y
n
<c and lim y
n
=c. By the previous proposition, the function is decreasing
on (a, c) so f (x) ≥ f (y
n
). The function is continuous at c so we can take the limit to get f (x) ≥ f (c)
for all x ∈ (a, c).
Similarly take x ∈ (c, b) and ¦y
n
¦ a sequence such that c < y
n
< x and lim y
n
= c. The function
is increasing on (c, b) so f (x) ≥ f (y
n
). By continuity of f we get f (x) ≥ f (c) for all x ∈ (c, b).
Thus f (x) ≥ f (c) for all x ∈ (a, b).
Note that converse of the proposition does not hold. See Example 4.2.10 below.
4.2. MEAN VALUE THEOREM 113
4.2.5 Continuity of derivatives and the intermediate value theorem
Derivatives of functions satisfy an intermediate value property. The theorem is usually called the
Darboux’s theorem.
Theorem 4.2.9 (Darboux). Let f : [a, b] →R be differentiable. Suppose that there exists a y ∈ R
such that f
/
(a) < y < f
/
(b) or f
/
(a) > y > f
/
(b). Then there exists a c ∈ (a, b) such that f
/
(c) = y.
Proof. Suppose without loss of generality that f
/
(a) < y < f
/
(b). Deﬁne
g(x) := yx − f (x).
As g is continuous on [a, b], then g attains a maximum at some c ∈ [a, b].
Now compute g
/
(x) = y − f
/
(x). Thus g
/
(a) > 0. We can ﬁnd an x > a such that
g
/
(a) −
g(x) −g(a)
x −a
< g
/
(a).
Thus
g(x)−g(a)
x−a
> 0 or g(x) −g(a) > 0 or g(x) > g(a). Thus a cannot possibly be a maximum.
Similarly as g
/
(b) < 0, we ﬁnd an x < b such that
g(x)−g(b)
x−b
−g
/
(b) < −g
/
(b) or that g(x) > g(b),
thus b cannot possibly be a maximum.
Therefore c ∈ (a, b). Then as c is a maximum of g we ﬁnd g
/
(c) = 0 and f
/
(c) = y.
And as we have seen, there do exist noncontinuous functions that have the intermediate value
property. While it is hard to imagine at ﬁrst, there do exist functions that are differentiable
everywhere and the derivative is not continuous.
Example 4.2.10: Let f : R →R be the function deﬁned by
f (x) :=
xsin(
1
/x)
2
if x = 0,
0 if x = 0.
We claim that f is differentiable, but f
/
: R →R is not continuous at the origin. Furthermore, f has
a minimum at 0, but the derivative changes sign inﬁnitely often near the origin.
That f has an absolute minimum at 0 is easy to see by deﬁnition. We know that f (x) ≥0 for all
x and f (0) = 0.
The function f is differentiable for x = 0 and the derivative is 2sin(
1
/x)
xsin(
1
/x) −cos(
1
/x)
.
As an exercise show that for x
n
=
4
(8n+1)π
we have lim f
/
(x
n
) =−1, and for y
n
=
4
(8n+3)π
we have
lim f
/
(y
n
) = 1. Hence if f
/
exists at 0, then it cannot be continuous.
Let us see that f
/
exists at 0. We claim that the derivative is zero. In other words
f (x)−f (0)
x−0
−0
goes to zero as x goes to zero. For x = 0 we have
f (x) − f (0)
x −0
−0
=
x
2
sin
2
(
1
/x)
x
=
xsin
2
(
1
/x)
≤[x[ .
114 CHAPTER 4. THE DERIVATIVE
And of course as x tends to zero, then [x[ tends to zero and hence
f (x)−f (0)
x−0
−0
goes to zero.
Therefore, f is differentiable at 0 and the derivative at 0 is 0.
It is sometimes useful to assume that the derivative of a differentiable function is continuous. If
f : I →R is differentiable and the derivative f
/
is continuous on I, then we say that f is continuously
differentiable. It is then common to write C
1
(I) for the set of continuously differentiable functions
on I.
4.2.6 Exercises
Exercise 4.2.1: Finish proof of Proposition 4.2.6.
Exercise 4.2.2: Finish proof of Proposition 4.2.8.
Exercise 4.2.3: Suppose that f : R →R is a differentiable function such that f
/
is a bounded
function. Then show that f is a Lipschitz continuous function.
Exercise 4.2.4: Suppose that f : [a, b] →R is differentiable and c ∈ [a, b]. Then show that there
exists a sequence ¦x
n
¦, x
n
= c, such that
f
/
(c) = lim
n→∞
f
/
(x
n
).
Do note that this does not imply that f
/
is continuous (why?).
Exercise 4.2.5: Suppose that f : R →R is a function such that [ f (x) − f (y)[ ≤ [x −y[
2
for all x
and y. Show that f (x) =C for some constant C. Hint: Show that f is differentiable at all points
and compute the derivative.
Exercise 4.2.6: Suppose that I is an interval and f : I →R is a differentiable function. If f
/
(x) > 0
for all x ∈ I, show that f is strictly increasing.
Exercise 4.2.7: Suppose f : (a, b) →R is a differentiable function such that f
/
(x) = 0 for all
x ∈ (a, b). Suppose that there exists a point c ∈ (a, b) such that f
/
(c) > 0. Prove that f
/
(x) > 0 for
all x ∈ (a, b).
Exercise 4.2.8: Suppose that f : (a, b) →Rand g: (a, b) →Rare differentiable functions such that
f
/
(x) = g
/
(x) for all x ∈ (a, b), then show that there exists a constant C such that f (x) = g(x) +C.
4.3. TAYLOR’S THEOREM 115
4.3 Taylor’s theorem
Note: 0.5 lecture (optional section)
4.3.1 Derivatives of higher orders
When f : I →R is differentiable, then we obtain a function f
/
: I →R. The function f
/
is called the
ﬁrst derivative of f . If f
/
is differentiable, we denote by f
//
: I →Rthe derivative of f
/
. The function
f
//
is called the second derivative of f . We can similarly obtain f
///
, f
////
, and so on. However, with
a larger number of derivatives the notation would get out of hand. Therefore we denote by f
(n)
the
nth derivative of f .
When f possesses n derivatives, we say that f is n times differentiable.
4.3.2 Taylor’s theorem
Taylor’s theorem
†
is a generalization of the mean value theorem. It tells us that up to a small error,
any n times differentiable function can be approximated at a point x
0
by a polynomial. The error of
this approximation behaves like (x−x
0
)
n
near the point x
0
. To see why this is a good approximation
notice that for a big n, (x −x
0
)
n
is very small in a small interval around x
0
.
Deﬁnition 4.3.1. For a function f deﬁned near a point x
0
∈ R, deﬁne the nth Taylor polynomial for
f at x
0
as
P
n
(x) :=
n
∑
k=1
f
(k)
(x
0
)
k!
(x −x
0
)
k
= f (x
0
) + f
/
(x
0
)(x −x
0
) +
f
//
(x
0
)
2
(x −x
0
)
2
+
f
(3)
(x
0
)
6
(x −x
0
)
3
+ +
f
(n)
(x
0
)
n!
(x −x
0
)
n
.
Taylor’s theorem tells us that a function behaves like its the nth Taylor polynomial. We can think
of the theorem as a generalization of the mean value theorem, which is really Taylor’s theorem for
the ﬁrst derivative.
Theorem 4.3.2 (Taylor). Suppose f : [a, b] →R is a function with n continuous derivatives on [a, b]
and such that f
(n+1)
exists on (a, b). Given distinct points x
0
and x in [a, b], we can ﬁnd a point c
between x
0
and x such that
f (x) = P
n
(x) +
f
(n+1)
(c)
(n+1)!
(x −x
0
)
n+1
.
†
Named for the English mathematician Brook Taylor (1685–1731).
116 CHAPTER 4. THE DERIVATIVE
The term R
n
(x) :=
f
(n+1)
(c)
(n+1)!
(x −x
0
)
n+1
is called the remainder term. The form of the remainder
term is given in what is called the Lagrange form of the remainder term. There are other ways to
write the remainder term but we will skip those.
Proof. Find a number M solving the equation
f (x) = P
n
(x) +M(x −x
0
)
n+1
.
Deﬁne a function g(s) by
g(s) := f (s) −P
n
(s) −M(s −x
0
)
n+1
.
A simple computation shows that P
(k)
n
(x
0
) = f
(k)
(x
0
) for k = 0, 1, 2, . . . , n (the zeroth derivative
corresponds simply to the function itself). Therefore,
g(x
0
) = g
/
(x
0
) = g
//
(x
0
) = = g
(n)
(x
0
) = 0.
In particular g(x
0
) = 0. On the other hand g(x) = 0. Thus by the mean value theorem there exists
an x
1
between x
0
and x such that g
/
(x
1
) = 0. Applying the mean value theorem to g
/
we obtain that
there exists x
2
between x
0
and x
1
(and therefore between x
0
and x) such that g
//
(x
2
) = 0. We repeat
the argument n+1 times to obtain a number x
n+1
between x
0
and x
n
(and therefore between x
0
and
x) such that g
(n+1)
(x
n+1
) = 0.
Now we simply let c := x
n+1
. We compute the (n+1)th derivative of g to ﬁnd
g
(n+1)
(s) = f
(n+1)
(s) −(n+1)! M.
Plugging in c for s we obtain that M =
f
(n+1)
(c)
(n+1)!
, and we are done.
In the proof we have computed that P
(k)
n
(x
0
) = f
(k)
(x
0
) for k = 0, 1, 2, . . . , n. Therefore the
Taylor polynomial has the same derivatives as f at x
0
up to the nth derivative. That is why the
Taylor polynomial is a good approximation to f .
In simple terms, a differentiable function is locally approximated by a line, that’s the deﬁnition
of the derivative. There does exist a converse to Taylor’s theorem, which we will not state nor prove,
saying that if a function is locally approximated in a certain way by a polynomial of degree d, then
it has d derivatives.
4.3.3 Exercises
Exercise 4.3.1: Compute the nth Taylor Polynomial at 0 for the exponential function.
Exercise 4.3.2: Suppose that p is a polynomial of degree d. Given any x
0
∈ R, show that the
(d +1)th Taylor polynomial for p at x
0
is equal to p.
Exercise 4.3.3: Let f (x) :=[x[
3
. Compute f
/
(x) and f
//
(x) for all x, but show that f
(3)
(0) does not
exist.
Chapter 5
The Riemann Integral
5.1 The Riemann integral
Note: 1.5 lectures
We now get to the fundamental concept of an integral. There is often confusion among students
of calculus between “integral” and “antiderivative.” The integral is (informally) the area under the
curve, nothing else. That we can compute an antiderivative using the integral is a nontrivial result
we have to prove. In this chapter we will deﬁne the Riemann integral
∗
using the Darboux integral
†
,
which is technically simpler than (and equivalent to) the traditional deﬁnition as done by Riemann.
5.1.1 Partitions and lower and upper integrals
We want to integrate a bounded function deﬁned on an interval [a, b]. We ﬁrst deﬁne two auxiliary
integrals that can be deﬁned for all bounded functions. Only then can we talk about the Riemann
integral and the Riemann integrable functions.
Deﬁnition 5.1.1. A partition P of the interval [a, b] is a ﬁnite sequence of points x
0
, x
1
, x
2
, . . . , x
n
such that
a = x
0
< x
1
< x
2
< < x
n−1
< x
n
= b.
We write
∆x
i
:= x
i
−x
i−1
.
We say P is of size n.
∗
Named after the German mathematician Georg Friedrich Bernhard Riemann (1826–1866).
†
Named after the French mathematician JeanGaston Darboux (1842–1917).
117
118 CHAPTER 5. THE RIEMANN INTEGRAL
Let f : [a, b] →R be a bounded function. Let P be a partition of [a, b]. Deﬁne
m
i
:= inf¦ f (x) : x
i−1
≤x ≤x
i
¦,
M
i
:= sup¦ f (x) : x
i−1
≤x ≤x
i
¦,
L(P, f ) :=
n
∑
i=1
m
i
∆x
i
,
U(P, f ) :=
n
∑
i=1
M
i
∆x
i
.
We call L(P, f ) the lower Darboux sum and U(P, f ) the upper Darboux sum.
The geometric idea of Darboux sums is indicated in Figure 5.1. The lower sum is the area of
the shaded rectangles, and the upper sum is the area of the entire rectangles. The width of the ith
rectangle is ∆x
i
, the height of the shaded rectangle is m
i
and the height of the entire rectangle is M
i
.
Figure 5.1: Sample Darboux sums.
Proposition 5.1.2. Let f : [a, b] →R be a bounded function. Let m, M ∈ R be such that for all x we
have m ≤ f (x) ≤M. For any partition P of [a, b] we have
m(b−a) ≤L(P, f ) ≤U(P, f ) ≤M(b−a). (5.1)
Proof. Let P be a partition. Then note that m ≤m
i
for all i and M
i
≤M for all i. Also m
i
≤M
i
for
all i. Finally ∑
n
i=1
∆x
i
= (b−a). Therefore,
m(b−a) = m
n
∑
i=1
∆x
i
=
n
∑
i=1
m∆x
i
≤
n
∑
i=1
m
i
∆x
i
≤
≤
n
∑
i=1
M
i
∆x
i
≤
n
∑
i=1
M∆x
i
= M
n
∑
i=1
∆x
i
= M(b−a).
Hence we get (5.1). In other words, the set of lower and upper sums are bounded sets.
5.1. THE RIEMANN INTEGRAL 119
Deﬁnition 5.1.3. Now that we know that the sets of lower and upper Darboux sums are bounded,
deﬁne
b
a
f (x) dx := sup¦L(P, f ) : P a partition of [a, b]¦,
b
a
f (x) dx := inf¦U(P, f ) : P a partition of [a, b]¦.
We call
the lower Darboux integral and
the upper Darboux integral. To avoid worrying about
the variable of integration, we will often simply write
b
a
f :=
b
a
f (x) dx and
b
a
f :=
b
a
f (x) dx.
It is not clear from the deﬁnition when the lower and upper Darboux integrals are the same
number. In general they can be different.
Example 5.1.4: Take the Dirichlet function f : [0, 1] →R, where f (x) := 1 if x ∈ Q and f (x) := 0
if x / ∈ Q. Then
1
0
f = 0 and
1
0
f = 1.
The reason is that for every i we have that m
i
= inf¦ f (x) : x ∈ [x
i−1
, x
i
]¦ = 0 and sup¦ f (x) : x ∈
[x
i−1
, x
i
]¦ = 1. Thus
L(P, f ) =
n
∑
i=1
0 ∆x
i
= 0,
U(P, f ) =
n
∑
i=1
1 ∆x
i
=
n
∑
i=1
∆x
i
= 1.
Remark 5.1.5. The same deﬁnition is used when f is deﬁned on a larger set S such that [a, b] ⊂S.
In that case, we use the restriction of f to [a, b] and we must ensure that the restriction is bounded
on [a, b].
To compute the integral we will often take a partition P and make it ﬁner. That is, we will cut
intervals in the partition into yet smaller pieces.
Deﬁnition 5.1.6. Let P :=¦x
0
, x
1
, . . . , x
n
¦ and
˜
P :=¦˜ x
0
, ˜ x
1
, . . . , ˜ x
m
¦ be partitions of [a, b]. We say
˜
P is a reﬁnement of P if as sets P ⊂
˜
P.
That is,
˜
P is a reﬁnement of a partition if it contains all the points in P and perhaps some other
points in between. For example, ¦0, 0.5, 1, 2¦ is a partition of [0, 2] and ¦0, 0.2, 0.5, 1, 1.5, 1.75, 2¦
is a reﬁnement. The main reason for introducing reﬁnements is the following proposition.
120 CHAPTER 5. THE RIEMANN INTEGRAL
Proposition 5.1.7. Let f : [a, b] →R be a bounded function, and let P be a partition of [a, b]. Let
˜
P
be a reﬁnement of P. Then
L(P, f ) ≤L(
˜
P, f ) and U(
˜
P, f ) ≤U(P, f ).
Proof. The tricky part of this proof is to get the notation correct. Let
˜
P := ¦˜ x
0
, ˜ x
1
, . . . , ˜ x
m
¦ be
a reﬁnement of P := ¦x
0
, x
1
, . . . , x
n
¦. Then x
0
= ˜ x
0
and x
n
= ˜ x
m
. In fact, we can ﬁnd integers
k
0
< k
1
< < k
n
such that x
j
= ˜ x
k
j
for j = 0, 1, 2, . . . , n.
Let ∆˜ x
j
= ˜ x
j−1
− ˜ x
j
. We get that
∆x
j
=
k
j
∑
p=k
j−1
+1
∆˜ x
p
.
Let m
j
be as before and correspond to the partition P. Let ˜ m
j
:=inf¦ f (x) : ˜ x
j−1
≤x ≤x
j
¦. Now,
m
j
≤ ˜ m
p
for k
j−1
< p ≤k
j
. Therefore,
m
j
∆x
j
= m
j
k
j
∑
p=k
j−1
+1
∆˜ x
p
=
k
j
∑
p=k
j−1
+1
m
j
∆˜ x
p
≤
k
j
∑
p=k
j−1
+1
˜ m
p
∆˜ x
p
.
So
L(P, f ) =
n
∑
j=1
m
j
∆x
j
≤
n
∑
j=1
k
j
∑
p=k
j−1
+1
˜ m
p
∆˜ x
p
=
m
∑
j=1
˜ m
j
∆˜ x
j
= L(
˜
P, f ).
The proof of U(
˜
P, f ) ≤U(P, f ) is left as an exercise.
Armed with reﬁnements we can prove the following. The key point of this proposition is the
inequality that says that the lower Darboux sum is less than or equal to the upper Darboux sum.
Proposition 5.1.8. Let f : [a, b] →R be a bounded function. Let m, M ∈ R be such that for all x we
have m ≤ f (x) ≤M. Then
m(b−a) ≤
b
a
f ≤
b
a
f ≤M(b−a). (5.2)
Proof. By Proposition 5.1.2 we have for any partition P
m(b−a) ≤L(P, f ) ≤U(P, f ) ≤M(b−a).
The inequality m(b −a) ≤ L(P, f ) implies m(b −a) ≤
b
a
f . Also U(P, f ) ≤ M(b −a) implies
b
a
f ≤M(b−a).
The key point of this proposition is the middle inequality in (5.2). Let P
1
, P
2
be partitions of [a, b].
Deﬁne the partition
˜
P := P
1
∪P
2
.
˜
P is a partition of [a, b]. Furthermore,
˜
P is a reﬁnement of P
1
and it
5.1. THE RIEMANN INTEGRAL 121
is also a reﬁnement of P
2
. By Proposition 5.1.7 we have L(P
1
, f ) ≤L(
˜
P, f ) and U(
˜
P, f ) ≤U(P
2
, f ).
Putting it all together we have
L(P
1
, f ) ≤L(
˜
P, f ) ≤U(
˜
P, f ) ≤U(P
2
, f ).
In other words, for two arbitrary partitions P
1
and P
2
we have L(P
1
, f ) ≤U(P
2
, f ). Now we recall
Proposition 1.2.8. Taking the supremum and inﬁmum over all partitions we get
sup¦L(P, f ) : P a partition¦ ≤inf¦U(P, f ) : P a partition¦.
In other words
b
a
f ≤
b
a
f .
5.1.2 Riemann integral
We can ﬁnally deﬁne the Riemann integral. However, the Riemann integral is only deﬁned on a
certain class of functions, called the Riemann integrable functions.
Deﬁnition 5.1.9. Let f : [a, b] →R be a bounded function. Suppose that
b
a
f (x) dx =
b
a
f (x) dx.
Then f is said to be Riemann integrable. The set of Riemann integrable functions on [a, b] is denoted
by R[a, b]. When f ∈ R[a, b] we deﬁne
b
a
f (x) dx :=
b
a
f (x) dx =
b
a
f (x) dx.
As before, we often simply write
b
a
f :=
b
a
f (x) dx.
The number
b
a
f is called the Riemann integral of f , or sometimes simply the integral of f .
By appealing to Proposition 5.1.8 we immediately obtain the following proposition.
Proposition 5.1.10. Let f : [a, b] →R be a bounded Riemann integrable function. Let m, M ∈ R be
such that m ≤ f (x) ≤M. Then
m(b−a) ≤
b
a
f ≤M(b−a).
Often we will use a weaker form of this proposition. That is, if [ f (x)[ ≤M for all x ∈ [a, b], then
b
a
f
≤M(b−a).
122 CHAPTER 5. THE RIEMANN INTEGRAL
Example 5.1.11: We can also integrate constant functions using Proposition 5.1.8. If f (x) := c for
some constant c, then we can take m = M = c. Then in the inequality (5.2) all the inequalities must
be equalities. Thus f is integrable on [a, b] and
b
a
f = c(b−a).
Example 5.1.12: Let f : [0, 2] →R be deﬁned by
f (x) :=
1 if x < 1,
1
/2 if x = 1,
0 if x > 1.
We claim that f is Riemann integrable and that
2
0
f = 1.
Proof: Let 0 < ε < 1 be arbitrary. Let P := 0, 1−ε, 1+ε, 2 be a partition. We will use the
notation from the deﬁnition of the Darboux sums. Then
m
1
= inf¦ f (x) : x ∈ [0, 1−ε]¦ = 1, M
1
= sup¦ f (x) : x ∈ [0, 1−ε]¦ = 1,
m
2
= inf¦ f (x) : x ∈ [1−ε, 1+ε]¦ = 0, M
2
= sup¦ f (x) : x ∈ [1−ε, 1+ε]¦ = 1,
m
3
= inf¦ f (x) : x ∈ [1+ε, 2]¦ = 0, M
3
= sup¦ f (x) : x ∈ [1+ε, 2]¦ = 0.
Furthermore, ∆x
1
= 1−ε, ∆x
2
= 2ε and ∆x
3
= 1−ε. We compute
L(P, f ) =
3
∑
i=1
m
i
∆x
i
= 1 (1−ε) +0 2ε +0 (1−ε) = 1−ε,
U(P, f ) =
3
∑
i=1
M
i
∆x
i
= 1 (1−ε) +1 2ε +0 (1−ε) = 1+ε.
Thus,
2
0
f −
2
0
f ≤U(P, f ) −L(P, f ) = (1−ε) −(1+ε) = 2ε.
By Proposition 5.1.8 we have
2
0
f ≤
2
0
f . As ε was arbitrary we see that
2
0
f =
2
0
f . So f is
Riemann integrable. Finally,
1−ε = L(P, f ) ≤
1
0
f ≤U(P, f ) = 1+ε.
Hence,
1
0
f −1
≤ε. As ε was arbitrary, we have that
1
0
f = 1.
5.1.3 More notation
When f : S →R is deﬁned on a larger set S and [a, b] ⊂ S, we write
b
a
f to mean the Riemann
integral of the restriction of f to [a, b] (provided the restriction is Riemann integrable of course).
5.1. THE RIEMANN INTEGRAL 123
Furthermore, when f : S →R is a function and [a, b] ⊂S, we say that f is Riemann integrable on
[a, b] if the restriction of f to [a, b] is Riemann integrable.
It will be useful to deﬁne the integral
b
a
f even if a < b. Therefore. Suppose that b < a and that
f ∈ R[b, a], then deﬁne
b
a
f :=−
a
b
f .
Also for any function f we deﬁne
a
a
f := 0.
At times, the variable x will already have some meaning. When we need to write down the
variable of integration, we may simply use a different letter. For example,
b
a
f (s) ds :=
b
a
f (x) dx.
5.1.4 Exercises
Exercise 5.1.1: Let f : [0, 1] →R be deﬁned by f (x) := x
3
and let P :=¦0, 0.1, 0.4, 1¦. Compute
L(P, f ) and U(P, f ).
Exercise 5.1.2: Let f : [0, 1] →R be deﬁned by f (x) := x. Compute
1
0
f using the deﬁnition of the
integral (but feel free to use Proposition 5.1.8).
Exercise 5.1.3: Let f : [a, b] →R be a bounded function. Suppose that there exists a sequence of
partitions ¦P
k
¦ of [a, b] such that
lim
k→∞
U(P
k
, f ) −L(P
k
, f ) = 0.
Show that f is Riemann integrable and that
b
a
f = lim
k→∞
U(P
k
, f ) = lim
k→∞
L(P
k
, f ).
Exercise 5.1.4: Finish proof of Proposition 5.1.7.
Exercise 5.1.5: Suppose that f : [−1, 1] →R is deﬁned as
f (x) :=
1 if x > 0,
0 if x ≤0.
Prove that f ∈ R[−1, 1] and compute
1
−1
f using the deﬁnition of the integral (feel free to use
Proposition 5.1.8).
124 CHAPTER 5. THE RIEMANN INTEGRAL
Exercise 5.1.6: Let c ∈ (a, b) and let d ∈ R. Deﬁne f : [a, b] →R as
f (x) :=
d if x = c,
0 if x = c.
Prove that f ∈ R[a, b] and compute
b
a
f using the deﬁnition of the integral (feel free to use
Proposition 5.1.8).
Exercise 5.1.7: Suppose that f : [a, b] →R is Riemann integrable. Let ε > 0 be given. Then show
that there exists a partition P =¦x
0
, x
1
, . . . , x
n
¦ such that if we pick any set of points ¦c
1
, c
2
, . . . , c
n
¦
where c
k
∈ [x
k−1
, x
k
], then
b
a
f −
n
∑
k=1
f (c
k
)∆x
k
< ε.
5.2. PROPERTIES OF THE INTEGRAL 125
5.2 Properties of the integral
Note: 2 lectures
5.2.1 Additivity
The next result we prove is usually referred to as the additive property of the integral. First we prove
the additivity property for the lower and upper Darboux integrals.
Lemma 5.2.1. If a < b < c and f : [a, c] →R is a bounded function. Then
c
a
f =
b
a
f +
c
b
f
and
c
a
f =
b
a
f +
c
b
f .
Proof. If we have partitions P
1
:=¦x
0
, x
1
, . . . , x
k
¦ of [a, b] and P
2
:=¦x
k
, x
k+1
, . . . , x
n
¦ of [b, c], then
we have a partition P :=¦x
0
, x
1
, . . . , x
n
¦ of [a, c] (simply taking the union of P
1
and P
2
). Then
L(P, f ) =
n
∑
j=1
m
j
∆x
j
=
k
∑
j=1
m
j
∆x
j
+
n
∑
j=k+1
m
j
∆x
j
= L(P
1
, f ) +L(P
2
, f ).
When we take the supremum over all P
1
and P
2
, we are taking a supremum over all partitions P of
[a, c] that contain b. If Q is a partition of [a, c] such that P = Q∪¦b¦, then P is a reﬁnement of Q
and so L(Q, f ) ≤L(P, f ). Therefore, taking a supremum only over the P such that P contains b is
sufﬁcient to ﬁnd the supremum of L(P, f ). Therefore we obtain
c
a
f = sup¦L(P, f ) : P a partition of [a, c]¦
= sup¦L(P, f ) : P a partition of [a, c], b ∈ P¦
= sup¦L(P
1
, f ) +L(P
2
, f ) : P
1
a partition of [a, b], P
2
a partition of [b, c]¦
= sup¦L(P
1
, f ) : P
1
a partition of [a, b]¦+sup¦L(P
2
, f ) : P
2
a partition of [b, c]¦
=
b
a
f +
c
b
f .
Similarly, for P, P
1
, and P
2
as above we obtain
U(P, f ) =
n
∑
j=1
M
j
∆x
j
=
k
∑
j=1
M
j
∆x
j
+
n
∑
j=k+1
M
j
∆x
j
=U(P
1
, f ) +U(P
2
, f ).
126 CHAPTER 5. THE RIEMANN INTEGRAL
We wish to take the inﬁmum on the right over all P
1
and P
2
, and so we are taking the inﬁmum over
all partitions P of [a, c] that contain b. If Q is a partition of [a, c] such that P = Q∪¦b¦, then P is a
reﬁnement of Q and so U(Q, f ) ≥U(P, f ). Therefore, taking an inﬁmum only over the P such that
P contains b is sufﬁcient to ﬁnd the inﬁmum of U(P, f ). We obtain
c
a
f =
b
a
f +
c
b
f .
Theorem 5.2.2. Let a < b < c. A function f : [a, c] →R is Riemann integrable, if and only if f is
Riemann integrable on [a, b] and [b, c]. If f is Riemann integrable, then
c
a
f =
b
a
f +
c
b
f .
Proof. Suppose that f ∈ R[a, c], then
c
a
f =
c
a
f =
c
a
f . We apply the lemma to get
c
a
f =
c
a
f =
b
a
f +
c
b
f ≤
b
a
f +
c
b
f =
c
a
f =
c
a
f .
Thus the inequality is an equality and
b
a
f +
c
b
f =
b
a
f +
c
b
f .
As we also know that
b
a
f ≤
b
a
f and
c
b
f ≤
c
b
f , we can conclude that
b
a
f =
b
a
f and
c
b
f =
c
b
f .
Thus f is Riemann integrable on [a, b] and [b, c] and the desired formula holds.
Now assume that the restrictions of f to [a, b] and to [b, c] are Riemann integrable. We again
apply the lemma to get
c
a
f =
b
a
f +
c
b
f =
b
a
f +
c
b
f =
b
a
f +
c
b
f =
c
a
f .
Therefore f is Riemann integrable on [a, c], and the integral is computed as indicated.
An easy consequence of the additivity is the following corollary. We leave the details to the
reader as an exercise.
Corollary 5.2.3. If f ∈ R[a, b] and [c, d] ⊂[a, b], then the restriction f [
[c,d]
is in R[c, d].
5.2. PROPERTIES OF THE INTEGRAL 127
5.2.2 Linearity and monotonicity
Proposition 5.2.4 (Linearity). Let f and g be in R[a, b] and α ∈ R.
(i) α f is in R[a, b] and
b
a
α f (x) dx = α
b
a
f (x) dx.
(ii) f +g is in R[a, b] and
b
a
f (x) +g(x) dx =
b
a
f (x) dx +
b
a
g(x) dx.
Proof. Let us prove the ﬁrst item. First suppose that α ≥0. For a partition P we notice that (details
are left to reader)
L(P, α f ) = αL(P, f ) and U(P, α f ) = αU(P, f ).
We know that for a bounded set of real numbers we can move multiplication by a positive number
α past the supremum. Hence,
b
a
α f (x) dx = sup¦L(P, α f ) : P a partition¦
= sup¦αL(P, f ) : P a partition¦
= α sup¦L(P, f ) : P a partition¦
= α
b
a
f (x) dx.
Similarly we show that
b
a
α f (x) dx = α
b
a
f (x) dx.
The conclusion now follows for α ≥0.
To ﬁnish the proof of the ﬁrst item, we need to show that
b
a
−f (x) dx = −
b
a
f (x) dx. The
proof of this fact is left as an exercise.
The proof of the second item is also left as an exercise (it is not as trivial as it may appear at ﬁrst
glance).
Proposition 5.2.5 (Monotonicity). Let f and g be in R[a, b] and let f (x) ≤g(x) for all x ∈ [a, b].
Then
b
a
f ≤
b
a
g.
128 CHAPTER 5. THE RIEMANN INTEGRAL
Proof. Let P be a partition of [a, b]. Then let
m
i
:= inf¦ f (x) : x ∈ [x
i−1
, x
i
]¦ and ˜ m
i
:= inf¦g(x) : x ∈ [x
i−1
, x
i
]¦.
As f (x) ≤g(x), then m
i
≤ ˜ m
i
. Therefore,
L(P, f ) =
n
∑
i=1
m
i
∆x
i
≤
n
∑
i=1
˜ m
i
∆x
i
= L(P, g).
We can now take the supremum over all P to obtain that
b
a
f ≤
b
a
g.
As f and g are Riemann integrable, the conclusion follows.
5.2.3 Continuous functions
We say that a function f : [a, b] →R has ﬁnitely many discontinuities if there exists a ﬁnite set
S :=¦x
1
, x
2
, . . . , x
n
¦ ⊂[a, b], A := [a, b] `S, and the restriction f [
A
is continuous. Before we prove
that bounded functions with ﬁnitely many discontinuities are Riemann integrable, we need some
lemmas. The ﬁrst lemma says that bounded continuous functions are Riemann integrable.
Lemma 5.2.6. Let f : [a, b] →R be a continuous function. Then f ∈ R[a, b].
Proof. As f is continuous on a closed bounded interval, therefore it is uniformly continuous. Let
ε > 0 be given. Then ﬁnd a δ such that [x −y[ < δ implies [ f (x) − f (y)[ <
ε
b−a
.
Let P := ¦x
0
, x
1
, . . . , x
n
¦ be a partition of [a, b] such that ∆x
i
< δ for all i = 1, 2, . . . , n. For
example, take n such that
1
/n < δ and let x
i
:=
i
n
(b−a) +a. Then for all x, y ∈ [x
i−1
, x
i
] we have
that [x −y[ <∆x
i
< δ and hence
f (x) − f (y) <
ε
b−a
.
As f is continuous on [x
i−1
, x
i
] it attains a maximum and a minimum. Let x be the point where
f attains the maximum and y be the point where f attains the minimum. Then f (x) = M
i
and
f (y) = m
i
in the notation from the deﬁnition of the integral. Therefore,
M
i
−m
i
<
ε
b−a
.
5.2. PROPERTIES OF THE INTEGRAL 129
And so
b
a
f −
b
a
f ≤U(P, f ) −L(P, f )
=
n
∑
i=1
M
i
∆x
i
−
n
∑
i=1
m
i
∆x
i
=
n
∑
i=1
(M
i
−m
i
)∆x
i
<
ε
b−a
n
∑
i=1
∆x
i
=
ε
b−a
(b−a) = ε.
As ε > 0 was arbitrary,
b
a
f =
b
a
f ,
and f is Riemann integrable on [a, b].
The second lemma says that we need the function to only “Riemann integrable inside the
interval,” as long as it is bounded. It also tells us how to compute the integral.
Lemma 5.2.7. Let f : [a, b] →R be a bounded function that is Riemann integrable on [a
/
, b
/
] for
all a
/
, b
/
such that a < a
/
< b
/
< b. Then f ∈ R[a, b]. Furthermore, if a < a
n
< b
n
< b are such that
lim a
n
= a and lim b
n
= b, then
b
a
f = lim
n→∞
b
n
a
n
f .
Proof. Let M > 0 be a real number such that [ f (x)[ ≤ M. Pick two sequences of numbers a <
a
n
< b
n
< b such that lim a
n
= a and lim b
n
= b. Then Lemma 5.2.1 says that the lower and upper
integral are additive and the hypothesis says that f is integrable on [a
n
, b
n
]. Therefore
b
a
f =
a
n
a
f +
b
n
a
n
f +
b
b
n
f ≥−M(a
n
−a) +
b
n
a
n
f −M(b−b
n
).
Note that M > 0 and (b−a) ≥(b
n
−a
n
). We thus have
−M(b−a) ≤−M(b
n
−a
n
) ≤
b
n
a
n
f ≤M(b
n
−a
n
) ≤M(b−a).
Thus the sequence of numbers ¦
b
n
a
n
f ¦ is bounded and hence by BolzanoWeierstrass has a con
vergent subsequence indexed by n
k
. Let us call L the limit of the subsequence ¦
b
n
k
a
n
k
f ¦. We look
130 CHAPTER 5. THE RIEMANN INTEGRAL
at
b
a
f ≥−M(a
n
k
−a) +
b
n
k
a
n
k
f −M(b−b
n
k
)
and we take the limit on the righthand side to obtain
b
a
f ≥−M 0+L−M 0 = L.
Next use the additivity of the upper integral to obtain
b
a
f =
a
n
a
f +
b
n
a
n
f +
b
b
n
f ≤M(a
n
−a) +
b
n
a
n
f +M(b−b
n
).
We take the same subsequence ¦
b
n
k
a
n
k
f ¦ and take the limit of the inequality
b
a
f ≤M(a
n
k
−a) +
b
n
k
a
n
k
f +M(b−b
n
k
)
to obtain
b
a
f ≤M 0+L+M 0 = L.
Thus
b
a
f =
b
a
f = L and hence f is Riemann integrable and
b
a
f = L.
To prove the ﬁnal statement of the lemma we note that we can use Theorem 2.3.7. We have
shown that every convergent subsequence ¦
b
n
k
a
n
k
f ¦ converges to L. Therefore, the sequence ¦
b
n
a
n
f ¦
is convergent and converges to L.
Theorem 5.2.8. Let f : [a, b] →R be a bounded function with ﬁnitely many discontinuities. Then
f ∈ R[a, b].
Proof. We divide the interval into ﬁnitely many intervals [a
i
, b
i
] so that f is continuous on the
interior (a
i
, b
i
). If f is continuous on (a
i
, b
i
), then it is continuous and hence integrable on [c
i
, d
i
]
for all a
i
< c
i
< d
i
< b
i
. By Lemma 5.2.7 the restriction of f to [a
i
, b
i
] is integrable. By additivity
of the integral (and simple induction) f is integrable on the union of the intervals.
Sometimes it is convenient (or necessary) to change certain values of a function and then
integrate. The next result says that if we change the values only at ﬁnitely many points, the integral
does not change.
Proposition 5.2.9. Let f : [a, b] →R be Riemann integrable. Let g: [a, b] →R be a function such
that f (x) = g(x) for all x ∈ [a, b] `S, where S is a ﬁnite set. Then g is a Riemann integrable function
and
b
a
g =
b
a
f .
5.2. PROPERTIES OF THE INTEGRAL 131
Sketch of proof. Using additivity of the integral, we could split up the interval [a, b] into smaller
intervals such that f (x) = g(x) holds for all x except at the endpoints (details are left to the reader).
Therefore, without loss of generality suppose that f (x) = g(x) for all x ∈ (a, b). The proof
follows by Lemma 5.2.7, and is left as an exercise.
5.2.4 Exercises
Exercise 5.2.1: Let f be in R[a, b]. Prove that −f is in R[a, b] and
b
a
−f (x) dx =−
b
a
f (x) dx.
Exercise 5.2.2: Let f and g be in R[a, b]. Prove that f +g is in R[a, b] and
b
a
f (x) +g(x) dx =
b
a
f (x) dx +
b
a
g(x) dx.
Hint: Use Proposition 5.1.7 to ﬁnd a single partition P such that U(P, f ) −L(P, f ) <
ε
/2 and
U(P, g) −L(P, g) <
ε
/2.
Exercise 5.2.3: Let f : [a, b] →R be Riemann integrable. Let g: [a, b] →R be a function such that
f (x) = g(x) for all x ∈ (a, b). Prove that g is Riemann integrable and that
b
a
g =
b
a
f .
Exercise 5.2.4: Prove the mean value theorem for integrals. That is, prove that if f : [a, b] →R is
continuous, then there exists a c ∈ [a, b] such that
b
a
f = f (c)(b−a).
Exercise 5.2.5: If f : [a, b] →R is a continuous function such that f (x) ≥0 for all x ∈ [a, b] and
b
a
f = 0. Prove that f (x) = 0 for all x.
Exercise 5.2.6: If f : [a, b] →R is a continuous function for all x ∈ [a, b] and
b
a
f = 0. Prove that
there exists a c ∈ [a, b] such that f (c) = 0 (Compare with the previous exercise).
Exercise 5.2.7: If f : [a, b] →R and g: [a, b] →R are continuous functions such that
b
a
f =
b
a
g.
Then show that there exists a c ∈ [a, b] such that f (c) = g(c).
Exercise 5.2.8: Let f ∈ R[a, b]. Let α, β, γ be arbitrary numbers in [a, b] (not necessarily ordered
in any way). Prove that
γ
α
f =
β
α
f +
γ
β
f .
Recall what
b
a
f means if b ≤a.
132 CHAPTER 5. THE RIEMANN INTEGRAL
Exercise 5.2.9: Prove Corollary 5.2.3.
Exercise 5.2.10: Suppose that f : [a, b] →R has ﬁnitely many discontinuities. Show that as a
function of x the expression [ f (x)[ has ﬁnitely many discontinuities and is thus Riemann integrable.
Then show that
b
a
f (x) dx
≤
b
a
[ f (x)[ dx.
Exercise 5.2.11 (Hard): Show that the Thomae or popcorn function (See also Example 3.2.12) is
Riemann integrable. Therefore, there exists a function discontinuous at all rational numbers (a
dense set) that is Riemann integrable.
In particular, deﬁne f : [0, 1] →R by
f (x) :=
1
/k if x =
m
/k where m, k ∈ N and m and k have no common divisors,
0 if x is irrational.
Show that
1
0
f = 0.
If I ⊂R is a bounded interval, then the function
ϕ
I
(x) :=
1 if x ∈ I,
0 otherwise,
is called an elementary step function.
Exercise 5.2.12: Let I be an arbitrary bounded interval (you should consider all types of intervals:
closed, open, halfopen) and a < b, then using only the deﬁnition of the integral show that the
elementary step function ϕ
I
is integrable on [a, b], and ﬁnd the integral in terms of a, b, and the
endpoints of I.
When a function f can be written as
f (x) =
n
∑
k=1
α
k
ϕ
I
k
(x)
for some real numbers α
1
, α
2
, . . . , α
n
and some bounded intervals I
1
, I
2
, . . . , I
n
, then f is called a
step function.
Exercise 5.2.13: Using the previous exercise, show that a step function is integrable on any interval
[a, b]. Furthermore, ﬁnd the integral in terms of a, b, the endpoints of I
k
and the α
k
.
5.3. FUNDAMENTAL THEOREM OF CALCULUS 133
5.3 Fundamental theorem of calculus
Note: 1.5 lectures
In this chapter we discuss and prove the fundamental theorem of calculus. This is the one
theorem on which the entirety of integral calculus is built, hence the name. The theorem relates the
seemingly unrelated concepts of integral and derivative. It tells us how to compute the antiderivative
of a function using the integral.
5.3.1 First form of the theorem
Theorem 5.3.1. Let F : [a, b] →Rbe a continuous function, differentiable on (a, b). Let f ∈R[a, b]
be such that f (x) = F
/
(x) for x ∈ (a, b). Then
b
a
f = F(b) −F(a).
It is not hard to generalize the theorem to allow a ﬁnite number of points in [a, b] where F is not
differentiable, as long as it is continuous. This generalization is left as an exercise.
Proof. Let P be a partition of [a, b]. For each interval [x
i−1
, x
i
], use the mean value theorem to ﬁnd
a c
i
∈ [x
i−1
, x
i
] such that
f (c
i
)∆x
i
= F
/
(c
i
)(x
i
−x
i−1
) = F(x
i
) −F(x
i−1
).
Using the notation from the deﬁnition of the integral, m
i
≤ f (c
i
) ≤M
i
. Therefore,
m
i
∆x
i
≤F(x
i
) −F(x
i−1
) ≤M
i
∆x
i
.
We now sum over i = 1, 2, . . . , n to get
n
∑
i=1
m
i
∆x
i
≤
n
∑
i=1
F(x
i
) −F(x
i−1
)
≤
n
∑
i=1
M
i
∆x
i
.
We notice that in the sum all the middle terms cancel and we end up simply with F(x
n
) −F(x
0
) =
F(b) −F(a). The sums on the left and on the right are the lower and the upper sum respectively.
L(P, f ) ≤F(b) −F(a) ≤U(P, f ).
We can now take the supremum of L(P, f ) over all P and the inequality yields
b
a
f ≤F(b) −F(a).
134 CHAPTER 5. THE RIEMANN INTEGRAL
Similarly, taking the inﬁmum of U(P, f ) over all partitions P yields
F(b) −F(a) ≤
b
a
f .
As f is Riemann integrable, we have
b
a
f =
b
a
f ≤F(b) −F(a) ≤
b
a
f =
b
a
f .
And we are done as the inequalities must be equalities.
The theorem is often used to solve integrals. Suppose we know that the function f (x) is a
derivative of some other function F(x), then we can ﬁnd an explicit expression for
b
a
f .
Example 5.3.2: For example, suppose we are trying to compute
1
0
x
2
dx.
We notice that x
2
is the derivative of
x
3
3
, therefore we use the fundamental theorem to write down
1
0
x
2
dx =
0
3
3
−
1
3
3
=
1
3
.
5.3.2 Second form the theorem
The second form of the fundamental theorem gives us a way to solve the differential equation
F
/
(x) = f (x), where f (x) is a known function and we are trying to ﬁnd an F that satisﬁes the
equation.
Theorem 5.3.3. Let f : [a, b] →R be a Riemann integrable function. Deﬁne
F(x) :=
x
a
f .
First, F is continuous on [a, b]. Second, If f is continuous at c ∈ [a, b], then F is differentiable at c
and F
/
(c) = f (c).
Proof. First as f is bounded, there is an M > 0 such that [ f (x)[ ≤M. Suppose x, y ∈ [a, b]. Then
using an exercise from earlier section we note
[F(x) −F(y)[ =
x
a
f −
y
a
f
=
x
y
f
≤M[x −y[ .
5.3. FUNDAMENTAL THEOREM OF CALCULUS 135
Do note that it does not matter if x < y or x > y. Therefore F is Lipschitz continuous and hence
continuous.
Now suppose that f is continuous at c. Let ε > 0 be given. Let δ > 0 be such that [x −c[ < δ
implies [ f (x) − f (c)[ < ε for x ∈ [a, b]. In particular for such x we have
f (c) −ε ≤ f (x) ≤ f (c) +ε.
Thus
( f (c) −ε)(x −c) ≤
x
c
f ≤( f (c) +ε)(x −c).
Note that this inequality holds even if c > x. Therefore
f (c) −ε ≤
x
c
f
x −c
≤ f (c) +ε.
As
F(x) −F(c)
x −c
=
x
a
f −
c
a
f
x −c
=
x
c
f
x −c
,
we have that
F(x) −F(c)
x −c
− f (c)
< ε.
Of course, if f is continuous on [a, b], then it is automatically Riemann integrable, F is differen
tiable on all of [a, b] and F
/
(x) = f (x) for all x ∈ [a, b].
Remark 5.3.4. The second form of the fundamental theorem of calculus still holds if we let d ∈[a, b]
and deﬁne
F(x) :=
x
d
f .
That is, we can use any point of [a, b] as our base point. The proof is left as an exercise.
A common misunderstanding of the integral for calculus students is to think of integrals whose
solution cannot be given in closedform as somehow deﬁcient. This is not the case. Most integrals
we write down are not computable in closedform. Plus even some integrals that we consider in
closedform are not really. For example, how does a computer ﬁnd the value of lnx? One way to
do it is to simply note that we deﬁne the natural log as the antiderivative of
1
/x such that ln1 = 0.
Therefore,
lnx :=
x
1
1
/s ds.
Then we can numerically approximate the integral. So morally, we did not really “simplify”
x
1
1
/s ds
by writing down lnx. We simply gave the integral a name. If we require numerical answers, it is
possible that we will end up doing the calculation by approximating an integral anyway.
136 CHAPTER 5. THE RIEMANN INTEGRAL
Another common function where integrals cannot be evaluated symbolically is the erf function
deﬁned as
erf(x) :=
2
π
x
0
e
s
2
ds.
This function comes up very often in applied mathematics. It is simply the antiderivative of (
2
/π)e
x
2
that is zero at zero. The second form of the fundamental theorem tells us that we can write the
function as an integral. If we wish to compute any particular value, we numerically approximate the
integral.
5.3.3 Change of variables
A theorem often used in calculus to solve integrals is the change of variables theorem. Let us prove
it now. Recall that a function is continuously differentiable if it is differentiable and the derivative is
continuous.
Theorem 5.3.5 (Change of variables). Let g: [a, b] →R be a continuously differentiable function.
If g([a, b]) ⊂[c, d] and f : [c, d] →R is continuous, then
b
a
f
g(x)
g
/
(x) dx =
g(b)
g(a)
f (s) ds.
Proof. As g, g
/
, and f are continuous, we know that f
g(x)
g
/
(x) is a continuous function on [a, b],
therefore Riemann integrable.
Deﬁne
F(y) :=
y
g(a)
f (s) ds.
By second form of the fundamental theorem of calculus (using Exercise 5.3.4 below) F is a
differentiable function and F
/
(y) = f (y). Now we apply the chain rule. Write
F ◦g
/
(x) = F
/
g(x)
g
/
(x) = f
g(x)
g
/
(x)
Next we note that F
g(a)
= 0 and we use the ﬁrst form of the fundamental theorem to obtain
g(b)
g(a)
f (s) ds = F
g(b)
= F
g(b)
−F
g(a)
=
b
a
F ◦g
/
(x) dx =
b
a
f
g(x)
g
/
(x) dx.
The substitution theorem is often used to solve integrals by changing them to integrals we know
or which we can solve using the fundamental theorem of calculus.
5.3. FUNDAMENTAL THEOREM OF CALCULUS 137
Example 5.3.6: From an exercise, we know that the derivative of sin(x) is cos(x). Therefore we
can solve
√
π
0
xcos(x
2
) dx =
π
0
cos(s)
2
ds =
1
2
π
0
cos(s) ds =
sin(π) −sin(0)
2
= 0.
However, beware that we must satisfy the hypothesis of the function. The following example
demonstrates a common mistake for students of calculus. We must not simply move symbols
around, we should always be careful that those symbols really make sense.
Example 5.3.7: Suppose we write down
1
−1
ln[x[
x
dx.
It may be tempting to take g(x) := ln[x[. Then take g
/
(x) =
1
x
and try to write
g(1)
g(−1)
s ds =
0
0
s ds = 0.
This “solution” is not correct, and it does not say that we can solve the given integral. First problem
is that
ln[x[
x
is not Riemann integrable on [−1, 1] (it is unbounded). The integral we wrote down
simply does not make sense. Secondly,
ln[x[
x
is not even continuous on [−1, 1]. Finally g is not
continuous on [−1, 1] either.
5.3.4 Exercises
Exercise 5.3.1: Compute
d
dx
x
−x
e
s
2
ds
.
Exercise 5.3.2: Compute
d
dx
x
2
0
sin(s
2
) ds
.
Exercise 5.3.3: Suppose F : [a, b] →R is continuous and differentiable on [a, b] `S, where S is a
ﬁnite set. Suppose there exists an f ∈ R[a, b] such that f (x) = F
/
(x) for x ∈ [a, b] `S. Show that
b
a
f = F(b) −F(a).
Exercise 5.3.4: Let f : [a, b] →R be a continuous function. Let c ∈ [a, b] be arbitrary. Deﬁne
F(x) :=
x
c
f .
Prove that F is differentiable and that F
/
(x) = f (x) for all x ∈ [a, b].
138 CHAPTER 5. THE RIEMANN INTEGRAL
Exercise 5.3.5: Prove integration by parts. That is, suppose that F and G are differentiable
functions on [a, b] and suppose that F
/
and G
/
are Riemann integrable. Then prove
b
a
F(x)G
/
(x) dx = F(b)G(b) −F(a)G(a) −
b
a
F
/
(x)G(x) dx.
Exercise 5.3.6: Suppose that F, and G are differentiable functions deﬁned on [a, b] such that
F
/
(x) = G
/
(x) for all x ∈ [a, b]. Show that F and G differ by a constant. That is, show that there
exists a C ∈ R such that F(x) −G(x) =C.
The next exercise shows how we can use the integral to “smooth out” a nondifferentiable
function.
Exercise 5.3.7: Let f : [a, b] →R be a continuous function. Let ε > 0 be a constant. For x ∈
[a+ε, b−ε], deﬁne
g(x) :=
1
2ε
x+ε
x−ε
f .
(i) Show that g is differentiable and ﬁnd the derivative.
(ii) Let f be differentiable and ﬁx x ∈ (a, b) (and let ε be small enough). What happens to g
/
(x) as
ε gets smaller.
(iii) Find g for f (x) :=[x[, ε = 1 (you can assume that [a, b] is large enough).
Exercise 5.3.8: Suppose that f : [a, b] →R is continuous. Suppose that
x
a
f =
b
x
f for all x ∈[a, b].
Show that f (x) = 0 for all x ∈ [a, b].
Exercise 5.3.9: Suppose that f : [a, b] →R is continuous and
x
a
f = 0 for all rational x in [a, b].
Show that f (x) = 0 for all x ∈ [a, b].
Chapter 6
Sequences of Functions
6.1 Pointwise and uniform convergence
Note: 1.5 lecture
Up till now when we have talked about sequences we always talked about sequences of numbers.
However, a very useful concept in analysis is to use a sequence of functions. For example, many
times a solution to some differential equation is found by ﬁnding approximate solutions only. Then
the real solution is some sort of limit of those approximate solutions.
The tricky part is that when talking about sequences of functions, there is not a single notion of
a limit. We will talk about two common notions of a limit of a sequence of functions.
6.1.1 Pointwise convergence
Deﬁnition 6.1.1. Let f
n
: S →R be functions. We say the sequence ¦ f
n
¦ converges pointwise to
f : S →R, if for every x ∈ S we have
f (x) = lim
n→∞
f
n
(x).
It is common to say that f
n
: S →R converges to f on T ⊂R for some f : T →R. In that case
we, of course, mean that f (x) = lim f
n
(x) for every x ∈ T. We simply mean that the restrictions of
f
n
to T converge pointwise to f .
Example 6.1.2: The sequence of functions f
n
(x) := x
2n
converges to f : [−1, 1] →R on [−1, 1],
where
f (x) =
1 if x =−1 or x = 1,
0 otherwise.
See Figure 6.1.
139
140 CHAPTER 6. SEQUENCES OF FUNCTIONS
x
2
x
4
x
6
x
16
Figure 6.1: Graphs of f
1
, f
2
, f
3
, and f
8
for f
n
(x) := x
2n
.
To see that this is so, ﬁrst take x ∈ (−1, 1). Then x
2
< 1. We have seen before that
x
2n
−0
= (x
2
)
n
→0 as n →∞.
Therefore lim f
n
(x) = 0.
When x = 1 or x =−1, then x
2n
= 1 and hence lim f
n
(x) = 1. We also note that f
n
(x) does not
converge for all other x.
Often, functions are given as a series. In this case, we simply use the notion of pointwise
convergence to ﬁnd the values of the function.
Example 6.1.3: We write
∞
∑
k=0
x
k
to denote the limit of the functions
f
n
(x) :=
n
∑
k=0
x
k
.
When studying series, we have seen that on x ∈ (−1, 1) the f
n
converge pointwise to
1
1−x
.
The subtle point here is that while
1
1−x
is deﬁned for all x = 1, and f
n
are deﬁned for all x (even
at x = 1), convergence only happens on (−1, 1).
Therefore, when we write
f (x) :=
∞
∑
k=0
x
k
we mean that f is deﬁned on (−1, 1) and is the pointwise limit of the partial sums.
6.1. POINTWISE AND UNIFORM CONVERGENCE 141
Example 6.1.4: Let f
n
(x) := sin(xn). Then f
n
does not converge pointwise to any function on any
interval. It may converge at certain points, such as when x = 0 or x = π. It is left as an exercise that
in any interval [a, b], there exists an x such that sin(xn) does not have a limit as n goes to inﬁnity.
Before we move to uniform convergence, let us reformulate pointwise convergence in a different
way. We leave the proof to the reader, it is a simple application of the deﬁnition of convergence of a
sequence of real numbers.
Proposition 6.1.5. Let f
n
: S →R and f : S →R be functions. Then ¦ f
n
¦ converges pointwise to f
if and only if for every x ∈ S, and every ε > 0, there exists an N ∈ N such that
[ f
n
(x) − f (x)[ < ε
for all n ≥N.
The key point here is that N can depend on x, not just on ε. That is, for each x we can pick a
different N. If we could pick one N for all x, we would have what is called uniform convergence.
6.1.2 Uniform convergence
Deﬁnition 6.1.6. Let f
n
: S →R be functions. We say the sequence ¦ f
n
¦ converges uniformly to
f : S →R, if for every ε > 0 there exists an N ∈ N such that for all n ≥N we have
[ f
n
(x) − f (x)[ < ε.
Note the fact that N now cannot depend on x. Given ε > 0 we must ﬁnd an N that works for
all x ∈ S. Because of Proposition 6.1.5 we easily see that uniform convergence implies pointwise
convergence.
Proposition 6.1.7. Let f
n
: S →R be a sequence of functions that converges uniformly to f : S →R.
Then ¦ f
n
¦ converges pointwise to f .
The converse does not hold.
Example 6.1.8: The functions f
n
(x) := x
2n
do not converge uniformly on [−1, 1], even though they
converge pointwise. To see this, suppose for contradiction that they did. Take ε :=
1
/2, then there
would have to exist an N such that x
2N
<
1
/2 for all x ∈ [0, 1) (as f
n
(x) converges to 0 on (−1, 1)).
But that means that for any sequence ¦x
k
¦ in [0, 1) such that lim x
k
= 1 we have x
2N
k
<
1
/2. On the
other hand x
2N
is a continuous function of x (it is a polynomial), therefore we obtain a contradiction
1 = 1
2N
= lim
k→∞
x
2N
k
≤
1
/2.
However, if we restrict our domain to [−a, a] where 0 < a < 1, then f
n
converges uniformly to
0 on [−a, a]. Again to see this note that a
2n
→0 as n →∞. Thus given ε > 0, pick N ∈ N such that
a
2n
< ε for all n ≥N. Then for any x ∈ [−a, a] we have [x[ ≤a. Therefore, for n ≥N
x
2N
=[x[
2N
≤a
2N
< ε.
142 CHAPTER 6. SEQUENCES OF FUNCTIONS
6.1.3 Convergence in uniform norm
For bounded functions there is another more abstract way to think of uniform convergence. To
every bounded function we can assign a certain nonnegative number (called the uniform norm).
This number measures the “distance” of the function from 0. Then we can “measure” how far two
functions are from each other. We can then simply translate a statement about uniform convergence
into a statement of a certain sequence of real numbers converging to zero.
Deﬁnition 6.1.9. Let f : S →R be a bounded function. Deﬁne
 f 
u
:= sup¦[ f (x)[ : x ∈ S¦.
 is called the uniform norm.
Proposition 6.1.10. A sequence of bounded functions f
n
: S →R converges uniformly to f : S →R,
if and only if
lim
n→∞
 f
n
− f 
u
= 0.
Proof. First suppose that limf
n
− f 
u
= 0. Let ε > 0 be given. Then there exists an N such that
for n ≥N we have f
n
− f 
u
< ε. As  f
n
− f 
u
is the supremum of [ f
n
(x) − f (x)[, we see that for
all x we have [ f
n
(x) − f (x)[ < ε.
On the other hand, suppose that f
n
converges uniformly to f . Let ε > 0 be given. Then ﬁnd N
such that [ f
n
(x) − f (x)[ < ε for all x ∈ S. Taking the supremum we see that  f
n
− f 
u
< ε. Hence
lim f
n
− f  = 0.
Sometimes it is said that f
n
converges to f in uniform norm instead of converges uniformly.
The proposition says that the two notions are the same thing.
Example 6.1.11: Let f
n
: [0, 1] →R be deﬁned by f
n
(x) :=
nx+sin(nx
2
)
n
. Then we claim that f
n
converge uniformly to f (x) := x. Let us compute:
f
n
− f 
u
= sup
nx +sin(nx
2
)
n
−x
: x ∈ [0, 1]
= sup
sin(nx
2
)
n
: x ∈ [0, 1]
¸
≤sup¦
1
/n : x ∈ [0, 1]¦
=
1
/n.
Using uniform norm, we can deﬁne Cauchy sequences in a similar way as Cauchy sequences of
real numbers.
6.1. POINTWISE AND UNIFORM CONVERGENCE 143
Deﬁnition 6.1.12. Let f
n
: S →R be bounded functions. We say that the sequence is Cauchy in the
uniform norm or uniformly Cauchy if for every ε > 0, there exists an N ∈ N such that for m, k ≥N
we have
f
m
− f
k

u
< ε.
Proposition 6.1.13. Let f
n
: S →R be bounded functions. Then ¦f
n
¦ is Cauchy in the uniform
norm if and only if there exists an f : S →R and ¦ f
n
¦ converges uniformly to f .
Proof. Let us ﬁrst suppose that ¦ f
n
¦ is Cauchy in the uniform norm. Let us deﬁne f . Fix x, then
the sequence ¦ f
n
(x)¦ is Cauchy because
[ f
m
(x) − f
k
(x)[ ≤ f
m
− f
k

u
.
Thus ¦ f
n
(x)¦ converges to some real number so deﬁne
f (x) := lim
n→∞
f
n
(x).
Therefore, f
n
converges pointwise to f . To show that convergence is uniform, let ε > 0 be given
ﬁnd an N such that for m, k ≥N we have  f
m
− f
k

u
< ε. Again this implies that for all x we have
[ f
m
(x) − f
k
(x)[ < ε. Now we can simply take the limit as k goes to inﬁnity. Then [ f
m
(x) − f
k
(x)[
goes to [ f
m
(x) − f (x)[. Therefore for all x we get
[ f
m
(x) − f (x)[ < ε.
And hence f
n
converges uniformly.
For the other direction, suppose that ¦ f
n
¦ converges uniformly to f . Given ε > 0, ﬁnd N such
that for all n ≥N we have [ f
n
(x) − f (x)[ <
ε
/4 for all x ∈ S. Therefore for all m, k ≥N we have
[ f
m
(x) − f
k
(x)[ =[ f
m
(x) − f (x) + f (x) − f
k
(x)[ ≤[ f
m
(x) − f (x)[ + f (x) − f
k
(x) <
ε
/4 +
ε
/4.
We can now take supremum over all x to obtain
 f
m
− f
k

u
≤
ε
/2 < ε.
6.1.4 Exercises
Exercise 6.1.1: Let f and g be bounded functions on [a, b]. Show that
 f +g
u
≤ f 
u
+g
u
.
Exercise 6.1.2: a) Find the pointwise limit
e
x/n
n
for x ∈ R.
144 CHAPTER 6. SEQUENCES OF FUNCTIONS
b) Is the limit uniform on R.
c) Is the limit uniform on [0, 1].
Exercise 6.1.3: Suppose f
n
: S →R are functions that converge uniformly to f : S →R. Suppose
that A ⊂R. Show that the restrictions f
n
[
A
converge uniformly to f [
A
.
Exercise 6.1.4: Suppose that ¦ f
n
¦ and ¦g
n
¦ deﬁned on some set A converge to f and g respectively
pointwise. Show that ¦ f
n
+g
n
¦ converges pointwise to f +g.
Exercise 6.1.5: Suppose that ¦ f
n
¦ and ¦g
n
¦ deﬁned on some set A converge to f and g respectively
uniformly on A. Show that ¦f
n
+g
n
¦ converges uniformly to f +g on A.
Exercise 6.1.6: Find an example of a sequence of functions ¦ f
n
¦ and ¦g
n
¦ that converge uniformly
to some f and g on some set A, but such that f
n
g
n
(the multiple) does not converge uniformly to f g
on A. Hint: Let A :=R, let f (x) := g(x) := x. You can even pick f
n
= g
n
.
Exercise 6.1.7: Suppose that there exists a sequence of functions ¦g
n
¦ uniformly converging to 0
on A. Now suppose that we have a sequence of functions f
n
and a function f on A such that
[ f
n
(x) − f (x)[ ≤g
n
(x)
for all x ∈ A. Show that f
n
converges uniformly to f on A.
Exercise 6.1.8: Let ¦ f
n
¦, ¦g
n
¦ and ¦h
n
¦ be sequences of functions on [a, b]. Suppose that f
n
and
h
n
converge uniformly to some function f : [a, b] →R and suppose that f
n
(x) ≤g
n
(x) ≤h
n
(x) for
all x ∈ [a, b]. Show that g
n
converges uniformly to f .
Exercise 6.1.9: Let f
n
: [0, 1] →R be a sequence of increasing functions (that is f
n
(x) ≥ f
n
(y)
whenever x ≥y). Suppose that f (0) = 0 and that lim
n→∞
f
n
(1) = 0. Show that f
n
converges uniformly
to 0.
Exercise 6.1.10: Let ¦ f
n
¦ be a sequence of functions deﬁned on [0, 1]. Suppose that there exists a
sequence of numbers x
n
∈ [0, 1] such that
f
n
(x
n
) = 1.
Prove or disprove the following statements.
a) True or false: There exists ¦ f
n
¦ as above that converges to 0 pointwise.
b) True or false: There exists ¦ f
n
¦ as above that converges to 0 uniformly on [0, 1].
6.2. INTERCHANGE OF LIMITS 145
6.2 Interchange of limits
Note: 1.5 lectures
Large parts of modern analysis deal mainly with the question of the interchange of two limiting
operations. It is easy to see that when we have a chain of two limits, we cannot always just swap the
limits. For example,
0 = lim
n→∞
lim
k→∞
n
/k
n
/k +1
= lim
k→∞
lim
n→∞
n
/k
n
/k +1
= 1.
When talking about sequences of functions, interchange of limits comes up quite often. We treat
two cases. First we look at continuity of the limit, and second we will look at the integral of the
limit.
6.2.1 Continuity of the limit
If we have a sequence of continuous functions, is the limit continuous? Suppose that f is the
(pointwise) limit of f
n
. If x
k
→x, we are interested in the following interchange of limits. The
equality we have to prove (it is not always true) is marked with a question mark.
lim
k→∞
f (x
k
) = lim
k→∞
lim
n→∞
f
n
(x
k
)
?
= lim
n→∞
lim
k→∞
f
n
(x
k
) = lim
n→∞
f
n
(x) = f (x).
In particular, we wish to ﬁnd conditions on the sequence ¦f
n
¦ so that the above equation holds. It
turns out that if we simply require pointwise convergence, then the limit of a sequence of functions
need not be continuous, and the above equation need not hold.
Example 6.2.1: Let f
n
: [0, 1] →R be deﬁned as
f
n
(x) :=
1−nx if x <
1
/n,
0 if x ≥
1
/n.
See Figure 6.2.
Each function f
n
is continuous. Now ﬁx an x ∈ (0, 1]. Note that for n >
1
/x we have x <
1
/n.
Therefore for n >
1
/x we have f
n
(x) = 0. Thus
lim
n→∞
f
n
(x) = 0.
On the other hand if x = 0, then
lim
n→∞
f
n
(0) = lim
n→∞
1 = 1.
Thus the pointwise limit of f
n
is the function f : [0, 1] →R deﬁned by
f (x) :=
1 if x = 0,
0 if x > 0.
The function f is not continuous at 0.
146 CHAPTER 6. SEQUENCES OF FUNCTIONS
1
1
/n
Figure 6.2: Graph of f
n
(x).
If we, however, require the convergence to be uniform, the limits can be interchanged.
Theorem 6.2.2. Let f
n
: [a, b] →R be a sequence of continuous functions. Suppose that ¦ f
n
¦
converges uniformly to f : [a, b] →R. Then f is continuous.
Proof. Let x ∈ [a, b] be ﬁxed. Let ¦x
n
¦ be a sequence in [a, b] converging to x.
Let ε > 0 be given. As f
k
converges uniformly to f , we ﬁnd a k ∈ N such that
[ f
k
(y) − f (y)[ <
ε
/3
for all y ∈ [a, b]. As f
k
is continuous at x, we can ﬁnd an N ∈ N such that for m ≥N we have
[ f
k
(x
m
) − f
k
(x)[ <
ε
/3.
Thus for m ≥N we have
[ f (x
m
) − f (x)[ =[ f (x
m
) − f
k
(x
m
) + f
k
(x
m
) − f
k
(x) + f
k
(x) − f (x)[
≤[ f (x
m
) − f
k
(x
m
)[ +[ f
k
(x
m
) − f
k
(x)[ +[ f
k
(x) − f (x)[
<
ε
/3 +
ε
/3 +
ε
/3 = ε.
Therefore ¦ f (x
m
)¦ converges to f (x) and hence f is continuous at x. As x was arbitrary, f is
continuous everywhere.
6.2.2 Integral of the limit
Again, if we simply require pointwise convergence, then the integral of a limit of a sequence of
functions need not be the limit of the integrals.
6.2. INTERCHANGE OF LIMITS 147
Example 6.2.3: Let f
n
: [0, 1] →R be deﬁned as
f
n
(x) :=
0 if x = 0,
n−n
2
x if 0 < x <
1
/n,
0 if x ≥
1
/n.
See Figure 6.3.
n
1
/n
Figure 6.3: Graph of f
n
(x).
Each f
n
is Riemann integrable (it is continuous on (0, 1]). Furthermore it is easy to compute that
1
0
f
n
=
1
/n
0
f
n
=
1
/2.
Let us compute the pointwise limit of f
n
. Now ﬁx an x ∈ (0, 1]. For n >
1
/x we have x <
1
/n and
thus f
n
(x) = 0. Therefore
lim
n→∞
f
n
(x) = 0.
We also have f
n
(0) = 0 for all n. Therefore the pointwise limit of ¦ f
n
¦ is the zero function. Thus
1
/2 = lim
n→∞
1
0
f
n
(x) dx =
1
0
lim
n→∞
f
n
(x)
dx =
1
0
0 dx = 0.
But, as for continuity, if we require the convergence to be uniform, the limits can be interchanged.
Theorem 6.2.4. Let f
n
: [a, b] →R be a sequence of Riemann integrable functions. Suppose that
¦ f
n
¦ converges uniformly to f : [a, b] →R. Then f is Riemann integrable and
b
a
f = lim
n→∞
b
a
f
n
.
148 CHAPTER 6. SEQUENCES OF FUNCTIONS
Proof. Let ε > 0 be given. As f
n
goes to f uniformly, we can ﬁnd an M∈ N such that for all n ≥M
we have [ f
n
(x) − f (x)[ <
ε
2(b−a)
for all x ∈ [a, b]. Note that f
n
is integrable and compute
b
a
f −
b
a
f =
b
a
( f (x) − f
n
(x) + f
n
(x)) dx −
b
a
( f (x) − f
n
(x) + f
n
(x)) dx
=
b
a
( f (x) − f
n
(x)) dx +
b
a
f
n
(x) dx −
b
a
( f (x) − f
n
(x)) dx −
b
a
f
n
(x) dx
=
b
a
( f (x) − f
n
(x)) dx +
b
a
f
n
(x) dx −
b
a
( f (x) − f
n
(x)) dx −
b
a
f
n
(x) dx
=
b
a
( f (x) − f
n
(x)) dx −
b
a
( f (x) − f
n
(x)) dx
≤
ε
2(b−a)
(b−a) +
ε
2(b−a)
(b−a) = ε.
The inequality follows from Proposition 5.1.8 and using the fact that for all x ∈ [a, b] we have
−ε
2(b−a)
< f (x) − f
n
(x) <
ε
2(b−a)
. As ε > 0 was arbitrary, f is Riemann integrable.
Now we can compute
b
a
f . We will apply Proposition 5.1.10 in the calculation. Again, for
n ≥M (M is the same as above) we have
b
a
f −
b
a
f
n
=
b
a
( f (x) − f
n
(x)) dx
≤
ε
2(b−a)
(b−a) =
ε
2
< ε.
Therefore ¦
b
a
f
n
¦ converges to
b
a
f .
Example 6.2.5: Suppose we wish to compute
lim
n→∞
1
0
nx +sin(nx
2
)
n
dx.
It is impossible to compute the integrals for any particular n using calculus as sin(nx
2
) has no closed
form antiderivative. However, we can compute the limit. We have shown before that
nx+sin(nx
2
)
n
converges uniformly on [0, 1] to the function f (x) := x. By Theorem 6.2.4, the limit exists and
lim
n→∞
1
0
nx +sin(nx
2
)
n
dx =
1
0
x dx =
1
/2.
Example 6.2.6: If convergence is only pointwise, the limit need not even be Riemann integrable.
For example, on [0, 1] deﬁne
f
n
(x) :=
1 if x =
p
/q in lowest terms and q ≤n,
0 otherwise.
6.2. INTERCHANGE OF LIMITS 149
As f
n
differs from the zero function at ﬁnitely many points (there are only ﬁnitely many fractions in
[0, 1] with denominator less than or equal to n), then f
n
is integrable and
1
0
f
n
=
1
0
0 = 0. It is an
easy exercise to show that f
n
converges pointwise to the Dirichlet function
f (x) :=
1 if x ∈ Q,
0 otherwise,
which is not Riemann integrable.
6.2.3 Exercises
Exercise 6.2.1: While uniform convergence can preserve continuity, it does not preserve differen
tiability. Find an explicit example of a sequence of differentiable functions on [−1, 1] that converge
uniformly to a function f such that f is not differentiable. Hint: Consider [x[
1+1/n
, show that these
functions are differentiable, converge uniformly, and the show that the limit is not differentiable.
Exercise 6.2.2: Let f
n
(x) =
x
n
n
. Show that f
n
converges uniformly to a differentiable function f on
[0, 1] (ﬁnd f ). However, show that f
/
(1) = lim
n→∞
f
/
n
(1).
Note: The previous two exercises show that we cannot simply swap limits with derivatives, even
if the convergence is uniform. See also Exercise 6.2.7 below.
Exercise 6.2.3: Let f : [0, 1] →R be a bounded function. Find lim
n→∞
1
0
f (x)
n
dx.
Exercise 6.2.4: Show lim
n→∞
2
1
e
−nx
2
dx = 0. Feel free to use what you know about the exponential
function from calculus.
Exercise 6.2.5: Find an example of a sequence of continuous functions on (0, 1) that converges
pointwise to a continuous function on (0, 1), but the convergence is not uniform.
Note: In the previous exercise, (0, 1) was picked for simplicity. For a more challenging exercise,
replace (0, 1) with [0, 1].
Exercise 6.2.6: True/False; prove or ﬁnd a counterexample to the following statement: If ¦ f
n
¦ is a
sequence of everywhere discontinuous functions on [0, 1] that converge uniformly to a function f ,
then f is everywhere discontinuous.
Exercise 6.2.7: For a continuously differentiable function f : [a, b] →R, deﬁne
 f 
C
1 := f 
u
+
f
/
u
.
Suppose that ¦ f
n
¦ is a sequence of continuously differentiable functions such that for every ε > 0,
there exists an M such that for all n, k ≥M we have
 f
n
− f
k

C
1 < ε.
Show that ¦ f
n
¦ converges uniformly to some continuously differentiable function f : [a, b] →R.
150 CHAPTER 6. SEQUENCES OF FUNCTIONS
For the following two exercises let us deﬁne for a Riemann integrable function f : [0, 1] →R
the following number
f 
L
1 :=
1
0
[ f (x)[ dx.
(It is true that [ f [ is always integrable if f is even if we have not proved that fact). This norm deﬁnes
another very common type of convergence called the L
1
convergence, that is however a bit more
subtle.
Exercise 6.2.8: Suppose that ¦ f
n
¦ is a sequence of functions on [0, 1] that converge uniformly to 0.
Show that
lim
n→∞
 f
n

L
1 = 0.
Exercise 6.2.9: Find a sequence of functions ¦ f
n
¦ on [0, 1] that converge pointwise to 0, but
lim
n→∞
 f
n

L
1 does not exist (is ∞).
Exercise 6.2.10 (Hard): Prove Dini’s theorem: Let f
n
: [a, b] →R be a sequence of functions such
that
0 ≤ f
n+1
(x) ≤ f
n
(x) ≤ ≤ f
1
(x) for all n ∈ N.
Suppose that f
n
converges pointwise to 0. Show that f
n
converges to zero uniformly.
Exercise 6.2.11: Suppose that f
n
: [a, b] →R is a sequence of functions that converges pointwise
to a continuous f : [a, b] →R. Suppose that for any x ∈ [a, b] the sequence ¦[ f
n
(x) − f (x)[¦ is
monotone. Show that the sequence ¦ f
n
¦ converges uniformly.
Exercise 6.2.12: Find a sequence of Riemann integrable functions f
n
: →R such that f
n
converges
to the zero functions pointwise and such that a)
¸
1
0
f
n
¸
increases without bound, b)
¸
1
0
f
n
¸
is the
sequence −1, 1, −1, 1, −1, 1, . . ..
6.3. PICARD’S THEOREM 151
6.3 Picard’s theorem
Note: 1.5–2 lectures
A course such as this one should have a pièce de résistance caliber theorem. We pick a theorem
whose proof combines everything we have learned. It is more sophisticated than the fundamental
theorem of calculus, the ﬁrst highlight theorem of this course. The theorem we are talking about is
Picard’s theorem
∗
on existence and uniqueness of a solution to an ordinary differential equation.
Both the statement and the proof are beautiful examples of what one can do with all that we
have learned. It is also a good example of how analysis is applied as differential equations are
indispensable in science.
6.3.1 First order ordinary differential equation
Modern science is described in the language of differential equations. That is equations that involve
not only the unknown, but also its derivatives. The simplest nontrivial form of a differential equation
is the socalled ﬁrst order ordinary differential equation
y
/
= F(x, y).
Generally we also specify that y(x
0
) = y
0
. The solution of the equation is a function y(x) such that
y(x
0
) = y
0
and y
/
(x) = F
x, y(x)
.
When F involves only the x variable, the solution is given by the fundamental theorem of
calculus. On the other hand, when F depends on both x and y we need far more ﬁrepower. It is not
always true that a solution exists, and if it does, that it is the unique solution. Picard’s theorem gives
us certain sufﬁcient conditions for existence and uniqueness.
6.3.2 The theorem
We will need to deﬁne continuity in two variables. First, a point in R
2
= RR is denoted by
an ordered pair (x, y). To make matters simple let us give the following sequential deﬁnition of
continuity.
Deﬁnition 6.3.1. Let U ⊂R
2
be a set and F : U →R be a function. Let (x, y) ∈U be a point. The
function F is continuous at (x, y) if for every sequence ¦(x
n
, y
n
)¦ of points in U such that lim x
n
= x
and lim y
n
= y, we have that
lim
n→∞
F(x
n
, y
n
) = F(x, y).
We say F is continuous if it is continuous at all points in U.
∗
Named for the French mathematician Charles Émile Picard (1856–1941).
152 CHAPTER 6. SEQUENCES OF FUNCTIONS
Theorem 6.3.2 (Picard’s theorem on existence and uniqueness). Let I, J ⊂R be closed bounded
intervals and let I
0
and J
0
be their interiors. Suppose F : I J →R is continuous and Lipschitz in
the second variable, that is, there exists a number L such that
[F(x, y) −F(x, z)[ ≤L[y −z[ for all y, z ∈ J, x ∈ I.
Let (x
0
, y
0
) ∈ I
0
J
0
. Then there exists an h > 0 and a unique differentiable f : [x
0
−h, x
0
+h] →R,
such that
f
/
(x) = F
x, f (x)
and f (x
0
) = y
0
. (6.1)
Proof. Suppose that we could ﬁnd a solution f , then by the fundamental theorem of calculus we
can integrate the equation f
/
(x) = F
x, f (x)
, f (x
0
) = y
0
and write it as the integral equation
f (x) = y
0
+
x
x
0
F
t, f (t)
dt. (6.2)
The idea of our proof is that we will try to plug in approximations to a solution to the righthand
side of (6.2) to get better approximations on the left hand side of (6.2). We hope that in the end
the sequence will converge and solve (6.2) and hence (6.1). The technique below is called Picard
iteration, and the individual functions f
k
are called the Picard iterates.
Without loss of generality, suppose that x
0
= 0 (exercise below). Another exercise tells us that
F is bounded as it is continuous. Let M := sup¦[F(x, y)[ : (x, y) ∈ I J¦. Without loss of generality,
we can assume M > 0 (why?). Pick α > 0 such that [−α, α] ⊂I and [y
0
−α, y
0
+α] ⊂J. Deﬁne
h := min
α,
α
M+Lα
. (6.3)
Now note that [−h, h] ⊂I.
Set f
0
(x) := y
0
. We will deﬁne f
k
inductively. Assuming that f
k−1
([−h, h]) ⊂[y
0
−α, y
0
+α],
we see that F
t, f
k−1
(t)
is a well deﬁned function of t for t ∈ [−h, h]. Further assuming that f
k−1
is
continuous on [−h, h], then F
t, f
k−1
(t)
is continuous as a function of t on [−h, h] by an exercise.
Therefore we can deﬁne
f
k
(x) := y
0
+
x
0
F
t, f
k−1
(t)
dt.
and f
k
is continuous on [−h, h] by the fundamental theorem of calculus. To see that f
k
maps [−h, h]
to [y
0
−α, y
0
+α], we compute for x ∈ [−h, h]
[ f
k
(x) −y
0
[ =
x
0
F
t, f
k−1
(t)
dt
≤M[x[ ≤Mh ≤M
α
M+Lα
≤α.
We can now deﬁne f
k+1
and so on, and we have deﬁned a sequence ¦ f
k
¦ of functions. We simply
need to show that it converges to a function f that solves the equation (6.2) and therefore (6.1).
6.3. PICARD’S THEOREM 153
We wish to show that the sequence ¦f
k
¦ converges uniformly to some function on [−h, h]. First,
for t ∈ [−h, h] we have the following useful bound
F
t, f
n
(t)
−F
t, f
k
(t)
≤L[ f
n
(t) − f
k
(t)[ ≤L f
n
− f
k

u
,
where  f
n
− f
k

u
is the uniform norm, that is the supremum of [ f
n
(t) − f
k
(t)[ for t ∈ [−h, h]. Now
note that [x[ ≤h ≤
α
M+Lα
. Therefore
[ f
n
(x) − f
k
(x)[ =
x
0
F
t, f
n−1
(t)
dt −
x
0
F
t, f
k−1
(t)
dt
=
x
0
F
t, f
n−1
(t)
−F
t, f
k−1
(t)
dt
≤L f
n−1
− f
k−1

u
[x[
≤
Lα
M+Lα
 f
n−1
− f
k−1

u
.
Let C :=
Lα
M+Lα
and note that C < 1. Taking supremum on the lefthand side we get
 f
n
− f
k

u
≤C f
n−1
− f
k−1

u
.
Without loss of generality, suppose that n ≥k. Then by induction we can show that
 f
n
− f
k

u
≤C
k
 f
n−k
− f
0

u
.
Now compute for any x ∈ [−h, h] we have
[ f
n−k
(x) − f
0
(x)[ =[ f
n−k
(x) −y
0
[ ≤α.
Therefore
 f
n
− f
k

u
≤C
k
f
n−k
− f
0

u
≤C
k
α.
As C < 1, ¦ f
n
¦ is uniformly Cauchy and by Proposition 6.1.13 we obtain that ¦ f
n
¦ converges
uniformly on [−h, h] to some function f : [−h, h] →R. The function f is the uniform limit of
continuous functions and therefore continuous.
We now need to show that f solves (6.2). First, as before we notice
F
t, f
n
(t)
−F
t, f (t)
≤L[ f
n
(t) − f (t)[ ≤L f
n
− f 
u
.
As  f
n
− f 
u
converges to 0, then F
t, f
n
(t)
converges uniformly to F
t, f (t)
. It is easy to see
(why?) that the convergence is then uniform on [0, x] (or [x, 0] if x < 0). Therefore,
y
0
+
x
0
F(t, f (t)
dt = y
0
+
x
0
F
t, lim
n→∞
f
n
(t)
dt
= y
0
+
x
0
lim
n→∞
F
t, f
n
(t)
dt (by continuity of F)
= lim
n→∞
y
0
+
x
0
F
t, f
n
(t)
dt
(by uniform convergence)
= lim
n→∞
f
n+1
(x) = f (x).
154 CHAPTER 6. SEQUENCES OF FUNCTIONS
We can now apply the fundamental theorem of calculus to show that f is differentiable and its
derivative is F
x, f (x)
. It is obvious that f (0) = y
0
.
Finally, what is left to do is to show uniqueness. Suppose g: [−h, h] →R is another solution.
As before we use the fact that
F
t, f (t)
−F
t, g(t)
≤L f −g
u
. Then
[ f (x) −g(x)[ =
y
0
+
x
0
F
t, f (t)
dt −
y
0
+
x
0
F
t, g(t)
dt
=
x
0
F
t, f (t)
−F
t, g(t)
dt
≤L f −g
u
[x[ ≤Lh f −g
u
≤
Lα
M+Lα
f −g
u
.
As we said before C =
Lα
M+Lα
< 1. By taking supremum over x ∈ [−h, h] on the left hand side we
obtain
 f −g
u
≤C f −g
u
.
This is only possible if  f −g
u
= 0. Therefore, f = g, and the solution is unique.
6.3.3 Examples
Let us look at some examples. We note that the proof of the theorem actually gives us an explicit
way to ﬁnd an h that works. It does not however give use the best h. It is often possible to ﬁnd a
much larger h for which the theorem works.
The proof also gives us the Picard iterates as approximations to the solution. Therefore the proof
actually tells us how to obtain the solution, not just that the solution exists.
Example 6.3.3: Let us look at the equation
f
/
(x) = f (x), f (0) = 1.
That is, we let F(x, y) = y, and we are looking for a function f such that f
/
(x) = f (x). We pick
any I that contains 0 in the interior. We pick an arbitrary J that contains 1 in its interior. We
can always pick L = 1. The theorem guarantees an h > 0 such that there exists a unique solution
f : [−h, h] →R. This solution is usually denoted by
e
x
:= f (x).
We leave it to the reader to verify that by picking I and J large enough the proof of the theorem
guarantees that we will be able to pick α such that we get any h we want as long as h <
1
/3.
Of course, we know (though we have not proved) that this function exists as a function for
all x. It is possible to show (we omit the proof) that for any x
0
and y
0
the proof of the theorem
above always guarantees an arbitrary h as long as h <
1
/3. The key point is that L = 1 no matter
6.3. PICARD’S THEOREM 155
what x
0
and y
0
are. Therefore, we get a unique function deﬁned in a neighborhood [−h, h] for any
h <
1
/3. After deﬁning the function on [−h, h] we ﬁnd a solution on the interval [0, 2h] and notice
that the two functions must coincide on [0, h] by uniqueness. We can thus iteratively construct the
exponential for all x ∈ R. Do note that up until now we did not yet have proof of the existence of
the exponential function.
Let us see the Picard iterates for this function. First we start with f
0
(x) := 1. Then
f
1
(x) = 1+
x
0
f
0
(s) ds = x +1,
f
2
(x) = 1+
x
0
f
1
(s) ds = 1+
x
0
s +1 ds =
x
2
2
+x +1,
f
3
(x) = 1+
x
0
f
2
(s) ds = 1+
x
0
s
2
2
+s +1 ds =
x
3
6
+
x
2
2
+x +1.
We recognize the beginning of the Taylor series for the exponential.
Example 6.3.4: Suppose we have the equation
f
/
(x) =
f (x)
2
and f (0) = 1.
From elementary differential equations we know that
f (x) =
1
1−x
is the solution. Do note that the solution is only deﬁned on (−∞, 1). That is we will be able to use
h < 1, but never a larger h. Note that the function that takes y to y
2
is simply not Lipschitz as a
function on all of R. As we approach x = 1 from the left we note that the solution becomes larger
and larger. The derivative of the solution grows as y
2
, and therefore the L required will have to be
larger and larger as y
0
grows. Thus if we apply the theorem with x
0
close to 1 and y
0
=
1
1−x
0
we
ﬁnd that the h that the proof guarantees will be smaller and smaller as x
0
approaches 1.
The proof of the theorem guarantees an h of about 0.1123 (we omit the calculation) for x
0
= 0,
even though we see from above that any h < 1 should work.
Example 6.3.5: Suppose we start with the equation
f
/
(x) = 2
[ f (x)[, f (0) = 0.
Note that F(x, y) = 2
[y[ is not Lipschitz in y (why?). Therefore the equation does not satisfy the
hypotheses of the theorem. The function
f (x) =
x
2
if x ≥0,
−x
2
if x < 0,
is a solution, but g(x) = 0 is also a solution.
156 CHAPTER 6. SEQUENCES OF FUNCTIONS
6.3.4 Exercises
Exercise 6.3.1: Let I, J ⊂ R be intervals. Let F : I J →R be a continuous function of two
variables and suppose that f : I →J be a continuous function. Show that F
x, f (x)
is a continuous
function on I.
Exercise 6.3.2: Let I, J ⊂R be closed bounded intervals. Show that if F : I J →R is continuous,
then F is bounded.
Exercise 6.3.3: We have proved Picard’s theorem under the assumption that x
0
= 0. Prove the full
statement of Picard’s theorem for an arbitrary x
0
.
Exercise 6.3.4: Let f
/
(x) = x f (x) be our equation. Start with the initial condition f (0) = 2 and
ﬁnd the Picard iterates f
0
, f
1
, f
2
, f
3
, f
4
.
Exercise 6.3.5: Suppose that F : I J →R is a function that is continuous in the ﬁrst variable,
that is, for any ﬁxed y the function that takes x to F(x, y) is continuous. Further, suppose that F is
Lipschitz in the second variable, that is, there exists a number L such that
[F(x, y) −F(x, z)[ ≤L[y −z[ for all y, z ∈ J, x ∈ I.
Show that F is continuous as a function of two variables. Therefore, the hypotheses in the theorem
could be made even weaker.
Further Reading
[BS] Robert G. Bartle and Donald R. Sherbert, Introduction to real analysis, 3rd ed., John Wiley
& Sons Inc., New York, 2000.
[DW] John P. D’Angelo and Douglas B. West, Mathematical Thinking: ProblemSolving and
Proofs, 2nd ed., Prentice Hall, 1999.
[R1] Maxwell Rosenlicht, Introduction to analysis, Dover Publications Inc., New York, 1986.
Reprint of the 1968 edition.
[R2] Walter Rudin, Principles of mathematical analysis, 3rd ed., McGrawHill Book Co., New
York, 1976. International Series in Pure and Applied Mathematics.
[T] William F. Trench, Introduction to real analysis, Pearson Education, 2003. http://
ramanujan.math.trinity.edu/wtrench/texts/TRENCH_REAL_ANALYSIS.PDF.
157
158 FURTHER READING
Index
absolute convergence, 72
absolute maximum, 92
absolute minimum, 92
absolute value, 31
additive property of the integral, 125
Archimedean property, 27
arithmeticgeometric mean inequality, 30
bijection, 15
bijective, 15
Bolzano’s intermediate value theorem, 95
Bolzano’s theorem, 95
BolzanoWeierstrass theorem, 61
bounded above, 21
bounded below, 21
bounded function, 33, 92
bounded interval, 35
bounded sequence, 39
Cantor’s theorem, 17, 35
cardinality, 15
Cartesian product, 13
Cauchy in the uniform norm, 143
Cauchy sequence, 65
Cauchy series, 70
Cauchycomplete, 67
chain rule, 106
change of variables theorem, 136
closed interval, 35
cluster point, 63, 79
comparison test for series, 73
complement relative to, 10
complete, 67
completeness property, 22
composition of functions, 15
conditionally convergent, 72
constant sequence, 39
continuous at c, 86
continuous function, 86
continuous function of two variables, 151
continuously differentiable, 114
converge, 40
convergent sequence, 40
convergent series, 68
converges, 80
converges absolutely, 72
converges in uniform norm, 142
converges pointwise, 139
converges uniformly, 141
countable, 16
countably inﬁnite, 16
Darboux sum, 118
Darboux’s theorem, 113
decreasing, 111
Dedekind completeness property, 22
DeMorgan’s theorem, 10
density of rational numbers, 27
difference quotient, 103
differentiable, 103
differential equation, 151
Dini’s theorem, 150
direct image, 14
Dirichlet function, 89
discontinuity, 89
discontinuous, 89
disjoint, 10
159
160 INDEX
divergent sequence, 40
divergent series, 68
diverges, 80
domain, 14
element, 8
elementary step function, 132
empty set, 8
equal, 9
existence and uniqueness theorem, 152
extended real numbers, 28
ﬁeld, 22
ﬁnite, 15
ﬁnitely many discontinuities, 128
ﬁrst derivative, 115
ﬁrst order ordinary differential equation, 151
function, 13
fundamental theorem of calculus, 133
graph, 14
greatest lower bound, 22
halfopen interval, 35
harmonic series, 71
image, 14
increasing, 111
induction, 12
induction hypothesis, 12
inﬁmum, 22
inﬁnite, 15
injection, 15
injective, 15
integers, 9
integration by parts, 138
intermediate value theorem, 95
intersection, 9
interval, 35
inverse function, 15
inverse image, 14
irrational, 26
Lagrange form, 116
least upper bound, 21
leastupperbound property, 22
Leibniz rule, 105
limit, 80
limit inferior, 57
limit of a function, 80
limit of a sequence, 40
limit superior, 57
linearity of series, 71
linearity of the derivative, 105
linearity of the integral, 127
Lipschitz continuous, 101
lower bound, 21
lower Darboux integral, 119
lower Darboux sum, 118
mapping, 13
maximum, 29
Maximumminimum theorem, 92
mean value theorem, 110
mean value theorem for integrals, 131
member, 8
minimum, 29
Minimummaximum theorem, 92
monotone decreasing sequence, 42
monotone increasing sequence, 42
monotone sequence, 42
monotonic sequence, 42
monotonicity of the integral, 127
n times differentiable, 115
naïve set theory, 8
natural numbers, 9
negative, 23
nonnegative, 23
nonpositive, 23
nth derivative, 115
nth Taylor polynomial for f, 115
onetoone, 15
INDEX 161
onto, 15
open interval, 35
ordered ﬁeld, 23
ordered set, 21
pseries, 74
ptest, 74
partial sums, 68
partition, 117
Picard iterate, 152
Picard iteration, 152
Picard’s theorem, 152
pointwise convergence, 139
polynomial, 87
popcorn function, 90, 132
positive, 23
power set, 16
principle of induction, 12
principle of strong induction, 13
product rule, 106
proper subset, 9
quotient rule, 106
range, 14
range of a sequence, 39
ratio test for sequences, 55
ratio test for series, 76
rational numbers, 9
real numbers, 21
reﬁnement of a partition, 119
relative maximum, 109
relative minimum, 109
remainder term in Taylor’s formula, 116
restriction, 84
reverse triangle inequality, 32
Riemann integrable, 121
Riemann integral, 121
Rolle’s theorem, 109
second derivative, 115
sequence, 39
series, 68
set, 8
set building notation, 9
set theory, 8
settheoretic difference, 10
settheoretic function, 13
squeeze lemma, 47
step function, 132
strictly decreasing, 111
strictly increasing, 111
subsequence, 44
subset, 9
supremum, 21
surjection, 15
surjective, 15
symmetric difference, 18
tail of a sequence, 44
Taylor polynomial, 115
Taylor’s theorem, 115
Thomae function, 90, 132
triangle inequality, 31
unbounded interval, 35
uncountable, 16
uniform convergence, 141
uniform norm, 142
uniform norm convergence, 142
uniformly Cauchy, 143
uniformly continuous, 98
union, 9
universe, 8
upper bound, 21
upper Darboux integral, 119
upper Darboux sum, 118
Venn diagram, 10
well ordering principle, 12
well ordering property, 12
2
A Typeset in L TEX.
Copyright c 2009–2011 Jiˇí Lebl r
This work is licensed under the Creative Commons AttributionNoncommercialShare Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/ licenses/byncsa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. You can use, print, duplicate, share these notes as much as you want. You can base your own notes on these and reuse parts if you keep the license the same. If you plan to use these commercially (sell them for more than just duplicating cost), then you need to contact me and we will work something out. If you are printing a course pack for your students, then it is ﬁne if the duplication service is charging a fee for printing and selling the printed copy. I consider that duplicating cost. During the writing of these notes, the author was in part supported by NSF grant DMS0900885. See http://www.jirka.org/ra/ for more information (including contact information).
Contents
Introduction 0.1 Notes about these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2 About analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.3 Basic set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Real Numbers 1.1 Basic properties . . . . . . 1.2 The set of real numbers . . 1.3 Absolute value . . . . . . 1.4 Intervals and the size of R 5 5 7 8 21 21 25 31 35 39 39 47 57 65 68 79 79 86 92 98 103 103 109 115
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
2
Sequences and Series 2.1 Sequences and limits . . . . . . . . . . . . . . . . . . 2.2 Facts about limits of sequences . . . . . . . . . . . . . 2.3 Limit superior, limit inferior, and BolzanoWeierstrass 2.4 Cauchy sequences . . . . . . . . . . . . . . . . . . . . 2.5 Series . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous Functions 3.1 Limits of functions . . . . . . . . . . . . 3.2 Continuous functions . . . . . . . . . . . 3.3 Minmax and intermediate value theorems 3.4 Uniform continuity . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
4
The Derivative 4.1 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
. . . . . . . . . . . . . .4 5 CONTENTS The Riemann Integral 5. . . . . . . . . . . . . . . . . . .3 Fundamental theorem of calculus . . . . . . . . . 5. . . 151 157 159 Further Reading Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 117 125 133 6 Sequences of Functions 139 6. . . . .2 Interchange of limits . . . . . . . . . . . . . . . . . . . 5. .2 Properties of the integral .1 The Riemann integral .1 Pointwise and uniform convergence . . . . .3 Picard’s theorem . . . . . . . . . . 145 6. . . . . .
Introduction to Real Analysis third edition [BS]. The Darboux approach is far more appropriate for a course of this level. For those that wish to continue mathematics.Introduction 0. but further material should be added. for example one using the (unfortunately rather pricey) book [DW]. The course does not cover topics such as metric spaces. However. The generalized Riemann integral is not covered at all. I have taken a lot of inspiration and ideas from Rudin. For example. The book normally used for the class at UIUC is Bartle and Sherbert. we will deﬁne the Riemann integral using Darboux sums and not tagged partitions. some topics differ substantially from [BS] and some topics are not covered at all. [BS] seems to be targeting a different audience than this course. My favorite is without doubt Rudin’s excellent Principles of Mathematical Analysis [R2] or as it is commonly and lovingly called baby Rudin (to distinguish it from his other great analysis textbook). As the integral is treated more lightly. The structure of the notes mostly follows the syllabus of UIUC Math 444 and therefore has some similarities with [BS].1 Notes about these notes This book is a one semester course in basic analysis. Other excellent books exist. which a more advanced course would. Some topics covered in [BS] are covered in slightly different order. It should be possible to use these notes for a beginning of a more advanced course. Rudin is a bit more advanced and ambitious than this present course. There is also the freely downloadable Introduction to Real Analysis by William Trench [T] for those that do not wish to invest much money. A prerequisite for the course is a basic proof course. This theorem is a wonderful example that uses many results proved in the book. The course is a ﬁrst course in mathematical analysis aimed at students who do not necessarily wish to continue a graduate study in mathematics. Rosenlicht may not be as dry as Rudin for those just starting out in mathematics. Rudin is a ﬁne investment. we can spend some extra time on the interchange of limits and in particular on a section on Picard’s theorem on the existence and uniqueness of solutions of ordinary differential equations if time allows. An inexpensive alternative to Rudin is Rosenlicht’s Introduction to Analysis [R1]. These were my lecture notes for teaching Math 444 at the University of Illinois at UrbanaChampaign (UIUC) in Fall semester 2009. While the 5 . I prefer to do by a direct proof or at least by a contrapositive. and that is the reason for writing this present book. I want to mention a note about the style of some of the proofs. Many proofs that are traditionally done by contradiction. In my view.
§7. The optional material is marked in the notes that appear below every section title.3 §2.1 §6.2 §6. I would also like to thank Dan Stoneham. or left as reading. Section §0.2 Section in [BS] §1.4 §2. contradiction is more likely to get the beginning student into trouble.2 §6. In my opinion.1 (and §5.3 §5.1 §5. or when the contradiction follows rather quickly.3 §7. §3.2 §2. As a general rule.2 §5.1 §4.3 §3. as more and more details are left out to avoid clutter. If you are teaching (or being taught) with [BS].3 and §3.3 and §2. the proofs and the language get slightly less formal as we progress through the book. and an anonymous reader for suggestions and ﬁnding errors and typos.2.3) can safely be skipped as it is never used later.1.7 §4.4 can safely be skipped. though I have not marked the section optional. §3.3 §1.3 §2.3 Section in [BS] §5. but in a contradiction proof.3 §8.4 §3. I may use it even when the context is “local. The section on Taylor’s theorem (§4.2 §7. In a contradiction proof. And of course. here is the correspondence of the sections.1.2 §3. Furthermore. The alternative proof of BolzanoWeierstrass in §2.4 §2.5 parts of §3.3 §6.1–§4. I will use := instead of = to deﬁne an object rather than to simply show equality.3 §2. Frank Beatrous.2?) Section §3.1 §8. I may simply deﬁne a function f (x) := x2 for a single exercise or example.1 §1. intuition usually leads us astray. I use this symbol rather more liberally than is usual.5 §3. as described above. Section §0. I only do so when the contrapositive statement seemed too awkward.3 §1. In a direct proof or a contrapositive proof one can be guided by intuition.1 §3.4 §6. Uncountability of R in §1.3.3 can safely be skipped.3 ? §5.1–§1.5 §3.2 §5.6 INTRODUCTION book does include proofs by contradiction. I also try to avoid unnecessary formalism where it is unhelpful.4 §3.2 §7.2 Not in [BS] It is possible to skip or skim some material in the book as it is not used later on. Finally I would like to acknowledge Jana Maˇíková and Glen Pugh for teaching with the notes r and ﬁnding many typos and errors.2 §4. we are arguing about objects that do not exist.1 §2. the section on Picard’s theorem can also be skipped if there is no time at the end of the course.4 §2.4 §4.3 can be covered lightly. The material within is considered prerequisite. .1 and §2. the material in these notes and in [BS] differs.2 §1. The correspondences are only approximate.” that is.1 §6.2 §2. §3.
ﬁx broken headlights. continuity. that is. To use (or teach) mathematics effectively. I suspect the nomenclature is just historical baggage. It is here to give you a good understanding of the concept of a limit. a more advanced course on “real analysis” would talk about complex numbers often. We will start with discussion of the real number system. Furthermore. we prove that an object (a number perhaps) is equal to another object. though he could “do” all problems in calculus. If 0 ≤ x < ε is true for all real numbers ε > 0. He will not be able to work independently to diagnose and ﬁx problems. we generally prove inequalities. The goal of the course is to acquaint the reader with the basic concepts of rigorous proof in analysis. consider the following statement. will only be able to do those simple tasks. you must know why it is true. To illustrate the point.” The other type of analysis. Let x be a real number. that is. The present course will deal with the most basic concepts in analysis. we will show that 0 ≤ x < ε for all positive ε.0. then x = 0. “complex analysis” really builds up on the present material. the limit of a sequence. That is. This course is to tell you why calculus is true. A high school teacher that does not understand the deﬁnition of the Riemann integral will not be able to properly answer all the student’s questions that could come up. ABOUT ANALYSIS 7 0. An auto mechanic that has learned to change the oil. and the integral. To this day I remember several nonsensical statements I heard from my calculus teacher in high school who simply did not understand the concept of the limit. and charge the battery. In algebra. If we wish to show that x = 0. most importantly its completeness property. I prefer to normally use just “analysis. you cannot simply know what is true.2 About analysis Analysis is the branch of mathematics that deals with inequalities and limiting processes. . Calculus has prepared you (the student) for using mathematics without telling you why what you have learned is true. Let us get on with the show. and the derivative.2. The term “real analysis” is a little bit of a misnomer. we prove equalities directly. We will then discuss the simplest form of a limit. rather than being distinct. Let me give perhaps the most important difference between analysis and algebra. This statement is the general idea of what we do in analysis. We will end with discussion of sequences of functions and the interchange of limits. we will deﬁne the Riemann integral and prove the fundamental theorem of calculus. In analysis. and also to set a ﬁrm foundation for calculus of one variable. the derivative. Next. which is the basis for all that we will talk about. We will then move to study functions of one variable. . Let us give an analogy to make the point. .
0. The universe is generally understood from context and is not explicitly mentioned. 1. For simplicity. The elements of all sets under consideration come from some set we call the universe. .3. We write 1∈S to denote that the number 1 belongs to the set S.1. Most of the time we will consider sets of numbers. and it is hard to get into trouble. This section should be thought of as a refresher. do not take the analogy too far. and therefore that’s where we will start. More formally we have the following deﬁnition. the students who play chess are members of the chess club.3. our universe will most often be the set of real numbers. the set S := {0. two sets that have the same members are the same set. 2} contains the numbers 0 and 2. 1 is a member of S. It will be assumed that the reader has seen basic set theory and has had a course in basic proof writing. For example. 7 is not a member of S. A set can contain some of the same elements as another set. so we can have a set of sets as well. ∗ The term “modern” refers to late 19th century up to the present. and 2.1 Sets Deﬁnition 0. that is what majority of mathematicians use.8 INTRODUCTION 0. using the socalled “naïve set theory.3 Basic set theory Note: 1–3 lectures (some material can be skipped or covered lightly) Before we can start talking about analysis we need to ﬁx some language. We will talk about sets in a rather informal way. / The best way to think of a set is like a club with a certain membership. A set with no objects is called the empty set and is denoted by 0 (or sometimes by {}). the elements of a set can also be other sets. However. A set is only deﬁned by the members that form the set.” Do not worry. Do note. 1. For example. In this course. 2} is the set containing the three elements 0. however. We write T ⊂ S. Modern∗ analysis uses the language of sets. That is. In this case all elements of T also belong to S. For example. That is. The elements of a set will usually be numbers. Similarly we write 7∈S / to denote that the number 7 is not in S. A set is just a collection of objects called elements or members of a set. we often consider the universe to be a set that contains only the elements (for example numbers) we are interested in. T := {0.
N := {1. for S and T deﬁned above we have T ⊂ S. (ii) Two sets A and B are equal if A ⊂ B and B ⊂ A.4. The notation is sometimes abbreviated (A is not mentioned) when understood from context. We write A B. That is.}. (iii) A set A is a proper subset of B if A ⊂ B and A = B. (ii) The set of integers. 3.3: The following are sets including the standard notations for these. n (iv) The set of even natural numbers. . but T = S. we consider A and B to just be two names for the same exact set. Q := { m : m. and we write A ⊂ B. For example. Deﬁnition 0. At this juncture.3. . R. Z := {0. then we write A = B. A and B contain the exactly the same elements. If it is not true that A and B are equal. This notation refers to a subset of the set A containing all elements of A that satisfy the property P(x). 2. (v) The set of real numbers. . Let us see some examples of sets. BASIC SET THEORY Deﬁnition 0. We write A = B. . Furthermore. (i) The set of natural numbers.3.3. 1.3. {2m : m ∈ N}. . When A = B.0. There are many operations we will want to do with sets. 9 (i) A set A is a subset of a set B if x ∈ A implies that x ∈ B. Note that N ⊂ Z ⊂ Q ⊂ R. 2. n ∈ Z and n = 0}. {x ∈ A : P(x)}. (ii) An intersection of two sets A and B is deﬁned as A ∩ B := {x : x ∈ A and x ∈ B}. we also mention the set building notation.2. (i) A union of two sets A and B is deﬁned as A ∪ B := {x : x ∈ A or x ∈ B}. −1. Example 0. all members of A are also members of B. So T is a proper subset of S. (iii) The set of rational numbers. −2. .}. x is sometimes replaced with a formula to make the notation easier to read. . That is.
/ The notation Bc may be a little vague at this point. (v) We say that sets A and B are disjoint if A ∩ B = 0. then Bc is N \ B. Let us now establish one of most basic theorems about sets and logic. we will use the set difference notation A \ B. Theorem 0. . We illustrate the operations on the Venn diagrams in Figure 1. Let A. If B is naturally a subset of the natural numbers. B. If ambiguity would ever arise. (B ∩C)c = Bc ∪Cc . But for example if the set B is a subset of the real numbers R.5 (DeMorgan). then Bc will mean R \ B. A is either the entire universe or is the obvious set that contains B. / (iv) We just say complement of B and write Bc if A is understood from context.C be sets.3.10 INTRODUCTION (iii) A complement of B relative to A (or settheoretic difference of A and B) is deﬁned as A \ B := {x : x ∈ A and x ∈ B}. Then (B ∪C)c = Bc ∩Cc . A B A B A∪B A∩B A B B A\B Bc Figure 1: Venn diagrams of set operations.
Thus x ∈ (A \ B) ∩ (A \C). A2. then x ∈ A \ (B ∪C). . A \ (B ∩C) = (A \ B) ∪ (A \C).3 .3.}. . / / However. BASIC SET THEORY or. A1. we must also show that if x ∈ (A \ B) ∩ (A \C).m = n=1 m=1 n=1 m=1 An. n=1 ∞ An := {x : x ∈ An for all n ∈ N}. Hence x ∈ A \ (B ∪C). ∞ ∞ {k ∈ N : mk < n} = m=1 n=1 m=1 N = N. / / The proof of the other equality is left as an exercise. We deﬁne ∞ An := {x : x ∈ An for some n ∈ N}. . A1. that is. A2 . A3 . switching unions and intersections is not generally permitted without proof. . For example. n=1 We could also have sets indexed by two integers. then we just apply the union or intersection operation several times. . suppose that we have an inﬁnite collection of sets (a set of sets) {A1 . It is not hard to see that we could take the unions in any order.2 . A2. but not in B nor C.1 .1 . So let us assume that x ∈ A \ (B ∪C).1 . 11 Proof. However. Hence x is in A and not in B.2 . For example: ∞ ∞ ∞ {k ∈ N : mk < n} = n=1 m=1 n=1 ∞ 0 = 0. Remember the deﬁnition of equality of sets. Second.}. In particular x ∈ (A \ B) and so x ∈ A and x ∈ B. First. However. Similarly x ∈ A \C. more generally.m . If there are only ﬁnitely many. we must show that if x ∈ A \ (B ∪C). Also as x ∈ (A \C). . We will also need to intersect or union several sets at once. . A3. Then x is in A. then x ∈ C. On the other hand suppose that x ∈ (A \ B) ∩ (A \C).” Let us prove A \ (B ∪C) = (A \ B) ∩ (A \C).0. And similarly with intersections. We note that the ﬁrst statement is proved by the second statement if we assume that set A is our “universe. A \ (B ∪C) = (A \ B) ∩ (A \C). we could have the set of sets {A1. then x ∈ (A \ B) ∩ (A \C). x ∈ A \ B. Then we can write ∞ ∞ ∞ ∞ An.
12 INTRODUCTION 0.}. That is. Suppose that (i) (basis statement) P(1) is true.7: Let us prove that for all n ∈ N we have 2n−1 ≤ n!.2 Induction A common method of proof is the principle of induction. That is. we know that P(m − 1) is true.3. By the principle of induction. we have 2(n!) ≤ (n + 1)(n!) = (n + 1)!. Let us call m the least element of S. we see that P(n) is true for all n. Proof. then P(n + 1) is true.6 (Principle of induction).3. . . The principle of induction is the following theorem. . (ii) (induction step) if P(n) is true. We start with the set of natural numbers N = {1. and hence 2n−1 ≤ n! is true for all n ∈ N. We note that the natural ordering on N (that is. and hence P(n + 1) is true. Multiply both sides by 2 to obtain 2n ≤ 2(n!). but all that changes is the labeling. / Since m was the least element of S. As 2 ≤ (n + 1) when n ∈ N. then P(n + 1) is true” is usually called the induction hypothesis. 1 < 2 < 3 < 4 < · · · ) has a wonderful property. Then P(n) is true for all n ∈ N. We let P(n) be the statement that 2n−1 ≤ n! is true. Therefore m > 1 and m − 1 is a natural number as well. But by the induction step we can see that P(m − 1 + 1) = P(m) is true. The assumption that P(n) is true in “if P(n) is true. Every nonempty subset of N has a least (smallest) element.3. 3. The natural numbers N ordered in the natural way possess the well ordering property or the well ordering principle. 2n ≤ 2(n!) ≤ (n + 1)!. Suppose that S is the set of natural numbers m for which P(m) is not true. suppose that 2n−1 ≤ n! holds. contradicting the statement that m ∈ S. Then S has a least element by the well ordering principle. Suppose that S is nonempty. Theorem 0. We know that 1 ∈ S by assumption. we can see that P(1) is true. which is equivalent to the well ordering property of the natural numbers. Sometimes it is convenient to start at a different number than 1. Suppose that P(n) is true. Therefore S is empty and P(n) is true for all n ∈ N. By plugging in n = 1. 2. Example 0. Well ordering property of N. Let P(n) be a statement depending on a natural number n. .
a settheoretic function f taking a set A to a set B is a mapping that to each x ∈ A assigns a unique y ∈ B. Suppose that (i) (basis statement) P(1) is true. a function f : A → B is a black box.8: We claim that for all c = 1. For example.3. Then the Cartesian product is the set of tuples deﬁned as follows. 2} to T = {0. 1−c = There is an equivalent principle called strong induction. y ∈ B}. We write f : A → B. . 0. Also a function need not have any formula being able to compute its values. Often. then P(n + 1) is true. n. Let A and B be sets. 1. . Let P(n) be a statement depending on a natural number n.3.3. Then 1 + c + c2 + · · · + cn + cn+1 = (1 + c + c2 + · · · + cn ) + cn+1 1 − cn+1 + cn+1 1−c 1 − cn+1 + (1 − c)cn+1 = 1−c n+2 1−c = . functions are deﬁned by some sort of formula. The subtle issue here is that a single function can have several different formulas. all giving the same function.0. we have that 1 + c + c2 + · · · + cn = 1 − cn+1 . (ii) (induction step) if P(k) is true for all k = 1. To deﬁne a function rigorously ﬁrst let us deﬁne the Cartesian product. into which we can stick an element of A and the function will spit out an element of B. . Deﬁnition 0. 2. That is.9 (Principle of strong induction). Then P(n) is true for all n ∈ N. however. f (1) := 2. you should really think of a function as just a very big table of values.10. The proof that strong induction is equivalent to induction is left as an exercise. . Theorem 0. A × B := {(x. .3.3 Functions Informally. BASIC SET THEORY Example 0. y) : x ∈ A. 2} by assigning f (0) := 2. and f (2) := 0. Suppose that it is true for n. 1−c 13 Proof: It is easy to check that the equation holds with n = 1. Sometimes f is called a mapping and we say that f maps A to B. we could deﬁne a function f : S → T taking S = {0.3.
f −1 (C ∩ D) = f −1 (C) ∩ f −1 (D). Let us start with the union. Let C ⊂ A. 1]. . Suppose that x ∈ f −1 (C ∪ D). and (1. D be subsets of B. which also takes functions to functions. When A and B are the same set we sometimes use a superscript 2 to denote such a product. Then f ([0. Proposition 0. Let D ⊂ B. For example [0. Conversely if x ∈ f −1 (C). while the domain of f is always equal to A.3. Read the last line as f −1 (B \C) = A \ f −1 (C). 1). 1] × [0. Let f : A → B be a function. However.3. 1] and returns the number 1 0 g(x)dx. Then f −1 (C ∪ D) = f −1 (C) ∪ f −1 (D). and we are have equality. Another example is the Laplace transform. f −1 ({0}) = Z.3. y) ∈ f . Let f : A → B. f −1 (Cc ) = f −1 (C) . The set A is called the domain of f (and sometimes confusingly denoted D( f )). you have seen some other types of functions as well. Deﬁnition 0. 1/2]) = [0. Let C. Example 0. For example the derivative is a function mapping the set of differentiable functions to the set of all functions. the set [0. or R2 = R × R (the Cartesian plane). there is a unique (x.3.11. The rest of the proof is left as an exercise. 0). Deﬁne the inverse image as f −1 (D) := {x ∈ A : f (x) ∈ D}.13. Note that R( f ) can possibly be a proper subset of B. . 0). 1] × [0. then x ∈ f −1 (C ∪ D). Yet another example is the function that takes a continuous function g deﬁned on the interval [0.14 INTRODUCTION For example. Proof. etc.15. Deﬁne the image (or direct image) of C as f (C) := { f (x) ∈ B : x ∈ C}. Deﬁnition 0. 1]. The set R( f ) := {y ∈ B : there exists an x such that (x. y) ∈ f } is called the range of f . (0. Example 0. 1). Similarly for x ∈ f −1 (D). you are most familiar with functions taking real numbers to real numbers. Hence f −1 (C ∪ D) ⊃ f −1 (C) ∪ f −1 (D).3.14: Deﬁne the function f : R → R by f (x) := sin(πx). c . That means that x maps to C or D. A function f : A → B is a subset of A × B such that for each x ∈ A. 1]2 = [0.12: From calculus. (1. Thus f −1 (C ∪ D) ⊂ f −1 (C) ∪ f −1 (D). . 1] is a set in the plane bounded by a square with vertices (0. Sometimes the set f is called the graph of the function rather than the function itself.
Deﬁnition 0. A ﬁnal piece of notation for functions that we will need is the composition of functions. 15 Deﬁnition 0. We then write A := 0.3. When A is the empty set. or “size” of sets. (g ◦ f )(x) := g f (x) .4 Cardinality A very subtle issue in set theory and one generating a considerable amount of confusion among students is that of cardinality.20. 3.17.3. then f −1 ({y}) is always a unique element of A.0. D be subsets of A. In this case√ call f −1 the inverse function of f .16. We then write A := n. Suppose that A has the same cardinality as {1. . Then f (C ∪ D) = f (C) ∪ f (D). we will see the ﬁrst really unexpected theorem. Note that A has the same cardinality as the empty set if and only if A itself is the empty set. Let f : A → B. n} for some n ∈ N. Let f : A → B be a function.3. f (C ∩ D) ⊂ f (C) ∩ f (D).3. The function f is said to be surjective or onto if f (A) = B. . . Deﬁnition 0. 2. In this section. We do have the following weaker result. The function f is said to be injective or onetoone if f (x1 ) = f (x2 ) implies x1 = x2 . f −1 ({y}) is empty or consists of a single element for all y ∈ B.19. we also call A ﬁnite.3. BASIC SET THEORY The proposition does not hold for direct images. For we example. Finally. We then call f a surjection.3. We denote by A the equivalence class of all sets with the same cardinality as A and we simply call A the cardinality of A. Proposition 0. The concept of cardinality is important in modern mathematics in general and in analysis in particular. The proof is left as an exercise. a function that is both an injection and a surjection is said to be bijective and we say it is a bijection. When f : A → B is a bijection. Deﬁnition 0. We then call f an injection.3.18. We say that A is inﬁnite or “of inﬁnite cardinality” if A is not ﬁnite. Let A and B be sets. . . 0. and we could then consider f −1 as a function f −1 : B → A. g : B → C. In other words. Let f : A → B. for the bijection f (x) := x3 we have f −1 (x) = 3 x. Let C. We say A and B have the same cardinality when there exists a bijection f : A → B. Then we deﬁne a function g ◦ f : A → C as follows. and we say that A is ﬁnite.
n}. As we will not need this statement in the sequel. 1). However. We write A ≤ B if there exists an injection from A to B. 1). The issues surrounding this last statement are very subtle. We write A < B if A ≤ B. 2). we deﬁne the power set of A. 2 go to (1. then we say A is countable. First. . (3. 3.3. leave out any fraction (such as 2/2) that has already appeared. 1). then A is said to be uncountable.3. we omit proofs. In fact. etc. We can also order sets by size. . . Deﬁnition 0. there exists a unique natural number n such that there exists a bijection from A to {1. That is. That is. let us mention without proof the following characterization of inﬁnite sets: A set is inﬁnite if and only if it is in one to one correspondence with a proper subset of itself. to be the set of all subsets of A. Proof: Arrange the elements of N × N as follows (1. for each nonempty ﬁnite set A.3. we will not give it here. If A is ﬁnite or countably inﬁnite. . (1. We write A = B if A and B have the same cardinality. 2). For completeness we mention the following statement. The interesting cases of sets are inﬁnite sets. . . denoted by P(A). 1). Example 0. . Similarly if A is uncountable. We start with the following deﬁnition. (1. Deﬁnition 0.3. then A is said to be countably inﬁnite. We state without proof that A = B have the same cardinality if and only if A ≤ B and B ≤ A. 2.22.25: The set of rational numbers is countable. then A is countable. we can always write A ≤ B or B ≤ A. Note that the cardinality of N is usually denoted as ℵ0 (read as alephnaught)† . Furthermore. Then create a bijection taking 2n to n. 2) and so on. If A = N. always write down ﬁrst all the elements whose two entries sum to k. then B is uncountable. we need a notation for the set of all subsets of a set. If A ⊂ B and B is countable. Proof: Given an even natural number. † For the fans of the TV show Futurama.26. We give the ﬁrst truly striking result. and as the proof requires the CantorBernsteinSchroeder theorem mentioned above. write it as 2n for some n ∈ N.3. This is the socalled CantorBernsteinSchroeder theorem. but A and B do not have the same cardinality. If A is not countable. As we will not require either of these two statements. there is a movie theater in one episode called an ℵ0 plex. . 1/2. Deﬁnition 0. Example 0. . . Example 0. writing 1/1.21. . 2/1. .16 INTRODUCTION That the notation A = n is justiﬁed we leave as an exercise.23: The set of even natural numbers has the same cardinality as N.24: N × N is a countably inﬁnite set. then write down all the elements whose entries sum to k + 1 and so on.3. Then deﬁne a bijection with N by letting 1 go to (1. 3). If A is a set. if A and B are any two sets. (2. (2. Proof: (informal) Follow the same procedure as in the previous example.
and D such that f (C ∩ D) is a proper subset of f (C) ∩ f (D). Proof. at least for ﬁnite sets. . n} to {1. Theorem 0. as P(N) must be uncountable. Exercise 0.3. P(P(N)). Note that for a ﬁnite set A of / n . b) Find an example for which equality of sets in f (C ∩ D) ⊂ f (C) ∩ f (D) fails.27 (Cantor).3. The theorem not only says that uncountable sets exist. m}.1: Show A \ (B ∩C) = (A \ B) ∪ (A \C). then there exists a unique number n such that there exists a bijection between A and {1. but that there in fact exist progressively larger and larger inﬁnite sets N. BASIC SET THEORY 17 For example. . / We claim that B is not in the range of f and hence f is not a surjection. 2.3. 2. That is. . In other words. If x0 ∈ B. and the theorem caused quite a controversy at the time it was announced. To ﬁnish the proof. Either x0 ∈ B or x0 ∈ B. The existence of uncountable sets may seem unintuitive. then there is no injection from {1. . One particular consequence of this theorem is that there do exist uncountable sets.2: Prove that the principle of strong induction is equivalent to the standard induction.3. . Deﬁne the set B := {x ∈ A : x ∈ f (x)}. etc. Suppose that there exists an x0 such that f (x0 ) = B. So for x ∈ A. Hint: Show that if n > m. .0.16. and f is not a surjection. . . P(N). Exercise 0. then x0 ∈ f (x0 ) = B. no surjection can exist.5 (Tricky): Prove that if A is ﬁnite. If x0 ∈ B. For any x ∈ A. A. there exists no surjection from A onto P(A). 2}}. n}. . ﬁnd an f . then P(A) = {0. Therefore A ≤ P(A). C. Thus such an x0 does / not exist. {2}. . {1}. What is an unexpected and striking fact is that this statement is still true for inﬁnite sets. There of course exists an injection f : A → P(A). This fact is left as an exercise. As f was an arbitrary function. . . f (x) is a subset of A. the cardinality of P(A) is 2 of P(A) is strictly larger than the cardinality of A. 3.3. B. Suppose that f : A → P(A) is a function. which is again a contradiction.4: a) Prove Proposition 0. deﬁne f (x) := {x}. the notation A := n is justiﬁed. B is not in the range of f . 0. . This fact is related to the fact that the set of real numbers (which we study in the next chapter) is uncountable. the cardinality cardinality n.3: Finish the proof of Proposition 0. which is a / / contradiction. 3. P(P(P(N))). Exercise 0. A < P(A).3. then x0 ∈ f (x0 ) = B. 2. 2}. That is. Therefore.15.3. . Exercise 0. . {1.5 Exercises Exercise 0. we have to show that no function f : A → P(A) is a surjection.3. 3. . In particular.3. if A := {1.3.
6: Prove a) A ∩ (B ∪C) = (A ∩ B) ∪ (A ∩C) b) A ∪ (B ∩C) = (A ∪ B) ∩ (A ∪C) INTRODUCTION Exercise 0. b) Prove that if g ◦ f is surjective. but not to both A and B. Exercise 0. b) Show A∆B = (A \ B) ∪ (B \ A). then g is surjective.18 Exercise 0. b) Find c) Find ∞ n=1 An .3.7: Let A∆B denote the symmetric difference. Exercise 0. for all n ∈ N.10: Let f : A → B and g : B → C be functions. Exercise 0. ∞ n=1 An . but neither f nor g are bijective. c) S = {1. the cardinality of P(A) is 2n .9: Determine P(S) (the power set) for each of the following: a) S = 0. 2. / b) S = {1}. Exercise 0.3. a) Draw a Venn diagram for A∆B. d) S = {1.3.3. a) Prove that if g ◦ f is injective.12: Show that for a ﬁnite set A of cardinality n. Exercise 0. 4}. Exercise 0. 3. then f is injective.14: Prove 13 + 23 + · · · + n3 = n(n+1) 2 2 . Exercise 0. let An := {(n + 1)k : k ∈ N}.3.8: For each n ∈ N.11: Prove that n < 2n by induction.13: Prove 1 1·2 1 1 + 2·3 + · · · + n(n+1) = n n+1 for all n ∈ N.3. a) Find A1 ∩ A2 . c) Show A∆B = (A ∪ B) \ (A ∩ B).3. c) Find an explicit example where g ◦ f is bijective.3. that is.3. 2}. the set of all elements that belong to either A or B.
3. Exercise 0.3.0.19: Give an example of a countable collection of ﬁnite sets A1 .3. such that ∞ A j is nonempty and ﬁnite.17: Find all n ∈ N such that n2 < 2n . .20: Give an example of a countable collection of inﬁnite sets A1 .15: Prove that n3 + 5n is divisible by 6 for all n ∈ N. Show that 2(n + 5)2 < n3 for all n ≥ n0 . Exercise 0. . prove the well ordering property for N using the principle of induction. . Exercise 0.3.3. That is. with A j ∩ Ak being inﬁnite for all j and k. 19 Exercise 0. . whose union is not a ﬁnite set.. A2 .16: Find the smallest n ∈ N such that 2(n + 5)2 < n3 and call it n0 ..3. A2 . . BASIC SET THEORY Exercise 0. . Exercise 0.18: Finish the proof that the principle of induction is equivalent to the well ordering property of N.3. j=1 .
20 INTRODUCTION .
For example. x = y. if there exists a relation < such that (i) For any x. where A is an ordered set. or y < x holds. (i) If there exists a b ∈ A such that x ≤ b for all x ∈ E.5 lectures The main object we work with in analysis is the set of real numbers. (ii) If x < y and y < z.1 Basic properties Note: 1. (iii) If there exists an upper bound b0 of E such that whenever b is any upper bound for E we have b0 ≤ b.1.2. (ii) If there exists a b ∈ A such that x ≥ b for all x ∈ E. N and Z are also ordered sets. We deﬁne > and ≥ in the obvious way.1. As this set is so fundamental. exactly one of x < y.Chapter 1 Real Numbers 1. 21 . then we say E is bounded below and b is a lower bound of E. We will write x ≤ y if x < y or x = y. often much time is spent on formally constructing the set of real numbers. then x < z. we will take an easier approach here and just assume that a set with the correct properties exists. Deﬁnition 1. Similarly.1. y ∈ A. Let E ⊂ A. then we say E is bounded above and b is an upper bound of E. the rational numbers Q are an ordered set by letting x < y if and only if y − x is a positive rational number. However. Deﬁnition 1. A set A is called an ordered set. We write sup E := b0 . then b0 is called the least upper bound or the supremum of E. We need to start with some basic deﬁnitions.
then x + y ∈ F. y ∈ F. We divide by 2 and note that 2k2 = n2 and hence n is divisible by 2. But analysts require the leastupperbound property to do any work.22 CHAPTER 1.1. y. (A5) For every element x ∈ F there exists an element −x ∈ F such that x + (−x) = 0. z ∈ F. y ∈ F. (M2) (commutativity of multiplication) If xy = yx for all x. That Q does not have the leastupperbound property is one of the most important reasons why we work with R in analysis. (A3) (associativity of addition) If (x + y) + z = x + (y + z) for all x. then b0 is called the greatest lower bound or the inﬁmum of E. An ordered set A has the leastupperbound property if every nonempty subset E ⊂ A that is bounded above has a least upper bound. addition x + y and multiplication xy. y. we require that they are a ﬁeld. But that is a contradiction as we said m/n was in lowest terms. z ∈ F. A set F is called a ﬁeld if it has two operations deﬁned on it.1. but 1 is not in the set itself. that is sup E exists in A. then xy ∈ F. Deﬁnition 1. Deﬁnition 1. . The set {x ∈ Q : √ x2 < 2} does not have a supremum. and if it satisﬁes the following axioms. Hence m2 is divisible by 2 and so m is divisible by 2. REAL NUMBERS (iv) Similarly. (A4) There exists an element 0 ∈ F such that 0 + x = x for all x ∈ F. if there exists a lower bound b0 of E such that whenever b is any lower bound for E we have b0 ≥ b.1. In particular. We write m = 2k and so we have (2k)2 = 2n2 .4: For example Q does not have the leastupperbound property. We write inf E := b0 . We also require our real numbers to have many algebraic properties. Note that a supremum or inﬁmum for E (even if they exist) need not be in E. Write x = m/n in lowest terms. So (m/n)2 = 2 or m2 = 2n2 . (A1) If x ∈ F and y ∈ F. Sometimes leastupperbound property is called the completeness property or the Dedekind completeness property. (M1) If x ∈ F and y ∈ F. For example the set {x ∈ Q : x < 1} has a least upper bound of 1. (A2) (commutativity of addition) If x + y = y + x for all x. The set Q is just ﬁne for algebraists. (M3) (associativity of multiplication) If (xy)z = x(yz) for all x. Example 1. The obvious supremum 2 is not rational.5. Suppose that x2 = 2 for some x ∈ Q.3.
If x < 0. Note that (iv) implies in particular that 1 > 0. (v) If 0 < x < y. Proof. If x < 0. Now apply item (ii) of the deﬁnition of ordered ﬁelds to obtain 0 < x(z − y). then xy < xz. x < y implies x + z < y + z. The inequality x > 0 implies by item (i) of deﬁnition of ordered ﬁeld that x + (−x) > 0 + (−x). z ∈ F. y. (M5) For every x ∈ F such that x = 0 there exists an element 1/x ∈ F such that x(1/x) = 1. For (ii). Similarly 1/y > 0. Proposition 1.1. Plug in y = x and z = 0. and x is nonpositive if x ≤ 0. Then apply part (ii) (as x > 0) to obtain x(−1/x) > 0x or −1 > 0. 23 Example 1. Let us prove (i). then 0 < 1/y < 1/x.1. then x2 > 0. Let F be an ordered ﬁeld and x. (ii) For x. (ii) If x > 0 and y < z. Deﬁnition 1. If x > 0. y ∈ F such that x > 0 and y > 0 implies xy > 0.1. y. Hence (1/x)(1/y) > 0 by deﬁnition and we have (1/x)(1/y)x < (1/x)(1/y)y. notice that 1/x cannot be equal to zero (why?). we say x is positive. The “viceversa” follows by similar calculation. then −1/x > 0 by (i).1. A ﬁeld F is said to be an ordered ﬁeld if F is also an ordered set such that: (i) For x. then xy > xz. Part (iii) is left as an exercise. We also say x is nonnegative if x ≥ 0. we can use part (iii) of this proposition. If 1/x < 0. BASIC PROPERTIES (M4) There exists an element 1 (and 1 = 0) such that 1x = x for all x ∈ F. Finally to prove part (v). Now apply the algebraic properties of ﬁelds to obtain 0 > −x.8. To prove part (iv) ﬁrst suppose that x > 0. z ∈ F. Then by item (ii) of the deﬁnition of ordered ﬁelds we obtain that x2 > 0 (use y = x). By algebraic properties we get 0 < xz − xy. y. z ∈ F. then −x < 0 (and viceversa).6: The set Q of rational numbers is a ﬁeld. (iii) If x < 0 and y < z. Then: (i) If x > 0. ﬁrst notice that y < z implies 0 < z − y by applying item (i) of the deﬁnition of ordered ﬁelds. and again applying item (i) of the deﬁnition we obtain xy < xz. we say x is negative. On the other hand Z is not a ﬁeld. (iv) If x = 0.1. (D) (distributive law) x(y + z) = xy + xz for all x. . which contradicts 1 > 0 by using part (i) again. By algebraic properties we get 1/y < 1/x.7. as it does not contain multiplicative inverses.
Let A ⊂ S and suppose that b is an upper bound for A.1.3: Let x.1. Let x.1: Prove part (iii) of Proposition 1. Exercise 1. It is clear that both possibilities can in fact happen. Multiply y < 0 by x to get xy < 0x = 0. Show that inf B ≤ inf A ≤ sup A ≤ sup B. Suppose that b ∈ A. A is inﬁnite. Then A is bounded.1. Exercise 1. Exercise 1.6: Let S be an ordered set. Exercise 1. In / particular. Furthermore. Without loss of generality suppose that x > 0 and y < 0. it is not true that if the product is positive.1. Suppose that 0 < x < y.1. Exercise 1. Show that b = sup A. Suppose that xy > 0.8.5: Let S be an ordered set.1. Then either both x and y are positive. 1. Hint: Use induction.9. Let A ⊂ S be a nonempty ﬁnite subset. The result follows by contrapositive. Let A ⊂ S be a nonempty subset that is bounded above. Proof. y ∈ F. Proposition 1.24 CHAPTER 1. inf A exists and is in A and sup A exists and is in A. Exercise 1. / .1 Exercises Exercise 1. or both are negative. Show that A contains a countably inﬁnite subset. REAL NUMBERS Product of two positive numbers (elements of an ordered ﬁeld) is positive.7: Find a (nonstandard) ordering of the set of natural numbers N such that there exists a proper subset A N and such that sup A exists in N but sup A ∈ A.1. However. then xy < 0. then xy is zero and hence not positive. Let A ⊂ B be a nonempty subset. If either x and y are zero.4: Let S be an ordered set.1.1. y ∈ F where F is an ordered ﬁeld.2: Let S be an ordered set. We do have the following proposition. Let B ⊂ S be bounded (above and below). Show that x2 < y2 . and we simply need to show that if they have opposite signs. then each of the two factors must be positive. where F is an ordered ﬁeld. Hence we can assume that x and y are nonzero. Suppose that sup A exists and that sup A ∈ A.1. Suppose that all the inf’s and sup’s exist.
1 > 0. Taking ε = x/2 obtains a contradiction. Thus A is bounded above.3: Claim: There exists a unique positive real number r such that r2 = 2. The most useful property of R for analysts. then 0 < x/2 < x (why?). A more general and related simple fact is that any time we have two real numbers a < b.1. is up to isomorphism. We have already seen that R must contain elements that are not in Q because of the leastupperbound property.1. Just take for example c = a+b (why?). it is simply enough to assume that a set of real numbers exists. Proof. The following proposition is essentially how an analyst proves that a number is zero. then there is another real number c such that a < c < b. we simply state their existence as a theorem without proof. We have seen that there is no rational square root of two. In fact.1.2. then x = 0.2. ∗ Uniqueness . but we wish to avoid excessive use of algebra. There exists a unique∗ ordered ﬁeld R with the leastupperbound property such that Q ⊂ R. Similarly we can easily verify all the statements we know about rational numbers and their natural ordering. See Rudin [R2] for the construction and more details. For us. As 1 ∈ A. then x < 2.8 we will not explicitly mention its use from now on).2. The set {x ∈ Q : x2 < 2} √ implies the existence of the real number 2 that is not rational. Let us prove one of the most basic but useful results about the real numbers. If x > 0. Notice that Q is an ordered ﬁeld. Theorem 1.2 The set of real numbers Note: 2 lectures 1. 2 there are inﬁnitely many real numbers between a and b.2. then A is nonempty.1 The set of real numbers We ﬁnally get to the real number system.2. however. is not just that it is an ordered ﬁeld. THE SET OF REAL NUMBERS 25 1. If x ∈ R is such that x ≥ 0 and x ≤ ε for all ε ∈ R where ε > 0. Proof. note that x ≥ 2 implies x2 ≥ 4 (use Proposition 1. By induction (exercise) we can prove that n > 0 for all n ∈ N. So what we do is to throw in enough numbers to obtain R. Take the set A := {x ∈ R : x2 < 2}. Example 1. although this fact requires a bit of work. but we also want to take suprema (and inﬁma) willynilly. To see this fact. but that it has the leastupperbound property. Essentially we want Q. hence any number such that x ≥ 2 is not in A. Proposition 1. Thus x = 0. Note that also N ⊂ Q. As we have seen. We denote r √ by 2.2. Instead of constructing the real number set from the rational numbers. First we must note that if x2 < 2.
Claim: 0 < a < b implies b2 − a2 < 2(b − a)b. we can assume that h < 1. We plug in a = s − h and b = s (note that s − h > 0). Take a number s ≥ 1 such that s2 < 2. Thus s2 = r2 . 2(s + 1) This implies that (s + h)2 < 2. 2−s2 2−s2 Therefore 2(s+1) > 0. Let us ﬁrst show that r2 ≥ 2. 2s By subtracting s2 from both sides and multiplying by −1. or in other words s > r = sup A. −2 Now take a number s such that s2 > 2. Thus r2 ≤ 2. / 2 ≥ (s − h)2 > 2 (as x > 0 and s − h > 0) and so x ∈ A and so Furthermore. Hence s = r. and as before s 2s > 0. Note that 2 − s2 > 0. r2 ≥ 2 and r2 ≤ 2 imply r2 = 2. if x ≥ s − h. later on we will see that is it actually very large. s < r = sup A. it follows that r2 ≥ 2. Hence s + h ∈ A but as h > 0 we have s + h > s. However. for each x > 0. Therefore s − h ∈ A. Hence. We can choose an h ∈ R such that 0 < h < 2(s+1) . then s2 < r2 . That is. REAL NUMBERS Let us deﬁne r := sup A. Proof: Write b2 − a2 = (b − a)(a + b) < (b − a)2b. We obtain (s + h)2 − s2 < h2(s + h) < 2h(s + 1) < 2 − s2 since h < 1 since h < 2 − s2 . The proof is left as an exercise. Using the same technique as above. The set R \ Q is called the set of irrational numbers. Hence s2 − 2 > 0. s − h < s. As s ≥ 1 was an arbitrary number such that s2 < 2. Again we use the fact that 0 < a < b implies b2 − a2 < 2(b − a)b.26 CHAPTER 1. Furthermore. We will show that r2 = 2 by showing that r2 ≥ 2 and r2 ≤ 2. Together. However. Let us use the claim by plugging in a = s and b = s + h. Note that we already know that r ≥ 1 > 0. Suppose that s ∈ R such that s2 = 2 and s > 0. This is the way analysts show equality. we ﬁnd (s − h)2 > 2. then x / s − h is an upper bound for A. √ The number 2 ∈ Q. We have seen that / R \ Q is nonempty. there exists a positive real number r such that rn = x. . we can show that a positive real number x1/n exists for all n ∈ N and all x > 0. if 0 < s < r. We obtain 2 s2 − (s − h)2 < 2hs < s2 − 2 since h < s2 − 2 . We can choose 2 −2 an h ∈ R such that 0 < h < s 2s and h < s. We still need to handle uniqueness. by showing two inequalities. The existence part is ﬁnished. Similarly if 0 < r < s implies r2 < s2 .
there exists an n ∈ N such that n(y − x) > 1. Other corollaries are easy consequences and we leave them as exercises. Then take r = −q.2. We divide through by n to get x < m/n.2. even though it may not seem like that at ﬁrst sight. which contradicts b being an upper bound. but m − 1 ∈ A and so m − 1 ≤ nx. Suppose for contradiction that N is bounded. then note that 0 < −y < −x and ﬁnd a rational q such that −y < q < −x. If m > 1. Also by (i) the set A := {k ∈ N : k > nx} is nonempty. (ii) (Q is dense in R) If x.1. Furthermore. A has a least element m.4. If y > 0. Thus there exists an m ∈ N such that m > b − 1. y ∈ R and x > 0. On the other hand from n(y − x) > 1 we obtain ny > 1 + nx. (i) (Archimedean property) If x. Now let us tackle (ii).2 Archimedean property As we have seen.2. THE SET OF REAL NUMBERS 27 1. m − 1 ≤ nx < m. Proof. We can divide through by x and then what (i) says is that for any real number t := y/x. 1/n ∈ A contradicting the fact that b is a lower bound. As m ∈ A. If m = 1. In other words. then m − 1 = 0. However. As m is the least element of A. . so b := inf A exists. then m > nx. there are plenty of real numbers. then there exists an n ∈ N such that nx > y. We can add one to obtain m + 1 > b. If b > 0. inf{1/n : n ∈ N} = 0. First assume that x ≥ 0. The number b − 1 cannot possibly be an upper bound for N as it is strictly less than b. But there are also inﬁnitely many rational numbers in any interval. As 0 is a lower bound. m − 1 ∈ A.5. in any interval. The following is one of the most fundamental facts about the real numbers. Let us state and prove a simple but useful corollary of the Archimedean property. Let b := sup N. y ∈ R and x < y. Note that y − x > 0. Now assume that x < 0. By the Archimedean property there exists an n such that nb > 1. Let A := {1/n : n ∈ N}.2. Proof. The two parts of the next theorem are actually equivalent. or in other words b > 1/n. 1/n > 0 and so 0 is a lower bound. In other words. (i) says that N ⊂ R is unbounded. Theorem 1. and m − 1 ≤ nx still holds / as x ≥ 0. then b ≥ 0. then we can just take r = 0. Obviously A is not empty. Corollary 1. then / m − 1 ∈ N. As nx ≥ m − 1 we get that 1 + nx ≥ m and hence ny > m and therefore y > m/n. then there exists an r ∈ Q such that x < r < y. If y < 0. Let us prove (i). Hence b = 0. By the well ordering property of N. By (i). we can ﬁnd natural number n such that n > t.
6.2. (ii) If A is not bounded above.2. then sup A := −∞. Proposition 1. (ii) If x ∈ R. Now we can take suprema and inﬁma without fear.28 CHAPTER 1. we will sometimes treat ∞ and −∞ as if they were numbers. (i) If x ∈ R. For convenience. The set R∗ is called the set of extended real numbers. (v) If x < 0. It is possible to deﬁne some arithmetic on R∗ . (vi) If x < 0. (iv) If x > 0. then sup(xA) = x(sup A). REAL NUMBERS 1.3 Using supremum and inﬁmum To make using suprema and inﬁma even easier. then inf(x + A) = x + inf A. Do note that multiplying a set by a negative number switches supremum for an inﬁmum and viceversa. then sup A := ∞. (iv) If A is not bounded below. We can make R∗ := R ∪ {−∞. (iii) If A is empty. then inf(xA) = x(sup A). Let us say a little bit more about them.7. then inf A := −∞. xA := {xy ∈ R : y ∈ A}. (i) If A is empty. then inf A := ∞. For a set A ⊂ R and a number x deﬁne x + A := {x + y ∈ R : y ∈ A}. except we will not allow arbitrary arithmetic with them. we want to be able to always write sup A and inf A without worrying about A being bounded and nonempty.2. then inf(xA) = x(inf A). . First we want to make sure that suprema and inﬁma are compatible with algebraic operations. We make the following natural deﬁnitions Deﬁnition 1. then sup(xA) = x(inf A). Let A ⊂ R. ∞} into an ordered set by letting −∞ < ∞ and − ∞ < x and x < ∞ for all x ∈ R. (iii) If x > 0. Let A ⊂ R be a set. then sup(x + A) = x + sup A. but we will refrain from doing so as it leads to easy mistakes because R∗ will not be a ﬁeld.
Therefore x ≤ inf B.4. 100} = 100. Note that x < y whenever x ∈ A and y ∈ B still only implies sup A ≤ inf B. 1. then sup(x + A) ≤ x + b = x + sup A. 2. Here is an example. Then 0 < 1/n for all n ∈ N. If b is a bound for x + A. Let A. then sup A ≤ b − x = sup(x + A) − x. And the result follows. take A := {0} and take B := {1/n : n ∈ N}.1. Now inf B is an upper bound for A and therefore sup A ≤ inf B. Similarly for inﬁmum. sup A = 0 and inf B = 0 as we have seen.1. 100} = 1. When a set A is bounded below and inf A ∈ A.2. and so x + b is a bound for x + A. The rest are left as exercises.2. For example. For example. The other direction is similar. Then sup A ≤ inf B. max{1. π. and not a strict inequality.2.2 we know that a ﬁnite set of numbers always has a supremum or an inﬁmum that is contained in the set itself. We have to be careful about strict inequalities and taking suprema and inﬁma. π. then we can use the word minimum and the notation min A. Then x + y < x + b. While writing sup and inf may be technically correct in this situation. 2. If b = sup(x + A). That is. First note that any x ∈ A is a lower bound for B. Let us only prove the ﬁrst statement. THE SET OF REAL NUMBERS 29 Proof. Suppose that b is a bound for A.8.4. Proposition 1. then we can use the word maximum and notation max A to denote the supremum. This is an important subtle point that comes up often. Proof. B ⊂ R such that x ≤ y whenever x ∈ A and y ∈ B. So b − x is a bound for A. When we have a set A of real numbers bounded above. then x + y < b for all y ∈ A and so y < b − x. Sometimes we will need to apply supremum twice. However. if b = sup A. . such that sup A ∈ A.4 Maxima and minima By Exercise 1. y < b for all y ∈ A. In particular. max and min are generally used to emphasize that the supremum or inﬁmum is in the set itself. In this case we usually do not use the words supremum or inﬁmum. min{1.
Let C := {ab : a ∈ A. 2 2 Exercise 1. Suppose that x2 + y2 = 0. Show that C is a bounded set and that sup C = sup A + sup B and inf C = inf A + inf B.2.2. Exercise 1.2. Exercise 1. Prove that x = 0 and y = 0. then there exists an n ∈ N such that n − 1 ≤ t < n.2.2.9: Let A and B be two bounded sets of real numbers.5: Show that 3 is irrational.2. then there exists an n ∈ N such that 1 < t. show that there exists a unique positive real number r such that x = rn .5 Exercises Exercise 1. Exercise 1.8: Show that for any two real numbers such that x < y.30 CHAPTER 1. Let C := {a + b : a ∈ A. b ∈ B}.2.10: Let A and B be two bounded sets of nonnegative real numbers.2: Prove that if t > 0 (t ∈ R). Exercise 1. That is.2.2.6: Let n ∈ N.2. we have an irrational number x y s such that x < s < y. y we have x+y √ xy ≤ .1: Prove that if t > 0 (t ∈ R). Show that either n is either an integer or it is irrational. for two positive real numbers x.7.2. √ Exercise 1.7: Prove the arithmeticgeometric mean inequality. √ Exercise 1. Hint: Apply the density of Q to √ and √ . Exercise 1. Show that C is a bounded set and that sup C = (sup A)(sup B) and inf C = (inf A)(inf B). REAL NUMBERS 1. . Exercise 1.11 (Hard): Given x > 0 and n ∈ N. n2 Exercise 1. equality occurs if and only if x = y.2.4: Let x. b ∈ B}. Usually r is denoted by x1/n . 2 Furthermore.3: Finish proof of Proposition 1.2. y ∈ R.
Hence.1. If x < 0. and x = 0 if and only if x = 0. If x > 0. Finally without loss of generality assume that x > 0 and y < 0. −y ≤ x ≤ y. suppose that −y ≤ x ≤ y is true.3. then x ≤ y is equivalent to x ≤ y.3 Absolute value Note: 0. A property used frequently enough to give it a name is the socalled triangle inequality. (vi) − x ≤ x ≤ x for all x ∈ R. then −y ≤ x implies (−x) ≤ y. (iii): If x or y is zero. Proposition 1. (i) x ≥ 0. Let us give the main features of the absolute value as a proposition. then as y ≥ 0 it is obviously true that −y ≤ 0 = x = 0 ≤ y. If x < 0. Let us give a formal deﬁnition. If x ≥ 0. then the result is obvious. (ii): Suppose that x > 0. then x2 = (−x)2 = x2 . then x ≤ y.1. then x ≤ y means −x ≤ y. (i): This statement is obvious from the deﬁnition. (v): Suppose that x ≤ y. (v) x ≤ y if and only if −y ≤ x ≤ y. Then x y = x(−y) = −(xy). (ii) −x = x for all x ∈ R. ABSOLUTE VALUE 31 1. If x = 0. if x < 0.3. then x y = xy. x + y ≤ x + y for all x.2 (Triangle Inequality). xy is also positive and hence xy = xy. Now xy is negative and hence xy = −(xy). or x = 0. Proof. When x and y are both positive. (iv): Obvious if x = 0 and if x > 0. Negating both sides we get x ≥ −y. y ∈ R. then −x = −(−x) = x = x. On the other hand. Similarly when x < 0.3. y ∈ R. (iv) x2 = x2 for all x ∈ R. which is equivalent to x ≤ y. Proposition 1. Again y ≥ 0 and so y ≥ 0 > x. (iii) xy = x y for all x. Obviously y ≥ 0 and hence −y ≤ 0 < x so −y ≤ x ≤ y holds. You want to think of the absolute value as the “size” of a real number. . x := x −x if x ≥ 0.51 lecture A concept we will encounter over and over is the concept of absolute value. If x < 0. (vi): Just apply (v) with y = x.
We will proceed by induction. Proof. From Proposition 1. It is obvious that x2 + 9x + 1 is largest when x is largest.3. ﬁrst using the standard triangle inequality. x2 . but we didn’t ask for the best possible M. . Let x1 . xn ∈ R. One possibility for M is M = 52 + 9(5) + 1 = 71. x2 . y ∈ R (i) (reverse triangle inequality) (ii) x − y ≤ x + y. Proof. We add these two inequalities to obtain −(x + y) ≤ x + y ≤ x + y . In the interval provided. .3.3. . There are other versions of the triangle inequality that are applied often. Using the triangle inequality. Corollary 1. The bound of 71 is much higher than it need be.4. Let us plug in x = a − b and y = b into the standard triangle inequality to obtain a = a − b + b ≤ a − b + b .5: Find a number M such that x2 − 9x + 1 ≤ M for all −1 ≤ x ≤ 5. Then x1 + x2 + · · · + xn  ≤ x1  + x2  + · · · + xn  . . (x − y) ≤ x − y.1 we have − x ≤ x ≤ x and − y ≤ y ≤ y. and then the induction hypothesis x1 + x2 + · · · + xn + xn+1  ≤ x1 + x2 + · · · + xn  + xn+1  ≤ x1  + x2  + · · · + xn  + xn+1 . . xn+1 and compute. . Switching the roles of a and b we obtain or b − a ≤ b − a = a − b. The second version of the triangle inequality is obtained from the standard one by just replacing y with −y and noting again that −y = y. of course.1 again we obtain the reverse triangle inequality.3. or a − b ≤ a − b. Let us see an example of the use of the triangle inequality. Now suppose that the corollary holds for n. just one that works. . Again by Proposition 1.3.3. Example 1. Note that it is true for n = 1 trivially and n = 2 is the standard triangle inequality. Take n + 1 numbers x1 . Now applying Proposition 1.32 CHAPTER 1. x is largest when x = 5 and so x = 5. There are. Corollary 1. Let x.3. write x2 − 9x + 1 ≤ x2  + 9x + 1 = x2 + 9x + 1. . .1 we have that x + y ≤ x + y. other M that work. REAL NUMBERS Proof.
For this stronger inequality we need the stronger hypothesis f (x) ≤ g(y) The proof is left as an exercise. We write sup f (x) := sup f (D). y∈D But that means that supy∈D g(y) is an upper bound for f (D). hence is greater than or equal to the least upper bound of f (D). x∈D y∈D Let us prove this inequality. then it is not bounded.1) You should be careful with the variables. If b is an upper bound for g(D). x∈D x∈D inf f (x) := inf f (D). let us prove the following proposition.2) is not true given the hypothesis of the claim above. for all x ∈ D and y ∈ D.2) The inequality (1.1) is different from the x on the right. Proposition 1.7.6. then f (x) ≤ g(x) ≤ b and hence b is an upper bound for f (D).1. Do note that a common mistake is to conclude that sup f (x) ≤ inf g(y). Suppose f : D → R is a function. In the example we have shown that x2 − 9x + 1 is bounded when considered as a function on D = {x : −1 ≤ x ≤ 5}. To illustrate some common issues. . and inf f (x) ≤ inf g(x).3. You should really think of the ﬁrst inequality as sup f (x) ≤ sup g(y). x∈D y∈D (1. sup f (x) ≤ sup g(y). if we consider the same polynomial as a function on the whole real line R. 33 Deﬁnition 1.3. then we can talk about its supremum and its inﬁmum.3. On the other hand. Therefore taking the least upper bound we get that for all x f (x) ≤ sup g(y). x∈D y∈D The second inequality (the statement about the inf) is left as an exercise. ABSOLUTE VALUE The last example leads us to the concept of bounded functions. x∈D x∈D (1. We say f is bounded if there exists a number M such that  f (x) ≤ M for all x ∈ D. The x on the left side of the inequality in (1. If a function f : D → R is bounded. If f : D → R and g : D → R are bounded functions and f (x) ≤ g(x) then sup f (x) ≤ sup g(x) x∈D x∈D for all x ∈ D.
then x∈D inf f (x) ≤ inf g(x).3.2: Show that a) max{x.3.5: Let f : D → R and g : D → R be functions.3.3: Find a number M such that x3 − x2 + 8x ≤ M for all −2 ≤ x ≤ 10 Exercise 1. Show that sup f (x) ≤ inf g(x). x∈D Exercise 1.1 Exercises Exercise 1.4: Finish the proof of Proposition 1. a) Suppose that f (x) ≤ g(y) for all x ∈ D and y ∈ D. REAL NUMBERS 1.3.1: Let ε > 0. y} = x+y+x−y 2 x+y−x−y 2 Exercise 1. That is. f . such that f (x) ≤ g(x) for all x ∈ D. but sup f (x) > inf g(x). and g. y} = b) min{x.3. x∈D x∈D b) Find a speciﬁc D. Show that x − y < ε if and only if x − ε < y < x + ε.3.7.34 CHAPTER 1. x∈D x∈D . Exercise 1. prove that given any set D.3. and two bounded functions f : D → R and g : D → R such that f (x) ≤ g(x).
We deﬁne unbounded intervals. We have already seen that any open interval (a. Let us say more about the cardinality of intervals and hence about the cardinality of R. It turns out there are a lot more irrational numbers than rational numbers. 2]. since both a and b were real numbers.4. but let us give a formal deﬁnition here. and we will show in a little bit that R is uncountable. b) := {x ∈ R : a < x < b}. 1). b) (where a < b of course) must be nonempty. 1] to say (0. [a. ∞) := {x ∈ R : a ≤ x}. b) are called halfopen intervals.4 Intervals and the size of R Note: 0. .4. b] and [a.1. 2]. b) := {x ∈ R : x < b}. 2 all intervals have the same “size. the cardinality of R is the same as the cardinality of P(N). that is R \ Q is nonempty. (a. An unexpected fact is that from a settheoretic perspective. (−∞. it contains the number a+b . ∞) := R. they all have the same cardinality. (a. The above intervals were all bounded intervals. hence the bounded interval (−π. 1] and [0. (a. The interval [a. maybe more interestingly. b] is called a closed interval and (a. We have seen that Q is countable. [a. how many irrational numbers are there.1 (Cantor). b] := {x ∈ R : a ≤ x ≤ b}. Theorem 1. In fact. π) to R. Or. And do not worry. but it is possible. b] := {x ∈ R : x ≤ b}. π) has the same cardinality as R. although we will not prove this claim. R is uncountable. the function f (x) := tan(x) is a bijective map from (−π. The question is. ∞) := {x ∈ R : a < x}. For example the map f (x) := 2x takes the interval [0. However. b) is called an open interval.51 lecture (proof of uncountability of R can be optional) You have seen the notation for intervals before. there does exist a way to measure the “size” of subsets of real numbers that “sees” the difference between [0. The intervals of the form (a. b ∈ R such that a < b we deﬁne [a. b) := {x ∈ R : a ≤ x < b}. For completeness we deﬁne (−∞. (−∞. For example. It is not completely straightforward to construct a bijective map from [0. We have seen that there exist irrational numbers. its proper deﬁnition requires much more machinery than we have right now.” that is. 1] bijectively to the interval [0. INTERVALS AND THE SIZE OF R 35 1. b] := {x ∈ R : a < x ≤ b}. For a.
for each k > 1: (i) Deﬁne ak := x j . Exercise 1. we can write X as a sequence of real numbers x1 . Claim: a j < bk for all j and k in N. Similarly y cannot be a member of B. a2 . Proof. . In this case we would make either ak = y or bk = y. To see the claim note that the smallest j such that x j is in (ak−1 .4. 1. Deﬁne y = sup A. x2 . b3 . which would be a contradiction. Let us construct two other sequences of real numbers a1 .1 Exercises Exercise 1.2: Suppose that f : [0. Hence eventually we will reach a point where x j = y.4. bk−1 ) or (ak . We claim that this means that y would be picked for am or bm in one of the steps. but a proof by contrapositive is easier to understand. b] to (0. there is an x ∈ X such that a < x < b. then there / exists some k such that y = xk . If y = a j for some j. then there is an n such that an ≥ bn (why?). As an open interval is nonempty. Next. Consequently. we know that such an x j always exists by our assumption on X. Notice however that y ∈ (am . As X is countable. Normally this proof is stated as a contradiction proof.1: For a < b. x3 . Let A = {a j : j ∈ N} and B = {b j : j ∈ N}. then we could take X = R. 1] → (0. then X cannot equal to R and R must be uncountable. REAL NUMBERS We give a modiﬁed version of Cantor’s original proof from 1874 as this proof requires the least setup. x2 . we have shown that X is a proper subset of R. . This is because a j < a j+1 for all j and bk > bk+1 for all k. If y ∈ X. (ii) Next. . .36 CHAPTER 1. . . 1]. cannot contain all elements of R and thus R is uncountable. . The number y cannot be a member of A. such that each number in X is given by some x j for some j ∈ N. Construct a bijection from [−1. . . . which is not possible by deﬁnition.. which is impossible.4. Therefore. the sequence x1 . 1] to R using f . bk−1 ) always becomes larger in every step. . . there is a bijection from N to X. deﬁne bk := x j where j is the smallest j ∈ N such that x j ∈ (ak . then we are done.. If y ∈ X. where j is the smallest j ∈ N such that x j ∈ (ak−1 . bk−1 ). If we can show that X must be a proper subset. bk−1 ). . then y < a j+1 . which is a contradiction. b2 . a3 . If there did exist a j and a k such that a j ≥ bk . If R were countable. 1) is a bijection. bm−1 ) for all m ∈ N. and b1 . We have seen before that sup A ≤ inf B. construct an explicit bijection from (a. Let X ⊂ R be a countable subset such that for any two numbers a < b. Let a1 := 0 and b1 := 1. bm ) and y ∈ (am .
Exercise 1. write down an algorithm that tells you exactly what number goes where.3 (Hard): Show that the cardinality of R is the same as the cardinality of P(N). . Exercise 1.4. INTERVALS AND THE SIZE OF R 37 Exercise 1.1. 1/2]. The tricky part is to notice that some numbers have more than one binary representation. 1/2] to (1/2. that is. Then prove that the map is a bijection.5 (Hard): Construct an explicit bijection from [0. then you have a sequence of 1’s and 0’s. 1). 1] to (0.4. 3/4]. 1] to (0.4. Use the sequence to construct a subset of N. 1] to (0. 1).4 (Hard): Construct an explicit bijection from (0.4. etc. . 1]. then map (1/4. Write down the map explicitly. . . Hint: If you have a binary representation of a real number in the interval [0. Hint: One approach is as follows: First map (1/2.
38 CHAPTER 1. REAL NUMBERS .
c. A sequence {xn } is bounded if there exists a B ∈ R such that xn  ≤ B for all n ∈ N. .1 Sequences and limits Note: 2. 2. which is what [R2] uses. Instead of x(n) we will usually denote the nth element in the sequence by xn . 4.Chapter 2 Sequences and Series 2. . . 1/5. 1/3.1. the sequence {(−1)n } is the sequence −1. . . 1. . We have already seen sequences used informally. 1/2.5 lectures Analysis is essentially about taking limits... the range of the sequence. the sequence {xn } is bounded whenever the set {xn : n ∈ N} is bounded. . A sequence is a function x : N → R. or simply {1/n}. stands for the sequence 1. we use the words sequence or set to distinguish the two concepts. the notions are distinct. 3. 1. The most basic type of a limit is a limit of a sequence of real numbers. is just the set {−1. For example. The sequence {1/n} is a bounded sequence (B = 1 will sufﬁce). 39 . In other words. whereas the set of values. consisting of a single constant c ∈ R. c. 1}. c. {1/n}∞ . 1/4. We will use the notation {xn } or more precisely {xn }∞ n=1 to denote a sequence. −1. . Another example of a sequence is the constant sequence. . When ambiguity could arise. For example. While the notation for a sequence is similar∗ to that of a set. ∗ [BS] use the notation (xn ) to denote a sequence instead of {xn }. When we n=1 need to give a concrete sequence we will often give each term as a formula in terms of n. . We could write this set as {(−1)n : n ∈ N}. −1. Deﬁnition 2. That is a sequence {c} = c. . and this sequence is not bounded (why?). 1. Let us give the formal deﬁnition. Both are common.. .1. On the other hand the sequence {n} stands for 1.
4: The sequence {1/n} is convergent and 1 = 0. 1. .1. The key point in the deﬁnition is that given any ε > 0.1. It is possible. We will see in Proposition 2. That is. we are saying two things. Let us illustrate this concept on a few examples. we can ﬁnd an M ∈ N such that 0 < 1/M < ε (Archimedean property at work). 1. Otherwise. so we only pick an M once we know ε. n→∞ A sequence that converges is said to be convergent. .5: The sequence {(−1)n } is divergent. It is good to know intuitively what a limit means. provided we go far enough in the sequence. Example 2. So it makes sense to talk about the limit of a sequence. For every ε > 0. we can be arbitrarily close to the limit. and But 2 = 1 − x − (−1 − x) ≤ 1 − x + −1 − x < 1/2 + 1/2 = 1. there exists an M ∈ N such that xn − x < ε for all n ≥ M. and quite common. if for every ε > 0. we can pick M = 1. and it is necessary to understand it perfectly.1. Given an ε > 0.1.2.40 CHAPTER 2.3: The constant sequence 1. that {xn } is convergent. n→∞ n lim Let us verify this claim. that there is no xn in the sequence that equals the limit x. Suppose such an M exists. The number x is said to be the limit of {xn }. Deﬁnition 2. It means that eventually every number in the sequence is close to the number x. and second that the limit is x. and that is a contradiction. we can ﬁnd an M. The M can depend on ε. It does not mean we will ever reach the limit. Example 2. The above deﬁnition is one of the most important deﬁnitions in analysis. is convergent and the limit is 1. . then it is unique. A sequence {xn } is said to converge to a number x ∈ R. if a limit exists. the sequence is said to be divergent. n n M Example 2. 1. If there were a limit x. More precisely.1. SEQUENCES AND SERIES We now get to the idea of a limit of a sequence. When we write lim xn = x for some real number x. First. We will write lim xn := x. . Then for all n ≥ M we have that xn − 0 = 1 1 1 = ≤ < ε. then for ε = 1 we 2 expect an M that satisﬁes the deﬁnition. then for an even n ≥ M we compute 1/2 > xn − x = 1 − x 1/2 > xn+1 − x = −1 − x .6 that the notation below is well deﬁned.
M2 }. Thus there exists a M ∈ N such that for all n ≥ M we have xn − x < 1. . then y − x = 0 and y = x. Many proofs follow the same general scheme. Suppose that the sequence {xn } has the limit x and the limit y. Similarly we ﬁnd an M2 such that for all n ≥ M2 we have xn − y < ε/2. Example 2.6. From the deﬁnition we ﬁnd an M1 such that for all n ≥ M1 . . . A convergent sequence has a unique limit. Now take M := max{M1 . For n ≥ M (so that both n ≥ M1 and n ≥ M2 ) we have y − x = xn − x − (xn − y) ≤ xn − x + xn − y ε ε < + = ε.1. Suppose that {xn } converges to x. Take an arbitrary ε > 0.1. 2 2 As y − x < ε for all ε > 0. The set {x1  . Proposition 2.1. xM−1 } is a ﬁnite set and hence let B2 := max{x1  . n→∞ n2 + n lim . xM−1 }. and we estimate each one by arbitrarily small numbers. Proof.2.7. x2  . The sequence {(−1)n } shows that the converse does not hold. B2 }. . . Let B := max{B1 . A convergent sequence {xn } is bounded. . x2  . SEQUENCES AND LIMITS Proposition 2. .1. xn − x < ε/2. Let B1 := x + 1 and note that for n ≥ M we have xn  = xn − x + x ≤ xn − x + x < 1 + x = B1 .8: The sequence n2 +1 n2 +n converges and n2 + 1 = 1. Then for all n ∈ N we have xn  ≤ B. Proof. A bounded sequence is not necessarily convergent. We want to show a certain quantity is zero. 41 The proof of this proposition exhibits a useful technique in analysis. . We write the quantity using the triangle inequality as two quantities. Hence the limit (if it exists) is unique.
ﬁnd M ∈ N such that 1 M+1 CHAPTER 2. It is also easy to ﬁnd the limit for a convergent monotone sequence. Proof. that is the set {xn : n ∈ N} is bounded from above. Let us suppose that the sequence is monotone increasing. ≤ M+1 n +1 Therefore. Let x := sup{xn : n ∈ N}. A sequence {xn } is monotone increasing if xn ≤ xn+1 for all n ∈ N.1. As x is the supremum. then there must be at least one M ∈ N such that xM > x − ε (because x is the supremum). lim n2 +n = 1.10.42 Given any ε > 0.1 Monotone sequences The simplest type of a sequence is a monotone sequence. Theorem 2. Some authors also use the word monotonic. Hence xn − x = x − xn ≤ x − xM < ε. if {xn } is monotone increasing and bounded. then n→∞ lim xn = inf{xn : n ∈ N}.9. 2 2. Then for any n ≥ M we have n2 + 1 n2 + 1 − (n2 + n) −1 = n2 + n n2 + n 1−n = 2 n +n n−1 = 2 n +n n 1 ≤ 2 = n +n n+1 1 < ε.1. Checking that a monotone sequence converges is as easy as checking that it is bounded. SEQUENCES AND SERIES < ε. then n→∞ lim xn = sup{xn : n ∈ N}. That means that there exists a B such that xn ≤ B for all n. Suppose that the sequence is bounded. we simply say the sequence is monotone.1. A monotone sequence {xn } is bounded if and only if it is convergent. If {xn } is monotone decreasing and bounded. A sequence {xn } is monotone decreasing if xn ≥ xn+1 for all n ∈ N. . If a sequence is either monotone increasing or monotone decreasing. Furthermore. Let ε > 0 be arbitrary. As {xn } is monotone increasing. Deﬁnition 2. then it is easy to see (by induction) that xn ≥ xM for all n ≥ M. provided we can ﬁnd the supremum or inﬁmum of a countable set of numbers.
Example 2. A common example of where monotone sequences arise is the following proposition. This is a monotone increasing sequence that grows very slowly. We have to show that a monotone sequence is bounded in order to use Theorem 2. Take a number 1 b ≥ 0 such that b ≤ √n for all n. that this sequence has no upper bound and so does not converge. n n+1 First we note that So the sequence is monotone decreasing. It is not at all obvious that this sequence has no bound. n→∞ n n We already know that the inﬁmum is greater than or equal to 0. take the sequence {1 + 1/2 + · · · + 1/n}.1. Proposition 2.10. Hence b = 0 is the greatest lower bound and hence the limit. SEQUENCES AND LIMITS 43 Hence the sequence converges to x. We can square both sides to obtain b2 ≤ 1 n for all n ∈ N.1. 1 √ n We have seen before that this implies that b2 ≤ 0 (a consequence of the Archimedean property). The proof is left as an exercise. Then there exist monotone sequences {xn } and {yn } such that xn . once we get to series. We start with n + 1 ≥ n (why is that true?).11: Take the sequence { √n }. > 0 and hence the sequence is bounded from below. We will see. For example. which completes the other direction of the implication. yn ∈ S and sup S = lim xn n→∞ and inf S = lim yn . We can apply the theorem to note that the sequence is convergent and that in fact 1 1 lim √ = inf √ . bounded from below (and hence bounded).2. n→∞ .1. As we also have b2 ≥ 0. Let us show that it √ √ is monotone decreasing.1. then b2 = 0 and hence b = 0. The proof for monotone decreasing sequences is left as an exercise. 1 Example 2. We already know that a convergent sequence is bounded.12: Be careful however.13.1. as 0 is a lower bound. From this inequality we obtain 1 1 √ ≤√ . Let S ⊂ R be a nonempty bounded set.
Furthermore. Thus. . Therefore {yn } converges to x.2 Tail of a sequence Deﬁnition 2. Let {xn } be a sequence. For a sequence {xn }. A subsequence of {xn } is a sequence that contains only some of the numbers from {xn } in the same order. the sequence {xn }∞ converges if and only if the Ktail n=1 {xn+K }∞ converges. Note that n ≥ M implies n + K ≥ M. Deﬁne yn := xn+K . SEQUENCES AND SERIES 2.14. Deﬁnition 2. given an ε > 0. there exists an M ∈ N such that x − xn  < ε for all n ≥ M. That is. Therefore. the beginning of the sequence may be arbitrary. For any K ∈ N. the limit does not care about how the sequence begins.16. Essentially.15. it only cares about the tail of the sequence. if the limit exists. there exists an M ∈ N such that x − yn  < ε for all n ≥ M . Now suppose that {yn } converges to x ∈ R.1. the Ktail (where K ∈ N) or just the tail of the sequence is the sequence starting at K + 1. Suppose that {xn } converges to some x ∈ R. Let {ni } be a strictly increasing sequence of natural numbers (that is n1 < n2 < n3 < · · · ). usually written as {xn+K }∞ n=1 or {xn }∞ n=K+1 . Therefore {xn } converges to x.1. The main result about the tail of a sequence is the following proposition. Proposition 2. it is true that for all n ≥ M we have that x − yn  = x − xn+K  < ε. And furthermore that the limits are equal. n→∞ Proof. That is. then n=1 n→∞ lim xn = lim xn+K .1. Then n ≥ M implies that n − K ≥ M .44 CHAPTER 2. We wish to show that {xn } converges if and only if {yn } converges. Let M := M + K. given an ε > 0.1. The sequence {xni }∞ i=1 is called a subsequence of {xn }. 2. whenever n ≥ M we have x − xn  = x − yn−K  < ε.3 Subsequences A very useful concept related to sequences is that of a subsequence.1. That is.
1. for all i ≥ M we have xni − x < ε. Exercise 2. convergent? If so. . the subsequence {x2n } converges to 1 and the subsequence {x2n+1 } converges to 0. 1/3. take the sequence {1/n}. See also Theorem 2. 0. To see how these two sequences ﬁt in the deﬁnition. 2. Thus.2. If {xn } is a convergent sequence. . It is not hard to see that {xn } is divergent. 0. .3. Suppose that limn→∞ xn = x. Exercise 2.1. is not a subsequence of {1/n}. Note that the numbers in the subsequence must come from the original sequence. is not a subsequence of {1/n}. . .1. For an arbitrary subsequence. . 0. then any subsequence {xni } is also convergent and lim xn = lim xni .4: Is the sequence {2−n } convergent? If so. 1/5.1.1. what is the limit. and xn = 1 if n is even.3: Is the sequence (−1)n 2n convergent? If so.18: Do note that the implication in the other direction is not true. what is the limit. .1. so the sequence 1. what is the limit. or prove that the series is divergent.4 Exercises In the following exercises. feel free to use what you know from calculus to ﬁnd the limit.6: Is the sequence n n+1 n n2 + 1 convergent? If so.. 1/3.17. But you must prove that you have found the correct limit.2: Is the sequence {n} convergent? If so. 1/2. Note that a tail of a sequence is one type of subsequence. so 1. what is the limit.7.1. Proposition 2. n→∞ i→∞ Proof. take the sequence 0. . 0. Exercise 2.5: Is the sequence Exercise 2. Exercise 2. . however. we have the following proposition. Hence i ≥ M implies that ni ≥ M. That means that for every ε > 0 we have an M ∈ N such that for all n ≥ M xn − x < ε. 1. and we are done. Exercise 2. Example 2. The sequence {1/3n} is a subsequence. if it exists. 1.1.1. 1/5. For example. what is the limit.1: Is the sequence {3n} bounded? Prove or disprove. That is xn = 0 if n is odd.1. take ni := 3i. SEQUENCES AND LIMITS 45 For example. It is not hard to prove (do it!) by induction that ni ≥ i. . Similarly order must be preserved. 1.
16: Let {xn } be a sequence. SEQUENCES AND SERIES a) Show that lim xn = 0 (that is. n+1 n is monotone.10 for monotone decreasing sequences.11: Finish proof of Theorem 2.13: Let {xn } be a convergent monotone sequence. Exercise 2.17.10: Show that the sequence to ﬁnd the limit.1. CHAPTER 2. n→∞ Show that xn = xk for all n ≥ k.1.1. Exercise 2. Suppose that i→∞ lim xni = a and i→∞ lim xmi = b. without using Proposition 2.1. Prove that {xn } is not convergent. Exercise 2. .1.1. Suppose that there are two convergent subsequences {xni } and {xmi }.1.15: Let {xn } be a sequence deﬁned by xn := n if n is odd.1.1. b) Find an example such that {xn } converges and {xn } diverges.46 Exercise 2. what is the limit.1.10 Exercise 2.12: Prove Proposition 2. and use Theorem 2. bounded. 1/n if n is even.1. 1 √ 3 n is monotone. the limit exists and is zero) if and only if lim xn  = 0. Exercise 2.13.1.7: Let {xn } be a sequence.1.10 to Exercise 2.1. Exercise 2. a) Is the sequence bounded? (prove or disprove) b) Is there a convergent subsequence? If so.8: Is the sequence 2n n! convergent? If so. bounded. and use Theorem 2. Exercise 2.14: Find a convergent subsequence of the sequence {(−1)n }. Suppose that there exists a k ∈ N such that lim xn = xk .9: Show that the sequence ﬁnd the limit. where a = b. ﬁnd it.1. Exercise 2.
Proof. n→∞ n→∞ The intuitive idea of the proof is best illustrated on a picture. {bn }. Set M := max{M1 . If x is the limit of an and bn . n→∞ Then {xn } converges and n→∞ lim xn = lim an = lim bn . We compute xn − an  = xn − an ≤ bn − an = bn − x + x − an  ≤ bn − x + x − an  ε ε 2ε < + = .2. Let x := lim an = lim bn . and an M2 such that for all n ≥ M2 we have bn − x < ε/3.1.5 lectures In this section we will go over some basic results about the limits of sequences. see Figure 2. then the distance between an and bn is at most 2ε/3. then xn must be at most ε away from x. Let ε > 0 be given.1 Limits and inequalities A basic lemma about limits is the socalled squeeze lemma.1 (Squeeze lemma). then if they are both within ε/3 of x. It allows us to show convergence of sequences in difﬁcult cases if we can ﬁnd two other simpler convergent sequences that “squeeze” the original sequence. As xn is between an and bn it is at most 2ε/3 from an .2. M2 }. Suppose that n ≥ M.2. Lemma 2. Let {an }. lim an = lim bn .2. and {xn } be sequences such that an ≤ xn ≤ bn Suppose that {an } and {bn } converge and n→∞ for all n ∈ N. FACTS ABOUT LIMITS OF SEQUENCES 47 2. Let us follow through on this intuition rigorously. an x xn bn Figure 2. We start with looking at how sequences interact with inequalities. Find an M1 such that for all n ≥ M1 we have that an − x < ε/3.2 Facts about limits of sequences Note: 2. 2. 3 3 3 .1: Squeeze lemma in picture. Since an is at most ε/3 away from x.
We add these inequalities to obtain yn − xn + x − y < ε. or yn − xn < y − x + ε. Hence. 3 3 And we are done. suppose that we have the sequence √ 1 { n√n }.2: A simple example of how to use the squeeze lemma is to compute limits of sequences using limits that are already known.48 Armed with this information we estimate CHAPTER 2. SEQUENCES AND SERIES xn − x = xn − x + an − an  ≤ xn − an  + an − x 2ε ε < + = ε. Example 2. x − y < ε for all ε > 0. . Therefore x ≤ y. as we have seen that a nonnegative number less than any positive ε is zero. In particular. Lemma 2. Let ε > 0 be given. For example. Since xn ≤ yn we have 0 ≤ yn − xn and hence 0 < y − x + ε.2. or − ε < y − x. M2 } we have x − xn < ε/2 and yn − y < ε/2.2. In other words. Let {xn } and {yn } be convergent sequences and xn ≤ yn . n→∞ n n Limits also preserve inequalities. n n n for all n ∈ N. Find an M1 such that for all n ≥ M1 we have xn − x < ε/2. Then n→∞ lim xn ≤ lim yn . using the constant sequence {0} and the sequence {1/n} in the squeeze lemma. n→∞ Proof. for n ≥ max{M1 . Find an M2 such that for all n ≥ M2 we have yn − y < ε/2.3. That means that x − y ≤ 0. Let x := lim xn and y := lim yn . we conclude that 1 lim √ = 0. for all n ∈ N. We already know that lim 1/n = 0. Since n ≥ 1 for all n ∈ N we have 1 1 0≤ √ ≤ .
. Proposition 2. Corollary 2. n→∞ This issue is a common source of errors.4. then n→∞ lim xn ≥ 0.2. converges and n→∞ lim (xn − yn ) = lim zn = lim xn − lim yn . where zn := xn yn . we can only conclude that n→∞ lim xn ≤ lim yn . these inequalities are not preserved by the limit operation as we have lim xn = lim yn = 0. The moral of this example is that strict inequalities may become nonstrict inequalities when limits are applied.2. FACTS ABOUT LIMITS OF SEQUENCES 49 We give an easy corollary that can be proved using constant sequences and an application of Lemma 2.2 Continuity of algebraic operations Limits interact nicely with algebraic operations. However. converges and n→∞ lim (xn yn ) = lim zn = lim xn n→∞ n→∞ n→∞ lim yn . The proof is left as an exercise. and yn > 0 for all n. for all n ∈ N.3. (i) The sequence {zn }.2. where zn := xn − yn . n→∞ n→∞ n→∞ (iii) The sequence {zn }.5. For example. if we know that xn < yn for all n.2.2. Then a ≤ lim xn ≤ b. n→∞ Note in Lemma 2. 2. Let {xn } and {yn } be convergent sequences. xn < 0. i) Let {xn } be a convergent sequence such that xn ≥ 0. b ∈ R and let {xn } be a convergent sequence such that a ≤ xn ≤ b.2. where zn := xn + yn .2. ii) Let a. n→∞ n→∞ n→∞ (ii) The sequence {zn }. converges and n→∞ lim (xn + yn ) = lim zn = lim xn + lim yn . Then xn < yn .3 we cannot simply replace all the nonstrict inequalities with strict inequalities. let xn := −1/n and yn := 1/n. That is.
Let z := x + y. converges and yn (iv) If lim yn = 0. Find an M1 such that for all n ≥ M1 we have xn − x < 2(y+1) . n→∞ yn lim yn lim Once the claim is proved. . Let z := xy. SEQUENCES AND SERIES xn . Take M := max{M1 . Let {xn } and {yn } be convergent sequences and let zn := xn yn . where zn := lim xn = lim zn = n→∞ n→∞ yn lim xn . M2 }. Let ε > 0 be given. Let x := lim xn and y := lim yn . Find an M2 ε such that for all n ≥ M2 we have yn − y < 2B . we prove the following simpler claim: Claim: If {yn } is a convergent sequence such that lim yn = 0 and yn = 0 for all n ∈ N. Let us start with (i). Let {xn } and {yn } be convergent sequences and let zn := xn + yn . Let x := lim xn and y := lim yn . 2 2 Finally let us tackle (iv). Let us tackle (iii). it is bounded. As {xn } is convergent. For all n ≥ M we have zn − z = (xn + yn ) − (x + y) = xn − x + yn − y ≤ xn − x + yn − y ε ε < + = ε. lim yn Proof. 2 2 Therefore (i) is proved. For all n ≥ M we have zn − z = (xn yn ) − (xy) = xn yn − (x + xn − xn )y = xn (yn − y) + (xn − x)y ≤ xn (yn − y) + (xn − x)y = xn  yn − y + xn − x y ≤ B yn − y + xn − x y ε ε y <B + 2B 2(y + 1) ε ε < + = ε. then 1 1 = . Therefore. ﬁnd a B > 0 such that ε xn  ≤ B for all n ∈ N. Proof of (ii) is almost identical and is left as an exercise. Instead of proving (iv) directly. Take M := max{M1 . M2 }. then the sequence {zn }. Find an M1 such that for all n ≥ M1 we have xn − x < ε/2.50 CHAPTER 2. Find an M2 such that for all n ≥ M2 we have yn − y < ε/2. we take the sequence {1/yn } and multiply it by the sequence {xn } and apply item (iii). and yn = 0 for all n. Let ε > 0 be given.
2. yn  y Now we can ﬁnish the proof of the claim. By plugging in constant sequences. we get several easy corollaries. n→∞ Similarly with subtraction and division.2. If c ∈ R and {xn } is a convergent sequence. FACTS ABOUT LIMITS OF SEQUENCES 51 Proof of claim: Let ε > 0 be given. Find an M such that for all n ≥ M we have ε y yn − y < min y2 .2. Then √ lim xn = lim xn . we can take limits past powers. Proposition 2. then for example n→∞ lim cxn = c lim xn n→∞ and n→∞ lim (c + xn ) = c + lim xn . < y y And we are done. Let us see if we can do the same with roots. n→∞ n→∞ . or in other words yn  ≥ y − y − yn . 1 1 y − yn − = yn y yyn y − yn  = y yn  y − yn  2 < y y y2 ε 2 2 = ε. 2 y 2 yn  ≥ y − y − yn  > and consequently 1 2 < . Let y := lim yn . Now yn − y < y 2 implies that y − yn − y > Therefore y . That is. . k As we can take limits past multiplication we can show that lim xn = (lim xn )k .6. 2 2 Note that y = y − yn + yn  ≤ y − yn  + yn  . Let {xn } be a convergent sequence such that xn ≥ 0.
We leave this to the reader as a challenging exercise. Now suppose that x > 0 (and hence √ x > 0).4 to show that lim xn ≥ 0 so that we can take the square root without worry. 2.52 CHAPTER 2. Let {xn } be a convergent sequence and let x := lim xn . First suppose that x = 0. Details are left to the reader. we need to apply Corollary 2. SEQUENCES AND SERIES Of course to even make this statement.3 Recursively deﬁned sequences Once we know we can interchange limits and algebraic operations.2. Hence √ √ √ xn − x = xn < ε. A similar proof works the kth root. or in other words xn < ε.2. Proposition 2. If {xn } is a convergent sequence. we also obtain lim xn = (lim xn )1/k . then {xn } is convergent and n→∞ 1/k lim xn  = lim xn . x We leave the rest of the proof to the reader. That is. Hence if xn − x can be made arbitrarily small. Let ε > 0 be given. so can xn  − x . . One such class are recursively deﬁned sequences. We simply note the reverse triangle inequality xn  − x ≤ xn − x . Then there is an M such that for all n ≥ M we have √ xn = xn  < ε 2 .7.2. Proof. n→∞ Proof. we will actually be able to easily compute the limits for a large class of sequences. We may also want to take the limit past the absolute value sign. √ √ xn − x √ xn − x = √ xn + x 1 √ xn − x =√ xn + x 1 ≤ √ xn − x . That is sequences where the next number in the sequence computed using a formula from a ﬁxed number of preceding numbers in the sequence.
then the sequence converges and the limit really is 0. we know that x = 2. we have that xn+1 − 2 ≥ 0 for all n. If we can show that xn − 2 ≥ 0 for all n. First let us prove that xn > 0 for all n (then the sequence is well deﬁned). Let us deﬁne x := lim xn . 2 2 4xn 4xn 4xn 2 2 xn+1 − 2 = 2 Since xn > 0 and any number squared is nonnegative. If we blindly assumed that the limit exists (call it x). Let us show this by induction. you must make sure the sequence converges. however. deals precisely with these issues. It remains to ﬁnd out what the limit is.8: Let {xn } be deﬁned by x1 := 2 and 2 xn − 2 xn+1 := xn − . .2. Since {xn+1 } is the 1tail of {xn }. called dynamics.2.2. However. We can take the limit of both sides to obtain 2x2 = x2 + 2. 2xn 53 We must ﬁnd out if this sequence is well deﬁned. Let us see an example. For an arbitrary n we have that 2 xn + 2 2xn 2 2 2 4 2 x2 − 2 x4 + 4xn + 4 − 8xn xn − 4xn + 4 −2 = n = = n 2 . but it depends on the initial value x1 . it is not hard to show that {xn } is unbounded and therefore does not converge. we must show we never divide by zero. Then xn+1 = xn − 2 2 2 2 xn − 2 2xn − xn + 2 xn + 2 = = . and therefore the limit exists. then we would get the equation x = x2 + x. suppose that xn > 0. Then we must ﬁnd out if the sequence converges. The thing to notice in this example is that the method still works. We know that x1 = 2 > 0. Only then can we attempt to ﬁnd the limit. 2 Example 2. √ or x2 = 2. You should. For the induction step. be careful. {xn } is monotone decreasing and bounded.2. FACTS ABOUT LIMITS OF SEQUENCES Example 2. An entire branch of mathematics. then xn + 2 > 0 and hence xn+1 > 0. then xn+1 ≤ xn for all n. 2xn 2xn 2xn 2 If xn > 0. it converges to the same limit. If we made x1 = 0. As x ≥ 0. from which we might conclude that x = 0. Obviously x1 − 2 = 4 − 2 = 2 > 0. Before taking any limits. Let us write 2 2xn xn+1 = xn + 2. Next let us show that the sequence is monotone 2 2 decreasing.9: Suppose x1 := 1 and xn+1 := xn + xn . Therefore.
we can ﬁnd an n such that nr ≥ B − 1).11. to study when a sequence has a limit is the same as studying when another sequence goes to zero. Proof. First a simple test. Then cn = 1 1 11 ≤ ≤ .54 CHAPTER 2. Proposition 2.2. the main idea is that {xn } converges to x if and only if {xn − x} converges to zero. for all n ≥ M we have xn − x ≤ an < ε.2. {cn } converges to zero.4 Some convergence tests Sometimes it is not necessary to go back to the deﬁnition of convergence to prove that a sequence is convergent. so does { 1 n }. By the Archimedean property of the real numbers. By induction (or using the binomial theorem if you know it) we see that cn = (1 + r)n ≥ 1 + nr. Then {xn } converges and lim xn = x. Proof. First let us compute the limit of a very speciﬁc sequence. Find an M ∈ N such that for all n ≥ M we have an = an − 0 < ε. Note that an ≥ 0 for all n. Suppose that there is an x ∈ R and a convergent sequence {an } such that lim an = 0 n→∞ and xn − x ≤ an for all n. For some special sequences we can test the convergence easily. (ii) If c > 1. SEQUENCES AND SERIES 2. Write c = 1+r . then {cn } is unbounded. 1 Now let c < 1. n (1 + r) 1 + nr r n 1 1 As { n } converges to zero. Let ε > 0 be given. r . Hence. the sequence {1 + nr} is unbounded (for any number B.10. Let c > 0. (i) If c < 1. where r > 0.2. Essentially. Therefore cn is unbounded. We write c = 1 + r for some r > 0. First let us suppose that c > 1. As the proposition shows. Then. Let {xn } be a sequence. then n→∞ lim cn = 0. Proposition 2.
2.2. FACTS ABOUT LIMITS OF SEQUENCES
55
If we look at the above proposition, we note that the ratio of the (n + 1)th term and the nth term is c. We can generalize this simple result to a larger class of sequences. The following lemma will come up again once we get to series. Lemma 2.2.12 (Ratio test for sequences). Let {xn } be a sequence such that xn = 0 for all n and such that the limit xn+1  L := lim n→∞ xn  exists. (i) If L < 1, then {xn } converges and lim xn = 0. (ii) If L > 1, then {xn } is unbounded (hence diverges). Even if L exists, but L = 1, the lemma says nothing. We cannot make any conclusion based on that information alone. For example, consider the sequences 1, 1, 1, 1, . . . and 1, −1, 1, −1, 1, . . ..
n+1 Proof. Suppose L < 1. As xxn   ≥ 0, we have that L ≥ 0. Pick r such that L < r < 1. As r − L > 0, there exists an M ∈ N such that for all n ≥ M we have
xn+1  − L < r − L. xn  Therefore, xn+1  < r. xn  For n > M (that is for n ≥ M + 1) we write xn  = xM  xn  xn−1  xM+1  ··· < xM  rr · · · r = xM  rn−M = (xM  r−M )rn . xn−1  xn−2  xM 
The sequence {rn } converges to zero and hence xM  r−M rn converges to zero. By Proposition 2.2.10, the Mtail of {xn } converges to zero and therefore {xn } converges to zero. Now suppose L > 1. Pick r such that 1 < r < L. As L − r > 0, there exists an M ∈ N such that for all n ≥ M we have xn+1  − L < L − r. xn  Therefore, xn+1  > r. xn  Again for n > M we write xn  = xM  xn  xn−1  xM+1  ··· > xM  rr · · · r = xM  rn−M = (xM  r−M )rn . xn−1  xn−2  xM 
The sequence {rn } is unbounded (since r > 1), and therefore xn  cannot be bounded (if xn  ≤ B for all n, then rn < xB  rM for all n, which is impossible). Consequently, {xn } cannot converge. M
56
CHAPTER 2. SEQUENCES AND SERIES
Example 2.2.13: A simple example of using the above lemma is to prove that 2n = 0. lim n→∞ n! Proof: We ﬁnd that 2n+1 /(n + 1)! 2n+1 n! 2 = n = . 2n /n! 2 (n + 1)! n + 1
2 It is not hard to see that { n+1 } converges to zero. The conclusion follows by the lemma.
2.2.5 Exercises
Exercise 2.2.1: Prove Corollary 2.2.4. Hint: Use constant sequences and Lemma 2.2.3. Exercise 2.2.2: Prove part (ii) of Proposition 2.2.5. Exercise 2.2.3: Prove that if {xn } is a convergent sequence, k ∈ N, then
n→∞ k lim xn = lim xn n→∞ k
.
Hint: Use induction. Exercise 2.2.4: Suppose that x1 := Hint: You cannot divide by zero! Exercise 2.2.5: Let xn := the limit.
n−cos(n) . n 1 2 2 and xn+1 := xn . Show that {xn } converges and ﬁnd lim xn .
Use the squeeze lemma to show that {xn } converges and ﬁnd
yn 1 Exercise 2.2.6: Let xn := n2 and yn := 1 . Deﬁne zn := xn and wn := xn . Does {zn } and {wn } n yn converge? What are the limits? Can you apply Proposition 2.2.5? Why or why not? 2 Exercise 2.2.7: True or false, prove or ﬁnd a counterexample. If {xn } is a sequence such that {xn } converges, then {xn } converges.
Exercise 2.2.8: Show that
n2 = 0. n→∞ 2n Exercise 2.2.9: Suppose that {xn } is a sequence and suppose that for some x ∈ R, the limit lim L := lim xn+1 − x n→∞ xn − x
exists and L < 1. Show that {xn } converges to x. Exercise 2.2.10 (Challenging): Let {xn } be a convergent sequence such that xn ≥ 0 and k ∈ N. Then
n→∞
1/k
lim xn = lim xn
n→∞ xn −x1/k xn −x
1/k
1/k
.
Hint: Find an expression q such that
= 1. q
2.3. LIMIT SUPERIOR, LIMIT INFERIOR, AND BOLZANOWEIERSTRASS
57
2.3
Limit superior, limit inferior, and BolzanoWeierstrass
Note: 1.52 lectures, alternative proof of BW optional In this section we study bounded sequences and their subsequences. In particular we deﬁne the socalled limit superior and limit inferior of a bounded sequence and talk about limits of subsequences. Furthermore, we prove the socalled BolzanoWeierstrass theorem† , which is an indispensable tool in analysis. We have seen that every convergent sequence is bounded, but there exist many bounded divergent sequences. For example, the sequence {(−1)n } is bounded, but we have seen it is divergent. All is not lost however and we can still compute certain limits with a bounded divergent sequence.
2.3.1 Upper and lower limits
There are ways of creating monotone sequences out of any sequence, and in this way we get the socalled limit superior and limit inferior. These limits will always exist for bounded sequences. Note that if a sequence {xn } is bounded, then the set {xk : k ∈ N} is bounded. Then for every n the set {xk : k ≥ n} is also bounded (as it is a subset). Deﬁnition 2.3.1. Let {xn } be a bounded sequence. Let an := sup{xk : k ≥ n} and bn := inf{xk : k ≥ n}. We note that the sequence {an } is bounded monotone decreasing and {bn } is bounded monotone increasing (more on this point below). We deﬁne lim sup xn := lim an ,
n→∞ n→∞ n→∞
lim inf xn := lim bn .
n→∞
For a bounded sequence, liminf and limsup always exist. It is possible to deﬁne liminf and limsup for unbounded sequences if we allow ∞ and −∞. It is not hard to generalize the following results to include unbounded sequences, however, we will restrict our attention to bounded ones. Let us see why {an } is a decreasing sequence. As an is the least upper bound for {xk : k ≥ n}, it is also an upper bound for the subset {xk : k ≥ (n + 1)}. Therefore an+1 , the least upper bound for {xk : k ≥ (n + 1)}, has to be less than or equal to an , that is, an ≥ an+1 . Similarly, bn is an increasing sequence. It is left as an exercise to show that if xn is bounded, then an and bn must be bounded. Proposition 2.3.2. Let {xn } be a bounded sequence. Deﬁne an and bn as in the deﬁnition above. (i) lim sup xn = inf{an : n ∈ N} and lim inf xn = sup{bn : n ∈ N}.
n→∞ n→∞
(ii) lim inf xn ≤ lim sup xn .
n→∞ n→∞
Named after the Czech mathematician Bernhard Placidus Johann Nepomuk Bolzano (1781 – 1848), and the German mathematician Karl Theodor Wilhelm Weierstrass (1815 – 1897).
†
Theorem 2. We can apply Lemma 2. We can associate with lim sup and lim inf certain subsequences. n→∞ Do note that the sequence {xn } is not a convergent sequence. n→∞ . n→∞ Similarly. We leave it to the reader to show that the limit is 1. as the inf of a set is less than or equal to its sup. if n is even. n→∞ n→∞ Example 2. lim sup xn = 1. there exists a (perhaps different) subsequence {xnk } such that k→∞ lim xnk = lim inf xn . we note that bn ≤ an . n→∞ n→∞ It is not hard to see that sup{xk : k ≥ n} = n+1 n n+2 n+1 if n is odd.3: Let {xn } be deﬁned by xn := n+1 n 0 if n is odd.3 to note that lim bn ≤ lim an . We know that {an } and {bn } converge to the limsup and the liminf (respectively). then there exists a subsequence {xnk } such that k→∞ lim xnk = lim sup xn . For the second item.3. The ﬁrst item in the proposition follows as the sequences {an } and {bn } are monotone.4. If {xn } is a bounded sequence. SEQUENCES AND SERIES Proof.3. Let us compute the lim inf and lim sup of this sequence lim inf xn = lim (inf{xk : k ≥ n}) = lim 0 = 0.58 CHAPTER 2. if n is even.2. That is. n→∞ n→∞ n→∞ For the limit superior we write lim sup xn = lim (sup{xk : k ≥ n}) .
For all k ≥ M we have x − xnk  = ank − xnk + x − ank  ≤ ank − xnk  + x − ank  1 ε < + k 2 1 ε ε ε ≤ + ≤ + = ε.2. M2 2 Take M := max{M1 . Thus there exists an M1 ∈ N such that for all k ≥ M1 we have ε ank − x < . although there are subtle differences. Note that the subsequence need not be monotone. Set nk+1 := m. AND BOLZANOWEIERSTRASS 59 Proof. 2 1 ε ≤ . Next we need to prove that it has the right limit. The subsequence {xnk } is deﬁned. then the subsequence {ank } converges to x.2 Using limit inferior and limit superior The advantage of lim inf and lim sup is that we can always write them down for any (bounded) sequence. Suppose we have deﬁned the subsequence until nk for some k. k Let us show that {xnk } is convergent to x. for every k > 1 we have ank − xnk  = ank − xnk ≤ a(nk−1 +1) − xnk 1 < . Now pick some m > nk such that a(nk +1) − xm < 1 . Write x := lim sup xn = lim an . Note that a(nk−1 +1) ≥ ank (why?) and that ank ≥ xnk . Working with lim inf and lim sup is a little bit like working with limits. LIMIT SUPERIOR. LIMIT INFERIOR.3. As {an } converges to x. M2 2 2 2 We leave the statement for lim inf as an exercise. Let ε > 0 be given. Find an M2 ∈ N such that 2. we can also compute the limit of the sequence if it exists. Therefore. . M2 } and compute. k+1 We can do this as a(nk +1) is a supremum of the set {xn : x ≥ nk + 1} and hence there are elements of the sequence arbitrarily close (or even equal) to the supremum. Deﬁne an := sup{xk : k ≥ n}.3. Pick n1 := 1 and work inductively. If we could somehow compute them. Deﬁne the subsequence as follows.
4 that there exists a subsequence {xnk } that converges to lim sup xn .5. we know that every subsequence converges to x and therefore lim sup xn = x. . SEQUENCES AND SERIES Theorem 2.1. That is.3.3. The middle inequality has been noted before already.3 to conclude that j→∞ lim c j ≤ lim a j . Also deﬁne c j := sup{xnk : k ≥ j}. It is not true that c j is necessarily a subsequence of a j . as nk ≥ k for all k. Then {xn } converges if and only if lim inf xn = lim sup xn . We will prove the third inequality.3. Let {xn } be a bounded sequence. Deﬁne an and bn as in Deﬁnition 2. Limit superior and limit inferior behave nicely with subsequences. Suppose that {xn } is a bounded sequence and {xnk } is a subsequence. n→∞ n→∞ Proof. we want to prove that lim sup xnk ≤ lim sup xn .1). n→∞ k→∞ k→∞ n→∞ Proof. and leave the ﬁrst inequality as an exercise. if {xn } converges. If lim inf xn = lim sup xn . Similarly lim inf xn = x.60 CHAPTER 2. Then lim inf xn ≤ lim inf xnk ≤ lim sup xnk ≤ lim sup xn .2. We know by Theorem 2. By the squeeze lemma (Lemma 2. Deﬁne a j := sup{xk : k ≥ j} as usual.2. Proposition 2.6. However. j→∞ which is the desired conclusion. As {xn } converges to x. then we know that {an } and {bn } have limits and that these two limits are the same. n→∞ n→∞ Now suppose that {xn } converges to x.3. n→∞ n→∞ Furthermore. A supremum of a subset is less than or equal to the supremum of the set and therefore c j ≤ a j. then n→∞ lim xn = lim inf xn = lim sup xn . {xn } converges and n→∞ lim bn = lim xn = lim an . We apply Lemma 2. we have that {xnk : k ≥ j} ⊂ {xk : k ≥ j}. Now note that bn ≤ xn ≤ an .
then there exist two numbers a1 < b1 such that a1 ≤ xn ≤ b1 for all n ∈ N. We can take n1 := 1. We can use Theorem 2.3.3. xn2 . {bi } is monotone decreasing. If . We have already deﬁned a1 and b1 . Theorem 2. We deﬁne the sequence inductively. Proof. . .3. If there exist inﬁnitely many j ∈ N 2 such that x j ∈ [ak .4. As the sequence is bounded. Theorem 2. the BolzanoWeierstrass theorem tells us that we can at least ﬁnd a convergent subsequence. Alternate proof of BolzanoWeierstrass. xnk . and the sequences {ai } and {bi } up to some k ∈ N.4 is strictly stronger than the BolzanoWeierstrass theorem as presented above. That xni converges follows by the squeeze lemma.3. Further we will always have that xn ∈ [ai . such that {ai } is monotone increasing.3. AND BOLZANOWEIERSTRASS 61 Limit superior and limit inferior are in fact the largest and smallest subsequential limits. y].4 only applies to the real line. ai ≤ xni ≤ bi and such that lim ai = lim bi .2. We will always assume that ai < bi . That is true. It says that there exists a subsequence whose limit is lim sup xn .7.3. Now suppose we have deﬁned the subsequence xn1 . and pick nk+1 > nk such that xnk+1 ∈ [ak . Theorem 2. We ﬁnd y = ak +bk . lim inf xn ≤ lim xnk ≤ lim sup xn . we present an explicit proof. We will deﬁne a subsequence {xni } and two sequences {ai } and {bi }.8 (BolzanoWeierstrass). . bk+1 := y. Then there exists a convergent subsequence {xni }.3. LIMIT SUPERIOR. n→∞ k→∞ n→∞ Similarly we also get the following useful test for convergence of a bounded sequence. then set ak+1 := ak . The reader might complain right now that Theorem 2.3 BolzanoWeierstrass theorem While it is not true that a bounded sequence is convergent. . However. As the theorem is so important to analysis. 2. The version of BolzanoWeierstrass that we will present in this section is the BolzanoWeierstrass for sequences. but BolzanoWeierstrass applies in more general contexts (that is. Suppose that a sequence {xn } of real numbers is bounded. then of course we have that lim inf xnk = lim xnk = lim sup xnk . A bounded sequence {xn } is convergent and converges to x if and only if every convergent subsequence {xnk } converges to x. It is clear that ak < y < bk . If the subsequence in the previous proposition is convergent. LIMIT INFERIOR. The following proof generalizes more easily to different contexts. Therefore. that is xn1 = x1 . in Rn ) with pretty much the exact same statement. We leave the proof as an exercise. y]. bi ] for inﬁnitely many n ∈ N.
Claim: Every sequence has a monotone subsequence. Obviously y ≤ x as ai < bi for all i. Show that {bn } is an increasing sequence. bk+1 := bk .3.4: Prove Theorem 2. suppose that {xn } is a bounded sequence and {xnk } is a subsequence. we obtain that 2 bi − ai = b1 − a1 . By induction.3: Finish the proof of Proposition 2. bk ].3. then it must be true that there are inﬁnitely many j ∈ N such that x j ∈ [y. We ﬁnish by the squeeze i−1 lemma. and pick nk+1 > nk such that xnk+1 ∈ [y.2: Suppose that {xn } is a bounded sequence. 2i−1 Let x := lim ai . Therefore bi+1 − ai+1 = bi −ai . What is left to prove is that lim ai = lim bi .5: a) Let xn := b) Let xn := (−1)n . Exercise 2. ﬁnd lim sup xn and lim inf xn . y].3.1. n .1: Suppose that {xn } is a bounded sequence. Now we have the sequences deﬁned. That is. ﬁnd lim sup xn and lim inf xn . 2i−1 1 As b2−a1 is arbitrarily small and y − x ≥ 0. it is obvious that bi − ai is cut in half in each step. Exercise 2.3. Prove lim inf xn ≤ lim inf xnk .6.3.62 CHAPTER 2. n (n − 1)(−1)n . Deﬁne bn as in Deﬁnition 2.4 Exercises Exercise 2. Deﬁne an and bn as in Deﬁnition 2. In this case pick ak+1 := y. Yet another proof of the BolzanoWeierstrass theorem proves the following claim.3. n→∞ k→∞ Exercise 2.3. As the sequences are monotone. From the construction.3. bk ]. then for any i we have (why?) y − x ≤ bi − ai = b1 − a1 .7. Exercise 2. we have that y − x = 0.3. Show that {an } and {bn } are bounded. SEQUENCES AND SERIES there are not inﬁnitely many j such that x j ∈ [ak . As {ai } is monotone we have that x = sup{ai : i ∈ N} Now let y := lim bi = inf{bi : i ∈ N}. 2. Obviously the limits exist as the sequences are monotone. which is left as a challenging exercise.1.3.
8: Let {xn } and {yn } be bounded sequences (from the previous exercise we know that {xn + yn } is bounded). x + ε) ∩ S \ {x} is not empty. Prove the following version of / the BolzanoWeierstrass theorem: Theorem. a) Show that {xn + yn } is bounded. n→∞ n→∞ n→∞ Hint: See previous exercise. then S contains a countably inﬁnite subset. then x ∈ R is a cluster point if for every ε > 0. Hint: If S is inﬁnite. That is. n→∞ n→∞ Exercise 2.3.3.6: Let {xn } and {yn } be sequences such that xn ≤ yn for all n. For example. b) Find an explicit {xn } and {yn } such that (lim sup xn ) + (lim sup yn ) > lim sup (xn + yn ). n→∞ n→∞ n→∞ Hint: Find a subsequence {xni + yni } of {xn + yn } that converges. S := {1/n : n ∈ N} has a unique (only one) cluster point 0. if there are points of S arbitrarily close to x. LIMIT SUPERIOR. LIMIT INFERIOR. . Then ﬁnd a subsequence {xnmi } of {xni } that converges. Let S ⊂ R be a bounded inﬁnite set. That is. there is a sequence {xn } of distinct numbers in S.3. but 0 ∈ S.2. Then apply what you know about limits. b) Show that (lim inf xn ) + (lim inf yn ) ≤ lim inf (xn + yn ). n→∞ n→∞ n→∞ Hint: See previous exercise.3. Exercise 2. a) Show that (lim sup xn ) + (lim sup yn ) ≥ lim sup (xn + yn ). n→∞ n→∞ n→∞ Hint: Look for examples that do not have a limit. AND BOLZANOWEIERSTRASS Exercise 2.9: If S ⊂ R is a set. Then show that lim sup xn ≤ lim sup yn n→∞ n→∞ 63 and lim inf xn ≤ lim inf yn . then there exists at least one cluster point of S.3. the set (x − ε.7: Let {xn } and {yn } be bounded sequences. Exercise 2. c) Find an explicit {xn } and {yn } such that (lim inf xn ) + (lim inf yn ) < lim inf (xn + yn ).
Now there are two possibilities: either the sequence has at most ﬁnitely many peaks. Hint: Call n ∈ N a peak if am ≤ an for all m ≥ n.10 (Challenging): a) Prove that any sequence contains a monotone subsequence.64 CHAPTER 2. . SEQUENCES AND SERIES Exercise 2. b) Now conclude the BolzanoWeierstrass theorem. or it has inﬁnitely many peaks.3.
It turns out that is true because R is complete (has the leastupperbound property). Deﬁnition 2.2.4. for all n. CAUCHY SEQUENCES 65 2. A Cauchy sequence is bounded. Intuitively what it means is that the terms of the sequence are eventually arbitrarily close to each other. In this case. Proof: Let ε > 0 be given.4. k ≥ M we have that 1/n < ε/2 and 1/k < ε/2. It would be nice if we could check for convergence without being able to ﬁnd the limit. A sequence {xn } is a Cauchy sequence‡ if for every ε > 0 there exists an M ∈ N such that for all n ≥ M and all k ≥ M we have xn − xk  < ε.2: The sequence {1/n} is a Cauchy sequence. Take M > 2/ε . We would expect such a sequence to be convergent. k ≥ M we have 1 1 ε ε 1 1 − ≤ + < + = ε.4. n k n k 2 2 Example 2.4.1. Example 2. n k 2 2 Proposition 2. n Proof: Given ε > 0. First. . ‡ Named after the French mathematician AugustinLouis Cauchy (1789–1857).4. Therefore.4. it is impossible to use the number itself in the proof that the sequence converges. Then for n.3: The sequence { n+1 } is a Cauchy sequence.51 lecture Often we wish to describe a certain number by a sequence that converges to it.4 Cauchy sequences Note: 0. let us look at some examples. ﬁnd M such that M > 2/ε . Therefore n+1 k+1 k(n + 1) − n(k + 1) − = n k nk kn + k − nk − n = nk k−n = nk k −n ≤ + nk nk 1 1 ε ε = + < + = ε. Then for n ≥ M we have that 1/n < ε/2.
7. Deﬁne a := lim inf xn and b := lim sup xn . Then there exists an M such that for n ≥ M we have ε xn − x < . there exists an M1 such that for all i ≥ M1 we have xni − a < ε/3 and an M2 such that for all i ≥ M2 we have xmi − b < ε/3. xM  . M3 }. Hence a − b = a − xni + xni − xmi + xmi − b ≤ a − xni  + xni − xmi  + xmi − b ε ε ε < + + = ε. then a = b and therefore the sequence converges. Or by the reverse triangle inequality.5. Now note that if i ≥ M. . k ≥ M3 we have xn − xk  < ε/3. We have shown that {xn } is bounded. Given an ε > 0. Proof. M2 . then ni ≥ M and mi ≥ M. In particular. 1 + xM }. Pick M such that for all n. By Theorem 2. such that i→∞ lim xni = a and i→∞ lim xmi = b. 3 3 3 As a − b < ε for all ε > 0. There also exists an M3 such that for all n. SEQUENCES AND SERIES Proof. x2  . we have that for all n ≥ M xn − xM  < 1. .66 CHAPTER 2. .5. xn  − xM  ≤ xn − xM  < 1. k ≥ M we have xn − xk  < 1. A sequence of real numbers is Cauchy if and only if it converges. 2 2 Alright. Then xn  ≤ B for all n ∈ N. that direction was easy. If we can show that lim inf xn = lim sup xn . there exist subsequences {xni } and {xmi }. Theorem 2.3. If we can show a = b. Let B := max{x1  . Now suppose that {xn } is Cauchy. then the sequence converges. Let ε > 0 be given and suppose that {xn } converges to x. Suppose that {xn } is Cauchy. Let M := max{M1 . 2 Hence for n ≥ M and k ≥ M we have xn − xk  = xn − x + x − xk  ≤ xn − x + x − xk  < ε ε + = ε.3.4. . Hence for n ≥ M we have xn  < 1 + xM  . n→∞ n→∞ Then {xn } must be convergent by Theorem 2. .
4. 1 −C 2 Exercise 2. .4.4. show that the sequence converges to 0. Exercise 2. Hint: You can freely use the formula (for C = 1) 1 +C +C2 + · · · +Cn = 1 −Cn+1 .4. Prove that F has the leastupperbound property. The advantage of using Cauchy sequences to deﬁne completeness is that this idea generalizes to more abstract settings. Suppose that for all k ∈ N and for all m ≥ k we have xm − xk  ≤ yk . Prove that {xn } is Cauchy.5: Suppose that a Cauchy sequence {xn } is such that for every M ∈ N.1: Prove that { n n−1 } is Cauchy using directly the deﬁnition of Cauchy sequences.4.1 Exercises Exercise 2. 2 Exercise 2. We can “complete” Q by “throwing in” just enough points to make all Cauchy sequences converge (we omit the details). there exists a k ≥ M and an n ≥ M such that xk < 0 and xn > 0.4.4. 2. then R is Cauchycomplete. That is. Exercise 2.2. Using simply the deﬁnition of a Cauchy sequence and of a convergent sequence. CAUCHY SEQUENCES 67 Remark 2.2: Let {xn } be a sequence such that there exists a 0 < C < 1 such that xn+1 − xn  ≤ C xn − xn−1  . We have proved above that as R has the leastupperbound property.4. Show that {xn } is Cauchy. we can say that R is Cauchycomplete (or sometimes just complete).6.4: Let {xn } and {yn } be sequences such that lim yn = 0. The statement of this proposition is sometimes used to deﬁne the completeness property of the real numbers.3: Suppose that F is an ordered ﬁeld that contains the rational numbers Q. Suppose that every convergent sequence is Cauchy. The resulting ﬁeld will have the leastupperbound property. We can deﬁne a convergent sequence and Cauchy sequence in F in exactly the same way as before.
1. That is. The numbers sn are called partial sums. k=1 ∞ n=1 ∑ xn = x. The idea is the same as the notation for the tail of a sequence. when foundations of analysis were being developed. For example. If x := lim sn .5.2. if the sequence {sn } deﬁned by n sn := converges. That is. . ∞ The lefthand side is more convenient to write. we write the formal object ∞ n=1 ∑ xn or sometimes just ∑ xn and call it a series. Understanding series is very important in applications of analysis.68 CHAPTER 2. In fact. ∑ xn is simply a formal object and not a number. On the other hand. for example we can write ∞ n=0 ∑r n := n=1 ∑ rn−1. the motivation was to understand series.5. k=1 We should be careful however to only use this equality if the limit on the right actually exists.5.5 Series Note: 2 lectures A fundamental object in mathematics is that of a series. solving differential equations often includes series. Remark 2. SEQUENCES AND SERIES 2. we write ∑ xk = x1 + x2 + · · · + xn. let us remark that it is sometimes convenient to start the series at an index different from 1. In this case. and differential equations are the basis for understanding almost all of modern science. In other words. we treat as a number. 2. if the sequence {sn } diverges. Given a sequence {xn }. In this case.1 Deﬁnition Deﬁnition 2. A series converges. for a convergent series we have ∞ n=1 n n→∞ ∑∞ xn n=1 ∑ xn = lim ∑ xk . we say that the series is divergent. Before going further. the righthand side does not make sense (the limit does not exist) if the series does not converge.
k 2 2 k=1 2 n 1 The sequence { 2n } converges to zero and so {1 − sn } converges to zero. SERIES Remark 2. Example 2.5. the geometric series n=0 ∑ rn ∞ converges. A fact we will use a lot is the following analogue of looking at the tail of a sequence. For −1 < r < 1.2. The proof consists n−1 1 − rn ∑ r = 1−r . We will not use this notation as it easily leads to mistakes in proofs. That is. Let ∑ xn be a series.3. The proof follows by induction. k=0 k and then taking the limit. In fact.5. 2n Note that the equation is easy to see when n = 1. which we leave as an exercise.4: The series ∞ n=1 ∑ 2n 1 converges and the limit is 1. The proof is left as an exercise to the reader. {sn } converges to 1.5. We write 1 − sn  = 1 − ∑ 1 1 1 = n = n. Then ∞ n=1 ∞ ∑ xn converges if and only if n=M ∑ xn converges. ∑ 2n n→∞ 2 n=1 k=1 ∞ First we prove the following equality n k=1 ∑ 2k 1 + 1 = 1. It is common to write the series ∑ xn as x1 + x2 + x3 + · · · 69 with the understanding that the ellipsis indicates that this is a series and not a simple sum. Proposition 2. ∑∞ rn = n=0 of showing that 1 1−r .5. So. Let M ∈ N.5. Let sn be the partial sum. n 1 1 = lim ∑ k = 1. .
Thus we can ﬁnd an M such that for every n ≥ M we have n+1 ε> j=n+1 ∑ x j = xn+1  .5 to ﬁnish the proof. it is Cauchy. Therefore a series is convergent if and only if it is Cauchy.5. The series ∑ xn is Cauchy if for every ε > 0. there exists an M ∈ N such that for every n ≥ M and every k > n we have k j=n+1 ∑ x j < ε. Proof.3 Basic properties Proposition 2. Let ε > 0 be given. . SEQUENCES AND SERIES Proof.6.5.7. Note that ∑M−1 xn is a ﬁxed number. 2. The series ∑ xn is Cauchy if for every ε > 0. As ∑ xn is convergent. there exists an M ∈ N. A series ∑ xn is said to be Cauchy or a Cauchy series.2 Cauchy series Deﬁnition 2. such that for every n ≥ M and k ≥ M we have k j=1 n j=1 ∑ xj − n j=1 ∑ xj < ε. Proposition 2.5. if the sequence of partial sums {sn } is a Cauchy sequence. Without loss of generality we can assume that n < k.5.70 CHAPTER 2. We have proved the following simple proposition. We look at partial sums of the two series (for k ≥ M) k n=1 M−1 k ∑ xn = ∑ xn n=1 + n=M ∑ xn.8. Then the sequence {xn } is convergent and n→∞ lim xn = 0. A sequence of real numbers converges if and only if it is Cauchy.2. Now use Proposition 2. Let ∑ xn be a convergent series. Hence for every n ≥ M + 1 we have that xn  < ε.5. Then we write k j=1 k ∑ xj − ∑ xj = j=n+1 ∑ x j < ε. n=1 2.
9: The series ∑ n diverges (despite the fact that lim 1 = 0). . the terms of the series go to zero. .5. (i) Then ∑ αxn is a convergent series and ∞ n=1 § The ∞ n=1 ∑ αxn = α ∑ xn. In fact the proof we give is the earliest proof and was given by Nicole Oresme (1323–1382). and 1 therefore {sn } is unbounded. Let us give an example.2. and hence cannot converge. divergence of the harmonic series was known before the theory of series was made rigorous. More generally 2k 2k 1 1 1 1 ∑ m ≥ ∑ 2k = (2k−1) 2k = 2 . we can multiply them by constants and add them and these operations are done term by term. the implication goes only one way. However. 2 j=1 2 k k As { 2 } is unbounded by the Archimedean property. SERIES 71 Hence if a series converges. This is the famous n harmonic series§ . and consequently ∑ n diverges.5. That is. Let α ∈ R and ∑ xn and ∑ yn be convergent series. 1 Example 2. We will simply show that the sequence of partial sums is unbounded. We note that 1/3 + 1/4 ≥ 1/4 + 1/4 = 1/2 and 1/5 + 1/6 + 1/7 + 1/8 ≥ 1/8 + 1/8 + 1/8 + 1/8 = 1/2. + 3 4 1 1 1 1 1 1 + + + + + . 3 4 5 6 7 8 s2k = 1 + ∑ j=1 ∑ j−1 1 m +1 .10 (Linearity of series). 2 1 + s4 = (1) + 2 1 s8 = (1) + + 2 . Proposition 2. s2 = (1) + k 2j m=2 1 1 .5. . that means that {s2k } is unbounded. 1 . Hence {sn } diverges. Write the partial sums sn for n = 2k as: s1 = 1. Convergent series are linear. m=2k−1 +1 m=2k−1 +1 Therefore k s2k = 1 + ∑ j=1 1 ∑ m m=2k−1 +1 2k ≥ 1+ ∑ 1 k = 1+ .
If the series ∑ xn converges absolutely.4 Absolute convergence Since monotone sequences are easier to work with than arbitrary sequences. For the ﬁrst item. Then the sequence of partial sums is monotone increasing. We look at the righthand side and note that the sum of convergent sequences is convergent. we simply can take the limit of both sides to obtain the result. It is not true. A series ∑ xn converges absolutely if the series ∑ xn  converges.12. we simply write the nth partial sum n n ∑ αxk = α k=1 ∑ xk k=1 . If a series converges. Deﬁnition 2.5. that we can multiply term by term. Let us formalize this statement as a proposition.5. . SEQUENCES AND SERIES ∞ n=1 ∑ (xn + yn) = ∑ xn + ∑ yn . Proof.11. 2. Proposition 2. we say it is conditionally convergent. then ∑ xn converges if and only if the sequence of partial sums is bounded. If xn ≥ 0 for all n. it is generally easier to work with series ∑ xn where xn ≥ 0 for all n.5. We look at the righthand side and note that the constant multiple of a convergent sequence is convergent.5. then it converges. Hence.13. and we will not cover this topic here. we simply can take the limit of both sides to obtain the proposition. For the second item we also look at the nth partial sum n n n ∑ (xk + yk ) = k=1 ∑ xk + k=1 ∑ yk k=1 . Proposition 2. since that strategy does not work even for ﬁnite sums. but does not converge absolutely. Note that multiplying series is not as simple as adding. The following criterion often gives a convenient way to test for convergence of a series. of course.72 (ii) Then ∑(xn + yn ) is a convergent series and ∞ n=1 ∞ n=1 CHAPTER 2. Hence.
We know that a series is convergent if and only if it is Cauchy.5. Computing one will not help us compute the other. We state without proof that ∞ (−1)n ∑ n n=1 converges.5. Absolutely convergent series have many wonderful properties for which we do not have space in these notes. Hence suppose that ∑ xn  is Cauchy. absolutely convergent series can be rearranged arbitrarily.” If we know about convergence of a certain series we can use the following comparison test to see if the terms of another series go to zero “fast enough. We can apply the triangle inequality for a ﬁnite sum to obtain n n ∑ j=k+1 xj = ∑ j=k+1 x j < ε. but they have to go to zero “fast enough. Therefore ∑ (−1) is a conditionally convergent subsequence. SERIES 73 Proof. the limits of ∑ xn and ∑ xn  are different.5. then so does ∑ yn . That is for every ε > 0. (i) If ∑ yn converges.” Proposition 2. On the other hand we have already seen that ∞ n=1 n ∑n 1 diverges.2. if ∑ xn converges absolutely. (ii) If ∑ xn diverges. Hence ∑ xn is Cauchy and therefore it converges. . n 2. For example. then so does ∑ xn .5 Comparison test and the pseries We have noted above that for a series to converge the terms not only have to go to zero. Let ∑ xn and ∑ yn be series such that 0 ≤ xn ≤ yn for all n ∈ N.14 (Comparison test). Of course. there exists an M such that for all k ≥ M and n > k we have that n n ∑ j=k+1 xj = ∑ j=k+1 x j < ε.
the sequence of partial sums must be unbounded since it is monotone increasing.1) is bounded for all n. the sequence of partial sums are both monotone increasing. it must converge. . SEQUENCES AND SERIES Proof. there is an n such that n n k=1 B≤ ∑ xk ≤ ∑ yk . We note that since xn ≤ yn for all n. s3 = (1) + 1 1 + p . Let sk denote the kth partial sum. but instead of showing that the sequence of partial sums is unbounded we show that it is bounded. p p 2 3 4 5 6 7 . the series ∞ n=1 ∑ np 1 converges if and only if p > 1.15 (pseries or the ptest). Proof. Proposition 2. then n1p ≥ 1 . Therefore the righthand side of (2. then the partial sums satisfy n k=1 n ∑ xk ≤ ∑ yk . Hence the partial sums for ∑ xn are also bounded. Now suppose that p > 1.74 CHAPTER 2. .1) we see that for any B ∈ R. the partial sums for ∑ xn are bigger than any real number.1) If the series ∑ yn converges the partial sums for the series are bounded. k=1 Hence the partial sums for ∑ yn are also unbounded. the sequence of partial sums is monotone increasing. If we show that it is bounded. For p > 0. We proceed in a similar fashion as we did in the case of the harmonic series. On the other hand if ∑ xn diverges. A useful series to use with the comparison test is the pseries. p 2 3 1 1 1 1 1 1 s7 = (1) + + p + + p+ p+ p . and hence ∑ yn also diverges. Since ∑ 1 diverges. Since the terms of the series are all nonnegative. As n ≥ 1 and p ≤ 1.5. k=1 (2. That is. . we see that the ∑ n1p must diverge for n n all p ≤ 1. The ﬁrst item is thus proved. k−1 2 j+1 −1 m=2 s2k −1 = 1 + ∑ j=1 ∑j 1 mp . Since the partial sums are a monotone increasing sequence they are convergent. Putting this together with (2. Since the terms of the series are positive. s1 = 1.
Example 2. +1 Therefore. For example. Understanding the behavior of this function contains one of the most famous problems in mathematics today and has applications in seemingly unrelated areas such as modern cryptography. then 2 p < 3 p . Similarly 41p + 51p + 61p + 71p < 41p + 41p + 41p + 41p .5. In particular. and hence 21p + 31p < 21p + 21p . Therefore k 2 j+1 −1 m=2 s2k −1 = 1 + ∑ k j=1 ∑j ∑j 1 mp 1 (2 j ) p 2 j+1 −1 m=2 < 1+ ∑ k j=1 = 1+ ∑ k j=1 2j (2 j ) p 1 2 p−1 j = 1+ ∑ As p > 1. ∑ n21 converges. as p > 1. +1 1 1 Proof: First note that n21 < n2 for all n ∈ N.16: The series ∑ n21 converges.5. j=1 The sequence of partial sums is bounded and hence converges.2. j=1 < 1. we estimate from above. They only tell us that a limit of the partial sums exists. ∞ sn < 1 + ∑ 1 2 p−1 j . if we treat ∑ 1/n p as a function of p. Then by using the result of Exercise 2. we get the socalled Riemann ζ function. Note that ∑ n2 converges by the pseries test. then all sn ≤ s2k −1 for all n ≤ 2k − 1. Therefore k s2k −1 < 1 + ∑ 1 2 p−1 j ∞ j=1 ≤ 1+ ∑ 1 2 p−1 j . Note that neither the pseries test nor the comparison test will tell us what the sum converges to. In fact. +1 ¶ Demonstration of this fact is what made the Swiss mathematician Leonhard Paul Euler (1707 – 1783) famous. . Thus for all n. we note that ∞ j=1 ∑ 1 2 p−1 j . SERIES 75 Instead of estimating from below. converges. while we know that ∑ 1/n2 converges it is far harder to ﬁnd¶ that the limit is π 2/2. then 1 2 p−1 . j=1 As {sn } is a monotone sequence.5.2. by the comparison test.
As r − L > 0. xn  For n > M (that is for n ≥ M + 1) write xn  = xM  xn  xn−1  xM+1  ··· < xM  rr · · · r = xM  rn−M = (xM  r−M )rn . ∑ xk  k=1 ∑ k=M+1 .5. there exists an M ∈ N such that for all n ≥ M xn+1  − L < r − L.12. xn  Therefore. xn+1  < r. Since it is a necessary condition for the convergence of series that the terms go to zero. The proof is similar to that of Lemma 2.5.17 (Ratio test).2. n k=1 M k=1 M ∑ xk  ≤ ∑ xk  ≤ + (xM  r−M ) + (xM  r−M ) n ∑ k=M+1 ∞ rn rk . Then (i) If L < 1. From Lemma 2.6 Ratio test Proposition 2. We will argue that ∑ xn  must converge. (ii) If L > 1. Thus we can take the limit as n goes to inﬁnity on the righthand side to obtain.76 CHAPTER 2. we know that ∑ xn must diverge. then ∑ xn diverges. Now pick r such that L < r < 1. SEQUENCES AND SERIES 2. Let ∑ xn be a series such that L := lim exists. xn−1  xn−2  xM  n M n xn+1  n→∞ xn  For n > M we can therefore write the partial sum as ∑ xk  = k=1 ∑ xk  + k=1 M ∑ k=M+1 n xk  (xM  r−M )rn n ≤ ≤ ∑ xk  + k=1 M k=1 ∑ k=M+1 ∑ xk  + (xM  r−M ) ∑ k=M+1 rk . Thus suppose that L < 1.12 we note that if L > 1. Proof. Of course L ≥ 0. then ∑ xn converges absolutely. k As 0 < r < 1 the geometric series ∑∞ rk converges and thus of course ∑∞ k=0 k=M+1 r converges as well (why?).2. then xn diverges.
∞ 1 Hint: Use the previous exercise. n→∞ n! Therefore. k=0 Exercise 2.18: The series 2n ∑ n=1 n! ∞ 2n = 0. then compute s(1 − r) = s − rs.5. Hence the sequence of partial sums of ∑ xn  is bounded and therefore ∑ xn  is convergent.5. the series converges absolutely by the ratio test. Thus ∑ xn is absolutely convergent. Exercise 2.7 Exercises Exercise 2. ∞ a) n=1 ∞ ∑ 9n + 1 ∑ 2n − 1 ∞ 3 b) 1 n=1 c) (−1)n ∑ 2 n=1 n ∞ d) n=1 ∞ ∑ n(n + 1) ∑ ne−n 2 1 e) n=1 .5. prove n−1 ∑ rk = k=0 1 − rn . Proof: We have already seen that 2. SERIES 77 The righthand side is a number that does not depend on n. Example 2.1: For r = 1.5. 1−r Hint: Let s := ∑n−1 rk . lim converges absolutely.3: Decide the convergence or divergence of the following series.2.2: Prove that for −1 < r < 1 we have n=0 ∑ rn = 1 − r . and solve for s.5.5.
k }∞ denote n sequences. a) If there is an N and a ρ < 1 such that for all n ≥ N we have b) If there is an N such that for all n ≥ N we have xn+1  xn  xn+1  xn  < ρ.5. Then show n j=1 ∞ ∞ n ∑ ∑ x j. .5: For j = 1. n.k k=1 = ∑ ∑ x j.5.k k=1 j=1 . . SEQUENCES AND SERIES a) Prove that if n=1 ∑ xn converges. b) Find an explicit example where the converse does not hold. then the series converges.4: ∞ ∞ n=1 CHAPTER 2. . let {x j. Exercise 2. Suppose that for each j k=1 ∞ ∑ x j. Show that lim nxn = 0. then the series diverges.k k=1 is convergent. n→∞ Exercise 2. 2.5.6: Prove the following stronger version of the ratio test: Let ∑ xn be a series.5.7: Let {xn } be a decreasing sequence such that ∑ xn converges. . Exercise 2. ≥ 1. then ∑ (x2n + x2n+1) also converges.78 Exercise 2. .
1. (iv) For the set [0.1 Cluster points First. the set of cluster points is the whole real line R. 79 . the set of cluster points is the interval [0. x is a cluster point of S if there are points of S arbitrarily close to x. 1) are all points in the closed interval [0. we want to see how f (x) behaves as x tends to a certain point. let us return to a concept we have previously seen in an exercise.1. given a function f : S → R.Chapter 3 Continuous Functions 3. A number x ∈ R is called a cluster point of S if for every ε > 0. Deﬁnition 3. Another way of phrasing the deﬁnition is to say that x is a cluster point of S if for every ε > 0. That is. 1]. (i) The set {1/n : n ∈ N} has a unique cluster point zero.1. the set (x − ε. (ii) The cluster points of the open interval (0. there exists a y ∈ S such that y = x and x − y < ε.1 Limits of functions Note: 3 lectures Before we can deﬁne continuity of functions. (v) The set N has no cluster points in R. Let us see some examples. 3. That is. we need to visit a somewhat more general notion of a limit. x + ε) ∩ S \ {x} is not empty. 1) ∪ {2}. (iii) For the set Q. Let S ⊂ R be a set. 1].
3. Again the notation and language we are using above assumes that the limit is unique even though we have not yet proved that. then we can deﬁne the limit of f (x) as x gets close to c. there exists a δ > 0 such that whenever x ∈ S \ {c} and x − c < δ .80 CHAPTER 3. We write lim f (x) := L. x + 1/n) ∩ S \ {x}. Suppose that x ∈ S. On the other hand if we start with a sequence of numbers {xn } in S converging to x such that xn = x for all n. then we say that the limit does not exist or that f diverges at c. that is. we pick xn to be an arbitrary point of (x − 1/n. Then L1 − L2  = L1 − f (x) + f (x) − L2  ≤ L1 − f (x) +  f (x) − L2  < ε ε + = ε. then  f (x) − L < ε. Suppose that there exists an L ∈ R and for every ε > 0. Proof. Furthermore.1. 3. Proof. x − xn  < 1/n. If no such L exists. x − c < δ2 . and lim xn = x. x − c < δ .4. Deﬁnition 3. the limit of the function as x goes to c could very well be different from f (c). Let us do that now. x→c or f (x) → L as x → c. and x = c. In this case we say that f (x) converges to L as x goes to c. then for every ε > 0 there is an M such that in particular xM − x < ε. Then x ∈ R is a cluster point of S if and only if there exists a convergent sequence of numbers {xn } such that xn = x.1. We also say that L is the limit of f (x) as x goes to c. Let f : S → R be a function and c be a cluster point of S. First suppose that x is a cluster point of S. which we know is nonempty because x is a cluster point of S. Take an ε > 0 and ﬁnd a δ1 > 0 such that  f (x) − L1  < ε/2 for all x ∈ S. and x = c. As {1/n} converges to zero. Do note that it is irrelevant for the deﬁnition if f is deﬁned at c or not. xn ∈ S. even if the function is deﬁned at c. Also ﬁnd δ2 > 0 such that  f (x) − L2  < ε/2 for all x ∈ S. then {xn } converges to x. CONTINUOUS FUNCTIONS Proposition 3. δ2 }.1. Let L1 and L2 be two numbers that both satisfy the deﬁnition. 2 2 . xM ∈ (x − ε. Proposition 3. Let c be a cluster point of S ⊂ R and let f : S → R be a function such that f (x) converges as x goes to c. For any n ∈ N. Put δ := min{δ1 .1.2 Limits of functions If a function f is deﬁned on a set S and c is a cluster point of S. Let S ⊂ R.2. That is. x − c < δ1 and x = c. x + ε) ∩ S \ {x}. Then the limit of f (x) as x goes to c is unique. Then xn is within 1/n of x.
lim f (x) = 0.6: Let S := [0. Adding 2 c to both sides we obtain x + c < 2 c + 1. We can now compute f (x) − c2 = x2 − c2 = (x + c)(x − c) = x + c x − c ≤ (x + c) x − c < (2 c + 1) x − c ε < (2 c + 1) = ε. Then by reverse triangle inequality we get x − c ≤ x − c < 1. 2 c + 1 Take x = c such that x − c < δ . Deﬁne f (x) := Then x→0 x if x > 0.1. x→c Proof: First let c be ﬁxed. Example 3. 3. 1 if x = 0.1.3. Take δ := min 1. Proof: Let ε > 0 be given. ε .5: Let f : R → R be deﬁned as f (x) := x2 . x = 0. LIMITS OF FUNCTIONS As L1 − L2  < ε for arbitrary ε > 0. 2 c + 1 Example 3. Let ε > 0 be given. Then for x ∈ S.3 Sequential limits Let us connect the limit as deﬁned above with limits of sequences. 1).1. In particular. .1. Then x→c 81 lim f (x) = lim x2 = c2 . and x − 0 < δ we get  f (x) − 0 = x < δ = ε. x − c < 1. even though f (0) = 1. Let δ := ε. then L1 = L2 .
82
CHAPTER 3. CONTINUOUS FUNCTIONS
Lemma 3.1.7. Let S ⊂ R and c be a cluster point of S. Let f : S → R be a function. Then f (x) → L as x → c, if and only if for every sequence {xn } of numbers such that xn ∈ S, xn = c, and such that lim xn = c, we have that the sequence { f (xn )} converges to L. Proof. Suppose that f (x) → L as x → c. Now suppose that {xn } is a sequence as in the proposition. We wish to show that { f (xn )} converges to L. Let ε > 0 be given. Find a δ > 0 such that if x ∈ S ∩ (x − δ , x + δ ) \ {c}, then we have  f (x) − L < ε. We know that {xn } converges to c, hence ﬁnd an M such that for n ≥ M we have that xn − c < δ . Therefore xn ∈ S ∩ (x − δ , x + δ ) \ {c}, and thus  f (xn ) − L < ε. Thus { f (xn )} converges to L. For the other direction, we will use proof by contrapositive. Suppose that it is not true that f (x) → L as x → c. The simple negation of the deﬁnition is that there exists an ε > 0 such that for every δ > 0 there exists an x ∈ S, x − c < δ and x = c and  f (x) − L ≥ ε. Let us use 1/n for δ in the above statement. We have that for every n, there exists a point xn ∈ S, xn = c and xn − c < 1/n such that  f (xn ) − L ≥ ε. This is precisely the negation of the statement that the sequence { f (xn )} converges to L. And we are done. Example 3.1.8: lim sin(1/x) does not exist, but lim x sin(1/x) = 0. See Figure 3.1.
x→0 x→0
Figure 3.1: Graphs of sin(1/x) and x sin(1/x). Note that the computer cannot properly graph sin(1/x) near zero as it oscillates too fast. Proof: Let us work with sin(1/x) ﬁrst. Let us deﬁne a sequence xn := that lim xn = 0. Furthermore, sin(1/xn ) = sin(πn + π/2) = (−1)n . Therefore, {sin(1/xn )} does not converge. Thus, by Lemma 3.1.7, limx→0 sin(1/x) does not exist.
1 πn+π/2 .
It is not hard to see
3.1. LIMITS OF FUNCTIONS
83
Now let us look at x sin(1/x). Let xn be a sequence such that xn = 0 for all n and such that lim xn = 0. Notice that sin(t) ≤ 1 for any t ∈ R. Therefore, xn sin(1/xn ) − 0 = xn  sin(1/xn ) ≤ xn  . As xn goes to 0, then xn  goes to zero, and hence {xn sin(1/xn )} converges to zero. By Lemma 3.1.7, lim x sin(1/x) = 0.
x→0
Using the proposition above we can start applying anything we know about sequential limits to limits of functions. Let us give a few important examples. Corollary 3.1.9. Let S ⊂ R and c be a cluster point of S. Let f : S → R and g : S → R be functions. Suppose that the limits of f (x) and g(x) as x goes to c both exist, and that f (x) ≤ g(x) Then
x→c
for all x ∈ S.
lim f (x) ≤ lim g(x).
x→c
Proof. Take {xn } be a sequence of numbers from S \ {c} that converges to c. Let L1 := lim f (x),
x→c
and
L2 := lim g(x).
x→c
By Lemma 3.1.7 we know { f (xn )} converges to L1 and {g(xn )} converges to L2 . We obtain L1 ≤ L2 using Lemma 2.2.3. By applying constant functions, we get the following corollary. The proof is left as an exercise. Corollary 3.1.10. Let S ⊂ R and c be a cluster point of S. Let f : S → R be a function. And suppose that the limit of f (x) as x goes to c exists. Suppose that there are two real numbers a and b such that a ≤ f (x) ≤ b Then a ≤ lim f (x) ≤ b.
x→c
for all x ∈ S.
By applying Lemma 3.1.7 in the same way as above we also get the following corollaries, whose proofs are again left as an exercise. Corollary 3.1.11. Let S ⊂ R and c be a cluster point of S. Let f : S → R, g : S → R, and h : S → R be functions. Suppose that f (x) ≤ g(x) ≤ h(x) for all x ∈ S,
84
CHAPTER 3. CONTINUOUS FUNCTIONS
and the limits of f (x) and h(x) as x goes to c both exist, and
x→c
lim f (x) = lim h(x).
x→c
Then the limit of g(x) as x goes to c exists and
x→c
lim g(x) = lim f (x) = lim h(x).
x→c x→c
Corollary 3.1.12. Let S ⊂ R and c be a cluster point of S. Let f : S → R and g : S → R be functions. Suppose that limits of f (x) and g(x) as x goes to c both exist. Then (i) lim f (x) + g(x) = lim f (x) + lim g(x) .
x→c x→c x→c
(ii) lim f (x) − g(x) = lim f (x) − lim g(x) .
x→c x→c x→c
(iii) lim f (x)g(x) = lim f (x)
x→c x→c x→c
x→c
lim g(x) .
(iv) If lim g(x) = 0, and g(x) = 0 for all x ∈ S, then f (x) limx→c f (x) = . x→c g(x) limx→c g(x) lim
3.1.4 Restrictions and limits
It is not necessary to always consider all of S. Sometimes we may be able to just work with the function deﬁned on a smaller set. Deﬁnition 3.1.13. Let f : S → R be a function. Let A ⊂ S. Deﬁne the function f A : A → R by f A (x) := f (x) The function f A is called the restriction of f to A. The function f A is simply the function f taken on a smaller domain. The following proposition is the analogue of taking a tail of a sequence. Proposition 3.1.14. Let S ⊂ R and let c ∈ R. Let A ⊂ S be a subset such that there is some α > 0 such that A ∩ (c − α, c + α) = S ∩ (c − α, c + α). Let f : S → R be a function. (i) The point c is a cluster point of A if and only if c is a cluster point of S. (ii) Supposing c is a cluster point of S, then f (x) → L as x → c if and only if f A (x) → L as x → c. for x ∈ A.
4: Prove Corollary 3. 1]. 3. then  f A (x) − L < ε.1. then x is in S \ {c}. c + ε). First let c be a cluster point of A. Let f : S → R be a function.14.1. if c is a cluster point of S. c+ε) is nonempty for every ε > 0 and thus c is a cluster point of A. If we picked δ > α. then for ε > 0 such that ε < α we get that A \ {c} ∩ (c − ε. Now suppose that f A (x) → L as x → c. Note why you cannot apply Proposition 3. c + ε) = S \ {c} ∩ (c − ε.5: Let A ⊂ S. Now suppose that f (x) → L as x → c.1.3. then f A (x) → L as x → c. Note the difference from Proposition 3.1. Exercise 3.5 Exercises Exercise 3. Exercise 3. for c ≥ 0. c + ε) must be nonempty for all ε > 0.10.12. but such that the limit of f (x) + g(x) exists as x → 0.1. On the other hand. As A ⊂ S. Show that if f (x) → L as x → c.1.1.1.1: Find the limit or prove that the limit does not exist √ a) lim x.7: Find an example of a function f : [−1.1.1. and hence f A (x) → L as x → c.1. Show that if c is a cluster point of A. Exercise 3. Exercise 3. Note the difference from Proposition 3. the restriction f A (x) → 0 as x → 0.1. x→c c) lim x2 cos(1/x) x→0 x→0 d) lim sin(1/x) cos(1/x) e) lim sin(x) cos(1/x) x→0 Exercise 3.1.14. Hence for every ε > 0 there is a δ > 0 such that if x ∈ S \ {c} and x − c < δ . LIMITS OF FUNCTIONS 85 Proof. then S\{c}∩(c−ε. Exercise 3.14. Suppose that c is a cluster point of A and it is also a cluster point of S.3: Prove Corollary 3.1. If x − c < δ .1. then if A \ {c} ∩ (c − ε. Since A ⊂ S. Thus  f (x) − L =  f A x − L < ε. 1] → R such that for A := [0.8: Find example functions f and g such that the limit of neither f (x) nor g(x) exists as x → 0. Thus c is a cluster point of A. .2: Prove Corollary 3. then c is a cluster point of S. x→c b) lim x2 + x + 1. then x ∈ S \ {c} if and only if x ∈ A \ {c}. for any c ∈ R. This is true for all ε < α and hence A \ {c} ∩ (c − ε. c + ε) is nonempty.11. then  f (x) − L < ε. but the limit of f (x) as x → 0 does not exist. Hence for every ε > 0 there is a δ > 0 such that if x ∈ A \ {c} and x − c < δ . then if x is in A \ {c}. Exercise 3. then set δ := α.1.6: Let A ⊂ S.
Let us move to the second item. It is left as an exercise to prove that if f is continuous on A. That is. and ﬁnally Weierstrass) to get correctly and its ﬁnal form dates only to the late 1800s. x→c (iii) f is continuous at c if and only if for every sequence {xn } where xn ∈ S and lim xn = c. Then for every ε > 0 there is a δ > 0 such that if x ∈ S \ {c} and x − c < δ . (ii) If c is a cluster point of S.1 Deﬁnition and basic properties Deﬁnition 3. Suppose that f : S → R is a function and c ∈ S. This deﬁnition is one of the most important to get correctly in analysis. then f is continuous at c if and only if the limit of f (x) as x → c exists and lim f (x) = f (c). but also on c. CONTINUOUS FUNCTIONS 3.2. and it is not an easy one to understand. c + δ ) = {c}.5 lectures You have undoubtedly heard of continuous functions in your schooling. then we simply say that f is a continuous function. Then (i) If c is not a cluster point of S. Therefore. While that intuitive concept may be useful in simple situations. implies that  f (x) − f (c) < ε. The following deﬁnition took three great mathematicians (Bolzano. A high school criterion for this concept is that a function is continuous if we can draw its graph without lifting the pen from the paper. Therefore  f (x) − f (c) =  f (c) − f (c) = 0 < ε. we need not have to pick one δ for all c ∈ S. . Let us ﬁrst suppose that limx→c f (x) = f (c). The only x ∈ S such that x − c < δ is x = c. x ∈ S. Cauchy. It is no accident that the deﬁnition of a continuous function is similar to the deﬁnition of a limit of a function. for any ε > 0. 3. Suppose that c is not a cluster point of S. Note that δ not only depends on ε. When f : S → R is continuous at all c ∈ S. Proof.1.86 CHAPTER 3.2. Let c ∈ S be a number.2 Continuous functions Note: 2.2. Let us start with the ﬁrst item. then f A is continuous. Sometimes we say that f is continuous on A ⊂ S. simply pick this given delta. Let S ⊂ R. we will require a rigorous concept. The main feature of continuous functions is that these are precisely the functions that behave nicely with limits. Suppose that c is a cluster point of S. then f is continuous at c. Then we mean that f is continuous at all c ∈ A.2. the sequence { f (xn )} converges to f (c). Then there exists a δ > 0 such that S ∩ (c − δ . Proposition 3. Let f : S → R be a function. We say that f is continuous at c if for every ε > 0 there is a δ > 0 such that x − c < δ .
Then n→∞ d d−1 lim f (xn ) = lim ad xn + ad−1 xn + · · · + a1 xn + a0 n→∞ = ad (lim xn )d + ad−1 (lim xn )d−1 + · · · + a1 (lim xn ) + a0 = ad cd + ad−1 cd−1 + · · · + a1 c + a0 = f (c). Let xn ∈ S be such that xn − c < 1/n and  f (xn ) − f (c) ≥ ε. we can do this. f is continuous. Let us prove the converse by contrapositive. ∞). Let {xn } be a sequence such that xn ∈ S and lim xn = c. Thus { f (xn )} does not converge to f (c) (it may or may not converge. The last item in the proposition is particularly powerful. but it deﬁnitely does not converge to f (c)). . Then f is continuous. Suppose that f is not continuous at c. there exists an x ∈ S such that x − c < δ and  f (x) − f (c) ≥ ε. Proposition 3. Thus f is continuous at c. there exists a δ > 0 such that for x ∈ S where x − c < δ we have  f (x) − f (c) < ε. so { f (xn )} converges to f (c). ∞) → R deﬁned by f (x) := 1/x is continuous. It allows us to quickly apply what we know about limits of sequences to continuous functions and even to prove that certain functions are continuous. Then we know that 1 1 1 = = = f (c). Now {xn } is a sequence of numbers in S such that lim xn = c and such that  f (xn ) − f (c) ≥ ε for all n ∈ N. Proof. Let {xn } be a sequence in (0. On the other hand. Let f : R → R be a polynomial. Then the statement is. This means that there exists an ε > 0 such that for all δ > 0. .4. Let {xn } be a sequence such that lim xn = c.2. That is f (x) = ad xd + ad−1 xd−1 + · · · + a1 x + a0 . of course. As f is continuous at all c ∈ (0. As f is not continuous at c.3: f : (0. suppose that f is a continuous function at c. For every ε > 0. CONTINUOUS FUNCTIONS 87 then  f (x) − f (c) < ε. ∞) such that lim xn = c. However. As  f (c) − f (c) = 0 < ε. f is continuous. . Fix c ∈ R. Let ε > 0 be given. Find δ > 0 such that  f (x) − f (c) < ε for all x ∈ S such that x − c < δ . Therefore the function x2 is continuous. . Then for n ≥ M we have that  f (xn ) − f (c) < ε. Example 3. a1 . then the deﬁnition of continuity at c is satisﬁed. n→∞ xn lim xn c lim Thus f is continuous at c. for some constants a0 . As f is continuous at all c ∈ R. Proof: Fix c ∈ (0. Let us deﬁne a sequence xn as follows.2. For the third item. we can use the continuity of algebraic operations with respect to limits of sequences we have proved in the previous chapter to prove a much more general result.2. suppose that f is continuous. still true if x ∈ S \ {c} ⊂ S. Now ﬁnd an M ∈ N such that for n ≥ M we have xn − c < δ . Therefore limx→c f (x) = f (c). .3. ad . We have previously shown that limx→c x2 = c2 directly. ∞).
12 we can prove the following. A very useful property of continuous functions is that compositions of continuous functions are again continuous. We also use the simple facts that sin(x) ≤ x.88 CHAPTER 3. Proposition 3. Example 3. cos(x) − cos(c) = −2 sin 3.6: The functions sin(x) and cos(x) are continuous. cos(x) ≤ 1. and sin(x) ≤ 1. (iii) The function h : S → R deﬁned by h(x) := f (x)g(x) is continuous at c. In the following computations we use the sumtoproduct trigonometric identities. (iv) If g(x) = 0 for all x ∈ S. Let f : S → R and g : S → R be functions continuous at c ∈ S. Recall that for two functions f and g.2. the composition f ◦ g is deﬁned by ( f ◦ g)(x) := f g(x) . The details of the proof are left as an exercise. then the function h : S → R deﬁned by h(x) := f (x) g(x) is continuous at c.1.2. (i) The function h : S → R deﬁned by h(x) := f (x) + g(x) is continuous at c. (ii) The function h : S → R deﬁned by h(x) := f (x) − g(x) is continuous at c. . sin(x) − sin(c) = 2 sin x+c x−c cos 2 2 x−c x+c = 2 sin cos 2 2 x−c ≤ 2 sin 2 x−c ≤2 = x − c 2 x−c x+c sin 2 2 x−c x+c = 2 sin sin 2 2 x−c ≤ 2 sin 2 x−c ≤2 = x − c 2 The claim that sin and cos are continuous follows by taking an arbitrary sequence {xn } converging to c. CONTINUOUS FUNCTIONS By similar reasoning.2 Composition of continuous functions You have probably already realized that one of the basic tools in constructing complicated functions out of simple ones is composition.2.5. or by appealing to Corollary 3. Details are left to the reader.
10: The function f : R → R deﬁned by f (x) := −1 1 if x < 0. ∞). then {g(xn )} converges to g(c). ∞) (actually on all of R. but f (c) = 1. there exists a sequence {xn }. Thus f ◦ g is continuous at c. Then f (−1/n) = −1 and so lim f (−1/n) = −1. .3.3 Discontinuous functions Let us spend a bit of time on discontinuous functions. CONTINUOUS FUNCTIONS 89 Proposition 3.2 as a separate claim we get an easy to use test for discontinuities. 1) (the range of sin). Then f (xn ) = 0 and so lim f (xn ) = 0. then f ◦ g : A → R is continuous at c. ∞). then { f g(xn ) } converges to f g(c) . Let A. Proof. but f (0) = 1. If c is irrational.8: Claim: sin(1/x) is a continuous function on (0. Example 3. f (x) := 1 0 if x is rational. Then lim f (xn ) = 1 but f (c) = 0.7.2. Let {xn } be a sequence in A such that lim xn = c.11: For an extreme example we take the socalled Dirichlet function. Proof: First note that 1/x is a continuous function on (0.2. Proposition 3. is not continuous at 0. As f is continuous at g(c). Suppose that for some c ∈ S. but (0. Proof: Simply take the sequence {−1/n}. and lim xn = c such that { f (xn )} does not converge to f (c) (or does not converge at all). ∞) is the range for 1/x). xn ∈ S. if x is irrational. ∞) and sin(x) is a continuous function on (0. Example 3.2. then take a sequence of rational numbers {xn } that converges to c. then f is not continuous at c. The function f is discontinuous at all c ∈ R.2.2. Proof: Suppose that c is rational. Then we can take a sequence {xn } of irrational numbers such that lim xn = c. Thus the 2 composition sin(1/x) is also continuous on (0. or that it has a discontinuity at c. Then as g is continuous at c.2. Let f : S → R be a function. B ⊂ R and f : B → R and g : A → B be functions.2.2. If g is continuous at c ∈ A and f is continuous at g(c). Example 3. if x ≥ 0. We say that f is discontinuous at c.9. 2 3. Hence the composition sin(1/x) is continuous. We also know that x2 is continuous on the interval (−1. If we state the contrapositive of the third item of Proposition 3.
” Proof: Suppose that c = m/k is rational. For a given ε > 0. Can there exist a function that is continuous on all irrational numbers. Therefore f is continuous at irrational c. if x is irrational. 1) whose denominator k in lowest terms is less than K.12: Let f : (0. Thus for n ≥ M  f (xn ) − 0 = f (xn ) ≤ 1/K < ε.2. . Take a sequence {xn } of numbers in (0. then m < k. but discontinuous at all rational numbers? Note that there are rational numbers arbitrarily close to any irrational number. perhaps strangely. ∞) such that lim xn = c. It is then obvious that there are only ﬁnitely rational numbers in (0. ∗ Named after the German mathematician Johannes Karl Thomae (1840 – 1921). Figure 3.90 CHAPTER 3. But. Now suppose that c is irrational and hence f (c) = 0. k ∈ N and m and k have no common divisors. 1) that are irrational and is discontinuous at all rational c. 3. 1) → R be deﬁned by f (x) := 1/k 0 if x = m/k where m. let us yet again test the limits of your intuition. ﬁnd K ∈ N such that 1/K < ε by the Archimedean property. The following example is called the Thomae function∗ or the popcorn function.2.2. 1). the answer is yes.1: Using the deﬁnition of continuity directly prove that f : R → R deﬁned by f (x) := x2 is continuous. If m/k is written in lowest terms (no common divisors) and m/k ∈ (0.2. Then take a sequence of irrational numbers {xn } such that lim xn = c. Hence there is an M such that for n ≥ M.2: Graph of the “popcorn function.4 Exercises Exercise 3. Then f is continuous at all c ∈ (0. CONTINUOUS FUNCTIONS As a ﬁnal example. See the graph of the function in Figure 3. So f is discontinuous at c. Example 3. all the rational numbers xn have a denominator larger than or equal to K. Then lim f (xn ) = lim 0 = 0 but f (c) = 1/k = 0.
if x = 0. but h is a continuous function? Exercise 3.2. Exercise 3.3. if x is irrational. Suppose that for some c ∈ R and α > 0.2.6: Prove Proposition 3. Prove that if f A is continuous at c. c + α) we have f (x) > 0.2. Using the deﬁnition of continuity directly prove that f is continuous at 1 and discontinuous at 2.8: Suppose that S ⊂ R. Suppose that for all rational numbers r. Exercise 3. Let S ⊂ R and A ⊂ S. Then the restriction f A is continuous. Exercise 3. Show that there exists an α > 0 such that for all x ∈ (c − α. f (r) = g(r). we have A = (c − α.2.2. Show that f is continuous. Exercise 3. CONTINUOUS FUNCTIONS 91 Exercise 3. Let f : S → R be a function. sin(1/x) 0 if x = 0.11: Let f : R → R be continuous. but f and g are not continuous.12: Let f : Z → R be a function.2. Exercise 3.2. if x = 0. Can you ﬁnd f and g that are nowhere continuous. . Suppose that f (c) > 0. Exercise 3.2.5.4: Let f : R → R be deﬁned by f (x) := Is f continuous? Prove your assertion. Exercise 3.7: Prove the following statement. Show that f (x) = g(x) for all x.9: Give an example of functions f : R → R and g : R → R such that the function h deﬁned by h(x) := f (x) + g(x) is continuous.2: Using the deﬁnition of continuity directly prove that f : (0. c + α) ⊂ S.5: Let f : R → R be deﬁned by f (x) := Is f continuous? Prove your assertion.2. ∞) → R deﬁned by f (x) := 1/x is continuous. then f is continuous at c.2. x sin(1/x) 0 if x = 0. Exercise 3. Let f : S → R be a continuous function.2. Exercise 3.2.10: Let f : R → R and g : R → R be continuous functions.2.3: Let f : R → R be deﬁned by f (x) := x x2 if x is rational.
Proof. The main point will not be just that f is bounded.92 CHAPTER 3.3. then f must have an absolute minimum and an absolute maximum. Let x := lim xni . We have the following lemma.5 lectures Let us now state and prove some very important results about continuous functions deﬁned on the real line. on closed bounded intervals of the real line. Let f : [a. then a ≤ x ≤ b. We simply say that f achieves an absolute minimum or an absolute maximum on S if such a c ∈ S exists. Let us prove this by contrapositive. Let f : [a. Then f achieves both an absolute minimum and an absolute maximum on [a. Suppose that f is not bounded. CONTINUOUS FUNCTIONS 3. b]. Now {xn } is a bounded sequence as a ≤ xn ≤ b. but the minimum and the maximum are actually achieved. Theorem 3.3. b]. Lemma 3.1.2 (Minimummaximum theorem). In particular. f achieves an absolute maximum at c ∈ S if f (x) ≤ f (c) for all x ∈ S. b] → R is bounded if there exists a B ∈ R such that  f (x) < B for all x ∈ [a. b]. Since a ≤ xni ≤ b for all i. i→∞ lim xni .3. then for each n ∈ N.3 Minmax and intermediate value theorems Note: 1. such that  f (xn ) ≥ n. It turns out that if S is a closed and bounded interval. there is an xn ∈ [a. On the other hand. On the other hand f (x) is a ﬁnite number and f (x) = f Thus f is not continuous at x. 3. The limit lim f (xni ) does not exist as the sequence is not bounded as  f (xni ) ≥ ni ≥ i. Then f is bounded. b] → R be a continuous function.1 Minmax theorem Recall that a function f : [a. Recall from calculus that f : S → R achieves an absolute minimum at c ∈ S if f (x) ≥ f (c) for all x ∈ S. there is a convergent subsequence {xni }. By the BolzanoWeierstrass theorem. . b] → R be a continuous function.
b].3. b]) = { f (x) : x ∈ [a. yn are in [a. So it is important that we are looking at a bounded interval. Therefore. The problem is that the sequences {xn } and {yn } need not converge. where xn . Example 3. Similarly. such that n→∞ lim f (xn ) = inf f ([a.3. Therefore. Example 3.3. Deﬁne f : [0. 1) achieves neither a minimum. Now we apply that a limit of a subsequence is the same as the limit of the sequence if it converged to get. so x and y are in [a. the values of the function approach 1 as well but f (x) > 1 for all x ∈ (0. That is.3. b]). there are sequences { f (xn )} and { f (yn )}. There is no x ∈ (0. Let x := lim xni i→∞ and y := lim yni . We know that {xn } and {yn } are bounded (their elements belong to a bounded interval [a. Then the function does not achieve a maximum. b]). deﬁned on (0. From what we know about suprema and inﬁma. achieves neither a minimum.3. we have that a ≤ x ≤ b.6: Continuity is important. MINMAX AND INTERMEDIATE VALUE THEOREMS 93 Proof. If we instead took the domain to be [−10. .3. we need to ﬁnd where the minimum and the maxima are. 2] achieves a minimum at x = 0 when f (0) = 1. the set f ([a.5: The function f (x) := 1/x. 1) such that f (x) = 1. 1). Also as we approach x = 1. We are not done yet. 1] → R by f (x) := 1/x for x > 0 and let f (0) := 0. We have shown that f is bounded by the lemma. f achieves an absolute minimum at x and f achieves an absolute maximum at y.4: The function f (x) := x. b]) = lim f (xn ) = lim f (xni ) = f n→∞ i→∞ i→∞ lim xni = f (x). So it is important that we are looking at a closed interval. The values of the function are unbounded as we approach 0. Example 3. sup f ([a. b]. Instead the maximum would be achieved at either x = 10 or x = −10. b]) and n→∞ lim f (yn ) = sup f ([a. It achieves a maximum at x = 2 where f (2) = 5. nor a maximum. 10]. Hence there exist convergent subsequences {xni } and {yni }. i→∞ Then as a ≤ xni ≤ b. Do note that the domain of deﬁnition matters. b]} has a supremum and an inﬁmum. Similarly a ≤ y ≤ b. We apply the BolzanoWeierstrass theorem. deﬁned on the whole real line. nor a maximum. b]) that approach them.3: The function f (x) := x2 + 1 deﬁned on the interval [−1. then x = 2 would no longer be a maximum of f . The problem is that the function is not continuous at 0. b]) = lim f (yn ) = lim f (yni ) = f n→∞ i→∞ i→∞ lim yni = f (y). Example 3. and we apply the continuity of f to obtain inf f ([a. Let us show by examples that the different hypotheses of the theorem are truly necessary. there exist sequences in the set f ([a.
let an+1 := an and bn+1 := < 0. The proof will follow by deﬁning two sequences {an } and {bn } inductively as follows. then an+1 < bn+1 . Finally we notice that bn − an . and use continuity of f to take limits in those inequalities to get f (c) = lim f (an ) ≤ 0 and f (c) = lim f (bn ) ≥ 0. we see that c = d.7. CONTINUOUS FUNCTIONS 3. Thus by induction an < bn for all n. As an < bn for all n. as an is increasing and bn is decreasing. for all n f (an ) < 0 and f (bn ) ≥ 0.94 CHAPTER 3. b] such that f (c) = 0. Proof. Lemma 3. As 21−n (b − a) → 0 as n → ∞. As f (c) ≥ 0 and f (c) ≤ 0 we know that f (c) = 0.3. and bn+1 := bn . they converge.3. then c ≤ d. To prove Bolzano’s theorem we prove the following simpler lemma. Once we know that fact we can see that an ≤ an+1 and bn ≥ bn+1 for all n. Let c := lim an and d := lim bn . Then there exists a c ∈ [a. b] → R be a continuous function. By construction. (i) Let a1 := a and b1 := b. We can use the fact that lim an = lim bn = c. . Suppose that f (a) < 0 and f (b) > 0. bn+1 − an+1 = 2 By induction we can see that bn − an = b1 − a1 = 21−n (b − a). From the deﬁnition of the two sequences it is obvious that if an < bn . It is sometimes called only intermediate value theorem. let an+1 := an +bn 2 an +bn 2 .2 Bolzano’s intermediate value theorem Bolzano’s intermediate value theorem is one of the cornerstones of analysis. c is the supremum of an and d is the supremum of the bn . Let f : [a. Furthermore. (ii) If f (iii) If f an +bn 2 an +bn 2 ≥ 0. 2n−1 As {an } and {bn } are monotone. Thus d − c ≤ bn − an for all n. Thus d − c = d − c ≤ bn − an ≤ 21−n (b − a) for all n. or just Bolzano’s theorem.
b] ⊂ S. Another advantage is that the moment we ﬁnd an initial interval where the intermediate value theorem can be applied. MINMAX AND INTERMEDIATE VALUE THEOREMS 95 Notice that the proof tells us how to ﬁnd the c. Hence there must exist a point c ∈ [1.7.8 (Bolzano’s intermediate value theorem).5) = −0. Hence there is a root of f in [1.b] is also continuous. then f (c) = y. Proposition 3.3.75.10. Then again g(a) < 0 and g(b) > 0 and we can apply Lemma 3. .5 and ﬁnd that f (1.5. 2] such that f (c) = 0.7 to g. In general it is very hard to do quickly. Again if g(c) = 0. Therefore.3. thus there is root in [1. then the restriction to a subset is continuous. there is no real number x such that x2 + 1 = 0. Proof. To ﬁnd a better approximation of the root we could follow the proof of Lemma 3. Suppose that there exists a y such that f (a) < y < f (b) or f (a) > y > f (b). we are guaranteed that we will ﬁnd a root up to a desired precision after ﬁnitely many steps. There are better and faster methods of ﬁnding roots of equations. Then we see that g(a) < 0 and g(b) > 0 and we can apply Lemma 3.75 and note that f (1. not just a polynomial. we generally apply the theorem to a function continuous on some large set S. Hence. If we follow the procedure of the proof. Next we look at 1. One advantage of the above method is its simplicity. Let f : [a. Odd polynomials. Do note that the theorem guarantees a single c such that f (c) = y. then deﬁne g(x) := f (x) − y. but we restrict attention to an interval.3. next we would look at 1. For example.875 and ﬁnd that f (1. Let us prove the following interesting result about polynomials. b] such that f (c) = y. if a function is continuous. then f (c) = y.75. but it is a very useful idea in applied mathematics. Note that polynomials of even degree may not have any real roots. 2].625.75) ≈ −0. on the other hand. Then there exists a c ∈ [a. We simply notice that f (1) = −1 and f (2) = 1. There could be many different roots of the equation f (c) = y.3. 1. b] → R be a continuous function.44. precisely and automatically. If g(c) = 0. So if f : S → R is continuous and [a.875) ≈ 0.9: The polynomial f (x) := x3 − 2x2 + x − 1 has a real root in [1. Let f (x) be a polynomial of odd degree. Next we look at 1. we are guaranteed to ﬁnd approximations to one such root. then deﬁne g(x) := y − f (x). The theorem says that a continuous function on a closed interval achieves all the values between the values at the endpoints. Therefore the proof is not only useful for us pure mathematicians. for example the Newton’s method. Finding roots of polynomials is perhaps the most common problem in applied mathematics.875].3. We can use the intermediate value theorem to ﬁnd roots for any continuous function.3. Then f has a real root. always have at least one real root. Of course as we said. If f (a) < y < f (b). then f [a. For example.7. there is a root of the equation in [1. Similarly if f (a) > y > f (b).016.3. The technique above is the simplest method of ﬁnding roots of polynomials.3. We will need to work harder to ﬁnd any other roots that may exist. 2]. Example 3. Theorem 3. We follow this procedure until we gain sufﬁcient precision. 2].
Md or in other words bd−1 M d−1 + · · · + b1 M + b0 < M d . n→∞ nd Thus there exists an M ∈ N such that lim bd−1 M d−1 + · · · + b1 M + b0 < 1.96 CHAPTER 3. We look at bd−1 nd−1 + · · · + b1 n + b0 bd−1 nd−1 + · · · + b1 n + b0 = nd nd bd−1  nd−1 + · · · + b1  n + b0  ≤ nd bd−1  nd−1 + · · · + b1  nd−1 + b0  nd−1 ≤ nd nd−1 (bd−1  + · · · + b1  + b0 ) = nd 1 = (bd−1  + · · · + b1  + b0 ) . the function f (x) := sin(1/x) 0 if x = 0. Therefore g(M) > 0. As g(x) = fad . In particular. In the proof make sure you use the fact that d is odd. Example 3. Then we can write f (x) = ad xd + ad−1 xd−1 + · · · + a1 x + a0 . We can divide by ad to obtain a polynomial g(x) = xd + bd−1 xd−1 + · · · + b1 x + b0 . Therefore . CONTINUOUS FUNCTIONS Proof. which implies that there must be a c ∈ [−K. Next we look at the sequence {g(−n)}.11: An interesting fact is that there do exist discontinuous functions that have the intermediate value property. this means that (−n)d = −(nd ). By a similar argument (exercise) we ﬁnd that there exists some K ∈ N such that −(bd−1 (−K)d−1 + · · · + b1 (−K) + b0 ) < K d and therefore g(−K) < 0 (why?). For example. Suppose f is a polynomial of odd degree d. M] (x) such that g(c) = 0. we see that f (c) = 0. where ad = 0. n bd−1 nd−1 + · · · + b1 n + b0 = 0. Now we appeal to the intermediate value theorem. if x = 0. We look at the sequence {g(n)} for n ∈ N.3. where bk = ak/ad . and the proof is done.
5: Suppose that g(x) is a polynomial of odd degree d such that g(x) = xd + bd−1 xd−1 + · · · + b1 x + b0 . . if there exists a y such that f (a) < y < f (b) or f (a) > y > f (b). Exercise 3. 1) → R be a continuous function such that lim f (x) = lim f (x) = 0. if x = 0. then there exists a c ∈ (a. . . bd−1 .3.3. b]) is a closed and bounded interval.3. Show that f has the intermediate value property. 3.3. That is. for any a < b.3.3: Let f : (0. Exercise 3. . .3. for any a < b. . b] → R is a continuous function. 1) (but perhaps not both).6: Suppose that g(x) is a polynomial of even degree d such that g(x) = xd + bd−1 xd−1 + · · · + b1 x + b0 .2: Find an example of a bounded discontinuous function f : [0.3.3 Exercises Exercise 3.3. bd−1 . Exercise 3. You will have to use that (−n)d = −(nd ). .7: Suppose that f : [a. there exists a c such that f (y) = c.3. Exercise 3. for some real numbers b0 . x→0 x→1 Show that f achieves either an absolute minimum or an absolute maximum on (0. for some real numbers b0 . Show that there exists a K ∈ N such that g(−K) < 0. however it has the intermediate value property. Hint: Make sure to use the fact that d is odd. 1] → R that has neither an absolute minimum nor an absolute maximum.4: Let f (x) := sin(1/x) 0 if x = 0.3. b) such that f (c) = y.1: Find an example of a discontinuous function f : [0. 1] → R where the intermediate value theorem fails. . Show that g has at least two distinct real roots. b1 . Suppose that g(0) < 0. That is. Prove that the direct image f ([a. . MINMAX AND INTERMEDIATE VALUE THEOREMS 97 is not continuous at 0. Exercise 3. and any y such that f (a) < y < f (b) or f (a) > y > f (b). b1 . Proof is left as an exercise. Exercise 3.
3: f : [0. δ can no longer depend on c. Therefore given ε > 0. let δ := ε/2. A function that is not uniformly continuous on a larger set. Example 3. y − x y − x = . which is a contradiction. then x2 − c2 < ε. f : R → R. xy xy . then  f (x) − f (c) < ε. it only depends on ε. Let S ⊂ R. c ≤ 1) x2 − c2 = x + c x − c ≤ (x + c) x − c ≤ (1 + 1) x − c . Take x > 0 and let c := x + δ/2. deﬁned by f (x) := x2 is not uniformly continuous. there would exist a δ > 0 such that if x − c < δ . then for all ε > 0. Deﬁnition 3.1. It is not hard to see that a uniformly continuous function must be continuous. y in (0.5 lectures 3. Write (note that 0 ≤ x. may be uniformly continuous when restricted to a smaller set. Therefore x ≤ ε/δ for all x > 0. Then if x − c < δ .4. Given ε > 0. Therefore there is no single δ > 0.2: f : (0.4. 1) → R. Suppose that for any ε > 0 there exists a δ > 0 such that whenever x. Write ε ≥ x2 − c2 = x + c x − c = (2x + δ/2)δ/2 ≥ δ x.4. Therefore. Suppose it is.1 Uniform continuity We have made a fuss of saying that the δ in the deﬁnition of continuity depended on the point c. deﬁned by f (x) := x2 is uniformly continuous. However. The only difference in the deﬁnitions is that for a given ε > 0 we pick a δ > 0 that works for all c ∈ S.4. then x2 − c2 < ε. There are situations when it is advantageous to have a δ independent of any point. Do note that the domain of deﬁnition of the function makes a difference now. That is. to satisfy the deﬁnition of uniform continuity we would have to have δ ≤ xyε for all x.4 Uniform continuity Note: 1. Let us therefore deﬁne this concept. 1] → R. Let f : S → R be a function. Example 3. c ∈ S and x − c < δ . but it is continuous. deﬁned by f (x) := 1/x is not uniformly continuous. but that would mean that δ ≤ 0. 1). Then we say f is uniformly continuous. then for ε > 1/x − 1/y to hold we must have ε > 1/x − 1/y = or x − y < xyε.98 CHAPTER 3. CONTINUOUS FUNCTIONS 3.
we can. Proof. Theorem 3. For closed and bounded interval [a. By BolzanoWeierstrass. then f can be continuous. there exists a convergent subsequence {xnk }. we see that {ynk } converges and the limit is c.3.4. There exists an ε > 0 such that for every δ > 0. b] → R be a continuous function. As c − xnk  goes to zero as does 1/nk as k goes to inﬁnity. Let us suppose that f is not uniformly continuous. Proof. y in S with x − y < δ and  f (x) − f (y) ≥ ε.4. Let f : S → R be a uniformly continuous function. We now show that f is not continuous at c. Then there is a δ > 0 such that  f (x) − f (y) < ε whenever x − y < δ . k ≥ M we have  f (xn ) − f (xk ) < ε. Therefore. we can ﬁnd sequences {xn } and {yn } such that xn − yn  < 1/n and such that  f (xn ) − f (yn ) ≥ ε. b] where f is not continuous. but not uniformly continuous. We estimate  f (c) − f (xnk ) =  f (c) − f (ynk ) + f (ynk ) − f (xnk ) ≥  f (ynk ) − f (xnk ) −  f (c) − f (ynk ) ≥ ε −  f (c) − f (ynk ) . there exist points x. We will prove the statement by contrapositive. Let {xn } be a Cauchy sequence in S. then a ≤ c ≤ b. The main difference here is that for a Cauchy sequence we no longer know where the limit ends up and it may not end up in the domain of the function.2 Continuous extension Before we get to continuous extension. Let us negate the deﬁnition of uniformly continuous. Thus f cannot be continuous at c. It says that uniformly continuous functions behave nicely with respect to Cauchy sequences. k ≥ M we have xn − xk  < δ . 3. make the following statement. Then f is uniformly continuous.4. however. Write c − ynk  = c − xnk + xnk − ynk  ≤ c − xnk  + xnk − ynk  < c − xnk  + 1/nk . . UNIFORM CONTINUITY 99 We have seen that if f is deﬁned on an interval that is either not closed or not bounded. b].4. So for the ε > 0 above. we show the following useful lemma. Lemma 3. Or in other words  f (c) − f (xnk ) +  f (c) − f (ynk ) ≥ ε. Then for all n. Then { f (xn )} is Cauchy. and we will ﬁnd some c ∈ [a. Note that as a ≤ xnk ≤ b.5. Now ﬁnd an M ∈ N such that for all n. at least one of the sequences { f (xnk )} or { f (ynk )} cannot converge to f (c) (else the left hand side of the inequality goes to zero while the righthand side is positive). Let f : [a. Let c := lim xnk .4. Let ε > 0 be given.
if x = a. then f is also uniformly continuous (easy exercise). Theorem 3.100 CHAPTER 3. So L1 − L2  = L1 − f (xn ) + f (xn ) − f (yn ) + f (yn ) − L2  ≤ L1 − f (xn ) +  f (xn ) − f (yn ) +  f (yn ) − L2  ≤ ε/3 + ε/3 + ε/3 = ε. If limx→a f (x) exists. the sequence { f (xn )} is Cauchy and therefore convergent.  f (xn ) − L1  < ε/3. b). . And since f is continuous at c ∈ (a. Now that we know that the limits La and Lb exist.6. a − yn  < δ/2. It says that a function on an open interval is uniformly continuous if and only if it can be extended to a continuous function on the closed interval. To show that Lb exists is left as an exercise. Let us concentrate on La . Take a sequence {xn } in (a. Then for n ≥ M we have xn − yn  = xn − a + a − yn  ≤ xn − a + a − yn  < δ/2 + δ/2 = δ .4. Now ﬁnd M ∈ N such that for n ≥ M we have a − xn  < δ/2.4. and  f (yn ) − L2  < ε/3.4. Let ε > 0 be given. b). if x = b. On direction is not hard to prove. A function f : (a. then the limit La = limx→a f (x) exists. ﬁnd δ > 0 such that x − y < δ implies  f (x) − f (y) < ε/3. b) such that lim yn = a. b] → R deﬁned by f (x) ˜(x) := La f Lb is continuous. if x ∈ (a.4. Proof. By the same reasoning we get L2 := lim f (yn ). The sequence is a Cauchy sequence and hence by Lemma 3.1. Therefore L1 = L2 . If f˜ is continuous. b) such that lim xn = a. Hence f˜ is continuous at a and b. Now take another sequence {yn } in (a. Now suppose that f is uniformly continuous. As f is the restriction of f˜ to (a. then it is uniformly continuous by Theorem 3. Similarly with Lb . b). If we can show that L1 = L2 . CONTINUOUS FUNCTIONS An application of the above lemma is the following theorem. we are done.5. Thus La exists.14). then limx→a f˜(x) exists (See Proposition 3. b) → R is uniformly continuous if and only if the limits La := lim f (x) x→a and Lb := lim f (x) x→b exist and if the function f˜ : [a. We must ﬁrst show that the limits La and Lb exist. We have some number L1 := lim f (xn ). then f˜ is continuous at c ∈ (a. b).
10: The function f : [1. For any x and y in S such that x − y < δ we have that  f (x) − f (y) ≤ K x − y < Kδ = K Therefore f is uniformly continuous. Example 3. UNIFORM CONTINUITY 101 3. √ x − y x−y √ x− y = √ √ =√ √ . x+ y x+ y .4. Therefore. f (x) and y. If f is a Lipschitz continuous function with some constant K. sin(x) − sin(y) ≤ x − y and cos(x) − cos(y) ≤ x − y . f (y) . The inequality can be rewritten that for x = y we have f (x) − f (y) ≤ K. Be careful however. Let f : S → R be a function such that there exists a number K such that for all x and y in S we have  f (x) − f (y) ≤ K x − y . A large class of functions is Lipschitz continuous. K Hence sin and cos are Lipschitz continuous with K = 1. Let f : S → R be a function and let K be a constant such that for all x. ε = ε. Example 3. Take δ := ε/K .7. x−y f is Lipschitz continuous if every line that intersects the graph of f at least two points has slope less than or equal to K. the domain of deﬁnition of the function is important.9: The functions sin(x) and cos(x) are Lipschitz continuous.4.3 Lipschitz continuous functions Deﬁnition 3. Let ε > 0 be given.8. First let us justify using the word “continuous.3. Proof.4. As for uniformly continuous functions. y in S we have  f (x) − f (y) ≤ K x − y. Then f is said to be Lipschitz continuous. A Lipschitz continuous function is uniformly continuous. We can interpret Lipschitz continuity geometrically. x−y The quantity f (x)− f (y) is the slope of the line between the points x. See the examples below and the exercises.6) the following two inequalities.” Proposition 3.4. ∞) → R deﬁned by f (x) := √ x is Lipschitz continuous.2.4. We have seen (Example 3.4.
8: Show that f : (0. Exercise 3.4. It is not hard (exercise) to show that this means that x is uniformly continuous on [0. If K > 0.4.4: Show that f : (0.4.6: Let f : R → R be a polynomial of degree d ≥ 2.4 Exercises Exercise 3. Exercise 3. 1] by Theorem 3. ∞) → R deﬁned by f (x) := x is not Lipschitz continuous. Finish proof of Theorem 3. ∞) note that it is uniformly continuous on [0. Suppose that we have √ √ x − y ≤ K x − y . x→b Exercise 3. ∞) → R for some c > 0 and deﬁned by f (x) := 1/x is Lipschitz continuous. Show that there exists a uniformly continuous function f˜ : R → R such that f (x) = f˜(x) for all x ∈ Q. Note that the last example shows an example of a function that is uniformly continuous but not √ Lipschitz continuous. ∞). then for x > 0 we then get 1/K ≤ x.102 As x ≥ 1 and y ≥ 1.4.6 by showing that the limit lim f (x) exists. we can see that √ 1√ x+ y CHAPTER 3. √ √ for some K.2: Let f : (a.5: Let A.3: Show that f : (c. 3. then h is uniformly continuous.4. Exercise 3. ∞) → R deﬁned by f (x) := 1/x is not Lipschitz continuous. a) Prove that if A ∩ B = 0.4.4. Exercise 3. Therefore 2 √ x−y 1 √ x− y = √ √ ≤ x − y . b) / Find an example where A ∩ B = 0 and h is not even continuous. b) → R be a uniformly continuous function.7: Let f : (0. This cannot possibly be true for all x > 0. B be intervals. Deﬁne the function h : A ∪ B → R by h(x) := f (x) if x ∈ A and h(x) := g(x) if x ∈ B \ A.4. It is also Lipschitz (and therefore uniformly continuous) on √ [1. Exercise 3.4. Then the restriction f A is uniformly continuous. Let us see why.4. Let y = 0 to obtain x ≤ Kx. Thus no such K > 0 can exist and f is not Lipschitz continuous.4. Show that the function g(x) := x(1 − x) f (x) is uniformly continuous. / Exercise 3. ∞). Show that f is not Lipschitz continuous. Exercise 3. Let A ⊂ S.4. Let f : A → R and g : B → R be uniformly continuous functions such that f (x) = g(x) for x ∈ A ∩ B.4. . x+ y 2 √ On the other hand f : [0. CONTINUOUS FUNCTIONS ≤ 1 . To see that x is uniformly continuous on [0. 1) → R be a bounded continuous function.9: Let f : Q → R be a uniformly continuous function. ∞) → R deﬁned by f (x) := sin(1/x) is not uniformly continuous.1: Let f : S → R be uniformly continuous.
Suppose that the limit f (x) − f (c) L := lim x→c x−c exists. f (c) . but all the theory still works by allowing it. If f is differentiable at all c ∈ I. f (c) and x.Chapter 4 The Derivative 4.1.1. Let us be precise. 4. Some calculus books will not allow c to be an endpoint of an interval.1 The derivative Note: 1 lecture The idea of a derivative is the following. Note that we allow I to be a closed interval and we allow c to be an endpoint of I. The expression f (x)− f (c) x−c is called the difference quotient. we x−c get the righthand plot. As we take the limit as x goes to c. let f : I → R be a function. and let c ∈ I. Of course.1 Deﬁnition and basic properties Deﬁnition 4. The slope tells us how fast is the value of the function changing at the particular point. We can then talk about the slope of this line. 103 .1. f (x) with slope f (x)− f (c) . Then we say that f is differentiable at c and we say that L is the derivative of f at c and we write f (c) := L.1. Let I be an interval. and it will make our work easier. Let us suppose that a graph of a function looks locally like a straight line. then we simply say that f is differentiable. we are leaving out any function that has corners or discontinuities. On this plot we can see that the derivative of the function at the point c is the slope of the line tangent to the graph of f at the point c. The graphical interpretation of the derivative is depicted in Figure 4. and then we obtain a function f : I → R. The lefthand plot gives the line through c.
104
CHAPTER 4. THE DERIVATIVE
slope =
f (x)− f (c) x−c
slope = f (c)
c x c Figure 4.1: Graphical interpretation of the derivative.
Example 4.1.2: Let f (x) := x2 deﬁned on the whole real line. Then we ﬁnd that x2 − c2 = lim (x + c) = 2c. x→c x→c x − c lim Therefore f (c) = 2c. Example 4.1.3: The function f (x) := x is not differentiable at the origin. When x > 0, then x − 0 = 1, x−0 and when x < 0 we have x − 0 = −1. x−0
A famous example of Weierstrass shows that there exists a continuous function that is not differentiable at any point. The construction of this function is beyond the scope of this book. On the other hand, a differentiable function is always continuous. Proposition 4.1.4. Let f : I → R be differentiable at c ∈ I, then it is continuous at c. Proof. We know that the limits lim f (x) − f (c) = f (c) x−c and lim (x − c) = 0
x→c
x→c
exist. Furthermore, f (x) − f (c) = f (x) − f (c) (x − c). x−c
4.1. THE DERIVATIVE Therefore the limit of f (x) − f (c) exists and
x→c
105
lim f (x) − f (c) = f (c) · 0 = 0.
Hence lim f (x) = f (c), and f is continuous at c.
x→c
One of the most important properties of the derivative is linearity. The derivative is the approximation of a function by a straight line, that is, we are trying to approximate the function at a point by a linear function. It then makes sense that the derivative is linear. Proposition 4.1.5. Let I be an interval, let f : I → R and g : I → R be differentiable at c ∈ I, and let α ∈ R. (i) Deﬁne h : I → R by h(x) := α f (x). Then h is differentiable at c and h (c) = α f (c). (ii) Deﬁne h : I → R by h(x) := f (x) + g(x). Then h is differentiable at c and h (c) = f (c) + g (c). Proof. First, let h(x) = α f (x). For x ∈ I, x = c we have h(x) − h(c) α f (x) − α f (c) f (x) − f (c) = =α . x−c x−c x−c The limit as x goes to c exists on the right by Corollary 3.1.12. We get h(x) − h(c) f (x) − f (c) = α lim . x→c x→c x−c x−c lim Therefore h is differentiable at c, and the derivative is computed as given. Next, deﬁne h(x) := f (x) + g(x). For x ∈ I, x = c we have f (x) + g(x) − f (c) + g(c) h(x) − h(c) f (x) − f (c) g(x) − g(c) = = + . x−c x−c x−c x−c The limit as x goes to c exists on the right by Corollary 3.1.12. We get h(x) − h(c) f (x) − f (c) g(x) − g(c) = lim + lim . x→c x→c x→c x−c x−c x−c lim Therefore h is differentiable at c and the derivative is computed as given. It is not true that the derivative of a multiple of two functions is the multiple of the derivatives. Instead we get the socalled product rule or the Leibniz rule∗ .
∗ Named
for the German mathematician Gottfried Wilhelm Leibniz (1646–1716).
106
CHAPTER 4. THE DERIVATIVE
Proposition 4.1.6 (Product Rule). Let I be an interval, let f : I → R and g : I → R be functions differentiable at c. If h : I → R is deﬁned by h(x) := f (x)g(x), then h is differentiable at c and h (c) = f (c)g (c) + f (c)g(c). The proof of the product rule is left as an exercise. The key is to use the identity f (x)g(x) − f (c)g(c) = f (x) g(x) − g(c) + g(c) f (x) − f (c) . Proposition 4.1.7 (Quotient Rule). Let I be an interval, let f : I → R and g : I → R be differentiable at c and g(x) = 0 for all x ∈ I. If h : I → R is deﬁned by h(x) := then h is differentiable at c and h (c) = Again the proof is left as an exercise. f (c)g (c) + f (c)g(c) g(c)
2
f (x) , g(x)
.
4.1.2 Chain rule
A useful rule for computing derivatives is the chain rule. Proposition 4.1.8 (Chain Rule). Let I1 , I2 be intervals, let g : I1 → I2 be differentiable at c ∈ I2 , and f : I2 → R be differentiable at g(c). If h : I → R is deﬁned by h(x) := ( f ◦ g)(x) = f g(x) , then h is differentiable at c and h (c) = f g(c) g (c). Proof. Let d := g(c). Deﬁne u(y) := v(x) :=
f (y)− f (d) y−d
if y = d, if y = d, if x = c, if x = c.
f (d)
g(x)−g(c) x−c
g (c)
lim u(y) = f g(c) . Exercise 4.1. and ﬁnally that x→c x→c and g(x) − g(c) = v(x)(x − c).1.1: Prove the product rule. and ﬁnd the derivative at 0. Therefore. Hint: Use the sumtoproduct trigonometric identity as we did before. Hint: You can do this directly. x−c We note that lim v(x) = g (c). 4. Exercise 4. prove that sin is differentiable at all x and that the derivative is cos(x).2: Prove the quotient rule.4. Exercise 4. Hint: Use the product rule. g is continuous at c.1.1.3 Exercises Exercise 4. y→g(c) Thus h is differentiable at c and the limit is f g(c) g (c). Therefore. Exercise 4. THE DERIVATIVE 107 By the deﬁnition of the limit we see that lim u(y) = f (d) and lim v(x) = g (c) (the functions u and y→d x→c v are continuous at d and c respectively). h(x) − h(c) = u g(x) v(x). Prove that f is differentiable at 0. Prove that ( f n ) (x) = n f (x) f (x).1. Prove that sin is differentiable at 0. .1. Exercise 4.4: Prove that a polynomial is differentiable and ﬁnd the derivative.3: Prove that xn is differentiable and ﬁnd the derivative. Hint: Use the previous exercise. but discontinuous at all points except 0. Exercise 4.5: Let f (x) := x2 0 if x ∈ Q. Exercise 4.1. f (y) − f (d) = u(y)(y − d) We plug in to obtain h(x) − h(c) = f g(x) − f g(c) = u g(x) g(x) − g(c) = u g(x) v(x)(x − c) . otherwise.1. that is lim g(x) = g(c). Hint: Use f (x)g(x) − f (c)g(c) = f (x) g(x) − g(c) + g(c) f (x) − f (c) .7: Using the previous exercise.8: Let f : I → R be differentiable.1.6: Assume the inequality x − sin(x) ≤ x2 . Deﬁne f n be the function deﬁned by f n (x) := n n−1 f (x) . Therefore the limit of the righthand side exists and is equal to f g(c) g (c).1. but it may be easier to ﬁnd the derivative of 1/x and then use the chain rule and the product rule.
THE DERIVATIVE Exercise 4. Suppose that both f is differentiable at c ∈ I1 and f (c) = 0 and g is differentiable at f (c).1.108 CHAPTER 4.10: Let I1 . I2 be intervals. Use the chain rule to ﬁnd a formula for g f (c) (in terms of f (c)).9: Suppose that f : R → R is a differentiable Lipschitz continuous function. Let f : I1 → I2 be a bijective function and g : I2 → I1 be the inverse. Exercise 4. Prove that f is a bounded function.1. .
the derivative should be zero. Then there exists a c ∈ (a. Intuitively it should attain a minimum or a maximum in the interior of the interval. If x > c we note that f (x) − f (c) ≤ 0.1 Relative minima and maxima Deﬁnition 4. Theorem 4. and such that lim xn = lim yn = c. . and c is a relative minimum or a relative maximum of f .2. The function f is said to have a relative maximum at c ∈ S if there exists a δ > 0 such that for all x ∈ S such that x − c < δ we have f (x) ≤ f (c).3 (Rolle).2. such that xn > c. and yn < c for all n ∈ N.2.2 for the geometric idea. x−c We now take sequences {xn } and {yn }. MEAN VALUE THEOREM 109 4. Proof. n→∞ xn − c yn − c n→∞ 4. Let f : [a. b). Then f (c) = 0.4.2 Rolle’s theorem Suppose that a function is zero on both endpoints of an interval. In particular as long as x − c < δ we have f (x) − f (c) ≤ 0.2 Mean value theorem Note: 2 lectures (some applications may be skipped) 4. x−c and if x < c we have f (x) − f (c) ≥ 0. b] → R be continuous function differentiable on (a. Then at a minimum or a maximum. b) such that f (a) = f (b) = 0. Then we look at the difference quotient. For a minimum the statement follows by considering the function − f . See Figure 4.2.1. The deﬁnition of relative minimum is analogous. Let S ⊂ R be a set and let f : S → R be a function. b] → R be a function differentiable at c ∈ (a.2. Since f is differentiable at c we know that 0 ≥ lim f (yn ) − f (c) f (xn ) − f (c) = f (c) = lim ≥ 0. Let f : [a.2. This is the content of the socalled Rolle’s theorem. b) such that f (c) = 0. Let c be a relative maximum of f . We will prove the statement for a maximum. Theorem 4.2.
2. Hence the relative minimum is 0 and the relative maximum is 0.2. If the absolute maximum as at a or at b. b−a Then we know that g is a differentiable function on (a. b−a . So suppose that the absolute minimum is also at a or b. b] such that g(a) = 0 and g(b) = 0. b]. Thus there exists c ∈ (a. then again we ﬁnd that f (c) = 0.4 (Mean value theorem). Thus f (x) = 0 for all x ∈ [a. 4. b] so pick an arbitrary c.3 Mean value theorem We extend Rolle’s theorem to functions that attain different values at the endpoints. then c is also a relative maximum and we apply Theorem 4. b). 0 = g (c) = f (c) + f (b) − f (a) Or in other words f (c)(b − a) = f (b) − f (a). b) such that g (c) = 0. b] → R by g(x) := f (x) − f (b) + f (b) − f (a) b−x . The theorem follows easily from Rolle’s theorem.2. Proof. b] → R be continuous function differentiable on (a.2: Point where tangent line is horizontal. that is f (c) = 0. THE DERIVATIVE b a c Figure 4. If it attains an absolute maximum at c ∈ (a. b).2 to ﬁnd that f (c) = 0. As f is continuous on [a. Proof. Let f : [a. b). b) such that f (b) − f (a) = f (c)(b − a). b] it attains an absolute minimum and an absolute maximum in [a. then we look for the absolute minimum. b) continuous on [a. and therefore the function is identically zero.110 CHAPTER 4. Deﬁne the function g : [a. Then there exists a point c ∈ (a. If the absolute minimum is at c ∈ (a. −1 . Theorem 4.
then there exist x and y in I such that x < y and f (x) = f (y).2. We will show this by contrapositive. f (x) < f (y)). . The idea is that the value f (b)− f (a) is the slope of the line between the points a. the tangent line at the point c. f (a)) Figure 4.6. Then f is a constant. (b.3. strictly increasing) if x < y implies f (x) ≤ f (y) (resp. f (a) and b. As y = x and f (y) = f (x) we see that f (c) = 0. Then f restricted to [x.4. Proposition 4. (i) f is increasing if and only if f (x) ≥ 0 for all x ∈ I. Now that we know what it means for the function to stay constant. Let I be an interval and let f : I → R be a differentiable function such that f (x) = 0 for all x ∈ I. We deﬁne decreasing and strictly decreasing in the same way by switching the inequalities for f . Proof.2.2. 4. Suppose that f is not constant. Let f : I → R be a differentiable function. y) such that f (y) − f (x) = f (c)(y − x). We say that f : I → R is increasing (resp.5. f (b) . f (b) . f (c) has the same slope b−a as the line between a.3: Graphical interpretation of the mean value theorem. that is. Proposition 4. MEAN VALUE THEOREM 111 For a geometric interpretation of the mean value theorem. Therefore there is a c ∈ (x.4 Applications We can now solve our very ﬁrst differential equation. let us look at increasing and decreasing functions. Then c is the b−a point such that f (c) = f (b)− f (a) . f (a) and b. y] satisﬁes the hypotheses of the mean value theorem. f (b)) c (a. see Figure 4.2.
7: We can make a similar but weaker statement about strictly increasing and decreasing functions. We leave the decreasing part to the reader as exercise. The theorem is stated for an absolute minimum and maximum. Let x < y in I. By the previous proposition.8. Then by the mean value theorem there is some c ∈ (x. then f has an absolute maximum at c. As f (c) ≥ 0. b). b) and {yn } a sequence such that c < yn < x and lim yn = c. Suppose that f is increasing. . Another application of the mean value theorem is the following result about location of extrema. b) so f (x) ≥ f (yn ). Let us prove the ﬁrst item. CHAPTER 4. but the way it is applied to ﬁnd relative minima and maxima is to restrict f to an interval (c − δ . (ii) If f (x) ≥ 0 for x ∈ (a. By continuity of f we get f (x) ≥ f (c) for all x ∈ (c. b). Note that converse of the proposition does not hold. b).10 below. Proof. Let x be in (a. d). c) and (c. then f is strictly increasing. For example. c) and f (x) ≤ 0 for x ∈ (c. then f has an absolute minimum at c. c) and {yn } a sequence such that x < yn < c and lim yn = c. Let us prove the ﬁrst item. then f (x) − f (y) ≥ 0 and so f is increasing. Let f : (a. x−c Taking a limit as x goes to c we see that f (c) ≥ 0. Let c ∈ (a. For the other direction. The function is increasing on (c. (i) If f (x) ≤ 0 for x ∈ (a. Example 4. However. THE DERIVATIVE Proof.112 (ii) f is decreasing if and only if f (x) ≤ 0 for all x ∈ I. the converse is not true. b) → R be continuous. If f (x) > 0 for all x ∈ I. c).2. then for all x and c in I we have f (x) − f (c) ≥ 0.2. c + δ ). b) and suppose f is differentiable on (a. the function is decreasing on (a. and x − y > 0. c) and f (x) ≥ 0 for x ∈ (c. f (x) := x3 is a strictly increasing function but f (0) = 0. c) so f (x) ≥ f (yn ). Proposition 4. y) such that f (x) − f (y) = f (c)(x − y). The proof is left as an exercise. Thus f (x) ≥ f (c) for all x ∈ (a. See Example 4.2. b). The function is continuous at c so we can take the limit to get f (x) ≥ f (c) for all x ∈ (a. Similarly take x ∈ (c. The second is left to the reader. suppose that f (x) ≥ 0 for all x ∈ I.
b]. b) such that f (c) = y. x−a > 0 or g(x) − g(a) > 0 or g(x) > g(a). but the derivative changes sign inﬁnitely often near the origin. The function f is differentiable for x = 0 and the derivative is 2 sin(1/x) x sin(1/x) − cos(1/x) . Furthermore. f has a minimum at 0.2. Then there exists a c ∈ (a. x−0 x . While it is hard to imagine at ﬁrst.9 (Darboux). Let us see that f exists at 0.2. Then as c is a maximum of g we ﬁnd g (c) = 0 and f (c) = y. we ﬁnd an x < b such that g(x)−g(b) − g (b) < −g (b) or that g(x) > g(b). there do exist functions that are differentiable everywhere and the derivative is not continuous. That f has an absolute minimum at 0 is easy to see by deﬁnition. Now compute g (x) = y − f (x). Therefore c ∈ (a. Hence if f exists at 0. Suppose that there exists a y ∈ R such that f (a) < y < f (b) or f (a) > y > f (b). Let f : [a. b). Similarly as g (b) < 0. x−b thus b cannot possibly be a maximum. Deﬁne g(x) := yx − f (x). Thus a cannot possibly be a maximum.10: Let f : R → R be the function deﬁned by f (x) := x sin(1/x) 0 2 if x = 0. but f : R → R is not continuous at the origin. Theorem 4. For x = 0 we have f (x) − f (0) x2 sin2 (1/x) −0 = = x sin2 (1/x) ≤ x . b] → R be differentiable. and for yn = (8n+3)π we have lim f (yn ) = 1. Proof. b]. then it cannot be continuous. And as we have seen. then g attains a maximum at some c ∈ [a.2. there do exist noncontinuous functions that have the intermediate value property.2.5 Continuity of derivatives and the intermediate value theorem Derivatives of functions satisfy an intermediate value property. Thus g (a) > 0. In other words f (x)− f (0) − 0 x−0 goes to zero as x goes to zero. Example 4. Suppose without loss of generality that f (a) < y < f (b). We claim that f is differentiable. We know that f (x) ≥ 0 for all x and f (0) = 0. The theorem is usually called the Darboux’s theorem. We claim that the derivative is zero. 4 4 As an exercise show that for xn = (8n+1)π we have lim f (xn ) = −1.4. As g is continuous on [a. MEAN VALUE THEOREM 113 4. if x = 0. We can ﬁnd an x > a such that g (a) − Thus g(x)−g(a) x−a g(x) − g(a) < g (a).
If f (x) > 0 for all x ∈ I.114 CHAPTER 4. THE DERIVATIVE f (x)− f (0) x−0 And of course as x tends to zero. Suppose that there exists a point c ∈ (a.8: Suppose that f : (a. Exercise 4.2. b] → R is differentiable and c ∈ [a. Exercise 4.2. Hint: Show that f is differentiable at all points and compute the derivative. − 0 goes to zero.2: Finish proof of Proposition 4. xn = c. f is differentiable at 0 and the derivative at 0 is 0. Exercise 4. If f : I → R is differentiable and the derivative f is continuous on I.6. It is sometimes useful to assume that the derivative of a differentiable function is continuous.2.2.6: Suppose that I is an interval and f : I → R is a differentiable function. Exercise 4. b]. It is then common to write C1 (I) for the set of continuously differentiable functions on I.7: Suppose f : (a. b) → R and g : (a. b) → R are differentiable functions such that f (x) = g (x) for all x ∈ (a.3: Suppose that f : R → R is a differentiable function such that f is a bounded function. Exercise 4.8. then x tends to zero and hence Therefore. b).2.2. b) such that f (c) > 0. Prove that f (x) > 0 for all x ∈ (a.2.6 Exercises Exercise 4. . then we say that f is continuously differentiable. 4.2. Exercise 4. show that f is strictly increasing. b). Then show that there exists a sequence {xn }. then show that there exists a constant C such that f (x) = g(x) +C. Then show that f is a Lipschitz continuous function. Exercise 4.5: Suppose that f : R → R is a function such that  f (x) − f (y) ≤ x − y2 for all x and y.2. n→∞ Do note that this does not imply that f is continuous (why?). Show that f (x) = C for some constant C.4: Suppose that f : [a. b) → R is a differentiable function such that f (x) = 0 for all x ∈ (a.1: Finish proof of Proposition 4. such that f (c) = lim f (xn ). b).2.2.
and so on. any n times differentiable function can be approximated at a point x0 by a polynomial. When f possesses n derivatives. Deﬁnition 4. We can think of the theorem as a generalization of the mean value theorem. (x − x0 )n is very small in a small interval around x0 .4. which is really Taylor’s theorem for the ﬁrst derivative. Therefore we denote by f (n) the nth derivative of f .3. b). then we obtain a function f : I → R.3.3. The function f is called the second derivative of f . (n + 1)! for the English mathematician Brook Taylor (1685–1731). We can similarly obtain f . we denote by f : I → R the derivative of f .1. b] → R is a function with n continuous derivatives on [a.1 Derivatives of higher orders When f : I → R is differentiable.2 (Taylor). f . . If f is differentiable. TAYLOR’S THEOREM 115 4. For a function f deﬁned near a point x0 ∈ R. The error of this approximation behaves like (x − x0 )n near the point x0 .2 Taylor’s theorem Taylor’s theorem† is a generalization of the mean value theorem. Theorem 4. Given distinct points x0 and x in [a. 4.3 Taylor’s theorem Note: 0. However.5 lecture (optional section) 4. we say that f is n times differentiable. It tells us that up to a small error. Suppose f : [a. deﬁne the nth Taylor polynomial for f at x0 as n Pn (x) := ∑ k=1 f (k) (x0 ) (x − x0 )k k! f (x0 ) f (3) (x0 ) f (n) (x0 ) (x − x0 )2 + (x − x0 )3 + · · · + (x − x0 )n . b]. To see why this is a good approximation notice that for a big n. b] and such that f (n+1) exists on (a. we can ﬁnd a point c between x0 and x such that f (x) = Pn (x) + † Named f (n+1) (c) (x − x0 )n+1 . 2 6 n! = f (x0 ) + f (x0 )(x − x0 ) + Taylor’s theorem tells us that a function behaves like its the nth Taylor polynomial. with a larger number of derivatives the notation would get out of hand.3. The function f is called the ﬁrst derivative of f .3.
Thus by the mean value theorem there exists an x1 between x0 and x such that g (x1 ) = 0. .3. A simple computation shows that Pn (x0 ) = f (k) (x0 ) for k = 0.3: Let f (x) := x3 . 1. Exercise 4. which we will not state nor prove. Plugging in c for s we obtain that M = f (n+1) (c) (n+1)! . Proof.2: Suppose that p is a polynomial of degree d. The form of the remainder term is given in what is called the Lagrange form of the remainder term. In particular g(x0 ) = 0. There are other ways to write the remainder term but we will skip those. We repeat the argument n + 1 times to obtain a number xn+1 between x0 and xn (and therefore between x0 and x) such that g(n+1) (xn+1 ) = 0. Deﬁne a function g(s) by g(s) := f (s) − Pn (s) − M(s − x0 )n+1 .116 (n+1) CHAPTER 4. but show that f (3) (0) does not exist. That is why the Taylor polynomial is a good approximation to f .3. 2. .3. g(x0 ) = g (x0 ) = g (x0 ) = · · · = g(n) (x0 ) = 0. Therefore the Taylor polynomial has the same derivatives as f at x0 up to the nth derivative. 4. . THE DERIVATIVE (c) The term Rn (x) := f (n+1)! (x − x0 )n+1 is called the remainder term.3.3 Exercises Exercise 4. There does exist a converse to Taylor’s theorem. Find a number M solving the equation f (x) = Pn (x) + M(x − x0 )n+1 . Exercise 4. . saying that if a function is locally approximated in a certain way by a polynomial of degree d. n. 1. In simple terms. We compute the (n + 1)th derivative of g to ﬁnd g(n+1) (s) = f (n+1) (s) − (n + 1)! M.1: Compute the nth Taylor Polynomial at 0 for the exponential function. show that the (d + 1)th Taylor polynomial for p at x0 is equal to p. 2. . . Therefore. Compute f (x) and f (x) for all x. n (the zeroth derivative corresponds simply to the function itself). . Applying the mean value theorem to g we obtain that there exists x2 between x0 and x1 (and therefore between x0 and x) such that g (x2 ) = 0. Given any x0 ∈ R. (k) (k) and we are done. . that’s the deﬁnition of the derivative. a differentiable function is locally approximated by a line. On the other hand g(x) = 0. In the proof we have computed that Pn (x0 ) = f (k) (x0 ) for k = 0. . then it has d derivatives. Now we simply let c := xn+1 .
In this chapter we will deﬁne the Riemann integral∗ using the Darboux integral† . That we can compute an antiderivative using the integral is a nontrivial result we have to prove. Only then can we talk about the Riemann integral and the Riemann integrable functions. x1 . 117 . There is often confusion among students of calculus between “integral” and “antiderivative. xn such that a = x0 < x1 < x2 < · · · < xn−1 < xn = b. b] is a ﬁnite sequence of points x0 . b]. .” The integral is (informally) the area under the curve. nothing else. x2 .1 The Riemann integral Note: 1. . We write ∆xi := xi − xi−1 .1.1. . Deﬁnition 5. which is technically simpler than (and equivalent to) the traditional deﬁnition as done by Riemann.Chapter 5 The Riemann Integral 5.5 lectures We now get to the fundamental concept of an integral. We ﬁrst deﬁne two auxiliary integrals that can be deﬁned for all bounded functions.1. . after the French mathematician JeanGaston Darboux (1842–1917). We say P is of size n. A partition P of the interval [a.1 Partitions and lower and upper integrals We want to integrate a bounded function deﬁned on an interval [a. ∗ Named † Named after the German mathematician Georg Friedrich Bernhard Riemann (1826–1866). 5.
The width of the ith rectangle is ∆xi . the set of lower and upper sums are bounded sets. Let f : [a.2. i=1 n U(P. i=1 We call L(P. f ) ≤ U(P. n L(P.1). and the upper sum is the area of the entire rectangles. f ) the upper Darboux sum. Mi := sup{ f (x) : xi−1 ≤ x ≤ xi }. f ) := ∑ Mi ∆xi . b] we have m(b − a) ≤ L(P. b] → R be a bounded function. Let P be a partition. b]. the height of the shaded rectangle is mi and the height of the entire rectangle is Mi . For any partition P of [a. (5. Therefore. f ) ≤ M(b − a).1: Sample Darboux sums. Finally ∑n ∆xi = (b − a).1) Proof. Then note that m ≤ mi for all i and Mi ≤ M for all i. The lower sum is the area of the shaded rectangles. The geometric idea of Darboux sums is indicated in Figure 5. Deﬁne mi := inf{ f (x) : xi−1 ≤ x ≤ xi }.118 CHAPTER 5.1. M ∈ R be such that for all x we have m ≤ f (x) ≤ M. Also mi ≤ Mi for all i. i=1 n n i=1 n i=1 m(b − a) = m i=1 ∑ ∆xi = ∑ m∆xi ≤ ∑ mi ∆xi ≤ n i=1 n i=1 n i=1 ≤ ∑ Mi ∆xi ≤ ∑ M∆xi = M ∑ ∆xi = M(b − a).1. Let m. Let P be a partition of [a. . b] → R be a bounded function. Proposition 5. f ) := ∑ mi ∆xi . Hence we get (5. In other words. THE RIEMANN INTEGRAL Let f : [a. f ) the lower Darboux sum and U(P. Figure 5.
The reason is that for every i we have that mi = inf{ f (x) : x ∈ [xi−1 . THE RIEMANN INTEGRAL 119 Deﬁnition 5. . 1. That is. f ) : P a partition of [a. Thus n L(P. We say ˜ ˜ ˜ ˜ is a reﬁnement of P if as sets P ⊂ P. .3. The same deﬁnition is used when f is deﬁned on a larger set S such that [a. b]. f ) : P a partition of [a. The main reason for introducing reﬁnements is the following proposition. For example. f ) = ∑ 0 · ∆xi = 0. To compute the integral we will often take a partition P and make it ﬁner.1.5. a We call the lower Darboux integral and the upper Darboux integral.5. .5. In general they can be different. x1 . ˜ Deﬁnition 5. Example 5.5. Then / 1 1 f =0 0 and 0 f = 1. To avoid worrying about the variable of integration. .1.1. 1. . 2} is a partition of [0. b] ⊂ S. Let P := {x0 . xn } and P := {x0 . b]}. xm } be partitions of [a. 0. xi ]} = 1.75. 1. where f (x) := 1 if x ∈ Q and f (x) := 0 if x ∈ Q. we use the restriction of f to [a. . . x1 . 1] → R.6. b]. 2] and {0. b] and we must ensure that the restriction is bounded on [a. f ) = ∑ 1 · ∆xi = ∑ ∆xi = 1.1. 1. 0. a b f (x) dx := inf{U(P. Now that we know that the sets of lower and upper Darboux sums are bounded. deﬁne b f (x) dx := sup{L(P. i=1 n n i=1 U(P. . In that case. . we will often simply write b b b b f := a a f (x) dx and a f := a f (x) dx. It is not clear from the deﬁnition when the lower and upper Darboux integrals are the same number. {0. P is a reﬁnement of a partition if it contains all the points in P and perhaps some other points in between.5. xi ]} = 0 and sup{ f (x) : x ∈ [xi−1 .4: Take the Dirichlet function f : [0.1.2. ˜ P ˜ That is. we will cut intervals in the partition into yet smaller pieces. 2} is a reﬁnement. b]}. 0. i=1 Remark 5.
f ). 1. . Also U(P. b].7. ˜ Let ∆x j = x j−1 − x j . . xm } be ˜ ˜ ˜ a reﬁnement of P := {x0 . Let f : [a. . P is a reﬁnement of P1 and it . . f ) ≤ U(P.2). . Furthermore. . (5. THE RIEMANN INTEGRAL ˜ Proposition 5. f ) ≤ L(P. f ). In fact. f ) ≤ U(P. . we can ﬁnd integers ˜ ˜ k0 < k1 < · · · < kn such that x j = xk j for j = 0. n. ˜ kj kj kj m j ∆x j = m j So n p=k j−1 +1 ∑ ∆x p = ˜ p=k j−1 +1 ∑ m j ∆x p ≤ ˜ p=k j−1 +1 ∑ m p ∆x p . b]. The key point of this proposition is the middle inequality in (5. f ) implies m(b − a) ≤ b a f b a f.1. ˜ Proof. P2 be partitions of [a. We get that ˜ ˜ ˜ kj ∆x j = p=k j−1 +1 ∑ ∆x p . and let P be a partition of [a.1.2) Proof. P is a partition of [a. Let f : [a. f ) is left as an exercise. The inequality m(b − a) ≤ L(P. Therefore. Let P1 . . Then b b m(b − a) ≤ a f≤ a f ≤ M(b − a). By Proposition 5. ˜ ˜ n kj m L(P. Let P := {x0 . Let m. ˜ Let m j be as before and correspond to the partition P. The key point of this proposition is the inequality that says that the lower Darboux sum is less than or equal to the upper Darboux sum. ˜ The proof of U(P. Then ˜ L(P. Let P be a reﬁnement of P.2 we have for any partition P m(b − a) ≤ L(P. . x1 . f ) ≤ M(b − a) implies ≤ M(b − a). b] → R be a bounded function. M ∈ R be such that for all x we have m ≤ f (x) ≤ M. . Now. . Then x0 = x0 and xn = xm . Armed with reﬁnements we can prove the following. f ) ≤ M(b − a).120 CHAPTER 5. f ) and ˜ U(P. The tricky part of this proof is to get the notation correct. b] → R be a bounded function. ˜ ˜ ˜ Deﬁne the partition P := P1 ∪ P2 .1. x1 .8. b]. ˜ ˜ m j ≤ m p for k j−1 < p ≤ k j . Proposition 5. 2. f ) = j=1 ∑ m j ∆x j ≤ j=1 p=k j−1 +1 ∑ ∑ m p ∆x p = ˜ ˜ j=1 ˜ ˜ ˜ ∑ m j ∆x j = L(P. . f ) ≤ U(P. Let m j := inf{ f (x) : x j−1 ≤ x ≤ x j }. xn }.
7 we have L(P1 . Suppose that b b f (x) dx = a a f (x) dx. then b f ≤ M(b − a). THE RIEMANN INTEGRAL 121 ˜ ˜ is also a reﬁnement of P2 . Then b m(b − a) ≤ a f ≤ M(b − a). The number b a f is called the Riemann integral of f . if  f (x) ≤ M for all x ∈ [a. Often we will use a weaker form of this proposition. f ) : P a partition}. Let m.9.2. Taking the supremum and inﬁmum over all partitions we get sup{L(P. f ) ≤ U(P2 . Now we recall Proposition 1. Putting it all together we have ˜ ˜ L(P1 . f ).1. When f ∈ R[a. Then f is said to be Riemann integrable. f ). Proposition 5. f ) ≤ U(P2 .1.1. f ) : P a partition} ≤ inf{U(P. the Riemann integral is only deﬁned on a certain class of functions. In other words b a f ≤ b a f. b] → R be a bounded function. b] → R be a bounded Riemann integrable function. b] is denoted by R[a. By appealing to Proposition 5. b] we deﬁne b b b f (x) dx := a a f (x) dx = a f (x) dx. f ) and U(P.10.8 we immediately obtain the following proposition.5. However. f ) ≤ U(P. M ∈ R be such that m ≤ f (x) ≤ M.2 Riemann integral We can ﬁnally deﬁne the Riemann integral. f ) ≤ L(P. Deﬁnition 5. Let f : [a. Let f : [a. b]. The set of Riemann integrable functions on [a. we often simply write b b f := a a f (x) dx. f ) ≤ U(P2 .1. By Proposition 5. As before. called the Riemann integrable functions.1. b]. f ) ≤ L(P.1. for two arbitrary partitions P1 and P2 we have L(P1 . 5. In other words. f ).8. or sometimes simply the integral of f . a . That is.
If f (x) := c for some constant c. i=1 3 U(P. 1 − ε. f ) ≤ 0 f ≤ U(P. Finally. m2 = inf{ f (x) : x ∈ [1 − ε. 1 + ε]} = 0. 1 + ε]} = 1. 1 − ε]} = 1. f ) = ∑ mi ∆xi = 1 · (1 − ε) + 0 · 2ε + 0 · (1 − ε) = 1 − ε.2) all the inequalities must b be equalities.8 we have Riemann integrable. We will use the notation from the deﬁnition of the Darboux sums. . M1 = sup{ f (x) : x ∈ [0.8. Let P := 0. As ε was arbitrary. ∆x2 = 2ε and ∆x3 = 1 − ε. f ) − L(P. b] and a f = c(b − a). i=1 Thus. 2] → R be deﬁned by 1 f (x) := 1/2 0 if x < 1. 1 − ε]} = 1. 5.12: Let f : [0.122 CHAPTER 5. Thus f is integrable on [a. f ) = (1 − ε) − (1 + ε) = 2ε. then we can take m = M = c.1. 2 be a partition. we have that f = 1. ∆x1 = 1 − ε. Furthermore. We compute 3 L(P. b] ⊂ S. ≤ 2 0 f. f ) = 1 + ε. M3 = sup{ f (x) : x ∈ [1 + ε. 2 2 f− 0 0 f ≤ U(P. Then m1 = inf{ f (x) : x ∈ [0. 2]} = 0.1. THE RIEMANN INTEGRAL Example 5. 1 0 Hence. if x = 1. 1 0 f − 1 ≤ ε. 2]} = 0.1. Example 5. So f is 1 − ε = L(P. if x > 1. 1 + ε.11: We can also integrate constant functions using Proposition 5.1.1. M2 = sup{ f (x) : x ∈ [1 − ε.3 More notation b When f : S → R is deﬁned on a larger set S and [a. 2 0 f By Proposition 5. f ) = ∑ Mi ∆xi = 1 · (1 − ε) + 1 · 2ε + 0 · (1 − ε) = 1 + ε. we write a f to mean the Riemann integral of the restriction of f to [a. Then in the inequality (5. 2 We claim that f is Riemann integrable and that 0 f = 1. b] (provided the restriction is Riemann integrable of course). Proof: Let 0 < ε < 1 be arbitrary. As ε was arbitrary we see that 1 2 0 f = 2 0 f. m3 = inf{ f (x) : x ∈ [1 + ε.
1] → R is deﬁned as f (x) := Prove that f ∈ R[−1.1. Suppose that b < a and that f ∈ R[b.1. 1}. a].4: Finish proof of Proposition 5. the variable x will already have some meaning.8). Compute integral (but feel free to use Proposition 5. b b f (s) ds := a a f (x) dx.8). When we need to write down the variable of integration. f ).5: Suppose that f : [−1. b] such that k→∞ lim U(Pk .1. f ) and U(P.1. f ) = 0.1. b] → R be a bounded function. Exercise 5. Therefore.1: Let f : [0. Exercise 5.4 Exercises Exercise 5. 1] → R be deﬁned by f (x) := x. Suppose that there exists a sequence of partitions {Pk } of [a. when f : S → R is a function and [a. k→∞ k→∞ Exercise 5.2: Let f : [0. b] if the restriction of f to [a. f ). 1] → R be deﬁned by f (x) := x3 and let P := {0. 1 0 f using the deﬁnition of the Exercise 5.1. 0. Show that f is Riemann integrable and that b a f = lim U(Pk . if x ≤ 0.1. At times.1. f ) − L(Pk . b] is Riemann integrable. 1] and compute Proposition 5. 1 −1 1 0 if x > 0. f ) = lim L(Pk . 0.5. we may simply use a different letter.1.3: Let f : [a. Also for any function f we deﬁne a f := 0. f using the deﬁnition of the integral (feel free to use . then deﬁne b a f := − a b a f. Compute L(P. THE RIEMANN INTEGRAL 123 Furthermore.4. b] ⊂ S. b It will be useful to deﬁne the integral a f even if a < b.1. 5. we say that f is Riemann integrable on [a.1. For example.7.
xk ]. then b a n f − ∑ f (ck )∆xk < ε. f using the deﬁnition of the integral (feel free to use Exercise 5. b] → R as f (x) := Prove that f ∈ R[a.8). . . .1.6: Let c ∈ (a. . THE RIEMANN INTEGRAL Exercise 5. x1 . if x = c. b] and compute Proposition 5. c2 . b) and let d ∈ R. . . k=1 . Then show that there exists a partition P = {x0 . . b] → R is Riemann integrable. b a d 0 if x = c. cn } where ck ∈ [xk−1 . . Deﬁne f : [a.124 CHAPTER 5.1.7: Suppose that f : [a. xn } such that if we pick any set of points {c1 . Let ε > 0 be given.1.
. c]} a = sup{L(P. If a < b < c and f : [a. f ) = j=1 ∑ m j ∆x j = ∑ m j ∆x j + ∑ j=k+1 m j ∆x j = L(P1 . f ).5. b] and P2 := {xk . b]} + sup{L(P2 .2. When we take the supremum over all P1 and P2 . f ) : P2 a partition of [b.2. b ∈ P} = sup{L(P1 . Proof. x1 .1. we are taking a supremum over all partitions P of [a. f ). . taking a supremum only over the P such that P contains b is sufﬁcient to ﬁnd the supremum of L(P. . and P2 as above we obtain n k j=1 n U(P. If Q is a partition of [a. PROPERTIES OF THE INTEGRAL 125 5. Then n k j=1 n L(P. f ) + L(P2 . b]. c] (simply taking the union of P1 and P2 ). . . f ) : P1 a partition of [a.1 Additivity The next result we prove is usually referred to as the additive property of the integral. First we prove the additivity property for the lower and upper Darboux integrals. . Lemma 5. f ) : P1 a partition of [a. f ) + L(P2 . . c]. xk+1 . f ). . c]} = sup{L(P1 . c] that contain b. x1 . f ) = j=1 ∑ M j ∆x j = ∑ M j ∆x j + ∑ j=k+1 M j ∆x j = U(P1 . c] such that P = Q ∪ {b}. . c]} b c = a f+ b f. P1 . f ) +U(P2 . xn } of [a. Therefore we obtain c f = sup{L(P. then P is a reﬁnement of Q and so L(Q.2 Properties of the integral Note: 2 lectures 5. f ) : P a partition of [a. P2 a partition of [b. f ) ≤ L(P. . then we have a partition P := {x0 . . . xk } of [a.2. If we have partitions P1 := {x0 . for P. xn } of [b. Therefore. c]. . f ) : P a partition of [a. f ). Similarly. c] → R is a bounded function. Then c b c f= a a f+ b f and c b c f= a a f+ b f.
b]. f ). d]. Thus the inequality is an equality and b c b c f+ a b c b f b f= a c b f. c].3. f . f ). then c c b c a f = c = c a b f= a a f= a f+ b f≤ a f+ b f= a f= a f.2. Let a < b < c. An easy consequence of the additivity is the following corollary. then c b c f= a a f+ b c a f f. Therefore f is Riemann integrable on [a. f ) ≥ U(P. We again apply the lemma to get c b c b c b c c f= a a f+ b f= a f+ b f= a f+ b f= a f. Therefore. We obtain c b c f= a a f+ b f. b] and [b. c] are Riemann integrable. b] and [c. b] and [b. b] and to [b. Suppose that f ∈ R[a. A function f : [a. Now assume that the restrictions of f to [a. d] ⊂ [a. c] and the desired formula holds. Theorem 5. a Thus f is Riemann integrable on [a. If Q is a partition of [a. taking an inﬁmum only over the P such that P contains b is sufﬁcient to ﬁnd the inﬁmum of U(P. c]. f+ b f. THE RIEMANN INTEGRAL We wish to take the inﬁmum on the right over all P1 and P2 . then the restriction f [c.126 CHAPTER 5. As we also know that b a f ≤ b a f b and f= ≤ we can conclude that c c f a and b f= b f. and so we are taking the inﬁmum over all partitions P of [a. and the integral is computed as indicated. if and only if f is Riemann integrable on [a. then P is a reﬁnement of Q and so U(Q. c].2. Corollary 5.d] is in R[c. c] → R is Riemann integrable. We apply the lemma to get c c c Proof. We leave the details to the reader as an exercise.2. c] such that P = Q ∪ {b}. If f is Riemann integrable. . c] that contain b. If f ∈ R[a.
Similarly we show that b b α f (x) dx = α a a f (x) dx. The conclusion now follows for α ≥ 0. . Let f and g be in R[a. f ) and U(P. For a partition P we notice that (details are left to reader) L(P.2.2. Let f and g be in R[a.2 Linearity and monotonicity Proposition 5. The proof of this fact is left as an exercise. Proposition 5.2.5. Then b b f≤ a a g.2. α f ) = αU(P. Hence. α f ) = αL(P. f ) : P a partition} b =α a f (x) dx. f ) : P a partition} = α sup{L(P. b] and let f (x) ≤ g(x) for all x ∈ [a. b] and α ∈ R. b α f (x) dx = sup{L(P. (i) α f is in R[a. (ii) f + g is in R[a. We know that for a bounded set of real numbers we can move multiplication by a positive number α past the supremum. First suppose that α ≥ 0. b].4 (Linearity). The proof of the second item is also left as an exercise (it is not as trivial as it may appear at ﬁrst glance). Let us prove the ﬁrst item. α f ) : P a partition} a = sup{αL(P. f ). b] and b b b f (x) + g(x) dx = a a f (x) dx + a g(x) dx.5 (Monotonicity). PROPERTIES OF THE INTEGRAL 127 5. b b To ﬁnish the proof of the ﬁrst item. Proof. b] and b b α f (x) dx = α a a f (x) dx. we need to show that a − f (x) dx = − a f (x) dx.
. xn } be a partition of [a. ˜ i=1 We can now take the supremum over all P to obtain that b b f≤ a a g.128 Proof. . .6. . g). Then f (x) = Mi and f (y) = mi in the notation from the deﬁnition of the integral. b−a . Then f ∈ R[a. b−a As f is continuous on [xi−1 . the conclusion follows. then mi ≤ mi . As f is continuous on a closed bounded interval. Mi − mi < ε . xi ]}. b] → R has ﬁnitely many discontinuities if there exists a ﬁnite set S := {x1 . b]. therefore it is uniformly continuous. n. Therefore. b] → R be a continuous function. . take n such that 1/n < δ and let xi := n (b − a) + a. xi ]} As f (x) ≤ g(x). f ) = ∑ mi ∆xi ≤ ∑ mi ∆xi = L(P. b] \ S.3 Continuous functions We say that a function f : [a. y ∈ [xi−1 .2. Let ε ε > 0 be given. we need some lemmas. Let x be the point where f attains the maximum and y be the point where f attains the minimum. . For i example.2. xi ] we have that x − y < ∆xi < δ and hence ε f (x) − f (y) < . 5. Lemma 5. Proof. 2. Then let mi := inf{ f (x) : x ∈ [xi−1 . Then for all x. b]. b] such that ∆xi < δ for all i = 1. . xi ] it attains a maximum and a minimum. Let P := {x0 . xn } ⊂ [a. b]. x2 . . . x1 . Therefore. A := [a. ˜ n i=1 n CHAPTER 5. . The ﬁrst lemma says that bounded continuous functions are Riemann integrable. Let P be a partition of [a. . Then ﬁnd a δ such that x − y < δ implies  f (x) − f (y) < b−a . As f and g are Riemann integrable. Before we prove that bounded functions with ﬁnitely many discontinuities are Riemann integrable. THE RIEMANN INTEGRAL and mi := inf{g(x) : x ∈ [xi−1 . ˜ L(P. and the restriction f A is continuous. Let f : [a. .
if a < an < bn < b are such that lim an = a and lim bn = b. b ] for all a . Let M > 0 be a real number such that  f (x) ≤ M. then b bn f = lim a n→∞ an f. We thus have −M(b − a) ≤ −M(bn − an ) ≤ Thus the sequence of numbers { bn an bn an f ≤ M(bn − an ) ≤ M(b − a). b]. f } is bounded and hence by BolzanoWeierstrass has a conbnk ank vergent subsequence indexed by nk .7. Let f : [a. b−a As ε > 0 was arbitrary. f ) n n = n i=1 ∑ Mi∆xi − i=1 ∑ mi∆xi = ∑ (Mi − mi )∆xi i=1 ε n < ∑ ∆xi b − a i=1 ε = (b − a) = ε. b] → R be a bounded function that is Riemann integrable on [a .1 says that the lower and upper integral are additive and the hypothesis says that f is integrable on [an . Furthermore. b b f= a a f. It also tells us how to compute the integral. Then Lemma 5. b such that a < a < b < b. We look . Pick two sequences of numbers a < an < bn < b such that lim an = a and lim bn = b.5. bn ]. Let us call L the limit of the subsequence { f }. The second lemma says that we need the function to only “Riemann integrable inside the interval. and f is Riemann integrable on [a. Proof.2. Therefore b an bn b f= a a f+ an f+ bn f ≥ −M(an − a) + bn an f − M(b − bn ).2. Note that M > 0 and (b − a) ≥ (bn − an ). b]. PROPERTIES OF THE INTEGRAL And so b b 129 f− a a f ≤ U(P. f ) − L(P. Then f ∈ R[a.” as long as it is bounded.2. Lemma 5.
the sequence { ann f } is convergent and converges to L. . THE RIEMANN INTEGRAL b f ≥ −M(ank − a) + bnk ank f − M(b − bnk ) and we take the limit on the righthand side to obtain b f ≥ −M · 0 + L − M · 0 = L. The next result says that if we change the values only at ﬁnitely many points. b] → R be a bounded function with ﬁnitely many discontinuities. bi ] so that f is continuous on the interior (ai . a b b b Thus a f = a f = L and hence f is Riemann integrable and a f = L. If f is continuous on (ai . bi ] is integrable. Then f ∈ R[a. then it is continuous and hence integrable on [ci . bi ).2. b] → R be a function such that f (x) = g(x) for all x ∈ [a. Let g : [a. By additivity of the integral (and simple induction) f is integrable on the union of the intervals. We take the same subsequence { b a f } and take the limit of the inequality bnk ank f ≤ M(ank − a) + f + M(b − bnk ) to obtain b f ≤ M · 0 + L + M · 0 = L. Then g is a Riemann integrable function and b b g= a a f. Theorem 5. To prove the ﬁnal statement of the lemma we note that we can use Theorem 2. di ] for all ai < ci < di < bi . Proposition 5. We divide the interval into ﬁnitely many intervals [ai .7. b]. Proof.130 at a CHAPTER 5. b] \ S. a Next use the additivity of the upper integral to obtain b an bn b f= a a f+ an bnk ank f+ bn f ≤ M(an − a) + bn an f + M(b − bn ). Sometimes it is convenient (or necessary) to change certain values of a function and then integrate.9. By Lemma 5. Let f : [a. b] → R be Riemann integrable.2. We have bn b shown that every convergent subsequence { ankk f } converges to L. bi ).3. where S is a ﬁnite set. Therefore. the integral does not change.7 the restriction of f to [ai .2.8. Let f : [a.
b Exercise 5. Prove that γ β γ f= α α f+ β f. Exercise 5. Prove that f (x) = 0 for all x.7. b] → R are continuous functions such that Then show that there exists a c ∈ [a. b] such that f (c) = 0 (Compare with the previous exercise). β .1: Let f be in R[a. b] → R is a continuous function for all x ∈ [a. Exercise 5. b] → R and g : [a.2. b]. Using additivity of the integral. b] such that f (c) = g(c). Exercise 5.2. The proof follows by Lemma 5. b] and b a f = 0.4: Prove the mean value theorem for integrals. Exercise 5. f ) < ε/2 and U(P. PROPERTIES OF THE INTEGRAL 131 Sketch of proof. Prove that there exists a c ∈ [a. b] → R be a function such that f (x) = g(x) for all x ∈ (a. Let g : [a. Exercise 5.2.2. b] and b b b f (x) + g(x) dx = a a f (x) dx + a g(x) dx.2: Let f and g be in R[a.2. 5. Recall what b a f means if b ≤ a.3: Let f : [a.5. we could split up the interval [a. then there exists a c ∈ [a. b] → R is b continuous.2.7: If f : [a. b] and a f = 0. b] and b b − f (x) dx = − a a f (x) dx.2. Prove that g is Riemann integrable and that b b g= a a f. g) < ε/2. b] into smaller intervals such that f (x) = g(x) holds for all x except at the endpoints (details are left to the reader). b). without loss of generality suppose that f (x) = g(x) for all x ∈ (a. prove that if f : [a. b]. b] → R be Riemann integrable. b].2. . Therefore.5: If f : [a. Let α. Prove that f + g is in R[a. b] → R is a continuous function such that f (x) ≥ 0 for all x ∈ [a.4 Exercises Exercise 5. That is.7 to ﬁnd a single partition P such that U(P. b] (not necessarily ordered in any way). b] such that a f = f (c)(b − a). f ) − L(P.2. b a f= b a g.8: Let f ∈ R[a.2.1.2. Exercise 5. Prove that − f is in R[a. γ be arbitrary numbers in [a. g) − L(P.6: If f : [a. b). and is left as an exercise. Hint: Use Proposition 5.
k ∈ N and m and k have no common divisors. In .12: Let I be an arbitrary bounded interval (you should consider all types of intervals: closed. 0 otherwise. then using only the deﬁnition of the integral show that the elementary step function ϕI is integrable on [a. halfopen) and a < b. b.2. Therefore. there exists a function discontinuous at all rational numbers (a dense set) that is Riemann integrable.2. I2 .132 Exercise 5. When a function f can be written as n 1 if x ∈ I. .2. b]. Exercise 5. THE RIEMANN INTEGRAL Exercise 5. . Then show that b b f (x) dx ≤ a a  f (x) dx.2. open. then f is called a step function. ﬁnd the integral in terms of a.9: Prove Corollary 5. 1] → R by f (x) := Show that 1 0 1/k 0 if x = m/k where m. .10: Suppose that f : [a. Exercise 5. f = 0. and the endpoints of I. . . If I ⊂ R is a bounded interval.2. b] → R has ﬁnitely many discontinuities. In particular.12) is Riemann integrable. .11 (Hard): Show that the Thomae or popcorn function (See also Example 3. then the function ϕI (x) := is called an elementary step function. α2 . the endpoints of Ik and the αk . show that a step function is integrable on any interval [a. αn and some bounded intervals I1 . . . CHAPTER 5. and ﬁnd the integral in terms of a.3.2. f (x) = ∑ αk ϕIk (x) k=1 for some real numbers α1 . Show that as a function of x the expression  f (x) has ﬁnitely many discontinuities and is thus Riemann integrable. deﬁne f : [0.2. b]. Furthermore. Exercise 5. .13: Using the previous exercise. b. if x is irrational.
Let f ∈ R[a. It tells us how to compute the antiderivative of a function using the integral. This generalization is left as an exercise. n to get n i=1 n i=1 n ∑ mi∆xi ≤ ∑ F(xi ) − F(xi−1 ) ≤ ∑ Mi ∆xi . The theorem relates the seemingly unrelated concepts of integral and derivative. The sums on the left and on the right are the lower and the upper sum respectively. b).1. FUNDAMENTAL THEOREM OF CALCULUS 133 5. For each interval [xi−1 . 2. . b] → R be a continuous function. i=1 We notice that in the sum all the middle terms cancel and we end up simply with F(xn ) − F(x0 ) = F(b) − F(a). . Let P be a partition of [a. differentiable on (a. a It is not hard to generalize the theorem to allow a ﬁnite number of points in [a. b). 5. Using the notation from the deﬁnition of the integral.1 First form of the theorem Theorem 5. f ) ≤ F(b) − F(a) ≤ U(P. Proof. a . .3. mi ≤ f (ci ) ≤ Mi . Let F : [a. . This is the one theorem on which the entirety of integral calculus is built.3. f ) over all P and the inequality yields b f ≤ F(b) − F(a).3 Fundamental theorem of calculus Note: 1. b] where F is not differentiable. L(P.5 lectures In this chapter we discuss and prove the fundamental theorem of calculus. hence the name. as long as it is continuous. We now sum over i = 1. We can now take the supremum of L(P. Then b f = F(b) − F(a).3. xi ]. Therefore. use the mean value theorem to ﬁnd a ci ∈ [xi−1 . f ). b]. mi ∆xi ≤ F(xi ) − F(xi−1 ) ≤ Mi ∆xi . xi ] such that f (ci )∆xi = F (ci )(xi − xi−1 ) = F(xi ) − F(xi−1 ).5. b] be such that f (x) = F (x) for x ∈ (a.
First as f is bounded. Second. y ∈ [a. As f is Riemann integrable. First. f ) over all partitions P yields b F(b) − F(a) ≤ a f. Deﬁne x F(x) := a f. And we are done as the inequalities must be equalities.3. Suppose x. then we can ﬁnd an explicit expression for a f . 3 3 3 2 5. Theorem 5.2: For example. Then using an exercise from earlier section we note x y x F(x) − F(y) = a f− a f = y f ≤ M x − y . We notice that x2 is the derivative of x3 3. suppose we are trying to compute 1 0 x2 dx. b]. Let f : [a. F is continuous on [a. Example 5.3. The theorem is often used to solve integrals. where f (x) is a known function and we are trying to ﬁnd an F that satisﬁes the equation. we have b b b b f= a a f ≤ F(b) − F(a) ≤ a f= a f. taking the inﬁmum of U(P. Suppose we know that the function f (x) is a b derivative of some other function F(x). . then F is differentiable at c and F (c) = f (c). b] → R be a Riemann integrable function.3.134 CHAPTER 5.2 Second form the theorem The second form of the fundamental theorem gives us a way to solve the differential equation F (x) = f (x). b]. THE RIEMANN INTEGRAL Similarly.3. If f is continuous at c ∈ [a. there is an M > 0 such that  f (x) ≤ M. b]. 1 0 therefore we use the fundamental theorem to write down 03 13 1 x dx = − = . Proof.
F is differentiable on all of [a.4. x−c x−c x c we have that F(x) − F(c) − f (c) < ε. Let δ > 0 be such that x − c < δ implies  f (x) − f (c) < ε for x ∈ [a. d That is. b]. b] and F (x) = f (x) for all x ∈ [a. Let ε > 0 be given. Remark 5. The proof is left as an exercise. We simply gave the integral a name. Note that this inequality holds even if c > x. A common misunderstanding of the integral for calculus students is to think of integrals whose solution cannot be given in closedform as somehow deﬁcient. . Therefore f (c) − ε ≤ As F(x) − F(c) = x−c f ≤ f (c) + ε. we can use any point of [a. In particular for such x we have f (c) − ε ≤ f (x) ≤ f (c) + ε. b]. b]. x−c x a x c f− a f c f = . x Then we can numerically approximate the integral. x−c Of course. we did not really “simplify” 1 1/s ds by writing down ln x. it is possible that we will end up doing the calculation by approximating an integral anyway.3. The second form of the fundamental theorem of calculus still holds if we let d ∈ [a.5. b] as our base point. FUNDAMENTAL THEOREM OF CALCULUS 135 Do note that it does not matter if x < y or x > y. Therefore F is Lipschitz continuous and hence continuous. So morally. If we require numerical answers. then it is automatically Riemann integrable. x ln x := 1 1/s ds. Most integrals we write down are not computable in closedform. Thus ( f (c) − ε)(x − c) ≤ c x f ≤ ( f (c) + ε)(x − c).3. if f is continuous on [a. how does a computer ﬁnd the value of ln x? One way to do it is to simply note that we deﬁne the natural log as the antiderivative of 1/x such that ln 1 = 0. Therefore. This is not the case. b] and deﬁne x F(x) := f. Now suppose that f is continuous at c. For example. Plus even some integrals that we consider in closedform are not really.
.3. THE RIEMANN INTEGRAL Another common function where integrals cannot be evaluated symbolically is the erf function deﬁned as 2 x s2 e ds.3 Change of variables A theorem often used in calculus to solve integrals is the change of variables theorem.5 (Change of variables). By second form of the fundamental theorem of calculus (using Exercise 5. Theorem 5. we numerically approximate the integral. If we wish to compute any particular value. g .3. erf(x) := π 0 This function comes up very often in applied mathematics. d] and f : [c. Deﬁne y F(y) := g(a) f (s) ds. and f are continuous. The substitution theorem is often used to solve integrals by changing them to integrals we know or which we can solve using the fundamental theorem of calculus. b]. d] → R is continuous. then b g(b) f g(x) g (x) dx = a g(a) f (s) ds. b] → R be a continuously differentiable function. As g.3. therefore Riemann integrable. It is simply the antiderivative of (2/π ) ex that is zero at zero. b]) ⊂ [c. The second form of the fundamental theorem tells us that we can write the function as an integral. 2 5. we know that f g(x) g (x) is a continuous function on [a. If g([a.136 CHAPTER 5. Let g : [a. Proof. Write F ◦ g (x) = F g(x) g (x) = f g(x) g (x) Next we note that F g(a) = 0 and we use the ﬁrst form of the fundamental theorem to obtain g(b) b b f (s) ds = F g(b) = F g(b) − F g(a) = g(a) a F ◦ g (x) dx = a f g(x) g (x) dx.4 below) F is a differentiable function and F (y) = f (y). Recall that a function is continuously differentiable if it is differentiable and the derivative is continuous. Now we apply the chain rule. Let us prove it now.
we know that the derivative of sin(x) is cos(x). b] → R be a continuous function.2: Compute dx Exercise 5. where S is a ﬁnite set.3. b] be arbitrary. First problem is that lnx is not Riemann integrable on [−1.5.6: From an exercise.3.3. FUNDAMENTAL THEOREM OF CALCULUS 137 Example 5.3.3.3. b] \ S. we should always be careful that those symbols really make sense. Prove that F is differentiable and that F (x) = f (x) for all x ∈ [a. Secondly.1: Compute d dx x −x x2 0 es ds . Therefore we can solve √ π 0 x cos(x2 ) dx = 0 π cos(s) 1 ds = 2 2 π cos(s) ds = 0 sin(π) − sin(0) = 0.3: Suppose F : [a.4 Exercises Exercise 5. b] → R is continuous and differentiable on [a. Then take g (x) = g(1) 0 1 x 1 and try to write s ds = g(−1) 0 s ds = 0. Let c ∈ [a. 2 However. Exercise 5. Deﬁne x F(x) := c f.4: Let f : [a. b] such that f (x) = F (x) for x ∈ [a. sin(s2 ) ds . b].7: Suppose we write down ln x dx. Suppose there exists an f ∈ R[a. 1] either. b] \ S. This “solution” is not correct. We must not simply move symbols around. 2 d Exercise 5. Example 5. The integral we wrote down x simply does not make sense. −1 x It may be tempting to take g(x) := ln x. and it does not say that we can solve the given integral. . Show that b a f = F(b) − F(a). beware that we must satisfy the hypothesis of the function. 1] (it is unbounded). 5. lnx is not even continuous on [−1. Finally g is not x continuous on [−1.3.3. The following example demonstrates a common mistake for students of calculus. 1].
3. b] → R is continuous and Show that f (x) = 0 for all x ∈ [a.3. Exercise 5.8: Suppose that f : [a. b − ε]. b].3. suppose that F and G are differentiable functions on [a. deﬁne 1 x+ε g(x) := f.9: Suppose that f : [a. Let ε > 0 be a constant. b]. f = 0 for all rational x in [a. (iii) Find g for f (x) := x. show that there exists a C ∈ R such that F(x) − G(x) = C. . Exercise 5. b] → R is continuous. b] is large enough). 2ε x−ε (i) Show that g is differentiable and ﬁnd the derivative. Exercise 5. For x ∈ [a + ε. b] → R be a continuous function. b] and suppose that F and G are Riemann integrable. x a x a f= b x f for all x ∈ [a.3.3. THE RIEMANN INTEGRAL Exercise 5. (ii) Let f be differentiable and ﬁx x ∈ (a. b) (and let ε be small enough). b]. That is.7: Let f : [a. Then prove b b F(x)G (x) dx = F(b)G(b) − F(a)G(a) − a a F (x)G(x) dx.5: Prove integration by parts. That is. What happens to g (x) as ε gets smaller.6: Suppose that F. b]. ε = 1 (you can assume that [a. Show that F and G differ by a constant. b]. and G are differentiable functions deﬁned on [a. b] such that F (x) = G (x) for all x ∈ [a. The next exercise shows how we can use the integral to “smooth out” a nondifferentiable function. Exercise 5.138 CHAPTER 5. Suppose that Show that f (x) = 0 for all x ∈ [a.
of course.1 Pointwise convergence Deﬁnition 6. if for every x ∈ S we have f (x) = lim fn (x). However.2: The sequence of functions fn (x) := x2n converges to f : [−1. See Figure 6.1. Then the real solution is some sort of limit of those approximate solutions.1 Pointwise and uniform convergence Note: 1.1. 6. In that case we. 1] → R on [−1. f (x) = 0 otherwise. For example. Example 6.1. n→∞ It is common to say that fn : S → R converges to f on T ⊂ R for some f : T → R. 139 . a very useful concept in analysis is to use a sequence of functions.1. We will talk about two common notions of a limit of a sequence of functions. where 1 if x = −1 or x = 1. The tricky part is that when talking about sequences of functions. many times a solution to some differential equation is found by ﬁnding approximate solutions only.1. We say the sequence { fn } converges pointwise to f : S → R. We simply mean that the restrictions of fn to T converge pointwise to f . 1]. Let fn : S → R be functions. there is not a single notion of a limit.5 lecture Up till now when we have talked about sequences we always talked about sequences of numbers.Chapter 6 Sequences of Functions 6. mean that f (x) = lim fn (x) for every x ∈ T .
We have seen before that x2n − 0 = (x2 ) → 0 as n → ∞. and fn are deﬁned for all x (even at x = 1). functions are given as a series. 1) and is the pointwise limit of the partial sums. Example 6. We also note that fn (x) does not converge for all other x. Therefore. Therefore lim fn (x) = 0.3: We write n ∑ xk k=0 ∞ to denote the limit of the functions fn (x) := n ∑ xk . when we write f (x) := ∑ xk k=0 ∞ we mean that f is deﬁned on (−1. then x2n = 1 and hence lim fn (x) = 1. To see that this is so. k=0 When studying series.1: Graphs of f1 . f2 . 1−x 1 The subtle point here is that while 1−x is deﬁned for all x = 1. and f8 for fn (x) := x2n . we have seen that on x ∈ (−1. f3 . In this case.1. 1). we simply use the notion of pointwise convergence to ﬁnd the values of the function. 1) the fn converge pointwise to 1 . Often. convergence only happens on (−1. SEQUENCES OF FUNCTIONS x2 x4 x6 x16 Figure 6. 1). When x = 1 or x = −1. ﬁrst take x ∈ (−1. . Then x2 < 1.140 CHAPTER 6.
there exists an x such that sin(xn) does not have a limit as n goes to inﬁnity. If we could pick one N for all x.5 we easily see that uniform convergence implies pointwise convergence. Let fn : S → R be functions. even though they converge pointwise. let us reformulate pointwise convergence in a different way. .6. Given ε > 0 we must ﬁnd an N that works for all x ∈ S. there exists an N ∈ N such that  fn (x) − f (x) < ε for all n ≥ N. 1)). a] where 0 < a < 1.7. therefore we obtain a contradiction 2N 1 = 12N = lim xk ≤ 1/2.1. for n ≥ N x2N = x2N ≤ a2N < ε.8: The functions fn (x) := x2n do not converge uniformly on [−1. then fn converges uniformly to 0 on [−a.1. On the other hand x2N is a continuous function of x (it is a polynomial).1. Example 6. It is left as an exercise that in any interval [a. suppose for contradiction that they did. Again to see this note that a2n → 0 as n → ∞. for each x we can pick a different N. Then { fn } converges pointwise to f if and only if for every x ∈ S. Note the fact that N now cannot depend on x. k→∞ However. Before we move to uniform convergence.1. 1]. Then for any x ∈ [−a. Therefore. it is a simple application of the deﬁnition of convergence of a sequence of real numbers. We leave the proof to the reader.5. and every ε > 0. such as when x = 0 or x = π.4: Let fn (x) := sin(xn). Proposition 6. a]. if we restrict our domain to [−a. We say the sequence { fn } converges uniformly to f : S → R. Then { fn } converges pointwise to f .1. Proposition 6.1. The converse does not hold. not just on ε. Then fn does not converge pointwise to any function on any interval.2 Uniform convergence Deﬁnition 6. 6. It may converge at certain points. The key point here is that N can depend on x. 1) such that lim xk = 1 we have xk < 1/2. 1) (as fn (x) converges to 0 on (−1.1. if for every ε > 0 there exists an N ∈ N such that for all n ≥ N we have  fn (x) − f (x) < ε. Take ε := 1/2. b]. To see this.6. a] we have x ≤ a. 2N But that means that for any sequence {xk } in [0.1. pick N ∈ N such that a2n < ε for all n ≥ N. Let fn : S → R and f : S → R be functions. then there would have to exist an N such that x2N < 1/2 for all x ∈ [0. we would have what is called uniform convergence. POINTWISE AND UNIFORM CONVERGENCE 141 Example 6. That is. Let fn : S → R be a sequence of functions that converges uniformly to f : S → R. Thus given ε > 0. Because of Proposition 6.
1. we see that for all x we have  fn (x) − f (x) < ε. Let ε > 0 be given. n→∞ u := sup{ f (x) : x ∈ S}. To every bounded function we can assign a certain nonnegative number (called the uniform norm). 1] n sin(nx2 ) : x ∈ [0. 1] → R be deﬁned by fn (x) := converge uniformly to f (x) := x. Deﬁne f · is called the uniform norm.3 Convergence in uniform norm For bounded functions there is another more abstract way to think of uniform convergence. The proposition says that the two notions are the same thing. Let us compute: fn − f u = sup nx+sin(nx2 ) . Then ﬁnd N such that  fn (x) − f (x) < ε for all x ∈ S. This number measures the “distance” of the function from 0. Let ε > 0 be given. Then we can “measure” how far two functions are from each other.1. n Then we claim that fn nx + sin(nx2 ) − x : x ∈ [0. Example 6. Then there exists an N such that for n ≥ N we have fn − f u < ε. We can then simply translate a statement about uniform convergence into a statement of a certain sequence of real numbers converging to zero. Proof. First suppose that lim fn − f u = 0. if and only if lim fn − f u = 0. 1]} = 1/n.10. Taking the supremum we see that fn − f u < ε. we can deﬁne Cauchy sequences in a similar way as Cauchy sequences of real numbers. On the other hand.9. A sequence of bounded functions fn : S → R converges uniformly to f : S → R. 1] n = sup ≤ sup{1/n : x ∈ [0. SEQUENCES OF FUNCTIONS 6. Deﬁnition 6. Proposition 6. Let f : S → R be a bounded function.1. . Using uniform norm.1.142 CHAPTER 6. Hence lim fn − f = 0. suppose that fn converges uniformly to f .11: Let fn : [0. Sometimes it is said that fn converges to f in uniform norm instead of converges uniformly. As fn − f u is the supremum of  fn (x) − f (x).
6.1. POINTWISE AND UNIFORM CONVERGENCE
143
Deﬁnition 6.1.12. Let fn : S → R be bounded functions. We say that the sequence is Cauchy in the uniform norm or uniformly Cauchy if for every ε > 0, there exists an N ∈ N such that for m, k ≥ N we have fm − fk u < ε. Proposition 6.1.13. Let fn : S → R be bounded functions. Then { fn } is Cauchy in the uniform norm if and only if there exists an f : S → R and { fn } converges uniformly to f . Proof. Let us ﬁrst suppose that { fn } is Cauchy in the uniform norm. Let us deﬁne f . Fix x, then the sequence { fn (x)} is Cauchy because  fm (x) − fk (x) ≤ fm − fk Thus { fn (x)} converges to some real number so deﬁne f (x) := lim fn (x).
n→∞ u.
Therefore, fn converges pointwise to f . To show that convergence is uniform, let ε > 0 be given ﬁnd an N such that for m, k ≥ N we have fm − fk u < ε. Again this implies that for all x we have  fm (x) − fk (x) < ε. Now we can simply take the limit as k goes to inﬁnity. Then  fm (x) − fk (x) goes to  fm (x) − f (x). Therefore for all x we get  fm (x) − f (x) < ε. And hence fn converges uniformly. For the other direction, suppose that { fn } converges uniformly to f . Given ε > 0, ﬁnd N such that for all n ≥ N we have  fn (x) − f (x) < ε/4 for all x ∈ S. Therefore for all m, k ≥ N we have  fm (x) − fk (x) =  fm (x) − f (x) + f (x) − fk (x) ≤  fm (x) − f (x) + f (x) − fk (x) < ε/4 + ε/4. We can now take supremum over all x to obtain fm − fk
u
≤ ε/2 < ε.
6.1.4 Exercises
Exercise 6.1.1: Let f and g be bounded functions on [a, b]. Show that f +g Exercise 6.1.2: a) Find the pointwise limit
u
≤ f
u+
g
u.
ex/n for x ∈ R. n
144 b) Is the limit uniform on R. c) Is the limit uniform on [0, 1].
CHAPTER 6. SEQUENCES OF FUNCTIONS
Exercise 6.1.3: Suppose fn : S → R are functions that converge uniformly to f : S → R. Suppose that A ⊂ R. Show that the restrictions fn A converge uniformly to f A . Exercise 6.1.4: Suppose that { fn } and {gn } deﬁned on some set A converge to f and g respectively pointwise. Show that { fn + gn } converges pointwise to f + g. Exercise 6.1.5: Suppose that { fn } and {gn } deﬁned on some set A converge to f and g respectively uniformly on A. Show that { fn + gn } converges uniformly to f + g on A. Exercise 6.1.6: Find an example of a sequence of functions { fn } and {gn } that converge uniformly to some f and g on some set A, but such that fn gn (the multiple) does not converge uniformly to f g on A. Hint: Let A := R, let f (x) := g(x) := x. You can even pick fn = gn . Exercise 6.1.7: Suppose that there exists a sequence of functions {gn } uniformly converging to 0 on A. Now suppose that we have a sequence of functions fn and a function f on A such that  fn (x) − f (x) ≤ gn (x) for all x ∈ A. Show that fn converges uniformly to f on A. Exercise 6.1.8: Let { fn }, {gn } and {hn } be sequences of functions on [a, b]. Suppose that fn and hn converge uniformly to some function f : [a, b] → R and suppose that fn (x) ≤ gn (x) ≤ hn (x) for all x ∈ [a, b]. Show that gn converges uniformly to f . Exercise 6.1.9: Let fn : [0, 1] → R be a sequence of increasing functions (that is fn (x) ≥ fn (y) whenever x ≥ y). Suppose that f (0) = 0 and that lim fn (1) = 0. Show that fn converges uniformly n→∞ to 0. Exercise 6.1.10: Let { fn } be a sequence of functions deﬁned on [0, 1]. Suppose that there exists a sequence of numbers xn ∈ [0, 1] such that fn (xn ) = 1. Prove or disprove the following statements. a) True or false: There exists { fn } as above that converges to 0 pointwise. b) True or false: There exists { fn } as above that converges to 0 uniformly on [0, 1].
6.2. INTERCHANGE OF LIMITS
145
6.2
Interchange of limits
Note: 1.5 lectures Large parts of modern analysis deal mainly with the question of the interchange of two limiting operations. It is easy to see that when we have a chain of two limits, we cannot always just swap the limits. For example, n/k n/k = lim lim = 1. 0 = lim lim n→∞ k→∞ n/k + 1 k→∞ n→∞ n/k + 1 When talking about sequences of functions, interchange of limits comes up quite often. We treat two cases. First we look at continuity of the limit, and second we will look at the integral of the limit.
6.2.1 Continuity of the limit
If we have a sequence of continuous functions, is the limit continuous? Suppose that f is the (pointwise) limit of fn . If xk → x, we are interested in the following interchange of limits. The equality we have to prove (it is not always true) is marked with a question mark.
k→∞
lim f (xk ) = lim lim fn (xk ) = lim lim fn (xk ) = lim fn (x) = f (x).
k→∞ n→∞ n→∞ k→∞ n→∞
?
In particular, we wish to ﬁnd conditions on the sequence { fn } so that the above equation holds. It turns out that if we simply require pointwise convergence, then the limit of a sequence of functions need not be continuous, and the above equation need not hold. Example 6.2.1: Let fn : [0, 1] → R be deﬁned as fn (x) := 1 − nx 0 if x < 1/n, if x ≥ 1/n.
See Figure 6.2. Each function fn is continuous. Now ﬁx an x ∈ (0, 1]. Note that for n > 1/x we have x < 1/n. Therefore for n > 1/x we have fn (x) = 0. Thus
n→∞
lim fn (x) = 0.
On the other hand if x = 0, then
n→∞
lim fn (0) = lim 1 = 1.
n→∞
Thus the pointwise limit of fn is the function f : [0, 1] → R deﬁned by f (x) := The function f is not continuous at 0. 1 if x = 0, 0 if x > 0.
if we simply require pointwise convergence. Therefore { f (xm )} converges to f (x) and hence f is continuous at x. b] be ﬁxed. If we. f is continuous everywhere. b] → R be a sequence of continuous functions. Thus for m ≥ N we have  f (xm ) − f (x) =  f (xm ) − fk (xm ) + fk (xm ) − fk (x) + fk (x) − f (x) ≤  f (xm ) − fk (xm ) +  fk (xm ) − fk (x) +  fk (x) − f (x) < ε/3 + ε/3 + ε/3 = ε. Theorem 6. Then f is continuous. As x was arbitrary. b] converging to x. require the convergence to be uniform. Let fn : [a. b] → R. then the integral of a limit of a sequence of functions need not be the limit of the integrals.2. Let x ∈ [a. b].2 Integral of the limit Again. 6.2.2: Graph of fn (x).2. Suppose that { fn } converges uniformly to f : [a. we can ﬁnd an N ∈ N such that for m ≥ N we have  fk (xm ) − fk (x) < ε/3. Proof. however. Let {xn } be a sequence in [a. SEQUENCES OF FUNCTIONS 1/n Figure 6. As fk converges uniformly to f . . the limits can be interchanged.146 1 CHAPTER 6. As fk is continuous at x. we ﬁnd a k ∈ N such that  fk (y) − f (y) < ε/3 for all y ∈ [a. Let ε > 0 be given.
if 0 < x < 1/n.2. as for continuity. if x ≥ 1/n. . 1/n Figure 6. For n > 1/x we have x < 1/n and thus fn (x) = 0. Theorem 6. the limits can be interchanged. Now ﬁx an x ∈ (0.6. Therefore lim fn (x) = 0. b] → R be a sequence of Riemann integrable functions.2. Thus 1 1/2 = n→∞ 0 1 1 n→∞ lim fn (x) dx = 0 lim fn (x) dx = 0 dx = 0. 0 But. 1]). Each fn is Riemann integrable (it is continuous on (0. b] → R. n 147 if x = 0.3: Let fn : [0.3. Therefore the pointwise limit of { fn } is the zero function.3: Graph of fn (x). Let fn : [a. 1] → R be deﬁned as 0 fn (x) := n − n2 x 0 See Figure 6. Let us compute the pointwise limit of fn . if we require the convergence to be uniform. n→∞ We also have fn (0) = 0 for all n. 1]. Suppose that { fn } converges uniformly to f : [a.4. Furthermore it is easy to compute that 1 0 1/n fn = 0 fn = 1/2. Then f is Riemann integrable and b b f = lim a n→∞ a fn .2. INTERCHANGE OF LIMITS Example 6.
Note that fn is integrable and compute b b b b f− a a f= a b ( f (x) − fn (x) + fn (x)) dx − b a ( f (x) − fn (x) + fn (x)) dx b b = a b ( f (x) − fn (x)) dx + ( f (x) − fn (x)) dx + b a b a b fn (x) dx − fn (x) dx − a b a ( f (x) − fn (x)) dx − ( f (x) − fn (x)) dx − a b a fn (x) dx fn (x) dx = a = a ( f (x) − fn (x)) dx − a ( f (x) − fn (x)) dx ≤ ε ε (b − a) + (b − a) = ε. 1 n→∞ 0 Example 6.1. As ε > 0 was arbitrary. 2(b − a) 2 Therefore { b a fn } converges to b a f. b Now we can compute a f .8 and using the fact that for all x ∈ [a. We will apply Proposition 5.148 CHAPTER 6. 1] deﬁne 1 fn (x) := 1 if x = p/q in lowest terms and q ≤ n. we can compute the limit. We have shown before that nx+sin(nx ) n converges uniformly on [0. 2(b − a) 2(b − a) The inequality follows from Proposition 5. b]. the limit need not even be Riemann integrable. 1] to the function f (x) := x.2.6: If convergence is only pointwise.5: Suppose we wish to compute lim nx + sin(nx2 ) dx. the limit exists and 1 nx + sin(nx2 ) dx = lim x dx = 1/2. However. . n→∞ 0 n 0 Example 6.2. f is Riemann integrable. 0 otherwise. for n ≥ M (M is the same as above) we have b b b f− a a fn = ≤ a ( f (x) − fn (x)) dx ε ε (b − a) = < ε. For example. on [0. As fn goes to f uniformly.2. n It is impossible to compute the integrals for any particular n using calculus as sin(nx2 ) has no closed2 form antiderivative. we can ﬁnd an M ∈ N such that for all n ≥ M ε we have  fn (x) − f (x) < 2(b−a) for all x ∈ [a. b] we have −ε ε 2(b−a) < f (x) − f n (x) < 2(b−a) . Again.10 in the calculation.4.1. SEQUENCES OF FUNCTIONS Proof. Let ε > 0 be given. By Theorem 6.
2. Exercise 6. INTERCHANGE OF LIMITS 149 As fn differs from the zero function at ﬁnitely many points (there are only ﬁnitely many fractions in 1 1 [0. 0 otherwise.2. Hint: Consider x1+1/n .2: Let fn (x) = xn . 1 Exercise 6. 1).2. k ≥ M we have fn − fk C1 < ε. show that f (1) = lim fn (1). .6: True/False. then f is everywhere discontinuous. b] → R. 1] → R be a bounded function. Suppose that { fn } is a sequence of continuously differentiable functions such that for every ε > 0. Find an explicit example of a sequence of differentiable functions on [−1.7: For a continuously differentiable function f : [a. there exists an M such that for all n. Feel free to use what you know about the exponential 2 Exercise 6.3: Let f : [0. and the show that the limit is not differentiable. then fn is integrable and 0 fn = 0 0 = 0. n→∞ n Note: The previous two exercises show that we cannot simply swap limits with derivatives. Exercise 6. but the convergence is not uniform. n→∞ 1 e−nx dx = 0. replace (0.2. 1) with [0.2. Note: In the previous exercise. 1] that converge uniformly to a function f such that f is not differentiable. 6. even if the convergence is uniform. Exercise 6. it does not preserve differentiability. prove or ﬁnd a counterexample to the following statement: If { fn } is a sequence of everywhere discontinuous functions on [0. 1] with denominator less than or equal to n). Show that fn converges uniformly to a differentiable function f on [0. n Exercise 6. 1) was picked for simplicity. Show that { fn } converges uniformly to some continuously differentiable function f : [a.2.5: Find an example of a sequence of continuous functions on (0.2. For a more challenging exercise. See also Exercise 6. 1] that converge uniformly to a function f .3 Exercises Exercise 6.7 below.2.2. show that these functions are differentiable. 1] (ﬁnd f ). However. It is an easy exercise to show that fn converges pointwise to the Dirichlet function f (x) := which is not Riemann integrable. b] → R.1: While uniform convergence can preserve continuity. 1].6. 1 if x ∈ Q.4: Show lim function from calculus. deﬁne f C1 := f u+ f u .2. Find lim 2 n→∞ 0 f (x) dx. converge uniformly. (0. 1) that converges pointwise to a continuous function on (0.
. SEQUENCES OF FUNCTIONS For the following two exercises let us deﬁne for a Riemann integrable function f : [0. (It is true that  f  is always integrable if f is even if we have not proved that fact). b] the sequence { fn (x) − f (x)} is monotone. b] → R is a sequence of functions that converges pointwise to a continuous f : [a.2.2. This norm deﬁnes another very common type of convergence called the L1 convergence. 1] that converge pointwise to 0. Suppose that fn converges pointwise to 0. −1. .10 (Hard): Prove Dini’s theorem: Let fn : [a. −1. 1.12: Find a sequence of Riemann integrable functions fn : → R such that fn converges 1 1 to the zero functions pointwise and such that a) 0 fn increases without bound. Suppose that for any x ∈ [a. b] → R.2. Exercise 6. but n→∞ lim fn L1 does not exist (is ∞). Exercise 6.11: Suppose that fn : [a.9: Find a sequence of functions { fn } on [0. 1. . b] → R be a sequence of functions such that 0 ≤ fn+1 (x) ≤ fn (x) ≤ · · · ≤ f1 (x) for all n ∈ N.2. Show that fn converges to zero uniformly. n→∞ Exercise 6.. Show that the sequence { fn } converges uniformly. Exercise 6. . 1] → R the following number 1 f L1 := 0  f (x) dx.150 CHAPTER 6. b) 0 fn is the sequence −1.8: Suppose that { fn } is a sequence of functions on [0.2. that is however a bit more subtle. Exercise 6. 1] that converge uniformly to 0. 1. Show that lim fn L1 = 0.
y) if for every sequence {(xn . a point in R2 = R × R is denoted by an ordered pair (x. Both the statement and the proof are beautiful examples of what one can do with all that we have learned. Deﬁnition 6. that it is the unique solution.1 First order ordinary differential equation Modern science is described in the language of differential equations.5–2 lectures A course such as this one should have a pièce de résistance caliber theorem. That is equations that involve not only the unknown. . The function F is continuous at (x. First. It is more sophisticated than the fundamental theorem of calculus. we have that lim F(xn . the ﬁrst highlight theorem of this course.1. y). y). and if it does. when F depends on both x and y we need far more ﬁrepower.3. The theorem we are talking about is Picard’s theorem∗ on existence and uniqueness of a solution to an ordinary differential equation. y). It is not always true that a solution exists. The solution of the equation is a function y(x) such that y(x0 ) = y0 and y (x) = F x.6. n→∞ We say F is continuous if it is continuous at all points in U.3 Picard’s theorem Note: 1. PICARD’S THEOREM 151 6. We pick a theorem whose proof combines everything we have learned.2 The theorem We will need to deﬁne continuity in two variables. y) ∈ U be a point. yn ) = F(x. the solution is given by the fundamental theorem of calculus.3. To make matters simple let us give the following sequential deﬁnition of continuity. On the other hand. Let U ⊂ R2 be a set and F : U → R be a function. yn )} of points in U such that lim xn = x and lim yn = y. ∗ Named for the French mathematician Charles Émile Picard (1856–1941). 6. y(x) . Generally we also specify that y(x0 ) = y0 . 6.3. Let (x. but also its derivatives.3. The simplest nontrivial form of a differential equation is the socalled ﬁrst order ordinary differential equation y = F(x. Picard’s theorem gives us certain sufﬁcient conditions for existence and uniqueness. It is also a good example of how analysis is applied as differential equations are indispensable in science. When F involves only the x variable.
Pick α > 0 such that [−α. y) − F(x. suppose that x0 = 0 (exercise below). 0 and fk is continuous on [−h.3. Let (x0 . h] to [y0 − α. h] by an exercise. then by the fundamental theorem of calculus we can integrate the equation f (x) = F x. z) ≤ L y − z for all y.2) The idea of our proof is that we will try to plug in approximations to a solution to the righthand side of (6. that is. fk−1 (t) dt ≤ M x ≤ Mh ≤ M α ≤ α.1) Proof. there exists a number L such that F(x. h] ⊂ I. We will deﬁne fk inductively. fk−1 (t) is a well deﬁned function of t for t ∈ [−h. Set f0 (x) := y0 .2) and hence (6. z ∈ J. We hope that in the end the sequence will converge and solve (6. h]. (6.152 CHAPTER 6. The technique below is called Picard iteration. Let I. Let M := sup{F(x. y) : (x. h] x  fk (x) − y0  = 0 F t.3) Now note that [−h. (6. y) ∈ I × J}. such that f (x) = F x. f (x0 ) = y0 and write it as the integral equation x f (x) = y0 + F t. Therefore we can deﬁne x fk (x) := y0 + F t. fk−1 (t) is continuous as a function of t on [−h. Suppose that we could ﬁnd a solution f . y0 ) ∈ I0 × J0 . Then there exists an h > 0 and a unique differentiable f : [x0 − h. h]) ⊂ [y0 − α. Assuming that fk−1 ([−h. M + Lα We can now deﬁne fk+1 and so on. we can assume M > 0 (why?). f (x) . h] by the fundamental theorem of calculus. We simply need to show that it converges to a function f that solves the equation (6.2) to get better approximations on the left hand side of (6. Another exercise tells us that F is bounded as it is continuous. SEQUENCES OF FUNCTIONS Theorem 6. f (t) dt. J ⊂ R be closed bounded intervals and let I0 and J0 be their interiors. y0 + α] ⊂ J.1). .2 (Picard’s theorem on existence and uniqueness).2) and therefore (6. f (x) and f (x0 ) = y0 . Without loss of generality. Without loss of generality. y0 + α]. α] ⊂ I and [y0 − α. Deﬁne h := min α. x0 + h] → R. α M + Lα . To see that fk maps [−h. y0 + α]. we see that F t.2). we compute for x ∈ [−h. x0 (6. and we have deﬁned a sequence { fk } of functions. and the individual functions fk are called the Picard iterates. Further assuming that fk−1 is continuous on [−h.1). Suppose F : I × J → R is continuous and Lipschitz in the second variable. then F t. h]. fk−1 (t) dt. x ∈ I.
Now compute for any x ∈ [−h. f (t) . that is the supremum of  fn (t) − fk (t) for t ∈ [−h. and note that C < 1. h] we have  fn−k (x) − f0 (x) =  fn−k (x) − y0  ≤ α. fn (t) converges uniformly to F t. lim fn (t) dt n→∞ 0 n→∞ lim F t. fk−1 (t) dt ≤ L fn−1 − fk−1 u x Lα fn−1 − fk−1 ≤ M + Lα Let C := Lα M+Lα u. then F t. f (t) ≤ L  fn (t) − f (t) ≤ L fn − f u.13 we obtain that { fn } converges uniformly on [−h. n→∞ . h] to some function f : [−h. 0] if x < 0). PICARD’S THEOREM 153 We wish to show that the sequence { fk } converges uniformly to some function on [−h. h] → R. f (t) dt = y0 + = y0 + = lim n→∞ 0 x F t. fn−1 (t) dt − 0 F t. fn (t) dt = lim fn+1 (x) = f (x). The function f is the uniform limit of continuous functions and therefore continuous.1. for t ∈ [−h. Therefore x x  fn (x) − fk (x) = = 0 x 0 F t. First. h]. fn−1 (t) − F t. Therefore. suppose that n ≥ k.6. fk (t) ≤ L  fn (t) − fk (t) ≤ L fn − fk u. Without loss of generality. First. fk−1 (t) dt F t. We now need to show that f solves (6. Now α note that x ≤ h ≤ M+Lα . As C < 1. fn (t) dt x (by continuity of F) (by uniform convergence) y0 + 0 F t. As fn − f u converges to 0. Taking supremum on the lefthand side we get fn − fk fn − fk u ≤ C fn−1 − fk−1 ≤ Ck fn−k − f0 u. as before we notice F t. { fn } is uniformly Cauchy and by Proposition 6.2). h] we have the following useful bound F t.3. fn (t) − F t. where fn − fk u is the uniform norm. Then by induction we can show that u u. h]. x] (or [x. fn (t) − F t. x x y0 + 0 F(t. It is easy to see (why?) that the convergence is then uniform on [0. Therefore fn − fk u ≤ Ck fn−k − f0 u ≤ Ck α.
3. This solution is usually denoted by ex := f (x). The proof also gives us the Picard iterates as approximations to the solution. Therefore the proof actually tells us how to obtain the solution. The theorem guarantees an h > 0 such that there exists a unique solution f : [−h. we let F(x. We leave it to the reader to verify that by picking I and J large enough the proof of the theorem guarantees that we will be able to pick α such that we get any h we want as long as h < 1/3.154 CHAPTER 6. It is possible to show (we omit the proof) that for any x0 and y0 the proof of the theorem above always guarantees an arbitrary h as long as h < 1/3. what is left to do is to show uniqueness.3 Examples Let us look at some examples. It does not however give use the best h. As before we use the fact that F t. 6.3. g(t) ≤ L f − g u . f (t) dt − y0 + F t. Then x x  f (x) − g(x) = y0 + x 0 F t. not just that the solution exists. and we are looking for a function f such that f (x) = f (x). h] on the left hand side we f −g u ≤C f −g u. f (t) − F t. Suppose g : [−h. The key point is that L = 1 no matter . We pick any I that contains 0 in the interior. Example 6. It is obvious that f (0) = y0 . y) = y. < 1. We note that the proof of the theorem actually gives us an explicit way to ﬁnd an h that works. We pick an arbitrary J that contains 1 in its interior. We can always pick L = 1. h] → R. f (t) − F t. f (x) . By taking supremum over x ∈ [−h. h] → R is another solution. This is only possible if f − g u = 0. we know (though we have not proved) that this function exists as a function for all x. That is. f = g. Finally. g(t) dt u x ≤ Lh ≤ L f −g As we said before C = obtain Lα M+Lα f −g u ≤ Lα f −g M + Lα u. f (0) = 1. It is often possible to ﬁnd a much larger h for which the theorem works. g(t) dt 0 = 0 F t. SEQUENCES OF FUNCTIONS We can now apply the fundamental theorem of calculus to show that f is differentiable and its derivative is F x. and the solution is unique. Therefore.3: Let us look at the equation f (x) = f (x). Of course.
From elementary differential equations we know that f (x) = 1 1−x is the solution. h] by uniqueness. After deﬁning the function on [−h. but g(x) = 0 is also a solution. The derivative of the solution grows as y2 . Thus if we apply the theorem with x0 close to 1 and y0 = 1−x0 we ﬁnd that the h that the proof guarantees will be smaller and smaller as x0 approaches 1. Let us see the Picard iterates for this function. Note that the function that takes y to y2 is simply not Lipschitz as a function on all of R. . 2 0 x s2 x3 x2 + s + 1 ds = + + x + 1. 1). x2 f1 (s) ds = 1 + s + 1 ds = + x + 1. Example 6.6. Therefore the equation does not satisfy the hypotheses of the theorem.4: Suppose we have the equation f (x) = f (x) 2 and f (0) = 1. h] for any h < 1/3. h] we ﬁnd a solution on the interval [0. Do note that the solution is only deﬁned on (−∞. but never a larger h. and therefore the L required will have to be 1 larger and larger as y0 grows. even though we see from above that any h < 1 should work. First we start with f0 (x) := 1.3.1123 (we omit the calculation) for x0 = 0. 2h] and notice that the two functions must coincide on [0.3. We can thus iteratively construct the exponential for all x ∈ R. f2 (s) ds = 1 + 6 2 0 2 x We recognize the beginning of the Taylor series for the exponential. f (0) = 0.5: Suppose we start with the equation f (x) = 2  f (x). That is we will be able to use h < 1. x2 −x2 if x ≥ 0. we get a unique function deﬁned in a neighborhood [−h. Therefore. The function f (x) = is a solution. if x < 0. Note that F(x. Then x f1 (x) = 1 + f2 (x) = 1 + f3 (x) = 1 + 0 x 0 x 0 f0 (s) ds = x + 1. PICARD’S THEOREM 155 what x0 and y0 are. Example 6. y) = 2 y is not Lipschitz in y (why?). The proof of the theorem guarantees an h of about 0.3. Do note that up until now we did not yet have proof of the existence of the exponential function. As we approach x = 1 from the left we note that the solution becomes larger and larger.
3. Let F : I × J → R be a continuous function of two variables and suppose that f : I → J be a continuous function. x ∈ I. the hypotheses in the theorem could be made even weaker.4: Let f (x) = x f (x) be our equation.3. for any ﬁxed y the function that takes x to F(x.3: We have proved Picard’s theorem under the assumption that x0 = 0. Show that F x. SEQUENCES OF FUNCTIONS 6. that is. y) − F(x. Show that F is continuous as a function of two variables. Therefore.2: Let I. suppose that F is Lipschitz in the second variable. J ⊂ R be closed bounded intervals. y) is continuous.156 CHAPTER 6. Prove the full statement of Picard’s theorem for an arbitrary x0 . Exercise 6. Exercise 6. Exercise 6.3. Show that if F : I × J → R is continuous. f4 . Exercise 6. that is.3.1: Let I.5: Suppose that F : I × J → R is a function that is continuous in the ﬁrst variable. there exists a number L such that F(x. . f (x) is a continuous function on I. Further. Start with the initial condition f (0) = 2 and ﬁnd the Picard iterates f0 .3. f1 . z) ≤ L y − z for all y. f3 . then F is bounded.4 Exercises Exercise 6. f2 . z ∈ J. J ⊂ R be intervals.3.
. 2000. [T] William F.math.PDF. McGrawHill Book Co. Introduction to real analysis. Prentice Hall... Sherbert. 3rd ed. Introduction to analysis. 1986. [R2] Walter Rudin. [R1] Maxwell Rosenlicht. [DW] John P.trinity. Introduction to real analysis.Further Reading [BS] Robert G. Reprint of the 1968 edition. 1976. West. http:// ramanujan.. New York. 1999.. 157 . New York. Mathematical Thinking: ProblemSolving and Proofs. John Wiley & Sons Inc. Dover Publications Inc. 3rd ed. Pearson Education.edu/wtrench/texts/TRENCH_REAL_ANALYSIS. D’Angelo and Douglas B. International Series in Pure and Applied Mathematics. 2003. New York.. Bartle and Donald R. Principles of mathematical analysis. Trench. 2nd ed.
158 FURTHER READING .
103 differential equation. 72 absolute maximum.Index absolute convergence. 15 Bolzano’s intermediate value theorem. 89 discontinuous. 143 Cauchy sequence. 35 cardinality. 125 Archimedean property. 113 decreasing. 35 cluster point. 136 closed interval. 151 continuously differentiable. 139 converges uniformly. 10 159 . 63. 39 Cantor’s theorem. 16 Darboux sum. 70 Cauchycomplete. 86 continuous function. 27 difference quotient. 17. 33. 92 absolute value. 72 converges in uniform norm. 79 comparison test for series. 35 bounded sequence. 15 Cartesian product. 114 converge. 73 complement relative to. 15 bijective. 151 Dini’s theorem. 67 completeness property. 106 change of variables theorem. 10 density of rational numbers. 150 direct image. 40 convergent series. 67 chain rule. 10 complete. 14 Dirichlet function. 103 differentiable. 27 arithmeticgeometric mean inequality. 21 bounded below. 142 converges pointwise. 89 discontinuity. 16 countably inﬁnite. 61 bounded above. 141 countable. 111 Dedekind completeness property. 21 bounded function. 40 convergent sequence. 89 disjoint. 86 continuous function of two variables. 15 conditionally convergent. 72 constant sequence. 95 BolzanoWeierstrass theorem. 31 additive property of the integral. 68 converges. 80 converges absolutely. 13 Cauchy in the uniform norm. 92 bounded interval. 92 absolute minimum. 118 Darboux’s theorem. 95 Bolzano’s theorem. 30 bijection. 39 continuous at c. 22 composition of functions. 65 Cauchy series. 22 DeMorgan’s theorem.
9 integration by parts. 92 monotone decreasing sequence. 21 lower Darboux integral. 8 natural numbers. 12 inﬁmum. 127 Lipschitz continuous. 8 equal. 14 greatest lower bound. 22 halfopen interval. 15 injection. 118 mapping. 133 graph. 115 onetoone. 105 limit. 15 inverse image. 80 limit inferior. 131 member. 13 fundamental theorem of calculus. 116 least upper bound. 92 mean value theorem. 29 Maximumminimum theorem. 35 harmonic series. 152 extended real numbers. 115 naïve set theory. 23 nonnegative. 15 integers. 15 INDEX . 80 domain. 23 nth derivative. 128 ﬁrst derivative. 68 diverges. 14 element.160 divergent sequence. 40 limit superior. 35 inverse function. 101 lower bound. 9 interval. 28 ﬁeld. 57 linearity of series. 22 Leibniz rule. 22 ﬁnite. 40 divergent series. 42 monotonicity of the integral. 127 n times differentiable. 111 induction. 119 lower Darboux sum. 23 nonpositive. 29 Minimummaximum theorem. 9 negative. 22 inﬁnite. 151 function. 14 increasing. 14 irrational. 71 linearity of the derivative. 15 ﬁnitely many discontinuities. 42 monotone sequence. 21 leastupperbound property. 132 empty set. 9 existence and uniqueness theorem. 71 image. 13 maximum. 42 monotonic sequence. 95 intersection. 15 injective. 26 Lagrange form. 80 limit of a sequence. 8 minimum. 12 induction hypothesis. 8 elementary step function. 138 intermediate value theorem. 110 mean value theorem for integrals. 105 linearity of the integral. 115 ﬁrst order ordinary differential equation. 57 limit of a function. 42 monotone increasing sequence. 115 nth Taylor polynomial for f.
8 set building notation. 9 set theory. 21 reﬁnement of a partition. 121 Riemann integral. 106 range. 9 real numbers. 15 symmetric difference. 55 ratio test for series. 9 universe. 121 Rolle’s theorem. 15 surjective. 152 pointwise convergence. 39 series. 21 surjection. 142 uniformly Cauchy. 12 principle of strong induction. 111 subsequence. 132 triangle inequality. 68 set. 10 well ordering principle. 39 ratio test for sequences. 115 Taylor’s theorem. 84 reverse triangle inequality. 31 unbounded interval. 23 power set. 8 settheoretic difference. 132 strictly decreasing. 8 upper bound. 13 squeeze lemma. 13 product rule. 35 uncountable. 21 upper Darboux integral. 76 rational numbers. 23 ordered set. 132 positive. 109 relative minimum. 16 principle of induction. 98 union. 9 supremum. 142 uniform norm convergence. 21 pseries. 90. 141 uniform norm. 9 quotient rule. 15 open interval. 117 Picard iterate. 109 remainder term in Taylor’s formula. 119 upper Darboux sum. 68 partition. 152 Picard iteration. 32 Riemann integrable. 14 range of a sequence. 35 ordered ﬁeld. 44 Taylor polynomial. 12 161 . 16 uniform convergence. 152 Picard’s theorem. 90. 74 ptest. 87 popcorn function. 118 Venn diagram. 115 sequence. 139 polynomial. 10 settheoretic function. 111 strictly increasing. 44 subset. 119 relative maximum. 12 well ordering property. 143 uniformly continuous. 116 restriction. 74 partial sums. 115 Thomae function. 18 tail of a sequence.INDEX onto. 109 second derivative. 47 step function. 106 proper subset.