You are on page 1of 261

Advanced Calculus of One Variable

J.W. Thomas
Professor of Mathematics
Colorado State University
Fort Collins, CO 80543

June, 2007
Contents

1 Introduction to the Real Numbers 5


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Introduction to Proofs . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Some Preliminaries to the Definition of the Real Numbers . . . . 13
1.4 Definition of the Real Numbers . . . . . . . . . . . . . . . . . . . 20
1.5 Some Properties of the Real Numbers . . . . . . . . . . . . . . . 25
1.6 Principle of Mathematical Induction . . . . . . . . . . . . . . . . 30

2 Some Topology of R 35
2.1 Introductory Set Theory . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Basic Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Compactness in R . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Limits of Sequences 53
3.1 Definition of Sequential Limit . . . . . . . . . . . . . . . . . . . . 53
3.2 Applications of the Definition of a Sequential Limit . . . . . . . . 58
3.3 Some Sequential Limit Theorems . . . . . . . . . . . . . . . . . . 66
3.4 More Sequential Limit Theorems . . . . . . . . . . . . . . . . . . 72
3.5 The Monotone Convergence Theorem . . . . . . . . . . . . . . . 80
3.6 Infinite Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4 Limits of Functions 89
4.1 Definition of the Limit of a Function . . . . . . . . . . . . . . . . 89
4.2 Applications of the Definition of the Limit . . . . . . . . . . . . . 95
4.3 Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4 Limits at Infinity, Infinite Limits and One-sided Limits . . . . . 109

5 Continuity 117
5.1 An Introduction to Continuity . . . . . . . . . . . . . . . . . . . 117
5.2 Some Examples of Continuity Proofs . . . . . . . . . . . . . . . . 122
5.3 Basic Continuity Theorems . . . . . . . . . . . . . . . . . . . . . 126
5.4 More Continuity Theorems . . . . . . . . . . . . . . . . . . . . . 130
5.5 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6 Rational Exponents . . . . . . . . . . . . . . . . . . . . . . . . . 142

3
4

6 Differentiation 147
6.1 An Introduction to Differentiation . . . . . . . . . . . . . . . . . 147
6.2 Computation of Some Derivatives . . . . . . . . . . . . . . . . . . 152
6.3 Some Differentiation Theorems . . . . . . . . . . . . . . . . . . . 156
6.4 L’Hospital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

7 Integration 171
7.1 An Introduction to Integration: Upper and Lower Sums . . . . . 171
7.2 The Darboux Integral . . . . . . . . . . . . . . . . . . . . . . . . 177
7.3 Some Topics in Integration . . . . . . . . . . . . . . . . . . . . . 183
7.4 More Topics in Integration . . . . . . . . . . . . . . . . . . . . . . 188
7.5 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . 195
7.6 The Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . 201
7.7 Logarithm and Exponential Functions . . . . . . . . . . . . . . . 206
7.8 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 211

8 Sequences and Series 213


8.1 Approximation by Taylor Polynomials . . . . . . . . . . . . . . . 213
8.2 Sequences and Series . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.3 Tests for Convergence . . . . . . . . . . . . . . . . . . . . . . . . 223
8.4 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
8.5 Uniform Convergence of Sequences . . . . . . . . . . . . . . . . . 241
8.6 Uniform Convergence of Series . . . . . . . . . . . . . . . . . . . 248

Index 257
Chapter 1

Introduction to the Real


Numbers

1.1 Introduction
Most students feel that they have an understanding of the real numbers and/or
the real line. Throughout most of your education you have drawn a line—
sometimes called a number line—having a positive (and hence negative) di-
rection and with the integers indicated. From this you could approximately
designate any other real number.
You used two of these as axes when you graphed functions. When you
learned how to take limits as x → a, you used a number line as the x-axis when
you (or the instructor or the book) explained the meaning of the concept of a
limit. When you were introduced to integrals, an interval on the real line was
subdivided which helped to give you an approximation to the area under some
function defined on the interval—to aid in the definition of the integral.
Unless you were given an non-standard introduction to these concepts you
really didn’t know enough about the real line to know what’s missing. You
surely don’t know enough about the real line to be able prove the important
calculus theorems.
This isn’t as bad as it may sound. When Isaac Newton and Gottfried Wil-
helm Leibniz invented calculus in the late 1600’s, they used a very intuitive
approach. After a while some people started pointing out that there were in-
consistencies to their approaches. Over the next 200 years many of the great
mathematicians worked on rigorizing calculus. In 1754 Jean-le-Rond d’Alembert
decided that it was necessary to give a rigorous treatment of limits. Joseph Louis
Lagrange published his first paper rigorizing calculus in 1797. As a part of his
work on hypergeometric series, in 1812 Carl Freidrich Gauss gave rigorous dis-
cussion of the convergence of an infinite series. And finally, Augustus-Louis
Cauchy in 1821 answered d’Alembert’s call and introduced a theory of lim-
its. In 1874 Karl Weierstrass gave an example of an everywhere continuous,

5
6 1. Real Numbers

nowhere differentiable function. This example illustrated that geometric intu-


ition was not an adequate tool for analytic studies. Weierstrass realized that
to perform rigorous analysis there must be an understanding of the real num-
ber system. Weierstrass instigated a program known as the arithmetization of
analysis which through the work of Weierstrass and his followers established
the rigorous treatment of the real number system as a foundation for classical
analysis. Weierstrass died in 1897.
Summarizing the situation, it took them 200 years from when they were first
introduced to the ideas of calculus until they arrived at an understanding of how
and why calculus really works. So it’s not too bad that you may have started
learning calculus two or three years ago and some of the important essentials
were skipped.
This chapter serves as an introduction to the set of real numbers. There are
at least three common approaches of introducing the set of real numbers. For
many people the approaches using either Dedekin cuts (where the real numbers
are represented by sets of rational numbers) or Cauchy sequences (where the real
numbers are represented by equivalence classes of Cauchy sequences of rational
numbers) are more satisfying in that you actually construct the set of reals.
In either case, however, neither the rationals nor the reals look like numbers
that you are accustomed to using. Instead of using either of these approaches
we shall describe the set of real numbers by a suitable set of postulates. One
advantage of this approach is that it is the fastest (and we don’t want to spend
too much time on it). In addition, the set of postulates gives us a very explicit
list of the most basic properties satisfied by real numbers.
In addition to introducing the real numbers in this chapter, we will discuss
certain aspects of proofs. We feel that this necessary or at least advantageous
to help the reader understand the proofs given in this text.
Before we get started we will review some things that we know (or at least
we think we know), introduce some notation and discuss some useful results and
ideas. We begin by defining the sets

• the set of natural numbers N = {1, 2, 3, · · · },

• the set of integers Z = {· · · , −3, −2, −1, 0, 1, 2, 3, · · · },

• the set of rational numbers Q = {m/n : m, n ∈ Z, n 6= 0}.

Of course N ⊂ Z ⊂ Q. The description of Q is not ideal. It’s not wrong


but it doesn’t take care of the fact that there are multiple descriptions of each
rational number, i.e. 1/2 = 3/6 = 123/246, etc. One complicated way around
this is to define a rational number as the equivalence class of rational numbers
that are equal. It’s a bit easier to just always consider the rational number
in “reduced form” where common multiples of the integers in the numerator
and the denominator have been divided out. We will take this latter approach.
Hence the “rational numbers” 1/2, 3/6 and 123/246 are all represented by 1/2.
One way to introduce the need for the real number system is to start with the
set of rational numbers and decide that something important is missing. If you
1.1 Introduction 7

graph the function f (x) = 2−x2 really carefully using only the rational numbers
on the x-axis, you see that the graph passes through the x-axis without hitting
the axis—we don’t want that but then you really can’t √ graph it that carefully.
Many of you have seen the common proof that 2 is not rational which
goes as follows. (Read the proof carefully. It’s not terribly important to be
able to reproduce the proof
√ but it is important √ that you can follow the proof.)
Assume false, i.e. that 2 is rational. Then 2 = m/n where m and n are in
N, n 6= 0 and m/n is in reduced form. If we square both sides we get m2 = 2n2 .
Since m2 is a multiple of 2, m2 is even. This implies that m is even. (If m
is not even, i.e. m is odd, then m = 2k + 1 for some integer k. But then
m2 = (2k + 1)2 = 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1 is odd. This is a contradiction.)
If m is even, then m can be written as m = 2k. But then the facts that m = 2k
and m2 = 2n2 imply that m2 = 4k 2 = 2n2 or n2 = 2k 2 . Thus n must also
be even. This is a contradiction to the fact that we assumed that m/n was in
reduced form. √
√Thus we see that 2 is not rational. But do we really care? Do we need
a 2 in our lives? With a little bit of thought about the diagonal of the unit
square or the √graph of the functions f (x) = 2 − x2 , it’s reasonably clear that we
want to have 2 in our number system, i.e. Q is not enough. In general, √ we do
not want to work on domains that have holes in them like Q has at 2.
It should be pretty clear that there are a lot of rational numbers (there are
a lot of natural numbers and there are clearly a lot more rational numbers than
there are natural numbers). A little thought will also convince you that there
are also a lot of numbers on the number line√that are √ not rational. A proof
similar to the one given √above will show that 3 and√ 7 are not rational. It’s
also easy to prove that q 2 (Example 1.2.2) and q + 2 (HW1.2.2–(a)) are not
rational for any q ∈ Q. Since there are a lot of what we think of as numbers
that are not rational, there are many holes in Q.
What we would like to do is to figure out how to fill in the holes in the real
number line of the numbers that are not rational. This is close to what the
approaches to building the real numbers using either Dedekin cuts or Cauchy
sequences do. We will really approach this from the other direction. We will
define the set of real numbers and then show that this set is what we want.

Before we go on, we would like to include a topic related to our work with the
rational numbers. The following nice result makes it easy to show that a given
number is not rational—and it shows more. Consider the following proposition.

Proposition 1.1.1 Consider the polynomial equation

a0 xn + a1 xn−1 + · · · + an−1 x + an = 0 (1.1.1)

where a0 , a1 , · · · , an are integers, a0 6= 0, an 6= 0 and n ≥ 1. Let r ∈ Q be a root


of equation (1.1.1) where r = p/q expressed in reduced form. Then q divides a0
and p divides an .
8 1. Real Numbers

Proof: If you observe carefully, you will see that the√proof of this proposition
is really very similar to the way that we proved that 2 was not rational.
 n  n−1
p p
If r = p/q is a root of equation (1.1.1), then a0 + a1 + ··· +
  q q
p
an−1 + an = 0. Multiplying by q n we get
q

a0 pn + a1 pn−1 q + · · · + an−1 pq n−1 + an q n = 0. (1.1.2)

Solving for a0 pn allows us to rewrite equation (1.1.2) as

a0 pn = −q a1 pn−1 + a2 pn−2 q + · · · + an q n−1 .


 

Since everything inside of the brackets is an integer, q must divide a0 pn . Since


p/q is in reduced form, no factors of q divide out with any part of pn (see
HW1.1.1). Thus, q divides a0 .
Likewise, we rewrite equation (1.1.2) as

an q n = −p a0 pn−1 + a1 pn−2 q + · · · + an−1 q n−1 .


 

We use the same argument as before. Because p must divide an q n and no factors
of p can divide out any factors of q n , then p must divide an .
Before we show you how nice this result is in relation to our work with
rational numbers, let us remind you that you have probably used this result
before. A while after you learned how to factor polynomials in your algebra
classes, you were faced with factoring polynomials of degree greater than or
equal to three. You were given a problem like “factor 2x3 + 3x2 − 8x + 3.” You
were taught to try to divide by x ± 3, x ± 3/2, x ± 1/2 and x ± 1. These potential
roots were formed by trying all rationals p/q where p is a factor of an = 3 (±3,
±1)and q is a factor of a0 = 2 (±2, ±1), i.e. by applying Proposition 1.1.1. If
and when you were lucky enough to divide 2x3 + 3x2 − 8x + 3 by x − 1 you got

2x3 + 3x2 − 8x + 3
= 2x2 + 5x − 3.
x−1
You then factored the quadratic term which gives you the complete factorization

2x3 + 3x2 − 8x + 3 = (x − 1)(2x − 1)(x + 3).

If none of the potential roots satisfy the equation, Proposition 1.1.1 implies
that there are no rational roots. In you algebra class you usually didn’t have to
worry about that since they were trying to teach you how to factor—one of the
potential roots always satisfied the equation.
Our application of Proposition 1.1.1 goes as follows. Consider the equation
x2 − 2 = 0. By Proposition 1.1.1 we know that if there are going to be rational
roots to this equation, they will be either ±2 or ±1. It is easy to try these
four potential roots and see that none of them satisfy the equation x2 − 2 =
0. Therefore the equation has no rational roots. Solving for x we know that
1.2 Proofs 9
√ √
x = ± 2 represents the solutions to this equation. Therefore ± 2 must not be
rational.
This same approach √ can be used to produce
√ many numbers that are not
rational. Many such as 13 are as easy as 2. For some it is more difficult
to find the appropriate algebraic equation associated q
with the numberq
but the
√ √
method still works. For example consider the number 3 4−3 2 . Set x = 3 4−3 2 .
√ √ √ 2
Then x3 = 4−3 2 , 3x3 = 4 − 2, 3x3 − 4 = − 2 and 3x3 − 4 = 2. Expand
this last expression and apply Proposition 1.1.1 to the resulting polynomial
q √
3 4− 2
(with integer coefficients). Surely 3 is a root of this polynomial.

HW 1.1.1 Assume that p and q have the prime factorizations p1 · · · pmp and
q1 · · · qmq , respectively and p/q is in reduced form. Prove that if q divides a0 pn
(where a0 is an integer), then q divides a0 .

HW 1.1.2 Prove that 13 is not rational.

HW 1.1.3 Prove that 32 + 13 is not rational.
q √
HW 1.1.4 Prove that 3 4−3 2 is not rational.

1.2 Introduction to Proofs


Before we proceed with the next step of defining the set of real numbers, we
pause to include a short discussion of proofs. This topic is surely a bit of a
detour but it may prove to be helpful. In a text such as this, proofs are very
important. It is a time in your mathematical career when you see why things
are true. It is a time when you learn to write a proof that convinces the reader
that what you claim is indeed true. Probably most importantly, you learn to
read mathematics (specifically mathematical proofs) critically and be able to
evaluate whether the writers argument is sound and if what the writer claims
is true is in fact true.
Two types of proofs are important in mathematics: the direct proof and the
indirect proof. We will first discuss the simplest case, direct proof.
Direct Proofs: A direct proof is a valid argument with true premises. In
our cases the true premises are usually axioms and definitions that have been
given, or previously proved results. If the statement to be proved is in the
form p implies q (which our statements will often be), then we can include the
statement p as one of the true premises. A valid argument should be defined and
studied in a logic class—but not many logic classes exist anymore. We will try
to show you what a valid argument is through a series of examples—just about
all (hopefully all) of the proofs in this text give examples of valid proofs. The
valid argument is a series of logical implications relating know facts resulting
in the desired conclusion. We should understand that we have already given
an example of a direct proof in the proof of Proposition 1.1.1. Consider the
following easy example.
10 1. Real Numbers

Example 1.2.1 Prove that r1 , r2 ∈ Q implies r1 + r2 ∈ Q.


Solution: The list of known facts mentioned as a part of the proof includes the statement
r1 , r2 ∈ Q, the definition of Q and all known properties of arithmetic for rational numbers.
(We know more. That means that there are more potential hypotheses but these would not
tend to be relevant here.) The argument to prove this statement can be given as follows:
r1 , r2 ∈ Q implies that r1 = m1 /n1 and r2 = m2 /n2 for m1 , m2 , n1 , n2 ∈ Z (by the definition
m1 m2 m1 n2 + n1 m2
of Q). Then r1 + r2 = + = (by know arithmetic for integers).
n1 n2 n1 n2
m1 , m2 , n1 , n2 ∈ Z implies that m1 n2 + n1 m2 and n1 n2 are in Z (because Z is closed with
respect to addition and multiplication). Therefore r1 + r2 is the ratio of two integers or r1 + r2
is rational (by the definition of Q).

This is a very easy proof but it is hoped that it shows explicitly what the
“true hypotheses” are and how these hypotheses fit together with the valid
argument to construct the proof. We will have more difficult direct proofs, but
more difficult direct proofs will just be more difficult analogs to this proof. We
should realize that the statement p implies q can also be written as if p, then q,
p is a sufficient condition for q, p only if q and q is a necessary condition for p.
Depending on the author you may see all of these different expressions.
And finally we discuss again what we mean by true premises. It is diffi-
cult when you leave the “do as I say” world of mathematics to the “prove it”
world of mathematics. At this time you “know” a lot of things that have been
told to you—things that have not been based on a firm mathematical founda-
tion. Students sometimes have trouble knowing what they can assume are true
premises. It is clear that you can assume anything that we have given you as
postulates, definitions or anything that you or we have proved (which so far
is almost nothing). We did cheat a bit when we have told you that you know
about the integers, the arithmetic for integers and consequently the arithmetic
for rationals. Actually, the facts that you know for the integers include a small
set of postulates and results proved from those postulates. Because we had to
start somewhere, we assume that you know those. When it is necessary, we will
include some of the properties of the integers—postulated and/or proved. Just
about every other true premise that you will have to use or we will use will be
included in this text. If we cheat, we will try to remember to tell you that we
are cheating.
Indirect Proofs: Indirect proofs are very common in analysis. There are
certain results that are very difficult to prove directly yet can be easily proved
using an indirect proof. The indirect proofs are based on the logical concepts of
the contrapositive and the contradiction. As you are probably aware√we have
already given you two examples of indirect proofs when we proved “ 2 is not
rational” and “m2 even implies that m is even” using a proof by contradiction.
We discuss first the use of the contrapositive in proof.
The Contrapositive: When the statement we wish to prove is if r, then s, a
common approach to proving the statement is to consider the contrapositive of
the statement. For this short discussion we will write the implication as r → s
and read it as r implies s. The contrapositive of the statement r → s is the
statement (∼ s) → (∼ r) where (∼ s) means “not s” and (∼ r) means “not r”.
1.2 Proofs 11

r s r→s ∼s ∼r (∼ s) → (∼ r)
T T T F F T
T F F T F F
F T T F T T
F F T T T T

Table 1.2.1: Truth table for the contrapositive.

We refer to ∼ s and ∼ r as the negation of s and r, respectively. An example of


an implication that we proved earlier is n2 is even implies that n is even (where
the context implied that n ∈ Z). The contrapositive of this statement is n is
not even implies n2 is not even or n is odd implies n2 is odd. It is not clear or
easy to see that the statements n2 is even implies that n is even and n is odd
implies n2 is odd are equivalent. The easiest way is to construct a simple truth
table including r → s and (∼ s) → (∼ r).
The first two columns of the Table 1.2.1 list all combinations of truth values
of the statements r and s, both r and s can be true or false. Column 3 can
be thought of the definition of the truth value of the implication. The point is
that the implication is only bad (i.e. false) if a true hypothesis implies a false
conclusion (row 2). Otherwise the implication is true. Columns 4 and 5 give the
truth values of the statements ∼ s and ∼ r (opposite of those of s and r). And
finally, column 6 gives the truth values of the statement (∼ s) → (∼ r) based
on the definition of the truth values of the implication (i.e. false only when a
true statement will imply a false statement) and the truth values of ∼ s and
∼ r. We note that the truth values of r → s and (∼ s) → (∼ r) are the same.
That means that the statements r → s and (∼ s) → (∼ r) are equivalent.
The result of the argument is to prove the statement r implies s is equivalent
to proving the statement not s implies not r, or using our example, to prove the
statement m2 is even implies m is even is equivalent to proving the statement
m is odd implies m2 is odd. This is good because the latter statement is very
easy to prove directly. The first statement is very difficult (if not impossible)
to prove directly.
√ This proof was given in the previous section as a part of the
proof that 2 is not rational (where we proved it using a proof by contradiction).
Therefore we use the easy argument to prove that n is odd implies n2 is odd.
Then this will also imply that n2 is even implies n is even.
Contradiction The second type of indirect proof is based on the logical con-
cept of a contradiction. A contradiction is a statement that is false for all
combinations of the truth values. A proof by contradiction, or when you are
really pleased with your proof you might refer to it as a proof by reductio ad
absurdum, is based on the fact that if p1 is a statement, then it is impossible for
p1 and ∼ p1 to both be true. Recall that a proof is a valid argument with true
premises. If we lump everything that we know to be true (including anything
that we can prove based on what we know to be true) into one statement called
pK (where if the statement we wish to prove looks like r → s, we also include
12 1. Real Numbers

the r in pK ), a proof is of the form pK → q where q is true whenever pK is true.


When we prove a statement s or r → s be contradiction, we begin by assum-
ing that the statement s is false. We then proceed to use this information to
prove that some statement p1 is false, where p1 is one of the statements included
in pK or r, in the case that the statement that we want to prove is of the form
r → s. In either case we will have assumed initially that p1 is true and have
proved that p1 is false, which is a contradiction both by Webster’s definition and
by a mathematical definition—because then p1 ∧ (∼ p1 ) is surely always false
(where the symbol ∧ stands for “and”). Thus our original assumption that the
statement s was false must be erroneous—thus
√ the statement s must be true.
In Section 1.1 we proved that the 2 was not rational. We did it at that
time—before we discussed proofs—because we wanted to convince you that
there were a lot of numbers other than √ the rational numbers. We gave a proof
by contradiction. Our √ statement was 2 is not rational. We assumed that this
was false, i.e. that 2 is rational. What we included in pK (everything that
we know to be true) at that time was very nebulous but we did emphasize that
a part of the definition of the rational numbers was that p ∈ Q implies that
p = m/n where m/n is in reduced form. For our proof p1 is the statement that
p ∈ Q implies that p = m/n where m/n √ is in reduced form. We then proceeded
from the assumption that s was false, 2 is rational, to prove that the form
assumed, m/n, was not in reduced form, i.e. we proved that p1 was false. Thus
we had our contradiction. And of course we should add that as a part of the
proof that m/n was not in reduced form we included a small contradiction
proof when we proved “m2 is even implies that m is even—which we saw can
also easily be proved by proving the contradiction of the original statement.
We next illustrate a proof by contradiction by considering an easier proof.

Example 1.2.2 Prove that q 2 is not rational for any q ∈ Q.

Solution: We first note that this statement can be reworded as q ∈ Q implies q 2 is
not rational. We know that in order to provide a direct proof of this statement we assume
that q is rational,
√ recall “everything else that we know is true”, and devise an argument that
shows that q 2 is not rational.√ This would be very difficult. Instead we assume that the
desired result is false, i.e. that q 2 is rational (along with√the true√hypothesis and everything
we know to be true). Then we know that we can write q 2 as q 2 = m where m, n ∈ Z or
√ m m
n
2 = qn . It is not difficult to show that qn is rational when m, n ∈ Z and q ∈ Q. Therefore,
√ √
2 is rational. However, part of “everything
√ we know to√ be true” is that 2 is not rational
(we proved it). Thus we have √ that 2 is a rational and 2 is not a rational. This is clearly a
contradiction. Therefore q 2 is not rational.
Notice that it would also be very easy to use Proposition 1.1.1 to prove this result—but
we included this result to give a clear example of a proof by contradicition.

There are clearly some similarities between proofs using the contrapositive
and contradiction. If the statement that we want to prove is of the form if r then
s, then we know that we can prove this statement by proving the contrapositive,
if ∼ s then ∼ r. As we mentioned earlier if we were to try to prove this statement
by contradiction we would include the statement r as a part of pK —the things
that we know to be true. We then proceed by assuming that s is false, and
we will complete the proof if we prove some statement p1 is false where p1
1.3 Some Preliminaries 13

is part of pK —something we know to be true. If as a part of our proof by


contradiction the statement we prove false is p1 = r, this is a perfectly good
proof by contradiction—because r was thrown in with the other things that we
knew were true. However we should realize that we have then proved if ∼ s
then ∼ r—i.e. we have in effect proved the contrapositive. Anything that we
can prove by proving the contrapositive can be proved by contradiction—using
the same proof.
We will be proving statements using direct and indirect proofs throughout
the rest of this text. There must be an explicit reason for every step of a proof.
To emphasize this fact, in the beginning we will try to explicitly give a reason
for each step. After a while we will revert to the approach that is generally used
in mathematics where we might give an explicit reason for some of the more
difficult steps but will assume that the reader can see the reasons for the other
steps (the reasons we maintain are “clear”). However, every step is taken for a
reason. If you do not understand why some particular step is done, ask.

HW 1.2.1 (a) Prove that p, q ∈ Q implies that p/q ∈ Q.


(b) Prove that if p is rational, then p + 17
3 is rational.

HW 1.2.2 (a) If q ∈ Q, prove that q + 2 is not a rational.
(b) If q ∈ Q and x is not rational, prove that q + x is not rational.

1.3 Some Preliminaries to the Definition of the


Real Numbers
It is now time to introduce the real numbers, R. In this section we give the
easy part of the definition, the structures of a field and an order. We give the
appropriate arithmetic properties by defining a field. We then add the order
properties by defining the order structure. You have probably been introduced
to the field properties before, and maybe the order relation. You at least have
used all of these properties often in your previous mathematics work.
Before we define a field we thought it might be good to be careful about
equality. We will use an equality as a part of our definition of a field (and about
everything else). Everyone knows what equality means—sort of. An acceptable
notion of equality on a set Q must satify the following properties: (i) For a ∈ Q
a = a (reflexive law). (ii) If a, b ∈ Q and a = b, then b = a (symmetric law).
(iii) If a, b, c ∈ Q, a = b and b = c, then a = c (transitive law). There are times
when one of the steps in a proof is technically a result of one of these properties
of equality. We want to make sure that you realize that there are reasons for
all steps—and some of these reasons are due to a precise definition of equality
given in (i)–(iii) above.
We are now ready to start with our definition of a field.

Definition 1.3.1 Let Q be a set on which between any two elements of Q,


a, b ∈ Q, two operations are defined, + and ·, called addition and multiplication.
14 1. Real Numbers

We assume that Q is closed with respect to addition and multiplication, i.e. if


a, b ∈ Q, then a + b ∈ Q and a · b ∈ Q. The set Q is said to be a field if addition
and multiplication in Q satisfies the following properties.
a1. For any a, b ∈ Q, a + b = b + a (addition is commutative).
a2. For any a, b, c ∈ Q, a + (b + c) = (a + b) + c (addition is associative).
a3. For any a ∈ Q there exists an element of Q, θ, such that a+θ = a (existence
of an additive identity).
a4. For any a ∈ Q there exists an element of Q, −a, such that a + (−a) = θ
(existence of an additive inverse).
m1. For any a, b ∈ Q, a · b = b · a (multiplication is commutative).
m2. For any a, b, c ∈ Q, a · (b · c) = (a · b) · c (multiplication is associative).
m3. For any a ∈ Q there exists an element of Q, 1, such that a·1 = a (existence
of an multiplicative identity).
m4. For any a ∈ Q such that a 6= θ there exists an element of Q, a−1 , such that
a · a−1 = 1 (existence of an multiplicative inverse).
d1 For any a, b, c ∈ Q a · (b + c) = a · b + a · c (multiplication is distributive over
addition).
The set Q is said to be an integral domain if Q, + and · satisfy properties a1,
a2, a3, a4, m1, m2, m3 and d1, along with the following property: if a, b, c ∈ Q,
c 6= θ and ca = cb, then a = b (cancellation law).

You see that the field properties consist of the very basic properties satisfied
by the addition and multiplication that you have used since grade school. When
you were working in N, Z, Q or R, you, your teachers and your books probably
wrote a · b as ab, θ as 0, 1 as 1 and a−1 as 1/a. We will stick with the more
formal notation at this time. After we “have the reals” we will revert to the
usual notation of ab, 0, etc.
It should be easy to see that N is not a field nor an integral domain because
it does not contain either an additive inverse, Z is not a field because it does not
contain a multiplicative inverse (but it is an integral domain), and Q is a field
(and an integral domain). There are many other fields that are very important
in mathematics.
As a part of our defintion of a field above, we assumed that we have oper-
ations addition and multiplication defined on the set. We emphasize that we
want to assume that these operations are uniquely defined. This is a trivial
idea, but it is important. That is, if we have a + b = a + c and b = b0 , then
we also have a + b0 = a + c. For the obvious reason we will sometimes refer to
this as the substitution law—and after a while we will not refer to it, we will
just do it. Of course we have the analogous substitution law associated with
multiplication.
As a part of our definition, if Q is a field, it possesses the basic properties
that are generally familiar to us. However, there are many more properties
associated with a field that are also familiar to us. The point is that there are
many very useful properties in a field that follow from the field axioms. (Though
we will try, we will probably not prove all of the properties in the right order.
We surely will not prove all of the necessary properties. We want to emphasize
1.3 Some Preliminaries 15

that all of the necessary (or desired) properties can be shown to follow from the
field axioms.) We include the following proposition that will give us some of
these properties.
Proposition 1.3.2 Suppose that Q is a field. Then the following properties are
satisfied.
(i) If a, b, c ∈ Q and a + c = b + c, then a = b.
(ii) If a, b, c ∈ Q, c 6= θ and a · c = b · c, then a = b. (This also shows that if Q
is a field, then Q is an integral domain.)
(iii) If a ∈ Q, then a · θ = θ.
(iv) The additive and multiplicative inverses are unique.
(v) If a, b ∈ Q, then (−a) · b = −(a · b).
(vi) If a, b ∈ Q, then (−a) · (−b) = a · b
(vii) If a, b ∈ Q and a · b = θ, then a = θ or b = θ.
Proof: (i) c ∈ Q implies there exists −c ∈ Q such that c + (−c) = θ (a4). By
the reflexive law of equality, (a + c) + (−c) = (a + c) + (−c). Since a + c = b + c,
the substitution law implies that (a + c) + (−c) = (b + c) + (−c). Then using
a2 twice we have a + (c + (−c)) = b + (c + (−c)). By a4 (twice) this becomes
a + θ = b + θ which implies (by a3 twice) that a = b.
Note that if we applied HW1.3.2–(b), we could have began this proof with
a + c = b + c implies that (a + c) + (−c) = (b + c) + (−c) and then proceeded
as above. However, the proof of HW1.3.2 uses the reflexive law of equality and
the substitution law.
(ii) It should be logical that this proof is analogous to the proof given for
(i)—properties (i) and (ii) are essentially the same properties, (i) with respect
to addition and (ii) with respect to multiplication. Because we have the hy-
pothesis that c 6= θ, by m4 there exists c−1 ∈ Q such that c · c−1 = 1. By
the multiplication analog of HW1.3.2–(b) we see that a · c = b · c implies that
(a · c) · c−1 = (b · c) · c−1 . Then by m2, m4 and m3, a = b.
(iii) Properties a3 and a1 imply that for a · θ ∈ Q (the closure with respect to
multiplication implies that if a, θ ∈ Q, then a · θ ∈ Q), (a · θ) + θ = a · θ and
a · θ + θ = θ + a · θ, or θ + a · θ = a · θ (by substitution). Then by a4 (applied
to θ), substitution and d1, we have θ + a · θ = a · θ = a · (θ + θ) = a · θ + a · θ.
Using part (i) of this proposition we have θ = a · θ.
(iv) We will show that the multiplicative inverse is unique. The common way
to approach a uniqueness result is to assume that there are two (at least two),
i.e. for a ∈ Q, a 6= θ, there exists a−1 and a? in Q such that a · a−1 = 1 and
a · a? = 1. Then by substitution we have a · a−1 = a · a? . Then by part (ii)
of this proposition we see that a−1 = a? . The proof of the uniqueness of the
additive inverse follows in the same way.
(v) The element −(a · b) ∈ Q is an element that satisfies (a · b) + (−(a · b)) = θ.
Thus if we can show that (a · b) + ((−a) · b) = θ, we will be done (by part (iv) of
this proposition). We have a · b + ((−a) · b) = b · (a + (−a)) (using a1 twice and
then d1) = b · θ (by a4 and substitution) = θ(by part (iii) of this proposition).
Therefore −(a · b) = (−a) · b.
16 1. Real Numbers

(vi) It is easy to use part (v) of this proposition along with m1 to show that
(−a) · (−b) = −(−(a · b)). Then HW1.3.2–(a) implies that −(−(a · b)) = a · b.
(vii) If a and b both equal θ, then we are done. If b 6= θ, then there exists b−1
such that b · b−1 = 1 (by m4). Then a · b = θ implies that (a · b) · b−1 = θ · b−1
by substitution. The right hand side equals θ by m1 and part (iii) of this
proposition. Then θ = (a · b) · b−1 = a · (b · b−1 ) = a · 1 = a by m2, m4 and m3.
Therefore a = θ.
The proof of the case if a 6= θ is clearly the same (with b replaced by a).
Of course there are more properties. The purpose of the above proposition
is to illustrate to you how some of the other properties that you know can be
proved from the field axioms.
We want to emphasize here that in the proofs above we used only the axioms
and properties that we had previously proved. It’s not terribly important that
you can prove these properties. It would be nice if you’re capable of proving
some reasonably easy properties using the axioms and previous results. It is
very important that you are able to read these proofs and verify that they are
correct (which we hope they are). In the proofs given we tried to be complete,
giving each step and giving a reason for each step. As we move along we will
ease up on some of the completeness, assuming that the reader understands the
reasons for some of the “simple” steps. When we are done with this section, we
will assume that you know and/or have proved all basic arithmetic properties
of a field. You will have seen proofs of some of these properties such as those
given in Proposition 1.3.2 and HW1.3.2. There are countless other little facts
concerning fields (the rational numbers and the reals—when we really know
what the rational numbers and the real numbers are) that we will need to use.
So that proofs of these facts do not slow down our subsequent work, we will not
fill in every detail and we will assume that you have proved all of these facts or
could prove them if someone wanted a proof.
We next would like to extend our definition to that of an ordered field. As
with the equality, an order must satisfy certain properties. A necesssary part of
defining an order and an ordered field Q is to identify a set P ⊂ Q of positive
elements. We will use the notation that if a ∈ P we will write a > θ. We now
proceed to define an ordered field. We define the ordered field with respect to
the order >.
Definition 1.3.3 Suppose that Q is a field in which we identify a set of positive
elements P ⊂ Q. The set Q along with > is said to be an ordered field if the
satisfy the following properties.
o1. The sum of two positive elements is positive, i.e. a, b ∈ P implies that
a + b ∈ P.
o2. The product of two positive elements is positive, i.e. a, b ∈ P implies that
a · b ∈ P.
o3. For a ∈ Q, one and only one of the following alternatives hold: either a is
positive, a = θ, or −a is positive, i.e. a > θ, a = θ or −a > θ.
You should recognize these three properties as being common facts that you
have used in the past when dealing with inequalities. One of the pertinent facts
1.3 Some Preliminaries 17

is that these three axioms are all you need to get everything you know and/or
need to know about inequalities. Of course we need—want—inequalities defined
on the entire set Q and the other inequalities that you know exist defined, <,
≥ and ≤. We make the following definition.

Definition 1.3.4 Suppose Q is an ordered field. If a, b ∈ Q, we say that b > a


if b − a > θ. Also, we say that
(i) b < a if and only if a > b,
(ii) b ≥ a if and only if b > a or b = a, and
(iii) b ≤ a if and only if b < a or b = a

It should then be reasonably easy to see that Z is not an ordered field (Z is not
a field) and that Q is an ordered field (use P = { m n : m, n ∈ Z and mn > 0}).
Just in case you are not familiar with the “if and only if” statement we pause
to fill you in. For example statement (i) above means that “b < a implies a > b”
and “a > b implies b < a”. This is a little bit unfair in this case because it is
a definition. Definitions are always “if and only if” statements. However, it is
still the case that the “if and only if” statements gives the implication in both
directions. We will encounter this in cases not involving a definition later.
As with the arithmetic properties of the field, the axioms above are then
used to prove a variety of properties concerning ordered fields. We state some
of these properties in the following proposition where we include some of the
very basic results that follow directly from Definition 1.3.3.

Proposition 1.3.5 Let Q along with the operations +, · and > be an ordered
field. Then the following propertes hold.
(i) If a, b, c ∈ Q, a > b and b > c, then a > c (transitive law).
(ii) If a, b, c ∈ Q and a > b, then a + c > b + c.
(iii) If a, b, c ∈ Q, a > b and c > θ, then a · c > b · c.
(iv) If a, b ∈ Q and a > b, then −b > −a.
(v) If a ∈ Q and a 6= θ, then a2 > θ.

Proof: (i) Since a > b and b > c, we have a − b > θ and b − c > θ. Then
by property o1 of Definition 1.3.3 we know that (a − b) + (b − c) > θ. Then by
using Definition 1.3.1 a2 (a couple times) and a4 we get a − c > θ or a > c.
(ii) a > b implies that a − b > θ. Then using a3, a4, a2, a1 etc, shows that
a−b = a−b+θ = (a−b)+(c+(−c)) = (a+c)−b+(−c) = (a+c)+((−b)+(−c))
= (a + c) − (b + c) or a + c > b + c.
(iii) a > b implies that a − b > θ. Applying 02 of Definition 1.3.3 to b − a and
c >, gives (a − b) · c > θ. Then d1 implies that a · c − b · c > θ or a · c > b · c.
(iv) If a > b, then by part (ii) of this result a + (−b) > b + (−b). By a4 and
a3 this becomes a + (−b) > θ. We next use a1 to fix up the right hand, apply
part (ii) of this result again (this time with −a), and then clean it all up with
a4 and a3 (on the left) and a3 to get (−b) > (−a).
(v) If a ∈ Q, then by o3 a > θ or −a > θ (we assumed that a 6= θ). The
case when a > θ follows immediately from o2. If −a > θ, by o2 we have
18 1. Real Numbers

(−a) · (−a) > θ. Then part (vi) of Proposition 1.3.2 gives us our desired result.

There are a lot of different properties of ordered real fields. In the next
proposition we include three more very important results.

Proposition 1.3.6 Let Q be an ordered field. Then the following properties


hold.
(i) 1 > θ
(ii) If a ∈ Q and a > θ, then a−1 > θ.
(iii) If a, b ∈ Q and b > a > θ, then a−1 > b−1 > θ.

Proof: (i) We begin by noticing that by Proposition 1.3.5-(v) we get 12 > θ.


By m3 12 = 1 so we have 1 > θ.
(ii) Suppose false, i.e. suppose that a > θ and a−1 is not greater than θ, i.e.
a−1 = θ or −a−1 > θ. If a−1 = θ, then a·a−1 = a·θ = θ. This is a contradiction
to the fact that a · a−1 = 1 6= θ.
We next consider the case when −a−1 > θ. By o2 we see that a · (−a−1 ) > θ.
1.3.2−(v) m4
Or a · (−a−1 ) = −a · a−1 = −1 > θ. This contradicts part (i) of this
proposition. Therefore a−1 > θ.
(iii) Since b > a > θ, by part (ii) of this proposition we see that a−1 > θ
and b−1 > θ. Then since b > a, by m4 and Proposition 1.3.5-(iii) we see that
1 = b · b−1 > a · b−1 . Then by Proposition 1.3.5-(iii), m1 and m4 we get
a−1 > a · b−1 · a−1 = b−1 —which along with the fact that b−1 > θ gives us
the desired result.
Again as we saw with our field properties we use the properties of our order
structure to prove a variety of additional properties of the ordered field. As you
will see, as we start applying properties of our field and later our order structure
we will need some additional properties of ordered fields. We do not want to try
to prove all of these results so we either have to pause and prove these properties
when we need them, assign them as homework so we can assume that they have
been proved, or cheat a bit and assume that everyone can prove them if there
is a need for a proof—which in reality is the approach that we will usually use.
All of the work concerning orders done above was done with respect to
the order >. You know that we have defined other order relations, <, ≥ and
≤. These other order relations will satisfy properties analogous to those found
above for >. Most of the results that we want for <, ≥ and ≤ will follow
from Definitions 1.3.3 and 1.3.4 and Propositions 1.3.5 and 1.3.6—along with a
careful consideration of results following from the fact that a = θ. We cannot
give all of the possible results—even all of the results directly like the previous
theorems—for all flavors of inequalities. We include some of these properties
without proof in the following proposition.

Proposition 1.3.7 Let Q be an ordered field. Then the following propertes


hold.
(i) If a, b, c ∈ Q, a < b and b < c, then a < c
(ii) If a, b, c ∈ Q and a < b, then a + c < b + c.
1.3 Some Preliminaries 19

(iii) If a, b, c ∈ Q, a < b and c > θ, then a · c < b · c.


(iv) If a, b, c ∈ Q, a > b and c < θ, then a · c < b · c.
(v) If a, b, c ∈ Q, a < b and c < θ, then a · c > b · c.
(vi) If a, b, c ∈ Q, a ≥ b and b ≥ c, then a ≥ c.
(vii) If a, b, c ∈ Q, a ≥ b, then a + c ≥ b + c.
(viii) If a, b, c ∈ Q, a > b and c ≥ θ, then a · c ≥ b · c.
(ix) If a, b, c ∈ Q, a ≥ b and c > θ, then a · c ≥ b · c.
(x) If a, b, c ∈ Q and a ≤ b, then a + c ≤ b + c.
We quit. Of course there are more ≤ results and other results—we thought
that x was enough. We assume that you are generally aware of the correct
results and hope that based on the propositions proved in this section you could
prove all reasonable true results.
With the definition of the ordered field and the properties we have proved,
we have the algebra of the reals. As we have seen the set of rationals, Q, is an
ordered field and we have claimed that the set of rationals is not good enough.
We need more which we will add in the next section.
HW 1.3.1 (True or False and why)
(a) If Q is a field and θ is the additive identity in Q, then θ · θ = θ.
(b) If Q is a field and θ is the additive identity in Q, then −θ = θ.
(c) Suppose that Q is a field and 1 and θ are the multiplicative and additive
identies, respectively. Then 1 and θ are unique.
(d) Suppose that Q is a field and a · x = b. Then x = a−1 · b.
(e) If Q is an ordered field, a ∈ Q and a < θ, then a−1 < θ.
(f) If Q is an ordered field, a, b ∈ Q and θ ≤ a < b, then a2 < ab < b2 .
HW 1.3.2 (a) Prove that if Q is a field and a ∈ Q, then −(−a) = a.
(b) Prove that if Q is a field, a, b, c ∈ Q and a = b, then a + c = b + c.
(c) Suppose that Q is an ordered field and a, b, c ∈ Q are such that a > b and
c > d. Prove that a + c > b + d.
(d) Suppose Q is an ordered field and a, b ∈ Q are such that ab > θ. Prove that
either a > θ and b > θ, or a < θ and b < θ.
HW 1.3.3 Suppose that Q is an order field. (a) Prove that if a, b ∈ Q and
θ ≤ a ≤ b, then a2 ≤ b2 . (Note: Essentially the same proof will prove that for
a, b ∈ Q and θ ≤ a < b, then a2 < b2 .) √
(b) For the moment for a ∈ Q and a ≥ θ define a to be that number such that
√ 2
a = a (if such a number exists–see √Example
√ 1.5.1 and Definition 5.6.1). Prove
that if a, b ∈ Q and θ ≤ a ≤ b, then a ≤ b. Hint: Use the contrapositive or
contradiction.
HW 1.3.4 (a) Define Q1 to be the set of all 2 × 2 matrices along with the
traditional matrix addition and multiplication. Prove or disprove that Q is a
field.
(b) Define Q2 to be the set of all 2 × 2 invertable matrices along with the
traditional matrix addition and multiplication. Prove or disprove that Q is a
field.
20 1. Real Numbers

HW 1.3.5 Let Q be an ordered field. Prove the following statements.


(a) If a, b, c ∈ Q, a < b and c < θ, then a · c > b · c. (Proposition 1.3.7-(v))
(b) If a, b, c ∈ Q, a > b and c ≥ θ, then a · c ≥ b · c. (Proposition 1.3.7-(viii))
(c) If a, b, c ∈ Q, a < b and c ≤ θ, then a · c ≥ b · c.

1.4 Definition of the Real Numbers


We have one more step in the definition of the real numbers—the difficult step.
Before we proceed we need a few easy definitions.

Definition 1.4.1 Let Q be an ordered field and let S be a nonempty subset of


Q.
(i) If M ∈ Q is such that s ≤ M for all s ∈ S, then M is said to be an upper
bound of S.
(ii) If m ∈ Q is such that s ≥ m for all s ∈ S, then m is said to be a lower
bound of S.
(iii) If a nonempty subset of Q has an upper bound, it is said to be bounded
above. If a nonempty subset of Q has a lower bound, it is said to be bounded
below. If a nonempty subset of Q has both an upper and lower bound, then it
is said to be bounded. If a set does not have an upper bound or a lower bound,
then the set is said to be unbounded.

It is easy to see that in the set of rational numbers, Q, (an ordered field) 7 is
an upper bound of the set S1 = {−3, −2, −1, 3, 4} and −5 is a lower bound of
S1 , there is no upper bound of the set
S2 = {−17, −3/2, −1/2, 0, 2, 8/3, 4, 32/5, 32/3, 128/7, · · · } (the elements of
the set continue to increase without bound) and −23.1 is a lower bound of
S2 , 7 is an upper bound of the set S3 = {r ∈ Q : r = 7 − 1/n for all n ∈ N}
and 6 is a lower bound of S3 , and −1 is an upper bound of the set S4 =
{· · · , −4, −3, −2, −1, −3/2, −5/4, −9/8, −17/16, · · · } and S4 has no lower bound.
Also, both 4 and 4.00001 are upper bounds and −3 and −3.1 are lower bounds
of the set S5 = {r ∈ Q : −3 < r ≤ 4}. Note that 3.9999 is not an upper bound
of S5 . It is also the case that −17 is also a lower bound of the set S2 , 0 is also
an upper bound of the set S4 , and 14 and −10, 9 and 5, and −10 and 10 are
upper and lower bounds of S1 , S3 and S5 , respectively.
We note that upper and lower bounds of a set may be elements of the set
(for example −1 ∈ S4 and 6 ∈ S3 ). And of course by the other examples, we
see that upper and lower bounds need not be elements of the set, nor are they
unique.
We next include the definition of least upper bounds and greatest lower
bounds. They are related to upper and lower bounds, more difficult than upper
and lower bounds and extremely important.

Definition 1.4.2 Let Q be an ordered field and let S be a nonempty subset of


Q.
1.4 Real Numbers 21

(i) If M ∗ ∈ Q is such that M ∗ is an upper bound of S, and for any upper bound
of S, M , M ∗ ≤ M , then M ∗ is said to be the least upper bound of S. We denote
the least upper bound of S by M ∗ = lub(S). Another word that is used for the
least upper bound of S is the supremum of S and is written as sup(S).
(ii) If m∗ ∈ Q is such that m∗ is a lower bound of S, and for any lower bound of
S, m, m∗ ≥ m, then m∗ is said to be the greatest lower bound of S. We denote
the greatest lower bound of S by M ∗ = glb(S). Another word that is used for
the greatest lower bound of S is the infimum of S and is written as inf (S).

Let us emphasize the fact that the least upper bound must be an upper bound.
Hence, if the set does not have an upper bound, the least upper bound of the set
does not exist. Likewise, if a set does not have a lower bound, then the greatest
lower bound of the set does not exist.
It should be easy to see that for the four sets S1 , S2 , S3 , S4 and S5 , glb(S1 ) =
−3 and lub(S1 ) = 4, glb(S2 ) = −17 and lub(S2 ) does not exist, glb(S3 ) = 6 and
lub(S3 ) = 7, glb(S4 )does not exist and lub(S4 ) = −1, and glb(S5 ) = −3 and
lub(S5 ) = 4 (where the facts that lub(S4 ) = −1 and glb(S5 ) = −3 are the two
that should be considered carefully).
Least upper bounds (as in S1 , S4 and S5 ) and greatest lower bounds (as
in S1 , S2 and S3 ) may be elements of the set, but that is not a requirement
(as 7 6∈ S3 and −3 6∈ S5 ). You should note that the upper and lower bounds
need not be close to the set (as 1000 is an upper bound of S5 ), whereas the
least upper bound and greatest lower bound must be close to the set (close to
at least one element of the set). We note that if the set is finite (meaning the
set has a finite number of elements), the least upper bounds and the greatest
lower bounds will always be the smallest and the largest elements of the set (as
with the set S1 ). That need not be the case for sets with an infinite number of
elements as can be seen by lub(S3 ) and glb(S5 ).
It is not difficult to prove any of the above claims. For example, if you were
forced to prove that −5 is a lower bound of the set S1 , we would only have to
list the elements of the set noting that −5 ≤ −3, −5 ≤ −2, −5 ≤ −1, −5 ≤ 3
and −5 ≤ 4. Therefore, −5 ≤ s for all x ∈ S1 so −5 is a lower bound of S1 .
If you wanted to prove that −3.0001 is an lower bound of the set S5 , you
would only have to show that if s ∈ S5 , then −3.0001 < −3 < s. Therefore, if
s ∈ S5 , then s ≥ −3.00001 so −3.00001 is a lower bound of the set S5 .
To prove that a given value is the greatest lower bound of a set or the least
upper bound of a set is a bit more difficult. To prove that glb(S5 ) = −3 we must
first prove that −3 is a lower bound of S5 —but this proof is almost identical to
the proof given above that −3.0001 is a lower bound of S5 .
We next must prove that −3 is the greatest lower bound of S5 . The way
to prove this is by contradiction. Assume that m∗ = glb(S5 ) and m∗ > −3.
Then we can find a number r = (−3 + m∗ )/2 that will be in S5 (because
r = (−3 + m∗ )/2 > (−3 + (−3))/2 = −3) but m∗ 6≤ r (r = (−3 + m∗ )/2 <
(m∗ + m∗ )/2 = m∗ so m∗ is not a lower bound of S5 . This contradicts the fact
that the greatest lower bound of a set must also be a lower bound of the set.
Therefore there cannot be a greatest lower bound of S5 that is greater than −3.
22 1. Real Numbers

Since −3 is a lower bound of S5 , it must be the greatest lower bound of S5 . We


should note that choosing r to be the average of m∗ and −3 is a common step
in obtaining a contradiction in proofs concerning least upper and greatest lower
bounds—though it’s not always −3.
To prove that lub(S3 ) = 7 is more difficult. It’s easy to show that 7 is an
upper bound of S3 . If we assume that r = lub(S3 ) < 7, all we have to do is to
show that there is some element of S3 that is greater than r, i.e. we must show
that there is some n0 ∈ N such that r < 7 − n10 . This is what we call intuitively
clear—but not proved. In HW1.5.2 we will use Corollary 1.5.5–(b) to complete
the proof that lub(S3 ) = 7.
Before we proceed we want one more example. Consider the set S6 = {x ∈
Q : x2 < 2}. It is not hard to show that 2 is an upper bound of S6 —if x ∈ S6
and x > 2, then x2 > 4 which is a contradiction (and −2 is a lower bound).
It’s not obvious that S6 doesn’t have a least upper bound. Since S6 is defined
as a subset of the rational numbers, if S6 has a least upper
√ bound, it has to be
a rational. If we cheat and claim that we know that 2 has a non-repeating,
non-terminating decimal expansion (though at this time we really don’t know
about decimal expansions) and that rational numbers have decimal expansions
that terminate or repeat from some point on, we can do some calculations and
figure out that no rational number can be the least upper bound of S6 . We
prove this fact in the following example.
Example 1.4.1 Show that S6 = {x ∈ Q : x2 < 2} does not have a least upper bound (in
Q).
Solution: We will prove this statement by contradiction. We assume that L = lub(S6 )
exists and L ∈ Q. Since 12 < 2 and 22 > 2, 1 ≤ L ≤ 2. We shall show that L2 = 2 (which we
know is impossible for L ∈ Q by the proof given in Section 1.1).
First suppose that L2 < 2. Choose α = (2 − L2 )/5. Note that α ∈ Q. Also note that
α > 1/5 implies that L2 < 1 which contradicts the fact that 1 ≤ L. Thus 0 < α ≤ 1/5 < 1.
We see that
(L + α)2 = L2 + 2αL + α2 < L2 + 5α = L2 + 5(2 − L2 )/5 = 2 (since L ≤ 2 and α2 < α).
Thus L + α ∈ S6 , L is not an upper bound of S6 and we have a contradiction. Therefore
L2 6< 2.
If we suppose that L2 > 2 and choose α = (L2 − 2)/4. If α > 1/2, then L2 > 4 which
contradicts the fact that L ≤ 2. Thus α ≤ 1/2 < 1. We see that
(L − α)2 = L2 − 2αL + α2 > L2 − 2αL ≥ L2 − 4α = L2 − 4(L2 − 2)/4 = 2 (since L ≤ 2).
Thus L − α is an upper bound of S6 and L − α < L so that L is not the least upper bound
of S6 . This is a contradiction. Therefore L2 6> 2.
The only choice left is that L2 = 2 but we know this is impossible. Therefore lub(S6 )
does not exist.

We emphasize that in the last example the point is that the lub(S6 ) does
not exist in Q. That is important.
We are finally ready to define the set of real numbers.
Definition 1.4.3 An ordered field Q is complete if and only if every subset of
Q that is bounded above has a least upper bound.
We refer to Definition 1.4.3 as the completeness axiom.
1.4 Real Numbers 23

Definition 1.4.4 The set of real numbers, R, is a complete ordered field.


We see that the set of real numbers are quite nice. The set is an ordered field
so that we get all of the properties of arithmetic and inequalities that we have
known and used since childhood. In addition, we get the completeness axiom.
We will see that the fact that the reals are complete will be extremely important
to almost every aspect of our work (it is the concept that delayed the rigor of
calculus for 200 years). There will be many times when you are working on
some proof that you want to use the largest element in a set. However, there
are many nice sets that are bounded above and do not have a largest element,
such as S3 . The least upper bound of S3 is not the largest element in S3 —it’s
not even in the set. However, 7 is very close to all of the large elements in
the set. We will be able to use the least upper bound approximately where we
wanted to use the largest element.
However, we have a problem. We know what we want the set of real numbers
to be: the numbers that are plotted on the real line, the numbers that come
up on our calculator screens, etc. Our definition is a very abstract definition.
To begin with we would have to prove the existence of a complete
ordered field. It would surely be embarrassing to define the set of reals to be
a complete ordered field and have someone else come back and prove that no
such set existed. After that, we have to worry about that fact that if we were
using some complete ordered field as our set of reals, someone else may be using
some other complete ordered field—so we might get different results (when you
bought a new calculator, you might have to decide which complete ordered field
you wanted it to be based on). This is not really a problem. However, we are
only going to clear up this situation by stating the following theorem that we
give without proof.
Theorem 1.4.5 There exists one and (except for isomorphic fields) only one
complete ordered field.
The proof of this theorem would take us too far out of our way to be useful at
this time. The fact that there are ismorphic complete ordered fields is not a
problem. Two fields are isomorphic if there is a one-to-one (which we will define
later) mapping between the fields that preserves the arithmetic operations. For
our purposes isomorphic complete order fields can be considered the same.
When we work with the set of real numbers, we still want R to contain
N, Z and Q. This is not a problem. We won’t do it but it is not difficult
to use 1 = 1, 2 = 1 + 1, · · · to define N within any field, or N = {x ∈ R :
1 ∈ N or x = k + 1 for k ∈ N}. Using the approach that we have by defining
the set of real numbers by a set of postulates, the above description is the
definition of the set of natural numbers. Likewise, we can use the additive
inverses and the multiplicative inverses to get Z and Q, respectively. Thus any
complete ordered field will contain N, Z and Q along with their properties.
When the natural numbers are developed without first defining the set of real
numbers, there are several sets of axioms that are used to define the natural
numbers. Since we defined the set of real numbers and defined the natural
24 1. Real Numbers

numbers as a particular subset of the reals, we don’t use any of these axioms.
Of course we want our set of natural numbers to satisfy these axioms–otherwise
either something is wrong with the sets of axioms or something is wrong with
our definition of N. In our setting these axioms can proved as theorems. We
will not prove all of these results though when we need them, we will use them.
We do want to give you one of the common sets of axioms, called the Peano
Postulates:
• PP1: 1 is a natural number.
• PP2: For each natural number k there exists exactly one natural number,
called the successor of k, which we donote by k+1 .
• PP3: 1 is not the successor of any natural number.
• PP4: If k+1 = j+1 , then k = j.
• PP5: Let M be a set of natural numbers such that (i) M contains 1
and (ii) M contains x+1 whenever it contains x, then M contains all the
natural numbers.

It should be clear that the Peano Postulates are a long way from the real
numbers–in other words, if you use this approach, you have a lot of work to
do before you get to R. Also based on our definition of R, some of the prop-
erties that we have proved in R and the definition of N, it is easy to see that
PP1, PP4 and PP5 are true. It not too difficult to see that PP3 can be proved
using PP5; and PP2 follows from the result that for k ∈ N there are no natural
numbers between k and k + 1 which follows from PP3. We prove PP3 in Ex-
ample 1.6.4 as one of our examples of the application of proof by mathematical
induction. You will see in Section 1.6 that postulate PP5 is a very important
property of the natural numbers.
We know from our work in Section 1.1 that there exist real numbers that
are not rational. We define
• the set of irrational numbers, I = {x ∈ R : x 6∈ Q} = R − Q.
Obviously by the definition of I, Q ∩ I = ∅ and R = Q ∪ I. In Section 1.1 we
showed that there were a lot of real numbers that are not rational, i.e. that are
irrational. Hence not only do we know that I is not empty, I 6= ∅, but I is large.
And finally, to this point we have tried to be careful to use the formal notation
of · for multiplication, −1 for the multiplicative inverse, θ for the additive identity
and 1 for the multiplicative identity. Now that we have defined R and made the
argument (some of it not proved) that R is the set of reals that we have always
used, we will change to a more traditional notation. We will write a · b as ab,
ab−1 and a/b, θ as 0 and 1 as 1.

HW 1.4.1 (True or False and why)


(a) If S is a set of real numbers and M and m are upper and lower bounds of
S, respectively, then M and m are unique.
1.5 Properties of the Real Numbers 25

(b) If S is a bounded set of real numbers and M ∗ and m∗ are least upper and
greatest lower bounds of S, respectively, then M ∗ and m∗ are unique.
(c) If S is a bounded set of real numbers and S ∗ ⊂ S, then lub(S) ≥ lub(S ∗ )
and glb(S) ≤ glb(S ∗ ).
(d) If S is a bounded set of real numbers, then glb(S) < lub(S).
(e) If S is a bounded set of real numbers, then glb(S) ≤ lub(S).
(f) Suppose that S is a bounded set of positive, real numbers, and set m∗ =
glb(S) and M ∗ = lub(S). Define S 0 = {x ∈ R : x = 1/y, y ∈ S}. Then
glb(S 0 ) = 1/M ∗ and lub(S 0 ) = 1/m∗ .

HW 1.4.2 Suppose that S ⊂ R contains only a finite number of elements.


Prove that M ∗ = lub(S) exists and M ∗ ∈ S.

1.5 Some Properties of the Real Numbers


We want to emphasize that the important addition to our knowledge base in
the last section is the fact that the set of real numbers must be complete. This
section is very much a continuation of the last section. We begin by stating and
proving a series of very important results related to the completeness. The first
result shows that we if a set is bounded below, then the set has a greatest lower
bound.
Proposition 1.5.1 If S is a subset of R that is bounded below, then S has a
greatest lower bound.

Proof: Let S 0 = {−x ∈ R : x ∈ S}. If m is a lower bound of S, then m ≤ s for


all s ∈ S. This is the same as −m ≥ −s for all −s ∈ S 0 . Therefore, −m is an
upper bound of S 0 . By the completeness of R, S 0 has a least upper bound, say
M ∗ = lub(S 0 ). Our claim is that m∗ = −M ∗ = glb(S). M ∗ is an upper bound
of S 0 so for all −s ∈ S 0 , M ∗ ≥ −s. Then m∗ = −M ∗ ≤ s for all s ∈ S and m∗
is a lower bound of S.
If g is a lower bound of S and g > m∗ , then −g will be an upper bound of
S and −g < −m∗ = M ∗ . Thus if m∗ is not the greatest lower bound of S, then
0

M ∗ is not the least upper bound of S 0 . Therefore by reductio ad absurdum (or


contradition) m∗ = glb(S).
If we return to the sets S1 –S5 described earlier and consider these sets as
subsets of the reals, R (S1 –S5 were previously considered as subsets of Q and
Q ⊂ R), then the least upper bounds and greatest lower bounds of these sets
are the same as before. If proofs were needed, the proofs would be essentially
the same as before. If we consider a different set S50 = {x ∈ R : −3 < x ≤ 4} =
(−3, 4] (where S50 contains all of the real numbers between −3 and 4 (including
4) where S5 contained only the rational numbers in that range, it is easy to
see (again using essentially the same arguments as before that lub(S50 ) = 4 and
glb(S50 ) = −3.
The case of special interest is the set S6 . Recall in Example 1.4.1 that when
S6 was considered a subset of Q, we found that the least upper bound of S6 did
26 1. Real Numbers

not exist. Since S6 is an example of a set bounded above that does not have a
least upper bound, we obtain the following result.

Proposition 1.5.2 The set of rationals, Q, is not complete.

Now consider S6 as a subset of R and define S60 = {x ∈ R : x2 < 2}. As we


showed before with S6 considered as a subset of Q, S6 and S60 are both bounded
above and below in R by 2 and −2, respectively. By the definition of R we know
that both lub(S6 ) and lub(S60 ) exist. We don’t explicitly know what value these
least √
upper bounds assume (even though deep in our hearts we know they are
both 2—the number such that when squared gives 2). Consider the following
example.

Example 1.5.1 Let L = lub(S60 ). Show that L2 = 2.


Solution: Other than the fact that in this case we know (because R is complete) that
L = lub(S60 ) exists, the proof of this result is the same as part of the proof given in Example
1.4.1.
We assume that L2 < 2, define α = (2 − L2 )/5, show that L + α ∈ S60 and obtain the
contradiction to the fact that L is an upper bound of S60 .
We then assume that L2 > 2, define α = (L2 − 2)/4, show that L − α is an upper bound
of S60 and contradict the fact that L is the least upper bound of S60 . (One large difference
between these proofs is the fact that in this case L and α may be—will be—irrational. This
makes no difference in the necessary computations.) Therefore we know that L2 = 2.

√ √
The completeness of the reals allows us to define 2 as 2 = lub(S60 ) and
√ √ 2
we know that 2 satisfies 2 = 2. This approach also allows us to define
square roots of all positive real numbers. √
This is a big deal. When I was a young student, I was told that we let 2 be
the number such that when squared gives 2—and I would guess most of you were
given the same introduction. No one questioned whether or not such a number √
might exist—I surely never questioned it. It wasn’t until I had to define 2 for
students that
√ I started wondering why we never discuss the existence. You now
know that 2 exists and why. You should also realize that if you consider S6 as
a
√subset of R, since S6 is surely bounded, lub(S6 ) will exist in R and will equal
2.
As we stated earlier the completeness axiom is a very important and essential
part of the definition of the set of real numbers. To better describe this property
of R and to make this property easier to use, we next include several useful
results that follow from the definition of the least upper and greatest lower
bounds and/or completenes axiom. These results are very important in that
often when we need to use the completeness of the reals, we will use Proposition
1.5.3–Corollary 1.5.5 rather than the definition of completeness.
We begin with a result that illustrates our earlier claim that when a set
doesn’t have or might not have a largest or smallest element, the least upper
bound and the greatest lower bound can be used to find elements of the set
that are arbitrarily close to being the largest or smallest elements of the set—
arbitrarily close to the least upper or greatest lower bounds of the set.
1.5 Properties of the Real Numbers 27

Proposition 1.5.3 (a) Suppose S ⊂ R is bounded above. Let M ∗ = lub(S).


Then for every  > 0 there exists x0 ∈ S such that M ∗ − x0 < .
(b) Suppose S ⊂ R is bounded below. Let m∗ = glb(S). Then for every  > 0
there exists x0 ∈ S such that x0 − m∗ < .

Proof: (a) Suppose false, i.e. suppose that for some 0 > 0 there is no
element x ∈ S greater than M ∗ − 0 . Then M ∗ − 0 will be an upper bound for
the set S—for all x ∈ S, x ≤ M ∗ − 0 . But this is a contradiction to the fact
that M ∗ is the least upper bound of S.
(b) The proof of (b) is similar—do be careful with the inequalities.
Remember that m∗ and M ∗ may or may not be in the set. Though we
cannot choose the smallest or largest element in the set, we can always find an
element in the set that is arbitrarily close to m∗ and M ∗ . Often when we are
using an argument where we would like to use the smallest or largest element
in a set (and can’t make the claim that there is such an element), we can use
the elements provided by the above proposition that are arbitrarily close the
the greatest lower bound and least upper bound of the set.
We might also make special note of the argument used in the proof of (a)
above. The proof is not difficult. For many students it is difficult to negate
the original statement. The statement is that “for every  > 0 there is an x0
that satisfies an inequality.” To negate that statement, you need “some  > 0
for which there is no x0 that will satisfy the inequality,” or “some  > 0 for
which every x ∈ S does not satisfies the inequality.” Analysis results often
involve convoluted statements. It is often difficult to negate these convoluted
statements.
We next obtain a very important corollary known as the Archimedian prop-
erty (and of course, it doesn’t really deserve to be a corrolary).

Corollary 1.5.4 For any positive real numbers a and b, there is an n ∈ N such
that na > b.

Proof: Suppose false, i.e. suppose that na ≤ b for all n ∈ N. Set S = {na :
n ∈ N}. Since we are assuming that na ≤ b for all n ∈ N, the set S is bounded
above by b. The completeness axiom implies that S has a least upper bound,
let M ∗ = lub(S).
By Proposition 1.5.3–(a) there exists an element of S, n0 a, n0 ∈ N, such
that M ∗ − n0 a < a. (The statement must be true for any  > 0. We’re applying
the proposition with  = a.) Then we have M ∗ < (n0 + 1)a for n0 + 1 ∈ N so
M ∗ is not an upper bound of S. This is a contradiction so there must be an
n ∈ N such that na > b.
The next result we give two special cases of the Archimedian property that
are basic and seem obvious (or as the very clever graduate students would say,
“intuitively clear to the casual observer”). The result makes it clear that the
completeness axiom is very important because without it we could not make
these seemingly obvious claims. By first choosing b = c and a = 1, and then
choosing a = 1 and b = , we obtain the following corollary.
28 1. Real Numbers

Corollary 1.5.5 (a) For any positive real number c there is an n ∈ N such that
n > c.
(b) For any  > 0 there is an n ∈ N such that 1/n < .

We next include two properties of the set of reals that both illustrate the
complexity of the reals.

Proposition 1.5.6 Let a, b ∈ R such that a < b.


(a) There exists r ∈ Q such that a < r < b.
(b) There exists x ∈ I such that a < x < b.

Proof: By Corollary 1.5.5–(b) we can choose n ∈ N so that 1/n < b − a. Let


m be the smallest integer such that m > na (or m/n > a). Then m − 1 ≤ na
and
m m−1 1 1
= + ≤ a + < a + (b − a) = b.
n n n n
Therefore r = m
n satisfies a < r < b.
(b) By part (a) there exists r ∈ Q such that a√ < r < b. By Corollary 1.5.5–(a)

there exists n ∈ N such that n1 < (b−r)

2
or r + n
2
< b. Then a < r < r + 2
n < b.

2
By Example 1.2.2 and HW1.2.2–(b), r + n ∈ I.
Note that part of the proof of (a) included “let m be the smallest integer such
that m > na” This type of “obvious” statement is commonly assumed to be true
and nothing more is said about it. However, when requested, you must be able
to justify the statement. This statement is called the Well-Ordering principle
and is stated as every non-empty subset of N contains a smallest natural number
or if M ⊂ N is non-empty, then glb(M ) ∈ M . We will prove this statement in
Example 1.6.5 in Section 1.6 but want to emphasize that though we prove this
later, we are not using any sort of circular argument.
Before we leave this section we include a definition the absolute value of a
real number and some properties of absolute value that don’t necessarily fit in
this section but we surely will need soon.
(
x if x ≥ 0
Definition 1.5.7 The absolute value of x ∈ R is defined as |x| =
−x if x < 0.

Proposition 1.5.8 (i) |x| ≥ 0 for all x ∈ R. |x| = 0 only if x = 0.


(ii) |xy| = |x||y| for all x, y ∈ R.
(iii) −|x| ≤ x ≤ |x| for all x ∈ R.
(iv) For a ∈ R, a ≥ 0 |x| ≤ a if and only if −a ≤ x ≤ a.
(v) |x + y| ≤ |x|
+ |y| for
all x, y ∈ R.
(vi) |x − y| ≥ |x| − |y| ≥ |x| − |y| for all x, y ∈ R.

Proof: We will claim that the proof of (i) and (iii) are trivial (follow directly
from the defintion). Likewise we would like to think that property (ii) is clear
1.5 Properties of the Real Numbers 29

and/or claim that property (ii) is very clear if you consider the four cases x ≥ 0,
y ≥ 0; x ≥ 0, y < 0; x < 0, y ≥ 0; and x < 0, y < 0.
(iv) Earlier discussed an“if and only if” statement as a part of a definition and
promised that we would see it again as a part of a theorem (ok, a proposition).
Recall that the “if and only if” statement means that we have implications going
in each direction. Often to prove “if and only if” you prove both directions
separately. Sometimes, as is the case here, you can prove both directions at the
same time.
We consider the statement |x| ≤ a. If we only consider x values greater
than or equal to zero, this statement becomes |x| = x ≤ a so that statement
is equivalent to 0 ≤ x ≤ a. If we only consider x values less than zero, this
statement becomes |x| = −x ≤ a or 0 > x ≥ −a. Since x is either greater or
equal to zero or less than zero, the statement |x| ≤ a is equivalent to 0 ≤ x ≤ a
or 0 > x ≥ −a. If we consider this set of x values carefully, we see that it is the
same as −a ≤ x ≤ a.
(v) Property (v) is well know as the triangular inequality and is an important
property of the absolute value. We will use it often. Having proved properties
(iii) and (iv), property (v) is easy to prove. Using property (iii) twice, for any
x, y ∈ R we have −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|. Adding these inequalities
gives −|x| − |y| ≤ x + y ≤ |x| + |y| (consider it carefully why it is permissable
to add these inequalities). By property (iv) this last inequality implies that
|x + y| ≤ |x| + |y|.
(vi) Property (vi) is another useful property of the absolute value. We will refer
to property (vi) as the backwards triangular inequality. The proof of property
(vi) is a trick. Consider the following two computations.

|x| = |(x − y) + y| ≤ |x − y| + |y| (triangular inequality) so |x| − |y| ≤ |x − y|

and

|y| = |(y − x) + x| ≤ |y − x| + |x| so − (|x| − |y|) = |y| − |x| ≤ |y − x| = |x − y|.



Then since |x| − |y| = |x| − |y| or −(|x| − |y|), |x| − |y| ≤ |x − y|.
Before we give the next properties, we define the following notation (some
of which we have already used, but at least in this way we know that you
understand the notation). For a ≤ b we define the closed interval [a, b] as
[a, b] = {x ∈ R : a ≤ x ≤ b}. For a < b we define the open interval (a, b) as
(a, b) = {x ∈ R : a < x < b}. And we use the obvious combinations of the
notation for the half open–half closed intervals (a, b] and [a, b).
Proposition 1.5.9 For r ∈ R and r > 0 the following three statements are
equivalent: |x − a| < r, a − r < x < a + r and x ∈ (a − r, a + r).

Proof: If we do as we did with property (iv) in Proposition 1.5.8 and


consider two cases, x − a ≥ 0 and x − a < 0, it is easy to see that the first two
expressions are equivalent. The equivalence of the third statement comes from
the second statement and the definition of the open interval.
30 1. Real Numbers

Infinity:
To include a discussion about infinity in this section is a bit odd since we
want to make it very clear that ±∞ 6∈ R, i.e. ∞ and −∞ are not real numbers.
But we will do it anyway. Often the extended reals are defined to include R and
±∞. Plus and minus infinity do fit into our order system in that ±∞ are such
that for x ∈ R, −∞ < x < ∞, i.e. ∞ is larger than any real number and −∞
is smaller than any real number. Above we defined [a, b], (a, b), etc for a, b ∈ R.
We can logically extend these definitions to the unbounded intervals (a, ∞) =
{x ∈ R : a < x < ∞}, [a, ∞) = {x ∈ R : a ≤ x < ∞}, (−∞, a] = {x ∈ R :
−∞ < x ≤ a}, (−∞, a) = {x ∈ R : −∞ < x < a}, and even (−∞, ∞) = R.
Notice that since we want these intervals to be subsets of R, ±∞ are not included
in any of these sets.
At times we will have to do some arithmetic with infinities so we define for
a ∈ R, a + ∞ = ∞, a − ∞ = −∞, a∞ = ∞ for a > 0 and a∞ = −∞ for a < 0.
We emphasize that ∞ − ∞ and 0 ∞ are not defined (we don’t know what order
of “large” the infinity represents). And finally, since N ⊂ R, for all n ∈ N we
have 1 ≤ n < ∞.

HW 1.5.1 (True, False and why)


(a) Suppose that S ⊂ R is such that x ∈ S implies s ≥ 0. Then glb(S) ≥ 0.
(b) We know that for x, y ∈ R, |x + y| ≤ |x| + |y|. If x ≤ 0 and y ≤ 0,
|x + y| = |x| + |y|.
(c) If x, y ∈ I, then x + y ∈ I.
(d) If x, y ∈ I, then xy ∈ I.

(e) For all n ∈ N, n ∈ I.

HW 1.5.2 In Section 1.4 we considered S3 = {r ∈ Q : r = 7−1/n for all n ∈ N}


as a subset of the rationals Q. We claimed that lub(S3 ) = 7. Prove it.

1.6 Principle of Mathematical Induction


In this section we consider the topic of proof by mathematical induction. Mathe-
matical induction is a very important form of proof in mathematics. It would be
easy to say that the topic of mathematical induction should not be included in a
chapter titled An Introduction to the Real Numbers. Because it is a convenient
time and place for this topic, we include it here.
Recall the fifth Peano Postulate, PP5: Let M be a set of natural numbers
such that (i) M contains 1 and (ii) M contains x+1 whenever it contains x, then
M contains all the natural numbers. From this postulate—which in our setting
followed immediately from the definition of the set of natural numbers—we
obtain the following theorem.

Theorem 1.6.1 Let P (n) be a proposition that is defined for every n ∈ N.


Suppose that P (1) is true, and that P (k + 1) is true whenever P (k) is true.
Then P (n) is true for all n ∈ N.
1.6 Math Induction 31

This theorem is referred to as the Principle of Mathematical Induction and


follows easily from the fifth Peano Postulate by setting M = {n ∈ N : P (n) is true}.
It is important for us to be able to use the Principle of Mathematical Induc-
tion, Theorem 1.6.1, as a method of proof: proof by mathematical induction.
We shall introduce proofs by math induction (short for mathematical induction
or the principle of mathematical induction) by a variety of examples. In each
example we will use a common template—which in order to avoid confusion, we
suggest that you follow.
n
X 1 − rn+1
Example 1.6.1 Prove that rj = .
j=0
1−r
Solution: We want to use the principle of mathematical induction. For this problem the
n
X 1 − rn+1
proposition P is the expansion rj = .
j=0
1−r
1
X 1 − r1+1 1 − r2
Step 1: Prove that P (1) is true. rj = 1 + r and = = 1 + r. Therefore
j=0
1−r 1−r
the proposition is true when n = 1.
k
X 1 − rk+1
Step 2: Assume that P (k) is true, i.e. rj = .
j=0
1−r
k+1
X 1 − r(k+1)+1
Step 3: Prove that P (k + 1) is true, i.e. rj = .
j=0
1−r
k+1 k
X X 1 − rk+1
rj = rj + rk+1 = + rk+1 by the assumption in Step 2 (1.6.1)
j=0 n=0
1−r

1 − rk+2
= . (1.6.2)
1−r
k+1
X
(Notice that in the first step of (1.6.1) we take the last term of the summation rj out of
j=0
the summation, changing the upper limit of the summation to k and including the last term
separately.) Therefore P is true for n = k + 1.
n
X 1 − rn+1
By the principle of mathematical induction P is true for all n, i.e. rj = .
j=0
1−r

We want to emphasize that all proofs by math induction follow the above
template. In Step 1 you prove that the proposition is true for n = 1. In Step 2
you assume that the proposition is true for n = k—this assumption is referred
to as the inductive assumption. In Step 3 you prove that the proposition is true
for n = k + 1–using the inductive assumption as a part of the proof. If you
are able to prove that the proposition is true for n = k + 1 without using the
inductive assumption, you would have a direct proof of the proposition—math
induction would not be necessary.
You should recognize the formula in Example 1.6.1 as the formula for the
sum of a geometric series. A common proof of this formula is to write
S = 1 + r + r2 + · · · + rn−1 + rn and note that (1.6.3)
rS = r + r2 + r3 + · · · + rn + rn+1 . (1.6.4)
32 1. Real Numbers

Subtracting equation (1.6.4) from equation (1.6.3) gives S − rS = (1 − r)S =


1 − rn+1
1 − rn+1 (the rest of the terms add out) or S = . The point that we
1−r
want to make is that this is a nice derivation but it is not a direct proof of the
formula. To be able to write rS as r + r2 + r3 + · · · + rn + rn+1 we are applying
n
X Xn
a “rule” r aj = raj —which is an extension of the distributive property
j=1 j=1
c(a + b) = ca + cb. If we want to be picky (and at times we do), this formula
should be proved. This formula is proved true by math induction. Likewise
when we are computing S − rS by subtracting the right hand side of equation
(1.6.4) from the right hand side of equation (1.6.3), we are using an extension
of the associative property of addition which can be proved by math induction.
Hence, this nice derivation (that didn’t seem to using mathematical induction)
involved several steps that could be or should be proved by math induction.
In general, when you do algebra involving expressions that include three
dots, you are probably doing an easy math induction proof. Another common
form of an easy math induction proof is when you write your desired result after
you’ve written several terms of the result and added the abbreviation “etc.” It’s
perfectly ok to do easy results this way—we all do them—but you should at
least realize that they’re true by the principle of mathematical induction.
n
X n(n + 1)
Example 1.6.2 Prove that j= .
j=1
2
1
X 1(1 + 1)
Solution: Step 1: Prove true for n = 1. j=1= = 1. Therefore the proposition
j=1
2
is true for n = 1.
k
X k(k + 1)
Step 2: Assume true for n = k, i.e. j= .
j=1
2
k+1
X (k + 1)(k + 2)
Step 3: Prove true for n = k + 1, i.e. prove that j= .
j=1
2

k+1 k
X X k(k + 1)
j= j + (k + 1) = + (k + 1) by assumption in Step 2 (1.6.5)
j=1 j=1
2
 
k (k + 2)
= (k + 1) + 1 = (k + 1) . (1.6.6)
2 2
Therefore the proposition is true for n = k + 1.
By the principle of mathematical induction the proposition is true for all n.

There are many of these summation formulas that can and are proved by
math induction. You should note that except for details, the proofs are very
similar.
We next include a proof by math induction that is somewhat different than
the preceeding two.
Example 1.6.3 If m, n ∈ N and a ∈ R, then am an = am+n .
1.6 Math Induction 33

Solution: Before we begin we should note that the definition of am is what we call an
inductive definition: Define a1 = a and for any k ∈ N, define ak+1 as ak+1 = ak a. We now
begin our proof by fixing m.
Step 1: Prove that the proposition is true for n = 1. Since am a1 = am+1 , by the definition
above the proposition is true for n = 1.
Step 2: Assume that the proposition is true for n = k, i.e. assume that am ak = am+k .
Step 3: Prove that the proposition is true for n = k + 1, i.e prove that am ak+1 = am+k+1 .
   
am ak+1=am ak a1 = am ak a=am+k a by the inductive hypothesis (1.6.7)

= am+k+1 by the definition of am given above. (1.6.8)


Therefore the proposition is true for n = k + 1 and by the principle of mathematical
induction the proposition is true for all n, i.e. am an = am+n .

We show how another basic property of the integers can be proved in the
following example.
Example 1.6.4 1 ≤ n for all n ∈ N.
Solution: Step 1: Prove true for n = 1. Clearly 1 ≤ 1 so the proposition is true for n = 1.
Step 2: Assume true for n = k, i.e. 1 ≤ k.
Step 3: Prove true for n = k + 1, i.e. 1 ≤ k + 1.
By adding 1 to both sides of the inequality 1 ≤ k (using the inductive hypothesis and
(x) of Proposition 1.3.7) we get 2 ≤ k + 1. By Proposition 1.3.6-(i) we have 1 > 0–which we
know implies that 0 < 1. Adding 1 to both sides gives 1 < 2. We then have 1 < 2 < k + 1 or
1 < k + 1. This implies that 1 ≤ k + 1.

In the last example of the application of mathematical induction we prove


a very important property of the natural numbers, the Well-Ordered Principle.
As we will see, we will prove the Well-Ordered Principle by contradiction, using
mathematical induction to arrive at the contradiction.
Example 1.6.5 Suppose that M ⊂ N and M 6= ∅. Then glb(M ) ∈ M .
Solution: Before we proceed we wish to emphasize that this statement can be reworded as
follows. If M is a nonempty subset of the natural numbers, then M contains a smallest natural
number.
We begin this proof by supposing that the statement is false, i.e. there exists a nonempty
subset of the natural numbers M that does not contain a smallest natural number. Since by
Example 1.6.4 we know that 1 is the smallest natural number, we know that 1 6∈ M . Let
T = {k ∈ N : k < m for all m ∈ M }. By the defintion of T it is clear that M ∩ T = ∅. We
will use math induction to prove that T = N.
Step 1: Because 1 6∈ M (since 1 is the smallest natural number, 1 would be the smallest
natural number in M ) and 1 ≤ n for all n ∈ N, 1 ∈ T .
Step 2: Suppose that k ∈ T , i.e. k < m for all m ∈ M .
Step 3: Prove that k + 1 ∈ T .
Let h ∈ N and be such that h < k + 1. Then h ≤ k because there cannot be a natural
number between k and k + 1. By the definition of T , h ∈ T (h ≤ k < m for all m ∈ M ) and
h 6∈ M (h ≤ k and k < m for all m ∈ M ). Then if k + 1 ∈ M , k + 1 is the smallest element of
M —but, of course M does not have a smallest element. Since k < m for all m ∈ M , there is
no natural number between k and k + 1 and k + 1 6∈ M , k + 1 < m for all m ∈ M . Therefore
k + 1 ∈ T.
By induction T = N—but this is a contradiction because we know that M 6= ∅. Therefore,
glb(M ) ∈ M .

If you are not inclined to just believe that for k ∈ N there are no natural
numbers between k and k + 1—and we surely hope you wouldn’t believe that,
34 1. Real Numbers

consider the following short proof. (Sometimes it makes life tough but you
must be careful what you believe to be obvious.) For α ∈ N suppose that there
exists a natural number β such that α < β < α + 1. Then β − α > 0 and
α + 1 − β > 0. But since these are natural numbers and 1 is the smallest natural
number, this implies that β − α ≥ 1 and α + 1 − β ≥ 1. Then we see that
(β − α) + (α + 1 − β) = 1 ≥ 2. This is a contradiction so for α ∈ N there are no
natural numbers between α and α + 1 by reductio ad absurdum.
n
X n(n + 1)(2n + 1)
HW 1.6.1 Prove that j2 = .
j=1
6

m
HW 1.6.2 Prove that if m, n ∈ N and a ∈ R, then (an ) = anm .
n  
n
X n
HW 1.6.3 For n ∈ N and a, b ∈ R prove that (a + b) = an−k bk
k
    k=0
n n!
where = .
k (n − k)!k!
n−1
X
HW 1.6.4 For n ∈ N and a, b ∈ R prove that an − bn = (a − b) an−1−j bj .
j=0

HW 1.6.5 Suppose that Q is an ordered field (or the reals) and suppose that
a, b ∈ Q and θ ≤ a ≤ b. Then for n ∈ N we have an ≤ bn .

HW 1.6.6 For 0 < c < 1 prove that 0 < cn < 1 for all n ∈ N.
Chapter 2

Some Topology of R

2.1 Introductory Set Theory


As a part of introducing some topology of the real numbers we will include
some basic set theory. It would be easy to say that it’s a bit late to include set
theory—we have already used sets and set notation. However we felt that to
be able to discuss some of the properties of R that we now want to introduce,
we want to be sure that you know what we are talking about—and it is not the
case that we didn’t care if you didn’t know what we were talking about earlier.
Some of this material might be review but bear with us.

Definition 2.1.1 (a) We say that A is a subset of B and write A ⊂ B (or


B ⊃ A) if x ∈ A implies that x ∈ B.
(b) If A ⊂ B and there exists an x ∈ B such that x 6∈ A, then A is said to be a
proper subset of B.
(c) If A ⊂ B and B ⊂ A, we say that A equals B and write A = B.
(d) We call the set that does not contain any elements the empty set and write
the empty set as ∅.

The sets in which we will be interested will almost always be subsets of the
real numbers—but none of the general definitions require that to be the case.
We have already seen that N ⊂ Z ⊂ Q ⊂ R—all of the the subsets being proper
subsets. We also have I ⊂ R. We note that for any set A, A ⊂ A (clearly x ∈ A
implies that x ∈ A) and ∅ ⊂ A (if x ∈ ∅, then x ∈ A because there are no x’s in
∅).
We will often want to combine two or more sets in various ways. We make
the following definition.

Definition 2.1.2 Suppose that S is a set and there exists a family of sets as-
sociated with S in that for any α ∈ S there exists the set Eα .
(a) We define the union of the sets Eα , α ∈ S, to be the set E such that x ∈ E
if and only if x ∈ Eα for some α ∈ S. We write E = ∪ Eα . If we have only
α∈S

35
36 2. Topology

two sets, E1 and E2 , we write E = E1 ∪ E2 . If S = {1, 2, · · · , n} (we have n


n
sets, E1 , · · · , En ), we write E = ∪ Ek or E = E1 ∪ E2 ∪ · · · ∪ En . If S = N,
k=1

then we write E = ∪ Ek .
k=1
(b) We define the intersection of the sets Eα , α ∈ S, to be the set E such that
x ∈ E if and only if x ∈ Eα for all α ∈ S. We write E = ∩ Eα . If we have
α∈S
only two sets, E1 and E2 , we write E = E1 ∩ E2 . If S = {1, 2, · · · , n} (we have
n
n sets, E1 , · · · , En ), we write E = ∩ Ek or E = E1 ∩ E2 ∩ · · · ∩ En . If S = N,
k=1

then we write E = ∩ Ek .
k=1

We note that the union contains all of the points that are in any of the sets
under consideration while the intersection contains the points that are in all of
the sets under consideration. It is easy to see that

• {1, 2, 3, 4, 5, 6, 7} ∪ {5, 6, 7, 8} = {1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7} ∩


{5, 6, 7, 8} = {5, 6, 7}

• Q ∪ I = R, Q ∩ I = ∅

• (1, 10)∪{1, 10} = [1, 10], [1, 10]∪[10, 20] = [1, 20], [1, 10)∪[10, 20] = [1, 20],
[1, 10] ∩ [10, 20] = {10}, [1, 10) ∩ [10, 20] = ∅

We can immediately obtain an assortment of properites pertaining to unions


and intersections which we include in the following proposition.

Proposition 2.1.3 For the sets A, B and C we obtain the following properities.
(a) A ⊂ A ∪ B
(b) A ∩ B ⊂ A
(c) A ∪ ∅ = A
(d) A ∩ ∅ = ∅
(e) If A ⊂ B, then A ∪ B = B and A ∩ B = A.
(f ) A ∪ B = B ∪ A, A ∩ B = B ∩ A Commutative Laws
(g) (A ∪ B) ∪ C = A ∪ (B ∪ C), (A ∩ B) ∩ C = A ∩ (B ∩ C) Associative Laws
(h) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) Distributive Law

Proof: We will not prove all of these—hopefully most of these proofs are very
easy for you. We will prove three of the properties to illustrate some methods
of proofs of set properties.
(b) To prove the set containment in property (b) we begin with an x ∈ A ∩ B.
This implies that x ∈ A and x ∈ B. Therefore x ∈ A and we are done.
(h) In the proof of property (b) we applied the definition of set containment
and proved that if x ∈ A ∩ B, then x ∈ A. To prove property (h) we must apply
the definition of equality of sets, Theorem 2.1.1–(c), and prove containment
both directions, i.e. we must prove that A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C) and
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C).
2.1 Set Theory 37

A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C): We suppose that x ∈ A ∩ (B ∪ C). Then


we know that x ∈ A, and x ∈ B or x ∈ C. If x ∈ B, then x ∈ A ∩ B. If
x ∈ C, then x ∈ A ∩ C. Thus we know that x ∈ A ∩ B or x ∈ A ∩ C. Therefore
x ∈ (A ∩ B) ∪ (A ∩ C) and A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C).
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C): We now suppose that x ∈ (A ∩ B) ∪ (A ∩ C).
Then we know that x ∈ A ∩ B or x ∈ A ∩ C, i.e. we know that x ∈ A and
x ∈ B, or x ∈ A and x ∈ C. Thus in either case we know that x ∈ A. We also
know that x must be in either B or C (or both, but we don’t care much about
this possibility). Thus x ∈ A and x ∈ B ∪ C, or x ∈ A ∩ (B ∪ C). Therefore
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C).
By Definition 2.1.1 we have that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).
(g) We will prove both properties given in (g) using Venn Diagrams. It’s not
clear what sort of proof Venn Diagrams provide but they are very nicely illus-
trative. We note in Figure Figure 2.1.1 below in the left box we draw three
supposedly arbitrary sets, A, B and C. We cross-hatch A with vertical lines, B
with horizontal lines and C with slanted lines. It is clear that A ∪ B is the set
cross-hatched with either vertical or horizontal lines. Then (A ∪ B) ∪ C is the
set that is cross-hatched with vertical bf or horizontal lines, or slanted lines,
i.e. the set that is cross-hatched in any manner.
We then proceed to the right box. We cross-hatch A, B and C as we did
in the box on the left. Then the set (B ∪ C) is the set that is cross-hatched
with either horizontal lines or slanted lines, and A ∪ (B ∪ C) is the set that is
cross-hatched with vertical lines, or horizontal lines or slanted lines, i.e. the set
that is cross-hatched in any manner. It is clear that the region denoting the set
(A ∪ B) ∪ C is the same as the region A ∪ (B ∪ C), so the sets are equal.
To prove the property (A∩B)∩C = A∩(B ∩C), we note that in the left box
A ∩ B) is the set cross-hatched with vertical and horizontal lines. We then note
that the set (A ∩ B) ∩ C is the set cross-hatched with vertical and horizontal
lines, and slanted lines, i.e. the region cross-hatched in all three directions. We
then note that in the right box the region (B ∩ C) is the region cross-hatched
with horizontal lines and slanted lines, so the region A ∩ (B ∩ C) will be the
region cross-hatched with vertical, and horizontal and slanted lines, i.e. the
region cross-hatched in all three directions. It is clear that these regions are
equal so we know that (A ∩ B) ∩ C = A ∩ (B ∩ C).
As we stated earlier it is not clear how rigorous the Venn Diagram proof
is, but it is a helpful method—because it’s so visual. We will not prove the
remaining properties. The proofs of the rest are very similar to the proofs given
above—and are all easier than the proof of (h).
We next define the complement of a set. To discuss the complement it is
necessary to have a universe. The entirety of the set of elements under consid-
eration is called the unviversal set or the universe. In the proof of property (g)
above (using Venn diagrams) the universe was the set of points contained in the
box. Generally for us the universe will be either R or a subset of R. When it
is not emphasized with respect to what we are taking the complement, assume
38 2. Topology

Figure 2.1.1: Venn Diagrams used to prove that (A ∪ B) ∪ C = A ∪ (B ∪ C)


and (A ∩ B) ∩ C = A ∩ (B ∩ C).

it is with respect to R. At the same time we define a concept that is strongly


related to the complement, the difference of two sets
Definition 2.1.4 (a) For two sets A and B, we define the difference of A and
B (or the complement of B with respect to A, as A − B = {x ∈ A : x 6∈ B}.
(b) The complement of the set A with respect to the universe U is the set Ac =
{x ∈ U : x 6∈ A}, or Ac = U − A.

If A1 = (−∞, 4), A2 = (2, 5) and A3 = (−∞, 5], it is easy to see that Ac1 =
[4, ∞), Ac2 = (−∞, 2] ∪ [5, ∞) and Ac3 = (5, ∞). If we wanted the complement
of A1 with respect to A3 , then A3 − A1 = [4, 5].
We next state the very basic but important result concerning complements
of sets.
Proposition 2.1.5 (Ac )c = A
It should be very easy to see that the above result is true. Probably the easiest
way is to draw the very simple Venn Diagram representing the left hand side of
the equality.
We next prove a very important result related to complements referred to
as DeMorgan’s Laws.
Proposition 2.1.6 Consider the set A, and the family of sets associated with
S, Eα .
(a) A − ∪ Eα = ∩ (A − Eα )
α∈S α∈S
(b) A − ∩ Eα = ∪ (A − Eα )
 α∈S c α∈S
(c) ∪ Eα = ∩ Eαc
α∈S α∈S
2.1 Set Theory 39
 c
(d) ∩ Eα = ∪ Eαc
α∈S α∈S

Proof: (a) The proof of property (a) follows by carefully applying the definition
of set equality. We begin by assuming that x ∈ A − ∪ Eα . Then we know
α∈S
that x ∈ A and x 6∈ ∪ Eα . The statement that x 6∈ ∪ Eα is a very strong
α∈S α∈S
statement. This means that x 6∈ Eα for any α ∈ S—if x ∈ Eα0 for some α0 ∈ S,
then x ∈ ∪ Eα . Thus x ∈ A and x 6∈ Eα so x ∈ A − Eα —and this holds for
α∈S
any α ∈ S. Therefore x ∈ ∩ (A − Eα ), and A − ∪ Eα ⊂ ∩ (A − Eα ).
α∈S α∈S α∈S

We next assume that x ∈ ∩ (A − Eα ). This implies that x ∈ A − Eα for


α∈S
every α ∈ S. Therefore, x ∈ A and for every α ∈ S, x 6∈ Eα . This implies that
x 6∈ ∪ Eα —because if x ∈ Eα0 for some α0 ∈ S, then x 6∈ A − Eα0 . Since
α∈S
x ∈ A and x 6∈ ∪ Eα , then x ∈ A − ∪ Eα , or ∩ (A − Eα ) ⊂ A − ∪ Eα .
α∈S α∈S α∈S α∈S
Therefore A − ∪ Eα = ∩ (A − Eα ).
α∈S α∈S

(b) The proof of property (b) is very similar to that of property (a). We assume
that x ∈ A − ∩ Eα . Then x ∈ A and x 6∈ ∩ Eα . The statement x 6∈ ∩ Eα
α∈S α∈S α∈S
implies that x 6∈ Eα0 for some (at least one) α0 ∈ S. But then x ∈ A − Eα0 so
x ∈ ∪ (A − Eα ) and A − ∩ Eα ⊂ ∪ (A − Eα ).
α∈S α∈S α∈S

If x ∈ ∪ (A − Eα ), then x ∈ A − Eα0 for some (again, at least one) α0 ∈ S.


α∈S
This implies that x ∈ A and x 6∈ Eα0 . But if x 6∈ Eα0 , then x 6∈ ∩ Eα
α∈S
(to be in there, it must be in all of them). Therefore x ∈ A − ∩ Eα and
α∈S
∪ (A − Eα ) ⊂ A − ∩ Eα .
α∈S α∈S
We then have A − ∩ Eα = ∪ (A − Eα ).
α∈S α∈S

(c) and (d) Properties (c) and (d) follow from properties (a) and (b),
respectively, by letting A = U the universal set.

HW 2.1.1 (True or False and why)


(a) A ⊂ A ∩ B
(b) B − A = B ∩ Ac

(c) For Ek = (−1/k, 1/k), k ∈ N, E = ∪ Ek = ∅.
k=1

(d) For Ek = (−k, k), k ∈ N, E = ∩ Ek = R.
k=1
(e) A ∪ B = [A − (A ∩ B)] ∪ B

HW 2.1.2 Give set containment proofs of parts (c) and (g) of Theorem 2.1.3.

HW 2.1.3 Give Venn diagram proofs of part (h) of Proposition 2.1.3 and part
(c) of Proposition 2.1.6.
40 2. Topology

2.2 Basic Topology


Topology is the study of a family of sets with certain properties used to define
a topological space. Many of the properties proved in the topological space are
useful in the study of analysis. But we do not want or need all of the results
and abstractness of topology. We are going to study calculus on R so we want
some of the relevant concepts of the topology of R that will help us. The title of
the chapter and the title of this section are very appropriate. We do not claim
to be giving you the topology of R. As the titles imply we are going to give you
some of the basic topology on R—the topology that we want to use. In this
section we will introduce some of the most basic topology of the reals. In later
sections, when appropriate, we will add more topological results.
We begin defining several ideas related to subsets of R.
Definition 2.2.1 Suppose x0 ∈ R and E is a subset of R. (a) A neighborhood
of a point x0 is the set Nr (x0 ) = {x ∈ R : |x − x0 | < r} for some r > 0. The
number r is called the radius of the neighborhood.
(b) A point x0 is a limit point (an accumulation point) of a set E if every
neighborhood of x0 contains a point x 6= x0 such that x ∈ E. We call the set of
limit points of E the derived set of E and denote it by E 0 .
(c) If x0 ∈ E and x0 is not a limit point of E, then x0 is said to be an isolated
point of E, i.e. x0 is an isolated point of E if x0 ∈ E and there exists a
neighborhood of x0 , Nr (x0 ), such that Nr (x0 ) ∩ E = {x0 }.
(d) The set E is closed if every limit point of E is in E, i.e. E 0 ⊂ E.
(e) A point x0 ∈ E is an interior point of E if there is a neighborhood N of x0
such that N ⊂ E. We call the set of interior points of E the interior of E and
denote it by E o .
(f ) The set E is open if every point of E is an interior point (E ⊂ E o so then
E = E o ).
(g) The set E is dense in R if x ∈ R implies that x is a limit point of E or
x ∈ E.
It should be easy to see from Proposition 1.5.9 that neighborhoods of a point
x0 are intervals, (x0 −r, x0 +r) where r > 0. Thus the intervals (.9, 1.1), (.5, 1.5)
and (.995, 1.005) are all neighborhoods of the point x0 = 1 with radii .1, .5, and
0.005, respectively.
Example 2.2.1 Define the following sets: E1 = [0, 1], E2 = (0, 1), E3 = {1, 1/2, 1/3, · · · }.
(a) Show that E10 = [0, 1] = E1 .
(b) Show that E20 = [0, 1].
(c) Show that E30 = {0}.
Solution: (a) We begin by considering any point x0 ∈ E1 , x0 6= 1, and let Nr (x0 ) be any
neighborhood of x0 . We note that the point x1 = min{x0 + r/2, 1} is in E1 and Nr (x0 ),
and x1 6= x0 so the point x0 is a limit point of E1 . Note that the point that we used,
x1 = min{x0 + r/2, 1} is not a very nice looking point but we needed to be careful to choose
a point that would be in both E1 and Nr (x0 )—the choice being 1 when r is too large. Draw
a picture to see why this point works.
If x0 = 1 and Nr (x0 ) is an arbitrary neighborhood of x0 —for any r, then the point
x1 = max{x0 − r/2, 0} is in E1 and Nr (x0 ), and x1 6= x0 so the point x0 = 1 is a limit point
of E1 . Thus every point in [0, 1] is a limit point of E1 = [0, 1].
2.2 Basic Topology 41

If x0 6∈ [0, 1], say x0 > 1, then if r = (x0 − 1)/2, the neighborhood of x0 is such that
Nr (x0 ) ∩ E1 = ∅. Thus x0 is not a limit point of E1 . A similar argument shows that any x0
such that x0 < 0 is not a limit point of E1 .
Thus only the points in [0, 1] are limit points of E1 = [0, 1], i.e. E10 = [0, 1] = E1 .
We would like to emphasize that by the definition of a limit point, for the point x0 to be
a limit point of E1 , it must be shown that all neighborhoods of x0 contains an element of E1
different from x0 . To show that a point x0 is not a limit point of E1 , we only have to show
that there exists one neighborhood of x0 that does not contain any elements of E1 other than
x0 .
(b) If we consider the set E2 = (0, 1) and let x0 be an arbitrary point of E2 , then for any
neighborhood of x0 , Nr (x0 ), the point x1 = min{x0 + r/2, (1 + x0 )/2} will be in both E2
and Nr (x0 ), and not equal to x0 . (Again we emphasize that we need the nasty looking point
x1 because we use x0 + r/2 when r is sufficientlly small and use (1 + x0 )/2 when r is large.)
Thus every point in (0, 1) is a limit point of E2 .
Since for any r > 0 the neighborhoods Nr (0) = (−r, r) and Nr (1) = (1 − r, 1 + r) contain
the points x0 = min{r/2, 1/2} and x1 = max{1 − r/2, 1/2}, respectively—both points in
E2 —and surely x0 6= 0 and x1 6= 1, the points x = 0 and x = 1 are both limits points of E2 .
The same argument used for the set E1 can be used to show that all points x 6∈ [0, 1]
are not limit points of (0, 1). Thus only the points in [0, 1] are limit points of E2 = (0, 1) or
E20 = [0, 1].
(c) To find E30 for the set E3 = {1, 1/2, 1/3, · · · } is easy but messy. The easiest way is to
first determine some facts concerning E3 . It is very intuitive that given some element of E3
other than 1, say 1/m, m ∈ N , then the elements of E3 that are closest to 1/m in value are
1/(m − 1) (the next larger element in the set) and 1/(m + 1) (the next smaller element in the
set). Of course we must be able to prove these statements—if someone asked. The easiest
way to prove these is to use the second Peano Postulate that could be stated that there are no
natural numbers between m − 1 and m, or m and m + 1—if there is some element of E3 , say
1/k, such that 1/m < 1/k < 1/(m − 1), then we have k < m and k > m − 1 which contradicts
PP2.
If we proceed and choose any specific element of E3 , say x0 = 1/1004, it is not difficult
to see that the neighborhood Nr (1/1004) where r = 0.00001 will not contain any elements
of E3 other than x0 (because we can compute the value of the elements of E3 that  are
1 1
closest to x0 ). This same argument will work for a general element of E3 , m —Nr m with
 
1 1 1
r = 2 m − m+1 will always work. For x0 = 1 the neighborhood N = (0.99, 1.01) will be
such that N contains no points of E3 other than x0 = 1. Thus no points in E3 are limit points
of E3 .
If we consider the point x0 > 1, then the neighborhood Nr (x0 ) with r = (x0 − 1)/2 will
not contain any elements of E3 . Thus x0 > 1 is not a limit point of E3 .
If we consider the point x0 < 0, then the neighborhood Nr (x0 ) with r = −x0 /2 will not
contain any elements of E3 . Thus x0 < 0 is not a limit point of E3 .
We now consider the point x0 , x0 6∈ E3 and 0 < x0 < 1, We know that there must be two
elements of E3 , say x1 = 1/(m − 1) and x2 = 1/m such that x2 < x0 < x1 —choose m be
setting x2 = 1/m = lub{y ∈ E3 : y < x0 } (you must prove that this least upper bound will
be in E3 ) and let x1 = 1/(m − 1) be the value of the next largest element in the set. We can
then set r = min{(x0 − x2 )/2, (x1 − x0 )/2} and note that Nr (x0 ) ∩ E3 = ∅. Therefore, the
points x0 such that x0 6∈ E3 and 0 < x0 < 1 are not limit points of E3 .
The last point that we have to consider is the point x0 = 0. We let Nr (0) denote any
neighborhood of x0 , i.e. consider any r. Then by Corollary 1.5.5–(b) with  = r we see that
there exists an n such that 1/n < r or 1/n ∈ Nr (0). Thus x0 = 0 is a limit point of E3 . Thus
we see that the only limit point of the set E3 is x0 = 0, i.e. E30 = {0}.
As we promised you, the proof was not nice. However, it is a good example of a case
where you consider different classes of points separately—showing that some of the points are
limit points and that some of the points are not limit points.

It should be clear that all of the points in E3 are isolated points. Likewise,
if we consider the set N, since none of the points of N are limit points (for any
42 2. Topology

k ∈ N, the neighborhood N1/2 (k) does not contain any elements of N), all of
the points in N are isolated. It should also be easy to see that no points of E1
or E2 are isolated points—all of the points in both E1 and E2 are limit points.
Example 2.2.2 Let E1 , E2 and E3 be as in Example 2.2.1.
(a) Show that E1 is a closed set.
(b) Show that E2 is not a closed set.
(c) Show that E3 is not a closed set.
Solution: These proofs are very easy based on the work done in Example 2.2.1. Since we
saw that E10 = [0, 1] = E1 , then clearly all of the limits points of E1 are contained in E1 and
the set E1 is closed.
Since we found that E20 = [0, 1], we see that the limit points 0 and 1 do not belong to E2
so the set E2 is not closed. Likewise, since we saw that the only limit point of E3 is the point
0 and 0 6∈ E3 , then the set E3 is not closed.

We should note that if we considered E4 = E3 ∪ {0}, E4 will surely be closed—


almost any time you define a new set by including the limit points to a set, the
set will be closed. (Can you give an example where that is not the case?) Also,
since we saw that N has no limit points, the set N is closed—any set that has
no limit points is closed. This statement then would also imply that the empty
set, ∅, is closed. It should be really easy to see that the set R0 = R and hence
that R is closed—for x0 ∈ R, the whole neighborhood Nr (x0 ) ⊂ R for any r.
Example 2.2.3 Let E1 , E2 and E3 be as in Example 2.2.1.
(a) Show that E1o = (0, 1).
(b) Show that E2o = (0, 1) = E2 .
(c) Show that E3o = ∅.
Solution: (a) For x0 ∈ (0, 1) is should be easy to see that Nr (x0 ) ⊂ E1 if r = min{x0 /2, (1 −
x0 )/2}. Thus x0 ∈ (0, 1) implies x0 ∈ E o . It is easy to see that there is no r such that
Nr (0) or Nr (1) will be contained in E1 —in the first case −r/2 6∈ E1 and in the second case
1 + r/2 6∈ E1 . And since a point must be an element of the set to be an interior point, we do
not have to consider any other points. Therefore E1o = (0, 1).
(b) For points x0 ∈ (0, 1) exactly the same argument used in part (a) will show that x0 is an
interior point of E2 . All other points are not in E2 so E2o = (0, 1).
(c) In part (c) of Example 2.2.1 we showed that for x0 ∈ E3 there was a neighborhood Nr (x0 )
that did not contain any points of E3 —other than x0 . So surely Nr (x0 ) 6⊂ E3 .
Clearly, any neighborhood Nr1 (x0 ) with r1 < r will also not contain any points of E3
other than x0 . Thus Nr1 (x0 ) 6⊂ E3 .
And though neighborhoods Nr1 (x0 ) for r1 > r may contain some elements of E3 , such
neighborhoods will also always contain Nr (x0 )—which contains a large number of points that
are not in E3 . Thus there is no neighborhood of x0 that is contained in E3 .
Since only points of E3 need be considered, E3o = ∅.

It should be clear that the set N does not contain any interior points and every
point in R is an interior point.
It should now be easy to see that since 0 6∈ E1o (1 would work too), the set
E1 is not open. Since E2o = (0, 1) = E2 (i.e. every point in E2 is an interior
point), the set E2 is an open set. Clearly since 1/120012 6∈ E3o (and of course
any element of E3 would work here), the set E3 is not open. And finally, it
should be easy to see that N is not open and R is open.
The question of whether a set is dense in R is more difficult but we do not
want to consider many examples. Hopefully it is clear that sets like E1 , E2 , E3
2.2 Basic Topology 43

and N are clearly not dense in R—you must have much bigger sets than these
to be dense in R. It should be clear that E = R will be trivially dense in R. The
two important examples were already consider in Proposition 1.5.6. Consider
the following.
Example 2.2.4 (a) Show that Q is dense in R.
(b) Show that I is dense in R.
Solution: (a) By the definition of I a point x0 ∈ R is either in Q or in I. Let x0 be an
arbritray point of R. We must show that x0 is in Q or x0 is a limit point of Q. Thus if x0 ∈ Q,
we are done. Suppose that x0 6∈ Q (so x0 ∈ I). Consider Nr (x0 ) for any r, i.e. the interval
(x0 − r, x0 + r). Then by Proposition 1.5.6–(a) (with a chosen to be x0 − r and b chosen to
be x0 + r) there exists a rational r1 such that x0 − r < r1 < x0 + r. Therefore x0 is a limit
point of Q.
Since any point of R is either in Q or a limit point of Q, Q is dense in R.
(b) The proof of part (b) follows the same pattern as the proof of part (a) except that we use
part (b) of Proposition 1.5.6 instead of part (a).

Hopefully the above examples gives us an understanding of the ideas pre-


sented in Definition 2.2.1. We now proceed to prove some important properties
concerning limit points, and open and closed sets.

Proposition 2.2.2 A neighborhood is an open set.

Proof: You should note that this proof is and should be very similar to the
proof that E2o = E2 = (0, 1) and that E2 is open in Example 2.2.3–(b) and the
statements following that example. We write the neighborhood N as N = (x0 −
r, x0 + r). If we choose any point y0 ∈ N , then it clear that the neighborhood of
y0 , Nr1 (y0 ) = (y0 − r1 , y0 + r1 ) where r1 = 21 min{r − (x0 − y0 ), r − ((y0 − x0 )}, is
in N (draw a picture to help see that this is true). Thus y0 is an interior point
of N so N is open (N o = N ).

Proposition 2.2.3 If x0 is a limit point of E ⊂ R and N is any neighborhood


of x0 , then N contains infinitely many points of E.

Proof: Since N is a neighborhood of x0 and x0 is a limit point of E, we know


that there exists a point of E in N . Call this point x1 . Then consider the
neighborhood of x0 , N1 = (x0 − d1 , x0 + d1 ) where d1 = max{x0 − x1 , x1 − x0 }
(we’re just choosing the positive distance from x0 to x1 ). It should be clear by
the construction of N1 that N1 ⊂ N and x1 6∈ N1 .
Then since x0 is still a limit point of E and N1 is a neighborhood of x0 ,
there exists x2 ∈ N1 such that x2 ∈ E and x2 6= x0 . Since x1 6∈ N1 , x2 6= x1 .
Let N2 = (x0 − d2 , x0 + d2 ) where d2 = max{x2 − x0 , x2 − x0 }. Then N2 is
another neighborhood of x0 and by construction N2 ⊂ N1 ⊂ N , and x1 6∈ N2
and x2 6∈ N2 . Since x0 is a limit point of E, there exists x3 ∈ N2 such that
x3 ∈ E, x3 6= x0 and x3 6= x1 , x3 6= x2 .
Inductively, we define a set of points {x0 , x1 , x2 , · · · } and neighborhoods
N1 , N2 , · · · such that for any n, xn+1 ∈ Nn , xn+1 ∈ E (because Nn is a
neighborhood of x0 and x0 is a limit point of E), Nn+1 is defined to be Nn+1 =
44 2. Topology

(x0 − dn+1 , x0 + dn+1 ) where dn+1 = max{x0 − xn+1 , xn+1 − x0 }, Nn+1 is a


neighborhood of x0 and Nn+1 ⊂ Nn ⊂ N .
Thus we see that the infinite set of points {x0 , x1 , x2 , · · · } are all in both N
and E.
From this result we obtain the following useful corollary.
Corollary 2.2.4 If E is a finite set, then E has no limit points.
Proposition 2.2.5 The set E ⊂ R is open if and only if E c is closed.

Proof: (⇐) We suppose that E c is closed and that x0 is an arbitrary point of


E. Then x0 6∈ E c (definition of E c ) and x0 is not a limit point of E c (because
E c is closed, it contains all of it’s limit points). Since x0 is not a limit point of
E c , we know that there exists a neighborhood of x0 , N , such that N ∩ E c = ∅,
i.e. N ⊂ E. Therefore x0 is an interior point of E, E ⊂ E o , so E is open.
(⇒) Now suppose that E is open and that x0 is a limit point of E c . Then
every neighborhood of x0 contains a point of E c , i.e. no neighborhood of x0 is
contained in E. Therefore, x0 is not an interior point of E. Since E is open
(E = E o ), this implies that x0 6∈ E, i.e. x0 ∈ E c . Thus E c is closed.
We then obtain the following corollary to the above result.
Corollary 2.2.6 The set F ⊂ R is closed if and only if F c is open.
We close this section with a very important result. We leave the proof of
this proposition to the reader in HW2.2.3 and HW2.2.4.
Proposition 2.2.7 (a) If E1 , E2 ⊂ R are open, then E1 ∪ E2 is open.

(b) Suppose E1 , E2 , · · · ⊂ R are open. Then ∪ Ek is open.
k=1
(c) If E1 , E2 ⊂ R are open, then E1 ∩ E2 is open.
(d) If E1 , E2 ⊂ R are closed, then E1 ∩ E2 is closed.

(e) Suppose E1 , E2 , · · · ⊂ R are closed. Then ∩ Ek is closed.
k=1
(f ) If E1 , E2 ⊂ R are open, then E1 ∩ E2 is open.

HW 2.2.1 (True or False and why)


(a) The set E = {x ∈ [0, 1] : x ∈ Q} = [0, 1] ∩ Q is open.
(b) The set E = {x ∈ [0, 1] : x ∈ I} = [0, 1] ∩ I is closed.
(c) The set E = [0, 1] ∪ {x : x = 1 + 1/n, n ∈ N} is closed.
(d) If E = [0, 1] ∩ Q, then E o = (0, 1).
(e) A neighborhood is closed.
(f) If E is a finite set, E is closed.
1 1
HW 2.2.2 Determine the limit points of the set {x ∈ R : x = n + m ,n, m ∈ N}.
HW 2.2.3 (a) Suppose E1 , E2 ⊂ R are open. Prove that E1 ∪ E2 is open.
(b) Suppose E1 , E2 ⊂ R are open. Prove that E1 ∩ E2 is open.
(c) Suppose E1 , E2 ⊂ R are closed. Prove that E1 ∪ E2 is closed.
(d) Suppose E1 , E2 ⊂ R are closed. Prove that E1 ∩ E2 is closed.
2.3 Compactness 45


HW 2.2.4 (a) Suppose E1 , E2 , · · · ⊂ R are open. Prove that ∪ Ek is open.
k=1

(b) Suppose E1 , E2 , · · · ⊂ R are closed. Prove that ∩ Ek is closed.
k=1

(c) Suppose E1 , E2 , · · · ⊂ R are open. Show that ∩ Ek need not be open.
k=1

(d) Suppose E1 , E2 , · · · ⊂ R are closed. Show that ∪ Ek need not be closed.
k=1

2.3 Compactness in R
The concept of compactness of sets is very important in analysis. We will use
compactness results in later chapters and you will probably use them throughout
your mathematic career. This is a tough section where the proofs are difficult—
probably more difficult than we have done so far. If the section seems too tough
at this time, read the results, consider the examples carefully and maybe try the
proofs again later when you use the results—and hopefully are better at proofs.
We make the following two definitions.

Definition 2.3.1 The collection {Gα }α∈S of open subsets of R is an open cover
of the set E ⊂ R if E ⊂ ∪ Gα .
α∈S

Definition 2.3.2 A set K ⊂ R is said to be compact if every open cover of K


contains a finite subcover.

The concept of compactness is an abstract concept. We will give several


examples of compact and non-compact sets but as you will see, this is difficult
If we consider the collection of sets Gk = (k − 1/2, k + 1/2), k ∈ N , it should
be clear that {Gk } is an open cover of the set N. It should also be clear that
we cannot choose a finite subcover. Thus the set N is not compact.
Also consider the set (0, 1] and the collection of sets Gk = (1/k, 2), k =
1, 2, · · · . For any x ∈ (0, 1] there exists k ∈ N such that 1/k < x (Corollary 1.5.5-
(b)), i.e. x ∈ (1/k, 2). Thus the collection {Gk } covers (0, 1]. Let Gα1 , · · · , Gαn
be any finite sub-collection of {Gk }. One these these sets will be associated
with the largest k value—the smallest 1/k, k = k0 (let k0 = lub{αk }). Then
n
the point 12 (1/k0 ) is not included in ∪ Gαk = (1/k0 , 2), i.e. the set (0, 1] is not
k=1
compact.
It would be nice to have an example of a compact set. Suppose that E is
a finite subset of R, say E = {a1 , · · · , aK }, and {Gα }α∈S is any open cover of
E, i.e. E ⊂ ∪ Gα . Then for each aj ∈ E there must exist some Gαj in the
α∈S
collection {Gα } such that aj ∈ Gαj —aj may be in a lot of Gα ’s but who cares.
K
Then {Gαj }K
j=1 is a finite subcover of E, E ⊂ ∪ Gαj , so the set E is compact.
j=1
We understand that the set E in this last example is a trivial example
whereas N and (0, 1] are more interesting sets. The truth of the matter is that
in general it is much easier to prove that a set is not compact (to show that it is
46 2. Topology

not compact, you only have to find one cover that has no finite subcover, than
it is to prove that a set is compact (to show that it is compact you must show
that for any open cover, you can find a finite subcover). Later we will use some
of our theorems to produce other sets that are compact (and some that are not
compact).
We need two different types of results concerning compactness. We first need
some general methods that help us determine when and if a given set is compact.
In addition we need some results that give us some of the useful properties of
compact sets—this is why we need and want the concept of compactness. We
begin with the following result.
Proposition 2.3.3 If K ⊂ R is compact, then K is closed.

Proof: We will prove this result by showing that K c is open (and then apply
Propositions 2.2.5 and 2.1.5).
Suppose that x ∈ K c . Then for any point y ∈ K, we can choose neighbor-
hoods Vy and Wy of points x and y, respectively, of radius r = |x − y|/4 (and
since x 6= y, r > 0). The collection of sets {Wy }, y ∈ K, will surely define
an open cover of the set K—y ∈ K implies y ∈ Wy . Since the set K is com-
pact, we choose a finite number of sets Wy1 , Wy2 , · · · , Wyn that covers K, i.e.
n
K ⊂ W = ∪ Wyk .
k=1
n
Let V = ∩ Vyk . Since each Vyk is a neighborhood of the point x and we
k=1
are considering only a finite number of such neighborhoods, V will also be a
neighborhood of the point x—of radius min{|x − y1 |/4, · · · , |x − yn |/4}. (Note
that the sets Vyk , k = 1, · · · , n form a nested set of neighborhoods all about the
point x—we don’t know what order. The set V will be the smallest of those
neighborhoods.) Since Vyk ∩ Wyk = ∅ for k = 1, · · · , n, V ∩ W = ∅. Since
K ⊂ W , V ⊂ K c —draw a picture, it’s easy. Therefore V is a neighborhood of
x ∈ K c such that V ⊂ K c , so x is an interior point of the set K c . Since x was
an arbitrary point of K c , the set K c is open, and then K = (K c )c is closed.
Since we know that the set N is closed but not compact, we know that
we cannot obtain the converse of the above result. We can however prove the
following ”partial converse.”
Proposition 2.3.4 If the set K ⊂ R is compact and F ⊂ K is closed, then F
is compact.

Proof: Let the collection of sets {Vα } be an open cover of F . Since F ⊂ K,


the sets {Vα } will cover part of K. Since F is closed, we know that F c is open.
Since the set F c will cover the part of K that the collection {Vα } might not
cover, the collection of sets {Vα } plus F c will cover K. (It may be the case that
{Vα } sometimes might cover all of K (without F c ) but we don’t care.) Since
K is compact we can choose a finite subcover, Vα1 , · · · , Vαn plus ”maybe F c .
Since F ⊂ K, this subcover must cover F also.
If F c was included in the subcover, then we can throw it out and Vα1 , · · · , Vαn
will cover F —because F c didn’t cover any part of F .
2.3 Compactness 47

Otherwise, if F c was not included in the subcover, then clearly Vα1 , · · · , Vαn
covers F .
In either case we have found a subcover of the collection of sets {Vα } which
covers F . Therefore F is compact.
You should understand that the above proof is especially abstract since you
start with a given open cover which you have no idea what it’s like. You still
must find a finite subcover—and we do. That is generally a tough job.
We next give a result that will be very important to us later. We will care
that sets have limit points. This result guarantees that a set has a limit point
any time that the set is infinite and is contained in a compact set.
Proposition 2.3.5 If K ⊂ R is compact, and the set E is an infinite subset of
K, then E has a limit point in K.

Proof: Suppose the result is false, i.e. suppose that K is compact and E ⊂ K
is infinite and E has no limit points in K. Then for any x ∈ K (which would
not be a limit point of E) there exists a neighborhood of x, Nx , such that if
x ∈ E (it need not be), then Nx ∩ E = {x} and if x 6∈ E, then Nx ∩ E = ∅.
(Since x is not a limit point of E, it is not the case that every neighborhood of
x contains a point of E other than maybe x. Or there is some neighborhood of
x that does not contain any point of E other than maybe x.)
The collection of all such neighborhoods, Nx , x ∈ K, is surely an open
cover of K (the sets are open because they are neighborhoods and they dover
K because for any x ∈ K, x ∈ Nx ). Clearly no finite subcover of this collection
of sets can cover E—each set Nx contains at most one point of E and the set E
is infinite. If no finite subcover of this collection can cover E, no finite subcover
of this collection can cover K since E ⊂ K. This contradicts the fact that the
set K is compact. Therefore the set E has at least one limit point in K.
When you think about the next statement it probably seems clear. We need
this result proved because it will be very important for us.
Proposition 2.3.6 Suppose that {In }∞
n=1 is a collection of closed intervals in

R such that In+1 ⊂ In for all n = 1, 2, · · · , then ∩ In is not empty.
n=1

Proof: Write the intervals as In = [an , bn ]. Let E = {an : n = 1, 2, · · · } and


x = lub(E) (the least upper bound of E exists because an ≤ b1 for all n). Then
it is clear that x ≥ an for all n.
We note that In+1 ⊂ In implies that an ≤ an+1 ≤ bn+1 ≤ bn . We claim
that for any natural numbers n and m, an ≤ an+m ≤ bn+m ≤ bn . This can be
proved by fixing n and using induction on m.
Step 1: We know that since In+1 ⊂ In , the statement is true for m = 1.
Step 2: We assume that the statement is true for m = k, i.e. an ≤ an+k ≤
bn+k ≤ bn .
Step 3: We will now prove that the statement is true for m = k + 1, i.e. an ≤
an+(k+1) ≤ bn+(k+1) ≤ bn . We know from the hypothesis of the proposition
that I(n+k)+1 ⊂ In+k . This implies that an+k ≤ a(n+k)+1 ≤ b(n+k)+1 ≤ bn+k .
48 2. Topology

This along with the inductive hypothesis implies that an ≤ an+k ≤ a(n+k)+1 ≤
b(n+k)+1 ≤ bn+k ≤ bn , i.e. an ≤ an+(k+1) ≤ bn+(k+1) ≤ bn which is what we
were to prove.
Therefore by the Principal of Mathematical Induction,

an ≤ an+m ≤ bn+m ≤ bn for all m. (2.3.1)

This result will also show that

am ≤ an+m ≤ bn+m ≤ bm . (2.3.2)

Using the first three inequalities of (2.3.1) and the last inequality of (2.3.2)
we see that an ≤ an+m ≤ bn+m ≤ bm . Thus for any m, bm is an upper bound
of E. Therefore x = lub(E) ≤ bm for all m. Thus, since am ≤ x ≤ bm for all m,
∞ ∞
x ∈ Im for all m and x ∈ ∩ In . Therefore ∩ In 6= ∅.
n=1 n=1
We next start proving some theorems that will give us a better idea of what
compact sets might look like. We begin with the first, very basic result.
Proposition 2.3.7 For a, b ∈ R with a < b the set [a, b] is compact.

Proof: We begin by setting a0 = a, b0 = b, denoting the interval [a0 , b0 ] by I0


and assuming that the set I0 is not compact, i.e. there exists an open cover of
I0 , {Gα }, which contains no open subcover.
We next consider the intervals [a0 , c0 ] and [c0 , b0 ] where c0 = (a0 + b0 )/2.
At least one of these two intervals cannot be covered by a finite subcollection of
{Gα }—if both subintervals could be covered by a finite subcollection of {Gα },
so could their union which is I0 . Denote whichever subinterval that cannot be
covered by a finite subcover of {Gα } I1 and denote the end points of this interval
by a1 and b1 (if neither subinterval can be covered by a finite subcover, chose
either).
We next consider the intervals [a1 , c1 ] and [c1 , b1 ] where c1 = (a1 + b1 )/2.
Again at least one of these two intervals cannot be covered by a finite subcol-
lection of {Gα }—denote this subinterval by I2 .
We inductively define a collection of closed intervals, {In }∞ n=1 , that satisfy
the following properties: (i) In+1 ⊂ In for n = 0, 1, 2, · · · , (ii) In is not covered
by any finite subcollection of {Gα }, and (iii) the length of the interval In is
(b − a)/2n .
We begin by applying Proposition 2.3.6 to the collection of closed intervals

to get x0 such that x0 ∈ ∩ In , i.e. x0 ∈ In for all n. Since x0 ∈ I0 (and all
n=0
others) and the collection {Gα } covers I0 , x0 ∈ Gα0 for some α0 . Since Gα0 is
open, there exists a neighborhood of x0 , say Nr (x0 ) for some r > 0, such that
x0 ∈ Nr (x0 ) and Nr (x0 ) ⊂ Gα0 . If we choose n0 such that (b − a)/2n0 < r/2,
then In0 will be contained in Nr (x0 )—remember x0 ∈ In for all n. But then
In0 ⊂ Nr (x0 ) ⊂ Gα0 , which is a finite subcover. This contracts (ii) above.
Therefore there is no open cover of [a, b] that does not have a finite subcover
and the set [a, b] is compact.
2.3 Compactness 49

We should be a little careful above where we chose n0 such that (b−a)/2n0 <
r/2. However, we can do this. By Corollary 1.5.4 (the Archimedian property) we
can choose n0 such that n0 > 2(b − a)/r (letting ”a” = 1 and ”b” = 2(b − a)/r
in Corollary 1.5.4—where ”a” and ”b” are the a and b of the Archimedian
property). It is then easy to use Mathematical Induction to prove that 2n0 > n0
for all n0 . We don’t really want to stop and prove everything like this but we
must realize that we must be ready and able to do so if asked.
We next prove a very important theorem that gives a characterization of
compact sets. This result is know as the Heine-Borel Theorem.
Theorem 2.3.8 (Heine-Borel Theorem) A set E ⊂ R is compact if and
only if E is closed and bounded.

Proof: (⇒) We begin by assuming that the set E is compact but is not
bounded. If E is not bounded we know that the set E either does not have
an upper bound or does not have a lower bound. Let’s suppose that E does not
have an upper bound. Then there exist points xn ∈ E such that xn > n for
n = 1, 2, · · · . Clearly the set E1 = {x1 , x2 , · · · } is an infinite subset of E that
does not have a limit point in E (the set E1 doesn’t even have a limit point in
R). This contradicts Proposition 2.3.5. Therefore the set E must be bounded.
We now suppose that E is not closed. This implies that there is a limit
point of E, x0 , such that x0 6∈ E. From Proposition 2.2.3 we know that every
neighborhood of x0 contains infinitely many points of E. We will use a con-
struction similar to that used in Proposition 2.2.3. Since x0 is a limit point of E,
there exists a point x1 ∈ E such that x1 ∈ N1 (x0 ) (neighborhood of radius 1).
Likewise, there is a point x2 ∈ E such that x2 ∈ N1/2 (x0 ). In general (or induc-
tively), there exists a point xn ∈ E such that xn ∈ N1/n (x0 ) for n = 1, 2, · · · .
Set E1 = {x1 , x2 , · · · }. E1 is a subset of E and is an infinite set. (Otherwise an
infinite number of the xj ’s must be equal. Since x0 6∈ E and xn ∈ N1/n (x0 ) ∩ E,
they cannot equal x0 .)
We want to show that E1 does not have a limit point in E. Since x0 6∈ E,
we know that it’s not x0 . We will next show that nothing else will be a limit
point of E1 . For any y0 ∈ R, y0 6= x0 we have

|xn − y0 | = |(x0 − y0 ) − (x0 − xn )| ≥ |x0 − y0 | − |x0 − xn | by Prop 1.5.8-(vi)


1
≥ |x0 − y0 | − the point xn ∈ N1/n (x0 ). (2.3.3)
n
1 1
If we choose n0 so that < |x0 − y0 | (which is possible by Corollary 1.5.5–
n0 2
1 1
(b)), then for all n ≥ n0 , < |x0 − y0 |. Then by (2.3.3) we have
n 2
1 1
|xn − y0 | ≥ |x0 − y0 | − |x0 − y0 | = |x0 − y0 |.
2 2
1
Since a neighborhood of x0 of radius less than |x0 − y0 | can include only a
2
finite number of elements of E1 , y0 cannot be a limit point of E1 . Thus E1 is
50 2. Topology

an infinite subset of the compact set E which has no limit point in E. This
contradicts Proposition 2.3.5. Therefore the set E is closed.
(⇐) Since E is bounded, there exists an a, b ∈ R such that a, b and E ⊂ [a, b].
Since [a, b] is compact (by Proposition 2.3.7) and E is closed, E is compact by
Proposition 2.3.4 which is what we were to prove.
Proposition 2.3.7 gives us a lot of compact sets. Theorem 2.3.8 makes it
easier yet to determine whether certain sets are compact. For example we
know that the sets (0, 1), [0, 1] ∩ Q and [0, ∞) are not compact. And the sets
{0, 1, 1/2, 1/3, · · · } and [0, 10] ∪ {3/2} ∪ [2, 3] are compact. The next result helps
us use the compact sets that we have to build more.

Proposition 2.3.9 (a) If E1 , E2 ⊂ R are compact, then E1 ∪ E2 is compact.


(b) If E1 , E2 ⊂ R are compact, then E1 ∩ E2 is compact.

Proof: (a) Suppose that {Gα } is an open cover of E1 ∪ E2 . Then {Gα } is an


open cover of E1 (so we can find a finite subcover of E1 ) and E2 (so we can find
a finite subcover of E2 ). If we include all of the sets in these two subcovers, we
will get a finite subcover of E1 ∪ E2 .
(b) Since E1 , E2 ⊂ R are compact, we know from Theorem 2.3.8 that E1 and
E2 are both closed and bounded. By HW2.2.3-(d) we know that E1 ∩ E2 is
closed. It should be easy to see that it also follows that E1 ∩ E2 is bounded.
Hence E1 ∩ E2 is compact.
We next prove the converse of Proposition 2.3.5. We should realize that this
next result along with Proposition 2.3.5 provides an alternative to the definition
of compactness.

Proposition 2.3.10 If K ⊂ R is such that any infinite subset of K has a limit


point in K, then K is compact.

Proof: Consider the following statement.


Result**: If K ⊂ R is such that for any infinite set E that is a subset of K,
then E has a limit point in K, then K is closed and bounded.
It should be clear that if we can prove the above result, we can apply The-
orem 2.3.8 to get the desired result.
In effect Result** has already been proved—in disguise. In the ⇒ direction
of the proof of Theorem 2.3.8 we supposed that the set was not closed and
bounded (doing one at a time) and showed that we had an infinite subset of K
that did not have a limit point in K (which would contradict the hypothesis of
Result**)—and hence contradicted Proposition 2.3.5. This same proof (without
the contradiction of Proposition 2.3.5) will prove Result**. Then as we said, we
apply Theorem 2.3.8 and get the desired result.
We close with a result that will be very useful to use later.

Theorem 2.3.11 Every bounded infinite subset of R has a limit point in R.


2.3 Compactness 51

Proof: Since E is bounded, E ⊂ [a, b] for some a and b in R. Since [a, b] is


compact, by Proposition 2.3.5 E has a limit point in [a, b], i.e. E has a limit
point in R.

HW 2.3.1 (True or False and why)


(a) The set [0, 2] ∪ [3, 4] is compact.
(b) If E ⊂ R is bounded, E is compact.
(c) If E1 , E2 ⊂ R and E1 ∪ E2 is compact, then E1 and E2 are compact.
(d) The set [0, 1] ∩ I is compact.
(e) If E is open and bounded, then E c is compact.
n
HW 2.3.2 (a) Prove that if E1 , · · · , En ⊂ R are compact, then ∪ Ej is com-
j=1
pact.
(b) Show that if E1 , E2 , · · · ⊂ R are compact, it is not necessarily the case that

∪ Ej is compact.
k=1

HW 2.3.3 (a) Give an open cover of the set (0, 1) that does not have a finite
subcover.
(b) Give an open cover of the set [1, ∞) that does not have a finite subcover.
52 2. Topology
Chapter 3

Limits of Sequences

3.1 Definition of Sequential Limit


Hopefully we now have enough of an understanding of some of the background
material so that we can start considering some of the traditional topics of cal-
culus. The first topics that we will study are sequences and limits of sequences.
It is highly likely that your first calculus course did not start with limits of
sequences but we think it is the most inciteful place to start. We assume that
you have worked with functions and know what a function is but to make the
material as concrete as possible we begin with the definition of a function.

Definition 3.1.1 Suppose that D and R are subsets of R. If f is a rule that


assigns one, and only one, element y ∈ R to each x ∈ D, then f is said to be
a function from D to R. The set D is referred to as the domain of f and R is
the range of f . We write f : D → R and often denote an element y ∈ R that
is associated with x ∈ D by the function f as y = f (x). f is also called a map
from D into R.

We note that f (x) must be defined for each element x ∈ D. We also note that it
is not necessary that each element of R be associated with some element of D.
For D1 ⊂ D we define the set f (D1 ) = {y ∈ R : y = f (x) for some x ∈ D1 }.
f (D1 ) is called the image of D1 . Obviously, f (D) ⊂ R and f (D1 ) ⊂ R. Since
every element of R need not be associated with some element of D, f (D) need
not be equal to R. If f (D) = R, f is said to be onto and we say f maps D
onto R. When working with functions, the domain and range can be any sort
of sets. In our work the domain and the range will not only be subsets of the
set of real numbers, but (except for the definition of a sequence) will most often
be intervals of R or all of R. We will not dwell on these definitions now—we
will try to make a point to explicitly define that domain, range, etc in examples
later.
Sequences: Definition and Examples As we see in the next definition, a
sequence is just a special function.

53
54 3. Limits of Sequences

Definition 3.1.2 A sequence is a function with domain


{n ∈ Z : n ≥ m for m ∈ Z}.
We note that usually m = 0 or 1. We did not specify the range of the function
in the definition of a sequence. The range can really be any set–but in our work
it will almost always be a subset of the reals. Using this definition we could
define a sequence by setting D = N and f (n) = 1/(n2 + 1) for each n ∈ D. But
this is not how it’s usually done. Because the potential domains can easily be
listed in order (especially N), we would usually write the above sequence as
1/2, 1/5, 1/10, · · · ,
where we assume that the reader can figure out the rest of the terms. If we
think that there’s a good chance that the reader will not be able to recognize
the general term, we might write
1/2, 1/5, 1/10, · · · , 1/(n2 + 1), · · ·

or just 1/(n2 + 1) n=1 . Often the sequence will be listed without a specific


description of the domain such as


1, 2, 5, 10, · · ·
or
3/4, 8/9, 15/16, · · · ,
where while you are figuring out the formula that generates the sequences, you
will be expected to also come up with the domain. You should realize that
the domain and formula is not unique–but they had better be equivalent. For
example, the last sequence could be expressed as 1−1/n2 for n = 2, 3, · · · , or you
could write the same sequence as (n2 + 2n)/(n + 1)2 for n = 1, 2, 3, · · · . When
we are discussing a general sequence, instead of using the function notation we

will write the sequence as a1 , a2 , · · · , {an } for n = 1, 2, · · · or {an }n=1 .
If we return to our discussion of plus and minus infinity from Section 1.5,
recall that we specifically included the property that for all n ∈ N we have
1 ≤ n < ∞. This allows us to consider the natural numbers sequentially, starting
at 1 and approaching infinity. Likewise, using the set N as our indexing set, we
can consider a sequence {an } starting at a1 and continuing with increasing n
as n approaches infinity. We are interested in which value, if there is such a
value, an approaches (gets close to) as n approaches infinity. We will write this
limiting value as lim an as n → ∞ or lim an . Just how we treat n approaching
n→∞
∞ hopefully will be made clear below, i.e. how we treat ”large n” in Definition
3.1.3 below. It should not be hard to see that the limits of the three sequences
given above are 0, ∞ and 1. Of course, we must make the claim in a very
precise manner. We need a definition so that anyone using the definition will
get the same results. Anyone using the idea of the limit of a sequence will know
precisely what they mean.
When you are just beginning, it is not easy or clear to imagine how the limit
of a sequence should be defined. We make the following definition.
3.1 Definition 55

Definition 3.1.3 Consider a real sequence {an } and L ∈ R. lim an = L if


n→∞
for every  > 0 there exists an N ∈ R such that n > N implies that |an − L| < .

If lim an = L, we say that the sequence {an } converges to L, sometimes


n→∞
write an → L as n → ∞—read, an approaches L as n approaches ∞, or just
an → L—assuming that the reader knows that n will be going to ∞.
An explanation of the definition of a limit that we like to use is ”for every
measure of closeness to L” (that’s what  measures) there exists ”a measure of
closeness to ∞” (that’s what N measures) so that whenever ”n is close to ∞”,
”an is close to L.” Recall the statements preceeding the definition where we
discussed the sequence an as ”n gets large,” and ”n approaches infinity.” These
concepts have been rigorized by the requirement that there exists an N such
that for all n > N , something happens. Thus we have taken an idea or concept
of ”n approaching ∞” and rigorized the notion so that it is possible to use in a
mathematical context. Of course when we prove theorems and/or prove limits,
we use the definition not the assortments of words that we have used to try to
give an understanding of the idea of a limit. When mathematics is done, we
must be precise and use the definition.

Figure 3.1.1: Plot of a sequence and the y = L ±  corridor.

Sequential Limit: Graphical Description Another description of the limit


of a sequence that is useful for some is to consider the sequence graphically. In
Figure 3.1.1 we have plotted a fictitious sequence {an }. We have plotted the
point (0, L) and horizontal dashed lines coming out of the points (0, L ± ). The
corridor within the dashed lines represents y-coordinate values that are “close to
L” (within a given ). The definition of lim an = L requires that for any  (no
n→∞
matter how large or how small you make the corridor around L—the corridor
being small is usually the problem) there must be a value of n, call it N , so
that from that point on, all of the points (to the right) are within the given
corridor. In general, when the corridor is smaller (the  is smaller), the N must
get larger.
56 3. Limits of Sequences

Comments: Sequential Limits (i) Given a sequence {an }, the definition


does not help you decide what L should be—but of course it is necessary to
have this L to apply the definition. One way to think about it is that you have
to ”guess the L” and then try to prove that the limit is L. Really we know that
we have methods for determining L from our basic calculus course (the methods
were not rigorously proved but surely would be sufficient to guess what L should
be). We will repeat these results (rigorously) in later sections.
(ii) We emphasize that the value N can be any real number. By notation
(an N usually represents an integer) we imply that N is an integer. It always
can be chosen as an integer but need not be. It is sometimes more convenient to
choose N as a particular real number rather than go through the song and dance
that it is the next largest integer greater than some particular real number–we
will make this clear later.
(iii) We want to emphasize (and we will beat this to death) that when we
apply Definition 3.1.3 we will always follow two steps. Step 1: For a given
 define N . How we find N is immaterial–we will develop methods for finding
N . Step 2: Show that the defined N works—that n > N implies that
|an − L| < . We will repeat and emphasize this procedure often.
(iv) We note that the definition of a limit can be given in terms of neigh-
borhoods introduced in Section 2.2. The statements ”there exists an  > 0” and
”|an − L| < ” (and their use) can be replaced by ”there exists a neighborhood
of L, N (L)” and ”an ∈ N (L)”. We can also define a neighborhood of infinity
(even though infinity is not in R) as follows.

Definition 3.1.4 A neighborhood of infinity, ∞, is the set NR (∞) = {x ∈ R :


x > R} for some R > 0. A neighborhood of minus infinity, −∞, is the set
NR (−∞) = {x ∈ R : x < −R} for some R > 0.

We can then write Definition 3.1.3 as follows: lim an = L if for every neigh-
n→∞
borhood of L, N (L), there exists a neighborhood of infinity, NN (∞), such that
n ∈ NN (∞) ⇒ an ∈ N (L). Other than notation there is no difference between
this version of the definition and the one given in Definition 3.1.3.
(v) In addition to being able to write Definition 3.1.3 in terms of neigh-
borhoods we get results connecting limit points of sets and limits of sequences.

Proposition 3.1.5 Suppose {an } is a real sequence and lim an = L.


n→∞
(a) Any neighborhood of L, Nr (L), contains infinitely many points of the se-
quence {an }.
(b) If we consider E = {a1 , a2 , a3 , · · · } as a subset of R (instead of as a sequence)
and E is an infinite set (the an ’s do not all equal L from some point on), then
L is a limit point of E.

Proof: The proofs of both parts are easy. (a) We know by Definition 3.1.3
that for any neighborhood of L, Nr (L), there exists an N and all of the points
of {an } for n > N are in that neighborhood. Note that for all we know all of
3.1 Definition 57

these sequence values could be the same—say if the sequence was a constant
sequence.
(b) If we consider any neighborhood of L, Nr (L), and apply the definition of
the limit of a sequence to the sequence {an }, then there exists an N ∈ R such
that n ∈ N and n > N implies that an ∈ Nr (L). Since we have assumed that
the set E is infinite, we can find a point in Nr (L) ∩ E that is different from L.

Note that it is important for part (b) to assume that the set E is infinite.
For the sequence {an } where an = 1 for all n, then an → 1 but 1 is not a limit
point of the set {a1 , a2 , · · · } = {1}.
When you think about the definition of the limit of a sequence—or the graph
of a sequence—the above result is not surprising. The other direction—and it’s
not really a converse—is a bit more of a surprise.

Proposition 3.1.6 Suppose that E ⊂ R and the point x0 is a limit point of E.


Then there exists a sequence of points {xn } ⊂ E such that xn → x0 .

Proof: We have essentially already proved this result. In the proof of Theorem
2.3.8 we consider the sequence of neighborhoods of x0 , N1/n (x0 ), n = 1, 2 · · ·
and chose a point from each neighborhood, xn ∈ N1/n (x0 ). Clearly |xn − x0 | <
1/n for all n. Then for any  > 0 we can use Corollary 1.5.5–(b) to obtain
N ∈ N such that N1 < . (Step 1: Define N .) Then for n > N we have
|xn − x0 | < n1 < N1 < . (Step 2: Show that the defined N works.) Thus
lim xn = x0 .
x→∞
Thus we see that when E is a bigger set (has many points), if x0 is a limit
point of E, we can always find a sequence of points of E converging to x0 . For
example if E = [0, 2), 3/2 is a limit point of E and xn = 23 + 4n
1
→ 32 . Also 1
1
is a limit point of E and xn = 1 + 2n → 1. And 2 is a limit point of E and
2 − n1 → 2.
Note that the sequence produced by the proof of Theorem 2.3.8 is such that
xn 6= x0 for any n. This is not necessary for this proof.
Of course we need some experience with proving limits—finding the N and
showing that it works. We will do that in the next section.

HW 3.1.1 (True or False and why)


(a) The set [0, 3/2) defines a sequence.
(b) Let {an } be a sequence that converges to L. Then the set
{x : x = an for some n} cannot be a closed set.
(c) If lim an = L, then the set {L, a1 , a2 , · · · } is a closed set.
n→∞
(d) If we can show that for any  > 0, n > 0 implies that |an − L| < , then
lim an = L.
n→∞
(e) If n > N implies that |an − L| < 10−1000 , then lim an = L.
n→∞
58 3. Limits of Sequences
n o∞
3n2 +4
HW 3.1.2 Plot several terms of the sequence 4n2 +3 . Do you think that
n=1
3n2 + 4
lim exists? If yes, what do you think this limit is?
n→∞ 4n2 + 3

3.2 Applications of the Definition of a Sequen-


tial Limit
In the last section we introduced the definition of a sequential limit. In this
section we will learn how to apply the definition to particular sequences. Re-
member, we apply Definition 3.1.3 as a two step process, Step 1: Define N and
Step 2: Show that the N works. Let us begin with the following example. It is
probably the second easiest example possible but it is very important.
1
Example 3.2.1 Prove that n→∞
lim = 0.
n
Solution: We will first do this problem graphically. Generally this is not the way to
prove limits—it only works for easy problems. However we want to illustrate how it works
with the picture and eventually to make a point. We note that in Figure 3.2.1 that the
sequence {1/n} is plotted and the dashed lines y = 0 ±  = ± are drawn. We notice that
the sequence decreases as n gets larger. (It’s easy to see that for n > m, 1/n < 1/m.) After

Figure 3.2.1: Plot of a sequence {1/n} and the y = L ±  corridor.

a while—exaggerated on this plot—the points representing the plot of the sequence enter the
corridor formed by y = ± and never leave it. The points will never leave the corridor because
they are positive and decreasing. It is of interest to see if we can sort of compute when the
1
points will cross the line y = . We set =  and solve for n as n = 1/. Of course  would
n
have to be special for 1/ to be an integer. However, it should be clear that if we set N = 1/
(This is the definition of N required by Step 1: we did it graphically, but we don’t really care
how we got it.), then if n > N , the plotted values of 1/n have entered into the ± corridor and
because 1/n <  (because
1 n > N = 1/) and 1/n > 0, these plotted values will never leave the
± corridor, i.e. n < . (This shows that the N defined as N = 1/ works: Step 2.) Hence,

1
we have defined an N = 1/ such that if n > N = 1/, then − 0 < . Therefore by the
n
1
definition of limit, Definition 3.1.3, lim = 0. We could also complete Step 2 by noting if
n→∞ n
3.2 Applications of the Definition 59

1 1 1
n > N = 1/, then − 0 = < = . (This also shows that the N works: Step 2.)
n n N
The second way we will do this problem is the way that limit proofs are most often done.
1 1
We suppose that  > 0 is given. We need N so that n > N implies that − 0 = < .
n n
This last inequality is equivalent to n > 1/. Therefore
if we choose N = 1/ (definition of
1 1 1
N : Step 1), then n > N = 1/ implies that − 0 =
< =  (Step 2: The defined N
n n N
1
works.). Therefore → 0 as n → ∞.
n

Notice that in each method, the first graphically and the second algebraically,
we define N and then show that this N works, i.e. we satisfy the definition of
the limit. This is always the way limit proofs are done when we are applying
the definition of the limit. The first method illustrates that it really makes
no difference how you find N . If we can show rigorously that a particular N
works—even if we only guessed it—we are done.
And finally, we note that the N that we found, N = 1/, depended on .
This is perfectly permissable. The statement in the definition is that ”for every
 > 0 there exists an N .” The same N surely does not have to work for all . It is
logical that N would generally have to depend on  (it is not a requirement) and
with this dependence, we can still satisfy the definition. We should understand
that generally N will depend on  in a way that as  gets smaller, N will get
larger—as with N = 1/.
 
1
Example 3.2.2 Prove that n→∞
lim 1 − = 1.
n2
Solution:
 Again, we assume
that
we are given  > 0. We want an N such that n > N
1 1 1
implies [1 − 2 − 1 = − 2 = 2 < . This last inequality is equivalent to n2 > 1/ or

n n n
p √ √ 2
n > 1/ = 1/  (because n > 0). Thus if we choose  N = 1/
 (or
such that 1/N = )
1 1 1 1
(Define N : Step 1) and let n > N , we have [1 − 2 − 1 = − 2 = 2 < 2 =  (Step
  n n n N
1
2: N works). Thus lim 1 − 2 = 1.
n→∞ n

You should note the proof of a limit involves the first step, where we set
|an − L| <  and solve this inequality of n. This shows us how we should define
N –by setting the inequality we want to be satisfied as true, we are able to see
what we need to make it true. This is sort of a ”mathematical limits dance.”
Other than the resulting definition of N , this is not technically a part of the
proof. It is a very common approach to help find N . We then show that this
N works. And we emphasize, after we perform the first step and define N , we
always must show that N satisfies the definition of a limit. If you understand
all of the parts of the dance, this is often easy.
Also, as a part of the analysis above we first had an inequality n2 > 1/
and took the square root of both sides. As a part of solving the inequality for
n, we often have to perform operations on both sides of an inequality. The
question is why can you take the square root of both sides (or perform some
other operation on both side of an inequality)? In the case of the square root,
60 3. Limits of Sequences

we proved that it was permissible in HW1.3.3–(b). To help us in general we say


that a function g defined on an interval of the real line is said
√ to be increasing
if x < y implies that g(x) < g(y). We know that g(x) = x is increasing–if

not because of HW1.3.3–(b),
√ because we know what the graph of y = x looks

like. So if x < y, then x < y, i.e. you can take √ the square root of both
sides of an inequality. Later we had n > N = 1/  and squared both sides.
This is possible because g(x) = x2 is an increasing function also (or we can use
HW1.3.3–(a)).
2n + 3 2
Example 3.2.3 Prove that n→∞
lim = .
5n + 7 5
Solution: Suppose
that  > 0 is given. We want to find an N such that n > N implies that
2n + 3 2
5n + 7 − 5 < , or


5(2n + 3) − 2(5n + 7) 1 = 1 1

= < .
5(5n + 7) 5(5n + 7) 5 5n + 7
 
1 1
This inequality is the same as 5n + 7 > 1/(5) or n > − 7 , i.e. we have done the
5 5  
1 
  1 1
dance. Define N = − 7 (Step 1: Define N ). Then if n > N = − 7 , we get
5 5 5 5
1 1 2n + 3 2
5n + 7 > 1/(5) or < . Therefore n > N implies − <  (Step 2: N
5 5n + 7 5n + 7 5
2n + 3 2
works), and → as n → ∞.
5n + 7 5

Before we move on to limits that do not exist, we prove one more limit. We
emphasize that we are cheating. As a part of the next example we will use the
natural logarithm, ln and the exponential, exp. We will define these functions
and prove properties of the ln and exp functions in Section 7.7. We could wait
for this example until then but are probably better served doing it now. There
are no circular arguments involved.
1
Example 3.2.4 Prove that n→∞
lim = 0.
2n
Solution: We proceed as we did in the last example. We suppose
that we are given  > 0
1 1
and we need to find N such that n > N implies that n − 0 = n < . This last inequality
2 2
is equivalent to 2n > 1/. Taking the logarithm base e of both sides gives (because ln is an
n
increasing function) ln 2 = n ln 2 > ln (1/) = − ln  or n > − ln / ln 2.
Thus we see that if we choose N = − ln / ln 2 (Step 1: Define N ) and consider n > N ,
then we have n > N = − ln / ln 2 or n ln 2 = ln 2n > − ln  = ln (1/). Taking the exponential
1 1 1
of both sides, we get 2n > 1/, n <  or n − 0 <  (N works: Step 2), and therefore n

2 2 2
approaches 0 as n approaches ∞.

We note that we can take the ln and exp of both sides of the inequality
because they are both increasing functions–cheating again, but think of the
graphs of these functions and it will be clear that they are increasing.
You should also note that we have used the fact that ln 2 ≈ .69 > 0 al-
lowing us to divide both sides of the inequality by ln 2 keeping the direction
of the inequality the same. Also note that we write the definition of N as
3.2 Applications of the Definition 61

N = − ln / ln 2. This is a logical way to write it because for  small (less than


1 but positive), ln  < 0. But do note that nothing we have done is illegal if  is
not small, say  ≥ 1. The definition must be satisfied for all  > 0.
Sequences that don’t converge It should not surprise you that there are
sequences that do not have limits. Here we want to discuss how lim an does
n→∞
not exist for some sequences and what we have to do to prove that a limit does
not exist. If you think about it, you should realize that it might be difficult
to show that the limit does not exist. You have to show that it is impossible
to satisfy Definition 3.1.3 no matter what real L you choose, i.e. we have to
show that for any L ∈ R there exists an  for which no N can be found (no
N such that n > N implies that |an − L| < ). There are generally two ways
that the limit does not exist. We have put the requirement in Definition 3.1.3
that L ∈ R and we know that ±∞ 6∈ R. Therefore, the sequences that want ∞ to
approach ∞ (or −∞) such as the sequence given in Section 3.1, n2 + 1 n=0 ,
will not satisfy Definition 3.1.3. (We will give a definition later for what we
mean when the limit is infinite.) The other case where the limit does not exists
is when it oscillates back and forth between two distinct numbers–or close to
two distinct numbers, or three. The literature seems to be confusing on how
they refer to the two situations of non-convergence. At the moment we will say
that if a sequence does not satisfy Definition 3.1.3 for any L ∈ R, the sequence
does not converge (that’s the only convergence definition we have at this time).
Some of the literature will refer to non-convergence as divergence. We will save
divergence for limits of ±∞ which we will introduce when we introduce infinite
limits (but at this time do not exist). Consider the following example.

lim n2 + 1 does not exist.


Example 3.2.5 Prove that n→∞
Solution: This is really a fairly easy case. Let L be any element of R and choose  = 1. Note
that from what we said above, if no N can be found for this situation (any L ∈ R and one ),
then the limit will not exist. If we
were to be able to satisfy the definition, we must satisfy the
following inequality, n2 + 1 − L < 1. This inequality is the same as L − 1 < n2 + 1 < L + 1.
Hopefully, it is clear that it is the right inequality that will not be satisfied for large n—you
should notice that for some value of n, n2 + 1 will get larger than L + 1√and stay larger for
all of the rest of the n’s. Rewrite the right inequality as n2 < L or n < L (allowable since
n > 0).
√ √
By Corollary 1.5.5–(a) we know that for L ∈ R, there exists n0 ∈ N such that n0 > L.
Of course, if this inequality is satisfied for some particular n0 ∈ N, it will also√be satisfied for
all n ≥ n0 . Since√it is impossible to find an N such that n > N implies n < L (because for
all n ≥ n0 , n > L), it is impossible to find an N such that n > N implies n2 + 1 − L < 1
for any L ∈ R. Since Definition 3.1.1 cannot be satisfied for any L ∈ R, lim n2 + 1 does not
n→∞
exist. √
Note that when we wrote L, we were assuming that L ≥ 0. If someone is silly enough
to guess that the limit might be L where L < 0, it is easy to see that n2 + 1 6< L + 1 (the right
side of the original inequality) for all n ∈ N, n ≥ 1. Therefore the limit can’t be negative.

The above example is a reasonably easy sequence to consider—however, all


more complicated sequences that approach ∞ (or −∞) are handled in the same
way with more difficult algebra.
62 3. Limits of Sequences

We next consider an example that is a classic case of nonexistence. All other


examples of nonexistence where the sequence oscillates between one or more
different values can be done in a similar fashion.
lim (−1)n does not exist.
Example 3.2.6 Prove that n→∞
Solution: As in the last example we must show that for any L ∈ R, there is an  > 0 for
which no N exists (no N such that n > N implies that |an − L| < ). For whatever real
number L we think that might be the limiting value, we must satisfy |(−1)n − L| <  or
L −  < (−1)n < L + . (3.2.1)
And if the sequence is to have a limit, we must find an N so that the last inequality is satisfied
for all n > N .
If you were trying to guess the limit and were naive enough to think that the limit would
exist, you might guess that the limit is 1 or you might guess that it is −1. After all, these are
the values that are assumed often in this sequence.
If we choose L = 1 (as our first guess) and set  = 1, then we would have to satisfy the
inequality (3.2.1), 1 − 1 = 0 < (−1)n < 1 + 1 = 2. It should be clear that this inequality
cannot be satisfied for any odd values of n when (−1)n = −1. Therefore it is impossible to
find an N so that n > N implies 0 < (−1)n < 2.
Likewise if we were to choose L = −1 and  = 1, the inequality −2 < (−1)n < 0 would
not be satisfied for even values of n so no appropriate N can be defined. Therefore lim (−1)n
n→∞
does not equal 1 or −1.
Finally, consider some L such that L 6= 1 and L 6= −1. Choose  = min{|L − 1|/2, |L −
(−1)|/2} (where by min{|L − 1|/2, |L − (−1)|/2} we mean the minimum of the two values).
You might want to draw a picture describing this choice. It should be clear that this  has
been chosen so that |(−1)n − L| is never less than  for any n—when n is even |(−1)n − L| =
|1 − L| |1 + L|
|1 − L| > ≥  and when n is odd |(−1)n − 1| = | − 1 − L| > ≥ . Therefore
2 2
lim (−1)n does not equal L (where L is anything but 1 or −1).
n→∞
Therefore, since Definition 3.1.3 cannot be satisfied with any L ∈ R, lim (−1)n does not
n→∞
exist.

Figure 3.2.2: Plot of a sequence {n2 + 1} and the y = L ± 1 corridor.

We should note that both of these cases of nonexistence of limits can be illus-
trated graphically—you should be very careful before you claim that the picture
gives you a proof. You will be asked to graphically illustrate the nonexistence of
several limits of sequences in HW3.2.2. In Figure 3.2.2 we draw a picture much
3.2 Applications of the Definition 63

like we did in Figure 3.1.1—choose some L and , plot the point (0, L) and draw
the lines y = L ± . To illustrate the non-existence in Example 3.2.5 we choose
an arbitrary L and let  = 1. We then plot some sequence values an = n2 + 1.
We note that sooner or later the sequence points go outside of the y = L ± 
corridor (actually above the corridor) and stay out of there forever—we did not
get to plot many points in Figure 3.2.2 because n2 + 1 grows large quickly. Since
the sequence clearly leaves the L ± -corridor and never comes back, the limit
is surely not equal to L.

Figure 3.2.3: Plot of a sequence {(−1)n } and the y = 1 ± 1/2 corridor.

Figure 3.2.4: Plot of a sequence {(−1)n } and the y = L ±  corridor where


 < min{|L − 1|, |L − (−1)|}.

To illustrate the non-existence in Example 3.2.6 we would draw two plots


similar to that in Figure 3.1.1. For the first plot, Figure 3.2.3, we would choose
L = 1 and  = 1/2, and note that every other point of the sequence ((−1)n for
64 3. Limits of Sequences

n odd) would be outside of the y = 1 ± 1/2 corridor–forever. We could draw


a similar plot for L = −1—and make a similar argument. In Figure 3.2.4 we
draw a plot for an arbitrary L not equal to 1 or −1 and choose  to be smaller
than the distances from L to 1 or −1. We note that the sequence values would
never be in the y = L ± -corridor.
As stated earlier be very careful about claiming that the above arguments
are proofs. They can be made to be a part of a proof if done carefully and if
they include some of the arguments are reasons given in Examples 3.2.5 and
3.2.6. The pictures alone are at best ”bad proofs.”
We see from the work in this section that it is not trivial to apply the
definition of the limit of a sequence. You might wonder if it’s because we
have the wrong definition. So that we have the definition here for comparison
purposes, we recall that lim an = L
n→∞

if for every  > 0 there exists N such that n > N implies |an − L| < .

One might ask if it is necessary to apply the definition so that it works for every
 > 0. We might try the following candidate for the definition.

F1 : if for some  > 0 there exists N such that n > N implies |an − L| < .
1
If this were the definition, life would be much easier. Consider an = and
n
choose  = 0.1. If we choose N = 13, then n > N = 13 implies that

1
− 0 = 1 < 1 = 1 < 0.1.

n n N 13
1
If F1 were the definition, the above computation would imply that lim = 0.
n→∞ n
This is the same result that we got in Example 3.2.1 (which we hope our intuition
tells us is the correct limit). We see that F1 is easy to apply. However, using
the same  = 0.1 we see that if we choose N = 100, then n > N = 100 implies
that
1
− 0.001 ≤ 1 + 0.001 < 0.01 + 0.001 < 0.1.

n n
1
So the same  would imply that lim = 0.001. Further calculations using
n→∞ n
1
F1 would give as a large assortment of answers for lim (this makes it very
n→∞ n
difficult to grade homework). And different choices of  would give us more
values of the limit. Clearly F1 is a bad choice—it is not a strong enough criterion
to serve as our definition.
If we instead tried

F2 : if for every  > 0 and all N ∈ N, n > N implies |an − L| < .

This proposal is in big trouble. Here we are claiming that for any  > 0, the
implication in the definition must be true for all N . That is just too strong of a
3.2 Applications of the Definition 65

requirement. If we return to an = 1/n and choose


a small , say  = .1, there is
1
notway that n > N = 1 will imply that − 0 < 0.1, i.e. for n = 2 > N = 1,
n
1 1 1
− 0 = > 0.1. So either lim 6= 0 or F2 will not make it as a definition
n 2 n→∞ n
replacement.
There are a few other candidates—bad candidates—that we could discuss
but by now you surely get the point, we had better live with Definition 3.1.3.
You will see in the next section that using Definition 3.1.3 we will always get a
unique result—not like F1 . We have seen that it is possible to apply Definition
3.1.3 to prove limits that are intuitively correct—not like F2 . So make sure that
you now forget F1 and F2 : the F stood for ”false”.
And finally, we want to emphasize that it is the tail end of the sequence
that determines whether or not the sequence converges—and since n → ∞, no
matter at which number you start the tail, it will be a very long tail. In the first
example of this section we showed that 1/n → 0 as n → ∞. That makes the
sequence {1/n} a nice sequence. In the last example of this section we showed
that lim (−1)n does not exist, i.e. the sequence {(−1)n } is not a nice sequence.
n→∞
Now let us construct a rather strange sequence by defining an = (−1)n for
n = 1, · · · , 100, 000 and an = 1/n for n > 100, 000. The tail end of the sequence
will be nice so that the sequence will converge—again to 0. Specifically we saw
in Example 3.2.1 that as a part of the proof of the convergence of {1/n} to 0,
we defined N = 1/. To prove that the sequence {an } converges to 0, we define
N = max{1/, 100, 000}. This way the proof never knows that we were working
with a strange sequence.
If we define a sequence {bn } by bn = 1/n for n = 1, · · · , 100, 000, 000 and
bn = (−1)n for n > 100, 000, 000, the sequence sure looks good for a long time—
many n’s. But lim bn does not exist (after a long time the sequence values will
n→∞
start bouncing back and forth between 1 and −1 and do that for ever. If the
tail end of a sequence is bad, the sequence will be bad.

HW 3.2.1 (True or false and why.)


1
(a) Suppose that the sequence {an } is such that |an − 7| < for all n ∈ N.
n
Then lim an = 7.
n→∞
(b) Consider the sequence {an }, n = 1, 2, · · · where an = c ∈ R for all n (i.e.
the sequence c, c, c, · · · ). Then lim an = c.
(−1)n
(c) The limit lim does not exist.
n→∞ nn
(d) The limit lim (−n) does not exist.
n→∞
(e) If lim an = 0, then lim 2an = 0.
n→∞ n→∞

∞
2n2 + 4

HW 3.2.2 (a) Consider the sequence . Illustrate graphically
n+3 n=1
that
66 3. Limits of Sequences

2n2 + 4
lim does not exist.
n→∞ n + 3
 ∞
1
(b) Consider the sequence (−1)n + . Illustrate graphically that
  n n=1
1
lim (−1)n + does not exist.
n→∞ n  
n 1
(c) Prove that lim (−1) + does not exist, i.e. use the definition).
n→∞ n

2n2 + 4 2
HW 3.2.3 (a) Prove that the limit lim = (use the definition).
n→∞ 3n2 + 1 3
2n2 + 4 2
(b) Prove that the limit lim = (use the definition).
n→∞ 3n2 + n + 1 3
HW 3.2.4 Suppose that {an } and {bn } are sequence such that lim an =
n→∞
lim bn = 0. Prove that lim an bn = 0.
n→∞ n→∞

3.3 Some Sequential Limit Theorems


We want to be able to compute limits and know what we have computed is the
correct result but we do not want to have to apply the definition every time.
Hence we now want to move on to the propositions, theorems and corollaries
(all referred to collectively as theorems) concerning limits of sequences. You
probably already know most of these theorems from your basic calculus course.
Most of these theorems are the building blocks that allow you to compute the
limit of a sequence without using the definition—but as you will see, the defini-
tion of a limit is the core of the proof of all of these theorems. The limits that
we compute using the limit theorems will be as rigorous as the limits that we
have proved using the definition because all of the results that we use will have
been rigorously proved.
We begin with a discussion of one of the common hypotheses of the theorems.
We note in Proposition 3.3.1 we assume that lim an exists. This is a common
n→∞
assumption for most of the propositions in this section and the next section.
What does this mean? Moreso, what do we get to use from this assumption?
This is very easy—and very, very important—if we return to Defintion 3.1.3.
In the first place the hypothesis ensures us that there is some L such that
lim an = L. For the sequence {an } and this L, the hypothesis ensures us that
n→∞
for any  > 0 we can find an N ∈ R such that n > N implies that |an − L| < .
The hypothesis doesn’t tell us what N and L are or how to find them. It just
guarantees that there are such an N and L—and that’s all we need. As we said
above, just about every proposition in this section and the next will have this
type of hypothesis. Think clearly each time just how it’s being used.
We begin with a result that is probably unlike results you have seen before
but is basic. From our basic calculus class we know how to compute some limits.
From the last section, for a given sequence {an } and L we know how to apply
3.3 Some Limit Theorems 67

the definition of a sequential limit to show that the the sequence {an } and L
satisfy the definition. But what if after you read Example 3.2.3 and gain an
2n + 3 2
understanding how Definition 3.1.3 was used to show that lim = ,
n→∞ 5n + 7 5
one of your classmates says that the limit is really 5/2 and claims that she can
apply Definition 3.1.3 to prove it. Could the text (and your reading of the text)
and your classmate both be right? Has it been made clear that a sequence can’t
have two distinct limits that satisfy the definition? We answer these questions
with the following proposition.
Proposition 3.3.1 Suppose {an } is a real sequence and lim an exists. The
n→∞
limit is unique.

Proof: A common way to prove uniqueness is to use contradiction. Thus


we assume that the above statement is false, i.e. we assume that there exists
L1 , L2 ∈ R such that L1 6= L2 , lim an = L1 and lim an = L2 . By these
n→∞ n→∞
assumptions we know that for any 1 > 0 there exists an N1 such that n > N1
implies |an − L1 | < 1 , and for any 2 > 0 there exists an N2 such that n > N2
implies |an − L2 | < 2 . Another way to write this is

for n > N1 , L1 − 1 < an < L1 + 1 ,

and
for n > N2 , L2 − 2 < an < L2 + 2 .

Figure 3.3.1: Plot of the y = L1 ±  and L2 ±  corridors, and some sequence


points—trying to be in both corridors.

To help see why this assumption is clearly false, we have included the plot
in Figure 3.3.1. Since L1 6= L2 , we have choosen 1 and 2 sufficiently small so
that the L1 ± 1 and L2 ± 2 -corridors do not intersect. Yet for all n > N1 all of
the values an must be in the L1 ± 1 -corridor and for n > N2 all of the values an
must be in the L2 ± 2 -corridor—we try to illustrate this in the plot but it isn’t
68 3. Limits of Sequences

going to happen—the question marks signify the fact that we can’t put them in
both places.
For the proof we set 1 = 2 = |L1 − L2 | /2. For convenience and without
loss of generality, we assume that L1 > L2 so 1 = 2 = (L1 − L2 ) /2—one of
the two values must be larger than the other. Define N = max{N1 , N2 }. Then
for all n > N (so that n > N1 and n > N2 ) we will have

L1 − 1 = (L1 + L2 )/2 < an < L1 + 1 = (3L1 − L2 )/2 (3.3.1)

and

L2 − 2 = (3L2 − L1 )/2 < an < L2 + 2 = (L1 + L2 )/2. (3.3.2)

The left most inequality in statement (3.3.1) and the right most inequality in
statement (3.3.2) gives us for all n > N = max{N1 , N2 }

(L1 + L2 )/2 < an < (L1 + L2 )/2. (3.3.3)

This is surely a contradicition. Therefore no such L1 and L2 exists, and the


limit is unique.
Hence we see that if such a claim about the results of Example 3.2.3 is
made, either the text or your classmate must be wrong (and we’re betting on
the classmate).
We next state and prove a proposition that includes several of our basic
sequential limit theorems.
Proposition 3.3.2 Suppose that {an } and {bn } are real sequences, lim an =
n→∞
L1 , lim bn = L2 and c ∈ R. We then have the following results.
n→∞
(a) lim (an + bn ) = L1 + L2 .
n→∞
(b) lim can = c lim an = cL1 .
n→∞ n→∞
(c) There exists an K ∈ R such that |an | ≤ K for all n.
(d) lim (an bn ) = L1 L2 .
n→∞

Proof: (a) Suppose  > 0 is given. The first two hypotheses gives us that

for any 1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < 1 ,
(3.3.4)
and

for any 2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < 2 .
(3.3.5)
We note that

|(an + bn ) − (L1 + L2 )| = |(an − L1 ) + (bn − L2 )| ≤ |(an − L1 )| + |(bn − L2 )| ,


(3.3.6)
where the last inequality follows from the triangular inequality, Proposition
1.5.8–(v).
3.3 Some Limit Theorems 69

Define 1 = 2 = /2 (the hypotheses presented in (3.3.4) and (3.3.5) allow


for any 1 and 2 ), N = max{N1 , N2 } (Step 1: Define N ) and let n > N (so
that the last inequalities in both (3.3.4) and (3.3.5) with 1 and 2 replace by
/2 will hold true). Then from inequality (3.3.6) we get
 
|(an + bn ) − (L1 + L2 )| ≤ |(an − L1 )| + |(bn − L2 )| < + =
2 2
(Step 2: Show that the defined N works). Then by Definition 3.1.3 an + bn →
L1 + L2 as n → ∞.
(b) We begin by noting that if c = 0 the result is very easy because the sequence
{can } will be the zero sequence and c lim an will be zero—the result then
n→∞
follows from HW3.2.1-(b).
Hence, we assume that c 6= 0. We suppose that we are given an  > 0. The
hypothesis that lim an = L1 gives us that for any 1 > 0 there exists an N1 ∈ R
n→∞
such that n > N1 implies |an − L1 | < 1 . We need an N such that n > N implies
that |can − cL1 | < . By Proposition 1.5.8–(ii) this last inequality is the same as
|c| |an − L1 | < . Thus we see that if we apply our hypothesis with 1 = /|c| and
N = N1 , for n > N we have |can − cL1 | = |c| |an − L1 | < |c|1 = |c|(/|c|) = .
Therefore by Definition 3.1.3 lim can = cL1 .
n→∞
(c) This statement is clearly different from parts (a), (b), and (d). As we shall
see, this result is both a very important tool and generally a very important
property of convergent sequences. We begin by choosing 1 = 1 and apply the
hypothesis that lim an = L1 to get an N1 ∈ R such that n > N1 implies that
n→∞
|an − L1 | < 1 = 1. Then by this last inequality and the backwards triangular
inequality, Proposition 1.5.8–(vi), we have for n > N1

|an | − |L1 | ≤ |an − L1 | < 1 = 1,

or |an | < |L1 | + 1. This inequality bounds most of the sequence {an }. If we let
N0 = [N1 ] where the bracket function is defined by [x] is the largest integer less
than or equal to x, then inequality |an | < |L1 | + 1 for n > N1 bounds |an | for
n = N0 + 1, N0 + 2, · · · . Thus we set K = max{|a1 |, |a2 |, · · · , |aN0 |, |L1 | + 1}
and we have our desired result.
We note that in the above proof it would have been convenient if we had
always defined N to be a natural number (so that we didn’t have to use the
[N ]). However, we have had many instances when it has been convenient for
us to only require that N ∈ R. We should also note that it is the completeness
axiom that assures us that such an integers N0 exists. We are defining [N ] to be
the least upper bound of the set {n ∈ N : n ≤ N }—which surely exists because
the set is bounded above by N .
(d) Suppose  > 0 is given. The first two hypotheses give us that

for any 1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < 1 ,
(3.3.7)
70 3. Limits of Sequences

and

for any 2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < 2 .
(3.3.8)
We must find an N such that n > N implies that |an bn − L1 L2 | < . It should
not surprise you that the proof will be similar to that given in part (a)—with a
different dance. We note that

|an bn − L1 L2 | = |(an (bn − L2 ) + L2 (an − L1 )| ≤ |an (bn − L2 )| + |L2 (an − L1 )|


= |an | |(bn − L2 )| + |L2 | |(an − L1 )| . (3.3.9)

(To verify the first step just multiply out the second expression. The first
inequality is due to the triangular inequality, Proposition 1.5.8–(v)—we will use
this often. The last step just uses |xy| = |x||y|, Proposition 1.5.8–(ii).) Then
starting with expression (3.3.9) and using (3.3.7), n > N1 , (3.3.8), n > N2 and
part (c) of this proposition, we get

|an bn − L1 L2 | ≤ |an | |(bn − L2 )| + |L2 | |(an − L1 )| < K2 + |L2 | 1 . (3.3.10)

where K is the bound of the sequence {an } given by part (c) of this proposition.
Thus we see that if we choose 2 = /2K, 1 = /2|L2 | and N = max{N1 , N2 }
(so that both of the inequalities in (3.3.7) and (3.3.8) are satified), we have
|an bn − L1 L2 | <  whenever n > N . Therefore lim an bn = L1 L2 .
n→∞
We note that in the last step we must assume that K and |L2 | are nonzero.
It is easy to see that we can assume that K 6= 0—if K is a bound so is K +
7. Assuming that |L2 | = 0 is a real assumption so we must prove the result
separately for this case. If L2 | = 0, then L2 = 0 and and L1 L2 = 0, and the
fact that lim bn = 0 implies that for any 2 > 0 there exists an N2 such that
n→∞
n > N2 implies that |bn | < 2 . Then let  = /K and statement (3.3.10) can be
replaced by
|an bn − 0| = |an ||bn | ≤ K|bn | < K(/K) = .
Thus lim an bn = 0 = L1 L2 .
n→∞
We should note that when many mathematicians are doing proofs such as
this one, they will often let 1 = 2 = , obtain expression (3.3.10) (with 
replacing 1 and 2 ) and claim that they are done. And they are. The last term
of expression (3.3.9) would be (K + |L2 |). Because of the  we are able to make
|an bn − L1 L2 | arbitrarily small—which is really our goal. However, we don’t
technically satisfy Definition 3.1.3. But it should also be clear at this time that
the (K + |L2 |) term can be fixed up so as to give the desired result. Textbooks
will generally fix it up so that they always end with just an  at the end of
the inequality—it’s just a bit cleaner. But don’t be surprised if you see this
”sloppier” (but correct) approach in classes and talks.
We now have some of the basic results that let us compute easy limits. We
know from Example 3.2.1 that lim 1/n = 0. It should be easy to see that
n→∞
we can use part (d) of the above theorem to get lim 1/n2 = 0 and another
n→∞
3.3 Some Limit Theorems 71

application of part (d) will give lim 1/n3 = 0. We are able to obtain the
n→∞
following more general result.
1
Example 3.3.1 Prove that n→∞
lim = 0 for any k ∈ N.
nk
Solution: We hope that you realize that this result is a natural for mathematical induction.
Step 1: Prove true for k = 1. Example 3.2.1 shows that it is true for k = 1.
1
Step 2: Assume true for k = j, i.e. lim j = 0.
n→∞ n
1
Step 3: Prove true for k = j + 1, i.e. prove that lim j+1 = 0. This proof is an easy
n→∞ n
1 1 1
application of part (d) of Proposition 3.3.2. We write j+1 as j . We know from Example
n n n
1
3.2.1 that lim 1/n = 0. We know from the inductive assumption, Step 2, that lim j = 0.
n→∞ n→∞ n
Then by part (d) of Proposition 3.3.2 we have
1 1 1
lim = lim j lim = 0.
n→∞ nj+1 n→∞ n n→∞ n
Therefore the proposition is true for k = j + 1.
By the principle of mathematical induction the proposition is true for all k ∈ N.

Note that you must be careful to not mix up the fact that usually our
statements to be proved by math induction were given in terms of n and we
used k as our dummy index. In this case, since n was already in use, our
statement is given in terms of k and we used j as our dummy index. It is only
a matter of notation.
Also, we might be inclined to want to prove the above result using part (d)
of Proposition 3.3.2) (k − 1 times) and Example 3.2.1 to show that
1 1 1
lim = lim · · · lim repeated k times
n→∞ nk n→∞ n n→∞ n
= 0.
This is a perfectly good approach. Hopefully you realize when you include the
”three dots” you are including a math induction proof in disguise—and hopefully
an easy one. The result needed here is the extension of part (d) of Proposition
3.3.2 that can be stated as follows. Let {ajn }∞ n=1 , j = 1, · · · , k denote k real
sequences such that lim ajn = Lj for j = 1, · · · , k. Then lim a1n · · · akn =
n→∞ n→∞
L1 · · · Lk . It is hoped that you realize that this statement can be proved easily
by mathematical induction—like the proof given above.
We next note that we can use parts (a) and (b) of Proposition 3.3.2, and
Example 3.3.1 to show that
 
1 1 1
lim 1 − 2 = lim 1 + lim (−1) 2 = 1 + lim (−1) lim 2 = 1 + (−1)0 = 1,
n→∞ n n→∞ n→∞ n n→∞ n→ n

which is the same result we got in Example 3.2.2. We note that once we have
proved Proposition 3.3.2, HW3.2.1-(b) and Example 3.3.1, the proof of the
limit given here is every bit as rigorous as the proof given in Example 3.2.2.
In the next example we prove a more general result that can be useful. Let p
denote a k-th degree polynomial p(x) = a0 xk + a1 xk−1 + · · · + ak−1 x + ak where
a0 , a1 , · · · , ak are real.
72 3. Limits of Sequences

Example 3.3.2 Prove that


 
1 1 1
lim p(1/n) = lim a0 k
+ a1 k−1 + · · · + ak−1 + ak = ak .
n→∞ n→∞ n n n

Solution: Again it should be clear that this proof could be done by induction. Instead, we
will prove this result using the extension of part (a) of Proposition 3.3.2 along with parts (b)
and Example 3.2.1 to see that
 
1 1 1
lim p(1/n) = lim a0 k + a1 k−1 + · · · + ak−1 + ak
n→∞ n→∞ n n n
     
1 1 1
= lim a0 k + lim a1 k−1 + · · · + lim ak−1 + lim [ak ]
n→∞ n n→∞ n n→∞ n n→∞
1 1 1
= lim a0 lim k + lim a1 lim k−1 + · · · + lim ak−1 lim + lim ak
n→∞ n→∞ n n→∞ n→∞ n n→∞ n→∞ n n→∞
= ak .
This is a nice straightforward proof of the desired result but it does depend strongly on
the use of the extension of part (a) Proposition 3.3.2—which you should be confident follows
easily from part (a) or you should prove it.

HW 3.3.1 (True or False and why) (a) If lim |an | exists, then lim an exists.
n→∞ n→∞
an
(b) If lim and lim bn exists, then lim an exists.
n→∞ bn n→∞ n→∞
(c) If lim an exists, then lim a3n exists.
n→∞ n→∞
(d) If lim a3n exists, then lim an exists.
n→∞ n→∞
(e) If lim an bn exist, then lim an and lim bn exist.
n→∞ n→∞ n→∞

HW 3.3.2 Prove that if lim an = 0, then lim |an | = 0.


n→∞ n→∞

HW 3.3.3 Prove that if lim an = L, then lim |an | = |L|.


n→∞ n→∞

HW 3.3.4 Consider the sequence {an }.


(a) Prove that lim an = L if and only if lim [an − L] = 0.
n→∞ n→∞
Prove that lim an = L implies that lim |an − L| = 0.
n→∞ n→∞
Show that lim |an − L| = 0 does not imply that lim an = L.
n→∞ n→∞

3.4 More Sequential Limit Theorems


As the section title indicates there are more results that we need concerning
sequential limits. As you will see this section is full of results—and hence a bit
longer than we would prefer—we thought that it would be ridiculous to include
another section entitled ”More, More Limit Theorems.” The first result is a
very basic result that you probably already know. Hopefully sometime during
2n2 + n − 3
your elementary calculus class you found limits such as lim . You
n→∞ 3n2 + 3n + 3
wrote
2 + n1 − 3 n12 limn→∞ 2 + n1 − 3 n12
 
2n2 + n − 3 2
lim = lim = = .
n→∞ 3 + 3 1 + 3 12 1 1

n→∞ 3n2 + 3n + 3 limn→∞ 3 + 3 n + 3 n2 3
n n
3.4 More Limit Theorems 73

The first step above follows from the fact that the expression inside of the limit is
exactly the same for the first two terms—the second term is found by multiplying
1/n2
the first by . In the second step of the calculation we would be using part
1/n2
(b) of Proposition 3.4.1 given below. Then two applications of Example 3.3.2
(or in place of Example 3.3.2 you can use parts (a), (b) of Proposition 3.3.2 and
Example 3.3.1). Thus we next include the quotient rule for sequential limits.
Proposition 3.4.1 Suppose that {an } and {bn } are real sequences, lim an =
n→∞
L1 and lim bn = L2 . Then we have the following results.
n→∞
(a) If L2 6= 0 then there exists an M ∈ R and an N3 ∈ R such that |bn | ≥ M
for all n > N3 .
an L1
(b) If L2 6= 0, then lim = .
n→∞ bn L2
Proof: (a) As in the other proofs, the hypotheses for this result implies that
for every 2 > 0 there exists an N2 ∈ R so that for n > N2 , |bn − L2 | < 2 .
We are also given the fact that L2 6= 0. This is another result for which it is
convenient to draw a picture. In Figure 3.4.1 we have used the fact that L2 6= 0
to choose an 2 so that the L2 ± 2 -corridor forces the sequence values to be
away from zero for all n greater than some N3 , i.e. we have choosen an 2 so
that y = 0 is not in the L2 ± 2 -corridor.
The easiest way to accomplish this is to choose 2 = |L2 | /2. Then the
hypothesis implies that there exists an N2 ∈ R such that n > N2 implies that
|bn − L2 | < |L2 | /2 or |L2 − bn | = |bn − L2 | < |L2 | /2. Then by the backwards
triangular inequality, Proposition 1.5.8–(vi), for n > N3 = N2 we get

|L2 | − |bn | ≤ |L2 − bn | < |L2 | /2

or |bn | > |L2 | /2.

Figure 3.4.1: Plot of a sequence and the y = L2 ±  corridor.

(b) Before we proceed we note that


an L1 L2 an − L1 bn L2 (an − L1 ) − L1 (bn − L2 )
− = = . (3.4.1)
bn L2 L2 bn L 2 bn
74 3. Limits of Sequences

We like the an − L1 and bn − L2 terms in the numerator because we can make


them small. The L2 and L1 are also good in the numerator because they’re
constants. The minus sign between the two terms does not cause us trouble
because when we use the triangular inequality separate the two terms, we can
use the fact that the triangular inequality will give us |x − y| = |x + (−y| ≤
|x| + | − y| = |x| + |y|—so it’s as if the minus sign isn’t really there. The bn term
in the denominator is the term that might cause us most problems but we have
part (a) of this proposition. So let’s begin.
Suppose  > 0 is given. The first two hypotheses give us that
for any 1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < 1 ,
(3.4.2)
and
for any 2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < 2 .
(3.4.3)
Since L2 6= 0, we can apply part (a) of this proposition to get an M and N3
such that n > N3 implies |bn | ≥ M . If we choose n so that n > N1 , n > N2
and n > N3 , i.e. choose n > N = max{N1 , N2 , N3 }, we can return to equality
(3.4.1) to see that

an L1
L2 (an −L1 )−L1 (bn −L2 ) |L2 (an −L1 )|+|L1 (bn −L2 )|

bn − = L 2 bn ≤ |L2 bn | (3.4.4)
L2

|L2 ||(an −L1 )|+|L1 ||(bn −L2 )| |L2 |1 +|L1 |2
= |L2 ||bn | < |L2 |M . (3.4.5)

Thus if we choose 1 = M /2 and 2 = M |L2 | / (2 |L1 |), we can apply inequal-
ities (3.4.4)–(3.4.5) to see that for n > N

an L1 |L2 ( |M /2) + |L1 | [M |L2 | / (2 |L1 |)]
− < = .
bn L2 |L2 | M
Therefore an /bn → L1 /L2 as n → ∞.
Note that in the above argument we have assumed that L1 6= 0. If L1 = 0,
we have that for any 1 > 0 there exists an N1 such that n > N1 implies that
|an | < 1 . Thus
an |an | 1
− 0 = < .
bn |bn | K
Then choosing N = max{N1 , N3 } and 1 = K, we see that
an limn→∞ an 0
lim = = = 0.
n→∞ bn limn→∞ bn L2

In the introduction to Proposition 3.4.1 we included a calculation using the


quotient rule. To obtain a more general result we let p and q denote k-th
and m-th degree polynomials p(x) = a0 xk + a1 xk−1 + · · · + ak−1 x + ak and
q(x) = b0 xm + b1 xm−1 + · · · + bm−1 x + bm , respectively, where a0 , a1 , · · · , ak
and b0 , b1 , · · · , bm are real. We obtain the following result.
3.4 More Limit Theorems 75

Example 3.4.1 (a) If k = m and b0 6= 0, then


p(n) a0
lim = . (3.4.6)
n→∞ q(n) b0

(b) If k < m and b0 6= 0, then


p(n)
lim = 0. (3.4.7)
n→∞ q(n)
p(n)
(c) If k > m and bj 6= 0 for some j = 1, · · · , m, then lim does not exist.
n→∞ q(n)
Solution: We want to make you very aware that in the last example we used the polynomial
p and calculated a limit of p(1/n). In this example we have p(n) and q(n) in our limit
statements. As you will see we will juggle things so that we can still use the results of the last
example—but that’s why we are getting a0 and b0 in our answer and not ak and bm .
(a) (k = m)Part (a) of this result is quite easy—it can be proved precisely the way you
computed these limits in your first calculus classes (except this time we will be using a quotient
result that we have proved). We note that

p(n) a0 nk + a1 nk−1 + · · · + ak−1 n + ak


lim = lim
n→∞ q(n) n→∞ b0 nm + b1 nm−1 + · · · + bm−1 n + bk
1 1
a0 + a1 n + · · · + ak−1 nk−1 + ak n1k
= lim 1 1
multiply top and bottom by 1/nk
n→∞b0 + b1 n + · · · + bk−1 nk−1 + bk n1k
h i
1 1
limn→∞ a0 + a1 n + · · · + ak−1 nk−1 + ak n1k
= h i part (b), Proposition 3.4.1
1 1
limn→∞ b0 + b1 n + · · · + bk−1 nk−1 + bk n1k
a0
= by applying Example 3.3.2 twice
b0
(b) (k < m) For this case we proceed similar to the way that we proceeded in part (a)—we
will multiply the top and bottom by 1/nm . We get

p(n) a0 nk + a1 nk−1 + · · · + ak−1 n + ak


lim = lim
n→∞ q(n) n→∞ b0 nm + b1 nm−1 + · · · + bm−1 n + bk
1 1 1
a0 nm−k + a1 nm−k+1 + · · · + ak−1 nm−1 + ak n1m 1/nm
= lim 1 1
multiply by 1/nm
n→∞ b0 + b1 n + · · · + bk−1 nk−1 + bk n1k
h i
1 1 1
limn→∞ a0 nm−k + a1 nm−k+1 + · · · + ak−1 nm−1 + ak n1m
= h i
1 1
limn→∞ b0 + b1 n + · · · + bk−1 nk−1 + bk n1k
0
= = 0 by applying Example 3.3.2 twice.
b0
(c) (k > m) This statement is true but the proof is too ugly to include here.

There are other results that we need or would like concerning limits of se-
quences. We can use the definition to prove that lim (−1)n /n converges to zero.
n→∞
However a tool that can be used to prove the convergence of this limit and many
others is the following proposition referred to as the Sandwich Theorem.

Proposition 3.4.2 Suppose that {an }, {bn } and {cn } are real sequence for
which lim an = lim cn = L and an ≤ bn ≤ cn for all n greater than some N1 .
n→∞ n→∞
Then lim bn = L.
n→∞
76 3. Limits of Sequences

Proof: We suppose that we are given an  > 0. The two limit hypotheses give
us that there exists an N2 such that n > N2 implies that |an − L| < , or

L −  < an < L + , (3.4.8)

and there exists an N3 such that n > N3 implies that |cn − L| < , or

L −  < cn < L + . (3.4.9)

Then if we use the left inequality of (3.4.8), the right inequality of (3.4.9) and
the hypothesis that an ≤ bn ≤ cn , we find that if n > N = max{N1 , N2 , N3 },
then
L −  < an ≤ bn ≤ cn < L + ,
or |bn − L| < . Therefore lim bn = L.
n→∞
It should then be easy to see that we can use the inequality −|an | ≤
(−1)n an ≤ |an | (see HW3.3.2) and Proposition 3.4.2 to obtain the following
result.
Corollary 3.4.3 If limn→∞ an = 0, then limn→∞ (−1)n an = 0.

More Topology on R In Chapter 2 we worked with limit points of sets.


Though the concepts are very different, in Propositions 3.1.5 and 3.1.6 we saw
that there was a connection limits of sequences and limits of sets. We next
include some additional topological results involving sequences.
Proposition 3.4.4 Suppose that E ⊂ R is closed and {xn } is a sequence con-
tained in E that converges to x0 . Then x0 ∈ E.

Proof: A sequence {xn } can converge to x0 in two ways: either xn = x0 for


all n > N for some N (in which case x0 ∈ E because {xn } ⊂ E) or the set of
points E1 = {x1 , x2 , · · · } is infinite (in which case by Proposition 3.1.5-(b) x0
is a limit point of of E1 , and since E1 ⊂ E, x0 is a limit point of E—and since
E is closed, x0 ∈ E. In either case, x0 ∈ E.
Subsequences Though we saw that the sequence {(−1)n } does not converge,
we can choose the subsequence of even terms {a2n } and note that a2n = 1 → 1.
Thus we see that even when a sequence does not converge, it is possible that a
subsequence might not converge. We begin with the following definition.
Definition 3.4.5 Consider the real sequence {an } and the sequence {nk } ⊂ N
such that n1 < n2 < n3 · · · . The sequence {ank }∞k=1 is called a subsequence of
{an }.
If lim ank exists, the limit is called a subsequential limit.
k→∞

We then have the following result.


Proposition 3.4.6 Suppose that {an } is a real sequence and L ∈ R. lim an =
n→∞
L if and only if every subsequence of {an } converges to L.
3.4 More Limit Theorems 77

Proof: (⇒) If limn→∞ an = L then for every  > 0 there exists an N ∈ R such
that n > N implies that |an − L| < . Consider any subsequence of {an }, {ank }.
Clearly if nk > N , then |ank − L| < . Let K be such that nK ≤ N < nK+1
(nK can be defined to be lub{nk : nk ≤ N }). Then k > K implies that nk > N
and |ank − L| < . Therefore lim ank = L.
k→∞
(⇐) Suppose false, i.e. every subsequence of {an } converges to L but lim an 6=
n→∞
L (either it doesn’t exist or it exists and does not equal L). The limit lim an 6=
n→∞
L if for some  > 0 and every N ∈ R there exists an n > N for which |an −L| ≥ .
Let N = 1 and denote by n1 the value (of n) such that |an1 − L| ≥ .
Then let N = n1 and denote by n2 the element of N such that n2 > N = n1
and |an2 − L| ≥ .
Continue in this fashion and get a sequence of natural numbers {nk } such that
n1 < n2 < n3 < · · · and |ank − L| ≥  for all k. Thus the subsequence {ank }
does not converge to L. This is a contradiction so lim an = L.
n→∞
The next result is an important result for later work known as the Bolzano–
Weierstrass Theorem.

Theorem 3.4.7 (Bolzano–Weierstrass Theorem) If the set E ⊂ R is


bounded, then every sequence in E has a convergent subsequence.

Proof: Let {xk } be a sequence in E. If the set E1 = {xk : k ∈ N} is a finite


set, then at least one value, say a ∈ E1 , must be repeated infinitely often in the
sequence {xn }. If we consider the subsequence {xnj }∞ j=1 where xnj = a for all
j, then the subsequence is clearly convergent.
If the set E1 is infinite, we proceed with a construction much the same as
we used in Proposition 2.3.7. Since E1 is bounded, there is an closed interval
I1 = [a1 , b1 ], a1 < b1 , such that E1 ⊂ I1 .
Let c1 = (a1 + b1 )/2 and consider the closed intervals [a1 , c1 ] and [c1 , b1 ]. One of
these intervals must contain infinitely many points of E1 (the sequence {xn }),
call this interval I2 and write this closed interval as I2 = [a2 , b2 ]. (If both
intervals [a1 , c1 ] and [c1 , b1 ] contained only finitely many points of E1 , then
[a1 , b1 ] = [a1 , c1 ] ∪ [c1 , b1 ] would contain only finitely many points of E1 .)
Let c2 = (a2 + b2 )/2 and consider the closed intervals [a2 , c2 ] and [c2 , b2 ]. One of
these intervals must contain infinitely many points of E1 (the sequence {xn }),
call this interval I3 and write this closed interval as I3 = [a3 , b3 ].
In general suppose that we have defined I1 ⊃ I2 ⊃ · · · ⊃ In , where Ij = [aj , bj ],
j = 1, · · · n. Let cn = (an + bn )/2 and consider the closed intervals [an , cn ]
and [cn , bn ]. One of these intervals must contain infinitely many points of E1
(the sequence {xn }), call this interval In+1 and write this closed interval as
In+1 = [an+1 , bn+1 ].
We have the nested sequence of closed intervals {In } (In ⊃ In+1 ) such that
each interval In contains infinitely many points of E1 and the length of the

inteval In is (b1 − a1 )/2n−1 . By Proposition 2.3.6 we know that ∩ In is not
n=1
78 3. Limits of Sequences


empty. Let x0 be such that x0 ∈ ∩ In . (Because the length of the intervals
n=1
goes to zero, there is really only one point in the intersection—but we don’t
care—finding one point is enough.)
Now choose the subsequence {xnj } as follows:
Choose xn1 as one of the terms of the sequence {xn } such that xn1 ∈ I1 (since
I1 contains infinitely many elements of E1 , this is surely possible).
Choose xn2 ∈ I2 from the terms of the part of the original sequence {xn1 +1 , · · · }
(i.e. such that n2 > n1 )—I2 contained infinitely many elements of E1 so there
are still enough to choose from.
In general choose xnj ∈ Ij so that nj > nj−1 —since Ij contained infinitely
many elements of E1 there are still plenty of elements to choose from. Do so for
all j, j = 1, · · · .

Since x0 ∈ ∩ In , x0 ∈ Ij for all j. Since xnj ∈ Ij also, xnj − x0 ≤
n=1
(b1 − a1 )/2j−1 → 0 as j → ∞ and the subsequence {xnj } converges to x0 .
One very easy result that we obtain from the Bolzano–Weierstrass Theorem
is the following.
Corollary 3.4.8 If the set K ⊂ R is compact, then every sequence in K has a
convergent subsequence that converges to a point in K.

Proof: From the Heine–Borel Theorem, Theorem 2.3.8, we know that K is


bounded. By the Bolzano–Weierstrass Theorem, Theorem 3.4.7, we know that
if {xn } is a sequence in K, then the sequence {xn } has a convergent subsequence
{xnj }. Again by the Heine–Borel Theorem , since K is compact, K is closed.
Then by Proposition 3.4.4 the subsequence {xnj } converges to some x0 ∈ K.
Cauchy Sequences and the Cauchy Criterion There is an idea strongly
related to convergence in the reals that at times can be very helpful when
discussing convergence of sequences. At first look it doesn’t appear that this
should be the right place to include this result. We include this definition
and proposition here because the proof depends on the Bolzano–Weierstrass
Theorem, 3.4.7. We begin with the following definition—notice how similar it
is to that of convergence of a sequence.
Definition 3.4.9 Consider a real sequence {an }. The sequence is said to be a
Cauchy sequence if for every  > 0 there exists an N ∈ R such that n, m ∈ N
and n, m > N implies that |an − am | < .
Thus we see that whereas {an } converges to L if for all large n’s, the an ’s get
close to L, the sequence is a Cauchy sequence if for all large n’s and m’s, the
terms an and am get close to each other. It is easy to see that the sequence
{1, 1, · · · } is a Cauchy sequence—choose N = 1. It is also easy to see that a
sequence {an } where an = 1/n is a Cauchy
sequence.
∗ 1 (If N is chosen so that
N = 2/, then n, m > N implies that n1 − m 1
≤ n+m 1
< N2 =  where the
step labeled ≤∗ is true because of the triangular inequality, Proposition 1.5.8-
(v).) And finally a sequence such as {n} is not a Cauchy sequence. If we choose
3.4 More Limit Theorems 79

 = 1, then for any N and n > N , we can find an m > N , say m = n + 5, such
that |an − am | = |n − m| = 5 > .
We next include a lemma that is really part of the proof of the Cauchy
criterion. We are separating it out because it may be useful in its own right.

Lemma 3.4.10 If the sequence {an } is a Cauchy sequence, then the sequence
is bounded.

Proof: Suppose that {an } is a Cauchy sequence. Choose  = 1 and let N ∈ R


be such that n, m > N implies that |an − am | < 1. Choose a fixed M ∈ N such
that M > N . Then for n > M > N we have |an − aM | < 1. Then by the
backwards inequality, Proposition 1.5.8-(vi), we get |an | − |aM | ≤ |am − aM | <
1, or if n > M , |an | < |aM | + 1. Then the sequence {an } is bounded by
max{|a1 |, · · · , |aM −1 |, |aM | + 1}.
We now proceed with a very important theorem. Note that the proof in one
direction is difficult—read it carefully.

Proposition 3.4.11 (Cauchy Criterion for Convergence) Consider a real


sequence {an }. The sequence {an } is convergent if and only if the sequence {an }
is a Cauchy sequence.

Proof: (⇒) Begin by supposing that an → L. We know that for any 1 > 0
there exists N ∈ R such that n > N implies that |an − L| < 1 . Now suppose
that we are given an  > 0, choose 1 = /2 and let N ∈ R be the value
promised us by the convergence of the sequence {an }. Then if n, m > N ,
|an − am | = |(an − L) + (L − am )| ≤∗ |an − L| + |L − am | < 21 =  where
the step labeled ≤∗ is due to the triangular inequality, Proposition 1.5.8-(v).
Therefore {an } is a Cauchy sequence.
(⇐) Suppose that {an } is a Cauchy sequence. Let E be the set of points
{a1 , a2 , · · · }. By Lemma 3.4.10 the sequence {an }, and hence the set E, is
bounded. Then we know by the Bolzano–Weierstrass Theorem, Theorem 3.4.7,
the sequence {an } has a convergent subsequence, say {ank }. Let L be such that
ank → L as k → ∞.
We will now proceed to prove that {an } converges to L. Suppose  > 0 is
given and let N ∈ R be such that n, m ∈ N and n, m > N implies |an −am | < /2
(because {an } is a Cauchy sequence). Let N2 ∈ R be such that k ∈ N and k > N2
implies that |ank − L| < /2 (because ank → L). Let nK (where nK is one of
the subscripts from the subsequence, n1 , n2 , · · · ) be a fixed integer such that
K > N2 . In addition require that nK > N —if we have found an appropriate
nK , we can always use a larger value. Hence, we have that |anK − L| < /2 and
if n > N we have that |anK − an | < /2. Thus for n > N ,
 
|an − L| = |(an − anK ) + (anK − L)| ≤∗ |an − anK | + |anK − L| < + =
2 2
where the step labeled ≤∗ follows by the triangular inequality. Therefore an → L
as n → ∞.
80 3. Limits of Sequences

So we see that the Cauchy criterion provides us with an alternative approach


to proving convergence. We know that because we showed that the sequence
{an } where an = 1/n is a Cauchy sequence, we know that {an } converges
(which we already know). However, using this approach we do not know or
need to know what the sequence converges to. That is the magic of the Cauchy
criterion. If you look more closely at how we proved that the sequence {1/n}
was Cauchy, it should be pretty clear that the approach is very similar to the
approach for showing that a sequence converges—except that we have to of the
terms an in the absolute value and we do not have the limit. As we will see
when we consider series in Chapter 8, the Cauchy criterion for convergence can
be very useful—because we often do not know the sum of our series.

1
HW 3.4.1 (True or False and why) (a) If lim exists, then lim an exists.
n→∞ an n→∞
(b) Consider the sequence {an }. If the subsequences {a2n } and {a2n+1 } both
converge, then the sequence {an } converges.
(c) If {an } is a sequence of rationals in [0, 1], then {an } has a subsequence that
converges to a rational in [0, 1].
(d) If an < 0 for all n > N for some N ∈ R and lim an exists, then lim an ≤ 0.
n→∞ n→∞
(e) The sequences {1/n2 } is a Cauchy sequence.
(f) There is a sequence that consists of the set of rationals in [0, 1].
1
HW 3.4.2 Prove that the sequence n sin n converges.

HW 3.4.3 Prove that there exists a subsequence {nk } of N such that {cos nk }
converges.

HW 3.4.4 Use the definition, Definition 3.4.9, to prove that {1/n3 } is a Cauchy
sequence.

HW 3.4.5 Let {an } represent the sequence that contains all of the rational
numbers in [0, 1]. Explain why the sequence {an } is not convergent. Prove
that {an } has a convergent subsequence. Describe one of the convergent subse-
quences of {an }.

3.5 The Monotone Convergence Theorem


At this time the methods we have to prove convergence of sequences is (i)
to use the definition (if we know what the limit is) and (ii) to use the limit
theorems to reduce our limit to one or more known limits (usually getting back
eventually to lim c = c or lim 1/n = 0). In this section we will include a
n→∞ n→∞
third approach for proving the convergence of sequences. We will discuss is the
convergence of monotone sequences. Montone sequences are a very important
class of sequences. We begin with the following definition.
3.5 Monotone Convergence Theorem 81

Definition 3.5.1 (a) The sequence {an } is said to be monotonically increasing


if an+1 ≥ an for all n ∈ N.
If {an } is such that an+1 > an for all n ∈ N, the sequences is said to be strictly
increasing.
(b) The sequence {an } is said to be monotonically decreasing if an+1 ≤ an for
all n ∈ N.
If {an } is such that an+1 < an for all n ∈ N, the sequences is said to be strictly
decreasing.
A sequence {an } is said to be monotone if it is either monotonically increas-
ing or decreasing.

It should not be hard to see that sequences {−1/n}, {−1/n2 }, {1−1/n2 } and
n
{3 } are monotonically increasing (they’re strictly
n increasing too) and that the
2 2 1
sequences {1/n}, {1/n }, {1 + 1/n } and are monotonically decreas-
2
ing (and strictly
 decreasing).
  Likewise, it should be clear that the sequences
n n1 1
{(−1) }, (−1) and 1 + (−1)n are not monotonic sequences. The
n n
easiest approach to demonstrate that a sequence such as {1 − 1/n2 } is mono-
1 ? 1
tone increasing is by setting an+1 ≥ an , i.e. 1 − ≥ 1 − 2 —which at
(n + 1)2 n
this time we do not know is true (we have placed a question mark, ?, over the
inequality to indicate that you don’t know that it is true), and then simplify the
inequality with reversible steps until you arrive at an inequality that you know
is true or that you know is false. In this case we see that

1 ? 1
1− 2
≥ 1 − 2 is the same as
(n + 1) n
1 ? 1
≤ 2 (subtract 1 from both sides and multiply both sides
(n + 1)2 n
by −1) is the same as
?
n2 ≤ (n + 1)2 = n2 + 2n + 1 (multiply both sides by n2 and
(n + 1)2 and simplify) is the same as
?
0 ≤ 2n + 1(subtract n2 from both sides).

We know that 0 ≤ 2n + 1 is true for all n ∈ N. Then it should be clear that we


can trace the steps used above backwards (add n2 to both sides, write n2 +2n+1
as (n + 1)2 , divide both sides by n2 and (n + 1)2 , multiply both sides by −1 and
1 1
add 1 to both sides) to actually prove that an+1 = 1 − 2
≥ 1 − 2 = an
(n + 1) n
for all n ∈ N. You will see that most people do the first calculation and do
not do the second—and it shouldn’t be necessary amongst friends. You should
realize (or verify) that two groups of monotonic sequences given above can be
proved to be such by the same method.
82 3. Limits of Sequences

It is much easier to prove that a sequence is not monotonic. Consider the


1
sequence {1 + (−1)n }. If we write out three terms in a row (the first three
n
make the arithmetic easier), 1 − 1 = 0, 1 + (1/2) = 3/2, 1 − 1/3 = 2/3, we see
that since a1 = 0 < a2 = 3/2 the sequence is not monotonically decreasing (but
it may be monotonically increasing) and since a2 = 3/2 > a3 = 2/3 the sequence
is not monotonically increasing. Therefore the sequence is not monotonic.
The above sequences are easy sequences and are some of the easiest sequences
to show whether or not they are monotonic. There are sequences where it is
more difficult to show that they are monotonic. Sometimes the algebra required
to perform the computations analogous to that done above is next to impossible.
One approach (which is cheating at this time but will be perfectly OK soon—
after Chapter 6) is to use the fact that if the derivative of a function is positive
(or negative), then the function is increasing (or decreasing). (This is something
that we have not proved yet but we should know that it is true from our basic
n2 + 3n + 1
calculus course.) For example consider the sequence an = . Because
2n + 3
2 2 2
d x + 3x + 1 2x + 6x + 7 x + 3x + 1
= 2
> 0 for x ≥ 1 the function f (x) =
dx 2x + 3 (2x + 3) 2x + 3
is increasing. Then for n ∈ N we see that n < n + 1 implies that

an = f (n) < f (n + 1) = an+1

or that the sequence {an } is monotonically increasing (and strictly increasing).

One last comment before we proceed to the Monotone Convergence Theo-


rem. We notice that we have always proved that our sequences were strictly in-
creasing or decreasing and only claimed that they were monotonically increasing
or decreasing. This was done this way because for the Monotone Convergence
Theorem, we only need that the sequences are monotonic. When it is important
to have the strict monotonicity, it is not difficult to shift gears, get it and use
it.

Theorem 3.5.2 (Monotone Convergence Theorem)


(a) If the sequence {an } is monotonically increasing and bounded above, the
sequence converges, and converges to lub{an : n ∈ N}.
(b) If the sequence {an } is monotonically decreasing and bounded below, the
sequence converges, and converges to glb{an : n ∈ N}.
(c) If a monotonic sequence is not bounded, then it does not converge.

Proof: This is a very important theorem that is especially nice because the
proof is really easy—in fact, when you think about it, it’s obvious. Consider
part (a). If the sequence is monotonically increasing, then it surely cannot be
the type of sequence that does not converge because it oscillates back and forth
between two distinct numbers. If the sequence is bounded, the sequence cannot
be the type of sequence that does not converge because it goes off to infinity.
There’s really nothing left.
3.5 Monotone Convergence Theorem 83

We begin the proof as usual by supposing that we are given  > 0. For
convenience let S = {an : n ∈ N} and L = lub(S)—which exists because the
sequence is bounded above. Recall that from Proposition 1.5.3–(a) we know
that for any  > 0 there exists some an0 ∈ S such that L − an0 < . Then
for all n > N = n0 (Step 1: Define N), by the fact that the sequence {an } is
monotonically increasing, an ≥ an0 > L − . Also for all n > N = n0 (really for
all n), because L = lub(S) is an upper bound of S, an ≤ L < L + . Therefore,
for n > N we have

L −  < an < L +  or |an − L| < 

so lim an = L.
n→∞
(b) We will not include the proof of part (b). You should make sure that
you understand that part (b) follows from Proposition 1.5.3–(b) in the same
way that (a) followed from Proposition 1.5.3–(a)—or you could consider the
sequence {−an } and apply part (a) of this theorem.
(c) This statement was only included in the proposition for completeness.
The contrapositive of the statement is that if the sequence is convergent, it
is bounded—but we already know that to be true for any sequence (monotone
or not) by Proposition 3.3.2–(c).
The Monotone Convergence Theorem has many applications and is an im-
portant theorem. At this time we will use it to prove a very useful limit.
lim cn = 0.
Example 3.5.1 Prove that if |c| < 1 then n→∞
Solution: (You should be aware that if c = 1, the limit is one. If c > 1, the sequence is
unbounded so the limit does not exist—or as we will soon show, the limit is infinity. If c ≤ −1,
the limit does not exist. We will not prove these now.)
Case 1: Suppose we make it easy and assume that 0 < c < 1. From HW1.6.6) we see that
0 < cn < 1 for all n ∈ N (an induction proof). If an = cn , then by the fact that c < 1
and Proposition 1.3.7-(iii) we have an+1 = cn+1 = cn c < cn 1 = an . Thus the sequence
{an = cn } is monotonically decreasing. Also since an = cn > 0, the sequence is bounded
below. Thus by Theorem 3.5.2–(b) we know that lim cn exists and equals L = glb(S) where
n→∞
S = {cn : n ∈ N}.
Notice that since 0 < cn for all n ∈ N, by HW1.5.1–(a) L ≥ 0. To show that L = 0 we
suppose false, i.e. suppose that L > 0. Since L is a lower bound of S, L ≤ cm for all m ∈ N.
cn+1 L
Specifically, n ∈ N, L ≤ cn+1 also. Then cn = ≥ , so L/c is a lower bound of S.
c c
But since c < 1, L/c > L. This contradicts the fact that L = glb(S). Therefore L = 0 and
lim cn = 0.
n→∞
Case 2 & 3: If c = 0, then cn = 0 for all n so the result follows from HW3.2.1-(b). If
−1 < c < 0, then we can write cn = (−|c|)n = (−1)n |c|n and the result follows from the Case
1 and Corollary 3.4.3 (−1 < c < 0 implies that 0 < |c| < 1, so Case 1 implies that |c|n → 0).

Note that this example includes the limit proved in Example 3.2.4. Hopefully
you realize that the limit considered above could also be proved using the same
approach as we used in Example 3.2.4
We next use Example 3.5.1, Proposition 3.4.2 and Corollary 3.4.3 to prove
the convergence of another important limit.
84 3. Limits of Sequences

an
Example 3.5.2 Prove that n→∞
lim = 0 for any a ∈ R.
n!
Solution: Before we proceed, recognize that this is a strong result. No matter how large a
is (and if a is large, an will get really large), eventually n! gets to be big enough to dominate
the an term. (If you are interested, set a = 100 and compute an /n! for n = 150, 151, 152.
You’ll see they are getting smaller but they have a long way to go. If you look at this proof
carefully, you will see exactly how and why this happens.
To make the solution a bit easier we consider Case 1: a > 0. We begin by choosing
M ∈ N such that M > a (we can do this by Corollary 1.5.4). Then for n > M , we see that
an an an
= ≤ there were n − M factors
n! M !(M + 1) · · · n M !M n−M
M M  a n
= . (3.5.1)
M! M
an M M  a n  a n
Since M is fixed, 0 < ≤ and → 0 (because a/m < 1), we apply
n! M! M M
n
Proposition 3.4.2 to see that the sequence {a /n!} converges to 0.
As in the last example Case 2 & 3 are easy. When a = 0 we have the trivial zero
an |a|n
sequence. When a < 0, we see that = (−1)n so the result follows from Case 1 and
n! n!
Corollary 3.4.3.

You might note that the limit proved in Example 3.5.2 can be proved using
the Monotone Convergence Theorem directly. It is an interesting application of
the Monotone Convergence Theorem in that the sequence is not monotonically
decreasing—set a = 100 and compute a1 , a2 , a3 , a150 , a151 and a152
. Toapply

an
the Monotone Convergence Theorem you apply it to the sequence
n! n=M
(where M is as in the last example). Again we begin with Case 1 where a > 0.
Since n + 1 > M and M > a, we see that

an+1 an a an a an
= < < ,
(n + 1)! n! n + 1 n! M n!
so the sequence is monotonically decreasing. The sequence is bounded below
by zero, so the limit exists and equals L = glb(S) = glb ({cn : n ∈ N}). As in
an+1
Example 3.5.1 we assume that L > 0 and note that ≥ L for any n and
(n + 1)!
an+1
an (n+1)! L
= a ≥ a .
n! n+1 M

Since this is true for any n ∈ N, M L/a is also a lower bound and M L/a > L so
L cannot be the greatest lower bound. Therefore L = 0.
We see that the tail end of the given sequence converges, hence the seequence
converges—recall the discussion of tail ends of sequence at the end of Section
3.2.

HW 3.5.1 (True or False and why) (a) The sequence {sin(1/n)} is monotone.
(b) The sequence {n + (−1)n /n} is monotone.
(c) The sequence {n/2n } is monotone.
3.6 Infinite Limits 85
n o
n+1
(d) The sequence n+2 is monotonically decreasing.
(e) If {an } and {bn } are Cauchy sequences, then {an +bn } is a Cauchy sequence.
(f) If {an } and {bn } are Cauchy sequences, then {an bn } is a Cauchy sequence.
(g) If the sequence {an } has a convergent subsequence, then {an } converges.

HW 3.5.2 Suppose S ⊂ R is bounded above and not empty, and set s = lub(S).
Prove that there exists a monotonically increasing sequence {an } ⊂ S such that
s = lim an .
n→∞

HW 3.5.3 Suppose {an } is an monotone increasing sequence and {bn } is a


monotone decreasing sequence such that an ≤ bn for all n ∈ N. Show that
limn→∞ an and limn→∞ bn both exist and limn→∞ an ≤ limn→∞ bn .

HW 3.5.4 Suppose the monotone sequence {an } has a convergent subsequence.


Prove that {an } is convergent.

3.6 Infinite Limits


As we stated earlier we do want to have the concept of infinite limits. The fact
that lim n2 + 1 does not exist is not on an equal footing with the fact that
n→∞
lim (−1)n does not exist.
n→∞
When we introduced the limit of a sequence we gave you the following expla-
nation (that we told you we liked): ”for every measure of closeness to L” there
exists ”a measure of closeness to ∞” so that whenever ”n is close to ∞”, ”an is
close to L.” To be able to define when lim an = ∞ it should be clear that we
n→∞
want a definition that will satisfy ”for every measure of closeness to ∞” there
exists ”a measure of closeness to ∞” so that whenever ”n is close to ∞”, ”an
is close to ∞.” We use the same type of measure of closeness of an to ∞ as we
do for the measure of closeness of n to ∞. We obtain the following definition.
Definition 3.6.1 Consider a real sequence {an }. (a) lim an = ∞ if for every
n→∞
M > 0 there exists an N ∈ R such that n > N implies that an > M .
(b) lim an = −∞ if for every M < 0 there exists an N ∈ R such that n > N
n→∞
implies that an < M .
We will say either that an converges to ∞ (or −∞) or an diverges to ∞ (or
−∞). From this point on we will no longer claim that a limit such as
lim n2 + 1 does not exist. We will say that lim (1/n) exists and equals 0,
n→∞ n→∞
lim n2 + 1 exists and equals ∞ and lim (−1)n does not exist.
n→∞ n→∞
Since we have made the claim that lim n2 + 1 = ∞, we had better prove
n→∞
it.
lim n2 + 1 = ∞.
Example 3.6.1 Prove that n→∞
Solution: As you will see the proofs of infinite limits are very much like the proofs of finite
limits—maybe easier. We still will have two basic steps. Step 1: Define N, and Step 2: Show
86 3. Limits of Sequences

that N works. We suppose that we are given an M > 0. We want an N so that n > N implies
that n2 + 1 > M . As we did in the case of finite limits, we solve this inequality for n, i.e.

n2 + 1 > M is the same as n2 > M − 1 is the same as n > M − 1.
√ √
Therefore we want to define N = M − 1 (Step 1: Define N ). Then if n > N = M − 1,
2 2
n > M − 1 and n + 1 > M (Step 2: N works).
Before we say that we are done we should note that what we have done above is not quite
correct. The definition must hold for any M > 0. But if 0 < M < 1, then M − 1 < 0 so we
cannot take the square root of M − 1—but using an M between 0 and 1 to measure whether a
sequence is going to infinite is not the smartest thing to do anyway. However, we must satisfy
the definition (this technicality is analogous to large ’s when we are considering finite limits.
The approach is to take two cases, 0 < M < 1 and M ≥ 1.
Case 1: (0 < M < 1) Choose N = 1. Then n > N = 1 implies that n2 + 1 > M (this is
assuming that the sequence starts at either n = 0 or n√= 1).
Case 2: (M ≥ 1) Proceed as we did originally—now M − 1 makes sense.

We include one more infinite limit example because we hinted at the result
in the last section—but warn you as was the case with Example 3.2.4 we will
again cheat in that we will use the logarithm and exponential functions.
lim cn = ∞.
Example 3.6.2 Prove that if c > 1 then n→∞
Proof: As before we assume that we are given M > 0. We want N so that n > N implies
that cn > M . We solve the last inequality for n by taking the logarithm of both sides to get
ln cn = n ln c > ln M or n > ln M/ ln c. We choose N = ln M/ ln c (Step 1: Defined N ). Then
n > N = ln M/ ln c implies that n ln c > ln M or ln cn > ln M . Taking the exponential of both
sides (the exponential function is also increasing) gives cn > M (Step 2: N works). Therefore
cn → ∞ as n → ∞.
We should note that some of the reasons that make the above steps correct include the
following facts. The logarithm and exponential functions are increasing so the inequalities
stay in the same direction when these functions are applied. We were given that c > 1 so that
ln c > 0 so that inequalities stay in the same direction when we divide by or multiply by ln c.
And if 0 < M < 1, ln M < 0 but it’s permissible to have a negative N because if we assume
that the sequence starts at either n = 0 or n = 1, for all n ≥ 0 > N = ln M , cn > M .

We should include the last few cases here. If c = 1, the sequence is the trivial
sequence of all ones so cn → 1. If c = −1, the sequence is the sequence that we
have consider in Example 3.2.6 so lim cn does not exist. Since the sequence
n→∞
is clearly unbounded, by Theorem 3.5.2–(c) the sequence does not converge to
any finite value—Theorem 3.5.2 only included convergence in R. Using Example
3.6.2 we see that c2n → ∞ and c2n+1 → −∞. Thus the sequence {cn } cannot
converge to either ±∞. Since there is nothing left, lim cn does not exist.
n→∞
We want to emphasize the point that the limit theorems stated and proved
in Sections 3.3 and 3.4 do not apply to infinite limits—we always had the as-
sumption that the limits were L, L1 or L2 and they were in R. It should not
surprise you that there are limit theorems for infinite limits—and that they are
not as nice as the theorems for finite limits. We include some of the results
without proof. The proofs of these results are easy. You should know that there
are more results available.

Proposition 3.6.2 Suppose that {an } and {bn } are real sequences. We have
the following results.
3.6 Infinite Limits 87

(a) If lim an = ∞ and lim bn = L2 where L2 ∈ R or L2 = ∞. Then


n→∞ n→∞
an + bn → ∞ as n → ∞.
(b) If lim an = −∞ and lim bn = L2 where L2 ∈ R or L2 = −∞. Then
n→∞ n→∞
an + bn → −∞as n → ∞.
(c) If lim an = ∞ and c ∈ R is such that c > 0, then can → ∞.
n→∞
(d) If lim an = −∞ and c ∈ R is such that c > 0, then can → ∞.
n→∞

HW 3.6.1 (True or False and why)


(a) lim (2n2 − 3n3 ) = ∞ − ∞ = 0.
n→∞
(b) lim (2n2 − 3n3 ) does not exist.
n→∞
(c) lim (2n2 − 3n3 ) = −∞
n→∞
2n2 limn→∞ 2n2 ∞
(d) lim = = = 1.
n→∞ 3n3 + 1 limn→∞ (3n3 + 1) ∞
2
2n
(e) lim =0
n→∞ 3n3 + 1

n2
HW 3.6.2 Prove that lim = ∞.
n→∞ n + 1
88 3. Limits of Sequences
Chapter 4

Limits of Functions

4.1 Definition of the Limit of a Function


In a way the title of this chapter is bad or misleading. We saw that a sequence
is a function (a function that has N as it’s domain) and we defined a limit of a
sequence. The difference is that in this chapter we will define limits of functions
defined on the reals or subsets of the reals that are generally much larger than
N. Where in case of the limit of a sequence we considered the limit of f (n) as
n approaches infinity, we will now consider the limit of f (x) as x approaches x0
for some x0 ∈ R. The limit that we will consider in this chapter is the limit
that you studied so hard in your basic calculus course and used to define the
derivative.
We begin by considering f : D → R where the domain and range of f , D
and R, are subsets of R. We suppose that x0 ∈ R but very importantly do not
require that x0 ∈ D. We do, however, require that x0 is a limit point of D, i.e.
every neighborhood of x0 must contain x ∈ D, x 6= x0 . Thus x0 need not be
in D but it must be close to D. Of course, We will write the limit of f (x) as x
approaches x0 equals L as lim f (x) = L.
x→x0
The limit considered in this chapter will be analogous to the sequential limit
so we must be able to characterize the limit by ”for every measure of closeness
to L” there exists ”a measure of closeness to x0 ” so that whenever ”x is close
to x0 ”, ”f (x) is close to L.” It should not surprise us that we can handle ”f (x)
is close L” very much as we did for the sequential limit. The difference is that
instead of n being close to ∞, we must now have the concept that x is close
to x0 —but that idea should not be too difficult to comprehend. We make the
following definition.

Definition 4.1.1 Suppose that f : D → R, D, R ⊂ R, x0 , L ∈ R and x0 is a


limit point of D. We say that lim f (x) = L if for every  > 0 there exists a
x→x0
real δ such that x ∈ D, 0 < |x − x0 | < δ implies that |f (x) − L| < .

89
90 4. Limits of Functions

If lim f (x) = L we say that f (x) converges to L as x goes to x0 , or sometimes


x→x0
write f (x) → L as x → x0 .
We see that the biggest difference between the definition of a sequential
limit and Definition 4.1.1 is the the statement ”there exists an N such that
n > N implies that |an − L| < ” is replaced by ”there exists a δ such that
0 < |x − x0 | < δ implies that |f (x) − L| < .” The measure of closeness to
infinity is all of the n’s greater than some N whereas the measure of closeness
to x0 is all of the x’s within some δ distance from x0 .
There are two pieces of the above definition of which we want to make special
note. The first is the ”0 <” part of the requirement that we want to consider
x’s such that 0 < |x − x0 | < δ. We want (need?) the limit to be applicable to
f (x) − f (x0 )
derivatives where the function under consideration is of the form ,
x − x0
i.e. we eventually want to use limits to define derivatives. This function is not
defined at x = x0 so if we want to take a limit of this function as x approaches
x0 , we surely do not want to require that x ever equals x0 .
Another point of the definition is that we only consider x’s such that 0 <
|x − x0 | < δ and x ∈ D. Of course we don’t want to consider any x’s that are
not in D—because then it would be stupid to write f (x) if x 6∈ D. However,
the requirement that x0 is a limit point of D ensures us that there are points in
the domain D that are arbitrarily close to x0 , i.e. there are some points in D
such that 0 < |x − x0 | < δ. Otherwise the limit definition is nonsensical at x0 .
Graphical description of the definition of a limit There is a useful graph-
ical description of Definition 4.1.1. In Figure 4.1.1 we first plotted a function,
chose a point x0 , and then projected that point up to the curve and across to

L+ε
L
L−ε

x0−δ2 x0+δ1
0
x0

0
x

Figure 4.1.1: Plot of a function, the y = L ±  corridor and the x0 − δ2 —x0 + δ1


corridor.
4.1 Definition 91

the y-axis to define L. Thus that part of the plot gives us the function, f , the
point at which we want the limit, x0 , and the limiting point, L. We are given
an  > 0 so we plot the points L ± . We then project these two points across
to the curve and down to the x-axis. We denote these two points as x0 − δ2 and
x0 + δ1 . This notation is really defining the size of δ1 and δ2 .
We note that whenever the curve is nonlinear, δ1 6= δ2 . In this case δ1 < δ2 .
More importantly you should realize that for any x between x0 − δ2 and x0 + δ1 ,
f (x) will be between L −  and L + —you choose any such x, project the point
vertically to the curve and then horizontally to the y-axis. We want to find
a δ so that whenever 0 < |x0 − x| < δ (or x0 − δ < x < x0 + δ, x 6= x0 ),
then f (x) will satisfy |f (x) − L| <  (or L −  < f (x) < L + ). If we choose
δ = min{δ1 , δ2 } (Step 2: Define δ), the point x0 + δ will be at x0 + δ1 (because
we claimed that δ1 < δ2 so in this case δ = min{δ1 , δ2 } = δ1 ) and x0 − δ will be
inside of x0 − δ2 —between x0 − δ2 and x0 . Hence by the Figure 4.1.1 it should
be clear that whenever 0 < |x0 − x| < δ, |f (x) − L| <  (Step 2: δ works).
You should realize that anytime you have an acceptable candidate for the
δ, i.e. one that works, you can always choose a smaller δ. For example, it
is clear from the picture that everything between x0 − δ1 (remembering that
δ1 < δ2 ) and x0 + δ1 will get mapped into the region (L − , L + ). So it
should be clear that if we chose δ = δ1 /13, then all points in the interval
(x0 − δ, x0 + δ) = (x0 − δ1 /13, x0 + δ1 /13) would also get mapped into the region
(L − , L + ). And, of course there is nothing special about 13 (except that it
is a very nice integer). In this case any δ such that 0 < δ < δ1 will work.
The second note that we should make about this example is that we have not
done anything to eliminate the point ”x0 ” from our deliberations, i.e. we have
not done anything to allow for the ”0 <” part of the requirement 0 < |x−x0 | < δ.
The reason is that in this case the function is sufficiently nice that we don’t have
to. In this case it is clear that f (x0 ) = L so that when |x − x0 | is actually zero,
i.e. when x = x0 , then |f (x) − L| = |f (x0 ) − L| = 0 < . The point is that once
we have the δ, we only need to satisfy if x is such that 0 < |x − x0 | < δ, then
|f (x) − L| < . If whenever x is such that |x − x0 | < δ, then |f (x) − L| < ,
the above statement will be satisfied (plus nice info at one extra point that we
didn’t need). This happens because f is a nice function. We will see that this
is not always the case. Even in this caase we could have not defined f at x0 or
defined f at x0 to be anything that we wanted. We would get the same picture
(except for right at x0 ) and the same result.
It should not surprise you to hear that we can also rewrite Definition 4.1.1
in terms of neighborhoods. We define a punctured neighborhood of a point
x0 to be the set (x0 − r, x0 + r) − {x0 } = (x0 − r, x0 ) ∪ (x0 , x0 + r) for some
r > 0, i.e. the same as a neighborhood of x0 except that we eliminate the point
x0 . We denote a punctured neighborhood of x0 by N̂r (x0 ). We can then restate
Definition 4.1.1 as follows: lim f (x) = L if for every neighborhood of L, N (L),
x→x0
there exists a punctured neighborhood of x0 , N̂δ (x0 ), such that x ∈ N̂δ (x0 ) ∩ D
implies that f (x) ∈ N (L). Again there is only a difference of notation between
this version of the definition and Definition 4.1.1.
92 4. Limits of Functions

Two limit theorems Before we proceed to apply the definitions to some spe-
cific examples, we are going to prove two propositions. The first is the analog
to Proposition 3.3.1. It would be best—in fact it is imperative—that when we
do have a value of L satisfying Definition 4.1.1, there isn’t some other L1 that
would also satisfy the definition. We have the following proposition.
Proposition 4.1.2 Suppose that f : D → R, D, R ⊂ R, x0 ∈ R and x0 is a
limit point of D. If lim f (x) exists, it is unique.
x→x0

Proof: The proof of this proposition is very similar to that of Proposition


3.3.1. We suppose the proposition is false and that there are at least two limits,
lim f (x) = L1 and lim f (x) = L2 , L1 6= L2 . For convenience let us suppose
x→x0 x→x0
that L1 > L2 (one or the other must be larger). Choose  = |L1 − L2 |/2 =
(L1 − L2 )/2. (After you have finished reading this proof, draw a picture to see
why this is a good choice of .) Since lim f (x) = L1 , we know that for the 
x→x0
given there exists a δ1 such that 0 < |x − x0 | < δ1 implies that |f (x) − L1 | < .
This inequality can be rewritten as

− + L1 < f (x) <  + L1 or (L1 + L2 )/2 < f (x) < (3L1 − L2 )/2. (4.1.1)

Likewise since lim f (x) = L2 , we know that for the  given above there
x→x0
exists a δ2 such that 0 < |x − x0 | < δ2 implies that |f (x) − L2 | < . This
inequality can be rewritten as

− + L2 < f (x) <  + L2 or (3L2 − L1 )/2 < f (x) < (L1 + L2 )/2. (4.1.2)

Let δ = min{δ1 , δ2 } and consider x such that 0 < |x − x0 | < δ. Then both
inequalities (4.1.1) and (4.1.2) will be satisfied. If we take the leftmost part of
inequality (4.1.1) and the rightmost part of inequality (4.1.2) we get

(L1 + L2 )/2 < f (x) < (L1 + L2 )/2.

Of course this is impossible so we have a contradiction (and because x0 is a limit


point of D we know that there are some values of x at which this contradiction
actually occurs), two such L’s do not exist and our limit is unique.
We note that the hypothesis that ”x0 is a limit point of D” is a very impor-
tant hypothesis for this result. If x0 is not required to be a limit point of D, the
limit would not be unique at x0 —the limit could be anything at such points.
Our next result will be very important to us. It is logical to try to relate the
limit of Definition 4.1.1 and that of the sequential limits. We do this with the
following propopsition.
Proposition 4.1.3 Suppose that f : D → R, D ⊂ R, x0 , L ∈ R and x0 is a
limit point of D. Then lim f (x) = L if and only if for any sequence {an } such
x→x0
that an ∈ D for all n, an 6= x0 for any n, and lim an = x0 , then lim f (an ) =
n→∞ n→∞
L.
4.1 Definition 93

Proof: (⇒) We begin by assuming the hypothesis that lim = L and suppose
x→x0
that we are given a sequence {an } with an ∈ D for all n, an 6= x0 for any n and
an → x0 . We also suppose that we are given some  > 0. We must find an N
such that n > N implies that |f (an ) − L| < .
Because lim f (x) = L, we get a δ such that
x→x0

if 0 < |x − x0 | < δ, then |f (x) − L| < . (4.1.3)

We apply the definition of the fact that an → x0 with the ”traditional  relaced
by δ” to get an N ∈ R such that

n > N implies that |an − x0 | < δ. (4.1.4)

(Step 1: Define N .)
Now suppose that n > N . We first apply statement (4.1.4) above to see that
|an − x0 | < δ. By the fact that we assumed that the sequence {an } satisfied
an 6= x0 for all n, we know that 0 < |an − x0 |, i.e. for n > N we have
0 < |an − x0 | < δ. We then apply statement (4.1.3) (with x replaced by an ) to
see that |f (an ) − L| <  (Step 2: N works). Therefore lim f (an ) = L.
n→∞
(⇐) We now assume that if {an } is any sequence such that an ∈ D for all n,
an 6= x0 for any n and an → x0 , then lim f (an ) = L. We assume that the
n→∞
proposition is false, i.e. that lim f (x) does not converge to L. This means
x→x0
that there is some  such that for any δ there exists an x-value, xδ , such that
0 < |xδ − x0 | < δ, xδ ∈ D and |f (xδ ) − L| ≥ , i.e. for any δ there is at least
one bad value xδ . Think carefully this negation.
The emphasis is that the above last statement is true for any δ.
Let δ = 1: Then there exists an xδ value, call it a1 , such that 0 < |a1 − x0 | < 1
and |f (a1 ) − L| ≥ . a1 will be in D and a1 6= x0 .
Let δ = 1/2: Then there exists an xδ value, call it a2 , such that 0 < |a2 − x0 | <
1/2 and |f (a2 ) − L| ≥ . (It happens for any δ.) a2 ∈ D and a2 6= x0 .
We could go next to 1/3, then 1/4, etc except that it gets old. We’ll jump
to a general n.
Let δ = 1/n: Then there exists an xδ value, call it an , such that 0 < |an − x0 | <
1/n and |f (an ) − L| ≥ . an ∈ D and an 6= x0 .
And of course this works for all n ∈ N. We have a sequence {an } such that
an 6= x0 for all n (true because of the ”0 <” part of the restriction). We also
have |an − x0 | < 1/n for all n. This implies that an → x0 . (See HW3.2.1–(a).)
Then by our hypothesis we know that f (an ) → L, i.e. for any  > 0 (including
specifically the  given to us above) there exists an N such that n > N implies
|f (an ) − L| < . But for this sequence we have that |f (an ) − L| ≥  for all
n ∈ N. This is a contradiction therefore the assumption that ” lim f (x) does
x→x0
not converge to L” is false and lim f (x) = L.
x→x0
94 4. Limits of Functions

Comments concerning Proposition 4.1.3 (i) We first note that since we


have an ”if and only if” result with the definition on one side, this gives us a
statement equivalent to our definition. It is the case that the right side of the
above proposition is used as the definition of a limit in some text books. Our
definition is surely the more traditional one. Once we have Proposition 4.1.3,
who cares. We can use either the definition or Proposition 4.1.3, which ever
best suites us at the time.
(ii) It should seem that the restrictions on the sequences {an } are not espe-
cially nice in that we always have to assume that an 6= x0 for any n. However,
it is fairly obvious that this is necessary—it’s necessary because f may not be
defined at x = x0 . A lot of the sequences that converge to x0 take on the value
x0 once or many times, for example the very nice sequence {x0 , x0 , · · · }. Such
sequences are not allowed—because we do have and want the ”0 <” as a part
of our definition of a limit. But in the end as long as you remember that the
restriction is necessary, it doesn’t seem to cause undo difficulties.
(iii) And finally it might seem that it would be very hard to apply the ⇐
direction of Proposition 4.1.3 because you have to consider a lot of sequences—
all of the sequences such that an ∈ D for all n, an 6= x0 for any n and an → x0 .
But often this is not a terrible burden if you can just consider a general sequence.
Application of Proposition 4.1.3 is an especially nice way to show that a
given limit does not exist or is not L. If we can find one sequence {an } such that
an 6= x0 and an → x0 but f (an ) 6→ L, then we know that lim f (x) 6= L (f (an )
x→x0
must approach L for all such sequences). If we can find one sequence {an } such
that an 6= x0 and an → x0 but lim f (an ) does not exist, then lim f (x) does
n→∞ x→x0
not exist ( lim f (an ) must exist and equal L for all such sequences).
n→∞

HW 4.1.1 (True or False and why)


(a) Suppose D = [0, 1] ∪ {2} and define f : D → R by f (x) = x2 . For any
 > 0 any x such that 0 < |x − 2| < 1, x ∈ D implies |f (x) − 4| < . Then
lim f (x) = 4.
x→2
(b) Suppose D = [0, 1) and define f : D → R by f (x) = x2 . Let {an } be any
sequence such that an ∈ [0, 1), an 6= 0, and an → 1 as n → ∞. By Proposition
3.3.2-(d) lim f (an ) = lim a2n = 1 · 1 = 1. Then lim f (x) = 1.
n→∞ n→∞ x→1
(c) Suppose f : D → R, D ⊂ R, x0 ∈ D is such that if for any sequence {an }
such that an ∈ D and an → x0 , then f (an ) → f (x0 ). Then lim f (x) = f (x0 ).
x→x0
(d) Suppose f : D → R, D ⊂ R, x0 , L ∈ R and x0 is a limit point of D. The
negation of the statement lim f (x) = L is ”for some  > 0 there exists a δ such
x→x0
that 0 < |x − x0 | < δ implies |f (x) − L| ≥ .”
(e) lim (3x + 2) = 8
x→2

HW 4.1.2 Prove that lim |x| = 0 (Hint: Consider HW3.3.2).


x→0

HW 4.1.3 Prove that lim (2x + 3) = 5 (Hint: Consider Proposition 3.3.2).


x→1
4.2 Applications of the Definition 95

HW 4.1.4 Suppose that f (y) < 0 for all y in some punctured neighborhood of
y0 . Suppose that lim F (y) exists. Prove that lim F (y) ≤ 0.
y→y0 y→y0

4.2 Applications of the Definition of the Limit


In the last section we introduced the definition of a limit of a function. In
this section we will learn how to apply the definition to particular functions and
points. Again we want to emphasize that when we are applying Definition 4.1.1,
we will always follow the two steps, Step 1: Define δ, and Step 2: Show that
the δ works.
In addition to introducing the definition of a limit in the last section we
also proved Proposition 4.1.3—which gave an alternative equivalent definition
of the limit of a function. All of these examples can be done both using the
definition and using Proposition 4.1.3. As you will see, most often it is easier
to apply Proposition 4.1.3 (we have already done most of the work in Sections
3.3 and 3.4). However, we do want you to be familiar with the definition. For
this reason we will do each of these examples twice: using Definition 4.1.1 and
using Proposition 4.1.3.
We now consider several examples.
Example 4.2.1 Prove that lim 2x + 3 = 9.
x→3
Using Definition 4.1.1: We suppose that we are given an  > 0. We must find the δ (Step 1)
that will satisfy the definition. This is an easy example. It is easy to use the graphical approach
to find the δ. In HW4.2.2 you will given the problem of proving this limit graphically. At this
time we will introduce the method that is the most common approach because it works for a
wider class of problems—the method is very close the the method used for proving sequential
limits.
We need x to satisfy |f (x) − 9| = |(2x + 3) − 9| = |2x − 6| < . This last inequality
is the same as |2(x − 3)| = 2|x − 3| <  or |x − 3| < /2. According to Definition 4.1.1,
we must find a δ such that 0 < |x − 3| < δ implies that |f (x) − 9| < . But the above
calculation shows that |f (x) − 9| <  is equivalent to |x − 3| < /2. Thus if we choose δ = /2
and require that x satisfy |x − 3| < δ = /2. We can multiply by 2 to get 2|x − 3| <  or
|2(x − 3)| = |(2x + 3) − 9| = |f (x) − 9| < .
Thus, if we choose δ = /2 (Step 1:Define δ), the above calculation shows that this
delta works, i.e. |x − 3| < δ implies that |f (x) − 9| <  (Step 2: The δ works). Therefore
lim 2x + 3 = 9.
x→3
Note again that as in the case with the example given in Figure 4.1.1, we have shown that
|x − 3| < δ implies that f (x) − 9| <  where we are only required to show that 0 < |x − 3| < δ
implies that |f (x) − 9| < . It is always permissible to show something stronger that what
we need to show. Again this is possible for this example because it is just about the second
easiest example possible.
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an 6= 3
for any n and an → 3—any such sequence. Then we know by Proposition 3.3.2 parts (a) and
(b), and HW3.2.1-(b) that
lim (2an + 3)) = lim (2an ) + lim 3 = 2 lim an + 3 = 2(3) + 3 = 9.
n→∞ n→∞ n→∞ n→∞

Therefore lim 2x + 3 = 9. Admitedly much easier.


x→3

Note in the above example when we applied the definition, we started with
the inequality that we need to be satisfied, |f (x) − 9| < . We then proceeded
96 4. Limits of Functions

to manipulate this inequality until we were able to isolate a term of the form
|x − 3|. This led to an easy definition of δ. The algebra of inequalities will not
always be as easy but it will always be possible to isolate the term |x − x0 |.
Observe this occurence as we proceed.
Example 4.2.2 Prove that lim x2 = 4.
x→2
Solution: Using Definition 4.1.1: We begin as we did in the last problem. We suppose
that  > 0 is given. We must find δ. Eventually we must satisfy the inequality |f (x) − L| =
|x2 − 4| = |(x − 2)(x + 2)| = |x − 2||x + 2| < . Notice that the |x − 2| term is included in the
second to the last term of the inequality—as we promised it would. The next step is tougher.
We cannot divide by |x + 2|, get |x − 2| < /|x + 2| and define δ = /|x + 2|. This would
be analogous to what we did in Example 4.2.1. The δ that we find can depend on  (like the
N ’s almost always depended on ). If we are taking a limit as x approaches a general point
x0 , the δ can depend on x0 —it’s a fixed value. δ cannot depend on x.
We used the bold face above but we did want to make that point extremely clear—
otherwise (and maybe in spite of) someone would make that mistake. The last couple sentences
of the above paragraph are very important.
We return to the inequality that we want satisfied |x − 2||x + 2| < . The technique we
use is to bound the |x + 2| term. How we do this is to choose a temporary fixed δ1 , say δ1 = 1,
and assume that |x − 2| < δ1 = 1. Then −1 < x − 2 < 1, 1 < x < 3 and 3 < x + 2 < 5.
The last inequality implies that |x + 2| < 5. Could it be less for some x? Of course it could.
However it could be very close to 5—and never bigger. Therefore if we assume that x satisfies
|x − 2| < δ1 = 1, then |x − 2||x + 2| < 5|x − 2|. If we then set 5|x − 2| < , we see that
|x − 2| < /5 so we see that it’s logical to define δ = /5. But this is wrong. If we review this
paragraph carefully, we see that |x − 2||x + 2| < 5|x − 2| < 5δ = 5(/5) =  only if x satisfies
|x − 2| < δ1 = 1 and |x − 2| < δ = /5.
Therefore the way to do it is to forget our earlier definition of δ and define δ to be
δ = min{1, /5} (Step 1: Define δ). Then if x satisfies |x − 2| < δ, x will satisfy both
|x − 2| < 1 and |x − 2| < /5. Then
|x2 − 4| = |x − 2||x + 2| <∗ 5|x − 2| <∗∗ 5(/5) = 
(Step 2: Show that the defined δ works) where inequality ”<∗ ” is satisfied because |x −
2| < 1 (δ = min{1, /5} < 1) and inequality ”<∗∗ ” is satisfied because |x − 2| < /5 (δ =
min{1, /5} < /5). Therefore lim x2 = 4.
x→2
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an 6= 2
for any n and an → 2. By Proposition 3.3.2–(d) we see that
  
lim f (an ) = lim a2n = lim an lim an = 2 · 2 = 4.
n→∞ n→∞ n→∞ n→∞

Therefore lim x2 = 4.
x→2

For the application of the definition, if you compare Examples 4.2.1 and 4.2.2,
you realize that the difference is that the function in Example 4.2.1 is linear and
that is why it is so easy to apply the definition. For most functions (at least all
nonlinear functions) you will have to apply some version of the method (trick?)
used in Example 4.2.2. Of course application of Proposition 4.1.3 let us skip
these difficulties. Recall that in the proof of Proposition 3.3.2–(d), we used
part (c) of the proposition—the result that guaranteed the boundedness of a
convergent sequence. Thus the application of Proposition 4.1.3 to find the limit
in Example 4.2.2 also uses a boundedness result—albeit, very indirectly.
We might note that in the application of the definition we used δ1 = 1 (we
call it δ1 because it’s sort of the first approximation of our δ) because 1 is a really
4.2 Applications of the Definition 97

nice number. If we had used δ1 = 1/2, we would have gotten 7/2 < x + 2 < 9/2.
In this case we see that |x + 2| < 9/2, so |x − 2||x + 2| < (9/2)|x − 2| and we
would define δ to be δ = min{1/2, /(9/2)}. If instead we had used δ1 = 2, then
we would find that |x + 2| < 6 and would eventually define δ = min{2, /6}.
Any of these choices would give you a correct result. As we see in the next
example it is sometimes important to be careful how we choose δ1 .
x−2
Example 4.2.3 Prove that lim = −4.
x→−2 x+3
Solution: Using Definition 4.1.1: For this problem we proceed as we have before and
assume
that  > 0 is given. We want a δ so that when 0 < |x − (−2)| = |x + 2| < δ,
x − 2
x + 3 − (−4) < . We see that


x − 2 5(x + 2) 5|x + 2|

x + 3 − (−4) x + 3 = |x + 3| .
= (4.2.1)

We note that the |x + 2| term is there—as we promised it would be. So as in Example 4.2.2
we must bound the rest. But in this case we must be more careful. If we chose δ1 = 1 as
we did before (and 1 is such a nice number), then 5/|x + 3| would be unbounded on the set
of x such that |x + 2| < δ1 = 1. (|x + 2| < 1 implies that −3 < x < −1. 5/|x + 3| goes to
infinity as x goes to −3.) Hence we must be a little bit more careful and choose δ1 = 1/2. If
x is such that |x + 2| < 1/2, then −5/2 < x < −3/2 and 1/2 < x + 3 < 3/2. Thus we see
that if |x + 2| < 1/2, |x + 3| > 1/2 (and it’s only the bad luck of the numbers that the two
1/2’s appear) and 5/|x + 3| < 5/(1/2) = 10. Thus we return to equation (4.2.1) and see that
if |x + 2| < 1/2, then
x − 2 5|x + 2|
x + 3 − (−4) = |x + 3| < 10|x + 2|. (4.2.2)

Thus define δ = min{/10, 1/2} (Step 1: Define δ). Then if 0 < |x + 2| < δ, 5/|x + 3| < 10
and 10|x + 3| < 10(/10) = . Therefore if 0 < |x + 2| < δ,

x − 2
= 5|x + 2| < 10|x + 3| < 

− (−4)
x + 3 |x + 3|
x−2
(Step 2: δ works.), and lim = −4.
x→−2 x+3
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an ∈ D
for all n (which in this case means that an 6= −3 for any n), an 6= −2 for any n and an → −2.
By Proposition 3.4.1–(b), Proposition 3.3.2–(a) and HW3.2.1-(b) we see that
an − 2 −4
lim f (an ) = lim = = −4.
n→∞ n→∞ an + 3 1
Note that again in this problem the ”0 <” part of the restriction on x is not
important. The function is well behaved at x = −2 (and equals −4). However,
in this problem the function f blows up near x = −3, we must be careful to
restrict the δ in the application of the definition and the sequence {an } in the
application of Proposition 4.1.3.
The next example that we consider is an important problem. The limit con-
sidered is an example of a limit used to compute a derivative—a very important
use of limits in calculus.
x3 − 64
Example 4.2.4 Prove that lim = 48.
x→4 x−4
x3 − 64
Solution: Using Definition 4.1.1: For convenience define f (x) = . Note that f (4)
x−4
is not defined. When you try to evaluate f at x = 4, you get zero over zero. This does not
98 4. Limits of Functions

mean that we will not be able to evaluate the limit given above—and hopefully you realize
this if you remember your limit work related to derivatives.
We proceed as usual, assume that we are given an  > 0 and want to find a δ so that
x3 − 64
0 < |x − 4| < δ will imply that |f (x) − 48| = − 48 < . We start with expression
x−4
f (x) − 48 and note that
x3 − 64 (x − 4)(x2 + 4x + 16)
− 48 = − 48 (4.2.3)
x−4 x−4
(if you don’t believe the factoring, multiply the expression on the right to see that you get
x3 − 64 back). We have an x − 4 factor in both the numerator and the denominator of the first
term on the right. We want to divide them out. In general you have to be careful in doing
this but in this case it is completely permissible. The requirement on x will be 0 < |x − 4| < δ.
The meaning of the part of the inequality 0 < |x − 4| is that x − 4 6= 0. And if x − 4 6= 0, we
can divide them out. Hence returning to equation (4.2.3) we get
x3 − 64 (x − 4)(x2 + 4x + 16)
−48 = −48 = (x2 +4x+16)−48 = x2 +4x−32 = (x−4)(x+8).
x−4 x−4
We promised you that there would always be an x − 4 factor in the simplified version of
f (x) − L. Thus 3
x − 64

x−4 − 48 = |(x − 4)(x + 8)| = |x − 4||x + 8|.
(4.2.4)

The |x − 4| term will be made less than δ as it has been in Examples 4.2.1–4.2.3. The |x + 8|
term must be bounded as we bounded |x + 2| in Example 4.2.2 and 5/|x + 3| in Example 4.2.3.
Hence we require that |x−4| satisfy |x−4| < δ1 = 1 and notice that this gives us the following:
|x − 4| < 1 ⇒ − 1 < x − 4 < 1 ⇒ 3 < x < 5 ⇒ 11 < x + 8 < 13. Therefore if |x − 4| < δ1 = 1,
then |x + 8| < 13. More importantly, it is also clear that if 0 < |x − 4| < δ1 = 1, then
|x + 8| < 13—if it’s satisfied on the larger set, it’s satisfied on the smaller set. Returning to
equation (4.2.4) we see that if we require that x satisfy |x − 4| < δ1 = 1, then
3
x − 64
x − 4 − 48 = |x − 4||x + 8| < 13|x − 4|. (4.2.5)

And finally, if we define δ = min{1, /13} (Step 1: Define δ) and require that 0 < |x − 4| < δ
(so that 0 < |x − 4| < 1 and 0 < |x − 4| < /13), we continue with equation (4.2.5) to get
3
x − 64

x−4 − 48 = |x − 4||x + 8| < 13|x − 4| < 13(/13) = 

x3 − 64
(Step 2: δ works). Therefore → 48 as x → 4.
x−4
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an ∈ D
for all n (i.e. an 6= 4 for any n), an 6= 4 for any n and an → 4. Then
a3n − 64 (an − 4)(a2n + 4an + 16)
lim = lim (4.2.6)
n→∞ an − 4 n→∞ an − 4
= lim a2 + 4an + 16 = 42 + 4 · 4 + 16 = 48. (4.2.7)
n→∞ n

We note that it is permissible to divide out the an − 4 term between lines (4.2.6) and (4.2.7)
x3 − 64
because we have assumed that an 6= 4 for any n. Therefore → 48 as x → 4.
x−4
We should emphasize that it would be wrong to apply Proposition 3.4.1–(b) after step
(4.2.6) and then try some sort of division.

You might have noticed that when we wrote the limits in the preceeding
examples, we did not usually explicitly define the function and the domain. We
wrote the expression for the function as a part of the limit statement (as you
4.2 Applications of the Definition 99

did in your basic calculus course) and assumed that you knew the domain. This
is common. Know the domain? We really assume that the domain is chosen as
the largest set on which the expression can be defined—in the case of Examples
4.2.1 and 4.2.2, D = R, in Example 4.2.3 D = R − {−3}, in Example 4.2.4
D = R − {4}, etc. Of course in these cases the requirement that x0 is a limit
point of D was always satisfied.
In the solution of Example 4.2.4 using the definition, we were able to factor
out an x − 4 term out of x3 − 64 in equation (4.2.3). This was not luck. If the
limit is to exist, it will always be there. Remember when we tried to evaluate
f (4) we got 0/0. The zero in the numerator implies that there’s a x − 4 factor in
there—somewhere, sometimes it’s hard to see. For all of the problems that result
in applying the definition of a derivative, you will always have the x − 4 term
in the numerator that will divide out with the x − 4 term in the denominator.
But remember it was the ”0 <” part of the restriction on x that allowed
us to divide out the x − 4 terms. This was essential. Likewise, this is
an example that if you choose to prove the limit using Proposition 4.1.3, the
hypothesis ”an 6= x0 for any n” becomes important. In the application of
Proposition 4.1.3, it is this assumption on the sequence {an } that allows us to
divide out the an − 4 terms.
There are other problems that require the ”0 <” restriction on x (or an 6=
x0 assumption) and a division other than the limits involved in computing
derivatives. You could make up a function that when factored looked like
(x − 2)2 (x2 + x + 1)/(x − 2)2 (you can multiply it out if you’d like to make
it look like a real example) and try to calculate the limit of that function as
x → 2. The limit would be 7. You would use the ”0 <” restriction to divide
out the (x − 2)2 terms and then would have

(x − 2)2 (x2 + x + 1)
− 7 = (x2 + x + 1) − 7 = x2 + x − 6 = (x + 3)(x − 2).
(x − 2)2

The last expression contains the x−2 term (we promised) and if we were applying
the defintion, we would proceed by bounding |x + 3| the way that we have done
before.
(x + 2)2 h(x)
If the function is of the form f (x) = where h(−2) 6= 0 (and we
x+2
want the limit of f as x → −2, then only one x + 2 term will divide out (you
only have one in the denominator—what else could you do) and the limit would
be 0 because of the x + 2 term that is left in the numerator. This is just a really
nice limit.
(x − 3)2 h(x)
If the function is of the form f (x) = where h(3) 6= 0—
(x − 3)3
emphasizing the fact that the degree of the term in the numerator is smaller
than that in the denominator, then you could divide out only two of the x − 3
terms and the x − 3 term that was left in the denominator would cause the limit
to not exist.
Let us emphasize again, all of these slight variations of the problem given
in Example 4.2.4 work because of the ”0 <” part of the restriction on x in
100 4. Limits of Functions

Definition 4.1.1. We see that it does not come into problems involving easy
limits but is important on the class of limits associated with derivatives—and
similar problems.
Nonconvergence of Limits: Of course if we have a definition of convergence
of limits and some examples of application of the definition, we must have some
examples where the function doesn’t converge to a limit. As in the case of
nonconvergence of sequential limits, proving that a limit does not exist using
the definition is often difficult. You must show that for some  > 0 there does
not exist any δ such that 0 < |x − x0 | < δ implies that |f (x) − L| <  (using
the notation as given in Definition 4.1.1), i.e. for some  > 0 and any δ, there
exists an xδ such that 0 < |xδ − x0 | < δ and |f (x) − L| ≥ .
In general, it is usually much easier and more natural to use Proposition
4.1.3 to show that a limit does not exist. Again we do want you to see that it
is possible to use the definition in these arguments and how to use it.
The application of Proposition 4.1.3 is a bit different from before. Consider
the ⇒ direction of the proposition: if lim f (x) = L, then for any sequence
x→x0
{an } such that an ∈ D for all n, an 6= x0 for any n and lim an = x0 , then
n→∞
lim f (an ) = L. Of course the contrapositive of this statement would read
n→∞
something like the following. if it is not the case that for any sequence {an } such
that an ∈ D for all n, an 6= x0 for any n and lim an = x0 , then lim f (an ) = L,
n→∞ n→∞
then lim f (x) 6= L. How does one satisfy the statement ”it is not the case that
x→x0
for any sequence {an } such that an ∈ D for all n, an 6= x0 for any n and
lim an = x0 , then lim f (an ) = L”? It is easy. One way is to find a sequence
n→∞ n→∞
that satisfies the properties an ∈ D for all n, an 6= x0 for any n and an → x0 , but
the limit lim f (an ) does not exist. That implies that not only is the limit not
n→∞
some particular L but that the limit does not exist. Another way is to find two
sequences {an } and {bn } such that an , bn ∈ D for all n, an 6= x0 and bn 6= x0 for
any n, an → x0 and bn → x0 as n → ∞, and lim f (an ) 6= lim f (bn ). Not only
n→∞ n→∞
will this imply that the original limit is not L but will also imply that it can’t
be anything else either (because we have at least two nonequal candidates), i.e.
the limit does not exist.
We will include three examples of nonconvergence. The first will not satisfy
Definition 4.1.1 because it wants to have an infinite limit and Definition 4.1.1
requires that L ∈ R (and as in the case with sequential limits, we will later
define what means to have an infinite limit). The second will be analogous to
the sequential limit example given in Example 3.2.6—there will be two logical
limits, so neither (and nothing else) will satisfy the definition. And the last—
probably the most interesting—will not have a limit just because it is a really
nasty function. As we mentioned earlier we will show nonconvergence by the
definition because we want you to see how it is done. Because we feel that the
natural approach is to apply Proposition 4.1.3, we will give that approach first.
1
Example 4.2.5 Prove that lim does not exist.
x→0 x2
4.2 Applications of the Definition 101

Solution: (Using Proposition 4.1.3) Consider the sequence {1/n}. This sequence satisfies
the
 properties  1/n 6= 0 for any n and 1/n → 0. Thus if the limit were to exist, the sequence
1
= {n2 } would have to converge to some L—the resulting limit. Clearly this is not
(1/n)2
1 2 1
the case in that (1/n) 2 = n → ∞ as n → ∞. Therefore lim does not exist.
x→0 x2
(Using Definition 4.1.1) If you evaluate the function 1/x2 near zero, we would hope that
you figure out what is happening. You consider  = 1 (remember, we only have to show that
it’s bad for one particular ) and suppose that the limit is some L ∈ R where for the moment
we assume that L > −1. We must show that for any δ we do not satisfy 0 < |x| < δ implies
1
<  = 1 or L − 1 < 1 < L + 1, i.e. we must show that for any δ there exists an xδ


x2 − L 2
x


1
such that 0 < |xδ | < δ and 2 − L ≥  = 1.
x √
Choose xδ = min{δ/2, 1/2 L + 1}. Then xδ is such that 0 < |xδ | < δ and 0 < xδ <
√ √ 1 1
1/ L + 1. We note that 0 < xδ < 1/ L + 1 implies that 2 > L + 1, or 2 − L ≥ 1. Thus

xδ xδ
1
lim cannot equal L. How did we choose this xδ that worked so well? Of course we worked
x→0 x2
backwards—knowing that we could choose xδ small enough so that 1/x2δ would be greater
than L + 1 (and δ/2 would guarantee that it is between 0 and δ).
If L ≤ −1 (and note that this implies that −L ≥ 1), then
for any δ we choose xδ = δ/2
1 4 4 1 1
and note that 2 − L = 2 − L ≥ 2 + 1 > 1, or 2 − L ≥ 1. Thus lim 2 cannot equal

xδ δ δ xδ x→0 x
L.
And of course, if the limit cannot equal L for L > −1 and cannot equal L for L ≤ −1,
then the limit cannot exist.

Note that even though this limit does not exist, it is handy here having the
”0 <” part of the restriction on x in the definition of a limit so that 1/x2 need
not be defined at x = 0—otherwise we would have been done long ago.
(
1 if x ≥ 0
Example 4.2.6 For f defined as f (x) = prove that lim f (x) does not
0 if x < 0. x→0
exist.
Solution: (Using Proposition 4.1.3) The approach we use for this example is to use
two sequences—remember that Proposition 4.1.3 must hold for all sequences {an } such that
an 6= x0 and an → x0 . We first considerthesequence {1/n}. We know that 1/n 6= 0 for any
1
n and 1/n → 0. It is easy to see that f → 1 (since all of the 1/n’s are positive implies
n
the f (1/n) = 1 for all n—so this is just an application of HW3.2.1-(b)). Then we consider the
sequence {−1/n}. Again  we notice that the sequence satisfies the hypothesis of Proposition
1
4.1.3, but this time f − → 0. Therefore lim f (x) does not exist.
n x→0
(Using Definition 4.1.1) We approach this proof similar to the way that we proved that
lim (−1)n did not exist in Example 3.2.6. Case 1: We first guess that maybe the limit is
n→∞
1. We choose  = 1/2 (remember that we only have to show that for some  > 0 there is no
appropriate δ). If this limit were not to be 1, we would have to show that for any δ there
is an xδ such that 0 < |xδ | < δ, or −δ < xδ < δ, xδ 6= 0, and |f (x) − 1| ≥  = 1/2. Now
consider any δ and choose xδ = −δ/2. Then xδ satisfies |xδ | < δ and xδ 6= 0. Since f (xδ ) = 0
(xδ < 0), |f (xδ ) − 1| = 1 ≥  = 1/2. Thus we know that lim f (x) 6= 1.
x→0
Case 2: We next guess that the limit might be 0. We again choose  = 1/2 and consider any
δ. This time choose xδ = δ/2. Then since |f (xδ ) − 0| = |1 − 0| = 1 ≥  = 1/2, lim f (x) 6= 0.
x→0
102 4. Limits of Functions

Case 3: We next consider the most difficult case, and assume that lim f (x) = L where L
x→0
is any real number other than 1 or 0 (the two cases that we have already considered). Then
choose  = min{|L|/2, |L − 1|/2}. For any δ choose xδ = δ/2. Then |f (x) − L| = |1 − L| >
|L − 1|/2 ≥ . Thus lim f (x) 6= L, L 6= 1 and L 6= 0. (We could just have well chosen
x→0
xδ = −δ/2. Draw a picture to show why our choice of  works in this case.)
Since we have exhausted all possible limits in R, lim f (x) does not exist.
x→0

In the next example we will use the sine function—and we have never defined
it (but we used it earlier). We assume that your trigonometry course gave a
sufficiently rigorous definition of these functions. We proceed with our last
case of non-existence.
(
1

sin if x 6= 0
Example 4.2.7 Define the function f : R → R by f (x) = x Prove that
0 if x = 0.
lim f (x) does not exist.
x→0
It is especially instructive for this example to get a plot of the function. We see on the
plot below that like the sine function −1 ≤ f (x) ≤ 1. But as x nears zero, 1/x goes through
odd multiples of π/2 (giving values ±1), multiples of π (giving values of 0) and everything
else in between—many times.

0.8

0.6

0.4

0.2

−0.2

−0.4

−0.6

−0.8

−1

−3 −2 −1 0 1 2 3
x

Figure 4.2.1: Plot of a function f (x) = sin(1/x) for x 6= 0 and f (0) = 0.

Solution: (Using Proposition 4.1.3) Again we choose two sequences converging to


zero. We choose the sequence {an } where an = 1/(nπ) and the sequence {bn } where bn =
2/[(4n + 1)π]. Both of these sequences will clearly never equal 0 and both of these sequences
will converge to zero. It is easy to see that f (an ) = 0 for all n and f (bn ) = 1 for all n.
Therefore, f (an ) → 0, f (bn ) → 1 and lim f (x) does not exist.
x→0
(Using Definition 4.1.1) Case 1: L 6= 0 For any δ we can find an x0 of the form x0 =
1/(n0 π) for some n0 ∈ N that satisfies 0 < |x0 | < δ—this follows from Corollary 1.5.5–(b)
(there are many such n0 ’s). Then if we suppose that the limit exists and is some L other than
0, we choose  = |L|/2 and note that |f (x0 ) − L| = |0 − L| = |L| > |L|/2 so it is impossible to
satisfy Definition 4.1.1 for any δ (so lim f (x) 6= L, L 6= 0).
x→0
4.3 Limit Theorems 103

Case 2: L = 0 We next suppose that the limit is 0 (it is the only value left). We choose
 = 1/2. Then for any δ, we can find an x0 such that 0 < |x0 | < δ, x0 6= 0 and x0 =
2/[(2n0 + 1)π] for some n0 ∈ N (one over an odd multiple of π/2). For this value of x0
|f (x0 ) − 0| = | ± 1| = 1 > 1/2 = . Thus again it is impossible to satisfy Definition 4.1.1 for
any δ (so lim f (x) 6= 0).
x→0
Therefore, lim f (x) does not exist.
x→0

Note that while f defined in Example 4.2.7 is a terribly nasty function—especially


near 0, for any x0 6= 0 (even very near 0), lim f (x) exists and equals sin(1/x0 ).
x→x0

HW 4.2.1 (True or False and Why)


(a) lim |x| = 0
x→0
(b) lim |x| = 2
x→−2
(c) Suppose f : D → R, D ⊂ R, x0 ∈ R. If f (x0 ) is defined (x0 ∈ D), then
lim f (x) = f (x0 ).
x→x0
x2
(d) lim = −4
x→2 2x − 5 
1
 x<0
(e) Consider the function defined by f (x) = 0 x = 0. Then lim f (x) = 0.
 x→0
−1 x>0

HW 4.2.2 Use the graphical approach to show that lim 2x+3 = 9. Specifically
x→3
find the δ1 and δ2 (of Figure 4.1.1), determine δ and show that it works. Explain
why δ1 = δ2 in this example.

HW 4.2.3 (a) Prove that lim 7 = 7. Show this using the graphical approach
x→4
and then prove it twice—first using Definition 4.1.1 and then using Proposition
4.1.3.
(b) Prove that for any x0 , c ∈ R, lim c = c.
x→x0

(
x2 + x + 1 if x 6= 2
HW 4.2.4 Define the function f : R → R by f (x) =
12 if x = 2.
Prove that lim f (x) = 7—prove it twice, first using Definition 4.1.1 and then
x→0
using Proposition 4.1.3.

x2
HW 4.2.5 Prove that lim = −9—prove it twice, first using Definition
x→3 x − 4
4.1.1 and then using Proposition 4.1.3.

4.3 Limit Theorems


We don’t want to have to apply the Definition 4.1.1 or Proposition 4.1.3 every
time we have a limit. As was the case with sequential limits, we shall develop
104 4. Limits of Functions

limit theorems that allow us to compute a large number of limits. You already
know most of these limit theorems from your elementary calculus class. Of
course we will now include the proofs of these theorems. And it should not
be a surprise to you that the limit theorems will look very much like the limit
theorems that we proved for limits of sequences. As with the proofs of conver-
gence of the specific limits done in Section 4.2, parts (a), (b), (d) and (f) of
Proposition 4.3.1 given below can be proved by either using the definition or
Proposition 4.1.3. Again we feel that you should see both approaches. For that
reason we will include both proofs for these parts. Since the proofs applying
Definition 4.1.1 are very similar to the proofs of the analogous proofs for limits
of sequences and the proofs applying Proposition 4.1.3 are pretty easy, we will
give reasonably abbreviated versions of these proofs.
Proposition 4.3.1 Consider the functions f, g : D → R where D ⊂ R, suppose
that c, x0 ∈ R and x0 is a limit point of D. Suppose lim f (x) = L1 and
x→x0
lim g(x) = L2 . We then have the following results.
x→x0
(a) lim (f (x) + g(x)) = lim f (x) + lim g(x) = L1 + L2 .
x→x0 x→x0 x→x0
(b) lim cf (x) = c lim f (x) = cL1 .
x→x0 x→x0
(c) There exists a δ3 , K ∈ R such that for x ∈ D and 0 < |x − x0 | < δ3 ,
|f (x)| < K.   
(d) lim f (x)g(x) = lim f (x) lim g(x) = L1 L2 .
x→x0 x→x0 x→x0
(e) If L2 6= 0, then there exists a δ4 , M ∈ R such that if x ∈ D and 0 < |x−x0 | <
δ4 , then |g(x)| > M .
(f ) If L2 6= 0, then
f (x) limx→x0 f (x) L1
lim = = .
x→x0 g(x) limx→x0 g(x) L2

Proof: So that we don’t have to repeat it every time, throughout this proof
let {an } be any sequence such that an ∈ D for all n, an 6= x0 for any n, and
a n → x0 .
(a) (Using Definition 4.1.1) We suppose that we are given an  > 0. We
apply the hypothesis lim f (x) = L1 with 1 = /2 to get a
x→x0

δ1 such that x ∈ D, 0 < |x − x0 | < δ1 implies that |f (x) − L1 | < 1 = /2,

and the hypothesis lim g(x) = L2 with 2 = /2 to get a


x→x0

δ2 such that x ∈ D, 0 < |x − x0 | < δ2 implies that |g(x) − L2 | < 2 = /2.

Then if we let δ = min{δ1 , δ2 } (Step 1: Define δ) and require that x ∈ D and


0 < |x − x0 | < δ, we have

|(f (x) + g(x)) − (L1 + L2 )| = |(f (x) − L1 ) + (g(x) − L2 )|


≤ |(f (x) − L1 )| + |(g(x) − L2 )| < /2 + /2 = .
4.3 Limit Theorems 105

(Step 2: δ works.) Therefore lim (f (x) + g(x)) = L1 + L2 .


x→x0
(Using Proposition 4.1.3) We note that by Proposition 3.3.2–(a)

lim (f (an ) + g(an )) = lim f (an ) + lim g(an ) = L1 + L2 .


n→∞ n→∞ n→∞

Since this holds true for any such sequence {an }, by Proposition 4.1.3 we get
lim (f (x) + g(x)) = L1 + L2 .
x→x0
(b) (Using Definition 4.1.1) If c 6= 0, we apply the hypothesis lim f (x) = L1
x→x0
with 1 = /|c|. Then setting δ = δ1 will give the desired result.
If c = 0, the result is trivial since cf (x) = 0 for all x ∈ D—so it follows from
HW4.2.3-(b).
(Using Proposition 4.1.3) Since lim cf (an ) = c lim f (an ) by Proposition
n→∞ n→∞
3.3.2–(b), the result follows.
(c) (We do not give a proof of this result based on Proposition 4.1.3—it is
possible but it would not be very insightful.) Using the hypothesis lim = L1
x→x0
with 1 = 1, we get a δ3 such that if x ∈ D and 0 < |x − x0 | < δ3 implies that
|f (x) − L1 | < 1 = 1. Then by the backwards triangular inequality, Proposition
1.5.8–(vi), we see that for all x such that x ∈ D and 0 < |x − x0 | < δ3 ,

|f (x)| − |L1 | ≤ |f (x) − L1 | < 1,

or |f (x)| < 1 + |L1 |. If we set K = 1 + |L1 |, we are done.


(d) (Using Definition 4.1.1) We suppose that we given an  > 0. We apply
the hypothesis lim f (x) = L1 with 1 = /(2L2 ) to get a
x→x0

δ1 such that x ∈ D, 0 < |x − x0 | < δ1 implies that |f (x) − L1 | < 1 = /(2L2 ),

and the hypothesis lim g(x) = L2 with 2 = /(2K) to get a


x→x0

δ2 such that x ∈ D, 0 < |x − x0 | < δ2 implies that |g(x) − L2 | < 2 = /(2K).

We set δ = min{δ1 , δ2 , δ3 } (where δ3 follows from part (c) of this proposition)


(Step 1: Define δ) and note that if x is such that x ∈ D and 0 < |x − x0 | < δ
(x satisfies all three restrictions),

|f (x)g(x) − L1 L2 | = |f (x)(g(x) − L2 ) + L2 (f (x) − L1 )|


≤ |f (x)||g(x) − L2 | + |L2 ||f (x) − L1 | < K2 + |L2 |1 .

Then using the fact that 1 = /(2|L2 |) and 2 = /(2K) (Step 2: δ works), the
result follows.
Note that generally K 6= 0—or we can always choose it to be so. If L2 = 0,
the result follows by choosing 2 = /K and δ = min{δ2 , δ3 }, and noting that

|f (x)g(x) − 0| = |f (x)||g(x)| < K2 = 


106 4. Limits of Functions

whenever x ∈ D and 0 < |x − x0 | < δ—we use the hypothesis lim f (x) = L1
x→x0
only to get K.
(Using Proposition 4.1.3) The result follows from Proposition 3.3.2–(d).
(e) (Again we do not include a proof of this result based on Proposition 4.1.3.)
Since L2 is assumed to be nonzero, we use the hypothesis lim g(x) = L2
x→x0
with 2 = |L2 |/2 and obtain a δ4 such that 0 < |x − x0 | < δ4 implies that
|g(x) − L2 | <  = |L2 |/2. We have
|L2 | − |g(x)| ≤ |L2 − g(x)| = |g(x) − L2 | <  = |L2 |/2
by the backwards triangular inequality, Proposition 1.5.8–(vi).
Thus when x ∈ D and 0 < |x − x0 | < δ4 , |g(x)| > |L2 | − |L2 |/2 = |L2 |/2. If we
set M = |L2 |/2, we are done.
(f ) (Using Definition 4.1.1) We suppose that we are given an  > 0 and apply
the hypotheses lim f (x) = L1 with respect to 1 to get δ1 such that x ∈ D
x→x0
and 0 < |x − x0 | < δ1 implies |f (x) − L1 | < 1 , lim g(x) = L2 with respect to
x→x0
2 to get δ2 such that x ∈ D and 0 < |x − x0 | < δ2 implies |g(x) − L2 | < 2 ,
and L2 6= 0 and part (e) of this proposition to get δ4 such that x ∈ D and
0 < |x − x0 | < δ4 implies that g(x) > M . Then we set δ = min{δ1 , δ2 , δ4 } (Step
1: Define δ), require x to satisfy x ∈ D and 0 < |x − x0 | < δ and note that

f (x) L1 f (x)L2 − L1 g(x) (f (x) − L1 )L2 + L1 (L2 − g(x))

g(x) − = =
L2 L2 g(x) L2 g(x)
|f (x) − L1 ||L2 | + |L1 ||L2 − g(x)| 1 |L2 | + |L1 |2
≤ < .
g(x)|L2 | M |L2 |
Thus we see that if we choose 1 as 1 = M /2 and 2 as 2 = M |L2 |/(2|L1 |),
f (x) L1
then − <  (Step 2:δ works) and lim f (x) = L1 .
g(x) L2 x→x0 g(x) L2
We have assumed that L2 6= 0 and we know by part (e) of this proposition
that g(x) 6= 0—so we can put them in the denominator. If L1 = 0, the result
follows by choosing δ = min{δ1 , δ4 } and 1 = M ).
(Using Proposition 4.1.3) Since by Proposition 3.4.1–(b)
f (an ) limn→∞ f (an ) L1
lim = =
n→∞ g(an ) limn→∞ g(an ) L2
for any such sequence {an }, the result follows from Proposition 4.1.3.
Notice that in this proof we don’t use part (d) of this proposition. It should
be clear that we use the hypothesis that L2 6= 0 via Proposition 3.4.1–(a) which
is then used to prove part (b) of Proposition 3.4.1.
Parts (a), (b), (d) and (f) of the above proposition are basic tools used in
the calculation of limits. However, to use these tools—which are always used to
simplify a given limit to a set of easier limits—we need some easier limits. In
the next proposition we provide one of the easy limits that we need.
4.3 Limit Theorems 107

Proposition 4.3.2 (a) For any x0 ∈ R lim x = x0 .


x→x0
(b) Consider f1 , · · · , fn : D → R where D ⊂ R, x0 , L1 , · · · , Ln ∈ R and x0
is a limit point of D. Suppose that lim fj (x) = Lj , j = 1, · · · , n. Then
x→x0
lim f1 (x) · f2 (x) · · · fn (x) = L1 · L2 · · · Ln .
x→x0
(c) Suppose x0 ∈ R and k ∈ N. Then lim xk = xk0 .
x→x0

Proof: The proofs of these results are very easy. Result (a) follows by choosing
δ =  in Definition 4.1.1. Property (b) is an elementary application of math-
ematical induction using part (d) of Proposition 4.3.1. And then the result
given in part (c) follows from parts (a) and (b) of this proposition (or by apply
Proposition 4.3.1–(d) k − 1 times along with part (a) of this proposition).
We next include an inductive version of Proposition 4.3.1–(a) and use this
result—along with the other parts of Proposition 4.3.1 to compute a large class of
limits. Let p and q denote mth and nth degree polynomials, respectively, p(x) =
a0 xm + a1 xm−1 + · · · + am−1 x + am and q(x) = b0 xn + b1 xn−1 + · · · + bn−1 x + an .

Proposition 4.3.3 (a) Consider f1 , · · · , fn : D → R where D ⊂ R,


x0 , L1 , · · · , Ln ∈ R and x0 is a limit point of D. Suppose that lim fj (x) = Lj ,
x→x0
j = 1, · · · , n. Then lim (f1 (x) + f2 (x) + · · · + fn (x)) = L1 + L2 + · · · + Ln .
x→x0
m−1
(b) For all x0 ∈ R lim p(x) = p(x0 ) = a0 xm
0 + a1 x0 + · · · + am−1 x0 + am .
x→x0
(c) If x0 ∈ R and q(x0 ) 6= 0, then

m−1
p(x) p(x0 ) a0 xm
0 + a1 x0 + · · · + am−1 x0 + am
lim = = n n−1 .
x→x0 q(x) q(x0 ) b0 x0 + b1 x0 + · · · + bn−1 x0 + bn

Proof: As was the case with Proposition 4.3.2 the proof of this proposition is
also easy. Part (a) can be proved by applying mathematical induction along with
part (a) of Proposition 4.3.1. The result given in part (b) then follows from part
(a) of this result, Proposition 4.3.1–(b) and Proposition 4.3.2–(c). And finally,
to prove part (c) we apply the quotient rule from Proposition 4.3.1–(f) along
with part (b) of this proposition.
We are now able to compute a large class of limits. We have intentionally
skipped functions involving irrational exponents and functions of the form ax
(which are two other very basic ”easy limits” that we use along with Proposition
4.3.1 to compute limits) because we will give a rigorous mathematical introduc-
tion to these functions in Section 7.7—so discussiing their limits at this time
would be cheating. If we returned to the examples considered in Section 4.2
we would now be able to compute the limits considered in Examples 4.2.1–4.2.3
very easily. To compute a limit such as that considered in Example 4.2.4, we
108 4. Limits of Functions

proceed much the way we did in our elementary course and compute as follows.
x4 − 16 (x − 2)(x + 2)(x2 + 4)
lim = lim
x→2 x − 2 x→2 x−2
= lim (x + 2)(x2 + 4) because we know that x − 2 6= 0
x→2
= 32 by Proposition 4.3.3–(b).

Before we leave this section we include one more limit result—the Theorem,
analogous to the sequential Sandwich Theorem, Proposition 3.4.2.
Proposition 4.3.4 Consider the functions f, g, h : D → R where D ⊂ R,
suppose that x0 ∈ R and x0 is a limit point of D. Suppose lim f (x) =
x→x0
lim g(x) = L and there exists a δ1 such that f (x) ≤ h(x) ≤ g(x) for x ∈ D
x→x0
and 0 < |x − x0 | < δ1 . Then lim h(x) = L.
x→x0

Proof: Suppose  > 0 is given. Let δ2 and δ3 be such that 0 < |x − x0 | < δ2
implies |f (x) − L| <  or L −  < f (x) < L + , and 0 < |x − x0 | < δ3 implies
|g(x) − L| <  or L −  < g(x) < L + . Let δ = min{δ1 , δ2 , δ3 } (Step 1: Define
δ) and suppose that x satisfies x ∈ D and 0 < |x − x0 | < δ. Then

L −  < f (x) ≤ h(x) ≤ g(x) < L + ,

or L− < h(x) < L+. Thus |h(x)−L| <  (Step 2: δ works) so limx→x0 h(x) =
L.

HW 4.3.1 (True and False and why)


(a) Suppose f : D → R, D ⊂ R, x0 , L ∈ R, x0 is a limit point of D and
lim f (x) = L. Then lim |f (x)| = |L|.
x→x0 x→x0
(b) Suppose f : D → R, D ⊂ R, x0 , L ∈ R, x0 is a limit point of D, lim f (x) =
x→x0
L and L > 0. Then there exists a neighborhood of x0 , N (x0 ), such that f (x) > 0
for all x ∈ N (x0 ).
(c) Consider f, g : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of D. Suppose
further that lim (f (x) + g(x)) and lim g(x) exist. Then lim f (x) exists.
x→x0 x→x0 x→x0
(d) Suppose f : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of D. Suppose
further that lim f (x) exists and there exists a punctured neighborhood of x0 ,
x→x0
N̂δ (x0 ), such that f (x) > 0 for x ∈ N̂δ (x0 ). It may be the case that lim f (x) =
x→x0
0.
(e) Suppose f, g, h : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of D.
Suppose also that lim f (x) = lim g(x) = L and there exists a δ such that
x→x0 x→x0
f (x) < h(x) < g(x) for x satisfying 0 < |x − x0 | < δ. In this situation it is not
necessarily the case that lim h(x) = L.
x→x0

x−2 √
HW 4.3.2 Prove that lim √ √ = 2 2.
x→2 x− 2
4.4 Infinite and One-sided Limits 109

HW 4.3.3 Suppose f, g : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of


D. Suppose further that f (x) ≤ g(x) on some punctured neighborhood of x0 ,
N̂δ (x0 ), and both limits lim f (x) and lim g(x) exist. Prove that lim f (x) ≤
x→x0 x→x0 x→x0
lim g(x).
x→x0

4.4 Limits at Infinity, Infinite Limits and One-


sided Limits
Limits at Infinity For all of the limits consider so far in this chapter both
x0 and L must be real. In this section we want to introduce limits where x0
and or L are ±∞. Of course we need definitions to extend the limit concepts
to these situations. If you think about it a bit, you should realize that we will
want lim f (x) to be very much like the sequential limit—except the definition
x→∞
will now have to allow for x values in some interval (N, ∞) rather than the
discrete points of N. Note that for convenience we define the limits at infinite
for functions defined on intervals such as (−∞, a) or (a, ∞) for some a. We
could give these definitions for domains less than these intervals but we would
have to fix up the domains so that we guaranteed that ±∞ was a limit point
of D—which we haven’t and don’t really want to define. We begin with the
following definition.

Definition 4.4.1 For f : (a, ∞) → R for some a ∈ R and L ∈ R, we say that


lim f (x) = L if for every  > 0 there exists an N ∈ R such that x > N implies
x→∞
that |f (x) − L| < .
Likewise, if f is defined on (−∞, a) for some a, lim f (x) = L if for every
x→−∞
 > 0 there exists an N ∈ R such that x < N implies that |f (x) − L| < .

You probably computed some limits of this sort in your basic calculus class.
One of the common applications of limits at ±∞ is to determine asymptotes
to curves. Methods for computing limits at infinity are similar to the methods
for computing sequential limits. For example, the approach used to calculate a
2x2 + x − 3
limit such as lim is to perform the following computation.
x→∞ 3x2 + 3x + 3

2x2 + x − 3 2 + 1/x − 3/x2 limx→∞ 2 + 1/x − 3/x2 2


lim 2
= lim 2
= = .
x→∞ 3x + 3x + 3 x→∞ 3 + 3/x + 3/x limx→∞ 3 + 3/x + 3/x2 3

(Compare this result with the limit evaluated at the beginning of Section 3.4.)
To perform the above computation we first multiplied the numerator and de-
nominator by 1/x2 and then used ”limit of a quotient is the quotient of the
limits”, ”limit of a sum is the sum of the limits”, ”limit of a constant times a
function is the constant times the limit”, ”limit of a constant is that constant”
and ”the limit of 1/xk as x goes to infinity is zero” (k ∈ N). Clearly, at present
we do not have these results, and hopefully equally clearly, these results are
110 4. Limits of Functions

completely analogous to the results proved for sequences in Propositions 3.3.2,


3.4.1, Example 3.3.1 and HW3.2.1-(b). We include these results in the following
two propositions.
Proposition 4.4.2 Suppose that f, g : (a, ∞) → R, for some a ∈ R, lim f (x) =
x→∞
L1 , lim g(x) = L2 and c ∈ R. We have the following results.
x→∞
(a) lim (f (x) + g(x)) = L1 + L2 .
x→∞
(b) lim cf (x) = c lim f (x) = cL1 .
x→∞ x→∞
(c) There exists an N, K ∈ R such that for x ∈ (a, ∞) and x > N , |f (x)| ≤ K.
(d) lim f (x)g(x) = L1 L2 .
x→∞
(e) If L2 6= 0 then there exists an N, M ∈ R such that |g(x)| ≥ M for all x > N .
f (x) L1
(f ) If L2 6= 0, then lim = .
x→∞ g(x) L2

Proposition 4.4.3 (a) For c ∈ R lim c = c.


x→∞
(b) lim (1/x) = 0
x→∞
(c) For k ∈ N lim (1/xk ) = 0.
x→∞

We are not going to prove these two propositions. Their proofs are just
copies of the analogous sequential results. Likewise, we could also take some
2x + 3 2
particular examples such as lim = and use Definition 4.4.1 to prove
x→∞ 5x + 7 5
this statement. We will not do that because such a proof would be almost
identical to the proof given in Example 3.2.3 (when we did the analogous result
for sequences). Also we should add that there are versions of Propositions 4.4.2
and 4.4.3 for the case when x approaches −∞. And finally note that we have
not mentioned a result analogous to Example 3.5.1. It is possible to prove that
for 0 < c < 1, lim cx = 0. However, since we will wait until Section 7.7 to
x→∞
define cx , we do not consider this limit at this time.
Infinite Limits In Example 4.2.5 we showed that lim (1/x2 ) does not exist. We
x→0
mentioned as a part of the proof that the limit wanted to go to infinity—so since
according to Definition 4.1.1 it is necessary that L ∈ R, the limit cannot exist.
We want to be able to show that the limit above does not exist in a much nicer
way than the nonexistence of the limits considered in Examples 4.2.6 and 4.2.7.
Just as we included an alternative definition for sequences converging to infinity
in Section 3.6, we want the same concept for limits of functions. Consider the
following definition.
Definition 4.4.4 (a) Suppose that f : D → R, D, R ⊂ R, x0 ∈ R and x0 a
limit point of D. We say that lim f (x) = ∞ if for every M > 0 there exists a
x→x0
δ such that x ∈ D, 0 < |x − x0 | < δ implies that f (x) > M .
(b) lim f (x) = −∞ if for every M < 0 there exists an δ such that x ∈ D,
x→x0
0 < |x − x0 | < δ implies that f (x) < M .
4.4 Infinite and One-sided Limits 111

We now return to the consideration of the example given in Example 4.2.5.


Example 4.4.1 Prove that lim (1/x2 ) = ∞.
x→0
Solution: We suppose that we are given an M > 0. We must find a δ so that 0 < |x − 0| < δ
implies that f (x) > M , i.e. we need x12 > M . This last inequality is equivalent to x2 < 1/M .
√ √
This inequality is satisfied if |x|
√ < 1/ M . Thus we define δ = 1/ M (Step 1: Define δ) and
suppose that 0 < |x| < δ = 1/ M . Then |x|2 = x2 < 1/M and 1/x2 > M which is what we
had to prove (Step 2: δ works). Therefore lim 1/x2 = ∞.
x→0
Note that we did not consider x = 0 at all. This is a place where the part of the
requirement ”0 <” is important.

If we wanted to, we could now prove some theorems pertaining to infinite


limits. This surely would be overkill. However, we should be aware that it is
possible to obtain all of the results analogous to those in Proposition 3.6.2. And
finally, we should realize that we could also define infinite limits as x approaches
either positive or negative infinity, i.e. lim (x2 + 1) = ∞. Hopefully you
x→−∞
are now capable of piecing the definitions given above to obtain the necessary
definition for such limits—if they are needed. Limits such as this last one are
not common.
One-sided Limits If you feel that this topic does not fit particularly well with
the other two topics in the section, you are right—we had to find a place to
put it. Quite literally there are times when instead of approaching the limiting
point x0 from either side, we want to only consider points to the right or left of
x0 . We did this when we considered limits at ±∞ but in that case there were
no points ”on the other side.” We have three easy approaches to this idea—we
shall do all three.
We begin by considering f : D → R where D ⊂ R and x0 ∈ R (where we
always keep in mind the most common case where D = [a, b]). We define two
new functions f + : D+ → R where D+ = D ∩ (x0 , ∞) and f − : D− → R where
D− = D ∩ (−∞, x0 ). We make the following definition.

Definition 4.4.5 (a) If x0 is a limit point of D+ , we define lim f (x) =


x→x0 +
+
lim f (x).
x→x0
We refer to this limit as the limit of f as x approaches x0 from the right or the
right hand limit of f at x0 .
(b) If x0 is a limit point of D− , we define lim f (x) = lim f − (x).
x→x0 − x→x0
We refer to this limit as the limit of f as x approaches x0 from the left or the
left hand limit of f at x0 .

We should note that the functions f + and f − are just copies of f to the right
and the left of x0 , respectively—hence using f + and f − we get the right and left
hand limits of f , respectively. The fact that Definition 4.1.1 is a very general
definition of a limit allows us to easily define the right and left hand limits. Also
notice that it is still a requirement that x0 is a limit point of D+ and D− —this
is to guarantee that we have enough points of D on either side of x0 to allow us
to apply Definition 4.1.1. Note that if x0 is a limit point of either D+ or D− ,
112 4. Limits of Functions

then x0 will also be a limit point of D. However there are sets D when x0 is
a limit point of D but x0 is not a limit point of both D+ and D− —Consider
x0 = 1 and D = (0, 1).
Before we look at some results concerning right and left hand limits, we
include the more common definition in the following result.
Proposition 4.4.6 Suppose that f : D → R where D ⊂ R and x0 , L ∈ R.
(a) Suppose that x0 is a limit point of D ∩ (x0 , ∞). Then lim f (x) = L if
x→x0 +
and only if for every  > 0 there exists a δ such that x ∈ D and 0 < x − x0 < δ
implies that |f (x) − L| < .
(b) Suppose that x0 is a limit point of D ∩ (−∞, x0 ). Then lim f (x) = L if
x→x0 −
and only if for every  > 0 there exists a δ such that x ∈ D and 0 < x0 − x < δ
implies that |f (x) − L| < .
Proof: (a) (⇒) We begin by assuming that lim f (x) = L, i.e. lim f + (x) =
x→x0 + x→x0
L. This means that for every  > 0 there exists a δ such that x ∈ D+ and 0 <
|x−x0 | < δ implies that |f + (x)−L| < . We note that if x ∈ D+ = D ∩(x0 , ∞),
then |x − x0 | = x − x0 , so 0 < |x − x0 | < δ is equivalent to 0 < x − x0 < δ. Also,
note that if x ∈ D+ , then x ∈ D also. And finally, for x ∈ D+ f + (x) = f (x).
Thus for the δ given we see that x ∈ D and 0 < |x − x0 | = x − x0 < δ implies
|f (x) − L| < .
(⇐) We will skip the proof of this direction because it is so similar to the proof
given for part (b)
(b) (⇒) We will skip the proof of this direction because it is so similar to the
proof given for part (a).
(⇐) We suppose that for every  > 0 there exists a δ so that if x ∈ D and
0 < x0 − x < δ implies |f (x) − L| < . Note that x ∈ D and 0 < x0 − x < δ
is equivalent to x ∈ D− and |x − x0 | < δ. Also, if x ∈ D and 0 < x0 − x < δ,
then f (x) = f − (x). For x ∈ D− and 0 < x0 − x = |x − x0 | < δ we have
|f − (x) − L| < . Thus lim f − (x) = L or lim f (x) = L.
x→x0 x→x0 −
The way that we apply Proposition 4.4.6 in a one-sided limit proof is very
similar to the way that we applied Definition 4.1.1—except that we now only
need to consider points on one side of x0 .
The third characterization of one-sided limits should be very familiar to us.
In Proposition 4.1.3 we gave a sequential characterization of limits—we can do
the same thing for one-sided limits. We state the following proposition.
Proposition 4.4.7 Suppose that f : D → R, D ⊂ R, x0 , L ∈ R.
(a) Suppose that x0 is a limit point of D ∩ (x0 , ∞). Then lim f (x) = L if
x→x0 +
and only if for any sequence {an } such that an ∈ D for all n, an > x0 for all
n, and lim an = x0 , then lim f (an ) = L.
n→∞ n→∞
(b) Suppose that x0 is a limit point of D ∩ (−∞, x0 ). Then lim f (x) = L if
x→x0 −
and only if for any sequence {an } such that an ∈ D for all n, an < x0 for all
n, and lim an = x0 , then lim f (an ) = L.
n→∞ n→∞
4.4 Infinite and One-sided Limits 113

Proof: We will skip this proof because it is so much like the proof of Proposition
4.1.3. For the (⇒) direction of part (a) for a given  > 0 the right hand limit
hypothesis gives a δ, this δ used as the ”” in the assumption that an → x0
(and it works because we have assumed that an > x0 ) gives an N which is the
N that we need to prove that f (an ) → L as n → ∞. The (⇒) direction of part
(b) is essentially the same.
To prove the (⇐) directions we again assume false and use this assumption
to create a sequence {an } that contradicts our hypthesis—because the one-sided
limit is used the our contradiction assumption, the sequence will either greater
than or less than x0 .
We emphasize that when we want to prove things about one-sided limits, we
will generally use Propositions 4.4.6 and 4.4.7. We used Definition 4.4.5 as our
definition of one-sided limits to emphasize the ”one sidedness” of the functions
when we consider one-sided limits.
In Example 4.2.2 we proved that lim x2 = 4. It is very easy to show that
x→2
lim x2 = 4 and lim x2 = 4. If we use Proposition 4.4.6 we can choose
x→2+ x→2−
δ = min{1, /5} (the same δ used in Example 4.2.2). If we apply Proposition
4.4.7 to prove that lim x2 = 4, we use a sequence an → 2 with an > 2 and
x→2+  
the fact that lim a2n = lim an lim an = 2 · 2 = 4. And of course the
n→∞ n→∞ n→∞
application of Proposition 4.4.7 to lim x2 is similar except that this time we
x→2−
assume that the sequence satisfies an < 2. We do not try to apply Definition
4.4.5 directly—either Propositions 4.4.6 and 4.4.7 are much cleaner ways to
prove one-sided( limits.
1 if x ≥ 0
If f (x) = we showed in Example 4.2.6 that lim f (x) does not
0 if x < 0, x→0

exist. It is very easy to show that lim f (x) = 1 and lim f (x) = 0. If we
x→0+ x→0−
were to apply Proposition 4.4.6, we can choose δ = 1 (or anything else) for both
of them. If we apply Proposition 4.4.7, the results follow because f (an ) = 1 if
an > 0 and f (an ) = 0 if an < 0.
In Example 4.2.7 we showed that lim f (x) does not exist when f (x) =
( x→0
sin(1/x) x 6= 0
The easiest way to show that lim f (x) does not exist is
0 x = 0. x→0+

to use Proposition 4.4.6 with two different sequences an = 1/nπ and bn =


2/(4n + 1)π—where just as we did in Example 4.2.7 we find that f (an ) → 0 and
f (bn ) → 1. To show that lim f (x) does not exist we again apply Proposition
x→0−
4.4.6, this time with the sequences {−an } and {−bn }. The fact that these
one-sided limits do not exist should be clear by looking at Figure 4.2.1.
We do not define one-sided infinite limits but it should be clear that it is
1
possible to define such limits. In Example 4.2.5 we showed in lim 2 does not
x→0 x
1
exist in R, and then in Example 4.4.1 we showed that lim 2 = ∞. If we are
x→0 x
114 4. Limits of Functions

given M > 0 and choose δ = 1/ M —-just as we did in Example 4.4.1, we can
√ 1
show that 0 < x < δ = 1/ M implies that x12 > M . Thus lim 2 = ∞. The
x→0+ x
1
same δ can be used to prove that lim 2 = ∞ also.
x→0− x
1
The limit lim is slightly different. This limit does not exist either by
x→0 x
Definition 4.1.1 (i.e. not in R) and not by Definition 4.4.4. However it is easy to
1 1
show that lim = ∞ and lim = −∞. Proof of the right hand limit is as
x→0+ x x→0− x
follows. Suppose M > 0 is given. Choose δ = 1/M . Then if 0 < x < δ = 1/M ,
= 1 > M . Therefore lim 1 = ∞. The proof of the left hand limit is
1
x x x→0+ x
similar.
Before we leave this topic we want to include one important result. We
notice that when we considered the left and right hand limits of f (x) = x2 at
x0 = 2, we got 4—they same value as lim x2 . When we considered the functions
( ( x→2
1 if x ≥ 0 sin(1/x) x 6= 0
f (x) = and f (x) =
0 if x < 0 0 x = 0,
for both of which the limit lim f (x) did not exist, we see that in one case both
x→0
one-sided limits exist but are different and in the other case neither of the one-
sided limits exist. These examples pretty much illustrate all possibilities of the
following theorem.

Proposition 4.4.8 Consider the function f : D → R where D, R ⊂ R, suppose


that L, x0 ∈ R and x0 is a limit point of both D ∩ (x0 , ∞) and D ∩ (−∞, x0 ).
Then lim f (x) exists if and only if both lim f (x) and lim f (x) exist and
x→x0 x→x0 + x→x0 −
are equal.

Proof: (⇒) We assume that lim f (x) exists, i.e. for every  > 0 there exists
x→x0
a δ such that x ∈ D and 0 < |x − x0 | < δ implies that |f (x) − L| <  for some
L ∈ R. Note that 0 < |x − x0 | < δ implies that 0 < x − x0 < δ or 0 < x0 − x < δ.
We assumed that x0 is a limit point of D ∩ (x0 , ∞) and we know that we have
a δ such that x ∈ D and 0 < x − x0 < δ implies that |f (x) − L| < . Thus by
Proposition 4.4.6-(a) lim f (x) = L. Also, x0 is a limit point of D ∩ (−∞, x0 )
x→x0 +
and we have a δ such that x ∈ D and 0 < x0 − x < δ implies that |f (x) − L| < .
Thus by Proposition 4.4.6-(b) lim f (x) = L.
x→x0 −

(⇐) Suppose  > 0 is given. We suppose that lim f (x) = L and lim f (x) =
x→x0 + x→x0 −
L. Then there exists a δ1 and δ2 such that if x satisfies either 0 < x − x0 < δ1
or 0 < x0 − x < δ2 implies that |f (x) − L| < . Let δ = min{δ1 , δ2 } (Step 1:
Define δ). Then if x satisfies 0 < |x − x0 | < δ, x satisfies 0 < x − x0 < δ or
0 < x0 − x < δ. So x satifies 0 < x − x0 < δ ≤ δ1 or 0 < x0 − x < δ ≤ δ2 . Thus
|f (x) − L| <  (Step 2: δ works) and lim f (x) = L.
x→∞
4.4 Infinite and One-sided Limits 115

There are times when we want to prove a particular limit that the above
theorem is very useful. We can handle the function on each side of x0 separately,
get the same one-sided limits and hence prove our limit result. The theorem is a
very useful too for proving that a particular limit does not exist. You compute
both one-sided limits and if either one doesn’t exist or they are not equal, the
limit doesn’t exist.

HW 4.4.1 (True or False and why)


(a) If x0 is a limit point of D ⊂ R, then x0 is a limit point of D+ = D ∩ (x0 , ∞)
and D− = D ∩ (−∞, x0 ).
(b) Suppose f : [0, 1) → R and that lim f (x) exists. Then lim f (x) exists
x→1− x→1+
also.
(c) Suppose f : [0, 1] → R and that lim f (x) exists. Then lim f (x) exists
x→1− x→1+
also.
(d) Suppose f : [0, 1) → R and that lim f (x) exists. Then lim f (x) exists.
x→1− x→1
(e) Suppose that f (x) = 1/x4 and g(x) = x. We know that lim f (x) = ∞ and
x→0
lim g(x) = 0. Then lim f (x)g(x) = 0 · ∞ = 0.
x→0 x→0

HW 4.4.2 (a) Prove that lim 1/x4 = ∞. (b) Prove that lim 1/x4 = 0.
x→0 x→∞
x 1
(c) Prove that lim sin x does not exist. (d) Prove that lim = .
x→∞ x→∞ 2x − 1 2
(
x2 if x 6= 3
HW 4.4.3 Suppose that f (x) = Prove that limx→3+ f (x) =
14 if x = 3.
lim f 9x) = 9.
x→3−

1

sin x
 if x > 0
HW 4.4.4 Suppose that f (x) = 0 if x < 0 Compute lim f (x) and
 x→0+
12 if x = 0.

lim f (x) if they exist. If they do not exist, prove that they do not exist.
x→0−
116 4. Limits of Functions
Chapter 5

Continuity

5.1 An Introduction to Continuity


In the preceeding chapters we have been building basics and tools. In this
chapter we introduce the concept of a continuous function and results related
to continuous functions. The class of continuous functions is a very important
set of functions in many areas of mathematics. Also there are a lot of very nice
and useful properties of continuous functions. We begin with the definition of
continuity.
Definition 5.1.1 Consider a function f : D → R where D ⊂ R and a point
x0 ∈ D. The function f is continuous at x0 if for every  > 0 there exists a δ
such that for x ∈ D and |x − x0 | < δ, then |f (x) − f (x0 )| < .
If the function f is continuous at x for all x ∈ D, then f is said to be continuous
on D.
If the function f is not continuous at a point x = x0 , then f is said to be
discontinuous at x = x0 .
In the last chapter we studied what it meant for a function to have a limit at a
point. Often the definition of continuity is given in terms of limits—especially
in the elementary calculus texts. We state the following proposition.
Proposition 5.1.2 Consider a function f : D → R where D ⊂ R and a point
x0 ∈ D. Suppose that x0 is a limit point of the set D. If lim f (x) = f (x0 ),
x→x0
then the function f is continuous at x = x0 .
If x0 ∈ D and x0 is a limit point of D but lim f (x) does not exist, or exists
x→x0
but does not equal f (x0 ), then f is not continuous at x = x0 .

Proof: This proof is very easy. Let  > 0 be given. If we apply the definition
of lim f (x) = f (x0 ) we get a δ such that 0 < |x − x0 | < δ implies that
x→x0
|f (x) − f (x0 )| < . But this is almost what we need to satisfy Definition 5.1.1.
We need to get rid of the ”0 <” requirement. But when x = x0 , we know that

117
118 5. Continuity

|f (x) − f (x0 )| = 0 <  so the ”0 <” part of the restriction on x is completely


unnecessary. Therefore f is continuous at x = x0 .
If x0 ∈ D is a limit point of D and it is not the case that lim f (x) = f (x0 )
x→x0
(either the limit does not exist or it exists but equals something other than
f (x0 )), then there is some  so that for any δ, there is an xδ ∈ D such that
0 < |x − xδ | < δ and |f (xδ ) − f (x0 )| ≥ . This also negates Definition 5.1.1 so
f is not continuous at x = x0 .
We want to emphasize that this result is not an ”if and only if” result, i.e.
this does not provide us with a statement that is equivalent to the definition of
continuity—but as we shall see later, it is close. Some texts use this as their
definition—usually only basic calculus texts. The function that we considered
in HW4.1.1 shows that hypotheses given in Proposition 5.1.2 are not equivalent
to Definition 5.1.1. In HW4.1.1–(a) we considered the domain D = [0, 1] ∪ {2}
and the function f : D → R, f (x) = x2 . The True-False question was whether
lim f (x) exists and equals 4. Of course the answer is False because for that limit
x→2
to exist, 2 must be a limit point of D. However, f is continuous at x = 2—if we
set δ = 1/2, Definition 5.1.1 is satisfied. It is not a requirement for continuity at
x = x0 that lim f (x) exist. A function is continuous at isolated points
x→x0
of its domain—the limit of the function will not exist at those points. This is
not a terribly important distinction for our work. Clearly Proposition 5.1.2 can
be used to prove continuity at points of the domain that are limit points of the
domain—which in this level of a text is most of them.
In Example 4.2.1 we showed that lim 2x + 3 = 9. If we set f1 (x) = 2x + 3
x→3
and choose the domain to be R (a reasonable domain for that function), we
note that f1 is defined at x = 3 and f1 (3) = 9. Thus by Proposition 5.1.2
the function f1 is continuous at x = 3. It would be equally easy to mimic
the work done in Example 4.2.1, (omitting the ”0 <” part) to show that f1
satisfies Definition 5.1.1 at x = 3. (Choose δ = /2. Then for |x − 3| < δ,
|f1 (x) − 9| = |(2x + 3) − 9| = 2|x − 3| < 2δ = .) It is equally easy to see—using
either Definition 5.1.1 or Proposition 5.1.2 (or we could use Proposition 4.3.3-
(b) along with Proposition 5.1.2—that the function f1 is continuous at x = x0
for any x0 ∈ R. Thus f1 is continuous on R.
Likewise, we showed in Example 4.2.2 that the function f2 (x) = x2 is con-
tinuous at x = 2 (given that we define f2 on some reasonable domain such
as D = R) and in Example 4.2.3 the function f3 (x) = x−2 x+3 is continuous at
x = −2. Note that in the case of the function f3 , the largest (most log-
ical) domain that we can choose on which f3 will be continuous is the set
D3 = {x : x ∈ R and x 6= −3}.
Recall that in the work done in Examples 4.2.1–4.2.3, the ”0 <” part of the
definition of a limit was not relevant. We noted that in each case |f (x) − L| < 
when x = x0 also—in fact in each of these cases |f (x) − L| = 0 when x = x0 .
This is exactly why f1 , f2 and f3 are continuous at x = 3, x = 2 and x = −2,
respectively.
5.1 Introduction 119

If we consider f2 and f3 at arbitrary points of D2 and D3 , respectively, we


can use either Definition 5.1.1, or Proposition 4.3.3 and Proposition 5.1.2 to
show that f2 is continuous on D2 and f3 is continuous on D3 . Hopefully it is
obvious that f3 is not continuous at x = −3. A function cannot be continuous
at a point that is not in the domain of the function.
x3 − 64
In Example 4.2.4 we showed that lim = 48. However, if we define
x→4 x − 4
x3 − 64
f4 (x) = and allow the domain to be what is sort of f4 ’s natural domain,
x−4
D4 = {x : x ∈ R and x 6= 4}, then f4 is surely not continuous at x = 4 since
f4 is not defined at x = 4. If we were to define a new function f8 so that
f8 (x) = f4 (x) for all x ∈ R, x 6= 4 and define f8 (4) = 48, then the domain of f8
would be D8 = R, f8 will be continuous at x = 4 and f8 would be continuous
on all of R—use Proposition 5.1.2.
And finally we showed ( in Examples 4.2.5, 4.2.6 ( and 4.2.7 that the functions
sin x1

1 if x ≥ 0 if x 6= 0
f5 (x) = 1/x2 , f6 (x) = and f7 (x) =
0 if x < 0, 0 if x = 0
are not continuous at x = 0. This can be seen by the last part of Proposition
5.1.2 because 0 is a limit point of the domain of each of these domains and the
limit as x approaches 0 of each of these functions does not exist. In the case of
f5 it is even easier yet. If a function isn’t defined at a particular point, there is
no way that the function can be continuous at that point.
A Graphical Example In Figure 5.1.1 we plot a function where the domain
is assumed to be approximately the set above which there is a graph plotted.
We make the following claims.

Figure 5.1.1: Plot of a function continuous at xA , discontinuous at xB –xF . A


small open circle denotes a point at which the function is not defined. A small
filled circle denotes one point of definition.

• At point xA , though the graph has a ”corner” at that point, the function
is continuous at that point. (Generally, a function is continuous at well-
defined corners.)
120 5. Continuity

• At point xB , the function is defined at x = xB but will not be continuous


at xB . Though lim f (x) exists, there is no way that lim f (x) will equal
x→xB x→xB
f (xB ). When x is near xB , f (x) is not near f (xB ).
• The function is not continuous at points xC , xD and xE . These points are
similar to the point x = 0 considered in Example 4.2.6 and the proof that
the limit does not exist at points xC and xD would be very similar to the
argument used in Example 4.2.6. We wanted to emphasize that each of
these points represent a jump in the function. The points xC and xD are
points where the function is defined on the left and right side of the jump,
respectively. At the point xE the function is not continuous because it has
a jump at that point and it is not defined at the jump point. A function
cannot be continuous at a point at which it is not defined.
• The function is not continuous at point xF . Even though the function is
nicely behaved on both sides of the point xF , the function must be defined
at a point to be continuous at that point. Note that lim f (x) exists, and
x→xF
if we were to define f at the point xF to be lim f (x), then the function
x→xF
f would be continuous at x = xF .
Before we leave this section we include one of the basic continuity theorems.
We see that this result is the continuity analog to the limit result, Proposition
4.1.3.
Proposition 5.1.3 Suppose that f : D → R, D ⊂ R and x0 ∈ R. Then f is
continuous at x = x0 if and only if for any sequence {an } such that an ∈ D for
all n and lim an = x0 , then lim f (an ) = f (x0 ).
n→∞ n→∞

Proof: Before we begin let’s look at some of the differences between the above
statement and that given in Proposition 4.1.3. Because we now assume that
x0 ∈ D and because we no longer have the ”0 <” restriction on the range of x,
we now do not require that an 6= x0 . In addition, in the above statement we no
longer require that x0 is a limit point of D. We know that when we consider
the continuity of a function, it is permissible to have isolated points in D and
the function will always be continuous at those isolated points. Despite these
differences the proof of this result is essentially identical to that of Proposition
4.1.3.
(⇒) We are assuming that f is continuous at x0 ∈ D. We consider any sequence
{an } where an ∈ D and an → x0 . We suppose that we are given an  > 0. The
continuity of f at x = x0 implies that there exists a δ such that |x − x0 | < δ
implies that |f (x) − f (x0 )| < . If we apply the definition of the fact that
an → x0 with the traditional ”” replaced by δ, we get an N such that n > N
implies that |an − x0 | < δ. Then statement that f is continuous at x = x0
implies that for n > N , |f (an ) − f (x0 )| < . Thus f (an ) → f (x0 ).
(⇐) We suppose that f is not continuous at x0 , i.e. for some 0 > 0 for any δ
there exists an xδ ∈ D such that |xδ − x0 | < δ and |f (xδ ) − f (x0 )| ≥ .
5.2 Examples 121

Let δ = 1 (it’s true for any δ) and get an x1 ∈ D such that |x1 − x0 | < 1 and
|f (x1 ) − f (x0 )| ≥ .
Let δ = 1/2 and get an x2 ∈ D such that |x2 −x0 | < 1/2 and |f (x2 )−f (x0 )| ≥ .
And in general
let δ = 1/n and get an xn ∈ D such that |xn −x0 | < 1/n and |f (xn )−f (x0 )| ≥ .
Thus we have a sequence {xn } such that xn ∈ D for all n, xn → x0 and
f (xn ) 6→ f (x0 ). This is a contradiction so f is continuous at x = x0 .
We should be mildly concerned that the proofs of Propositions 4.1.3 and
5.1.3 are the same—we pointed out the differences between the two results.
This can be explained fairly easily if we consider two separate cases: when x0
is a limit point of D and when it is not. For the first case we can consider
the continuity in terms of limits so it shouldn’t be surprising that the proof of
Proposition 5.1.3 is very much like the proof of Proposition 4.1.3. When x0 is
not a limit point of D both the continuity and the sequential statement are very
easy—so this case is included trivially.
There are some texts that use the right hand side of Proposition 5.1.3 as
the definition of continuity—since the proposition is an ”if and only if” result,
this is completely permissible. The definition of continuity given in Definition
5.1.1 is the most common definition. Of course there are many times that the
sequential characterization of continuity is very useful. We feel that you must
be comfortable with using both Definition 5.1.1 and Proposition 5.1.3 (just as
we tried to force you to work with both Definition 4.1.1 and Proposition 4.1.3).
Specifically, as was the case with limits, when we want to show that a function is
not continuous at a given point, it is usually easier to apply Proposition 5.1.3—
providing a sequence {an } at which {f (an )} does not converge or converges to
something different from f (x0 ). When a push comes to a shove, we will use
which ever characterization is best at the time.

HW 5.1.1 (True or False and why)


(a) Suppose f : [0, 1) → R is defined as f (x) = x2 . We know that lim f (x) = 1.
x→1
Then the function f is continuous at x = 1.
(b) Suppose f : N → R defines a sequence, i.e. f (n) = an . The function f is
continuous at all n ∈ N.
(c) Consider the function f : D = [0, 1] ∩ Q → R defined by f (x) = x2 . Then f
is continuous on D.
(d) Consider the function f : D = [0, 2) ∪ (2, 13) → R where f (x) = x2 . Then
f is continuous at x = 2. (
sin x1

x 6= 0
(e) Consider the function f (x) = Since both an = 1/(nπ) →
0 x = 0.
and f (an ) → 0 = f (0) as n → ∞, f is continuous at x = 0.
2
−1
(f) Consider the function f (x) = xx−1 . If let let an = 1 + 1/n and bn = 1 − 1/n,
we see that f (an ) → 2 and f (bn ) → 2. Thus f is continuous at x = 1.
122 5. Continuity

5.2 Some Examples of Continuity Proofs


In this section we include an assortment of proofs of continuity. Hopefully after
our work with limits, you are getting to be pretty good at these. We felt that
you should see some. In general we can use Definition 5.1.1 or Propositions
5.1.2 and 5.1.3—which ever appears to be best at the time. In this section we
will use a variety of methods so that you get a taste of each of the above results
(not necessarily what is best at the time).
We begin with an example where we use the definition for the specific case
and Proposition 5.1.2 for the general case.
x2 − 1
Example 5.2.1 Consider the function f (x) = on the domain D = {x ∈ R :
x+3
x 6= −3}.
(a) Prove that f is continuous at x = 3.
(b) Prove that f is continuous at x = x0 for x0 ∈ D.
Solution: (a) We begin by assuming that we are given an  > 0. We must find a δ so
that |x − 3| < δ implies that
2
4 3x2 − 4x − 15

x − 1 |3x + 5||x − 3|
|f (x) − f (3)| = − = = < .
x+3 3 3(x + 3) 3|x + 3|
Note the |x − 3| term in the numerator. As was the case with limits, it will always be there.
|3x + 5|
As we did with the limit proofs, we must bound the term. We begin as we did with
3|x + 3|
the limit problems and choose δ1 = 1 and restrict x so that |x − 3| < δ1 = 1. Then
|x − 3| < 1 ⇒ 2 < x < 4 ⇒ 6 < 3x < 12 ⇒ 11 < 3x + 5 < 17 so |3x + 5| < 17.
Likewise
|x − 3| < 1 ⇒ 2 < x < 4 ⇒ 5 < x + 3 < 7 ⇒ |x + 3| > 5 so 3|x + 3| > 15.
|3x + 5| 17
Therefore, if |x − 3| < δ1 = 1, then < and
3|x + 3| 15
|3x + 5||x − 3| 17
|f (x) − f (3)| = < |x − 3|.
3|x + 3| 15
Define δ = min{1, (15/17)} (there is still Step 1: Define δ) and require that x satisfy |x − 3| <
δ. Then
|3x + 5||x − 3| ∗ 17 17 15
|f (x) − f (3)| = < |x − 3| <∗∗  = ,
3|x + 3| 15 15 17

(and Step 2: δ works) where the ”< ” inequality is due to the fact that |x − 3| < δ ≤ 1 and
the ”<∗∗ ” inequality is due to the fact that |x − 3| < δ ≤ (15/17). Therefore the function f
is continuous at x = 3.
This result could have been proved using either Propositions 5.1.2 or 5.1.3—and using
either of these would be easier than the above proof.
(b) Originally (in the preparation of the text) we used Definition 5.1.1 to prove the continuity
of f at x0 ∈ D. It was good because it showed that it could be done but it was brutal—so we
took it out. Just believe us that it is possible! Probably the easiest way to prove continuity
at x0 ∈ D is to use Proposition 5.1.2. It should not be hard to see that any x0 ∈ D is a limit
point of D. Then by Proposition 4.3.1, parts (a), (b), (d) and (f), we see that
x2 − 1 x2 − 1
lim f (x) = lim = 0 = f (x0 ).
x→x0 x→x0 x+3 x0 + 3
Therefore f is continuous at any x0 ∈ D.
5.2 Examples 123

We next consider the absolute value function at x = 0. Recall that the graph
of the absolute value function has a corner at x = 0 (like point xA on the graph
of f in Figure 5.2.1). Functions are continuous at corners of the graph.
Example 5.2.2 Show that the function f (x) = |x| is continuous at x = 0.
Solution: Note that f is defined on R (which we will assume to be the domain of f ). Clearly
x = 0 is a limit point of the domain. Since −x ≤ |x| ≤ x, lim x = 0 lim (−x) = 0, we have
x→0 x→0
lim |x| = 0 = |0| by using Proposition 4.3.4. Then by Proposition 5.1.2 | · | is continuous at
x→0
x = 0.
It should be clear that the absolute value function is also continuous at all other points
of R.

We next prove the continuity of the sine and cosine functions—first at θ = 0


and then for general θ. The continuity of the sine and cosine functions can then
be used to prove the continuity of the remaining trigonometric functions at the
points where these functions are defined.
Example 5.2.3 (a) Show that for sufficiently small θ
−|θ| ≤ sin θ ≤ |θ| and − |θ| ≤ 1 − cos θ ≤ |θ|.
(b) Prove that sine function is continuous at θ = 0.
(c) Prove that cosine function is continuous at θ = 0.
(d) Show that the sine and cosine functions are continuous on R.
Solution: (a) We consider the picture given in Figure 5.2.1. We begin by noting that most
of this argument will be true for more that ”sufficiently small” θ. However, as we shall see,
we only need the result for small θ and do not want to have to worry about what happens
when θ gets larger than π/2 or smaller than −π/2 (we can always restrict θ in the limit proof
such that |θ| < δ1 = 1).

Figure 5.2.1: The circle has radius 1, A and P are on the circumference of the
circle and P Q is perpendicular to OA

We see that |AP | ≤ |θ| (equality when θ = 0)—where the absolute value signs are included
to allow for a negative angle θ. From triangle 4OQP we see that |QP | = | sin θ| and |OQ| =
cos θ—which gives |AQ| = 1 − cos θ. Then applying the Pythagorean Theorem to triangle
4AQP we see that
sin2 θ + (1 − cos θ)2 = |AP |2 ≤ θ2 .
124 5. Continuity

Therefore sin2 θ ≤ θ2 and (1 − cos θ)2 ≤ θ2 . If we take the square roots of both inequalities
we get | sin θ| ≤ |θ| and |1 − √
cos θ| ≤ |θ|. (Notice
√ that this is one of the times that you must
be very careful to note that a2 = |a|—not a2 = a.)
(b) and (c) By Example 5.2.2 we know that lim (±|θ|) = 0. Then by part (a) of this example
θ→0
and Proposition 4.3.4 lim sin θ = 0 = sin 0. Since x = 0 is a limit point of R we can apply
θ→0
Proposition 5.1.2 to see that the sine function is continuous at θ = 0.
It should be easy to see that the proof that lim (1 − cos θ) = 0 is the same as the proof
θ→0
given above for the sine function. From this we see that lim cos θ = 1 = cos 0 and that the
θ→0
cosine function is continuous at θ = 0.
(Alternative proof:) Once we have the inequalities from part (a), it is also easy to prove
the continuity of sine and cosine using Definition 5.1.1. (If  > 0 is given, then by choosing
δ = , we see that |θ − 0| < δ implies that − = −δ < −|θ| ≤ 1 − cos θ ≤ |θ| < δ = . So cosine
is continuous at θ = 0.)
(d) Let θ0 ∈ R. We consider lim sin θ. Note that by a trig identity
θ→θ0

sin θ = sin [θ0 + (θ − θ0 )] = sin θ0 cos(θ − θ0 ) + cos θ0 sin(θ − θ0 ). (5.2.1)

By part (b) lim sin h = 0, hence for any  > 0 there exists a δ such that |h| < δ implies that
h→0
| sin h| < . Then: if |θ − θ0 | < δ, we have | sin(θ − θ0 )| < . therefore lim sin(θ − θ0 ) = 0.
θ→θ0
In part (b) we found that lim cos θ = 1. Hence, lim cos(θ − θ0 ) = 1. Thus by these
θ→0 θ→θ0
limits, equation (5.2.1) and parts (a) and (b) of Proposition 4.3.1 we see that lim sin θ =
θ→θ0
sin θ0 (1) + cos θ0 (0) = sin θ0 . Therefore the sine function is continuous at θ = θ0 (for any
θ0 ∈ R).
To prove the continuity of the cosine function at θ = θ0 we use the identity

cos θ = cos [θ0 + (θ − θ0 )] = cos θ0 cos(θ − θ0 ) − sin θ0 sin(θ − θ0 )

and proceed as we did in the proof of the continuity of the sine function.

The next example is a fun example. Before we get working notice the func-
tion f defined in Example 5.2.4. Recall that in Example 4.2.7 we considered a
similar function (without the x term multiplying the sine term) that was not
continuous at x = 0. As we did in Example 4.2.7 it is useful here to look at the
plot of f . In Figure 5.2.2 we see that the plot of f squeezes down to zero when
x is near zero. This is the attribute of this function that makes it continuous at
x = 0 whereas the function given in Example 4.2.7 was not continuous at x = 0.
(
1

x sin if x 6= 0
Example 5.2.4 Define the function f : R → R by f (x) = x
0 if x = 0.
Show that f is continuous at x = 0.
Proof: It is easy to use Definition 5.1.1 to prove that f is continuous at x = 0. Let  > 0 be
given, define δ =  and consider x values that satisfies |x| < δ. Then
 
x sin 1 − 0 ≤ |x| < δ = .

x

Therefore f is continuous at x = 0.
It should be clear that f is also continuous for all other points in R.

We include one more example (that is also a fun example) that introduces
a useful, interesting function.
5.2 Examples 125

0.5

0.4

0.3

0.2

0.1

−0.1

−0.2

−0.5 0 0.5
x

Figure 5.2.2: Plot of a function f (x) = x sin(1/x) for x 6= 0 and f (0) = 0.

(
1 if x ∈ Q
Example 5.2.5 Define the function f : D = [0, 1] → R by f (x) =
0 if x ∈ I.
Show that f is discontinuous at all points x ∈ D = [0, 1].
Solution: First consider x0 ∈ [0, 1]∩I. Let  = 1/2. Consider any δ. We know by Proposition
1.5.6-(a) that there exists rδ ∈ Q such that rδ ∈ (x0 − δ, x0 + δ), i.e. rδ satisfies |rδ − x0 | < δ
and |f (rδ ) − f (x0 )| = |1 − 0| = 1 >  = 1/2. Therefore f is not continuous at x0 .
Likewise consider x0 ∈ [0, 1] ∩ Q. Let  = 1/2. Consider any δ. We know by Proposition
1.5.6-(b) that there exists iδ ∈ I such that iδ ∈ (x0 − δ, x0 + δ), i.e. iδ satisfies |iδ − x0 | < δ
and |f (iδ ) − f (x0 )| = |0 − 1| = 1 >  = 1/2. Therefore f is not continuous at x0 .

HW 5.2.1 (True or False and why)


(a) Set D = [0, 1] ∪ {3} and define f on D by f (x) = 1 for x ∈ D ∩ Q and
f (x) = 0 for x ∈ D ∪ I. Then f is discontinuous at all points of D.
(b) Suppose f : [−1, 1] → R is defined as follows: for x ∈ [−1, 1] ∩ Q, f (x) = x2
and for x ∈ [−1, 1] ∩ I, f (x) = −x2 . Then the function f is continuous at x = 0.
(c) The function f defined in part (b) is discontinuous for all x ∈ [−1, 1],
n x 6= 0.o √ ∞
(d) If we consider the function f defined in Example 5.2.5, the sequence 22 + n1
√ n=5
can be used to show that the function f is discontinuous at x = 2/2.
(e) We saw in Example 5.2.2 that the function f (x) = |x| is continuous at x = 0.
The function g(x) = | − x| is not continuous at x = 0.

HW 5.2.2 (a) Use Definition 5.1.1 to prove that f (x) = |x| is continuous at
x = 0.
(b) Use Definition 5.1.1 to prove that f (x) = |x| is continuous at x = 2.

HW 5.2.3 (a) Prove that f (x) = |x − 3| is continuous at x = 3.


(b) Prove that f is continuous at x = 2.
(c) Prove that f is continuous on R.
126 5. Continuity

HW 5.2.4 ( (a) Consider the functions f1 defined on R as(


x3 x≥0 x3 x ≥ 0
f1 (x) = f2 defined on R as f2 (x) = and f3
3x − 1 x < 0, 3x x < 0,
defined on
( [−1, 1] as
x3 x ∈ [−1, 1] ∩ Q
f3 (x) = 3
At which points are f1 , f2 and f3 are contin-
−x x ∈ [−1, 1] ∩ I.
uous and show why.

HW 5.2.5 Prove that any polynomial is continuous on R.

HW 5.2.6 Prove that any rational function is continuous at all points where
the denominator is nonzero.

5.3 Basic Continuity Theorems


Because the concept of continuity is important, there are a lot of important
continuity theorems. We will begin with the most basic of these theorems.

Proposition 5.3.1 Consider f, g : D → R for D ⊂ R, x0 ∈ D, c ∈ R, and


suppose that f and g are continuous at x = x0 . We then have the following
results.
(a) cf is continuous at x = x0 .
(b) f ± g is continuous at x = x0 .
(c) f g is continuous at x = x0 .
(d) If g(x0 ) 6= 0, then f /g is continuous at x = x0 .

Proof: The proofs of (a)-(d) follow from Proposition 5.1.3 along with Proposi-
tions 3.3.2 and 3.4.1. We consider any sequence {an } such that an ∈ D for all n
and an → x0 . Then by the continuity hypothesis and Proposition 5.1.3 we know
that f (an ) → f (x0 ) and g(an ) → g(x0 ). Then by Proposition 3.3.2 we know that
cf (an ) → cf (x0 ), (f + g)(an ) = f (an ) + g(an ) → f (x0 ) + g(x0 ) = (f + g)(x0 )
and (f g)(an ) = f (an )g(an ) → f (x0 )g(x0 ) = (f g)(x0 ). Then by Proposition
5.1.3 cf , f + g and f g are continuous at x = x0 .
Likewise, for any sequence {an } such that an ∈ D for all n and an → x0 ,
the continuity of f at x0 implies that f (an ) → f (x0 ) and g(an ) → g(x0 ).
Since g(x0 ) 6= 0, Proposition 3.4.1 implies that (f /g)(an ) = f (an )/g(an ) →
f (x0 )/g(x0 ) = (f /g)(x0 ). Then by Proposition 5.1.3, f /g is continuous at
x = x0 .
We must realize that the above results can also be proved based on Definition
5.1.1—similar to the proof of Proposition 4.3.1 given using Definition 4.1.1.
Also, we want to emphasize that Proposition 5.3.1 implies that if f and g are
continuous on D ⊂ R, then cf , f ± g, f g are continuous on D. And, f /g is
continuous on {x ∈ D : g(x) 6= 0}—which is the natural domain of f /g.
5.3 Basic Continuity Theorems 127

In addition the the results given above, we also have the results analogous
to parts (c) and (e) of Proposition 4.3.1. The result is a useful tool in the study
of continuity. We state the following proposition.
Proposition 5.3.2 Consider f : D → R for D ⊂ R, x0 ∈ D, and suppose that
f is continuous at x = x0 .
(a) There exists a K ∈ R and δ > 0 such that |x − x0 | < δ implies |f (x)| ≤ K.
(b) If f (x0 ) 6= 0, there exists M ∈ R and δ > 0 such that |x − x0 | < δ implies
|f (x)| ≥ M .

We don’t prove the above result—the proof is the same as those of parts (c)
and (e) of Proposition 4.3.1.
The next result could be pieced together by multiple applications of parts
Proposition 5.3.1—but we don’t have to work that hard. We have already done
the work in Section 4.3
Example 5.3.1 (a) For n ∈ N the function f (x) = xn is continuous on R.
(b) All polynomials are continuous on R.
(c) All rational functions are continuous at all points at which the denominator is not zero.
Solution: All of the points under consideration are limit points of the domains. Then part
(a) follows from Proposition 4.3.2-(c) along with Proposition 5.1.2. Parts (b) and (c) follows
from parts (b) and (c) of Proposition 4.3.3.

The are a series of basic continuity theorems that we must consider. We


include the following result.
Proposition 5.3.3 Consider f, g : D → R for D ⊂ R, x0 ∈ D, c ∈ R, and
suppose that f and g are continuous at x = x0 . We then have the following
results.
(a) The function F (x) = max{f (x), g(x)} is continuous at x = x0 .
(b) The function G(x) = min{f (x), g(x)} is continuous at x = x0 .

Proof: (a) Suppose  > 0 is given. Then there exists δ1 and δ2 such that

|x − x0 | < δ1 implies that |f (x) − f (x0 )| <  or f (x0 ) −  < f (x) < f (x0 ) + 
(5.3.1)
and

|x − x0 | < δ2 implies that |g(x) − g(x0 )| <  or g(x0 ) −  < g(x) < g(x0 ) + .
(5.3.2)
Let δ = min{δ1 , δ2 } (Step 1: Define δ). Then for x satisfying |x − x0 | < δ

max{f (x0 ), g(x0 )} −  = max{f (x0 ) − , g(x0 ) − } < max{f (x), g(x)} (5.3.3)

and

max{f (x), g(x)} < max{f (x0 ) + , g(x0 ) + } = max{f (x0 ), g(x0 )} +  (5.3.4)

or F (x0 ) −  < F (x) < F (x0 ) +  (Step 2: δ works). Thus we have |F (x) −
F (x0 )| <  so F is continuous at x = x0 . Look at the computation given
128 5. Continuity

in (5.3.3) and (5.3.4) carefully. It’s easy but looks difficult. You start with
max{f (x), g(x)} and replace each of them by the inequalities given by state-
ments (5.3.1) and (5.3.2).
(b) Of course the proof of part (b) will be the same. We again consider state-
ments (5.3.1) and (5.3.2). This time taking the minimums, we get G(x0 ) −  <
G(x) < G(x0 ) +  or |G(x) − G(x0 )| < . Thus G is continuous at x = x0 .
Next we want to give a result that will expand the number of functions that
we know are continuous. Before we give the result we include the definition of
the composite function.

Definition 5.3.4 For D ⊂ R consider f : D → R and g : U → R where


f (D) ⊂ U . Then the composition of f and g, g ◦ f : D → R is defined as
g ◦ f (x) = g(f (x)) for all x ∈ D.

We use the composition to define some more interesting


√ functions: (i) f (x) =

x2 + 1 and g(y) = y implies that g ◦ f (x) = x2 + 1. (ii) f (θ) = θ − π/2
and g(y) = sin y implies that g ◦ f (θ) = sin(θ − π/2). etc. We then have the
following basic result concerning continuity of a composite function.

Proposition 5.3.5 Suppose that f : D → R, g :→ R, f (D) ⊂ U , f is continu-


ous at x0 and g is continuous at f (x0 ). Then g ◦ f is continuous at x = x0 .

Proof: We suppose that  > 0 is given. g continuous at f (x0 ) implies that


there exists a δ1 such that |y − f (x0 )| < δ1 implies that |g(y) − g(f (x0 ))| < . f
continuous at x0 (applying the definition of the continuity of f at x0 using δ1 in
place of the traditional ””) implies that there exists a δ such that |x − x0 | < δ
implies that |f (x) − f (x0 )| < δ1 .
Then for x−x0 | < δ we have |f (x)−f (x0 )| < δ1 which implies that |g(f (x))−
g(f (x0 ))| <  or g ◦ f is continuous at x = x0 .
We next define the maximum and minimums that you worked a lot with in
your basic course.

Definition 5.3.6 Consider the function f : D → R where D ⊂ R and x0 ∈ D.


(a) The point (x0 , f (x0 )) is said to be a maximum (or local maximum) of f if
there exists a neighborhood of x0 , N , such that f (x) ≤ f (x0 ) for all x ∈ N ∩ D.
(b) The point (x0 , f (x0 )) is said to be an absolute maximum of f on D if f (x) ≤
f (x0 ) for all x ∈ D.
(c) The point (x0 , f (x0 )) is said to be a minimum (or local minimum) of f if
there exists a neighborhood of x0 , N , such that f (x) ≥ f (x0 ) for all x ∈ N ∩ D.
(d) The point (x0 , f (x0 )) is said to be an absolute minimum of f on D if f (x) ≥
f (x0 ) for all x ∈ D.

It is easy to see that the function f (x) = −x2 defined on [−1, 1] has a maxi-
mum at (0, 0)—it is an absolute maximum. This function has minimums at both
points (−1, −1) and (1, −1)—which are both absolute maximums. Note that
this means that the absolute maximum or minimum need not be unique. Note
5.3 Basic Continuity Theorems 129

that the same function defined on the set (−1, 1) does not have any minimi—
and then surely does not have an absolute minimum. We also note that if we
define a function f : D → R, D = (−2, −1) ∪ {0} ∪ (1, 2), by f (x) = x2 , then by
the definition x = 0 is both a maximum and a minimum—not very satisfying
but acceptable.
We next prove a useful lemma and an very important theorem concerning
continuous functions—a result that is much stronger than Proposition 5.3.2–(a).

Lemma 5.3.7 Suppose that f : [a, b] → R and f is continuous on [a, b]. Then
there exists an M ∈ R such that f (x) ≤ M for all x ∈ [a, b].

Proof: Suppose false. Suppose there is not such M such that f (x) ≤ M .
For M = 1 there exists an x1 ∈ [a, b] such that f (x1 ) > 1 (otherwise M = 1
would work).
For M = 2 there exists an x2 ∈ [a, b] such that f (x2 ) > 2.
And, in general, for each n ∈ N there exists an xn ∈ [a, b] such that f (xn ) > n.
{xn } is a sequence in [a, b]. Since [a, b] is compact, by Corollary 3.4.8 we know
that the sequence {xn } has a subsequence, {xnj } and there is an x0 ∈ [a, b],
such that xnj → x0 as j → ∞. Then by the continuity of f on [a, b] and
Proposition 5.1.3 we know that f (xnj ) → f (x0 ). Since the sequence {f (xnj }∞j=1
is convergent, we know by Proposition 3.3.2–(c) that the sequence is bounded.
This contradicts the fact that f (xnj ) > nj > j for all j. Therefore the set
f ([a, b]) is bounded above.

Theorem 5.3.8 Suppose that f : [a, b] → R and f is continuous on [a, b]. Then
f has an absolute maximum and an absolute minimum on [a, b].

Proof: Let S = f ([a, b]). By Lemma 5.3.7 S is bounded above. Thus by the
completeness axiom, Definition 1.4.3, M ∗ = lub(S) exists. Note that to find an
absolute maximum of f on [a, b], we must find an x0 such that f (x0 ) = M ∗ .
Recall that by Proposition 1.5.3–(a) for every  > 0 there exists an s ∈ S
such that M ∗ −s < . In our case Proposition 1.5.3–(a) gives that for every  > 0
there exists an x ∈ [a, b] (and an associated f (x)) such that M ∗ − f (x) < —all
points in S look like f (x)—and are associated with an x ∈ [a, b].
Let  = 1. We get x1 ∈ [a, b] such that M ∗ − f (x1 ) < 1.
Let  = 1/2. We get x2 ∈ [a, b] such that M ∗ − f (x2 ) < 1/2.
In general, let  = 1/n for n ∈ N. We get xn ∈ [a, b] such that M ∗ −f (xn ) < 1/n.

Hence we have M ∗ − 1/n < f (xn ) ≤ M ∗ for all n ∈ N (because M ∗ is an upper


bound) so f (xn ) → M ∗ —using either Definition 3.1.3 or Proposition 3.4.2.
All of xn ’s are in [a, b]. Since [a, b] is compact, by Corollary 3.4.8 we know
that there exists a subsequence of {xn }, {xnj }, such that xnj → x0 for some
x0 ∈ [a, b]. Thus f (xnj ) → f (x0 ).
130 5. Continuity

By Proposition 3.4.6 we know that f (xnj ) → M ∗ . By Proposition 3.3.1 we


know that the limit must be unique. Thus M ∗ = f (x0 ) and (x0 , f (x0 )) is an
absolute maximum.
To show that f has an absolute minimum on [a, b] we consider the function
g = −f . If f is continuous on [a, b], the function g will be continuous on [a, b].
The absolute maximum of g on [a, b] will be the absolute minimum of f on [a, b].

HW 5.3.1 (True or False and why)


(a) Suppose f, g : D → R, D ⊂ R, x0 ∈ R. If f + g are continuous at x = x0 ,
then f and g are continuous at x = x0 .
(b) Suppose f : [0, 1] → R such that f 2 is continuous on [0, 1]. Then f is
continuous on [0, 1].
(c) Suppose f, g : D → R, D ⊂ R. Then max{f (x), g(x) : x ∈ D} = max{f (x) :
x ∈ D} max{g(x) : x ∈ D}.

(d) Consider f (x) = x defined on [0, ∞) and g(x) = x − 1 defined on R. Then
f ◦ g is continuous on [1, ∞).
(e) Consider f : [0, 1] → R. Then f has a maximum on [0, 1].

HW 5.3.2 Suppose that f : [0, 1] → R is continuous at the point x = x0 and


f (x0 ) > 0. Prove that there exists an n ∈ N such that f (x) > 0 for all x in the
neighborhood N1/n (x0 ).

HW 5.3.3 (a) Suppose f : D → R, D ⊂ R, x0 ∈ D, is continuous at x = x0 .


Prove that |f | (defined by |f |(x) = |f (x)|) is continuous at x = x0 .
(b) Prove that for x ∈ D min{f (x), g(x)} = 21 [f (x) + g(x)] − 12 |f (x) − g(x)|.
(c) If f and g are continuous at x = x0 , prove that G(x) = min{f (x), g(x)}
is continuous at x = x0 (give a proof different from that given in Proposition
5.3.3).

5.4 More Continuity Theorems


We next prove a very important basic theorem concerning continuous functions—
the result yields an approximate characterisation of continuity.

Theorem 5.4.1 (Intermediate Value Theorem: IVT) Suppose that f : [a, b] →


R and f is continuous on [a, b]. Let c ∈ R be between f (a) and f (b). Then there
exists x0 ∈ (a, b) such that f (x0 ) = c.

Proof: We have two cases, f (a) < c < f (b) and f (b) < c < f (a). We will
consider the first case—the second case will follow in the same way.
This will be a constructive proof. Let a1 = a and b1 = b.
Let m1 = (a1 + b1 )/2. If f (m1 ) ≤ c, define a2 = m1 and b2 = b1 . If f (m1 ) > c,
define a2 = a1 and b2 = m1 . Note that this construction divides the interval
5.4 More Continuity Theorems 131

in half, and chooses the half so that f (a2 ) ≤ c < f (b2 )—specifically we have
a = a1 ≤ a2 < b2 ≤ b1 = b and f (a2 ) ≤ c < f (b2 ).
Let m2 = (a2 + b2 )/2. If f (m2 ) ≤ c, define a3 = m2 and b3 = b2 . If f (m2 ) > c,
define a3 = a2 and b3 = m2 . We have a = a1 ≤ a2 ≤ a3 < b3 ≤ b2 ≤ b1 = b and
f (a3 ) ≤ c < f (b3 ).
We continue in this fashion and inductively obtain an and bn , n = 1, 2, · · ·
such that

a = a1 ≤ a2 ≤ a3 ≤ · · · ≤ an < bn ≤ bn−1 ≤ · · · ≤ b1 = b

and f (an ) ≤ c < f (bn ) for all n ∈ N. We have a sequence of closed intervals
[an , bn ] such that f (an ) ≤ c < f (bn ) and bn − an = 21 [bn−1 − an−1 ] = · · · =
1 1
2n−1 [b1 − a1 ] = 2n−1 [b − a].
Clearly {an } is a monotonically increasing sequence that is bounded above
by b. Therefore by the Monotone Convergence Theorem, Theorem 3.5.2, there
exists α ≤ b such that an → α. Likewise the sequence {bn } is a monotonically
decreasing sequence bounded below by a. Thus by the Monotone Convergence
Theorem there exists a β ≥ a such that bn → β.
1
We see that limn→∞ [bn − an ] = limn→∞ 2n−1 [b − a] = 0 and limn→∞ [bn −
an ] = β − α. Thus α = β. Call it x0
We have f (an ) ≤ c < f (bn ) and limn→∞ f (an ) = limn→∞ f (bn ) = f (x0 ).
By the Sandwhich Theorem, Proposition 3.4.2, (where the center sequence will
be the constant sequence {f (c), f (c), · · · }) we have f (x0 ) = c.
We might note that one of the nice applications of the IVT is to prove the
existence of a solution of an equation of the form f (x) = 0. The approach is to
find a and b in the domain of the function such that f (a) < 0, f (b) > 0 and f is
continuous on [a, b]. The IVT then implies that there exists an x0 ∈ [a, b] such
that f (x0 ) = 0. For example consider the function f (x) = x5 + x + 1. We note
that f (−1) = −1, f (1) = 3 and f is surely continuous on the interval [−1, 1].
Therefore by the Intermediate Value Theorem there exists an x0 ∈ [−1, 1] such
that f (x0 ) = 0. Can you find such an x0 . Can you approximate it? (Use your
calculator.)
Can you find a sequence of points that converges to the solution? Use the
proof of the previous proposition: it’s called the Bisection Method. Suppose
we know that f (a) < 0, f (b) > 0 and f is continuous on [a, b]. We then use a
construction that we have used in the proof of the IVT. Let {an }, {mn } and
{bn } be as defined in the proof. But we don’t compute them all—that would
take a long time even on the best computer. We continue with this computation
until for some mn , f (mn ) is sufficiently small. We use that value of mn as an
approximation of the solution of f (x) = 0. We note that the proof of the IVT
proves that the sequence {an } converges to the solution of f (x) = 0. It’s easy
to see that the sequence {mn } will also converge to the solution of f (x) = 0.
(You might prove it—just to show that it is easy.) We really don’t know how
fast this convergence is taking place (the Bisection method is not the fastest
method) but we can get an excellent approximation to solutions of equations
using this method. For example if we again consider f (x) = x5 + x + 1, set
132 5. Continuity

a = −1 and b = 1, and perform the iteration, we get m1 = 0.0, m2 = −0.5,


m3 = −0.75, m4 = −0.875, m5 = −0.8125, m6 = −0.7813, m7 = −0.7656,
m8 = −0.7578. We see that f (−0.7578) = −0.0077 and we stopped because we
chose ”sufficiently small” to be 0.01. And of course, if you wanted to find the
solutions to x5 + x + 1 = 7 instead, you would consider the function f (x) =
(x5 + x + 1) − 7. You should understand that the importance of the scheme
(even though it’s too slow) is that the scheme is a convergent scheme that can
be used to compute the solution to a predefined accuracy.
We next include a corollary to Theorem 5.4.1 that will be useful to us in
the next section. Before we proceed we want to emphasize that by interval, we
mean any of the different types of intervals we have introduced—closed, open,
part open and part closed, unbounded, etc. We state the following result.
Corollary 5.4.2 Suppose f : I → R where I ⊂ R is an interval and f is
continuous on I. Then f (I) is an interval.

Proof: If f (I) is not an interval, there must be an f (a), f (b) ∈ f (I) and a
c ∈ R such that c is between f (a) and f (b) but c 6∈ f (I). This would contradict
Theorem 5.4.1 applied to f on [a, b] (where for convenience we assume that
a < b).
Often we are interested in when and where the functions are increasing and
decreasing—if you recall, you probably used these ideas in your basic class when
you used calculus to plot the graphs of some functions. We will use these ideas
in a very powerful way to help us study the inverse of functions. Before we
proceed we make the following definitions.

Definition 5.4.3 Consider the function f : D → R where D ⊂ R.


(a) f is said to be increasing on D if for x, y ∈ D such that x < y, then
f (x) ≤ f (y).
(b) f is said to be decreasing on D if for x, y ∈ D such that x < y, then
f (x) ≥ f (y).
(c) f is said to be strictly increasing on D if for x, y ∈ D such that x < y, then
f (x) < f (y).
(d) f is said to be strictly decreasing on D if for x, y ∈ D such that x < y, then
f (x) > f (y).
If the function f is either increasing or decreasing, we say that f is mono-
tone. If f is either strictly increasing or decreasing, then we say that f is strictly
monotone.

We note that f2 (x) = x2 is not monotone on R. We also note that f2


is strictly increasing on [0, ∞) and strictly decreasing on (−∞, 0]. We also
note that f3 (x) = x3 is strictly increasing on R—these all can be seen by
graphing the functions—these all can be proved by using the methods similar
to
( those used in HW1.3.3-(a). A more complicated function is given by f7 (x) =
x−4 if x < 0
—graph it and it should be clear that f7 is increasing on R.
2x + 3 if x ≥ 0
5.4 More Continuity Theorems 133

−x + 4 if x < 0

If we define f8 (x) = 2 if 0 ≤ x ≤ 1 and graph it, it is easy to see that

−4x if x > 1

f8 is decreasing but not strictly decreasing.
We next prove a result we think is surprising in that we get continuity with
a rather strange hypothesis. Read this proof carefully—it is a very technical
proof.
Proposition 5.4.4 Consider f : D → R where D ⊂ R. Assume that f is
monotone on D. If f (D) is an interval, then f is continuous on D.

Proof: Consider the case when f is increasing. The case of f decreasing will
be the same. Let x0 ∈ D and suppose that  > 0 is given. We must find a δ so
that |x − x0 | < δ implies that |f (x) − f (x0 )| < , i.e.

f (x0 ) −  < f (x) < f (x0 ) + . (5.4.1)

The method of proof will be to first work to the right of x0 to limit δ so that
for x ∈ (x0 , x0 + δ) f cannot grow too much (not more than to f (x0 ) + ). We
will then do the same thing to the left of x0 .
Consider the right most part of inequality (5.4.1): f (x) < f (x0 ) + .
If f (x) ≤ f (x0 ) for all x ∈ D, the desired inequality is satisfied and we can
choose δ1 = 1.
Otherwise, let x∗ ∈ D be such that f (x∗ ) > f (x0 ). Then x0 < x∗ (because f
is increasing) and the interval [f (x0 ), f (x∗ )] is contained in f (D) because f (D)
is assumed to be an interval. Let y ∗∗ = min{f (x0 ) + /2, f (x∗ )}. Then the
interval [f (x0 ), y ∗∗ ] is contained in f (D). Thus there exists an x∗∗ ∈ D such
that f (x∗∗ ) = y ∗∗ . Then x0 < x < x∗∗ implies that f (x0 ) ≤ f (x) ≤ f (x∗∗ ) =
y ∗∗ < f (x0 ) + . Let δ2 = x∗∗ − x0 .
Now consider the left most part of inequality (5.4.1): f (x0 ) −  < f (x).
If f (x) ≥ f (x0 ), we are done. Let δ3 = 1.
Otherwise, let x∗ ∈ D be such that f (x∗ ) < f (x0 ). Then x∗ < x0 and the
interval [f (x∗ ), f (x0 )] ⊂ f (D). Let y ∗∗ = max{f (x0 ) − /2, f (x∗ )}. Then
[y ∗∗ , f (x0 )] ⊂ f (D) and there exists x∗∗ ∈ D f (x∗∗ ) = y ∗∗ . Then x∗∗ < x < x0
implies that f (x0 ) −  < y ∗∗ = f (x∗∗ ) ≤ f (x) ≤ f (x0 ). Let δ4 = x0 − x∗∗ .
Thus we see that if we define δ = min{δ1 , δ2 , δ3 , δ4 } (Step 1: Define δ) and
require that |x − x0 | < δ, then |f (x) − f (x0 )| <  (Step 2: δ works).
Notice that the functions f7 and f8 consider earlier are both monotone but
are not continuous—neither f7 (R) nor f8 (R) are intervals. Check it out.
We next state a result is a bit strange because we already have this result—it
is a combination of Corollary 5.4.2 and Proposition 5.4.4. We do so because we
want to emphasize this result in this form.
Corollary 5.4.5 Consider f : I → R where I ⊂ R is an interval. Assume
that f is monotone on I. Then f is continuous on I if and only if f (I) is an
interval.
134 5. Continuity

There are times when we are given a function for which it is very important
to us to know that the function has an inverse and to be able to determine
properties of that inverse. You have been using inverse functions for a long
time(it is a very basic result to be give some y = f (x) and want to solve for x—
sometimes you might have been aware that you were using an inverse, and other
times you might not have been aware. We begin with the following definition.
Definition 5.4.6 Consider f : D → R where D ⊂ R. The function f is said
to be one-to-one (often written 1-1) if f (x) = f (y) implies that x = y.
In your basic calculus course when you studied one-to-one functions you used
what is called the horizontal line test—that is, draw an arbitrary horizontal line
on the graph of the function, the function is one-to-one if the line intersects
that graph at only one point. It should be clear that this description of the
horizontal line test is equivalent to Definition
√ 5.4.6—though less rigorous.
To prove that function f (x) = x defined on [0, ∞) we note that f1 (x) =
√ √1
f1 (y) is the same as x = y. If we then square both sides, we find that x = y—
which is what we must prove. Graph f1 to see how the horizontal line test works.
The function f2 (x) = x2 is surely not one-to-one on R (f2 (−1) = f2 (1)). Again,
plot the function and draw the horizontal line. If we consider f2 on [0, ∞)
instead, then f2 is one-to-one.
A statement that is equivalent to Definition 5.4.6 is as follows: The function
f is said to be one-to-one for each element y ∈ f (D) there exists one and only
one element in D, x, such that f (x) = y. The definition of one-to-one allows us
to make the following definition.
Definition 5.4.7 Consider f : D → R where D ⊂ R. Assume that the function
f is one-to-one. We define the function f −1 : f (D) → D by f −1 (y) = x if
f (x) = y. The function f −1 is called the inverse of f . When f −1 exists, f is
said to be invertible.
Note that the definition that f is one-to-one is exactly what is needed to make
f −1 a function, i.e. for each y ∈ f (D)there exists one and only one x ∈ D such
that f −1 (y) = x. We also note that by rewriting the statement in Definition
5.4.7 we see that f and f −1 satisfy f −1 (f (x)) = x for x ∈ D and f (f −1 (y)) = y
for y ∈ f (D).
If we consider f2 (x) = x2 on [0, ∞), let y = x2 and solve for x, we get
√ √
x = ± y. Since x must be greater than or equal to zero, f2−1 (y) = y. Note
that since f2 ([0, ∞)) = [0, ∞), the domain of f2−1 is also [0, ∞). If we next

consider f3 (x) = x3 on R, we note that f3 (R) = R and f3−1 (y) = 3 y for all
y ∈ R.
We obtain the following important but very easy result.
Proposition 5.4.8 Consider f : D → R where D ⊂ R. Assume that f is
strictly monotone on D. Then f is one-to-one on D.

Proof: Suppose that f is strictly increasing—the proof for f strictly decreasing


will the same—and suppose that f is not one-to-one. Then there exists x 6= y
5.4 More Continuity Theorems 135

such that f (x) = f (y). If x 6= y, then either x < y or y < x—either case is a
contradiction to the fact that f is strictly increasing.
It’s clear that the converse of Proposition 5.4.8 is not true when we consider
a function like f (x) = 1/x defined on R − {0}. However we are able to obtain
the following result.

Proposition 5.4.9 Consider the function f : I → R where I ⊂ R is an in-


terval. Assume that f is one-to-one and continuous on I. Then f is strictly
monotone on I.

Proof: We begin by choosing arbitrary a, b ∈ I. For convenience assume that


a < b. Since f is one-to-one, we know that f (a) 6= f (b)—so that either f (a) <
f (b) or f (a) > f (b). Consider the case of f (a) < f (b). If f is to be strictly
monotone, in this situation it must be the case that f is strictly increasing on
[a, b]. Assume false, i.e. assume that f is not strictly increasing on [a, b], i.e.
assume there exists some x1 , x2 ∈ [a, b] such that x1 < x2 and f (x1 ) ≥ f (x2 )—
since f is one-to-one, we know we would really have f (x1 ) > f (x2 ).
We have two cases. For each case it might help to draw a picture of
the situation—give it a try. Case 1: f (a) < f (x1 ). We then choose c =
max{(f (x1 ) + f (x2 ))/2, (f (a) + f (x1 ))/2}. Since f is continuous on I, f is con-
tinuous on [a, x1 ]. Also c is between f (a) and f (x1 ). Therefore by the IVT,
Theorem 5.4.1, we know that there exists y1 ∈ [a, x1 ] such that f (y1 ) = c.
Also f is continuous on [x1 , x2 ] and c is between f (x1 ) and f (x2 ). Thus
again by the IVT we know that there exists y2 ∈ [x1 , x2 ] such that f (y2 ) = c.
This is a contradiction to the fact that f is one-to-one.
Case 2: f (a) > f (x1 ). In this case then f (b) > f (a) > f (x1 ) > f (x2 ). Similar
to the last case, we set c = min{(f (x1 ) + f (x2 ))/2, (f (x2 ) + f (b)/2}, apply the
IVT with respect to c on [x1 , x2 ] and [x2 , b], and arrive at a contradiction to the
fact that f is one-to-one on I. Therefore f is strictly increasing on I.
The case when f (a) > f (b) is essentially the same.

Proposition 5.4.10 Suppose that f : D → R and D ⊂ R. (a) If f is strictly


increasing on D, then f −1 : f (D) → D is strictly increasing on f (D).
(b) If f is strictly decreasing, then f −1 : f (D) → D is strictly decreasing on
f (D).

Proof: (a) We assume that f −1 is not strictly increasing, i.e. suppose that
u < v and f −1 (u) 6< f −1 (v). Then f −1 (u) ≥ f −1 (v). Suppose x and y are
such that f (x) = u and f (y) = v. Then u < v is the same as f (x) < f (y) and
f −1 (u) ≥ f −1 (v) implies that x ≥ y.
This contradicts the fact that f is increasing because when f is increasing
x > y implies f (x) > f (y) and x = y implies f (x) = f (y). i.e. if we have x ≥ y,
we have u = f (x) ≥ f (y) = v.
(b) The proof when f is strictly decreasing is very similar to that given in part
(a).
136 5. Continuity

The next two results are the ultimate results relating the continuity proper-
ties of f −1 to those of f .

Proposition 5.4.11 Suppose that I ⊂ R is an interval and f : I → R is strictly


monotone on I. Then f −1 : f (I) → I is continuous.

Proof: Suppose that f is strictly increasing—the proof of the case for f strictly
decreasing is the same. We know then by Proposition 5.4.10–(a) that f −1 :
f (I) → I is strictly increasing. Then since f −1 (f (I)) = I is an interval (and
f (I) is the domain of f −1 ), by Proposition 5.4.5 we see that f −1 is continuous
on f (I).

Proposition 5.4.12 Suppose that f : I → R where I ⊂ R. Assume that f is


one-to-one and continuous on I. Then f −1 : f (I) → I is continuous.

Proof: This is an easy combination of Propositions 5.4.9 and 5.4.11. From


Proposition 5.4.9 we see that f is strictly monotone on I. Then by Proposition
5.4.11 we have that f −1 is continuous on f (I).
The results of the two propositions—f −1 is continuous—is the same. The
difference between the two propositions is in the hypotheses. The fact that
we assume f continuous (and one-to-one) in Proposition 5.4.12 seems like a
stronger hypothesis than assuming strict monotonicity as we do in Proposition
5.4.11. However, there are times that it is preferable to be able to assume one-
to-one than monotonicity—and it’s not a terrible assumption to assume that f
is continuous. The real point is that we have both results. Whatever we want
to use in the end, we will have.

HW 5.4.1 (True or False and why)


(a) There is at least one solution to the equation x4 − 3x3 + 2x2 − x − 1 = 0.
(b) Consider the function f : R → R defined by f (x) = sin x. We know that
f (R) = [−1, 1] is an interval. Then f is invertible on [−1, 1].
(c) Suppose f : D → R, D ⊂ R, is one-to-one and continuous on D. Then f is
strictly monotone.
(d) Suppose f : D → R, D ⊂ R, is monotone. Then f is invertible.
(e) Suppose f : D → R, D ⊂ R, is continuous on D. Then f (D) is an interval.

HW 5.4.2 Prove that the equation x6 + x4 − 3x3 − x + 1 = 0 has at least one


solution. Find an approximation of a solution to the equation.

HW 5.4.3 Suppose that the function f : [0, 1] → R is continuous and satisfies


f ([0, 1]) ⊂ Q. Prove that f is the constant function.

HW 5.4.4 Show that the function f (x) = 2x − cos x is invertible on R. Can


you explicitly find an expression of f −1 .
5.5 Uniform Continuity 137
(
3x − 2 if x < 0
HW 5.4.5 Define the function f (x) =
2x + 1 if x ≥ 0.
(a) Show that the function f is strictly increasing.
(b) Determine f (R).
(c) Show that f −1 exists.
(d) Prove that f −1 is continuous at x = −2.
(e) Determine where f −1 is continuous.

5.5 Uniform Continuity


The set of continuous functions on some domain D is an important set of func-
tions. There is another level of smoothness that we attach to functions that
yields another important class of functions: uniformly continuous functions. As
we shall see the idea of uniform continuity is truly related to the set D whereas
continuity was defined pointwise and then was consider continuous on the set D
if it was continuous at each individual point of D. We begin with the definition.

Definition 5.5.1 Consider the function f : D → R where D ⊂ R. f is said


to be uniformly continuous on D if for every  > 0 there exists a δ such that
x, y ∈ D and |x − y| < δ implies that |f (x) − f (x0 )| < .
This definition should be observed carefully and contrasted with Definition 5.1.1.
If we consider the function f (x) = x2 defined on D = (0, 1), we hope that it is
clear that f is continuous on (0, 1). However, when we proceed to show that it
is continuous at each point in (0, 1), we might begin by considering x0 = 0.1.
Then given  > 0 we write |x2 − (0.1)2 | = |x − 0.1||x + 0.1| and realize that this
is one of the applications of the definition of continuity where we must restrict
the range of x and bound the term |x+0.1|. Suppose we let δ1 = 0.1 (no specific
tie to the fact that x0 = 0.1 except for the fact that both were chosen because
0.1 is a nice small number) and restrict x so that |x − 0.1| < δ1 = 0.1. Then
|x + 0.1| < 3/10 so we set δ0.1 = min{0.1, 10/3} (Step 1: Define δ), suppose
that |x − 0.1| < δ0.1 and continue with our previous calculation to get
|x2 − (0.1)2 | = |x − 0.1||x + 0.1| <∗ (3/10)|x − 0.1| <∗∗ (3/10)10/3 = 
(Step 2: δ works) where the ”<∗ ” inequality is true because δ0.1 ≤ 0.1 and the
”<∗∗ ” inequality is true because δ0.1 ≤ 10/3. Therefore f (x) = x2 is continuous
at x = 0.1.
If we continued by next choosing x0 = 0.9, we can do some work to note
that we can choose δ0.9 = min{0.1, 10/19} (Step 1: Define δ), suppose that
|x − 0.9| < δ0.9 and note that
|x2 − (0.9)2 | = |x − 0.9||x + 0.9| < (19/10)|x − 0.9| < (19/10)10/19 = 
(Step 2: δ works). Therefore f is continuous at x = 0.9. (Do the necessary
calculation.)
138 5. Continuity

To continue with showing that f is continuous on D = (0, 1) we have many


more points to consider. But let us consider the two points we have already
considered. We first admit that we could have bounded our x values differently
(chosen δ1 larger than 0.1) and gotten different δ’s—but that wouldn’t make
our point. The end result is that we prove continuity at these two points using
radically different δ’s. For example, if  = 0.001, then δ0.1 = (0.001)10/3 and
δ0.9 = (0.001)10/19. Not only do we get different δ’s, in this case we get radically
different δ’s.
When we want to prove that f (x) = x2 is uniformly continuous on D = (0, 1),
for a given  > 0 we must find a δ so that for any x, y ∈ (0, 1) and |x − y| < δ
implies that |x2 − y 2 | < . That means it must work when we choose y = 0.1
and it must work when we choose y = 0.9. It seems as if (we don’t really know
for sure) δ0.1 will not work everywhere because it is a lot bigger than the δ0.9 .
Since the δ0.9 < δ0.1 , δ0.9 would work at both y = 0.1 and y = 0.9. Is there any
reason to believe that it would work everywhere? If a function f is continuous
on a set D and we are given an  > 0, it is perfectly permissable for the δ to be
different at every point in D.
Thus the question is can we choose a δ that will work everywhere (if we
can’t, then f is not uniformly continuous on (0, 1)) and how do we do it. If
we return to Figure 4.1.1 and consider what determines the size of δ (in the
case of Figure 4.1.1, the δ1 and the δ2 ) it should be clear that the steepness
of the graph at and near the point is what determines the quantity needed for
δ—the steeper the curve the smaller the δ. Hence, we want to choose the point
of D = (0, 1) that requires the smallest δ, i.e. the point at which the graph is
the steepest, construct a continuity at that point to determine the δ and show
that this δ will work for x, y ∈ D.
Hopefully we know what the graph of f (x) = x2 looks like on (0, 1). It should
be clear that there is not a point in (0, 1) at which the graph is the steepest, but
it is also clear that the closer we get to x = 1, the steeper the curve gets. We
consider the point x0 = 1. But this is ridiculous because x0 = 1 6∈ D = (0, 1).
Who cares. If we can determine a δ that works, we will be done. We know that
f (x) = x2 could have been defined on all of [0, 1] so a continuity proof at x0 = 1
will make sense. We again consider |x2 − (1)2 | = |x − 1||x + 1|. If we restrict x
so that |x − 1| < δ1 = 0.1 or 0.9 < x < 1.1, then for x ∈ [0, 1], |x + 1| ≤ 2.1.
We then set δ = min{0.1, /2.1} (Step 1: Define δ), suppose that x satisfies
x ∈ [0, 1] and |x − 1| < δ and continue with our previous calculation to get
|x2 − 1| = |x − 1||x + 1| ≤ 2.1|x − 1| < 2.1/2.1 = 
(Step 2: δ works). Thus, if we considered f (x) = x2 defined on [0, 1], we would
know that f is continuous at x0 = 1.
We now have a δ = min{0.1, /2.1} that our earlier argument indicates
might work to show that f is uniformly continuous on (0, 1). We suppose that
x, y ∈ D = (0, 1) satisfies |x − y| < δ and consider |f (x) − f (y)| = |x2 − y 2 | =
|x − y||x + y|. Clearly for x, y ∈ (0, 1), |x + y| < 2. Thus we have
|f (x) − f (y) = |x2 − y 2 | = |x − y||x + y| < 2|x − y| < 2δ ≤ 2/2.1 < .
5.5 Uniform Continuity 139

Therefore f is uniformly continuous on D = (0, 1).


In summary we see that when we prove that a function is continuous on a set,
the derived δ may be different at each point of the set. To prove uniform con-
vergence we must find a δ that works uniformly through out the entire domain.
We saw that if a function is going to be uniformly continuous, one way to find
the correct δ is to consider continuity at the steepest point of the graph of the
function. More so, in the example considered above we first literally proved (at
least determined the correct δ) uniform convergence of f (x) = x2 on the larger
domain [0, 1]—and then used this information to prove uniform convergence on
(0, 1). In effect, we used the following proposition which is trivial to prove.
Proposition 5.5.2 Suppose the function f : D → R, D ⊂ R is uniformly
continuous on D. If D1 ⊂ D, then f is uniformly continuus on D1 .

One very important but easy result is the following.


Proposition 5.5.3 Suppose the function f ; D → R, D ⊂ R is uniformly con-
tinuous on D. If x0 is any point in D, then f is continuous at x = x0 .

It should be pretty clear how we find a function that is not uniformly


continuous—a function that doesn’t have a steepest point, the graph keeps
getting steeper and steeper. Consider the function f : D = (0, 1) → R defined
by f (x) = 1/x. It is not difficult to show that f is continuous on D. It’s more
difficult to show that f is not uniformly continuous. We suppose that f is uni-
formly continuous on D = (0, 1). Then for a given  > 0 and any δ there must
1 1
exist xδ , yδ such that |xδ − yδ | < δ and − ≥ . Let  = 1 and consider
xδ yδ
any δ > 0. The graph of f (x) = 1/x on (0, 1) gets steep near x = 0 so that’s
where we have to work. Consider the points xn = 1/n and yn = 1/2n for n ∈ N.
Then |xn − yn | = 1/2n and |f (xn ) − f (yn )| = x1n − y1n = n. Clearly we can

find an n such that 1/2n < δ (and this will hold for all larger n) and n ≥  = 1.
1 1
For this value of n we have |xn − yn | < δ and − = n ≥ 1. Thus f is not
xn yn
uniformly continuous on (0, 1).
There is a result that makes this last proof a bit easier. Since it is an ”if and
only if” result, the following proposition provides for an alternative definition
of uniform continuity.
Proposition 5.5.4 Suppose the function f : D → R, D ⊂ R. The function f
is uniformly continuous if and only if for all sequences {un }, {vn } in D such
that lim [un − vn ] = 0, then lim [f (un ) − f (vn )] = 0.
n→∞ n→∞

Proof: (⇒) Let  > 0 be given. Since f is uniformly continuous on D, there


exists δ such that x, y ∈ D and |x − y| < δ implies that |f (x) − f (y)| < .
Suppose that {un } and {vn } are two sequences in D such that lim [un − vn ] =
n→∞
0. Apply the definition of the limit of a sequence to this statement with the
140 5. Continuity

“traditional”  replaced by δ. Then there exists an N ∈ R such that n > N


implies that |(un −vn )−0| < δ. Then for n > N we have |un −vn | < δ so we can
apply the definition of uniform continuity given above to get |f (un )−f (vn )| < .
Therefore lim [f (un ) − f (vn )] = 0.
n→∞
(⇐) Suppose f is not uniformly continuous on D, i.e. suppose that for some 0 >
0 for any δ there exist xδ , yδ ∈ D such that |xδ −yδ | < δ and |f (xδ )−f (yδ )| ≥ 0 .
We inductively define two sequences {xn } and {yn } in the following manner.
Set δ = 1. Then there exists x1 , y1 ∈ D such that |x1 − y1 | < δ and |f (x1 ) −
f (y1 )| ≥ 0 .
Set δ = 1/2. Then there exists x2 , y2 ∈ D such that |x2 − y2 | < δ and |f (x2 ) −
f (y2 )| ≥ 0 .
In general set δ = 1/n. Then there exists xn , yn ∈ D such that |xn − yn | < δ
and |f (xn ) − f (yn )| ≥  for all n ∈ N.
We have two sequences {xn }, {yn } such that xn − yn → 0 as n → ∞ and
f (xn ) − f (yn ) 6→ 0. This is a contradiction to the hypothesis.
We feel that when the above statement is used as the definition it is a rather
odd definition. However, Proposition 5.5.4 gives us an excellent approach to
show that a function is not uniformly continuous. For the example considered
earlier, f (x) = 1/x defined on D = (0, 1), we define un = 1/n and vn = 1/2n.
Then  
1 1 1
lim [un − vn ] = lim − = lim =0
n→∞ n→∞ n 2n n→∞ 2n
and

lim [f (un ) − f (vn )] = lim (n − 2n) = − lim n = −∞ (not zero).


n→∞ n→∞ n→∞

Thus by Proposition 5.5.4 f is not uniformly continuous on D = (0, 1). Note


that this is essentially what we did earlier—but now we have a proposition that
we can easily apply.
We next include a result that is a logical and necessary result—we had an
analogous result for limits and continuity.
Proposition 5.5.5 Suppose that f, g : D → R, D ⊂ R, are uniformly continu-
ous on D. If c1 , c2 ∈ R then c1 f + c2 g is uniformly continuous on D.
Proof: Let  > 0 be given. Since f and g are uniformly continuous on D,
for 1 > 0, 2 > 0 there exist δ1 , δ2 such that x, y ∈ D, x − y| < δ1 implies
|f (x) − f (y)| < 1 and x, y ∈ D, |x − y| < δ2 implies |g(x) − g(y)| < 2 . Let
δ = min{δ1 , δ2 } (Step1 : Def ineδ) and assume |x − y| < δ. Then

|(c1 f (x) + c2 g(x)) − (c1 f (y) + c2 g(y))| ≤ |c1 ||f (x) − f (y)| + |c2 ||g(x) − g(y)|
< |c1 |1 + |c2 |2 .

Then if we choose 1 = /|c1 | and 2 = /|c2 |, we have |(c1 f (x) + c2 g(x)) −


(c1 f (y) + c2 g(y))| <  (Step 2: δ works) so c1 f + c2 g is uniformly continuous on
D.
5.5 Uniform Continuity 141

Notice that we have not included results for products and quotients of uni-
formly continuous functions. See HW5.5.3. We have one more very important
result relating continuity and uniform continuity.

Proposition 5.5.6 Suppose that f : [a, b] → R is continuous on [a, b]. Then f


is uniformly continuous on [a, b].

Proof: Suppose  > 0 is given. For x0 ∈ [a, b] since f is continuous at x0 ,


then there exists a δx0 such that |x − x0 | < δx0 implies |f (x) − f (x0 | < /2.
This can be done for every x0 ∈ [a, b], i.e. for each x0 ∈ [a, b] we get a δx0 .
This construction will produce the open cover of the set [a, b], {Gx0 }x0 ∈[a,b] ,
1 1
Gx0 = (x0 − δx0 , Gx0 = {x ∈ [a, b] : |x − xx0 | < δx0 }. The sets Gx0 are
2 2
clearly open because they’re open intervals. We get [a, b] ⊂ ∪ Gx0 because
x0 ∈[a,b]
there is an open interval around each point of [a, b], i.e. the collection of open
sets {Gx0 }x0 ∈[a,b] is an open cover of [a, b].
Then by the fact that [a, b] is compact, Proposition 2.3.7 and the definition
of compactness, Definition 2.3.1, there exists a finite subcover of [a, b], i.e. there
exist a finite number of these open intervals Gx1 , · · · , Gxn such that [a, b] ⊂
n
1
∪ Gxj . Remember that these open sets are intervals with radius 2 δxj , j =
j=1
1, · · · , n. Let δ = 12 min{δx1 , · · · , δxn } (Step 1: Define δ).
Now consider x, y ∈ [a, b] such that |x − y| < δ. Since {Gx1 , · · · , Gxn } covers
[a, b] and x ∈ [a, b], there exists Gxi0 such that x ∈ Gxi0 . Then |x−xi0 | < δ < δi0
and |f (x) − f (xi0 )| < /2. Also

|y − xi0 | = |(y − x) + (x − xi0 )| ≤∗ |y − x| + |x − xi0 |


1 1 1
< δ + δi0 ≤∗∗ δi0 + δi0 = δi0
2 2 2
where the ”≤∗ ” inequality is due to the triangular inequality, and the ”≤∗∗ ”
inequality is due to the definition of δ = 21 min{δx1 , · · · , δxn }. Thus we have
|f (y) − f (xi0 | < /2. Then

|f (x) − f (y)| = |(f (x) − f (xi0 )) + (f (xi0 ) − f (y))|


1 1
≤∗ |f (x) − f (xi0 )| + |f (xi0 ) − f (y)| <  +  = ,
2 2
(Step 2: δ works) where the ”≤∗ ” inequality is due to the triangular inequality.
Therefore f is uniformly continuous on [a, b].
We next state the more general result—the proof of which is exactly the
same as that of Proposition 5.5.6.

Proposition 5.5.7 Suppose that f : K → R, K ⊂ R compact, is continuous


on K. Then f is uniformly continuous on K.
142 5. Continuity

HW 5.5.1 (True or False and why)


(a) If f is uniformly continuous on (0, 1), then f is uniformly continuous on
[0, 1].
(b) If f is uniformly continuous on (0, 1) and continuous at points x = 0 and
x = 1, then f is uniformly continuous on [0, 1].
(c) If the domain of f is all of R, then f cannot be uniformly continuous.
(d) If D is the domain of f and f (D) = R, then f cannot be uniformly continuous
on D.
(e) The set D = [0, 1] ∩ Q is not compact. If D is the domain of f , then f
cannot be uniformly continuous on D.

HW 5.5.2 (a) Show that the function f : (0, 1) → R defined by f (x) = 3x2 + 1
is uniformly continuous.
(b) Show that f : (2, ∞) → R defined by f (x) = 1/x2 is uniformly continuous.
(c) Show that the function f : R → R defined by f (x) = x3 is not uniformly
continuous on R.

HW 5.5.3 Suppose f, g : D → R, D ⊂ R, are both uniformly continuous on


D.
(a) Show that f g need not be uniformly continuous on D.
(b) Suppose f and g are bounded on D. Prove that f g is uniformly continuous
on D.
(c) Show that f /g need not be uniformly continuous on D.
(d) Suppose that f and g are bounded on D and g is bounded below by a
positive nummber on D. If possible, prove that f /g is uniformly continuous. If
not possible, show why.

5.6 Rational Exponents



In Example 1.5.1 in Section 1.5 we used the completeness of R to define 2. We
mentioned at that time that the same approach could be used to define square √
roots of the rest of the positive reals, i.e. we could define the function f (x) = x.
It would be possible to proceed in this fashion to define the functions x1/n for
n ∈ N , n ≥ 2. After the functions x1/n are defined we could consider limits of
these functions, continuity of these functions and any other operations that we
might want to apply to functions.
We decided not to proceed in this √ fashion. We have not used rational ex-
ponents (except for our work with 2) until this time. We will now give an
alternative slick approach to defining rational exponents. We use Proposition
5.4.8 to define the functions x1/n (when they should exist) and Proposition
5.4.11 to √
show that these functions are continuous. We begin by considering the
function x.

Example 5.6.1 Consider the function f (x) = x2 on D = [0, ∞). Show that f is invertible
and that f −1 is continuous on [0, ∞).
5.6 Rational Exponents: An Application 143

Solution: We first note that f (D) = [0, ∞) and as we saw in Section 5.4, f is strictly
increasing on D. By Proposition 5.4.8 we know that f is one-to-one, i.e. f is invertible. As
usual denote the inverse of f by f −1 . Of course f −1 : f (D) = [0, ∞) → D = [0, ∞).
In addition since D = [0, ∞) is an interval, by Proposition 5.4.11 we know that f −1 is
continuous on f (D) = [0, ∞).

In addition we note that by Proposition 5.4.10 we know that f −1 is strictly


increasing.
Also, recall that f and f −1 satisfy f −1 (f (x)) = x for all x ∈ D = [0, ∞)
and f f −1 (y) = y for all y ∈ f (D) = [0, ∞). Since f (x) = x2 , these identities
2
imply that f −1 x2 = x for x ∈ [0, ∞) and f −1 (y) = y for all y ∈ [0, ∞).


The last identity suggests that we make the following definition.


√ √
Definition 5.6.1 For y ∈ [0, ∞) define y = y 1/2 = f −1 (y). y is referred to
as the square root of y.
As you will see this definition
√ will be usurped by Definition 5.6.2 given below.
We included the definition of 2 for √ emphasis. You should realize√that at this
2
2
time the only property we have is x = x for x ∈ [0, ∞) and y = y for
all y ∈ [0, ∞)—the two identities associated with the definition of an inverse
function.
We next consider the function f (x) = xn for n ∈ N defined on D = [0, ∞).
We see that f (D) = [0, ∞). Using induction along with a calculation similar
to that used to show that the function g(x) = x2 is strictly increasing, we see
that f is strictly increasing on D (see HW5.6.2). Again by Proposition 5.4.8 we
know that f is one-to-one, i.e. f is invertible. Denote the inverse of f by f −1 .
Then f −1 : f (D) = [0, ∞) → D = [0, ∞) and since D = [0, ∞) is an interval,
by Proposition 5.4.11 we know that f −1 is continuous on f (D) = [0, ∞).
As always f and f −1 must satisfy the identity
f −1 (f (x)) = x for all x ∈ D = [0, ∞) or specificallyf −1 (xn ) = x (5.6.1)
and
f f −1 (y) = y for all y ∈ f (D) = [0, ∞)

n
or specifically f −1 (y) = y for all y ∈ [0, ∞). (5.6.2)
We make the following definition..
√ √
Definition 5.6.2 For y ∈ [0, ∞) and n ∈ N define n y = y 1/n = f −1 (y). n y
is referred to as the nth root of y.
Hence the definition of the nth root of y is defined as the inverse of the function
f (x) = xn . With this definition and the identities given in (5.6.1) and (5.6.2)
we get the following identities.
 n
1/n
(a) (xn ) = x for x ∈ [0, ∞) and (b) y 1/n = y for all y ∈ [0, ∞) (5.6.3)

Now that the nth roots are defined we have to decide what we want to do
with these definitions. To begin with we make the following extensions of the
of the above definition.
144 5. Continuity

1
Definition 5.6.3 (a) For n is a negative we define y 1/n = y−1/n for y ∈ (0, ∞).
m r 1/n
 m
(b) For r ∈ Q, r = n , we define y = y for y ∈ (0, ∞). If y = 0 and
r > 0, define y r = 0.
Now that we have xr defined we have work to do. We noted in Example
1.6.3 and HW1.6.2 that for m, n ∈ N and a > 0 we have am an = am+n and
n
(am ) = amn , respectively. Of course we would like these properties to be true
for rationals also. But before we prove these arithmetic properties we must
prove that xr is well defined. The problem is that r = m mk
n and r = nk are equal
m/n mk/nk
rationals. We need to know that x =x .
Proposition 5.6.4 xr is well defined.

Proof: We note that


 kn km  km kn  kn
m k ∗ ∗∗
km
(x ) = x = x 1/kn
= x 1/kn
=∗∗∗ xkm/kn
(5.6.4)
and
 n m k h m ikn  kn
m k ∗
(x ) = x 1/n
=∗∗ x1/n =∗∗∗ xm/n (5.6.5)

where in both cases the **-equalities are due to integer algebra and the *-
equalities and ***-equalities are due to Definition 5.6.3-(b)—the definition of
 kn  kn
y r . Thus we have xkm/kn = xm/n . Then because h(u) = ukn is
one-to-one on [0, ∞), we get xkm/kn = xm/n so xr is well defined.
We should note that in the last step where we used the fact that h is one-to-
one, we could have equally said that we were taking the kn-th root of both sides
of the equality—but you might recall that the fact that the kn-th root exists is
because h is one-to-one.
Now that we know that xr is well defined it’s time to start developing the
necessary arithmetic properties. We begin the the fractional part of our arith-
metic properties.
Proposition 5.6.5 Suppose m, n ∈ N. Then we get the following results.
  m
m 1/n 1/n
(a) (x ) = x = xm/n
1 1 1 1
(b) x n x m = x n + m
 1/m
(c) x1/n = x1/mn
h in
1/n
Proof: (a) We note that by (5.6.3)-(b) xm = (xm ) , and by (5.6.3)-(b)
h  n
im h  m
in
and integer algebra xm = x1/n = x1/n . Then since h(u) = un is
1/n  m
one-to-one, we get (xm ) = x1/n and this last expression is the definition
of xm/n .
5.6 Rational Exponents: An Application 145

(b) We note that


h mn im+n  m+n mn h imn  1 1 mn
m+n 1/mn 1/mn
x = x = x = x(m+n)/mn = x n + m

and
h n im h m in
xm+n = xm xn = x1/n x1/m
 nm  nm h inm
= x1/n x1/m = x1/n x1/m

—where again the steps


mnare due
h to integer
inm algebra and Definition 5.6.3-(b).
1 1
Thus we have x n + m = x1/n x1/m . Since h(u) = unm is one-to-one,
1 1 1 1
we have x n x m = x n + m .
 nm  n  1/m m n  1/m mn
1/nm 1/n 1/n 1/n
(c) Since x = x , x= x = x = x ,
 1/m
and h(u) = umn , we see that x1/n = x1/mn —again the reasons for the
steps are the f -f −1 identity (5.6.3)-(b) and integer algebra.
We now proceed to derive the final results for rational exponents. By now
we will stop giving the reasons for each of the steps—if you’ve read the proofs
of Propositions 5.6.4 and 5.6.5, you know the reasons.
Proposition 5.6.6 Suppose that r = m/n, s = p/q ∈ Q where m, n, p, q ∈ N.
Then we have the following.
(a) xr xs = xr+s
s
(b) (xr ) = xrs

Proof: (a) We see that since


nq  nq  (mq+np) nq
xr+s = x(mq+np)/nq = x1/nq
h nq imq+np h n imq h q inp
= x1/nq = xmq+np = xmq xnp = x1/n x1/q
h m inq h p inq
nq
= x1/n x1/q = (xr xs )

and h(u) = unq is one-to-one, xr+s = xr xs .


(b) Since
h nq imp  nqmp h mp inq
nq
xmp = x1/nq = x1/nq = x1/nq = (xrs ) ,
 n m p  nmp h m inp
mp m p 1/n
x = (x ) = x = x1/n = x1/n
h q inp
np 1/q
= (xr ) = (xr )
h iqnp h p inq
1/q 1/q s nq
= (xr ) = (xr ) = [(xr ) ] ,
146 5. Continuity

s
and h(u) = unq is one-to-one, xrs = (xr ) .
Thus we now have the arithmetic properties for rational exponents that we
have all known for a long time. The proofs given above are a bit gross but we
hope that you realize that you now have a rigorous treatment of these definitions
and properties.
We notice that the above definitions and analysis are all done for x ∈ (0, ∞).
When the appropriate rations are positive, then the same properties can be
proved for x = 0. It is necessary to only consider [0, ∞) because xn is not
one-to-one on R for n even—therefore it’s not invertible. It should be clear
that it’s possible to define y 1/n on R for n odd. For n odd we could repeat the
construction given earlier for f (x) = xn and arrive at the definition of y 1/n as
y 1/n = f −1 (y)—which is good because we have all taken the cube root of −27
sometime in our careers and got −3. You can do most of the arithmetic that we
developed for roots, etc. defined on [0, ∞). However you do have to be careful.
For example we know that 31 and 62 are two representations of the same rational
1/6 2
number. But (−27)1/3 = −3, (−27)2 = 3 and (−27)1/6 is not defined.
This is not good, i.e. you must be careful when you start taking odd roots of
negative numbers.
And finally we remember that we are in a chapter entitled Continuity. This
has been a very nice application of some of our continuity results but we will
now return to continuity. We obtain the following result.
Proposition 5.6.7 Suppose that r ∈ Q and define f : [0, ∞) → R by f (x) = xr .
Then f is continuous on [0, ∞).

Proof: We write r as r = m n , and define g(x) = x


1/n
and h(x) = xm . Then
 m
f (x) = xm/n = x1/n = h◦g(x). We know that h is continuous everywhere (it
is an easy polynomial). We found earlier by Proposition 5.4.11 that g(y) = y 1/n
is continuous on [0, ∞) because g = F −1 where F (x) = xn . Then by Proposition
5.3.5 we see that f is continuous on [0, ∞).

HW 5.6.1 (True or False and why)


(a) If we consider f (x) = x3 defined on R, then f −1 is defined and continuous
on R. √
(b) For x ∈ R we can define x3/2 by x3 .
√ 3
(c) For x ∈ R we can define x3/2 by ( x) . √
3 √ 2
(d) For x ∈ R we can define x2/3 by either x2 or ( 3 x) .
(e) Suppose
√ r ∈ Q, r = m/n, and n is odd. Define f : R → R by f (x) = xr , or
n m
f (x) = x . Then f is continuous on R.

HW 5.6.2 (a) Prove that f (x) = x3 for x ∈ R is strictly increasing.


(b) Prove that the function f (x) = xn , x ∈ [0, ∞), n ∈ N, is strictly increasing.
Chapter 6

Differentiation

6.1 An Introduction to Differentiation


In your first course in calculus you learned about the derivative and a variety of
applications of differentiation. You found the slope of tangent lines, velocities
and accelerations of particles, maximums and minimums, an assortment of dif-
ferent rates of changes and more. The importance of the concept of a derivative
should be clear. We begin with the definition.

Definition 6.1.1 Suppose that the function f : [a, b] → R. If x0 ∈ [a, b], then
f (x) − f (x0 )
f is said to be differentiable at x = x0 if lim exists. The limit
x→x0 x − x0
0
is the derivative of f at x0 and is denoted by f (x0 ). If E ⊂ [a, b] and f is
differentiable at each point of E, then f is said to be differentiable on E. The
function f 0 : E → R defined to be the derivative at each point of E is called the
derivative function. A common notation for the derivative function is to write
dy
the function as y = f (x) and denote the derivative of f as dx —at a particular
dy dy
0 d
point x0 write either dx (x0 ) or dx x=x0 . We also denote f (x) by dx f (x).

There is an important alternative form of the limit given in the definition above.
f (x) − f (x0 )
It should be clear that if we replace the x in the limit lim by
x→x0 x − x0
x0 + h, then x → x0 is the same as h → 0. Thus an alternative definition
f (x + h) − f (x0 )
of the derivative is given by lim . There are times that this
h→0 h
particular limit is preferable to the limit given in Definition 6.1.1 above.
In the above definition the derivative is defined at x = a and x = b and
the derivatives at these points will be in reality right and left hand derivatives,
respectively. We can also define right and left hand derivatives at interior points
of [a, b] by using right and left hand limits, i.e. the right hand derivative of f at
f (x) − f (x0 )
x = x0 ∈ (a, b) is defined by f 0 (x0 +) = lim , and the left hand
x→x0 + x − x0

147
148 6. Differentiation

f (x) − f (x0 )
derivative of f at x = x0 ∈ (a, b) is defined by f 0 (x0 −) = lim .
x→x0 − x − x0
We will not do much with one sided derivatives. Generally the results that you
need for one sided derivatives are not difficult.
Since hopefully we are good at taking limits, it is not difficult to apply
x3 − 64
Definition 6.1.1. In Example 4.2.4 we showed that lim = 48, i.e. if
x→4 x − 4
3 0
f (x) = x we showed that f (4) = 48. We can just as easily show that
f (x) − f (x0 ) x3 − x30
f 0 (x0 ) = lim = lim
x→x0 x − x0 x→x0 x − x0

(x − x0 )(x + xx0 + x20 )


2
= lim = lim (x2 + xx0 + x20 ) = 3x20 .
x→x0 x − x0 x→x0

We next include a result that is an extremely nice result and is necessary for
us to be able to proceed.
Proposition 6.1.2 Consider f : [a, b] → R and x0 ∈ [a, b]. If f is differentiable
at x = x0 , then f is continuous at x = x0 .
f (x) − f (x0 )
Proof: Note that f (x) = (x − x0 ) + f (x0 ). Then we see that
x − x0
 
f (x) − f (x0 )
lim f (x) = lim (x − x0 ) + f (x0 )
x→x0 x→x0 x − x0
f (x) − f (x0 )
= lim lim (x − x0 ) + lim f (x0 )
x→x0 x − x0 x→x0 x→x0
0
= f (x0 ) · 0 + f (x0 ) = f (x0 )
where we can apply the appropriate limit theorems because all of the individ-
ual limits exist. Since x0 ∈ [a, b], x0 is a limit point of [a, b]. Therefore by
Proposition 5.1.2 f is continuous at x = x0 .
The above result shows that there is a heirarchy of properties of functions.
Continuous functions may be nice but differentiable functions are nicer. It is
easy to see by considering the absolute value function at the origin—which we
will do soon—that the converse of this result is surely not true.
In your basic calculus course the very important tools that you used con-
stantly to compute derivatives were ”derivative of the sum is the sum of the
derivatives, derivative of a constant times a function is the constant times the
derivative, the product rule and the quotient rule.” We now include these results.

Proposition 6.1.3 Supppose that f, g : [a, b] → R, x0 ∈ [a, b], c ∈ R, and


f 0 (x0 ) and g 0 (x0 ) exist. Then we have the following results.
(a) (cf )0 (x0 ) = cf 0 (x0 )
(b) (f + g)0 (x0 ) = f 0 (x0 ) + g 0 (x0 )
(c) (f g)0 (x0 ) = f 0 (x0 )g(x0 ) + f (x0 )g 0 (x0 )
 0
f f 0 (x0 )g(x0 ) − f (x0 )g 0 (x0 )
(d) If g(x0 ) 6= 0, then (x0 ) = 2 .
g [g(x0 )]
6.1 Introduction 149

Proof: (a) & (b) The proofs of (a) and (b) are direct applications of Propo-
sition 4.3.1 parts (b) and (a).
(c) We note that

(f g)(x) − (f g)(x0 ) f (x)g(x) − f (x0 )g(x0 )


=
x − x0 x − x0
g(x) − g(x0 ) f (x) − f (x0 )
= f (x) + g(x0 ) .
x − x0 x − x0
(We added and subtracted terms to go from step 2 to step 3—if you simplify
the last expression, you will see that it is the same as step 2.) Then
 
(f g)(x) − (f g)(x0 ) g(x) − g(x0 ) f (x) − f (x0 )
lim = lim f (x) + g(x0 )
x→x0 x − x0 x→x0 x − x0 x − x0
g(x) − g(x0 ) f (x) − f (x0 )
= lim f (x) lim + g(x0 ) lim (6.1.1)
x→x0 x→x0 x − x0 x→x0 x − x0
by Proposition 4.3.1-(a), (b) & (d)
= f (x0 )g 0 (x0 ) + g(x0 )f 0 (x0 ). (6.1.2)

(To allow us to take the limits that get us from (6.1.1) to (6.1.2) we use the fact
that if f differentiable at x0 , then f is continuous at x0 , Proposition 6.1.2, and
of course Definition 6.1.1.)
Therefore we get the product rule, (f g)0 (x0 ) = f (x0 )g 0 (x0 ) + f 0 (x0 )g(x0 ).
(d) We attack the quotient rule in a similar way. We note that
f (x) f (x0 )
(f /g)(x) − (f /g)(x0 ) g(x) − f (x)g(x0 ) − g(x)f (x0 )
g(x0 )
= =
x − x0 x − x0 g(x)g(x0 )(x − x0 )
 
1 f (x) − f (x0 ) g(x) − g(x0 )
= g(x0 ) + f (x0 ) .
g(x)g(x0 ) x − x0 x − x0

(To get from the third term to the last term we have added and subtracted
things again. You can simplify the last expression to see that it is equal to the
second to the last expression.) Then

(f /g)(x) − (f /g)(x0 ) 1 f (x) − f (x0 )
lim = lim g(x0 )
x→x0 x − x0 x→x0 g(x)g(x0 ) x − x0

g(x) − g(x0 )
+f (x0 ) (6.1.3)
x − x0
1 0 0
= 2 [g(x0 )f (x0 ) − f (x0 )g (x0 )] . (6.1.4)
[g(x0 )]

(Note that to get to (6.1.4) from (6.1.3) we have used parts (a), (b), (d) and
(f) of Proposition 4.3.1 along with Definition 6.1.1. Again it is very important
that by Proposition 6.1.2 since g is differentiable at x0 , then g is continuous at
x0 —and nonzero—so that we can take the limit in the denominator.)
150 6. Differentiation

g(x0 )f 0 (x0 ) − f (x0 )g 0 (x0 )


Thus we have the quotient rule, (f /g)0 (x0 ) = 2 .
[g(x0 )]
One of the very basic and useful theorems that you learned and used often
in your Calc I course was the Chain Rule. We state the following theorem.
Proposition 6.1.4 Consider the functions f : [a, b] → R, g : [c, d] → R where
f ([a, b]) ⊂ [c, d] and x0 ∈ [a, b]. Suppose that f is differentiable at x = x0 ∈ [a, b]
and g is differentiable at y = f (x0 ) ∈ [c, d]. Then g ◦f is differentiable at x = x0
and (g ◦ f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ).

Proof: You should realize that this is a difficult proof. The proof given here
is clearly not difficult but it’s tricky. Read it carefully—otherwise before you
know what we’re doing, we’ll be done. (
g(y)−g(f (x0 ))
y−f (x0 ) if y 6= f (x0 )
Define h : [c, d] → R by h(y) = 0
g (f (x0 )) if y = f (x0 ).
Since g is differentiable at y = f (x0 ), h is continuous at y = f (x0 )—clearly
g(y) − g(f (x0 ))
lim h(y) = lim = g 0 (f (x0 )) = h(f (x0 )).
y→f (x0 ) y→f (x0 ) y − f (x0 )

Note that g(y) − g(f (x0 )) = h(y)(y − f (x0 )) for all y ∈ [c, d]—specifically check
the identity at y = f (x0 ). We let y = f (x) and get g(f (x)) − g(f (x0 )) =
h(f (x))(f (x) − f (x0 )) for all x ∈ [a, b].
Thus
g ◦ f (x) − g ◦ f (x0 ) f (x) − f (x0 )
(g ◦ f )0 (x0 ) = lim = lim h(f (x)) . (6.1.5)
x→x0 x − x0 x→x0 x − x0
f (x) − f (x0 )
Since f is differentiable at x = x0 , → f 0 (x0 ). Also, since f is
x − x0
differentiable at x = x0 , then f is continuous at x = x0 . And finally, since f
is continuous at x = x0 and h is continuous at y = f (x0 ), by Proposition 5.3.5
h ◦ f is continuous at x = x0 . Returning to (6.1.5) we get
f (x) − f (x0 )
(g ◦ f )0 (x0 ) = lim h(f (x)) = h(f (x0 ))f 0 (x0 ),
x→x0 x − x0
or (g ◦ f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ).
Often in texts of a variety of levels the justification of the chain rule is given
approximately as follows. We note that
g(f (x)) − g(f (x0 )) g(f (x)) − g(f (x0 )) f (x) − f (x0 )
=
x − x0 f (x) − f (x0 ) x − x0
g(y) − g(f (x0 )) f (x) − f (x0 )
= , (6.1.6)
y − f (x0 ) x − x0
where we have set y = f (x). The argument made is as x → x0 , y → f (x0 ) so
(6.1.6) implies (g ◦ f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ). Most often if you read the texts
6.2 Some Derivatives 151

carefully, they do not claim that it’s a proof. But you have to read it carefully.
The difference is between the statements

g(y) − g(f (x0 ))


lim (6.1.7)
y→f (x0 ) y − f (x0 )

and
g(f (x)) − g(f (x0 ))
lim . (6.1.8)
x→x0 f (x) − f (x0 )

Expression (6.1.8) is what we really have and we sneakily replaced it by (6.1.7).


They are not the same. Clearly the limit in (6.1.7) is g 0 (f (x0 )). The problem
with (6.1.8) is that the function f may be such that f has zeros in every neigh-
borhood of x = x0 . In that case it should be clear that for any given  we cannot
g(f (x)) − g(f (x0 ))
find a δ such that 0 < |x−x0 | < δ implies − L < —for any
f (x) − f (x0 )
L (including L = g 0 (f (x0 ))) because no matter which δ is chosen, we get zeros
in the denominator. Thus our proof given above dances around this difficulty.
The ”non-proof” given in the last paragraph is useful if given honestly. It is
a good indication that the Chain Rule is true. If you add the hypothesis that
”for some δ1 the function f satisfies f (x) 6= f (x0 ) when 0 < |x − x0 | < δ1 ,”
then it’s a proof. And lastly, the type of function that could cause the problems
described in the last paragraph is the function f3 defined in Example 6.2.4—so
as you will see, it has to get fairly ugly.

HW 6.1.1 (True or False and why)


(a) Suppose f : [0, 1] → R, x0 ∈ [a, b], is such that f 2 is differentiable at x = x0 .
Then f is differentiable at x = x0 .
(b) Suppose f : [a, b] → R, x0 ∈ [a, b], is such that f is differentiable at x = x0 .
Then f 2 is differentiable at x = x0 .
(c) Suppose f : [a, b] → R, x0 ∈ [a, b], is such that f is continuous at x = x0 .
Then f is differentiable at x = x0 .
(d) Suppose f, g : [a, b] → R, x0 ∈ [a, b], are such that f + g is differentiable at
x = x0 . Then f and g are differentiable at x = x0 .
(e) Suppose f, g : [a, b] → R, x0 ∈ [a, b], are such that f + g and f are differen-
tiable at x = x0 . Then g is differentiable at x = x0 .

HW 6.1.2 Suppose that f1 , · · · , fn : [a, b] → R, x0 ∈ [a, b], are all differentiable


at x = x0 . Then prove that f1 + · · · + fn is differentiable at x = x0 .

HW 6.1.3 Suppose f : [a, b] → R, g : [c, d] → R, h : [e1 , e2 ] → R are such


that f ([a, b]) ⊂ [c, d], g([c, d]) ⊂ [e1 , e2 ], f is differentiable at x = x0 , g is
differentiable at y = f (x0 ) and h is differentiable at z = g ◦ f (x0 ). Prove that
(h ◦ g ◦ f )0 (x0 ) = h0 (g ◦ f (x0 ))g 0 (f (x0 ))f 0 (x0 ).
152 6. Differentiation

6.2 Computation of Some Derivatives


Before we can proceed we should compute some derivatives. Definition 6.1.1,
Proposition 6.1.3 and Proposition 6.1.4 give us tools that allow us to compute
derivatives and reduce a problem involving a difficult expression to several easier
problems—that’s how we used these results in our basic course. We begin with
the derivatives of a few of the basic functions.
Example 6.2.1 Show that
d
(a) dx
c = 0 where c ∈ R.
d
(b) dx
x = 1.
d n
(c) dx
x = nxn−1 for n ∈ Z.
Solution: (a) We note that
d f (x) − f (x0 ) c−c
c = lim = lim = 0.
dx x→x0 x − x0 x→x0 x − x0

(b) We see that


d f (x) − f (x0 ) x − x0
x = lim = lim = 1.
dx x→x0 x − x0 x→x0 x − x0
Remember that we can divide out the x − x0 terms because of the ”0 <” part of the definition
of a limit.
(c) For n = 0 the statement is true by part (a). We next prove the formula for n ∈ N. We
d n
prove this statement by mathematical induction, i.e. dx x = nxn−1 for n ∈ N.
Step 1: Show true for n = 1: The statement is true for n = 1 by part (b) of this example, i.e.
d 1
dx
x = 1 · x0 = 1.
d k
Step 2: Assume true for n = k, i.e. assume that dx x = kxk−1 .
d k+1
Step 3: Prove true for n = k + 1, i.e. prove that dx x = (k + 1)xk . We note that
d k+1 d d d
x = (xxk ) =∗ x xk + xk x = x · (kxk−1 ) + xk · 1 = kxk + xk = (k + 1)xk
dx dx dx dx
where step ”=∗ ” is due to Proposition 6.1.3-(c)—the product rule.
d n
By mathematical induction the statement is true for all n, i.e. dx
x = nxn−1 .
And finally we consider n ∈ Z, n < 0. Then we have
 
d n d 1
x = where now we should note that −n > 0
dx dx x−n
0 · x−n − 1 · (−n)x−n−1
= by Proposition 6.1.3-(d)—the quotient rule
[x−n ]2
= nx−n−1+2n = nxn−1 .
d n
Thus for all n ∈ Z we have dx x = nxn−1 .
We should add that when n = 0, even for n > 0 xn is not defined at x = 0. Thus anytime
we wrote x0 , we were using the domain (−∞, 0) ∪ (0, ∞). Also, of course, in part (c) when
we considered n < 0 we do not want to include x = 0. For all other times our domain is R.

Note that a common approach to Example 6.2.4-(c) for n > 0 is to apply


the definition and note that
(x − x0 ) xn−1 + xn−2 x0 + · · · + xxn−2 + xn−1

xn − xn0 0 0
lim = lim (6.2.1)
x→x0 x − x0 x→x0 x − x0
= lim xn−1 + xn−2 x0 + · · · + xxn−2 + xn−1 = nxn−1

0 0 0 .
x→x0
6.2 Some Derivatives 153

We must realize that this proof has two ”obvious” mathematics induction proofs
hidden in the middle—the · · · ’s.
If we apply parts (a), (b) and (c) of Proposition 6.1.3 along with Example
6.2.1, we see that any polynomial is differentiable and (a0 xm + a1 xm−1 + · · · +
am−1 x+am )0 = ma0 xm−1 +(m−1)a1 xm−2 +· · ·+am−1 . Likewise, if in addition
we apply part (d) of Proposition 6.1.3 we find that any rational function is
differentiable at all points where the denominator is not zero and
0
p0 (x)q(x) − p(x)q 0 (x)

p(x)
= 2
q(x) [q(x)]

where p and q are polynomials.


We now include several more examples where we compute the derivative of
a function or show that the function is not differentiable.
d √ 1
Example 6.2.2 Show that for x ∈ (0, ∞) x= √ .
dx 2 x
Solution: We apply the definition and note that
√ √ √ √ √ √
d √ x − x0 x − x0 x + x0
x = lim = lim x → x0 √ √
dx x→x0 x − x0 x − x0 x + x0
(x − x0 ) 1
= lim √ √ = √ .
x→x0 (x − x0 )( x + x0 ) 2 x0

Again we get to divide out the x−x0 term because in the definition of a limit, we only consider
x − x0 6= 0.


We note that since x is not defined for x < 0, we know that we cannot
worry about the derivative there. If we consider x0 = 0, we see that

x−0 1
lim = lim √ = ∞.
x→0+ x−0 x→0+ x

(We have used a one-sided limit to emphasize the fact that we cannot consider
x < 0). We don’t really know that this limit is ∞ (even though we hope we
do know that). Hopefully we could use the methods in Section 4.4 to prove √
that this limit is ∞.) Since this limit does not exist in R, the derivative of x
does not exist at x = 0. However, this computation is useful when we use the
derivative to give the slope of the tangent to the curve. The above computation
shows that at x = 0 the tangent line is vertical—that’s surely better information
than just telling us that there is no tangent at that point.
If we think about the approach used for the above example, we can use the
d √ 1 d √ 1
analogous approach to show that 3
x= √ 3
for x ∈ R, 4
x= √ 4
dx 3 x 2 dx 4 x3
for x ∈ (0, ∞), etc.
We next include an example that includes an interesting limit and the very
important application of that limit that gives the derivatives of the trig func-
tions.
154 6. Differentiation

sin θ
Example 6.2.3 (a) Prove that lim = 1.
x→0 θ
1 − cos θ
(b) Prove that lim = 0.
x→0 θ
d
(c) Show that sin x = cos x.
dx
Solution: (a) In Figure 6.2.1 we notice that given that angle ∠P OA is θ, then |OB| = cos θ,
|BP | = sin θ and |AP | = tan θ. We also note that the area of triangle 4OAP is 12 sin θ,
the area of the sector OAP is 12 θ and the area of triangle 4OAQ is 21 tan θ (remember that
|OA| = 1). Also, the area of triangle 4OAP is less than the area of sector OAP is less than
the area of triangle 4OAQ, i.e. we have that
1 1 1 θ 1
sin θ < θ < tan θ or 1 < < .
2 2 2 sin θ cos θ
sin θ
Inverting these inequalities very carefully gives us that cos θ < < 1. Since lim cos θ = 1
θ x→0
(Example 5.2.3-(c)) and lim 1 = 1, we can apply the Sandwhich Theorem, Proposition 4.3.4,
x→0
sin θ
to see that lim = 1.
x→0 θ

Figure 6.2.1: This is a quarter circle of radius 1. P B and QA are perpendicular


to the x axis.
(b) Given part (a), part (b) is easy. We see that
1 − cos θ 1 − cos θ 1 + cos θ 1 − cos2 θ sin2 θ sin θ sin θ
= = = = .
θ θ 1 + cos θ θ((1 + cos θ) θ(1 + cos θ) θ 1 + cos θ
sin θ
Then from part (a) and the fact that lim = 0 (we know both sin and cos are
1 + cos θ
x→0
continuous at 0), we apply Proposition 4.3.1 to obtain the desired result.
(c) In order to apply the definition of the derivative to f (x) = sin x we note that
f (x + h) − f (x) sin(x + h) − sin x sin x cos h + sin h cos x − sin x
= =
h h h
cos h − 1 sin h
= sin x + cos x .
h h
(6.2.2)
Then using parts (a) and (b) we get
 
f (x + h) − f (x) cos h − 1 sin h
f 0 (x) = lim = lim sin x + cos x = cos x.
h→0 h h→0 h h
6.2 Some Derivatives 155

Of course the derivatives of the rest of the trig functions follow from the deriva-
tive of the sine function.
In Example 4.2.7 we showed that the limit of the function
(
sin x1

if x 6= 0
f1 (x) =
0 if x = 0

does not exist at x = 0—hence we know that f1 is not continuous at x = 0—so


it’s surely not differentiable
( at x = 0. In Example 5.2.4 we showed that the
x sin x1

if x 6= 0
function f2 (x) =
0 if x = 0
is continuous at x = 0—thus it’s at least a candidate for differentiability. We
next include the following example.
Example 6.2.4 (a) Show that f2 is not differentiable at
( x = 0.
1
x2 sin

x
if x 6= 0
(b) Show that the function f3 : R → R defined by f3 (x) =
0 if x = 0
is differentiable at x = 0.
Solution: (a) We note that
 
f2 (x) − f2 (0) 1
lim = lim sin .
x→0 x−0 x→0 x
We know that this last limit does not exist—by the same approach that we used to show that
f1 was not continuous at x = 0. Therefore the derivative of f2 does not exist at x = 0.
(b) We start the same way that we started with part (a) and note that
 
f3 (x) − f3 (0) 1
lim = lim x sin .
x→0 x−0 x→0 x
We showed in Example 5.2.4 that this last limit exists and equals zero. Therefore f3 is
differentiable at x = 0 and f30 (0) = 0.

You should view the functions f1 , f2 , and f3 as a related series of functions,


admitted not especially nice functions, that have obvious similarities. The func-
tion f2 is smoothed enough so that it is continuous at x = 0 (where f1 was not)
but not differentiable. The function f3 is smoothed more—it is differentiable at
x = 0 and hence also continuous there. In addition, we should realize that all
of the functions f1 , f2 and f3 are differentiable when x 6= 0.

HW 6.2.1 Consider the function f (x) = x3 − 2x2 + x − 1 defined on R. (a)


Use Definition 6.1.1 to compute f 0 (2).
(b) Compute f 0 (x).

d 1/3
HW 6.2.2 Compute x .
dx
sin 3θ
HW 6.2.3 Compute lim .
θ→0 sin 5θ
156 6. Differentiation
(
x2 if x ∈ [−1, 1] ∩ Q
HW 6.2.4 Consider the function f (x) = 2
−x if x ∈ [−1, 1] ∩ I.
(a) Is f differentiable at x ∈ [−1, 1], x 6= 0? If so, compute f 0 (x).
(b) Is f differentiable at x = 0? If so, compute f 0 (0).

HW 6.2.5 We saw in Example 6.2.4 that f3 is differentiable at x = 0. (a)


Show that f3 is differentiable for x 6= 0.
(b) Determine where f30 is continuous.

6.3 Some Differentiation Theorems


Now that we have some of the basic properties of the derivatve, it is time to
develop some additional applications of differentiation. There are many very
important applications of differentiation and you have surely seen some of these
in your basic course.
We begin with a very important result—but one we want now more as a
lemma. Recall that in Section 5.3 we defined maximums and minimums as the
maximum and minimum in some neighborhood about a point—Definition 5.3.6
(a) and (c). We begin with the very important result.

Proposition 6.3.1 Suppose the function f is such that f : (a, b) → R for some
a, b ∈ R, a < b and suppose that f is differentiable at x0 ∈ (a, b). If f has a
maximum or minimum at the point (x0 , f (x0 )), then f 0 (x0 ) = 0.

Proof; Consider the case where (x0 , f (x0 )) is a maximum. Let N ⊂ (a, b) be a
f (x) − f (x0 )
neighborhood of x0 so that f (x) ≤ f (x0 ) for all x ∈ N . Since lim
x→x0 x − x0
f (x) − f (x0 )
exists (f is differentiable at x0 ), by Proposition 4.4.8 both lim
x→x0 − x − x0
f (x) − f (x0 )
and lim exist and are equal. If x ∈ N and x < x0 , then
x→x0 + x − x0
f (x) − f (x0 )
x − x0 < 0, f (x) − f (x0 ) ≤ 0 and hence ≥ 0 (the slope is positive
x − x0
going up hill).
f (x) − f (x0 )
Claim 1: lim ≥ 0: We prove this claim by contradiction. We
x→x0 − x − x0
f (x) − f (x0 )
assume that lim < 0, i.e. there exists L < 0 such that for every
x→x0 − x − x0
f (x) − f (x0 )
 > 0 where exists δ such that 0 < x0 − x < δ implies − L < .
x − x0
 = |L|/2. Then there exists a δ such that 0 < x0 − x < δ implies
Choose
that f (x)−f (x0 )
− L < |L|/2 or − |L|/2 + L < f (x)−f (x0 )
< |L|/2 + L. Since

x−x0 x−x0
f (x)−f (x0 )
L < 0, L + |L|/2 = L/2 = −|L|/2 and x−x0 < −|L|/2. This contradicts
6.3 Differentiation Theorems 157

f (x) − f (x0 )
that fact that ≥ 0 for x ∈ (x0 − δ, x0 ) ∩ N so we know that
x − x0
f (x) − f (x0 )
lim ≥ 0.
x→x0 − x − x0
f (x) − f (x0 )
Claim 2: lim ≤ 0: This proof is very much like the last case.
x→x0 + x − x0
f (x) − f (x0 )
We show that if x ∈ N and x > x0 , then ≤ 0 (the slope is
x − x0
f (x) − f (x0 )
negative going down hill). We then assume that lim > 0, apply
x→x0 + x − x0
the definition of the right hand limit with  = L/2 and arrive at a contradicition.
f (x) − f (x0 )
Therefore lim ≤ 0.
x→x0 + x − x0
f (x) − f (x0 ) f (x) − f (x0 )
Since lim ≥ 0, lim ≤ 0 and they are equal,
x→x0 − x − x0 x→x0 + x − x0
f (x) − f (x0 )
they both must be zero. Therefore the limit lim = 0 or f 0 (x0 ) =
x→x0 x − x0
0.
We can prove the analogous result for a minimum using a completely similar
argument or by considering the function −f . If f has a minimum at x0 , then
−f will have a maximum at x0 .
Of course we know that Proposition 6.3.1 gives us a powerful tool for finding
local maximums and minimums. We set f 0 (x) = 0—we called these points
critical points in our basic course. This gives us all of the maximums and
minimums at points at which f is differentiable and probably a few extras. We
then develop methods (which we will discuss more later) to determine which of
these critical points are actually maximums or minimums. Then if we consider
any points at which f is not differentiable and maybe some end points, we have
the maximums and minimums.
However, at this time we wanted Proposition 6.3.1 to help us prove the
following theorem commonly referred to as Rolle’s Theorem.

Theorem 6.3.2 (Rolle’s Theorem) Suppose that f : [a, b] → R is continuous


on [a, b], differentiable on (a, b) and such that f (a) = f (b). Then there exists a
ξ ∈ (a, b) such that f 0 (ξ) = 0.

Proof: We know from Theorem 5.3.8 that there exists x0 , y0 ∈ [a, b] such that
f (x0 ) = glb{f (x) : x ∈ [a, b]} and f (y0 ) = lub{f (x) : x ∈ [a, b]}. If both x0
and y0 are endpoints of [a, b] (and f (a) = f (b)), then f is constant on [a, b],
f 0 (x) = 0 for x ∈ (a, b) so we can choose ξ = (a + b)/2. Otherwise, either
x0 or y0 is in (a, b), say x0 ∈ (a, b). Then by Proposition 6.3.1 we know that
f 0 (x0 ) = 0, i.e. we can set ξ = x0 .
The real reason that we want Rolle’s Theorem is to help us prove the Mean
Value Theorem, abbreviated by MVT, which is a very important result.
158 6. Differentiation

Theorem 6.3.3 (Mean Value Theorem (MVT)) Suppose that f : [a, b] →


R is continuous on [a, b] and differentiable on (a, b). Then there exists a ξ ∈
f (b) − f (a)
(a, b) such that f 0 (ξ) = .
b−a
Proof: The proof of this result is very easy to prove as long as we know the
f (b) − f (a)
“trick.” We set h(x) = f (x)−f (a)− (x−a). Then h(a) = h(b) = 0, h
b−a
is continuous on [a, b] and h is differentiable on (a, b). Thus by Rolle’s Theorem,
Theorem 6.3.2, there exists ξ ∈ (a, b) such that h0 (ξ) = 0, i.e. h0 (ξ) = f 0 (ξ) −
f (b) − f (a)
= 0 which is what we were to prove.
b−a
Often the results of the Mean Value Theorem will be given in the form
f (b) − f (a) = f 0 (ξ)(b − a).

Figure 6.3.1: Plot of a function f , the line from (a, f (a)) to (b, f (b)) and the
tangent line at x = ξ.

In Figure 6.3.1 we include a picture that is commonly used to illustrate the


Mean Value Theorem. Note that the slope of the line from (a, f (a)) to (b, f (b))
is f (b)−f
b−a
(a)
. Also notice that the tangent line at x = ξ is parallel to the line
from (a, f (a)) to (b, f (b). Hence they have the same slope—and the conclusion
of the MVT is that there always exists such a ξ ∈ (a, b) so that the slope of the
tangent at the point (ξ, f (ξ)) is equal to the slope of the line from (a, f (a)) to
(b, f (b)).
The Mean Value Theorem is important in the form given in Theorem 6.3.3
but is most important for some of the very important corollaries that follow
easily from the theorem. Notice that we state these results for some open
interval in R—including something like (a, b), (a, ∞), (−∞, b) or all of R. We
begin with two results that are related to integration—even though at this time
we don’t know what integration is.
Corollary 6.3.4 Suppose that I is some open interval of R, f : I → R is
differentiable on I and the function f is such that f 0 (x) = 0 for x ∈ I. Then f
is constant on I.
6.3 Differentiation Theorems 159

Proof: Choose any two values x0 , y0 ∈ I where x0 6= y0 , say x0 < y0 . Since


f is differentiable on I we know that f is differentiable on (x0 , y0 ) and by
Proposition 6.1.2 f is continuous on [x0 , y0 ]. We can then apply the MVT and
get f (y0 ) − f (x0 ) = f 0 (ξ)(y0 − x0 ) for some ξ ∈ (x0 , y0 ). Since f 0 (x) = 0 for
all x ∈ I, f 0 (ξ) = 0 and we have that f (x0 ) = f (y0 ). Since this is true for any
x0 , y0 ∈ I, f must be a constant on (a, b).

Corollary 6.3.5 Suppose that I is some open interval of R, f, g : I → R are


such that f and g are differentiable on I and f 0 (x) = g 0 (x) for all x ∈ I. Then
there exists a constant C ∈ R such that f (x) = g(x) + C for all x ∈ I.

Proof: Define h by h(x) = f (x) − g(x). If we then apply Corollary 6.3.4 to the
function h, we see that h is constant on I. This is what we wanted to prove.
As we stated earlier you should recall that you used both of the above results
often in your basic course. We next give the results that relate increasing-
decreasing functions with their derivatives. Recall that in Definition 5.4.3 we
defined increasing and decreasing, and strictly increasing and strictly decreasing
functions. We state the following corollary.

Corollary 6.3.6 Suppose that I is some open interval of R and f : I → R is


differentiable on I. We then have the following results.
(a) If f 0 (x) > 0 for all x ∈ I, then f is strictly increasing on I.
(b) If f 0 (x) ≥ 0 for all x ∈ I, then f is increasing on I.
(c) If f 0 (x) < 0 for all x ∈ I, then f is strictly decreasing on I.
(d) If f 0 (x) ≤ 0 for all x ∈ I, then f is decreasing on I.

Proof: (a) Suppose x, y ∈ I are such that x < y. Then f is differentiable on


(x, y) and continuous on [x, y] so we can apply the MVT. We get f (y) − f (x) =
f 0 (ξ)(y − x) for some ξ ∈ (x, y). Since f 0 (ξ) > 0 and y − x > 0, we get that
f (y) − f (x) > 0, or f (x) < f (y), or f is strictly increasing.
The proofs of (b), (c) and (d) follow from the MVT in exactly the same way.

We should realize that the application of Corollary 6.3.6 along with Propo-
sition 6.3.1 gives us a method for catorgorizing the maximums and minimums
of a function. We use Proposition 6.3.1 (along with listing the points where the
derivative does not exist) to find the potential maximums and minimums, criti-
cal points. We handle points at which the function is not defined separately. We
then evaluate f 0 at one point in the interval between each of these critical points
to determine whether f is strictly increasing or decreasing in that interval—if
we have all critical points listed, the sign of f 0 cannot change in the interval.
We then classify the critical point as a maximum if the curve of the function
is increasing to the left of the critical point and decreasing to the right of the
critical point. We classify the critical point as a minimum if the curve of the
function is decreasing to the left of the critical point and increasing to the right
of the critical point.
160 6. Differentiation

Also we get a very useful result from Corollary 6.3.6. From Proposition 5.4.8
we saw that if a function f is strictly monotone on the domain of the function,
then the function is one-to-one. Then from Corollary 6.3.6 and Proposition 5.4.8
we obtain the following useful result.

Corollary 6.3.7 Suppose I is some open interval of R and f : I → R is differ-


entiable on I. We then have the following results.
(a) If f 0 (x) > 0 for all x ∈ I, then f is one-to-one on I.
(b) If f 0 (x) < 0 for all x ∈ I, then f is one-to-one on I.

We next return to the situation of inverse functions. In Section 5.4, Propo-


sitions 5.4.11 and 5.4.12 we proved that if I is an interval and f is either strictly
monotone on I or one-to-one and continuous on I, then f −1 is continuous on
f (I). We next give the result that gives a differentiability result for f −1 .

Proposition 6.3.8 Suppose that f : I → R where I ⊂ R is an interval. Assume


that f is one-to-one and continuous on I. If x0 ∈ I is not an end point of I, f
is differentiable at x = x0 and f 0 (x0 ) 6= 0, then f −1 : f (I) → R is differentiable
at y0 = f (x0 ) and
0 1
f −1 (y0 ) = 0 . (6.3.1)
f (x0 )

Proof: Read this proof carefully. It is a very technical proof. As discussed


earlier we already know that f −1 is continuous on f (I).
f (x) − f (x0 )
We know that lim 6= 0. Hence
x→x0 x − x0

1 1 1 x − x0
= = lim = lim .
f 0 (x0 ) limx→x0 f (x)−f (x0 ) x→x0 f (x)−f (x0 ) x→x0 f (x) − f (x0 )
x−x0 x−x0

Thus for every  > 0 there exists δ such that 0 < |x − x0 | < δ implies

x − x0 1

f (x) − f (x0 ) f 0 (x0 ) < . (6.3.2)

Let g = f −1 . Since g is continuous at y0 , for every 1 > 0 there exists δ1


such that 0 < |y − y0 | < δ1 implies |g(y) − g(y0 )| < 1 . Apply this continuity
argument with 1 = δ (from the preceeding paragraph) and call the given δ1 η,
i.e. 0 < |y − y0 | < η implies |g(y)
− g(y0 )| < δ or |f (y) − x0 | < δ.
g(y) − x0 1
Then (6.3.2) implies that − 0 <  (where the last
f (g(y)) − f (x0 ) f (x0 )
expression follows by replacing x in (6.3.2) by g(y)).
Note that because x0 = g(y0 ), f (g(y)) = y and f (x0 ) = y0 ,

g(y) − x0 1 g(y) − g(y0 ) 1
− =
f (g(y)) − f (x0 ) f 0 (x0 ) y − y0 − .
f 0 (x0 )
6.3 Differentiation Theorems 161

Therefore we have that for every  > 0 there exists an η such that 0 < |y−y0 | < η
g(y) − g(y0 ) 1 g(y) − g(y0 ) 1
implies − 0 <  or lim = 0 —which is
y − y0 f (x0 ) y→y0 y − y0 f (x0 )
what we were to prove.
We next give a nice application of Proposition 6.3.8. In Section 5.6 we used
Proposition 5.4.8 to define y 1/n , we used Proposition 5.4.11 to show that the
function y 1/n is continuous on [0, ∞), and then used the composition of y 1/n and
xm to define xr , r ∈ Q and show that xr is continuous. We now want to extend
these results to show that xr is differentiable. To do so in the most pleasant

way we return to Example 5.6.1 and consider y. (Recall that in Example 6.2.2
d√ 1
we proved that y = √ . We did so using more elementary methods and
dy 2 y
methods that did not extend as nicely to y 1/n —and methods that weren’t as
nice as these.)
Example 6.3.1 Consider the function f (x) = x2 on D = (0, ∞). Show that f −1 (y) = √y
d √ 1 1
is differentiable on (0, ∞) and that y = √ = y −1/2 .
dy 2 y 2
Solution: We should recall that we already know that f is invertible, that f −1 is continuous

on (0, ∞) and that f −1 (y) = y. We see that it is very easy to apply Proposition 6.3.8 to
f —we know that f : I = (0, ∞) → R, I is surely an interval and that f is one-to-one and
continuous on I. We let y0 be an arbitrary element of (0, ∞) and let x0 ∈ (0, ∞) be such

that y0 = f (x0 ) = x20 , i.e. x0 = y0 . Then we know from Proposition 6.3.8 that f −1 is
d −1 1 1 1
differentiable at y0 and f (y0 ) = 0 = = √ . This is what we were to prove.
dy f (x0 ) 2x0 2 y0

Of course we next extend the above result to the function y 1/n .


Example 6.3.2 Consider the function f (x) = xn on D = (0, ∞). Show that f −1 (y) =
√ d 1/n 1 1
y 1/n = n y is differentiable on (0, ∞) and that y = y n −1 .
dy n
Solution: We proceed exactly as we did in the previous example. We know from Section
5.6 that f is invertible, that f −1 is continuous on (0, ∞) and f −1 (y) = y 1/n . It is also easy
to see that f satisfies the hypotheses of Proposition 6.3.8. Again let x0 ∈ D = (0, ∞) and
1/n
y0 ∈ f (D) = (0, ∞) be such that y0 = f (x0 ) = xn 0 or x0 = y0 . Then by Proposition 6.3.8
we know that f −1 is differentiable for any y0 ∈ f (D) = (0, ∞) and
d −1 1 1 1 1 1 1
f (y0 ) = 0 = =  = = y n −1
dy f (x0 ) nx0n−1 1/n n−1 1−1/n n

n y0 ny0

which is what we were to prove.

Of course the next step is to apply the Chain Rule, Proposition 6.1.4 to
prove that for r ∈ R, r = m/n,

d r d  1/n m  m−1 1 1 m m−1 1


x = x = m x1/n x n −1 = x n + n −1 = rxr−1 .
dx dx n n
This is important enough that we state this result in the form of a proposition.
162 6. Differentiation

Proposition 6.3.9 Consider r = m n ∈ Q and the function g : (0, ∞) → R


defined by g(x) = xr . The function g is continuous and differentiable on (0, ∞)
and for any x0 ∈ (0, ∞) we have g 0 (x0 ) = rxr−1
0 .

The inverse trig functions provide us with another nice application of the
Propositions 5.4.11 and 6.3.8. In our basic functions course some time after
defining the trig functions we defined the inverse trig functions—for the sine
function, probably very intuitively as θ = sin−1 x as the “angle whose sine is
x.” The interesting part and sometimes the tough part is that the sine function is
not one-to-one. We can get around this problem easily. Suppose for the moment
we write sin x to denote the sine function defined on R and define the restriction
of sin to [−π/2, π/2] by Sin x—this is a temporary, uncommon notation used to
make the point. It should be reasonably clear that though sin is not one-to-one,
Sin is one-to-one—we have restricted the domain, just as you did in your basic
course, so as to make the restriction one-to-one. We then have the following
result.
Example 6.3.3 Consider the function f = Sin : D = [−π/2, π/2] → R where f (x) =
Sin x = sin x. Show that f −1 exists on f (D) = [−1, 1], f −1 is continuous on [−1, 1], f −1 is
0 1
differentiable on (−1, 1) and f −1 (x0 ) = √ for x0 ∈ (−1, 1).
1 − x2
Solution: Above we stated that it was ”reasonably clear” that Sin is one-to-one. We also
d
need to know that the Sin function is monotone. Since we know that dx sin x = cos x and
cos x > 0 on (−π/2, π/2), by Corollaries 6.3.6 and 6.3.7 we know that the Sin function is
strictly increasing and one-to-one on the interval (−π/2, π/2). Since we also know that the
sine function does not equal ±1 on the open interval (−π/2, π/2), we can include the end
points to see that the function Sin is strictly increasing and one-to-one on [−π/2, π/2].
Since f is strictly monotone, we can apply Proposition 5.4.11 to see that f −1 is continuous
on f (D) = [−1, 1]. We know from Example 5.2.3-(d) that the sine function is continuous on
R—hence f (x) = Sin x is continuous on D = [−π/2, π/2]. Then since we already knoiw that
f is one-to-one, by Proposition 6.3.8 f −1 = sin−1 is differentiable on f (D) = (−1, 1), and for
d 1 1
x ∈ (−1, 1) and θ ∈ [−π/2, π/2] such that sin θ = x, we have sin−1 x = 0 = .
p dx f (θ) cos θ
We know that cos θ = ± 1 − sin2 θ. Because for θ ∈ [−π/2, π/2] we know that cos θ ≥ 0, we
p d 1
have cos θ = 1 − sin2 θ. Also sin θ = x so we have sin−1 x = √ —the formula you
dx 1 − x2
learned in your basic course.

Of course we know that in this case f −1 is usually written as sin−1 . You should
be careful with this notation. In some texts—usually old ones—they will write
sin−1 as the inverse of sin (not a function since sin is not one-to-one) and Arcsin
as the inverse of Sin. This is nice notation because it emphasizes the fact that
sin is not one-to-one but it doesn’t seem to be used much anymore. Just use the
notation sin−1 to denote the inverse of the Sin function and never talk about
the inverse of the sin function—because the inverse isn’t even a function. But
be careful.
Also we should realize that we could next consider the cosine, tangent, se-
cant, etc functions. Just like the sine function, none of these functions are
one-to-one. Thus we restrict the domain as you did in your basic class (some-
times different from the domain used to define sin−1 and proceed as we did in
6.4 L’Hospital’s Rule 163

Example 6.3.3. We emphasize that having different domains for these functions
can make things difficult when we have the more than one of them interacting
with each other. Also we must be careful when using a calculator.

HW 6.3.1 (True or False and why)


(a) Suppose f, g : (a, b) → R are such that f and g are differentiable on (a, b),
f 0 (x) = g 0 (x) for x ∈ (a, b) and f ((a + b)/2) = g((a + b)/2). Then f (x) = g(x)
for all x ∈ (a, b).
(b) Suppose f : R → R is strictly increasing on R. Then f 0 (x) > 0 for x ∈ R.
(c) Suppose f : [−1, 1] → R has a maximum at x = 0. Then f 0 (0) = 0.
(d) The function f (x) = x + sin x is strictly monotone on [0, ∞).
(e) Suppose f : (−3, 3) → R is differentiable on (−3, 3) and such that |f 0 (x)| > 0
on (−3, 3). Then f is strictly monotone on (−3, 3).

HW 6.3.2 Consider the function f (x) = |x| for x ∈ R. Show that if a < 0 < b,
then there is no ξ ∈ (a, b) such that f (b) − f (a) = f 0 (ξ)(b − a). Why does this
not contradict the Mean Value Theorem.

HW 6.3.3 Consider the function f : R → R defined by f (x) = 2x − cos x.


(a) Prove that f is invertible and f −1 is continuous on R.
−1
(b) Prove that f −1 is differentiable and find dfdy y=−π .

HW 6.3.4 Suppose f : D → R, D ⊂ R, is differentiable on D and for some M


satisfies |f (x)| ≤ M . Prove that f is uniformly continuous on D.

HW 6.3.5 Suppose f : [a, b] → R is continuous on [a, b] and differentiable on


(a, b). Prove that if f 0 (x) 6= 0 for all x ∈ (a, b), then f is one-to-one.

6.4 L’Hospital’s Rule


In Proposition 4.3.1-(f) we found that if f → L1 , g → L2 and L2 6= 0, then
x3 − 64
f /g → L1 /L2 . In Example 4.2.4 we found that lim = 48, i.e. we found
x→4 x − 4
the limit of a quotient when the limit in the denominator is zero. We did this
by dividing out the x − 4 term in the numerator and the denominator. And
sin θ
finally in Example 6.2.3 we proved that lim = 1—another example where
θ→0 θ
the limit exists where the limit in the denominator is zero. This time we were
unable to divide out the θ term so that we had to work much harder to find the
limit and prove 1 is the correct limit.
In this section we introduce L’Hospital’s Rule—a method for finding certain
limits of quotients. We begin by introducing the easiest version—a version that
will satisfy many of our needs.
164 6. Differentiation

Proposition 6.4.1 (L’Hospital’s Rule) Suppose f, g : I → R where I is


an interval, x0 ∈ I, f (x0 ) = g(x0 ) = 0, f and g are differentiable at x0 and
g 0 (x0 ) 6= 0. Then
f (x) f 0 (x0 )
lim = 0 .
x→x0 g(x) g (x0 )

Proof: We note that for x ∈ I, x 6= x0 , we have


f (x)−f (x0 )
f (x) f (x) − f (x0 ) x−x0
= = g(x)−g(x0 )
.
g(x) g(x) − g(x0 )
x−x0

We know that because f and g are differentiable at x0 , the limits in both the
numerator and denominator exist. Also that the limit in the denominator is
nonzero. Thus
f (x)−f (x0 )
f (x) limx→x0 x−x0 f 0 (x0 )
lim = =
x→x0 g(x) limx→x0 g(x)−g(x0 ) g 0 (x0 )
x−x0

which is what we were to prove.


We note that this version of L’Hospital’s Rule is enough to find both of the
x3 − 64
singular limits that we have considered in the past: lim , Example 4.2.4,
x→4 x − 4
sin θ
and lim , Example 6.2.3-(a). Also note that if you consider a simple limit
x→0 θ
3x + 4
such as lim = −7, and blindly apply L’Hospital’s Rule, we would get
x→1 2x − 3
that the limit is 32 —the wrong answer. As is often the case you must be careful
that the functions involved satisfy the hypotheses. The functions f (x) = 3x + 4
and g(x) = 2x − 3 do not satisfy the hypotheses f (1) = 0 and g(1) = 0. And
finally, note that if I is a closed interval and x0 is an endpoint, then we have a
version of L’Hospital’s Rule for one-sided derivatives.
There are more difficult versions of L’Hospital’s Rule—and at times there is a
need for these more difficult versions. We will include several of these different
versions of L’Hospital’s Theorem. We will prove several of these results and
state several without proof. These proofs will depend strongly on the Cauchy
Mean Value Theorem—a generalization of the Mean Value Theorem, Theorem
6.3.3.
Proposition 6.4.2 (Cauchy Mean Value Theorem (CMVT)) Suppose
f, g : [a, b] → R are continuous on [a, b], differentiable on (a, b) and g is such
that g 0 (x) 6= 0 on (a, b). Then there exists ξ ∈ (a, b) such that

f (b) − f (a) f 0 (ξ)


= 0 .
g(b) − g(a) g (ξ)

Proof: We prove this theorem very much we proved the Mean Value Theorem—
we use a trick and Rolle’s Theorem, Theorem 6.3.2. We define a function h by
6.4 L’Hospital’s Rule 165

f (b) − f (a)
h(x) = f (x) − mg(x) where m = . Then it is easy to see that
g(b) − g(a)
h(a) = h(b), h is continuous on [a, b] and h is differentiable on (a, b). Thus by
Rolle’s Theorem, Theorem 6.3.2, there exists ξ ∈ (a, b) such that h0 (ξ) = 0.
Since h0 (x) = f 0 (x) − mg 0 (x) and g 0 (x) 6= 0 on (a, b), we have the desired result.

We should note that if we wanted to pretend that we were discovering the


proof, we could set h(x) = f (x) − mg(x) (without telling what m is) and choose
m so that h(a) = h(b). It’s still a trick—but a nice trick. Also we should notice
that if we set g(x) = x, we get the Mean Value Theorem, Theorem 6.3.3. For
this reason sometimes Proposition 6.4.2 will be referred to as the Generalized
Mean Value Theorem.
Before we move on to some of the versions of LHospital’s Rules, we want to
discuss a few ideas that we will use often. The first is that though we will use
the CMVT in each of our results, we will always use a contorted version of the
theorem. We will always work with an x and y in our domain where y < x.
Then for example in part (a) of Proposition 6.4.3, we use the fact that
f (x) f (y)
f (x) − f (y) g(x) − g(x)
= g(y)
(6.4.1)
g(x) − g(y) 1− g(x)

and apply the CMVT to the left hand side. Equation (6.4.1) is easily seen to
be true by simplifying the right hand side.
We should also note that in our applications of the CMVT, it is always the
case that g(x) 6= g(y)—because we will always assume that g 0 (x) 6= 0 on our
interval I, the Mean Value Theorem, Theorem 6.3.3, implies that g(x) − g(y) =
g 0 (ξ)(x − y) for some ξ ∈ (y, x). Thus g(x) − g(y) 6= 0.
And finally another operation we will
use often is the following. Again in part
f (x) f (y)
g(x) − g(x)


(a) of Proposition 6.4.3 we will have
g(y)
− A <  where x is fixed and
1 − g(x)



f (x)
g(x) − 0


f and g approach zero as y → c+. We let y → c+ and get − A ≤ .
1 − 0
We should realize that this follows from HW4.1.4.
Since some flavor of each of the above statements will appear in each proof,
we thought that we’d belabor the idea once and in the proofs just proceed as if
we know what we’re doing. Thus we begin with the following result where we
consider several of the possibilities when x is approaching a real number from
the right hand side.
Proposition 6.4.3 (L’Hospital’s Rule) Suppose that f, g : I = (c, a) → R,
c, a ∈ R, where f, g are differentiable on I and g is assumed to satisfy g(x) 6= 0,
g 0 (x) 6= 0 on I. We then have the following results.
f 0 (x) f (x)
(a) If lim f (x) = lim g(x) = 0 and lim 0 = A ∈ R, then lim =
x→c+ x→c+ x→c+ g (x) x→c+ g(x)
166 6. Differentiation

A.
f 0 (x) f (x)
(b) If lim f (x) = lim g(x) = 0 and lim = ∞, then lim = ∞.
x→c+ x→c+ x→c+ g 0 (x) x→c+ g(x)

f 0 (x) f (x)
(c) If lim g(x) = ∞ and lim = A ∈ R, then lim = A.
x→c+ x→c+ g 0 (x) x→c+ g(x)

f 0 (x) f (x)
(d) If lim g(x) = ∞ and lim 0
= ∞, then lim = ∞.
x→c+ x→c+ g (x) x→c+ g(x)

f 0 (x)
Proof: (a) Suppose  > 0 is given and let 1 = /2. Since lim = A,
x→c+ g 0 (x)

0 1 given we know that there exists a δ such that ξ ∈ (c, c + δ) implies that
for
f (ξ)
g 0 (ξ) − A < 1 . Choose x, y ∈ (c, c + δ) such that y < x. By the CMVT

f (x) − f (y) f 0 (ξy )


there exists ξy ∈ (y, x) such that = 0 . Note that ξy is also in
g(x) − g(y) g (ξy )
(c, c + δ). Then

f (x) f (y)
g(x) − g(x)
0
f (x) − f (y) f (ξy )

g(y)
− A =
− A = 0
− A < 1 .
1 − g(x)

g(x) − g(y) g (ξy )

We then let y → c+, noting that ξy ∈ (c, c + δ) for all of the y’s, and get

f (x)
f (x)
g(x) − A ≤ 1 = /2 <  for any x ∈ (c, c + δ). Therefore lim = A.

x→c+ g(x)
You might notice that this proof isn’t too different from the proof of Propo-
sition 6.4.1, except in this case since we cannot evaluate f or g at x = c, we use
x and y and then let y → c+. Otherwise the proofs are really very similar.
(b) Suppose K > 0 is given—this time we are proving that a limit is infinite so
f 0 (x)
we begin with K in place of the traditional . Let K1 = 2K. Since lim 0 =
x→c+ g (x)
f 0 (ξ)
∞, there exists a δ such that ξ ∈ (c, c + δ) implies that 0 > K1 . Choose
g (ξ)
x, y ∈ (c, c + δ) with y < x. By the CMVT there exists ξy ∈ (y, x) such that
f (x) − f (y) f 0 (ξy )
= 0 . Then we have
g(x) − g(y) g (ξy )

f (x) f (y)
g(x) − g(x) f (x) − f (y) f 0 (ξy )
g(y)
= = 0 > K1
1− g(x) − g(y) g (ξy )
g(x)

—since ξy ∈ (y, x) ⊂ (c, c + δ)—and it’s true for any y and ξy as long as y < x.
f (x)
Let y → c+ and get ≥ K1 = 2K > K for all x ∈ (c, c + δ). Thus
g(x)
f (x)
lim = ∞.
x→c+ g(x)
6.4 L’Hospital’s Rule 167

(c) The proof of statement (c) is difficult. We feel that it is important for you
to know that there is a rigorous proof. We also feel that it is important that
you are able to read and understand such a proof—even when it is tough. We
proceed.
f 0 (x)
Suppose  > 0 is given. Let 1 = min{1, /2}. Since lim 0 = A, for
x→c+ g (x)
0
f (ξ)
1 > 0 given, there exists δ1 such that ξ ∈ (c, c + δ1 ) implies 0 − A < 1 .
g (ξ)
Choose x, y ∈ I such that y < x < c + δ1 . Then since ξy ∈ (y, x) ⊂ (c, c + δ1 )

f (y) f (x)
g(y) − g(y)
0
f (y) − f (x) f (ξy )
− A = − A = − A < 1 . (6.4.2)
1−
g(x) g(y) − g(x) g 0 (ξy )
g(y)

Since lim g(y) = ∞ for a fixed x there exists δ4 such that y ∈ (c, c + δ4 )
y→c+
g(x)
implies that 1 − > 0. Set 2 = 1 /2[2 + |A|]. Again using the fact that
g(y)
|f (x)|
lim g(y) = ∞, there exists δ5 such that y ∈ (c, c+δ5 ) implies that g(y) >
y→c+ 2
|g(x)|
and there exists δ6 such that y ∈ (c, c + δ6 ) implies that g(y) > . Let
2
δ = min{δ1 , δ4 , δ5 , δ6 }. Then y ∈ (c, c+δ) implies that |fg(y)
(x)|
< 2 and |g(x)|
g(y) < 2 .
Now from (6.4.2) we have
f (y) f (x)
g(y) − g(y)
−1 + A < g(x)
< A + 1
1− g(y)
or
   
g(x) f (x) f (y) g(x) f (x)
(−1 + A) 1 − + < < (1 + A) 1 − + . (6.4.3)
g(y) g(y) g(y) g(y) g(y)
|g(x)| g(x)
Note that < 2 implies that −2 < < 2 . From this we see that
g(y) g(y)
g(x) g(x)
1− > 1 − 2 and 1 − < 1 + 2 . Also, |fg(y)
(x)|
< 2 implies that
g(y) g(y)
−2 < fg(y)
(x)
< 2 . Then inequality (6.4.3) gives

f (y)
(−1 + A)(1 − 2 ) − 2 < < (1 + A)(1 + 2 ) + 2
g(y)
or
f (y)
−1 + A − 2 (1 + A − 1 ) < < 1 + A + 2 (1 + A + 1 ). (6.4.4)
g(y)
Using the fact that 2 = 1 /2[2 + |A|] and the fact that 1 ≤ 1, the extra term
on the right hand side of (6.4.4) becomes
2 (1 + A + 1 ) ≤ 2 (2 + A) ≤ 2 [2 + |A|] = 1 /2 < 1 .
168 6. Differentiation

The extra term on the left side of (6.4.4) (without the minus sign) becomes

1 1 + |A| 1
2 (1 + A − 1 ) < 2 [1 + A] ≤ 2 [1 + |A|] = ≤ < 1 .
2 2 + |A| 2

These allow us to write inequality (6.4.4) as

f (y)
− + A ≤ −21 + A = −1 + A − 1 < −1 + A − 2 (1 + A − 1 ) <
g(y)

and
f (y)
< 1 + A + 2 (1 + A + 1 ) < 21 + A ≤  + A,
g(y)

f (y) f (y)
or − A <  for all y ∈ (c, c + δ). Thus lim = A.
g(y) y→c+ g(y)

f 0 (x)
(d) Let K > 0 be given. Let K1 = 2K + 1. Since lim 0 = ∞, there exists
x→c+ g (x)
f 0 (ξ)
a δ1 such that ξ ∈ (c, c + δ1 ) implies that 0 > K1 . Choose x, y ∈ (c, c + δ1 )
g (ξ)
such that y < x. Then by the CMVT there exists ξy ∈ (y, x) such that
f (y) f (x)
g(y) − g(y) f (x) − f (y) f 0 (ξy )
g(x)
= = 0 > K1 . (6.4.5)
1− g(x) − g(y) g (ξy )
g(y)

As in the proof of part (c) choose δ7 ( δ5 and δ6 ) so that y ∈ (c, c + δ7 ) implies


|g(x)| 1 3 g(x) 1 |f (x)| 1
g(y < 2 (which implies that 2 > 1 − g(y) > 2 ) and g(y) < 2 (which implies
that − 12 < fg(y)
(x)
< 12 ). Finally, if δ = min{δ1 , δ7 } and y ∈ (c, c + δ), then
inequality (6.4.5) gives
 
f (y) g(x) f (x) 1 1 1 1
> 1− K1 + > K1 − = (2K + 1) − = K.
g(y) g(y) g(y) 2 2 2 2

f (y)
Therefore lim = ∞.
y→c+ g(y)
We next state and do not prove a version of l’Hospital’s Rule when x ap-
proaches −∞. Originally we had included the proofs in the text but then decided
that it did no good to include these proofs if no one reads them. They are tough.
If you are interested in the proofs see Advanced Calculus, Robert C. James.

Proposition 6.4.4 (L’Hospital’s Rule) Suppose that f, g : I = (−∞, a) →


R, a ∈ R, where f, g are differentiable on I and g is assumed to satisfy g(x) 6= 0,
g 0 (x) 6= 0 on I. We then have the following results.
f 0 (x) f (x)
(a) If lim f (x) = lim g(x) = 0 and lim 0 = A ∈ R, then lim =
x→−∞ x→−∞ x→−∞ g (x) x→−∞ g(x)
A.
6.4 L’Hospital’s Rule 169

f 0 (x) f (x)
(b) If lim f (x) = lim g(x) = 0 and lim = ∞, then lim =
x→−∞ x→−∞ x→−∞ g 0 (x) x→−∞ g(x)
∞.
f 0 (x) f (x)
(c) If lim g(x) = ∞ and lim = A ∈ R, then lim = A.
x→−∞ x→−∞ g 0 (x) x→−∞ g(x)

f 0 (x) f (x)
(d) If lim g(x) = ∞ and lim 0 = ∞, then lim = ∞.
x→−∞ x→−∞ g (x) x→−∞ g(x)

Of course we also have results analogous to Propositions 6.4.3 and 6.4.4


where the limits are taken as x → c− and x → ∞, respectively.

HW 6.4.1 Compute the following limits.


(a) lim x ln x
x→0+  
1
(b) lim x2 sin
x→∞ x
ln x17
(c) lim
x→∞ x
(d) lim xe−x
x→∞
170 6. Differentiation
Chapter 7

Integration

7.1 An Introduction to Integration: Upper and


Lower Sums
The concept of the integral is very important. An integral is an abstract way
to perform a summation. We know of its application to areas, work, distance-
velocity-acceleration, and much, much more. Generally the treatment of in-
tegration given in the basic course is less than adequate—integration is more
difficult than differentiation and continuity. In this chapter we introduce the
concept of the integral and develop basic results concerning integration. Specif-
ically, in this section we will lay the ground work for the definition. When we
feel that we want motivation for the integral, we will use the fact that we want
the integral to represent the area under a given curve—hence in this section we
will give upper and lower approximations of the area in terms of sums.
Consider an interval [a, b] where a < b, and for n ∈ N consider P =
{x0 , · · · , xn } where a = x0 < x1 < · · · < xn−1 < xn = b. P is called a partition
of [a, b]. The points xi , i = 0, · · · , n, are called partition points. The intervals
[xi−1 , xi ], i = 1, · · · , n, are called partition intervals. Note that P = {a, b} is
the most trivial partition of [a, b]. When we write a partition of [a, b], we will
write the partition as P = {x0 , x1 , · · · , xn }, assuming that we all know that
x0 = a, xn = b and xi−1 < xi , for i = 1, · · · , n. We should note that we could
technically allow two points xi−1 and xi such that xi−1 = xi —but the partition
interval [xi−1 , xi ] would contribute nothing to the definitions given below—so
why include such points. We don’t.

Definition 7.1.1 Consider the function f : [a, b] → R where f is bounded on


[a, b], and P is a partition of [a, b], P = {x0 , x1 , · · · , xn }.
(a) For each i, i = 1, · · · , n, define mi = glb{f (x) : x ∈ [xi−1 , xi ]} and Mi =
lub{f (x) : x ∈ [xi−1 , xi ]}. (Note that these glb’s and lub’s exist because f is
bounded on [a, b].)

171
172 7. Integration

n
X n
X
(b) Define L(f, P ) = mi (xi − xi−1 ) and U (f, P ) = Mi (xi − xi−1 ). The
i=1 i=1
values L(f, P ) and U (f, P ) are called the lower and upper Darboux sums of f
based on P , respectively, or just the lower and upper sums.

We notice in Figure 7.1.1 that for the given partition P , L(f, P ) (repre-
sented by the area under the thin horizontal lines) gives a lower approximation
for the area under the curve y = f (x), a ≤ x ≤ b, and U (f, P ) (represented
by the area under the thick horizontal lines) gives an upper approximation for
the area under the curve y = f (x), a ≤ x ≤ b. Note that when we use the
word ”approximation”, we do not mean that it necessarily provides an accu-
rate approximation. You should realize that if we include more points in the
partition, the values L(f, P ) and U (f, P ) will provide better approximations
of the area compared to the approximation given with respect to the parition
pictured—add a point to the partition in the figure, draw the new version of
the segments indicating the new upper and lower sums, and note that the new
upper and lower sums gives a value closer to the area under the curve y = f (x).

Figure 7.1.1: Plot of the function y = f (x) on [a, b], a partition indicated on
[a, b] and the step function representing the upper and lower sums, U (f, P ) and
L(f, P ), respectively.

Consider the following examples.


Example 7.1.1 Let f1 denote the constant function f1 (x) = k for x ∈ [a, b]. Let P =
{x0 , x1 , · · · , xn } be a partition of [a, b]. Compute L(f1 , P ) and U (f1 , P ).
Solution: Let [xi−1 , xi ] denote one of the partition intervals associated with partition P . It
should be clear that mi = glb{f1 (x) : x ∈ [xi−1 , xi ]} = k and Mi = k. Then
n
X n
X
L(f1 , P ) = mi (xi − xi−1 ) = k (xi − xi−1 ) = k(b − a).
i=1 i=1
7.1 Upper and Lower Sums 173

Also
n
X n
X
U (f1 , P ) = Mi (xi − xi−1 ) = k (xi − xi−1 ) = k(b − a).
i=1 i=1

Example
( 7.1.2 Consider the function f2 : [0, 1] → R defined by
1 if x ∈ Q ∩ [0, 1]
f2 (x) = and let P = {x0 , · · · , xn } be a partition of [0, 1]. Compute
0 if x ∈ I ∩ [0, 1]
L(f2 , P ) and U (f2 , P ).
Proof: Recall that f2 is the same function considered in Example 5.2.5. Let [xi−1 , xi ] denote
a partition interval of partition P and assume that xi−1 < xi . Since by Proposition 1.5.6-(a)
there exists a q ∈ Q such that q ∈ (xi−1 , xi ), we see that Mi = 1. Also since by Proposition
1.5.6-(b) there exists p ∈ (xi−1 , xi ) such that p ∈ I, we see that mi = 0. This is true for every
i, i = 1, · · · , n. Thus
n
X n
X
L(f2 , P ) = mi (xi − xi−1 ) = 0(xi − xi−1 ) = 0
i=1 i=1

and
n
X n
X
U (f2 , P ) = Mi (xi − xi−1 ) = 1(xi − xi−1 ) = 1.
i=1 i=1

Example 7.1.3 Consider f3 : [0, 1] → R defined by f3 (x) = 2x + 3 and the partition of


[0, 1], P = {x0 , · · · , xn }. Compute L(f3 , P ) and U (f3 , P ).
Proof: Since f3 is increasing, it is easy to that on the partition interval [xi−1 , xi ], mi =
2xi−1 + 3 and Mi = 2xi + 3, i = 1, · · · , n. Thus
n
X n
X n
X n
X
L(f3 , P ) = mi (xi − xi−1 ) = (2xi−1 + 3)(xi − xi−1 ) = 2 xi xi−1 − 2 x2i−1 + 3,
i=1 i=1 i=1 i=1

and
n
X n
X n
X n
X
U (f3 , P ) = Mi (xi − xi−1 ) = (2xi + 3)(xi − xi−1 ) = 2 x2i − 2 xi xi−1 + 3.
i=1 i=1 i=1 i=1

This is not very nice—and there’s no way to make it nice.


If instead we choose the partition P1 = {0, 1/n, 2/n, · · · , n/n = 1}, we get
n n   
X X i−1 i i−1
L(f3 , P1 ) = mi (xi − xi−1 ) = 2 +3 −
i=1 i=1
n n n
n
2 X 2 n(n − 1)
= 2
(i − 1) + 3 = 2 +3
n i=1 n 2

and
n n    n
X X i i i−1 2 X 2 n(n + 1)
U (f3 , P1 ) = mi (xi −xi−1 ) = 2 +3 − = 2 i+3 = 2 +3.
i=1 i=1
n n n n i=1
n 2

These are much nicer expressions. You’d think they’d be useful for something—but remember
that this is a very nice and specific partition.

We now proceed to develop some results comparing and relating the upper
and lower sums. Because it is clear that mi ≤ Mi , i = 1, · · · , n, we obtain the
following result.
Proposition 7.1.2 Suppose f : [a, b] → R where f is bounded on [a, b] and P
is a partition of [a, b]. Then L(f, P ) ≤ U (f, P ).
174 7. Integration

Rb
Remember that we want f to give us the area under the curve. If we look at
a Rb
Figure 7.1.1 again, we note that for that to be true we at least must define a f
Rb
so that for any partition P , L(f, P ) ≤ a f ≤ U (f, P ), i.e. we must squeeze
Rb
a
f between the inequality given in Proposition 7.1.2.
We next state and prove an easy but necessary lemma for our later work.
Lemma 7.1.3 Suppose that f : [a, b] → R is bounded such that m ≤ f (x) ≤ M
for all x ∈ [a, b]. Then for any partition P of [a, b], m(b − a) ≤ L(f, P ) and
U (f, P ) ≤ M (b − a).

Proof: Let the partition P be given by P = {x0 , · · · , xn }. Since m ≤ f (x) for


all x ∈ [a, b], surely m ≤ mi = glb{f (x) : x ∈ [xi−1 , xi ]} for all i, i = 1, · · · , n.
Then
n
X n
X n
X
m(b − a) = m (xi − xi−1 ) = m(xi − xi−1 ) ≤ mi (xi − xi−1 ) = L(f, P ),
i=1 i=1 i=1

which is one of the inequalities that we were to prove. The other inequality
follows in the same manner.
One of the problems that we have is that we have defined the upper and
lower Darboux sums for a particular partition. We have already discussed the
fact that if we make the partition finer, the upper and lower sums give us better
Rb
approximations of the area—the value that we want for a f . We must have
ways to connect the sums in some way for different partitions. The next few
definitions and propositions do this job for us. We begin with the following
definition.
Definition 7.1.4 Let P and P ∗ be partitions of [a, b] given by P = {x0 , · · · , xn }
and P ∗ = {y0 , · · · , ym }. If P ⊂ P ∗ , then P ∗ is said to be a refinement of P .
Note that since the partitions P and P ∗ (and any other partitions that we may
define) are given as a set of points in [a, b], the set containment definition above
makes sense.
We should note that the easiest way to get a simple refinement of the parti-
tion P is to add one point, i.e. if P = {x0 , · · · , xn }, choose a point yI ∈ [xI−1 , xI ]
and let P ∗ = {x0 , · · · , xI−1 , yI , xI , · · · , xn }. We should then realize that if P ∗
is a refinement of the partition P , it is possible to consider a series of one-point
refinements P0 , · · · , Pk such that P0 = P , Pk = P ∗ and Pj is a one-point refine-
ment of Pj−1 for j = 1, · · · , k. The construction consists of choosing P0 = P
and adding one of the points of P ∗ − P = {x : x ∈ P ∗ and x 6∈ P } at each step.
This observation makes several of the proofs given below much easier.
We next prove two lemmas, the first that relates the upper and lower sums
on a partition and refinements of that partition and the second that relates the
lower sum and upper sums with respect to different partitions.
Lemma 7.1.5 Suppose f : [a, b] → R where f is bounded on [a, b], P is a
partition of [a, b] and P ∗ is a refinement of P . Then L(f, P ) ≤ L(f, P ∗ ) and
U (f, P ∗ ) ≤ U (f, P ).
7.1 Upper and Lower Sums 175

Proof: Let P # be a one-point refinement of the partition P where P =


{x0 , · · · , xn } and the extra point in P # , yi is such that xi−1 ≤ yi ≤ xi . Then
to compare L(f, P ) and L(f, P # ) we only need to compare the contributions to
both of these values from the interval [xi−1 , xi ]. The contribution of this interval
to L(f, P ) is the value mi (xi −xi−1 ) where mi = glb{f (x) : x ∈ [xi−1 , xi ]}. The
contribution of this same interval to L(f, P # ) is the value m1 (yi −xi−1 )+m2 (xi −
yi ) where m1 = glb{f (x) : x ∈ [xi−1 , yi ]} and m2 = glb{f (x) : x ∈ [yi , xi ]}. Of
course we note that xi − xi−1 = (yi − xi−1 ) + (xi − yi ).
We make the claim that m1 ≥ mi and m2 ≥ mi . To see that this is true
we note that since {f (x) : x ∈ [xi−1 , yi ]} ⊂ {f (x) : x ∈ [xi−1 , xi ]}, mi =
glb{f (x) : x ∈ [xi−1 , xi ]} will be a lower bound of {f (x) : x ∈ [xi−1 , yi ]}. Hence
mi ≤ f (x) for all x ∈ [xi−1 , yi ]} and thus mi ≤ m1 = glb{f (x) : x ∈ [xi−1 , yi ]},
which also follows from HW1.4.1-(c). The fact that m2 ≥ mi follows in the
same manner.
Thus
mi (xi − xi−1 ) = mi [(yi − xi−1 ) + (xi − yi )]
= mi (yi − xi−1 ) + mi (xi − yi ) ≤ m1 (yi − xi−1 ) + m2 (xi − yi ),
so L(f, P ) ≤ L(f, P # ).
Now if P ∗ is any refinement of the partition P , from the discussion given pre-
ceeding this lemma we know that there exist one-point refinements P0 , · · · , Pk
such that P0 = P , Pk = P ∗ and Pj is a one-point refinement of Pj−1 for
j = 1, · · · , k. Thus by taking k steps involving the one-point refinement ar-
gument given above (really an easy proof by induction) we see that L(f, P ) =
L(f, P0 ) ≤ L(f, P1 ) ≤ · · · ≤ L(f, Pk ) = L(f, P ∗ ) which is what we were to
prove.
The proof that U (f, P ∗ ) ≤ U (f, P ) is very similar. Using the one-point
partition of P , P # , used above, the key step in the one-point partition argument
for U is that
Mi = lub{f (x) : x ∈ [xi−1 , xi ]} ≥ M 1 = lub{f (x) : x ∈ [xi−1 , yi ]}
and
Mi = lub{f (x) : x ∈ [xi−1 , xi ]} ≥ M 2 = lub{f (x) : x ∈ [yi , xi ]},
which follow from HW1.4.1-(c). The desired result then follows.

Lemma 7.1.6 Suppose that f : [a, b] → R where f is bounded on [a, b], and
suppose that P1 and P2 are partitions of [a, b]. Then L(f, P1 ) ≤ U (f, P2 ).

Proof: Let P ∗ be any common refinement of P1 and P2 , i.e. P ∗ is a refinement


of P1 and P ∗ is a refinement of P2 . The smallest such common refinement is
found by setting P ∗ = P1 ∪ P2 . Then by Lemma 7.1.5 we have L(f, P1 ) ≤
L(f, P ∗ ) and U (f, P ∗ ) ≤ U (f, P2 ). By Proposition 7.1.2 we have L(f, P ∗ ) ≤
U (f, P ∗ ). Thus we have
L(f, P1 ) ≤ L(f, P ∗ ) ≤ U (f, P ∗ ) ≤ U (f, P2 ).
176 7. Integration

If we return to Examples 7.1.1, 7.1.2 and 7.1.3, it’s really easy to see f1
and f2 satisfy L(fj , P ) ≤ U (fj , P ) for j = 1, 2. It’s more difficult to see that
Proposition 7.1.2 is satisfied for f3 —but it is true because xi−1 (xi − xi−1 ) ≤
xi (xi − xi−1 ) (sum both sides of the inequality and add 3). Because the upper
and lower sums for the functions f1 and f2 are so trivial, it is easy to see that
f1 and f2 will satisfy Lemma 7.1.5 for any refinement. Again because the upper
and lower sums for f3 are so difficult, it is difficult to see that f3 will satisfy
Lemma 7.1.5—except by approximately reproducing the proof of Lemma 7.1.5
with respect the function f3 and some particular refined partition. And finally,
while it is again easy to see that functions f1 and f2 will satisfy Lemma 7.1.6
with respect to any two partitions, it is very difficult to see that the upper and
lower sums of the function f3 with respect to the partitions P and P1 will satisfy
Lemma 7.1.6. It should be clear that if we were to consider upper and lower
sums for more complex functions, it would be next to impossible to compare
these upper and lower sums—especially with respect to very complex different
partitions. It is for that reason that the above lemmas are so important.
HW 7.1.1 (True or False and why)
(a) Suppose f : [0, 1] → R and f (x) > 0 for all x ∈ [0, 1]. Let P be some
partition of [0, 1]. Then L(f, P ) > 0.
(b) Suppose f : [0, 1] → R and let P be some partition of [0, 1]. It is possible
that L(f, P ) < 0 and U (f, P ) > 0.
(c) Suppose that P and P ∗ are partitions of [0, 1]. Then P ∩ P ∗ is a partition
of [0, 1]. Also P ∩ P ∗ is a refinement of both P and P ∗ .
HW 7.1.2 Consider the function f (x) = x and the partition of [0, 1], Pn =
{0, n1 , n2 , · · · , 1}.
(a) Compute L(f, Pn ) and U (f, Pn ).
(b) Compute U (f, Pn ) − L(f, Pn ).
(c) Show that U (f, Pn ) − L(f, Pn ) > 0.
(d) Compute lim [U (f, Pn ) − L(f, Pn )].
n→∞

HW 7.1.3 Consider the function f (x) = x and the partition of [0, 1], Pn =
1 n
{0, 2n , · · · , 2n , 1}.
(a) Compute L(f, P ) and U (f, P .
(b) Compute U (f, Pn ) − L(f, Pn ).
(c) Show that U (f, Pn ) − L(f, Pn ) > 0.
(d) Compute lim [U (f, Pn ) − L(f, Pn )].
n→∞
(
x if x ∈ Q ∩ [0, 1]
HW 7.1.4 Consider the function f (x) =
0 if x ∈ I ∩ [0, 1]
1 2
and the partition of [0, 1] Pn = {0, n , n , · · · , 1}.
(a) Compute L(f, Pn ) and U (f, Pn ).
(b) Compute U (f, Pn ) − L(f, Pn ) and lim [U (f, Pn ) − L(f, Pn )].
n→∞
7.2 The Integral 177

7.2 The Darboux Integral


In the last section we computed upper and lower Darboux sums that gave us
approximations of the area under the curve from above and below, respectively.
If these sums weren’t so terrible to work with, we could live with them—that’s
approximately what they do numerically. However, the upper and lower sums
are not a very nice analytic tools—and we can do better—much better. In the
following definition we use the upper and lower sums to get a step nearer to the
area under the curve.
Definition 7.2.1 Consider f : [a, b] → R where f is bounded on [a, b]. (a) We
define the lower integral of f on [a, b] to be
Z b
f = lub{L(f, P ) : P is a partition of [a, b]}.
a

(b) We define the upper integral of f on [a, b] to be


Z b
f = glb{U (f, P ) : P is a partition of [a, b]}.
a

We first note that if P ∗ is any fixed partition of [a, b], then by Lemma 7.1.6
L(f, P ) ≤ U (f, P ∗ ) for any partition P . Because U (f, P ∗ ) is an upper bound
for the set {L(f, P ) : P is a partition of [a, b]} we know that lub{L(f, P ) :
P is a partition of [a, b]} exists. Likewise, L(f, P ∗ ) is a lower bound of the set
{U (f, P ) : P is a partition of [a, b]} so that glb{U (f, P ) : P is a partition of [a, b]}
exists.
If we again return to Example 7.1.1, we see that since L(f1 , P ) = U (f1 , P ) =
Z b Z b
k(b − a) for any partition P . Thus f1 = f1 = k(b − a). If we consider
a a
the function f2 introduced in Example 7.1.2, we see that since L(f2 , P ) = 0
Z 1
for any partition P , f2 = 0 and since U (f2 , P ) = 1 for any partition P ,
0
Z 1
f2 = 1. And finally, it should be reasonably easy to see that since L(f3 , P )
0
and U (f3 , P ) from Example 7.1.3 are so complex, it is too difficult to try to
Z 1 Z 1
use these expressions to determine f3 or f3 . Even though L(f3 , P1 ) and
0 0
U (f3 , P1 ) found in Example 7.1.3 are much nicer, the least upper bound and the
greatest lower bound in Definition 7.2.1 above must be taken over all partitions
of [0, 1], not just a few nice ones—so knowing L(f, P1 ) and U (f, P1 ) does not
Z 1 Z 1
help us determine f3 or f3 (at least at this time).
0 0
We see by Lemma 7.1.5 that as we add points to a partition, the upper sums
Z b
get smaller. We define f to be the glb of these upper sums. Likewise we
a
178 7. Integration

know that as we add points to a partition, the lower sums get larger. The lower
Rb Rb Rb
integral a f is defined to be the lub of these lower sums. Hence, a f and a f
squeeze in to provide a better upper and lower approximation of the area under
the curve y = f (x) from a to b, respectively.
We next prove a result that our intuition should tell us is obvious.
Proposition 7.2.2 Suppose that f : [a, b] → R, f is bounded on [a, b] and let
Z b Z b
P be any partition of [a, b]. Then L(f, P ) ≤ f≤ f ≤ U (f, P ).
a a

Proof: The first inequality and the last inequality follow from the fact that the
lub and the glb must be an lower bound and a upper bound of the set of all
possible sums, respectively.
Let P1 and P2 be any two partitions of [a, b]. By Lemma 7.1.6 we know that
L(f, P2 ) ≤ U (f, P1 ). Hence U (f, P1 ) is an upper bound of the set {L(f, P2 ) :
P2 is a partition of [a, b]}. Therefore
Z b
f = lub{L(f, P2 ) : P2 is a partition of [a, b]} ≤ U (f, P1 ).
a
Z b
Then f is a lower bound of the set {U (f, P1 ) : P1 is a partition of [a, b]}.
a
Z b Z b
Therefore f ≤ glb{U (f, P1 ) : P1 is a partition of [a, b]} = f which is
a a
what we were to prove.
Z b
The upper sums U (f, P ) and the upper integral f approximate the area
a
under the curve y = f (x) from above, and the lower sums L(f, P ) and the
Z b
lower integral f approximate the area under the curve y = f (x) from below.
a
Z b
Since we want the integral f to give the area under the curve y = f (x), it is
a
reasonably logical to make the following definition.
Definition 7.2.3 Suppose f : [a, b] → R and f is bounded on [a, b]. We say
Z b Z b
that f is integrable on [a, b] if f= f . If f is integrable on [a, b], we write
a a
Z b Z b  Z b Z b 
f= f or f= f .
a a a a

Z b
We call
f the Darboux integral of f from a to b. We will actually drop
a Z b
the Darboux from here on and refer to f as the integral of f from a to
a
7.2 The Integral 179

b. We want to make it clear that this is the same integral that you studied
in your basic class. We tacked on the ”Darboux” to differentiate it from the
”Riemann” integral that we define in Section 7.6—at which time we immediately
prove that the Riemann and the Darboux integrals are the same. We use the
Darboux definition because it makes some of the proofs easier and because we
feel that it is more intuitive.
Before we discuss the integral we want to emphasize that while we denote the
Rb
integral of f from a to b by a f , the most common notation (especially in the
Rb
basic course) is to denote the integral by a f (x) dx. There are some advantages
to this latter notation. The ”dx” sort of reminds us that there is an xi − xi−1 in
the definition of the upper and lower sums. Also, later when we want to make
a change of variables, the ”dx” term is very useful for reminding us what we
Rb
want to do. The notation a f (x) dx can also be difficult to understand when
we study differentials. In our basic course we had a ”dx” in the integral, a ”dx”
as a part of differentials, with no apparent connection. (We surely won’t have
that problem since we won’t discuss differentials.) In any case we will generally
Rb
use the notation a f to denote the integral of f from a to b—though whenever
Rb
it seems convenient or more clear to use the a f (x) dx notation, we will use it.

We return to the statement made just before we gave the definition of the
integral, ”it is reasonably logical to make the following definition.” It wouldn’t
Z b Z b
be a good definition if it were always the case that f < f —then no
a a
function would be integrable. It is only a good definition—and then our logic is
affirmed—if for a large set of nice functions, we can in fact get equality—and for
some functions we do not get equality. For a first glimpse of what we have, we
return to our Examples 7.1.1, 7.1.2 and 7.1.3 from Section 7.1. For the function
f (x) = k defined on the interval [a, b] introduced in Example 7.1.1, we see that
Z1 b
f1 = k(b − a). If we consider the function f2 introduced in Example 7.1.2—
a
Z 1 Z 1
and the subsequent work on f2 , we see that since f2 = 0 < 1 = f2 , the
0 0
function f2 is not integrable on [0, 1]. And of course, we can’t say much about
the function f3 .
We want to emphasize that the integral of f on the interval [a, b] is defined
only for functions that are bounded on [a, b]. We saw above that in the case of
the function f2 , a function can be bounded and still not integrable. But also
we should realize that f2 is not a nice function, i.e. if it is a bounded function
that is pretty nasty, it may not be integrable.
To be able to show that more functions are integrable (in practice, other
than for theoretical purposes, we don’t care too much about the function f2 ),
we need some methods and results. We begin with a very powerful and useful
theorem, the Archimedes-Riemann Theorem.
180 7. Integration

Theorem 7.2.4 (The Archimedes-Riemann Theorem (A-R Theorem))


Consider f : [a, b] → R where f is bounded on [a, b]. The function f is integrable
on [a, b] if and only if there exists a sequence of partitions of [a, b], {Pn }, n =
1, · · · such that
lim [U (f, Pn ) − L(f, Pn )] = 0. (7.2.1)
n→∞
If there exists such a sequence of partitions, then
Z b
lim L(f, Pn ) = lim U (f, Pn ) = f. (7.2.2)
n→∞ n→∞ a

Proof: (⇐) Let Pn , n = 1, · · · be a sequence of partitions of [a, b] such that


lim [U (f, Pn ) − L(f, Pn )] = 0. By Proposition 7.2.2 we know that for all n
n→∞

Z b Z b
0≤ f− f ≤ U (f, Pn ) − L(f, Pn ).
a a

By the Sandwich Theorem, Proposition 3.4.2, we see that


Z b Z b
0≤ f− f ≤ lim [U (f, Pn ) − L(f, Pn )] = 0
a a n→∞

Rb Rb
(notice that the two sequences on the left are constant sequences) or a f = a f
so f is integrable on [a, b].
Z b Z b Z b
(⇒) We now assume that f is integrable on [a, b]. Then f= f= f.
a a a
Z b
Since f = lub{L(f, P ) : P is a partition of [a, b]}, for every n ∈ N there
a
Z b
1
exists a partition of [a, b], Pn∗ , such that f − L(f, Pn∗ ) < . (Recall that
a n
by Proposition 1.5.3-(a) for any  > 0 there exists a partition of [a, b], P such
Z b
that f − L(f, P ) < .) Likewise (using Proposition 1.5.3-(b)) there exists a
a
Z b
1
partition of [a, b], Pn# , such that U (f, Pn# ) −
. Let Pn be the common f<
a n
∗ # ∗ #
refinement of Pn and Pn , Pn = Pn ∪ Pn . Doing this construction of each n ∈ N
gives us a sequence of partitions of [a, b], {Pn }.
We note that L(f, Pn∗ ) ≤ L(f, Pn ) and U (f, Pn ) ≤ U (f, Pn# ). Thus we have
Z b
1
L(f, Pn ) ≥ L(f, P ∗ ) > f− (7.2.3)
a n

and
Z b
1
U (f, Pn ) ≤ U (f, Pn# ) < f+ . (7.2.4)
a n
7.2 The Integral 181

Subtracting (7.2.3) from (7.2.4) gives


"Z #
Z b b
1 1 2
0 ≤ U (f, Pn ) − L(f, Pn ) < f+ − f− =∗
a n a n n

where ”=∗ ” is true because of our hypothesis that f is integrable on [a, b] (in
Z b Z b
which case f= f ). Therefore by Proposition 3.4.2 we have
a a

2
0 ≤ lim [U (f, Pn ) − L(f, Pn )] ≤ lim = 0,
n→∞ n→∞ n
or lim [U (f, Pn ) − L(f, Pn )] = 0 which is what we wanted to prove.
n→∞
Now let {Pn } is such a sequence of partitions that satisfies the above ”if
Z b Z b
and only if” statement. If we use (7.2.3), the fact that f= f (if there is
a a
Z b
such a sequence of partitions, then f is integrable), and that f ≥ L(f, Pn )
a
Z b
1
for all n, we get 0 ≤ f − L(f, Pn ) < for all n—which implies that
a n
Z b
lim L(f, Pn ) = f . A similar arguement using (7.2.4) can be used to prove
n→∞ a Z b
that lim U (f, Pn ) = f.
n→∞ a
We claimed before we stated this theorem that it was ”powerful and useful.”.
If we consider an integral using Definitions 7.2.1 and 7.2.3, we must consider
all partitions of [a, b]—this is difficult because there are a lot of partitions. To
consider a particular integral using Theorem 7.2.4, we can use only a sequence of
partitions—and we can choose a very nice sequence of partitions. For example
when we considered the function f3 defined by f3 (x) = 2x + 3 in Example 7.1.3
we found that for a general partition the upper and lower sums are not very nice.
However, when we considered the partition P1 = {0, 1/n, 2/n, · · · , n/n = 1}, we
found that L(f3 , P1 ) = n(n−1)
n2 + 3 and U (f3 , P1 ) = n(n+1)
n2 + 3. Thus if we
define the sequence of partitions {Pn } to be Pn = {0, 1/n, 2/n, · · · , n/n = 1}
(the same as P1 but for all n), we see that U (f3 , Pn ) − L(f3 , Pn ) = n2 . Thus it
is clear that lim [U (f3 , Pn ) − L(f3 , Pn )] = 0 and that by Theorem 7.2.4 we see
n→∞
that Z 1  
n(n − 1)
f3 = lim L(f3 , Pn ) = lim + 3 = 4.
0 n→∞ n→∞ n2
We should note that since Theorem 7.2.4 is an “if and only if” result with
“f is integrable” one side, the theorem gives us a result that is equivalent to the
definition. In this case it is very important because the result given by Theorem
7.2.4 is easier to use than the definition. To make this result a bit easier to
discuss we make the following definition.
182 7. Integration

Definition 7.2.5 Suppose that f : [a, b] → R be such that f is bounded on [a, b].
Let {Pn } be a sequence of partitions of [a, b]. The sequence {Pn } is said to be
an Archimedian sequence of partitions for f on [a, b] if U (f, Pn ) − L(f, Pn ) → 0
as n → ∞.
Of course we can then reword Theorem 7.2.4 as follows: Consider f : [a, b] →
R such that f is bounded on [a, b]. The function f is integrable on [a, b] if
and only if there exists an Archimedian sequence of partitions for f on [a, b].
Also, if there exists an Archimedian sequence of partitions for f on [a, b], then
Z b
f = lim U (f, Pn ) = lim L(f, Pn ).
a n→∞ n→∞
Before we leave this section we include another theorem that is only a slight
variation of Theorem 7.2.4 but is sometimes useful—and gives us another char-
acterization of integrability.
Theorem 7.2.6 (Riemann Theorem) Suppose f : [a, b] → R is bounded on
[a, b]. Then f is integrable on [a, b] if and only if for every  > 0 there exists a
partition of [a, b], P , such that 0 ≤ U (f, P ) − L(f, P ) < .

Proof: (⇒) If f is integrable on [a, b], we know from the A-R Theorem, Theorem
7.2.4, that there exists an Archimedian sequences of partitions of [a, b], {Pn }, so
that U (f, Pn ) − L(f, Pn ) → 0 as n → ∞, i.e. for every  > 0 there exists N ∈ R
such that n ≥ N implies that |U (f, Pn ) − L(f, Pn )| < . Let n0 ∈ N be such
that n0 > N . Then the partition Pn0 is such that 0 ≤ U (f, Pn0 ) − L(f, Pn0 ) =
|U (f, Pn0 ) − L(f, Pn0 )| <  which is what we were to prove.
(⇐) We are given that for any  > 0 there exists a partition of [a, b], P , such
that 0 ≤ U (f, P ) − L(f, P ) < . Then for each n ∈ N there exists a partition of
[a, b], Pn , such that U (f, Pn ) − L(f, Pn ) < 1/n (letting  = 1/n). Then
0 ≤ U (f, Pn ) − L(f, Pn ) < 1/n → 0 as n → ∞.
Therefore by the A-R Theorem, Theorem 7.2.4, f is integrable on [a, b].
It’s obvious from the proof that the Riemann Theorem is only a slight vari-
ation of the A-R Theorem. There are times when it is more convenient to
only have to produce one partition to prove integrability. For those times the
Riemann Theorem is convenient.
HW 7.2.1 (True or False and why)
Z 1
(a) Suppose f : [0, 1] → R is integrable on [0, 1] and f = 0. Then for any
0
partition of [0, 1], P , U (f, P ) = L(f, P ) = 0.
Z 1
(b) Suppose f : [0, 1] → R is integrable on [0, 1] and f = 0. Then f (x) = 0
0
for all x ∈ [0, 1].
(c) Suppose f : [0, 1] → R is integrable on [0, 1]. Then f is continuous on [0, 1].
(d) Suppose f : [0, 1] → R is integrable on [0, 1] and f (x) > 0 for all x ∈ [0, 1].
Z 1
It is possible that f < 0.
0
7.3 Some Integration Topics 183

(e) Consider f : [0, 1] → R defined by f (x) = 1/(x − 1/2)2 , x 6= 1/2, and


f (1/2) = 0. The function f is integrable on [0, 1].
HW 7.2.2 Consider the function f (x) = x on [0, 1] (and recall HW7.1.2).
Z 1
(a) Show that f exists and equals 21 .
0
(b) Find a partition of [0, 1] that satisfies Riemann’s Theorem, Theorem 7.2.6,
i.e. for any  > 0 there exists a partition P such that U (f, P ) − L(f, P ) < .
(c) Do the results of HW7.1.3 given any information to whether or not f is
integrable on [0, 1]? Why?
(
x if x ∈ Q ∩ [0, 1]
HW 7.2.3 Consider the function f (x) =
0 if x ∈ I ∩ [0, 1]
(and recall HW7.1.4).
Z 1
(a) Explain why the results of HW7.1.4 do not directly show that f does
0
not exist.
(b) Show that for any partition of [0, 1], P , U (f, P ) > 21 .
Z 1
(c) Prove that f does not exist.
0
Z 1
HW 7.2.4 Prove that f (x) = |x| is integrable on [−1, 1]. Find f.
−1

7.3 Some Topics in Integration


Now that we have a definition of the integral and we have Theorem 7.2.4 to help
us, we could show that integrals exist by calculating some integrals by applying
the A-R Theorem to functions such as we did with the function f3 in Section
7.2. Instead we will use Theorems 7.2.4 and 7.2.6 to find some large classes of
integrable functions. We begin by showing that the class of monotonic functions
is integrable.
Proposition 7.3.1 Suppose that f : [a, b] → R is a bounded, monotonic func-
tion. Then f is integrable on [a, b].

Proof: Before we start, let us emphasize the approach that we shall use. By
Theorem 7.2.4 if we can find an Archimedian sequence of partitions for f on
[a, b], then we know that the function f is integrable on [a, b]. Thus we work to
find the appropriate sequence of partitions.
Consider the sequence of partitions {Pn } defined for each n ∈ N by
1 n−1
Pn = {x0 = a, x1 = a + (b − a), · · · , xn−1 = (b − a), xn = b}. (7.3.1)
n n
We will use this sequence of partitions often. Notice that it is very regular in
that the partition points are equally space throughout the interval.
184 7. Integration

Let us assume that f is monotonically increasing—we could prove the case


for f monotonically decreasing in a similar way or consider the negative of the
function and apply this result. Note that on any partition interval [xi−1 , xi ], the
fact that f monotonically increasing implies that mi = f (xi−1 ) and Mi = f (xi ),
i.e. xi−1 ≤ x ≤ xi implies that f (xi−1 ) ≤ f (x) ≤ f (xi ) for any x ∈ [xi−1 , xi ].
Therefore
n
X n
X
U (f, Pn ) − L(f, Pn ) = Mi (xi − xi−1 ) − mi (xi − xi−1 )
i=1 i=1
n
1 X
b−a
= (b − a) (Mi − mi ) remember that xi − xi−1 = n
n i=1
n
1 X 1
= (b − a) (f (xi ) − f (xi−1 )) = (b − a)(f (b) − f (a)) → 0
n i=1
n

as n → ∞. Thus {Pn } is an Archimedian sequence of partitions for f on [a, b]


so we know that f is integrable on [a, b].
As you will see, often the difficulty in the proof is to define the correct
sequence of partitions. Since the expression ”Archimedian sequence of partitions
for f on [a, b]” is a tedious statement, from this time on we will just state the
the sequence of partitions is ”Archimedian” and assume that you know that it
is for some f (the right f ) and on some interval (the right interval).
We next proof the integrability of a very large and important class of func-
tions, the continuous functions.
Proposition 7.3.2 If f : [a, b] → R is continuous on [a, b], then f is integrable
on [a, b].

Proof: Let  > 0 be given. Clearly since f is continuous on [a, b], by Lemma
5.3.7 we know that f is bounded on [a, b]—thus it makes sense to consider
whether f is integrable on [a, b]. Consider the partition Pn = {x0 , x1 , · · · , xn }
where xi = a + i(b − a)/n, i = 0, · · · , n. Recall that by Proposition 5.5.6, f
continuous on [a, b] implies that f is uniformly continuous on [a, b]. Thus for any
/(b − a) > 0 there exists a δ such that |x − y| < δ implies that |f (x) − f (y)| <
/(b − a). Choose n so that (b − a)/n < δ. (Recall that we know we can find
such an n by Corollary 1.5.5-(b).)
Consider the partition interval [xi−1 , xi ]. By Theorem 5.3.8 we know that
f assumes its absolute minimum and absolute maximum on [xi−1 , xi ]. Let
(xi , f (xi )) and (xi , f (xi )) denote the absolute minimum and maximum of f on
[xi−1 , xi ]. Note that mi = f (xi ) and Mi = f (xi ). Because f is uniformly
continuous, n was chosen so that (b − a)/n < δ and xi − xi−1 = (b − a)/n,
xi , xi ∈ [xi−1 , xi ] implies that |xi − xi | < δ and Mi − mi = |f (xi ) − f (xi )| <
/(b − a). Then
n
X n
X
U (f, Pn ) − L(f, Pn ) = (Mi − mi )(xi − xi−1 ) < (/(b − a)) (xi − xi−1 ) = .
i=1 i=1
7.3 Some Integration Topics 185

Therefore by Theorem 7.2.6 f is integrable on [a, b].


We have several large classes of integrable functions. We next provide results
that let us expand our cache of integrable functions, allow us to manipulate the
integrals and compute some of the integrals. There are many integration results
that we could include. We will try to include the theorems that you have
probably already seen and used in your basic course—this usually implies that
they are important theorems—and some theorems that are useful for further
analysis results. We will surely miss some nice results but we will assume that
you will now be able to read and or develop proofs for these results that we do
not include. We begin with the following proposition.
Proposition 7.3.3 Suppose f : [a, b] → R be integrable on [a, b]. Let c ∈ (a, b).
Rb Rc Rb
Then f is integral on [a, c] and [c, b], and a f = a f + c f .

Proof: Since f is integrable on [a, b], we know from Theorem 7.2.4 that there
exists an Archimedian sequence of partitions of [a, b], Pn , so that U (f, Pn ) −
L(f, Pn ) → 0. Suppose that Pn = {x0 , x1 , · · · , xn }. Then for each n there will
be some k so that xk ≤ c < xk+1 . Define the three partitions Pn0 = Pn ∪ {c},
[a,c] [c,b]
Pn = {x0 , · · · , xk , c} and Pn = {c, xk+1 , · · · , xn }—where of course, if xk =
c, no new point is really added, and if xk < c, the one new point c is added. Of
course these constructions are valid for each n.
We note that since Pn0 is a refinement of the partition Pn , by Proposition
7.1.2 and Lemma 7.1.5 0 ≤ U (f, Pn0 ) − L(f, Pn0 ) ≤ U (f, Pn ) − L(f, Pn ). Then by
the Sandwich Theorem, Proposition 3.4.2, U (f, Pn0 ) − L(f, Pn0 ) → 0 as n → ∞
and {Pn0 } is an Archemedian sequence of partitions for f on [a, b].
[a,c] [c,b] [a,c]
By the definition of Pn0 , Pn and Pn , we see that U (f, Pn0 ) = U (f, Pn )+
c,b] 0 [a,c] c,b]
U (f, Pn and L(f, Pn ) = L(f, Pn ) + L(f, Pn . Then because

0 ≤ U (f, Pn[a,c] ) − L(f, Pn[a,c] )


≤ U (f, Pn[a,c] ) − L(f, Pn[a,c] ) + U (f, Pn[c,b] ) − L(f, Pn[c,b] ) = U (f, Pn0 ) − L(f, Pn0 )

and

0 ≤ U (f, Pn[c,b] ) − L(f, Pn[c,b] )


≤ U (f, Pn[c,b] ) − L(f, Pn[c,b] ) + U (f, Pn[a,c] ) − L(f, Pn[a,c] ) = U (f, Pn0 ) − L(f, Pn0 ),

by two applications of the Sandwich Theorem, Proposition 3.4.2, we get


[a,c] [a,c] [c,b] [c,b]
U (f, Pn ) − L(f, Pn ) → 0 and U (f, Pn ) − L(f, Pn ) → 0 as n → 0.
[a,c] [c,b]
Therefore {Pn } and {Pn } are Archemedian sequences of partitions for f
on [a, c] and [c, b], respectively, and f is integrable on [a, c] and on [c, b].
And finally since f is integrable on [a, b], [a, c] and [c, b];
[a,c] c,b] Rb [a,c] Rc
L(f, Pn0 ) = L(f, Pn ) + L(f, Pn ); L(f, Pn0 ) → a f ; L(f, Pn ) → a f ; and
[c,b] Rb
L(f, Pn ) → c f , we see that
Z b h i Z c Z b
lim L(f, Pn0 ) = f = lim L(f, Pn[a,c] ) + L(f, Pnc,b] ) = f+ f.
n→∞ a n→∞ a c
186 7. Integration

We next include several very basic, important results for computing integrals.

Proposition 7.3.4 Suppose that f, g : [a, b] → R are integrable on [a, b] and


suppose that c, c1 , c2 ∈ R. Then we have the following results.
Z b Z b
(a) cf is integrable on [a, b] and cf = c f
aZ a
b Z b Z b
(b) f + g is integrable on [a, b] and (f + g) = f+ g
a Z a aZ
b b Z b
(c) c1 f + c2 g is integrable on [a, b] and (c1 f + c2 g) = c1 f + c2 g
a a a

Proof: Since f and g are both integrable, there exists an Archimedian sequence
for each of f and g. Let {Pn } be the common refinement of these two Archime-
dian sequences, i.e. then U (f, Pn ) − L(f, Pn ) → 0 and U (g, Pn ) − L(g, Pn ) → 0
as n → ∞. Suppose Pn = {x0 , · · · , xn }.
(a) This a very simple property. As you will see the proof is tough. Hold
on and ready carefully. Let us begin by defining the following useful notation:
Mif = lub{f (x) : x ∈ [xi−1 , xi ]}, mfi = glb{f (x) : x ∈ [xi−1 , xi ]}, Micf =
lub{cf (x) : x ∈ [xi−1 , xi ]} and mcf
i = glb{cf (x) : x ∈ [xi−1 , xi ]}. To prove
part (a) we need a relationship between U (f, Pn ) and U (cf, Pn ), and L(f, Pn )
and L(cf, Pn ).
Case 1: c ≥ 0: We begin by showing that Micf = cMif for any i. If c = 0,
then it is very easy since Micf = mcf i = cMif = cmfi = 0. So we consider
f
c > 0. Since Mi is an upper bound of {f (x) : x ∈ [xi−1 , xi ]}, then surely cMif
is an upper bound of {cf (x) : x ∈ [xi−1 , xi ]} and M,cf ≤ cMif (because Micf
is the least upper bound of the set). Likewise, since Micf is an upper bound
of {cf (x) : x ∈ [xi−1 , xi ]}, it is clear that Micf /c is an upper bound of the set
{f (x) : x ∈ [xi−1 , xi ]}. Thus Mif ≤ Micf /c (since Mif is the least upper bound
of the set) or cMif ≤ Micf . Therefore cMif = Micf . The proof that cmfi = mcf i
is very similar—with glb’s replacing lub’s.
We then have U (cf, Pn ) = cU (f, Pn ), L(cf, Pn ) = cL(f, Pn ) and

U (cf, Pn ) − L(cf, Pn ) = c [U (f, Pn ) − L(f, Pn )] → 0 as n → ∞.


Z b
Thus {Pn } is an Archimedian sequence for cf , cf is integrable and cf =
Z b a

lim L(cf, Pn ) = c lim L(f, Pn ) = c f.


n→∞ n→∞ a
Case 2: c < 0: When c < 0, the proofs that Micf = cmfi and mcf i = cMi
f

are very similar to the proofs given in Case 1, except that c < 0 reverses the
inequalities which replace the Mi by the mi . For example, since Mif = lub{f (x) :
x ∈ [xi−1 , xi ]}, Mif ≥ f (x) for all x ∈ [xi−1 , xi ]. Thus cMif ≤ cf (x) (remember
7.3 Some Integration Topics 187

c < 0) for all x ∈ [xi−1 , xi ] and cMif ≤ mcf cf


i because mi is the glb of the
cf
set {cf (x) : x ∈ [xi−1 , xi ]}. Also, since mi ≤ cf (x) for all x ∈ [xi−1 , xi ],
mcf cf f
i /c ≥ f (x) for all x ∈ [xi−1 , xi ], i.e. mi /c ≥ Mi (because Mi is the
f

least upper bound of the set {f (x) : x ∈ [xi−1 , xi ]}) or mcf i ≤ cMif . Thus
cf f
mi = cMi .
Xn X n
We then have U (cf, Pn ) = Micf (xi − xi−1 ) = cmfi (xi − xi−1 ) =
i=1 i=1
n
X n
X
cL(f, Pn ), L(cf, Pn ) = mcf
i (xi − xi−1 ) = cMif (xi − xi−1 ) = cU (f, Pn )
i=1 i=1
and

U (cf, Pn ) − L(cf, Pn ) = cL(f, Pn ) − cU (f, Pn ) = −c [U (f, Pn ) − L(f, Pn )] → 0.

(Note that it is easy to prove that if an → 0 implies that −an → 0 also.)


Z b
Thus {Pn } is an Archimedian sequence of cf , cf is integrable and cf =
Z b a

lim L(cf, Pn ) = c lim U (f, Pn ) = c f.


n→∞ n→∞ a
(b) Note that for this proof it is important that we made {Pn } to be the common
refinement of the Archimedian sequences for both f and g. We didn’t need it
for part (a) but we need it here. For this proof we define Mif , mfi as we did in
part (a), and Mig , mgi , Mif +g and mfi +g analogously. The technical inequalities
that we need, L(f, Pn ) + L(g, Pn ) ≤ L(f + g, Pn ) and U (f + g, Pn ) ≤ U (f, Pn ) +
U (g, Pn ) follow easily from the inequalities mfi + mgi ≤ mfi +g and Mif +g ≤
Mif + Mig . For example, for x ∈ [xi−1 , xi ] we see that mfi + mgi ≤ f (x) + g(x).
Thus mfi + mgi is a lower bound of the set {f (x) + g(x) : x ∈ [xi−1 , xi ]} and
mfi + mgi ≤ mfi +g (because mfi +g is the greatest lower bound of the set) which
is one of the inequalitiies that we wanted to prove. The other inequality follows
in the same manner.
Thus since both U (f, Pn ) − L(f, Pn ) → 0 and U (g, Pn ) − L(g, Pn ) → 0, we
have

U (f + g, Pn ) − L(f + g, Pn ) ≤ U (f, Pn ) + U (g, Pn ) − [L(f, Pn ) + L(g, Pn )]


= [U (f, Pn ) − L(f, Pn )] + [U (g, Pn ) − L(g, Pn )]
→ 0 as n → ∞.

Therefore {Pn } is an Archimedian sequence for f + g on [a, b], f + g is integrable


Z b
on [a, b], and lim L(f + g, Pn ) = lim U (f + g, Pn ) = (f + g).
n→∞ n→∞ a
Since we have L(f, Pn )+L(g, Pn ) ≤ L(f +g, Pn ) ≤ U (f +g, Pn ) ≤ U (f, Pn )+
U (g, Pn ), taking the limit of all parts of the inequality as n → ∞ gives
Z b Z b Z b Z b Z b Z b
f+ g≤ (f + g) ≤ (f + g) ≤ f+ g
a a a a a a
188 7. Integration

Rb
(we don’t really need the extra a (f + g) in the inequality). Since the first
term and the last term in the inequality are equal, they all must be equal and
Rb Rb Rb
a
(f + g) = a f + a g (or you could apply the Sandwich Theorem).
(c) Part (c) follows from parts (a) and (b).
The above proof was not necessarily difficult but very technical. However,
we must realize that it is the proof of an important theorem.

HW 7.3.1 (True or False and why)


(a) If f : [a, b] → R is integrable on [a, b], then |f | is integrable on [a, b].
(b) Suppose f : [a, b] → R. If |f | is integrable on [a, b], then f is integrable on
[a, b].
(c) If f, g :→ R are not integrable on [0, 1], then f + g is not integrable on [0, 1].
(d) If f : [a, b] → R is not integrable on [0, 1], then cf is not integrable on [0, 1]
for c ∈ R.
(e) If f, g : [0, 1] → R is such that f is continuous on [0, 1] and g is strictly
increasing on [0, 1], then f + 2g is integrable on [0, 1].
(
x if x ∈ [0, 1]
HW 7.3.2 Define f : [0, 2] → R by f (x) =
3 if x ∈ (1, 2].
(a) Find an Archimedian sequence of partitions that shows that f is integrable
Z 2
on [0, 2]. Find f.
0
(b) Use the theorems of this section to prove that f is integrable on [0, 2].
(c) Prove that f is integrable on [1, 2].

HW 7.3.3 Suppose f, g : [a, b] → R are such that f is continuous on [a, b]


and g(x) = f (x) on [a, b] except for a finite number of points. Show that g is
Z b Z b
integrable on [a, b] and g= f.
a a

 Consider the function f : [−2, 2] → R defined by


HW 7.3.4
−1 if x ∈ [−2, 1]

f (x) = x if x ∈ (−1, 1)

1 if x ∈ [1, 2].

Prove that f is integrable on [−2, 2].

7.4 More Topics in Integration


There are many basic, important properties of integration—too many to fit into
one section. Thus this section is actually a continuation of the last section. The
next proposition that we include at first seems very general and probably not
familiar. The proof is tough so pay attention. As you will see the proposition
will provide for us some very useful corollaries.
7.4 More Integration Topics 189

Proposition 7.4.1 Suppose f : [a, b] → R is integrable on [a, b] and φ : [c, d] →


R is continuous on [c, d] where f ([a, b]) ⊂ [c, d]. Then the function φ◦f : [a, b] →
R is integrable on [a, b].

Proof: We will work to find a partition of [a, b] that allows us to apply Rie-
mann’s Theorem, Theorem 7.2.6. Let  > 0 be given, let K = lub{|φ(y)| :
y ∈ [c, d]} and set 1 = /(b − a + 2K) (we’ll see why this is the correct choice
of 1 later). Since φ is continuous on [c, d], φ is uniformly continuous on [c, d].
So given 1 > 0 there exists δ such that y1 , y2 ∈ [c, d] and |y1 − y2 | < δ implies
that |φ(y1 ) − φ(y2 )| < 1 . Choose δ < 1 —we can always make our δ smaller.
Since f is integrable on [a, b], by the Riemann Theorem, Theorem 7.2.6, there
exists a partition P such that U (f, P ) − L(f, P ) < δ 2 . (Theorem 7.2.6 said that
we could find such a partition P for any  > 0—we’re using δ 2 in place of the 
in the theorem.) Suppose P is given by P = {x0 , · · · , xn }. The partition P is
the partition we want to use in our application of Theorem 7.2.6 to show that
φ ◦ f is integrable on [a, b], i.e. we must show that U (φ ◦ f, P ) − L(φ ◦ f, P ) < .
We know that
n
X
U (f, P ) − L(f, P ) = (Mif − mfi )(xi − xi−1 ) < δ 2 (7.4.1)
i=1

where we will use the notation used in the last theorem: for any F MiF =
lub{F (x) : x ∈ [xi−1 , xi ]} and mFi = glb{F (x) : x ∈ [xi−1 , xi ]}.
Since U (f, P )−L(f, P ) must be “small”, and since Mif −mfi and xi −xi−1 are
both greater than or equal to zero, then (Mif − mfi )(xi − xi−1 ) must be “small”.
There are two ways the terms (Mif − mfi )(xi − xi−1 ) are made “small”—either
Mif − mfi can be “small” or xi − xi−1 is “small”.
Let S1 be the set of indices for which Mif − mfi is “small”, i.e. S1 = {i :
1 ≤ i ≤ n and Mif − mfi < δ}. Let S2 = {i : 1 ≤ i ≤ n and Mif − mfi ≥ δ}.
Note that we have now partially defined what we mean by “small” and though
we have defined the set S2 to be the set of indices on which Mif − mfi is not
“small”, it is for these partition intervals that we had better have xi − xi−1 be
small—because we have decided that Mif − mfi is not.
We note that
n
X
U (φ ◦ f, P ) − L(φ ◦ f, P ) = (Miφ◦f − mφ◦f
i )(xi − xi−1 )
i=1
X X
= (Miφ◦f − miφ◦f )(xi − xi−1 ) + (Miφ◦f − mφ◦f
i )(xi − xi−1 ).(7.4.2)
i∈S1 i∈S2

We will handle the two sums separately.


For i ∈ S1 : We note that for i ∈ S1 we have Mif − mfi < δ. We are interested
in Miφ◦f − mφ◦f
i . To aid us we prove the following two claims.
Claim 1: Mif − mfi = lub{f (x) − f (y) : x, y ∈ [xi−1 , xi ]}
Claim 2: Miφ◦f − mφ◦fi = lub{φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}
190 7. Integration

Proof of Claim 2: Recall that Miφ◦f = lub{φ ◦ f (x) : x ∈ [xi−1 , xi ]} and


miφ◦f = glb{φ ◦ f (y) : y ∈ [xi−1 , xi ]}. By Proposition 1.5.3-(a) for every 3 >
0 there exists φ ◦ f (x∗ ) ∈ {φ ◦ f (x) : x ∈ [xi−1 , xi ]} such that Miφ◦f − φ ◦
f (x∗ ) < 3 /2. Also by Proposition 1.5.3-(b) there exists φ ◦ f (y ∗ ) ∈ {φ ◦ f (y) :
y ∈ [xi−1 , xi ]} such that φ ◦ f (y ∗ ) − mφ◦f
i < 3 /2. Then Miφ◦f − mφ◦f i −
∗ ∗
[φ(f (x )) − φ(f (y ))] < 3 , i.e. for 3 > 0 there exists

φ ◦ f (x∗ ) − φ ◦ f (y ∗ ) ∈ {φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}

so that Miφ◦f − miφ◦f − [φ(f (x∗ )) − φ(f (y ∗ ))] < 3 . Thus again by Proposition
1.5.3-(a) Miφ◦f − mφ◦f i = lub{φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}.
The proof of Claim 1 is essentially the same—a bit easier.
By Claim 1 and the fact that Mif −mfi < δ we see that for any x, y ∈ [xi−1 , xi ]
we have |f (x) − f (y)| < δ. Then by the uniform continuity of φ, we have for
any x, y ∈ [xi−1 , xi ] φ(f (x)) − φ(f (y))| < 1 . Therefore Miφ◦f − mφ◦f i ≤ 1 and
we have
X X
(Miφ◦f − mφ◦f i )(xi − xi−1 ) ≤ 1 (xi − xi−1 ) ≤ 1 (b − a). (7.4.3)
i∈S1 i∈S1

For i ∈ S2 : We note that for i ∈ S2 we have Mif − mfi ≥ δ. Note that


Miφ◦f = lub{φ ◦ f (x) : x ∈ [xi−1 , xi ]} ≤ K (x ∈ [xi−1 , xi ] implies φ(f (x)) ≤
|φ(f (x))| ≤ K) and mφ◦f i = glb{φ ◦ f (x) : x ∈ [xi−1 , xi ]} ≥ −K (x ∈ [xi−1 , xi ]
implies φ(f (x)) ≥ −|φ(f (x))| ≥ −K). Then Miφ◦f − mφ◦f i ≤ 2K.
If for each i ∈ S2 we multiply both sides of the inequality Mif − mfi ≥ δ by
P f f
i − xi−1 ) (all positive) and sum over S2 , we get
(xP i∈S2 (Mi − mi )(xi − xi−1 ) ≥
δ i∈S2 (xi − xi−1 ) or

X X n
X
δ (xi − xi−1 ) ≤ (Mif − mfi )(xi − xi−1 ) ≤ (Mif − mfi )(xi − xi−1 ) < δ 2
i∈S2 i∈S2 i=1
P
(by (7.4.1)) or i∈S2 (xi − xi−1 ) < δ. Thus
X
U (φ ◦ f, P ) − L(φ ◦ f, P ) = (Miφ◦f − mφ◦f
i )(xi − xi−1 )
i∈S1
X X
+ (Miφ◦f − mφ◦f
i )(xi − xi−1 ) ≤ 1 (b − a) + 2K (xi − xi−1 )
i∈S2 i∈S2
< 1 (b − a) + 2Kδ < 1 (b − a + 2K) = .

Therefore by Theorem 7.2.6 the function φ ◦ f is integrable on [a, b].


That too was a very technical proof. As we said earlier Proposition 7.4.1 is
useful for its corollaries. We begin with the following.
Corollary 7.4.2 Suppose f, g : [a, b] → R are integrable on [a, b]. Then
7.4 More Integration Topics 191

(a) f2 is integrable on [a, b]


(b) fn is integrable on [a, b] for any n ∈ N
(c) |f | is integrable on [a, b]
(d) fg is integrable on [a, b].

Proof: The proof of part (a) consists of letting φ(y) = y 2 —which is surely
continuous anywhere—and applying Proposition 7.4.1. The proof of part (b) is
a reasonably nice induction proof using part (a). To obtain part (c) we again
use Proposition φ(y) = |y|. And part (d) follows by noting
 7.4.1, this time with
that f g = 14 (f + g)2 − (f − g)2 and using Proposition 7.3.4 along with part
(a) of this proposition.
We should realize that we can let φ be any continuous function on [c, d]—
which will give us a lot of different integrable composite functions. For example,
if we let φ(x) = cx we see that the function cf is integrable if f is integrable—
part (a) Proposition 7.3.4. We next prove two reasonably easy propositions that
will give us a lot of interesting, integrable functions.

Proposition 7.4.3 (a) Suppose that f : [a, b] → R is integrable on [a, c] and


Z b Z c Z b
[c, b] for c ∈ (a, b). Then f is integrable on [a, b] and f= f+ f.
a a c
(b) Suppose that f : [c0 , ck+1 ] → R, c1 , · · · , ck satisfy c0 < c1 < · · · ck < ck+1
and f is integrable on [cj−1 , cj ] for j = 1, · · · k + 1. Then f is integrable on
Z ck+1 k+1
X Z cj
[c0 , ck+1 ] and f= f.
c0 j=1 cj−1

Proof: (a) Since f is integrable on [a, c] and [c, b], by Theorem 7.2.6 for any
 > 0 there exists a partition P1 of [a, c] and a partition P2 of [c, b] such that
U (f, P1 ) − L(f, P1 ) < /2 and U (f, P2 ) − L(f, P2 ) < /2. Let P by the parti-
tion P = P1 ∪ P2 . Clearly P is a partition of [a, b] and U (f, P ) − L(f, P ) =
[U (f, P1 ) − L(f, P1 )] + [U (f, P2 ) − L(f, P2 )] < . Therefore by Theorem 7.2.6
the function f is integrable on [a, b]. We can then apply Proposition 7.3.3 to
Z b Z c Z b
see that f= f+ f.
a a c
(b) Part (b) follows from part (a) using mathematical induction.

Proposition 7.4.4 Suppose f : [a, b] → R is bounded on [a, b] and continuous


on (a, b). Then f is integrable on [a, b].

Proof: Since f is bounded on [a, b], there exists some K such that |f (x)| ≤ K
for x ∈ [a, b]. Let x0 = a, x1 = a + (b − a)/n (for some n, yet to be determined),
xn−1 = b − (b − a)/n and xn = b. Note that lub{f (x) : x ∈ [x0 , x1 ]} ≤ K
(for x ∈ [x0 , x1 ], f (x) ≤ |f (x)| ≤ K), glb{f (x) : x ∈ [x0 , x1 ]} ≥ −K (for
x ∈ [x0 , x1 ], f (x) ≥ −|f (x)| ≥ −K), lub{f (x) : x ∈ [xn−1 , xn ]} ≤ K, and
glb{f (x) : x ∈ [xn−1 , xn ]} ≥ −K. Thus M1 − m1 ≤ 2K and Mn − mn ≤ 2K.
192 7. Integration

Now suppose we are given  > 0. Choose n above so that 4K(b−a)/n < /2.
Since f is continuous on (a, b), we know that f is continuous on [x1 , xn−1 ]. Then
by Riemann’s Theorem, Theorem 7.2.6, we know that there exists a partition
P ∗ of [x1 , xn−1 ] such that U (f, P ∗ ) − L(f, P ∗ ) < /2. Write the partition P ∗ as
P ∗ = {x1 , x2 , · · · , xn−1 } (which is clear not in the form that we promised we’d
write all partitions—but for a good reason). Let P = {x0 , x1 , · · · , xn−1 , xn },
i.e. P = P ∗ ∪ {x0 , xn }. Then clearly P is a partition of [a, b] and

U (f, P ) − L(f, P ) = (M1 − m1 )(x1 − a) + U (f, P ∗ ) − L(f, P ∗ )


+ (Mn − mn )(b − xn−1 ) < 2K(b − a)/n + /2 + 2K(b − a)/n < .

Then using Theorem 7.2.6 again we get that f is integrable on [a, b].
It might not be clear the range of integrable functions this result produces.
If a < c1 < · · · < ck < b, and f is continuous on (a, c1 ), (ck , b) and (cj−1 , cj ) for
j = 2, · · · , k, then we say that f is piecewise continuous on [a, b]. If we assume
that f is defined at a, c1 ,· · · ,ck ,b (which forces f to be bounded on [a, b]), then
by Propositions 7.4.3 and 7.4.4 we see that f is integrable on [a, b] and
Z b Z c1 k Z
X cj Z b
f= f+ f+ .
a a j=2 cj−1 ck

Also if we have the same setting a < c1 < · · · < ck < b and S is defined to be con-
stant on each open interval (a, c1 ), (ck , b) and (cj−1 , cj ), j = 2, · · · , k, then S is
said to be a step function. If, in addition S is defined at the points a, c1 ,· · · ,ck ,b,
then S is piecewise continuous on [a, b] and is integrable on [a, b]. Thus we have
a lot of integrable functions, a bit more interesting than just ( continuous func-
sin(1/x) if x 6= 0
tions. In fact, we note that the nasty function f (x) =
0 if x = 0
R1
is intregrable on [0, 1]. Can you find 0 f ? Note also that f is integrable on
[−1, 1].
We next proof a intuitive result that as we will see later is a useful tool.

Proposition 7.4.5 Suppose that f, g : [a, b] → R are integrable on [a, b] and


Rb Rb
satisfy f (x) ≤ g(x) for all ∈ [a, b]. Then a f ≤ a g.

Proof: Since f and g are integrable, they will both have Archimedian sequences
on [a, b]. Let {Pn } be the common refinement of both sequences, i.e. {Pn } will
Z b Z b
satisfy U (f, Pn ) → f and U (g, Pn ) → g as n → ∞. Since f (x) ≤ g(x)
a a
on [a, b], f (x) ≤ g(x) on any of the partition intervals [xi−1 , xi ] and we get
Mif ≤ Mig (using notation that we defined in the proof of part (b) of Proposition
Rb Rb
7.3.4) and U (f, Pn ) ≤ U (g, Pn ). If we then let n → ∞, we get a f ≤ a g.
We next combine Propositions 7.3.4 and 7.4.5 to obtain the following impor-
tant result.
7.4 More Integration Topics 193

Proposition Z 7.4.6
Suppose that f : [a, b] → R are integrable on [a, b].
b Z b
(a) Then f ≤ |f |.

a a
Z
b
(b) If |f (x)| ≤ M for all x ∈ [a, b], then f ≤ M (b − a).

a

Proof: (a) We note that by the definition of absolute value we have −|f (x)| ≤
f (x) ≤ |f (x)|. By Corollary 7.4.2-(c) we know that f integrable implies that |f |
Z b Z b Z b
is integrable. Then by Proposition 7.4.5 we have (−|f |) ≤ f ≤ |f |.
Z b Z ba Z b a a

Applying Proposition 7.3.4 gives − |f |) ≤ f ≤ |f | which gives us


Z Z a a a
b b
f ≤ |f |.


a a

(b) Before we start let us emphasize that |f (x)| ≤ M is not really an hypothesis.
Because f is already assumed to be integrable, we know f must be bounded.
We’re now just saying that it’s bounded by M . Z
b
If we apply part (a) of this proposition and Proposition 7.4.5 we get f ≤

a
Z b Z b
|f | ≤ M = (b − a)M which is what we were to prove.
a a
Everything we have done with respect to integrals has been over a range a
to b where a < b. It is convenient and necessary to have integral results for
integrals from d to c with d ≥ c—you probably have already used such integrals
in your basic course. To allow for this we make the following definition.

Definition 7.4.7 Suppose f : [a, b] → R is integrable on [a, b]. Let c, d ∈ [a, b]


be such
Z that c < d. We define
c
(a) f = 0 and
Z cc Z d
(b) f =− f.
d c

We then have a variety of results that “fix up” our previous results, now
allowing for integrals defined in Definition 7.4.7. The results generally follow
Rd
previous analogous results for integrals c f with c < d. We include some of
these results in the following proposition.

Proposition 7.4.8 Suppose f : [a, b] → R is integrable on [a, b]. We then have


the following results. Z x3 Z x2 Z x3
(a) For x1 , x2 , x3 ∈ [a, b] f= f+ f.
x1 x1 x2
194 7. Integration
Z x2 Z x2
(b) If x1 , x2 ∈ [a, b] and f (x) ≤ g(x) for x ∈ [a, b], then f ≤ g if
Z x2 Z x2 x1 x1

x1 ≤ x2 , and f≥ g if x1 > x2 .
x1 x1 Z
x2 x2
Z
(c) If x1 , x2 ∈ [a, b], then
f ≤
|f | .
x1 x1 Z x2

(d) If x1 , x2 ∈ [a, b] and |f (x)| ≤ M for x ∈ [a, b], then f ≤ M |x2 − x1 |.
x1

Proof: (a) We note that if x1 < x2 < x3 ,Zthis results


Z is the Zsame as Proposition
x2 Z x1 x2 x3
7.3.3. Suppose that x3 < x1 < x2 . Then f= f+ f or − f=
Z x3 Z x2 Z x3 Z x2 Z x3 x3 x3 x1 x2

− f+ f or f= f+ f which is what we wanted to prove.


x1 x1 x1 x1 x2
The results for other orders of x1 , x2 and x3 follow in the same manner.
(b) The first part of (b) is the
Z x1same Zasx1Proposition 7.4.5. If x2 < x1 , using
Proposition 7.4.5 again gives f≤ g. Using Definition 7.4.7 this can be
Z x2 Z x2x2 x2

rewritten as − f ≤− g which is equivalent to the inequality that we


x1 x1
must prove.
Z
x2 Z
x2

(c) First note that if x1 < x2 , the inequality f ≤ |f | follows from
x1 x1
Proposition 7.4.6-(a)—and the outer set of absolute value signs are not neces-
sary. If x1 = x2 , the inequality is trivial because the values on both sides of the
inequality are zero. And finally, if x1 > x2 , then
Z x2 Z x1 Z x1 Z x1


f = − f = f ≤ |f |
x1 x2 x2 x2
Z x1 Z x2 Z
x2

by Proposition 7.4.6-(a). Then since |f | = − |f | = |f | , we have
x2 x1 x1
the desired result.
(d) If x1 < x2 , this result follows from Proposition 7.4.5-(b). If x1 = x2 , both
sides of the inequality are zero—so
Z x1 the result is true. If x1 > x2 , we apply

Proposition 7.4.3-(b) to get f ≤ M (x1 − x2 ). Then
x2
Z
x2 Z
x1 Z
x1


f = −
f =
f ≤ M (x1 − x2 ) = M |x2 − x1 |.
x1 x2 x2

There may be some other integration results that must be adjusted to allow
for arbitrary limits and from this time on we will assume that you are able to
complete them.
7.5 Fundamental Theorem 195

HW 7.4.1 (True or False and why)


(a) Suppose f : [a, b] → R is integrable on [a, b]. Then the function g defined by
g(x) = sin(f (x)), x ∈ [a, b], is integrable on [a, b].
(b) Z xf2 : [a, b] → R is integrable on [a, b], and x1 , x2 ∈ [a, b]. Then
Z x2Suppose


f ≤ |f |.
x1 x1
Z b
(c) Suppose f : [a, b] → R is such that f (x) > 0 for x ∈ [a, b]. Then f > 0.
a
(d) Suppose f : [a, b] → R is integrable on [a, b] and such that f (x) > 0 for
Z b
x ∈ [a, b]. Then f > 0.
a
(e) Suppose f : [a, b] → R is integrable on [a, b] and there exists a c > 0 such
that f (x) ≥ c for all x ∈ [a, b]. Then 1/f is integrable on [a, b].
(
sin(1/x) if x 6= 0
HW 7.4.2 Recall that we earlier defined functions f1 (x) = ,
0 if x = 0
( (
x sin(1/x) if x 6= 0 x2 sin(1/x) if x 6= 0
f2 (x) = and f3 (x) =
0 if x = 0 0 if x = 0.
Which of the functions f1 , f2 and f3 are integrable on [−1, 1]—prove it.
Z 1
HW 7.4.3 Suppose f : [0, 1] → R is continuous on [0, 1] and that f > 0.
0
Prove that there exists an interval (α, β) ⊂ [0, 1] such that f (x) > 0 for x ∈
(α, β).
Z b
HW 7.4.4 Suppose f, g : [a, b] → R are integrable on [a, b]. Prove that |f +
Z b Z b a

g| ≤ |f | + |g|.
a a

HW 7.4.5 Suppose f : [a, b] → R, g : [c, d] → R are such that f ([a, b]) ⊂ [c, d],
f is continuous on [a, b] and g is integrable on [c, d]. Prove or disprove that g ◦ f
is integrable on [a, b].

7.5 The Fundamental Theorem of Calculus


So far we have defined the integral and derived properties of integrals. There are
many applications of integration. If we were not able to compute integrals, these
applications would not be very useful—at least until numerical integration has
become routine. We all know that there are methods for computing integrals.
In this section we state and prove the Fundamental Theorems of Calculus and
some related results—the theorems that will allow us to compute integrals. One
might guess that a theorem with a name like ”the Fundamental Theorem” of
anything might be important. So read carefully.
196 7. Integration

We consider a function f : [a, b] → R that is integrable on [a, b]. We note


that by Proposition 7.3.3 (and Definition 7.4.7) f is alsoZ integrable on [a, x] for
x
any x, a ≤ x ≤ b. Define F : [a, b] → R by F (x) = f . We will use this
a
notation throughout this section. We begin with our first result that gives us a
very basic property of F .
Proposition 7.5.1 If f : [a, b] → R is integrable on [a, b], then F is uniformly
continuous on [a, b].

Proof: Suppose x, y ∈ [a, b]. Then


Z y Z x Z x Z y Z x Z y
F (y)−F (x) = f− f= f+ f− f (by Prop. 7.4.8-(a)) = f.
a a a x a x

Since f is integrable, we know that f is bounded, i.e. there exists K ∈ R such


Z y |f (x)| ≤ K for x ∈ [a, b]. Then by Proposition 7.4.8-(d) |F (y) − F (x)| =
that


f ≤ K|y − x|. Thus, given any  > 0 we can choose δ = /K and see that
x
|y − x| < δ implies that |F (y) − F (x)| < . Therefore F is uniformly continuous
on [a, b].
We next add a hypothesis that makes f a bit nicer, and we see that it makes
F nicer. You could say that this result shows how integration is the reverse
operation of differentiation.
Proposition 7.5.2 Suppose f : [a, b] → R is integrable on [a, b]. If f is contin-
uous at x = c ∈ [a, b], then F 0 (c) exists and F 0 (c) = f (c).
F (x) − F (c)
Proof: We want to proceed in the obvious way and consider . We
Z x Z c Z x x−c
begin by noting that F (x) − F (c) = f− f = f . Also we note that
a a c Z x
1
since f (c) is a constant (c is a fixed point in [a, b]), f (c) = f (c) 1=
Z x x−c c
1
f (c)—for x 6= c. Then for x 6= c we have
x−c c
Z x Z x
F (x) − F (c) 1 1
− f (c) =
f− f (c)
x−c x−c c x−c c
Z x
1 x
Z
1
= (f − f (c)) =
[f − f (c)] .(7.5.1)
x−c c |x − c| c

We assume that we have an  > 0 given. By the continuity of f at x = c we get


a δ so that |x − c| < δ and x ∈ [a, b] implies that |f (x) − f (c)| < . Then if we
consider x satisfying 0 < |x − c| < δ and x ∈ [a, b], return to equation (7.5.1)
and apply Proposition 7.4.8-(d), we see that
1 x
Z
F (x) − F (c) 1
− f (c) = [f − f (c)] ≤
|x − c| |x − c| = . (7.5.2)
x−c |x − c| c

7.5 Fundamental Theorem 197

Thus we have for a given  > 0 a δ such that 0 < |x − c| < δ and x ∈ [a, b]
F (x) − F (c) F (x) − F (c)
implies that − f (c) < . Therefore lim = f (c) or
x−c x→c x−c
F 0 (c) = f (c).
Note that continity gave us |f (x) − f (c)| <  for all x satisfying |x − c| < δ.
When we used this inequality in equation (7.5.2) we only used it for 0 < |x−c| <
δ. This is all we want to use when we are working with the definition of a
derivative (a limiit) and it is ok to use less than what we have.
We now return to our basic calculus course and define the antiderivative of
a function.
Definition 7.5.3 Consider some interval I ⊂ R and f : I → R. If the function
F is such that F 0 (x) = f (x) for all x ∈ I, then F is said to be an antiderivative
of f on I.

We then have the following theorem, the Fundamental Theorem of Calculus.

Theorem 7.5.4 Suppose f : [a, b] Z x→ R is continuous on [a, b]. Then F :


[a, b] → R satisfies F(x) − F(a) = f if and only if F is an antiderivative of
a
f on [a, b].

Proof:
Z x (⇒) We assume that there is a function F such that F(x) − F(a) =
f . Using the notation of this section we can rewrite this expression as
a
F(x) − F(a) = F (x) and this expression holds for all x ∈ [a, b]. Then since by
Proposition 7.5.2 F is differentiable on [a, b], we know that F is differentiable on
[a, b] (F(a) is a constant). Also by Proposition 7.5.2 and the fact that we know
that the derivative of a constant is zero, we see that F 0 (x) = F 0 (x) = f (x).
Thus F is the antiderivative of f .
(⇐) If F : [a, b] → R is such that F 0 (x) = f (x) for x ∈ [a, b], we know that
F 0 (x) = F 0 (x). By Corollary 6.3.5 there exists a C ∈ R such that F(x) =
F (x) + C. Since F (a) = 0, we evaluate the last expression at x = a to see that
F(a) = RF (a) + C = C. Thus we have F(x) = F (x) + F(a) or F(x) − F(a) =
x
F (x) = a f which is what we were to prove.
If we let x = b, we get the following corollary that looks more like the result
we applied so often in our basic course.
Corollary 7.5.5 Suppose f : [a, b] → R is continuous on [a, b] and F is an
Z b
antiderivative of f on [a, b]. Then F(b) − F(a) = f.
a

Thus as we have done so often before, we evaluate


Z 1  3 1 
x2

x 1 1 11
(x2 + x + 1) = + +x = + +1 −0= = F(1) − F(0)
0 3 2 0 3 2 6
198 7. Integration

3 2
where F(x) = x3 + x2 + x.
We next include several very nice results, the first two of which are by-
products of our previous results. We begin with integration by parts. Inte-
gration by parts is usually presented as a technique for evaluating integrals—
integrals that we can not evaluated using easier methods. However, with the
advent of computer and calculator calculus systems, integration by parts is not
as necessary as an integration technique as it was in the past. Integration by
parts is an important tool in analysis as we shall see in the next chapter. We
proceed with the following result.
Proposition 7.5.6 (Integration by Parts) Suppose f, g : [a, b] → R are
differentiable on [a, b] and are such that f 0 , g 0 : [a, b] → R are continuous on
[a, b] (f and g are continuously differentiable on [a, b]). Then
Z b Z b
f 0 g = [f (b)g(b) − f (a)g(a)] − f g0 .
a a

Proof: The proof of the integration by parts formula is very easy. We all
d
remember that the product formula for differentiation gives us (f g) = f 0 g +
0
dx
f g . We integrate both sides of this equality, use Corollary 7.5.5 (noting that
d
f g is an antiderivative of the function dx (f g) and get
Z b Z b Z b
d 0
(f g) = f (b)g(b) − f (a)g(a) = f g+ f g0 .
a dx a a

Rearranged, this is the formula for integration by parts.


Note that when we say “integrate both sides of the equation,” we are just
considering the fact that when we have two equal functions, the integrals of
these functions must be equal.
R R
Thus we are now able to evaluate such integrals
R x as
R 7 x sin x, x arctan x,
etc—we cannot use integration by parts on xe or x ln x yet because we
have still not introduced the exponential and logarithm functions—but we will
soon also be able to do these integrals also.
Another technique that is a common tool for the evaluation of integrals is
that of substitution. Substitution is a very important result—for the evalua-
tion of integrals and for the general manipulation of integrals in all sorts of
applications. We include the following result.
Proposition 7.5.7 (Substitution) Suppose that φ : [a, b] → R is continuously
differentiable on [a, b] and that f : φ([a, b]) → R is continuous on φ([a, b]). Then
Z b Z φ(b)
f ◦ φ φ0 = f.
a φ(a)

Proof: Before we start our work we might note that we could write the result
Z b Z φ(b)
of the above proposition as f (φ(t))φ0 (t) dt = f (x) dx. Substitution is
a φ(a)
7.5 Fundamental Theorem 199

one of the places that this latter notation is very nice—it reminds you that you
Z φ(b)
are making the substitution x = φ(t) in the integral f (x) dx.
φ(a)
We begin by noting that since φ is continuous on the interval [a, b], φ attains
both its maximum and minimum on [a, b], Theorem 5.3.8, i.e. there exists a
xM , xm ∈ [a, b] such that φ(xM ) is the maximum value of φ on [a, b] and φ(xm )
is the minimum value of φ on [a, b]. Suppose for convenience that xm < xM .
For any y ∈ [φ(xm ), φ(xM ]) by the IVT, Theorem 5.4.1, we know that there is
an x0 ∈ [xm , xM ] such that φ(x0 ) = y0 . Thus φ([a, b]) = [φ(xm ), φ(xM )] is an
interval.
Now let c = φ(a) and d = φ(b). We note that since φ([a, b]) is an interval,
φ([a, b]) will contain the interval with end
R x points c and d—either [c, d] or [d, c].
Define F : φ([a, b]) → R by F (x) = c f and define h : [a, b] → R by h =
F ◦ φ. Then by the Chain Rule, Proposition 6.1.4, and Proposition 7.5.2, we see
that h0 (x) = F 0 (φ(x))φ0 (x) = f (φ(x))φ0 (x). If integrate both sides of this last
expression and apply Corollary 7.5.5, we get
Z b Z b
h0 = h(b) − h(a) = f (φ(x)φ0 (x) dx. (7.5.3)
a a

Rd
Since h(a) = F (φ(a)) = F (c) = 0 and h(b) = F (φ(b)) = F (d) = c f , equation
Z d Z b
R φ(b) Rb
(7.5.3) becomes f = f (φ(x))φ0 (x) dx or φ(a) f = a f ◦ φ φ0 which is
c a
what we were to prove.
Z 1/2
1
Thus we can now consider an integral such as √ . We choose
0 1 − x2
φ(θ) = sin θ. Then φ(0) = 0 and φ(π/6) = 1/2. Thus
Z 1/2 Z φ(π/6) Z π/6
1 1 1
√ = √ = p φ0 (x)
0 1 − x2 φ(0) 1 − x2 0 1 − [φ(x)]2
Z π/6 Z π/6
cos θ π
= p = 1= .
0 1 − sin θ 2
0 6

We next include a theorem that may not be familiar to you. We will find
useful in the next chapter and can be used in a variety of interesting ways. The
theorem is called the Mean Value Theorem for Integrals.

Theorem 7.5.8 (Mean Value Theorem for Integrals) Suppose that f :


[a, b] → R is continuous on [a, b], and p : [a, b] → R is integrable on [a, b] and
such that p(x) ≥ 0 for x ∈ [a, b]. Then there exists c ∈ [a, b]such that
Z b Z b
f p = f (c) p. (7.5.4)
a a
200 7. Integration

Proof: We know from Corollary 7.4.2-(d) that since f and p are integrable
on [a, b], f p is integrable on [a, b]. We also know from Theorem 5.3.8 that
f assumes its maximum and minimum on [a, b], i.e there exists m, M ∈ R
such that m ≤ f (x) ≤ M for x ∈ [a, b], and there exists xm , xM ∈ [a, b] such
that f (xm ) = m and f (xM ) = M . Because p(x) ≥ 0 we know also that
mp(x) ≤ f (x)p(x) ≤ M p(x) for all x ∈ [a, b]. Therefore
Z b Z b Z b
m p≤ fp ≤ M p. (7.5.5)
a a a

Rb
Z b
If a
p = 0, then f p = 0 and we can choose any c ∈ [a, b] to satisfy
a
equation (7.5.4). Otherwise we rewrite inequality (7.5.5) as
Rb
fp
f (xm ) = m ≤ Ra b ≤ M = f (xM ).
a
p

Then by the Intermediate Value Theorem, Theorem 5.4.1, (applied on either


[xm , xM ] or [xM , xm ] depending on whether xm ≤ xM or xM < xm ) there exists
Rb
fp
c between xm and xM such that f (c) = Ra b which is the same as (7.5.4).
a
p

HW 7.5.1 (True or False and why) "Z #


b
d
(a) Suppose f : [a, b] → R is continuous on [a, b]. Then f = f (x).
dx x
Z a 
d
(b) Suppose f : [a, b] → R is continuous on [a, b]. Then f = −f (x).
 dx x
−2 if x ∈ [−2, −1]

(c) Consider f : [−2, 2] → R defined by f (x) = x if x ∈ (−1, 1)

3 if x ∈ [1, 2].

Rx
Then the function F (x) = −2 f is continuous at points x ∈ [−2, −1) ∪ (−1, 1) ∪
(1, 2] and discontinuous at x = −1 and x = 1.
(d) The function F defined in part (c) is differentiable for all x ∈ [−2, 2].
(e) Suppose f : [a, b] → R is continuous on [a, b]. Then there exists c ∈ [a, b]
Z b
such that f = f (c)(b − a).
a

HW 7.5.2 Consider the functions defined in HW7.5.1-(c). Compute F . Plot


F.

HW 7.5.3 Calculate the following three integrals—verifying all steps.


Z 2 Z 3 Z 2
1
(a) x3 (b) x cos x (c) √ .
−1 −1 1 2x − 1
7.6 Riemann Integral 201

HW 7.5.4 Suppose f : [a, b] → R is integrable. Show that there may not be a


Z b
c ∈ [a, b] such that f = f (c)(b − a).
a

HW 7.5.5 Suppose f, φ, ψ : [a, b] → R are such that f is continuous on [a, b]


Z φ(x)
d
and φ and ψ are differentiable on (a, b). Show that for x ∈ (a, b) f=
dx ψ(x)
f (φ(x))φ0 (x) − f (ψ(x))ψ 0 (x).

7.6 The Riemann Integral


The integral studied in the basic calculus course is most often referred to as the
Riemann integral. We called the integral that we defined the Darboux integral
or just the integral to differentiate it from the integral defined in this section.
As we will see in Theorem 7.6.3 the name is not relevant because the integrals
are the always equal. It is important to introduce the definition given below
because this is the most common definition introduced in the basic calculus
courses. We begin with some definitions.
For a partition of [a, b], P = {x0 , x1 , · · · , xn−1 , xn }, we define the gap of P
to be gap(P ) = max{xi − xi−1 : i = 1, · · · , n}. Thus the gap(P ) is the length
of the largest partition interval. We then make the following definition.
Definition 7.6.1 Consider the function f : [a, b] → R where f is bounded on
[a, b] and let P = {x0 , x1 , · · · , xn } be a partition of [a, b].
(a) A Riemann sum of f with respect to the partition P is the sum
n
X
Sn (f, P ) = f (ξi )(xi − xi−1 ) (7.6.1)
i=1

where ξi is any point in [xi−1 , xi ].


(b) The function f is said to be Riemann integrable on [a, b] if there exists
Z b
a real number (R) f so that for every  > 0 there exists a δ such that
a

Z b
(R) f − Sn (f, P ) <  for all partitions P of [a, b] with gap(P ) < δ and

a
all different choices of Sn (f, P ).

Before we move on let us emphasize some important points here. The Rie-
mann sums are very difficult in that for a given partition there are many dif-
ferent sums—you get a different value for each choice of ξi ∈ [xi−1 , xi ] for each
i = 1, · · · , n. That is why we included the statement “all different choices of
Sn (f, P )” in the definition of the Riemann
integral—it’s usually
not there. The
Z b
fact that we must be able to show that (R) f − Sn (f, P ) <  for arbitrary

a
202 7. Integration

ξi ∈ [xi−1 , xi ] (besides all partitions P for which gap(P ) < δ) can make working
with this definition difficult.
The definition given above is the most common definition used in elementary
textbooks. This is probably because it is the definition that can be given as
quickly as possible. In most texts this definition is given before they consider
limits of sequences—let alone limits of partial sums.
We are going to do very little with Definition 7.6.1. As we stated earlier the
main result will be Theorem 7.6.3 were we prove that the Riemann integral and
the Darboux integral are the same. Before we do this, we state the following
easy result.

Proposition 7.6.2 Consider f : [a, b] → R where f is bounded on [a, b]. If


f is Riemann integrable on [a, b], then there exist a sequence of partitions of
Rb
[a, b], {Pn } such that Sn (f, Pn ) → (R) a f as n → ∞ for all choices of ξi ∈
Z b
[xi−1 , x − i], i = 1, · · · n, i.e. lim Sn (f, Pn ) = (R) f for arbitrary ξi .
n→∞ a

Proof: We begin by setting n = 1/n, n = 1, · · · and applying Definition


7.6.1. For each n we obtain a δn that for any partition of [a, b], Pn∗ , with
Z b
gap(Pn∗ ) < δn we get Sn (f, Pn∗ ) − (R) f < 1/n. For each n choose one such

a
partition, call it Pn . Then we have a sequence of partitions of [a, b] such that
Z b
Sn (f, Pn ) → (R) f as n → ∞.
a
Thus we see that the Riemann integral can be evaluated by a sequence of
the Riemann sums over a sequence of partitions—much like the result of the
Archimedes-Riemann Theorem, Theorem 7.2.4. The real result that we want is
that the Riemann integral defined by Definition 7.6.1 is the same as the Darboux
integral defined by Definition 7.2.3.

Theorem 7.6.3 Consider f : [a, b] → R where f is bounded on [a, b]. Then f


is Riemann integrable if and only if f is Darboux integrable, and in either case
the integrals are equal.

Proof: (⇒) We’ll do the easier one first. Suppose  > 0 is given.
Since f is
Z b
Riemann integrable there exists a δ so that (R) f − Sn (f, P ) < /3 for all

a
Z b Z b
 
partitions P with gap(P ) < δ or (R) f − < Sn (f, P ) < (R) f + —and
a 3 a 3
this must hold for all choices ξi ∈ [xi−1 , xi ], i = 1, · · · , n.
Choose one such partition P and consider the left half of the inequality,
Z b

(R) f − < Sn (f, P ). Since this inequality must hold for any choice of
a 3
ξi ∈ [xi−1 , xi ], i = 1, · · · , n, we can take the greatest lower bound of both sides
7.6 Riemann Integral 203

of this inequality over all such possible choices of ξi to get


Z b

(R) f − ≤ glb{Sn (f, P ) : ξi ∈ [xi−1 , xi ], i = 1, · · · , n}.
a 3
But we claim that the term on the right is just L(f, P ). To see that this is
the case consider the contribution of one particular partition interval [xi−1 , xi ]
to Sn (f, P ) and L(f, P ). As before set mi = glb{f (x) : x ∈ [xi−1 , xi ]}. Then
clearly for any ξi ∈ [xi−1 , xi ] we have mi ≤ f (ξi ). Thus for any choice of the
arbitrary points {ξi }, L(f, P ) ≤ Sn (f, P ) and L(f, P ) is a lower bound of the
set S = {Sn (f, P ) : ξi ∈ [xi−1 , xi ], i = 1, · · · , n}. To see that L(f, P ) is the
greatest lower bound of S, we note that by the definition of mi for any  > 0
and for each partition interval [xi−1 , xi ], we can choose ξi∗ ∈ [xi−1 , xi ] such that
f (ξi∗ ) − mi < /(b − a). Then for the particular Riemann sum associated with
{ξi∗ }, S ∗ (f, P ), we have S ∗ (f, P ) − L(f, P ) < .
Thus glb{Sn (f, P ) : ξi ∈ [xi−1 , xi ], i = 1, · · · , n} is just an ugly way of writ-
ing L(f, P ) and we have
Z b

(R) f − ≤ L(f, P ). (7.6.2)
a 3
Z b

Repeat this process with the the inequality Sn (f, P ) < (R) f + , this
a 3
time taking the least upper bound of both sides, to get
Z b

U (f, P ) ≤ (R) + . (7.6.3)
a 3
2
If we combine inequalities (7.6.2) and (7.6.3), we get U (f, P )−L(f, P ) ≤ < .
3
By Riemann’s Theorem, Theorem 7.2.6, we know that f is integrable (Darboux
integrable).
Z b
If we then use the fact that for any partition P we have L(f, P ) ≤ f ≤
a
U (f, P ) along with inequalities (7.6.2) and (7.6.3), we get
Z b Z b Z b
 
(R) f − ≤ L(f, P ) ≤ f ≤ U (f, P ) ≤ (R) f+
a 3 a a 3
Z b Z b Z b Z b
 
or − ≤ f − (R) f ≤ . Since  is arbitrary, we have f = (R) f.
3 a a 3 a a

(⇐) This is a tough proof, but also an interesting, good proof. But the proof
is not as hard as it looks—we do it very carefully. If f is integrable on [a, b],
then for  > 0 by Definitions 7.2.1 and 7.2.3 there exists a partition of [a, b],
P 0 = {x0 , · · · , xn }, such that
Z b Z b

U (f, P 0 ) − f = U (f, P 0 ) − f< . (7.6.4)
a a 4
204 7. Integration

Since f is bounded on [a, b], there exists M such that |f (x)| ≤ M for all x ∈ [a, b].
Set δ1 = /16M n and let P = {y0 , y1 , · · · , ym } be any partition of [a, b] such
that gap(P ) < δ1 . Let P ∗ be the common refinement of P 0 and P , P ∗ = P 0 ∪ P .
By Lemma 7.1.5 U (f, P ∗ ) ≤ U (f, P 0 ). Then from (7.6.4) we get
Z b
∗ 
U (f, P ) − f< . (7.6.5)
a 4

We next want to transfer the information from inequality (7.6.5) to partition


P . To do this we want to compare U (f, P ∗ ) and U (f, P ), and we will do this
by looking at 0 ≤ U (f, P ) − U (f, P ∗ ). Write P ∗ as P ∗ = {z0 , z1 , · · · , zp }
where p will be less than or equal to m + (n − 1)—but we don’t care about

this. Define the notation MiP = lub{f (x) : x ∈ [zi−1 , zi ]}, i = 1, · · · , p, and
MjP = lub{f (x) : x ∈ [yj−1 , yj ]}, j = 1, · · · , m. We note the following facts.

• If a partition interval of P , [yj−1 , yj ], contains no points of P 0 , then this



partition interval is the same as one of the partition intervals of P ∗ , MjP =
MjP and the contribution of this interval to U (f, P ) − U (f, P ∗ ) is zero.

• If a partition interval of P , [yj−1 , yj ], contains one point of P 0 , then this


partition interval is the same as two adjacent partition intervals of P ∗ , say
[zi−1 , zi ] and [zi , zi+1 ] and the contribution of this interval to U (f, P ) −
U (f, P ∗ ) is
∗ ∗
MjP (yj − yj−1 ) − [MiP (zi − zi−1 ) + Mi+1
P
(zi+1 − zi )]
∗ ∗
= MjP [(zi+1 − zi ) + (zi − zi−1 )] − [MiP (zi − zi−1 ) + Mi+1
P
(zi+1 − zi )]
∗ ∗
= (MjP − MiP )(zi − zi−1 ) + (MjP − Mi+1
P
)(zi+1 − zi ).
∗ ∗
Since either MjP = MiP or MjP = Mi+1 P
, at least one of these two
terms will be zero and the contribution to U (f, P ) − U (f, P ∗ ) will be

P∗
(MjP − MiP )(zi − zi−1 ) or (MjP − Mi+1 )(zi+1 − zi ), and in either case the
contribution will be less than or equal to 2M δ1 (for example
∗ ∗ ∗
|(MjP −MiP )(zi −zi−1 )| ≤ |(MjP −MiP )|δ1 ≤ (|MjP |+|MiP |)δ1 ≤ 2M δ1 .)

• If a partition interval of P , [yj−1 , yj ], contains two points of P 0 , then this


partition interval is the same as three adjacent partition intervals of P ∗ , we
play the same game—this time with three intervals of P ∗ —and find that
the contribution to U (f, P ) − U (f, P ∗ ) is less than or equal to 2 · 2M δ1 —
where the boldface 2 indicates that there will be two terms contributing
to this contribution—still only one adds out.

• etc. If a partition interval of P , [yj−1 , yj ], contains k points of P 0 , then


this partition interval is the same as k + 1 adjacent partition intervals of
P ∗ and will contribute less than or equal to k2M δ1 to U (f, P ) − U (f, P ∗ ).
7.6 Riemann Integral 205

Thus we see that because each interior point of P 0 contributes less than or
equal to 2M δ1 to U (f, p) − U (f, P ∗ ),

0 ≤ U (f, P ) − U (f, P ∗ ) = |U (f, P ) − U (f, P ∗ )| ≤ (n − 1)2M δ1

or

U (f, P ) ≤ U (f, P ∗ )+(n−1)2M δ < U (f, P ∗ )+(n−1)2M /16M n < U (f, P ∗ )+ .
8
If we combine this inequality with inequality (7.6.5), we get
Z b
3
U (f, P ) − f< . (7.6.6)
a 8

We have derived inequality (7.6.6) very carefully. In a like manner we can


show that there exists a δ2 such that if gap(P ) < δ2 , we get
Z b
3
f − L(f, P ) < . (7.6.7)
a 8

(To show that we are right—in that we claim that “we can show”—you might
try deriving inequality (7.6.7).)
Take δ = min{δ1 , δ2 } and suppose we are given a partition of [a, b], P , such
that gap(P ) < δ—we then get both inequalities (7.6.6) and (7.6.7).
Because on any partition interval we have mi ≤ f (ξi ) ≤ Mi , we have
L(f, P ) ≤ Sn (f, P ) ≤ U (f, P ). The right half of this inequality along with
Rb
inequality (7.6.6) gives Sn (f, P ) < a f + 3
8 and the left half
Z of the inequality

Rb b
3
along with inequality (7.6.7) gives a f − Sn (f, P ) < 8 ; or f − Sn (f, P ) <

a
Z b Z b
3
< . Thus by Definition 7.6.1 f is Riemann integrable and (R) f= f.
8 a a

As we promised the above proof is difficult. However it is especially neat


because we are given the P 0 and inequality with respect to P 0 by the hypothesis,
and then we are given another partition P and want essentially the inequality
with respect to P . We do this by defining P ∗ and use P ∗ to pass the inequality
from P 0 to P .

HW 7.6.1 Suppose f ; [0, 1] → R is integrable. Prove that


Z 1 n   n−1
X i 1
X i 1
f = lim f . Note also that lim f and
0 n→∞
i=1
n n n→∞
i=0
n n
n   Z 1
X 2i − 1 1
lim f are also equal to f.
n→∞
i=1
n n 0
206 7. Integration

n  
X i 1
HW 7.6.2 (a) Suppose f : [0, 1] → R and suppose lim f exists.
n→∞
i=1
n n
Show that f is not necessarily integrable on [0, 1].
(b) Show also that neither of the other limits of sums considered in HW7.6.1
will imply the integrability of f either.

7.7 Logarithm and Exponential Functions


We have been reasonably careful not to use logarithms yet because one of the
logical ways to define a logarithm is to use the integral as a part of the definition
(at least we haven’t used them often—surely we haven’t used them for anything
important). We’ve also stayed away from different kinds of exponentials where
the exponent is anything other than a rational, i.e. we have not allowed irra-
tional exponents—and we want them and need them. Approximately half of the
basic calculus books use this approach to define the logarithm and exponential
functions—the books that are not referred to as ”early transcendentals”. We
make the following definition.
Z x
1
Definition 7.7.1 For x > 0 we define ln x = dt which we call the loga-
1 t
rithm of x.

We now want to develop some of the properties of the logarithm function—all


of which you probably already know. We begin with the following result.

Proposition 7.7.2 (a) The function ln : (0, ∞) → R is continuous on (0, ∞).


d
(b) The function ln is differentiable on (0, ∞), and dx ln x = x1 .
(c) The function ln is strictly increasing.
(d) ln 1 = 0.

Proof: We are going to apply Propositions 7.5.1 and 7.5.2. Both of these
propositions considered a function f defined on a closed interval [a, b]. For
this result we must consider the function 1/t defined on (0, ∞). However, for
any x0 ∈ R we can consider [x0 /2, 2x0 ] and apply Propositions 7.5.1 and 7.5.2
to see that
f (x) = ln x is both continuous and differentiable at t = x0 , and
d 1 d 1
ln x = . Thus for any x ∈ (0, ∞), ln x = .
dx x=x0 x 0 dx x
d 1
Since ln x = > 0 on (0, ∞), the function ln is strictly increasing by
dx x
Corollary 6.3.6-(a).
R1
And by Definition 7.4.7-(a) we see that ln 1 = 1 1t dt.
We next need to show that the logarithm function defined satisfies the basic
properties that we all know logarithm functions are supposed to satisfy.
7.7 Logarithms and Exponentials 207

Proposition 7.7.3 For a, x ∈ (0, ∞) and r ∈ Q


(a) ln(ax) = ln a + ln x
(b) ln(a/x) = ln a − ln x
(c) ln xr = r ln x

Proof: Notice that the derivative found in part (b) of Proposition 7.7.2 along
0
d
with the Chain Rule, Proposition 6.1.4, dx ln f (x) = ff (x)
(x)
.
(a) We consider the expression ln(ax) where a ∈ (0, ∞) is some constant. Then
d a 1 d 1 1 d
dx ln(ax) = ax = x . Also dx (ln a + ln x) = 0 + x = x . Since dx ln(ax) =
d 1
dx (ln a + ln x) = x , by Corollary 6.3.5 we know that ln(ax) = ln a + ln x + C
where C is some constant. This last equality must be true for all x ∈ (0, ∞).
We set x = 1 to see that ln a = ln a + ln 1 + C = ln a + C or C = 0. Thus
ln(ax) = ln a + ln x.
d a 1  a 1 d
(b) We note that ln = − 2 = − and dx (ln a − ln x) = 0 − x1 =
dx x a/x x x
− x1 . Thus ln(a/x) = ln a−ln x+C. If we set x = 1, we get ln a = ln a−ln 1+C =
ln a + C or C = 0. Thus ln(a/x) = ln a − ln x.
d
(c) Since dx ln xr = x1r rxr−1 = xr and dxd
r ln x = r x1 , ln xr = r ln x + C. If we
let x = 1, we see that C = 0 and hence, ln xr = r ln x. Note that we have only
consider part (c) for r ∈ Q. This is because we have not defined xr for r ∈ I—so
surely we could not decide how to differentiate xr for r ∈ I.
We next consider ln 2—which by using a calculator we know
R 2 is approximately
equal to 0.69, but we can’t use that. We note that ln 2 = 1 (1/t) dt. We also
note that on [1, 2] we have 1t ≥ 12 (look at the graph of 1/t). Thus we know
R2 R2
that ln 2 = 1 (1/t) dt ≥ 1 (1/2) dt = 1/2—and it is true that 0.69 ≥ 1/2.
Using this inequality we see that ln 2n = n ln 2 ≥ n/2 and lim ln 2n =
n→∞
∞. Then because the ln function is increasing, we know that lim ln x = ∞.
x→∞
(Because limn→∞ ln 2n = ∞ for every R > 0 there exists an N ∈ R so that for
any n > N , ln 2n > R. Then for any R > 0 we can choose K = 2N +1 . Then
because the ln function is increasing, for any x > K, ln x > R.)
Likewise we want to show that lim ln x = −∞. We first show that ln 2−n =
x→0+
−n ln 2 ≤ −n/2. Part (b) below then follows using an argument similar to the
one used for part (a). We then have the following result that will help us to
understand the plot of the ln function.

Proposition 7.7.4 (a) lim ln x = ∞


x→∞
(b) lim ln x = −∞.
x→0+

If you look at a plot of the ln function—use your calculator—we know that


there is some x0 ∈ (0, ∞) so that ln x0 = 1. This can be proved by first noting
that ln 1 = 0 and ln 23 = 3 ln 2 ≥ 3(1/2) = 3/2. Then by the Intermediate
Value Theorem, Theorem 5.4.1, we know that there exists x0 ∈ (1, 8) such that
ln x0 = 1. We make the following definition.
208 7. Integration

Definition 7.7.5 The real number e is defined to be that value such that ln e =
1.

It should be reasonably clear that the same argument used above can be
used to prove that for any y ∈ (0, ∞) there exists an x ∈ (1, ∞) such that
ln x = y—for any y always use a some n so that ln 2n > y and then apply the
IVT with ln on (1, 2n ). This implies that ln(1, ∞) = (0, ∞).
Likewise, we can apply the same approach to show that ln(0, 1] = (−∞, 0]—
remember ln 1 = 0. Specifically, consider y0 = −11. We note that ln 2−24 =
−24 ln 2 ≤ −24(1/2) = −12 < −11 and ln 1 = 0 > −11 so we can apply the IVT
to imply that there exists some x0 ∈ (2−24 , 1) such that ln x0 = y0 = −11. Or
more generally, if you consider any y0 ∈ (−∞, 0), we can choose an n such that
ln 2−n = −n ln 2 ≤ −n/2 < y (and we do have ln 1 = 0 > y0 ). We can apply the
IVT to imply that there exists x0 ∈ (2−n , 1) such that ln x0 = y0 . This implies
that ln(0, 1] = (−∞, 0]. Thus we have the following result.

Proposition 7.7.6 ln(0, ∞) is an interval—specifically ln(0, ∞) = (−∞, ∞).

At this time we assume that we know almost everything that we want to


know about the logarithm function. We are now ready to move on the define
the exponential function. Because we know by Proposition 7.7.2-(c) that the ln
function is strictly increasing, we know that the ln function is one-to-one. Thus
we know that the inverse of the ln function exists on ln(0, ∞) = (−∞, ∞) so we
can make the following definition.

Definition 7.7.7 Define the exponential function, exp : (−∞, ∞) → (0, ∞), as
exp(x) = ln−1 x.

We want to make it very clear that at this time there is no special relationship
between the exponential function defined above and anything of the form ax —
we still don’t know what the latter expression means. However, we do have
tools to help us look at the exp function. We can use either Proposition 5.4.11
or 5.4.12 to prove the following result.

Proposition 7.7.8 The function exp : (−∞, ∞) → (0, ∞) is continuous on R.

The next property we would like to investigate regarding the exponential


functions is differentiability. Hopefully we remember that in Section 6.3 we
developed everything we need in Proposition 6.3.8. We have the following result.

Proposition 7.7.9 The function exp : (−∞, ∞) → (0, ∞) is differentiable at


y0 = ln x0 for any y0 ∈ (−∞, ∞), and

d 1
exp(y)|y=y0 = d
= exp(y0 ). (7.7.1)
dy dx ln x x=x0
7.7 Logarithms and Exponentials 209

Proof: The domain of the function ln is an interval—I = (0, ∞). The function
ln is one-to-one and continuous
on I, x0 is not an end point of I, ln is differen-
d
tiable at x = x0 and dx ln x x=x0 = x10 6= 0 for any x0 ∈ (0, ∞). Note that since
y0 = ln x0 , then x0 = ln−1 y0 = exp(y0 ). Thus by Proposition 6.3.8 we get

d 1 1
exp(y)|y=y0 = d
= = x0 = exp(y0 ).
dy dx ln x x=x0
1/x 0

The exponential function will inherit other more basic properties from the
logarithm function. Some of these properties are included in the following propo-
sition.

Proposition 7.7.10 (a) exp(0) = 1


(b) exp(1) = e
(c) For r ∈ Q exp(r) = er

Proof: Parts (a) and (b) follow since ln 1 = 0 and ln e = 1, respectively.


Remember that for r rational er has been defined earlier. Part (c) follows from
the fact that ln er = r ln e = r. Thus er = ln−1 r = exp(r).
We want to define some sort of exponential ax that makes sense for all
x ∈ R—specifically here we want to define ex . Above we see that on the rationals
er and exp(r) are the same—so we’re close. Thus we define the following.

Definition 7.7.11 For x ∈ R define ex = exp(x).

We should emphasize that we are really only defining ex for x ∈ I since it


is already defined on Q. It’s acceptable to state it the way we do because by
Proposition 7.7.10-(c), we know that for r ∈ Q they are the same anyway. We
should also emphasize that by Definition 7.7.11 and Proposition 7.7.9 we have
d x x
dx e = e .
One of the very important results follow immediately because of the function-
inverse function basic identity, Definition 5.4.7.

Proposition 7.7.12 (a) eln x = x for x > 0


(b) ln ex = x for x ∈ R

There are, of course, some very basic properties that we want exponentials
to satisfy. In Section 5.6, Proposition 5.6.6 we showed that for r, s ∈ Q, we
s
have xr xs = xr+s and (xr ) = xrs . We want and need ex1 ex2 = ex1 +x2 and
x
(ex1 ) 2 = ex1 x2 for x1 , x2 ∈ R. We have the following.

Proposition 7.7.13 For x1 , x2 ∈ R we have


(a) ex1 ex2 = ex1 +x2 and
x
(b) (ex1 ) 2 = ex1 x2 .
210 7. Integration

Proof: This proposition could be proved using the same approach that we used
to prove Proposition 7.7.3. Instead of using that approach we will show how
these properties follow from results proved in Proposition 7.7.3. Suppose y1 and
y2 are such that y1 = ex1 and y2 = ex2 —then also x1 = ln y1 and x2 = ln y2 .
Then x1 + x2 = ln y1 + ln y2 = ln(y1 y2 ) by Proposition 7.7.3-(a). Then taking
the exponential of both sides gives ex1 +x2 = eln(y1 y2 ) = y1 y2 = ex1 ex2 .
x
(b) In a similar way we note that x1 x2 = x2 ln ex1 = ln (ex1 ) 2 . Then taking
x
the exponential of both sides yields ex1 x2 = (ex1 ) 2 .
Note that the expression “taking the exponential of both sides” is perfectly
legal. We have two equal values, x1 + x2 = ln(y1 y2 ) in part (a). Hence from the
definition of a function, if we evaluate the exponential function at these equal
values, we can expect to get equal outputs.
We do want and need more general exponentials. To accomplish this we
make the following definition.
Definition 7.7.14 For a > 0 and x ∈ R we define ax = ex ln a .

We next would have to state and prove all of the relevant properties related
to the function ax . We want at least the following properties: a0 = 1, a1 = a,
x d x
ax1 ax2 = ax1 +x2 , (ax1 ) 2 = ax1 x2 and dx a = ax ln a. We will not prove
these properties but you should be able to see that they follow easily from the
analogous properties for the exponential—ex .
And finally we want one last very important function defined, xr for some
r ∈ R and x ∈ (0, ∞). The function xr is already defined, xr = er ln x . Of course
for certain values of r (many values of r) we can actually define xr for any x ∈ R
(x2 , x3 , x2/3 , etc)—but for many values of r (at least r = 1/2, r = π, etc) xr
just doesn’t make any sense for x < 0. And clearly the definition xr = er ln x
only makes sense for positive x.
The most basic properties of xr follow from the properties of the exponential
and logarithm. We note that becuase ln xr = ln er ln x = r ln x, we see that now
(essentially by definition) ln xr satisfies the property given in Proposition 7.7.3-
(c)—this time for any r ∈ R (instead of only r ∈ Q).
The property that we need badly is the derivative property. We already have
d r
that dx x = rxr−1 for r ∈ Q. We also have the extension of this result.
d r
Proposition 7.7.15 For r ∈ R and x ∈ (0, ∞) dx x = rxr−1 .
d r d r ln x
Proof: We note that dx x = dx e = er ln x xr = xr xr = rxr−1 which is the
desired result.
HW 7.7.1 (True or False and why)
(a) ln 2n ≥ n2 implies that lim ln 2n = ∞
n→∞
(b) If ln 2x + 3 ln 8x = 1, then x = 10
1
.
d x
(c) For x > 0, x = (1 + ln x)xx .
dx x
(d) The function sin x is defined for x ∈ [0, 2π].
(e) exp(ln x) = x for x > 0.
7.8 Improper Integrals 211

HW 7.7.2 Let f (θ) = cos θ. (a) Show that f is not one-to-one. Restrict f in
a way so that the restriction, fr , is one-to-one.
(b) prove the existence of fr−1 , prove the continuity of fr−1 and compute
d −1 d
f (x), i.e. compute cos−1 x.
dx r dx

7.8 Improper Integrals


Two important assumptions made as a part of the definition of the integral
were that the functions were bounded and the interval was finite—and it’s easy
to see that for many of the integration results proved, these were important
assumptions. However, there are many times that we want need some sort of
integral of an unbounded function or some sort of integral over an infinite in-
terval. In this section we introduce an extension of the integral to the improper
integral—an integral that allows for unbounded functions and/or infinite in-
tervals. We want to emphasize that the integral considered in this section is
not the Darboux-Riemann integral considered in the rest of the chapter—but
the improper integral is a very important and useful extension of the Darboux-
Riemann integral.
Z b
We want a definition for integrals f where f may be unbounded at a, b
a Z b
or at c ∈ (a, b). Likewise we want integrals of the form f where b is infinity,
a
a is minus infinity or both. We do this by considering each possibility separate.
To do this we make the following definition.

Definition 7.8.1 (a) Suppose that f : (a, b] → R is such that f is integrable on


Z b Z b
[c, b] for any c ∈ (a, b]. Suppose further that lim f exists. Define f =
c→a+ c a
Z b
lim f.
c→a+ c
(b) Suppose that f : [a, b) → R is such that f is integrable on [a, c] for any
Z c Z b Z c
c ∈ [a, b). Suppose further that lim f exists. Define f = lim f.
c→b− a a c→b− a
(c) Suppose that f : [a, ∞) → R is such
Z c that f is integrable
Z ∞ on [a, c] for
Z c any
c ∈ [a, ∞). Suppose further that lim f exists. Define f = lim f.
c→∞ a a c→∞ a
(d) Suppose that f : (−∞, b] → R is such that f is integrable on [c, b] for any c ∈
Z b Z b Z b
(−∞, b]. Suppose further that lim f exists. Define f = lim f.
c→−∞ c −∞ c→−∞ c
Z c Z b
(e) Suppose that f : [a, c) ∪ (c, b] → R is such that f and f exist and are
Z b Z c Z b a c

finite. Define f= f+ f.
a a c
212 7. Integration

Z f∞ : R →
(f ) Suppose that Z R
c
is such
Z ∞that f is integrable on [−c1 , c1 ] for every
c1 > 0. Define f= f+ f for any c ∈ R.
−∞ −∞ c

When we require that the limits above exist, we first consider that they exist
in R. In this case we can easily develop a whole boat load of properties for the
improper integral. We assume that you understand which properties that you
would want (or just return to the previous sections of this chapter and decide
which of the properties will still be true for the improper integral) and that
you could prove most of these properties if you wanted them—usually using a
combination of integration results and limit results. We are not going to do this
here. We wanted to introduce the improper integral because it is an important
concept, and we wanted to emphasize that the improper integral is not part of
the usual definition of the Darboux-Riemann integral.
It is possible to allow limits above that include ±∞. There’s nothing wrong
with allowing infinite values but especially in parts (e) and (f) of the above
definition you must be careful with your “infinite arithmetic.” Also in this case
you must be carefuly about the “obvious” properties of the integral. Again you
must be careful not to perform any illegal operations with ±∞. Other than
that, all is fine.
Chapter 8

Sequences and Series

8.1 Approximation by Taylor Polynomials


1
The functions ex , sin x, √1−x 2
, etc are nice functions—especially when you are
using a calculator or computer—but they are not as nice as polynomials. Specifi-
cally polynomials can be evaluated completely based on multiplication, subtrac-
tion and addition. Thus when you build your computer, if you teach it how to
multiply, subtract and add, your computer can also evaluate polynomials. These
other functions are not that simple (even division creates problems). The way
that most computers evaluate the more complex functions is to approximate
them by polynomials.
There are many other applications where it is useful to have a polynomial
approximation to a function. Generally, polynomials are just easier to use. In
this section we will show one way to obtain a polynomial approximation of a
function. The approximation will include the error term which is extremely
important since we must know that our approximation is a sufficiently good
approximation—how good depends on our application. The main tool that we
will use is integration by parts, Proposition 7.5.6. We will use integration by
Z b Z b
b
parts in the form F 0 (t)G(t) dt = [F G]a − F (t)G0 (t) dt where we see that
a a
it is convenient to include the variable of integration specifically because we will
have two variables in our formulas.
We consider a function f and desire to find a polynomial approximation of
f near x = a. At this time we will not worry about the necessary assumptions
on our function—they will be included when we state our proposition. We
begin by noting that by the Fundamental Theorem of Calculus, Theorem 7.5.4,
Z x
f 0 (t) dt = f (x) − f (a) or
a
Z x
f (x) = f (a) + f 0 (t) dt. (8.1.1)
a
We write expression (8.1.1) as f (x) = T0 (x) + R0 (x) where T0 (x) = f (a) and

213
214 8. Sequences and Series

Rx
R0 (x) = a f 0 (t) dt. T0 is referred to as the zeroeth order Taylor polynomial
of the function f about x = a and R0 is the zeroeth order error term—of course
the trivial case—and generally T0 would not be a very good approximation of
f. Z x
We obtain the next order of approximation by integrating f 0 (t) dt by
0
parts. We let G(t) = f 0 (t) and F 0 (t) = 1. Then G0 (t) = f 00 (t) and F (t) = t − c.
You should
Z take note of the last step carefully. The dummy variable in the
x
integral f 0 (t) dt is t. Hence, if you were to integrate by parts without being
0
especially clever (or even sneaky), you would say that F (t) = t. However, there
is no special reason that you could not use F = t + 1 or F = t + π instead.
The only requirement is that the derivative of F with respect to t must be 1.
Since the integration (and hence, the differentiation) is with respect to t, x is a
constant with respect to this operation (no different from 1, π or c). Since we
want it, it is perfectly ok to set F (t) = t − x. Then application of integration
by parts gives
Z x Z x Z x
0 0 x
f (t) dt = F (t)G(t) dt = [F G]a − F (t)G0 (t) dt
a a a
Z x Z x
0 00 0
= 0 − (a − x)f (a) − (t − x)f (t) dt = (x − a)f (a) − (t − x)f 00 (t) dt.
a a

If we plug this result into (8.1.1), we get


Z x
f (x) = f (a) + (x − a)f 0 (a) − (t − x)f 00 (t) dt. (8.1.2)
a

Expression (8.1.2) can be written as f (x) = T1 (x) + R1 (x) where T1 (x) =


f (a) + (xZ− a)f 0 (a) is the first order Taylor polynomial of f at x = a and
x
R1 (x) = (x − t)f 00 (t) dt is the first order error term. It should be clear that
a
T1 is a reasonably good approximation of f near x = a. The equation y = T1 (x)
is the equation of the line tangent to y = f (x) at x = a.
If we continue in the same fashion, we obtain the following result.
Proposition 8.1.1 Suppose f : I → R where I is some open interval contain-
ing a and f is n + 1 times continuously differentiable on I. Then for x ∈ I f
can be written as
f (x) = Tn (x) + Rn (x) (8.1.3)
where
n
X 1 (k)
Tn (x) = f (a)(x − a)k (8.1.4)
k!
k=0

and Z x
1
Rn (x) = (x − t)n f (n+1) (t) dt. (8.1.5)
n! a
8.1 Taylor Polynomials 215

The polynomial Tn is called the nth order Taylor polynomial of f about x = a


and Rn is called the nth order Taylor error term.
Proof: We apply mathematical induction.
Step 1: Equations (8.1.3)–(8.1.5) are true for n = 1 (by the derivation preceeding
this proposition).
Step 2: Assume that equations (8.1.3)–(8.1.5) are true for n = m, i.e. assume
m
X 1 (k)
that f can be written as f (x) = Tm (x)+Rm (x) where Tm (x) = f (a)(x−
k!
Z x k=0
1
a)k and Rm (x) = (x − t)m f (m+1) (t) dt.
m! a
Step 3: We now prove that equations (8.1.3)–(8.1.5) are true for m + 1. We
integrate the expression Rm by parts, letting G(t) = f (m+1) (t) and F 0 (t) =
1
(x − t)m and get
m!
Z x  t=x
1 1
(x − t)m f (m+1) (t) dt = − (x − t)m+1 f (m+1) (t)
a m! (m + 1)! t=a
Z x 
1 m+1 (m+2)
− − (x − t) f (t) dt
a (m + 1)!
1
= (x − a)m+1 f (m+1) (a)
(m + 1)!
Z x
1
+ (x − t)m+1 f (m+2) (t) dt.
(m + 1)! a
Thus
Z x
1 1
Rm (x) = (x − a)m+1 f (m+1) (a) + (x − a)m+1 f (m+2) (t) dt
(m + 1)! (m + 1)! a

and we can write


 
1 m+1 (m+1)
f (x) = Tm (x) + Rm (x) = Tm (x) + (x − a) f (a)
(m + 1)!
Z x
1
+ (x − t)m+1 f (m+2) (t) dt
(m + 1)! a
or f (x) = Tm+1 (x) + Rm+1 (x). Therefore equations (8.1.3)–(8.1.5) are true for
n = m + 1.
Therefore equations (8.1.3)–(8.1.5) are true for all n by mathematical induc-
tion.
We note that if we choose a = 0 we obtain the following special case which
is very common.
Proposition 8.1.2 Suppose f : I → R where I is some open interval contain-
ing 0 and f is n + 1 times continuously differentiable on I. Then for x ∈ I f
can be written as
f (x) = Tn (x) + Rn (x) (8.1.6)
216 8. Sequences and Series

where
n
X 1 (k)
Tn (x) = f (0)xk (8.1.7)
k!
k=0

and Z x
1
Rn (x) = (x − t)n f (n+1) (t) dt. (8.1.8)
n! 0

We can consider the function f (x) = ex and can easily obtain expression for
the Taylor polynomial for f about x = 0.
Example 8.1.1 Obtain the Taylor polynomial and error term for f (x) = ex about x = 0.
n
X 1 k
Solution: It is easy to see that for any n, f (n) (0) = 1. Then we can write Tn (x) = x
k=0
k!
1
Z x
and Rn (x) = (x − t)n et dt.
n! 0

1
Example 8.1.2 Consider the function f (x) = . Compute Taylor polynomials and
x+1
error terms for f about x = 2 for n = 4 and for general n.
Solution: We begin by making a table for derivatives of f at x = 2.

n f (n) (x) f (n) (2)


0 (x + 1)−1 3−1
1 −(x + 1)−2 −3−2
2 2!(x + 1)−3 2! · 3−3
3 −3!(x + 1)−4 −3! · 3−4
4 4!(x + 1)−5 4! · 3−5
5 −5!(x + 1)−6 −5! · 3−6
n (−1)n n!(x + 1)−(n+1) (−1)n n! · 3−(n+1)
n+1 (−1)n+1 (n + 1)!(x + 1)−(n+2)

1 1 1 1 1
It is then easy to see that T4 (x) = − (x − 2) + (x − 2)2 − (x − 2)3 + (x − 2)4
3 9 27 81 243
Z x n
X 1
and R4 (x) = −5 (x − t)4 (t + 1)−6 dt; and Tn (x) = (−1)k k+1 (x − 2)k and Rn (x) =
2 k=0
3
Z x
(−1)n+1 (n + 1) (x − t)n (t + 1)−(n+2) dt.
2

The title of this section was Approximation by Taylor Polynomials. The


function Tn does an especially good job of approximating f at x = a since Tn
and the first n derivatives of Tn evaluated at x = a gives f (a) and the first n
derivatives of f evaluated at x = a. For Tn to provide an approximation of f
for values of x other than x = a, it is clear that Rn will have to be small. If we
think about what a polynomial looks like, it is clear that a polynomial cannot
approximate a general function everywhere. To see how well Tn approximates
f , you might plot some of the Taylor polynomials found in Examples 8.1.1 and
8.1.2 along with the given functions. The best that we can hope for is that Tn
approximates f near x = a—which we show with the following result which we
refer to as the Taylor Inequality.
8.1 Taylor Polynomials 217

Proposition 8.1.3 (Taylor Inequality) Suppose f : I → R where I is some


open interval containing a, f is n + 1 times continuously differentiable on I and
[a − r, a + r] ⊂ I for some particular r. Suppose further that there exists M such
that |f (n+1) (x)| ≤ M for x ∈ I. Then
M
|Rn (x)| ≤ |x − a|n+1 for x ∈ I (8.1.9)
(n + 1)!
M n+1
or |Rn (x)| ≤ (n+1)! r .

Proof: We note that by Proposition 7.4.8-(c) we get


Z x
1 x
Z
1 n (n+1)
n (n+1)
|Rn (x)| = (x − t) f (t) dt ≤ (a − t) f (t) dt .

n! a n! a
Using our hypothesis on f and Proposition 7.4.8-(b) we get
M x
Z
n

|Rn (x)| ≤ |(a − t) | dt .
n! a

(To obtain this last result we must be careful. When x ≥ a, everything is positive
and the statement is true withoutRthe outside absolute value signs. R x When x < a,
x
by Proposition 7.4.8-(b) we get a (a − t)n f (n+1) (t) dt ≥ M a |(a − t)n | dt.
Because these two integrals are negative, we get
Z x Z x
n (n+1) n

− t) f (t) dt ≤ M |(a − t) | dt .)

(a
a a
R x
Next we must compute a |(a − t)n | dt —carefully. Probably the easiest
way is to consider x ≥ a and show that
Z x Z x
n (x − a)n+1 |x − a|n+1
|(a − t) | dt = (t − a)n dt = = .
a a n+1 n+1
Then consider x < a and show that
Z x Z x
(a − x)n+1 |x − a|n+1
|(a − t)n | dt = (a − t)n dt = − =− .
a a n+1 n+1
R x n+1
In either case a (a − t)n | dt = |x−a|

n+1 , and we get

M M
|Rn (x)| ≤ |x − a|n+1 ≤ rn+1 .
(n + 1)! (n + 1)!

We should note that the result of Proposition 8.1.3, equation (8.1.9), can
M
also be expressed as |f (x) − Tn (x)| ≤ (n+1)! rn+1 for x ∈ [a − r, a + r]. This
expression makes it extremely clear how well Tn approximates f .
In the above result that the (n + 1)! in the denominator is one part of the
above result that makes the error small on [a − r, a + r]. Also, if r is small, then
rn+1 makes Rn small. Consider the following example that is based on Example
8.1.1.
218 8. Sequences and Series

Example 8.1.3 Return to Example 8.1.1.


(a) Find the Taylor polynomial approximation of f (x) = ex associated with n = 3. Apply the
Taylor inequaltiy, Proposition 8.1.3, with r = 3 to obtain an error bound on [−3, 3] for this
approximation.
(b) Repeat part (a) with r = 0.1.
(c) Repeat part (a) with n = 27 and r = 3.
Solution: (a) We see that if we choose r = 3 and n = 3, then M = e3 ≈ 20.09, T3 (x) =
e3 4
1 + x + 21 x2 + 16 x3 and by Proposition 8.1.3 |R3 (x)| = |ex − T3 (x)| ≤ 24
3 ≈ 67.79 on [−3, 3].
This is not very good.
(b) If instead we choose r = .1 and n = 3, then M = e.1 ≈ 1.11, T3 is the same and
e.1
|R3 (x)| = |ex − T3 (x)| ≤ 24
0.14 ≈ 4.60 · 10−6 on [−.1, .1]. This is a very good result.
(c) If we want r = 3, we can choose n = 27 (or some other insanely large n), not write out T27
3
e
and see that |R27 (x)| ≤ 28! 328 ≈ 1.51 · 10−15 . So if we especially want a large interval, it is
possible to find a sufficiently high order Taylor polynomial that will approximate f (x) = ex .
Note that it is clear that if we make n large enough, the bound on the error will get
3n
small— lim = 0 by Example 3.5.2.
n→∞ n!

Thus we see that we can approximate ex well with a small order Taylor
polynomial on a small interval (with r small). It may not be very nice but we
also see that if for some reason we want or need a large interval, we can use
a Taylor polynomial (a high ordered Taylor polynomial) to approximate ex on
the large interval.
Likewise we can revisit the example considered in Example 8.1.2, f (x) =
1
x+1 , we clearly have to choose r so that −1 6∈ [2 − r, 2 + r]. If we choose
−6
r = 1 and n = 4, then M = 5!1−6 ≈ 120 and |R4 (x)| ≤ 5!15! 16 = 1 on the
interval [1, 3]—again not very good. If we instead chose r = 0.5 and n = 4,
−6
then M = 5!1.5−6 ≈ 10.53 and |R4 (x)| ≤ 5!1.5 5! .56 ≈ 1.37 · 10−3 on the interval
[1.5, 2.5]. This is a better result.
We see that in this case if r is a bit larger (1 or larger), M gets large—large
enough so that the (n + 1)! in the denominator of (8.1.9) doesn’t help making
R4 small. And of course, if r ≥ 1, the rn term doesn’t help make R4 small
either.

HW 8.1.1 (True or False and why)


(a) If n is sufficiently large and r is sufficiently small (but > 0), then Tn (x) =
f (x) on [a − r, a + r].
(b) On any interval [a−r, a+r] the derivative f (n) (x) gets small as n gets larger.
(c) A sufficient hypothesis for Proposition 8.1.1 is that each of the functions
f (k) be integrable, k = 1, · · · , k + 1.
(d) If f is a fourth degree polynomial, then T4 (x) = f (x) for all x ∈ R and
R4 (x) = 0 for all x ∈ R.
(e) If Rn (x) = 0 for all x ∈ R and some n ∈ N, then f is a polynomial.

HW 8.1.2 Begin with f expressed as f (x) = T1 (x) + R1 (x) as in equation


(8.1.2). Derive T2 (x) and R2 (x)—of course such that f (x) = T2 (x) + R2 (x).
8.2 Sequences and Series 219

HW 8.1.3 Consider the function f (x) = sin x. (a) Compute the Taylor poly-
nomial and error term about x = 0 for n = 4 and for a general n.
(b) Apply the Taylor inequality, Proposition 8.1.3, on [−1, 1] to determine a
bound on the error for both cases.
(c) Use the result from part (b) for general n to determine an n0 such that
| sin x − Tn (x)| ≤ 1.0 · 10−10 for all x ∈ [−1, 1].

8.2 Sequences and Series


Convergence of sequences of functions In Proposition 8.1.3 we see that
M
if the function f is defined on [a − r, a + r], then |f (x) − Tn (x)| ≤ (n+1)! rn+1
M
where we worked to find M , n and r so that (n+1)! rn+1 is small. How small?
It depends on how accurately we want to approximate f .
Hopefully that inequality above reminds you of convergence of sequences. If
we return to the function f (x) = ex and the sequence of Taylor polynomials
found in Example 8.1.1, choose r = 3 and you plot f along with a bunch of
the Tn ’s for different n’s, it is clear that Tn converges to f by the English
definition of “converges”. For a fixed x ∈ [−3, 3] since by the Taylor inequality
e3 e3 3n+1
− 3n+1 ≤ ex − Tn (x) ≤ 3n+1 and lim = 0 (Example
(n + 1)! (n + 1)! n→∞ (n + 1)!
3.5.2), by the Sandwich Theorem, Proposition 3.4.2, we know that lim [ex −
n→∞
Tn (x)] = 0 or Tn (x) → f (x) = ex —for fixed x ∈ [−3, 3].
We formalize the concept of a sequence of functions converging to a given
function with the following defintions.
Definition 8.2.1 Suppose f, fn : D → R for D ⊂ R, n = 1, 2, · · · . If for each
x ∈ D lim fn (x) exists and equals f (x), then we say that the sequence {fn }
n→∞
converges pointwise to f on D. We write fn → f .

We have defined pointwise convergence of a sequence of functions. There are


other other types of convergence—we will include uniform convergence later.
When there is no doubt that the convergence is pointwise, the “pointwise” will
often be eliminated.
There are an abundant number of easy, important sequences of functions.
Consider the following examples.
Example 8.2.1 Define(f1n , f1 : D = [0, 1] → R for n ∈ N by
0 for 0 ≤ x < 1
f1n (x) = xn and f1 (x) = Show that f1n → f1 pointwise.
1 for x = 1.
Solution: We note that
• since f1n (0) = 0 for all n, then f1n (0) → 0 = f1 (0),
• for 0 < x < 1, since lim xn = 0 by Example 3.5.1, f1n (x) → 0 = f1 (x), and
n→∞
• since f1n (1) = 1 for all n, then f1n (1) → 1 = f1 (1).
Thus f1n → f1 pointwise on [0, 1].
220 8. Sequences and Series

n
x
Example 8.2.2 Define f2n , f2 : [0, 1] → R for n ∈ N by f2n (x) = n
and f2 (x) = 0 for
x ∈ [0, 1]. Show that f2n → f2 pointwise on [0, 1].
xn
Solution: For any x ∈ [0, 1] lim = 0—thus f2n → f2 pointwise on [0, 1].
n→∞ n

nx
Example 8.2.3 Define f3n , f3 : [0, 1] → R for n ∈ N by f3n (x) = and f3 (x) = 0
1 + n2 x2
for x ∈ [0, 1]. Show that f3n → f3 pointwise on [0, 1].
Solution: Since f3n (0) = 0 for all n, then f3n (0) → 0 = f3 (0). For x satisfying 0 < x ≤ 1,
nx
lim = 0 = f3 (x). Thus f3n → f3 pointwise on [0, 1].
n→∞ 1 + n2 x2

Series We started this discussion talking about in which manner the Taylor
polynomial associated with f , Tn , converges to f . Specifically let Tn denote
the Taylor polynomial associated with f (x) = ex and consider the domain
D = [−3, 3]. Earlier we found that lim Tn (x) = ex for x ∈ [−3, 3]. Thus
n→∞
the sequence {Tn } converges to ex pointwise on D = [−3, 3].
It should be clear that {Tn } is a different sort of sequence from {f1n }, {f2n }
and {f3n } defined above. Recall that the sequence of Taylor polynomials Tn
n
x
X 1 k
associated with f (x) = e is given by Tn (x) = x . All Taylor polynomials
k!
k=0
look similar—given as a sum of n + 1 terms. When we take the limit as n
approaches ∞, we are computing an infinite sum. We want to understand what

X 1 k
we mean by x . Sequences such as these are referred to as a series of
k!
k=0
functions. To provide a logical setting to discuss series of functions we
introduce series of real numbers.
For a sequence {a1 , a2 , · · · } where ai ∈ R for all i = 1, 2, · · · , we want to

X
discuss what we mean by ai , the sum of an infinite number of real numbers.
i=1
n
X
We define partial sums of {ai } by sn = ai for n ∈ N and consider the
i=1
sequence of partial sums, {sn }.

Definition 8.2.2 Consider the real sequence {ai } and the associated sequence
n
X
of partial sums {sn }, sn = ai . If the sequence {sn } is convergent, say to
i=1

X ∞
X
s, we say that the series ai converges and we define ai = s = lim sn .
n→∞
i=1 i=1

X
We refer to ai as an infinite series, or just a series. If the sequence {sn }
i=1

X
does not converge, we say that the series ai does not converge. If sn → ±∞,
i=1
8.2 Sequences and Series 221


X
we say that the series ai diverges to ±∞, respectively—but make sure you
i=1
understand that a series that diverges to ±∞ does not converge in R.

Consider the following example.



X
Example 8.2.4 Consider the real series ai where ai = ri for some r ∈ R. Then the
i=1

X
series ai converges if and only if |r| < 1.
i=1
Solution: Recall that in Example 1.6.1 we showed that the formula for the sum of a finite
n
X 1 − Rn+1
geometric series was given by Rj = (where r was changed to R for convenience).
j=0
1−R
n
X
Applying this formula to the series given above gives formula for the partial sum sn = ri =
i=1
n n
1 − rn X X
r . If r = 1, we use the fact that sn = ri = 1 = n to see that the sequence {sn }
1−r i=1 i=1
diverges to infinity. When r 6= 1, lim sn exists if and only if lim rn exists. By Examples
n→∞ n→∞
3.2.6, 3.5.1, 3.6.2 and the discussion following Example 3.6.2 we know that lim rn exists if
n→∞
and only if |r| < 1.

The geometric series is very nice but this is almost the only series that we
can write and work explicitly with the sequence of partial sums (telescoping
series gives one more example).
When we consider the convergence of a series, it is sometimes useful to realize
that when we are showing that sn → s where s is to be the sum of the series
X∞ ∞
X
ai , we must consider s − sn —as in |s − sn | < . And s − sn = ai . Thus
i=1 i=n+1
to show that a series converges, we must show that the sum of the “tail end” of
the series is arbitrarily small.
And finally one other approach is extremely useful when working with the
convergence of series is to use the Cauchy criterion for the convergence of the
sequence {sn } introduced in Section 3.4. Recall that when we discussed the
Cauchy criterion, we noted that it was a case where we did not need to know
the limit of the sequence. This is especially convenient when we are working
with series in that we hardly ever know or can guess the sum of the series. We
include the application of the Cauchy criterion to the convergence of series in
the following proposition.

X
Proposition 8.2.3 Consider the real sequence {ai }. The series ai con-
i=1
exists N ∈ R such that m, n ∈ N,
verges if and only if for every  > 0 there
Xm
m ≥ n and m, n > N implies that ai < .


i=n
222 8. Sequences and Series

Proof: This result follows from Proposition 3.4.11 in that {sn } is convergent
if and only if the sequence {sn } is a Cauchy sequence. The sequence {sn } is a
Cauchy sequence if for every  > 0 there exists an N ∈ R such that n, m ∈ N
and n, m > N implies |sm − sn | < . This can easily be adjusted by setting
N ∗ = N + 1 and requiring m, n > N ∗ which implies that |sm − sn−1 | < . If
Xm
we take m ≥ n (one of the two must be larger), then sm − sn−1 = ai . The
i=n
result follows.
We used the example of convergence of Taylor polynomials to motivate the
convergence of series. We now realize that the convergence of Taylor polynomials
are really the convergence of a series, a Taylor series. For that reason (and the
fact that it is an important concept) we now define what we mean by the
pointwise convergence of a series of functions.

Definition 8.2.4 Consider the sequence of functions {fi (x)} where for each i,

X
fi : D → R, D ⊂ R. If for each x ∈ D the real series fi (x) is convergent, say
i=1

X
to s(x), then we say that the series of functions fi (x) converges pointwise
i=1
to s(x).

Begin by noting that the notation used above is not very good. At the function
X∞
level it would be better to say that the series of functions fi converges
i=1
pointwise to s—but the above notation is reasonably common.
We should note that we can also consider the sequence of partial sums of
n
X
functions, sn (x) = fi (x), and say that if the sequence {sn (x)} converges
i=1

X
pointwise, say to s(x), then the series of functions fi (x) is said to converge
i=1
pointwise and is defined to be equal to s(x). We emphasize that this approach
to convergence of series of functions is completely equivalent to the convergence
defined in Definition 8.2.4.
In our consideration of the convergence of sequences of Taylor polynomials
we have already given a very common example of a series of functions. Since
Tn (x) was really a partial sum, when we considered the convergence of the Taylor
polynomials of f (x) = ex on [−3, 3], we were proving the pointwise convergence

X 1 i
of the series of functions x on [−3, 3] (and we hope that you realize that
i=0
i!
it is not important that we considered general series starting at i = 1 and the
Taylor series starting with i = 0). Because we expanded f (x) = ex about x = 0,
the series given above is the Maclaurin series of f . In general we make the
following defintiion.
8.3 Convergence Tests 223

Definition 8.2.5 Let I be a neighborhood of x = a and suppose f : I → R has



X f (k) (a)
derivatives of all orders at x = a. Then (x − a)k is called the Taylor
k!
k=0
series expansion of f about x = a. When a = 0, the Taylor series is most often
referred to as the Maclaurin series.

HW 8.2.1 Prove that the sequence of functions {fn } where fn : [0, 1] → R is


defined by fn (x) = nx(1 − x2 )n converges pointwise to f where f (x) = 0 for all
x ∈ [0, 1].

HW 8.2.2 Consider the sequence of functions {fn } where fn : R → R is defined


∞ ∞
x2
X X x2
by fn (x) (1+x2 )n . Show that the series fn (x) = converges
n=0 n=0
(1 + x2 )n
pointwise and determine the limiting function.
1
HW 8.2.3 Determine the Taylor series of the function f (x) = x+1 about x = 2.

8.3 Tests for Convergence


As a part of our discussion of the pointwise convergence of the Taylor polyno-
mials, we considered real series for each fixed x—the sequence of Taylor polyno-
mials for a fixed x. For these Taylor series we were able to prove convergence by
the use of Taylor’s Inequality, Proposition 8.1.3. For general series (and hence
series of functions) we do not have a result as nice as Taylor’s Inequality—and
they are surely not all as nice as a geometric series. For this reason we need and
will develop a set of tools that can be used to prove convergence of series. We
begin with an obvious result of Definition 8.2.2 and Proposition 3.3.2-(a) and
(b).

X ∞
X
Proposition 8.3.1 Suppose ai and bi are two convergent real series and
k=1 i=1
c ∈ R. Then

X ∞
X ∞
X ∞
X
(a) (ai + bi ) converges and (ai + bi ) = ai + bi , and
i=1 i=1 i=1 i=1

X ∞
X ∞
X
(b) cai converges and cai = c ai .
i=1 i=1 i=1

When you look back to Proposition 3.3.2, you might ask “what about part

X (−1)i
(d)?” We don’t know it yet but we will find later that √ is conver-
i=1
i
∞ ∞
X (−1)i (−1)i X 1
gent (twice) but √ √ = is not. Hence there is no nice result
i=1
i i i=1
i
224 8. Sequences and Series

that gives convergence for a series resulting by a term by term product of two
convergent series.
The next is very easy but was a very important tool in your basic course.

X
Proposition 8.3.2 If the series ai converges, then lim ai = 0.
i→∞
i=1

Proof: If sn represents the partial sum associated with the convergent series

X
s = ai , we know that both limits lim sn and lim sn−1 exist and equal
n→∞ n→∞
i=1

X
s= ai . Then an = sn − sn−1 → s − s = 0.
i=1
As we said earlier, the result given in Proposition 8.3.2 is very important—
but not in the form given in Proposition 8.3.2. For this reason we state the
contrapositive of Proposition 8.3.2 as the following corollary—called the “test
for divergence” in the basic course.

X
Corollary 8.3.3 (Test for Divergence) Consider the real series ai . If
i=1

X
lim ai 6= 0, then the series ai does not converge.
i→∞
i=1

One thing that we want to emphasize is that the statement “ lim ai 6= 0” can
i→∞
be satisfied if either the limit does not exist, or the limit exists and is not equal

X
to zero. Of course this corollary can be used to show that the series (−1)i ,
i=1
∞ ∞ ∞
X
i
X X 1
2 and sin(i) do not converge. We do not know it yet but the series
i=1 i=1 i=1
i
does not converge. For this series ai = 1/i → 0. Hence we emphasize that the
converse of Proposition 8.3.2 is not true.
We next include a concept that will be very important to us later. We begin
by including the definition of absolute and conditional convergence.
Definition 8.3.4 Suppose {ai } is a real sequence. We say that {ai } is abso-

X
lutely convergent if the series |ai | is convergent. If the series {ai } is con-
i=1
vergent but not absolutely convergent, then the series is said to be conditionally
convergent.

We then state and prove the following result.



X
Proposition 8.3.5 Suppose {ai } is a real sequence. If the series ai is ab-
i=1
solutely convergent, then it is convergent.
8.3 Convergence Tests 225

Proof: This is one of the results where it is very convenient to consider the
Cauchy criterion for the convergence of a series given in Proposition 8.2.3. We
X∞
suppose that we are given an  > 0. Since ai is absolutely convergent, we
i=1
know that there exists N ∈ R such that n, m ∈ N, n, m > N and m > n implies
X m
|ai | <  (where the outer absolute value signs are not really needed). Then



i=n
m, n ∈ N, m, n > N and m > n implies—by multiple applications of the
triangular
m inequality, Proposition 1.5.8-(v), or an easy math induction proof—
X X m
a ≤ |ai | < . Thus by the Cauchy criterion for convergence of series,

i

i=n i=n

X
the series ai converges.
i=1

X 1
We see that if we consider the series i
, which we know is convergent
i=1
2
because it is a geometric series associated with r = 1/2 < 1, then we immediately

X 1
know that the series (−1)i i is convergent—or any other series like this
i=1
2
where some of the terms are negative. We will apply this result often in a
similar way.
The series results given so far do not directly help us decide whether or
not series converge. When we worked with sequences, we had many methods
that helped find limits. We next state and proof a series of results that help
determine whether or not a series is convergent. We begin with the integral test
(recall that we considered improper integrals in Section 7.8).


X
Proposition 8.3.6 (Integral Test) Suppose that ai is a real series and
i=1
suppose f : [1, ∞) is a positive, decreasing continuous function for which f (i) =
X∞ Z ∞
ai for i ∈ N. Then ai converges if and only if the integral f exists.
i=1 1

Z ∞
Proof: Before we proceed we emphasize that we are assuming that f
1
exists in R (we do not include convergence to ∞ for this assumption). Since f
is decreasing on the interval [i − 1, i], we know that f (i − 1) ≥ f (t) ≥ f (i) for
Z i Z i
any t ∈ [i − 1, i]. Hence by Proposition 7.4.5, f (i − 1) = f (i − 1) ≥ f≥
i−1 i−1
Z i Z i
f (i) = f (i), or ai−1 ≥ f ≥ ai . If we sum from i = 2 to i = n and
i−1 i−1
226 8. Sequences and Series

apply Proposition 7.4.3, we get


Z n
sn−1 ≥ f ≥ sn − a1 . (8.3.1)
1

X
(⇒) We assume s = ai converges. Since ai ≥ 0 for all i, the left side of
i=1
n Z X∞
inequality (8.3.1) yields f ≤ sn−1 ≤ ai = s. Therefore, the sequence
1 i=1
 Z n 
bn = f is a bounded, monotone increasing sequence. By the Monotone
1 Z n
Convergence Theorem, Theorem 3.5.2-(a), the limit lim bn = lim f exists
n→∞ n→∞ 1
and equals L = lub{bn : n ∈ N}. Because L ≥ bn for any n, the convergence
of the sequence {bn } can be expressed as follows: for every  > 0 there exists
N ∈ R such that n > N implies that |bn − L| = L − bn < . Z ∞
The above limit is not enough to show that the improper integral f
Z R 1

exists. We must show that lim f exists. We claim that this limit does in
R→∞ 1
fact exist and will equal L. Suppose that  > 0 is given. Choose the N based
on the convergence of the sequence {bn }. Let N1 = N + 1 and suppose that
R > N1 . Note that since f (x) ≥ 0, we can use Propositions 7.3.3 and 7.4.5
Z R Z [R]
to show that f ≥ f where [R] is the greatest integer function. Then
1 1
[R] > N and
Z R Z R Z [R]

L − f = L − f ≤L− f = b[R] − L < .

1 1 1
Z R Z ∞
Therefore lim f exists and the improper integral f exists.
R→∞ 1 1
Z ∞
(⇐) We now assume that the integral f exists. By the right side of in-
1 Z ∞
equality (8.3.1), the fact that f is positive and the fact that f exists, we
Z n Z ∞ 1

see that sn − a1 ≤ f ≤ f —the sequence {sn − a1 } is bounded. Since


1 1
ai ≥ 0 for all i, the sequence {sn } is increasing—so the sequence {sn − a1 } is
also increasing. By the Monotone Convergence Theorem, Theorem 3.5.2-(a),
the sequence {sn − a1 } is convergent. Thus the sequence {sn } is also convergent
X∞
(using Proposition 3.3.2-(a)) and the series ai converges.
i=1
The form of the integral test given in Proposition 8.3.6 is not in the form
that we are accustomed to using. We rewrite Proposition 8.3.6 in the following
8.3 Convergence Tests 227

corollary where we include one of the implications from Proposition 8.3.6 and
the contrapositive of the other implication from Proposition 8.3.6.

X
Corollary 8.3.7 (Integral Test) Suppose that ai is a real series and sup-
i=1
pose f : [1, ∞) is a positive, decreasing continuous function for which f (i) = ai
for i ∈ N.

R∞ X
(a) If the improper integral 1 f exists, then the series ai is convergent.
i=1

R∞ X
(b) If the improper integral 1
f does not exist, then the series ai is not
i=1
convergent.

The integral test is a specially good result because


Z ∞it gives us a large number
1
of convergent series easily. It is easy to see that p
exists for p > 1 and
1 x

X 1
does not exist for p ≤ 1. Thus we get the p-series: converges for p > 1
i=1
ip
and diverges if p ≤ 1. Note that when we choose p = 1, we see that the series

X 1
diverges. This series, called the harmonic series, is the series that we as
i=1
i
an example of a divergent series earlier in this section. There are some other
series on which the integral test can be used but not many important ones. We
also note at this time that if we apply the idea of absolute convergence along
with some of these p-series, we obtain more convergent series. We see that since
∞ ∞
X 1 X 1
3
converges, then (−1)i 3 also converges. Of course we can find many
i=1
i i=1
i
more convergent series using this method.
The next result, the comparison test, is important but is often difficult to
use.

Proposition 8.3.8 (Comparison Test) Suppose that {ai } and {bi } are real,
positive sequences and suppose that for some N1 ∈ N ai ≤ bi for all i ≥ N1 . If
X∞ ∞
X
the series bi converges, then the series ai converges.
i=1 i=1


X
Proof: Since bi converges, we know from Proposition 8.2.3 that for every
i=1
 > 0 there exists an N2 ∈ R such that n, m ∈ N, n, m > N2 and m > n
m
X
implies that bi <  (where no absolute value signs are needed since {bi } was
i=n
assumed to be a positive series). If we then let N = max{N1 , N2 }, we know
228 8. Sequences and Series

m
X m
X
that for n, m ∈ N, n, m > N and m > n we have ai ≤ bi < . Therefore
i=n i=n

X
again by Proposition 8.2.3 we know that ai converges.
i=1
We mentioned earlier that the comparison test is often difficult to use. If we

X 1 1 1
consider a series such as 2+i+1
, it is easy to see that 2 ≤ 2.
i=1
i i + i + 1 i

X 1
converges because it is a p-series with p = 2. Hence, by the comparison
i=1
i2

X 1
test the series 2+i+1
converges.
i=1
i

X 1
If we instead consider a series such as 2
we have to be more
i=1
i −i+1
1 1
clever. We note that i2 −i+ 1 ≥ i2 −2i+ 1 = (i−1)2 , then ≥ 2
(i − 1)2 i −i+1

X 1
for i ≥ 2. The series is a p-series with p = 2 so it is convergent—
i=2
(i − 1)2
it is not exactly in the form of a p-series but it should be clear that with a
change of variable j = i − 1, we see that it is exactly in the form of a p-series.

X 1
Then by the Comparsion Test, 2−i+1
is convergent. We note that the
i=1
i
series in Proposition 8.3.8 both start at i = 1 where in this example the series

X 1
starts at i = 2. This is no problem. We could add an i = 1 term,
i=2
(i − 1)2

X 1
say b1 = 13. The series 13 + will still be convergent and Proposition
i=2
(i − 1)2
8.3.8 will apply with N1 = 2.
Just as we did following the integral test, we can apply the comparison test
in conjunction with absolute convergence. Using the comparison test and the

| sin i| 1 X | sin i|
fact that 2
≤ 2
, it is easy to see that the series is convergent.
i i i=1
i2

X sin i
Then using Proposition 8.3.5 we know that converges.
i=1
i2
The next convergence test is an extremely nice result that takes care of most
of the difficulties associated with the comparison test.

Proposition 8.3.9 (Limit Comparison Test) Suppose that {ai } and {bi }
are positive, real sequences.

ai X
(a) If lim 6= 0, then the series ai is convergent if and only if the series
n→∞ bi
i=1
8.3 Convergence Tests 229


X
bi is convergent. Note that (a) can be worded as follows:
i=1
∞ ∞
ai X X
(a1 ) If lim 6= 0 and bi converges, then ai converges, and
i→∞ bi
i=1 i=1
∞ ∞
ai X X
(a2 ) If lim 6= 0 and bi does not converge, then ai does not converge.
n→∞ bi
i=1 i=1
∞ ∞
ai X X
(b) If lim = 0 and bi converges, then ai converges.
n→∞ bi
i=1 i=1

Proof: The statement of the proposition above really consists of parts (a) and
(b). Parts (a1 ) and (a2 ) are rewordings of part (a)—one implication and the
contrapositive of the other implication. Statements (a1 ) and (a2 ) are in a form
much easier to apply than that of (a).

ai X
(a) (⇒) We assume that lim = r 6= 0 and the series ai is convergent.
i→∞ bi
i=1
ai
Since ai and bi are positive, r > 0. Because lim = r > 0, for every  > 0
i→∞ bi
ai
there exists N ∈ R such that i > N implies that − r <  or
bi
ai
r−< < r + . (8.3.2)
bi

Since the sequence {bi } is assumed positive, inequality (8.3.2) can be rewritten
as
(r − )bi < ai < (r + )bi . (8.3.3)
Choose  = r/2. Then for i > N we have (r/2)bi < ai . By the comparison test,

X ∞
X
Proposition 8.3.8, since ai converges, (r/2)bi converges. By Proposition
i=1 i=1
X∞
8.3.1-(b) this implies that bi is also convergent.
i=1
(⇐) The proof of this directions is almost identical to the previous proof. The
difference is that this time, the right hand half of inequality (8.3.3) is used

X ∞
X
along with the comparison test to show that if bi converges, then ai is
i=1 i=1
also convergent—try it.
ai
(b) If lim = 0, then for  > 0 there exists an N ∈ R such that i > N implies
n→∞ bi
ai
that <  (no absolute value signs are necessary because both sequences are
bi
positive). Thus for i > N we have

ai < bi . (8.3.4)


230 8. Sequences and Series

Thus by the Proposition 8.3.1-(b) and the comparison test, the convergence of

X ∞
X
bi implies the convergence of ai .
i=1 i=1
Hopefully you remember from your basic course that you can easily prove the

X 1 1 1
convergence of 2−i+1
by setting ai = 2 , bi = 2 and applying
i=1
i i − i + 1 i
X∞
part (a1 ) of the limit comparison test (realizing that bi converges because it
i=1
is a p-series with p = 2). This is much easier than applying the comparison test.

X i2 + i + 1 i2 + i + 1
To show that does not converge, we set a i = ,
i=1
i3 + i2 + i + 1 i3 + i2 + i + 1
1 ai
bi = , show that → 1 as i → ∞, and apply part (a2 ) of the limit comparison
i bi
∞ ∞
X i2 + i + 1 X 1
test to see that 3 + i2 + i + 1
does not converge (recall that diverges
i=1
i i=1
i
since it is a p-series with p = 1).
Generally, the limit comparison test allows you to prove the comvergence or
divergence of a series by “comparing” the series with a known much nicer series
that is similar to the original series—similar in that the limit ai /bi → r exists.
We next introduce the convergence test that might be the most important
test of them all. The ratio test is applicable on series that are almost geometric
series—as we shall see by the proof and the examples that follow. Of course the
ratio test will work on a geometric series but we don’t need it there.
Proposition 8.3.10 (Ratio Test) Consider a real sequence of non-zero ele-
ments {ai }.
(a) Suppose that there exists r, 0 < r < 1, and an N ∈ N such that i ≥ N

ai+1 X
implies that
≤ r. Then the series ai is absolutely convergent.
ai i=1
(b) Suppose that there exists r, r ≥ 1, and an N ∈ N such that i ≥ N implies

ai+1 X
that ≥ r. Then the series ai does not converge.
ai i=1

Proof: (a) Suppose that there exists r, 0 < r < 1 and an N ∈ N such that
ai+1
i ≥ N implies that ≤ r. Note the following.
ai

aN +1

≤ r implies that |aN +1 | ≥ r|aN |
aN

aN +2

≤ r implies that |aN +2 | ≤ r|aN +1 | ≤ r2 |aN |
aN +1
• Claim: m ≥ 1 implies that |aN +m | ≤ rm |aN |
8.3 Convergence Tests 231

Proof by mathematical induction


Step 1: True for m = 1, given above.
Step 2: Assume true for m = k, i.e. assume that |aN +k | ≤ rk |aN |.

aN +k+1
Step 3: Since ≤ r, |aN +k+1 | ≤ r|aN +k | ≤ rrk |aN | (by the
aN +k
inductive hypothesis). Thus |aN +k+1 | ≤ rk+1 |aN |, i.e. it is true for m =
k + 1.
Therefore by math induction the statement is true for all m, i.e. for m ≥ 1,
|aN +m | ≤ rm |aN |.

X
We know that the series rm+1 |aN | is convergent since it is a geometric
m=1

X
series. By the comparison test we then know that aN +m is absolutely
m=1
N
X N
X
convergent. And finally, since ai is a finite sum, we then know that ai +
i=1 i=1

X ∞
X
aN +m = ai is absolutely convergent.
m=1 i=1

(b) Suppose
that there exists r, r ≥ 1 and an N ∈ N such that i ≥ N implies
ai+1
that ≥ r. By a mathematical induction proof similar to that used in
ai
part (a) we can show that m ≥ 1 implies that |aN +m | ≥ rm |aN |. Since r ≥ 1,
it is clear that |aN +m | 6→ 0 as m → ∞. Thus it is impossible that aN +m → 0
(because if aN +m → 0, then |aN +m | → 0). And if aN +m 6→ 0, it should be clear
X∞
that ai 6→ 0. Thus by Corollary 8.3.3 we know that the series ai does not
i=1
converge.
We should note that the above version of the ratio test is not the version
usually included in the basic calculus texts. We include the following version of
the ratio test.

Corollary 8.3.11 Ratio Test Consider a real sequence of non-zero elements


ai+1
{ai }. Suppose that lim = r. Then
i→∞ ai
X∞
(a) if r < 1, then the series ai is absolutely convergent.
i=1

X
(b) if r > 1, then the series ai is not convergent.
i=1
(c) if r = 1, then no prediction can be made.
232 8. Sequences and Series

ai+1
Proof: (a) If lim = r < 1, for every  > 0 there exists N ∈ R such that
i→∞
a i


ai+1
n > N implies that − r < . Choose  = (1−r)/2 and set N1 = [N ] +1.
ai

< r + 1 − r = r + 1 < 1. Thus by part
ai+1
Then for n ≥ N1 we have 0 <
ai 2 2
X∞
(a) of Proposition 8.3.10 the series ai converges absolutely.
i=1
(b) Again we know that for every  > 0 there exists N ∈ R such that i > N
ai+1
implies that − r < —but now r > 1. Choose  = (r − 1)/2 and set
ai

r+1 r − 1 ai+1
N1 = [N ] + 1. Then for i ≥ N1 we have 1 < <r− < . Thus
2 2 ai
X∞
we can use part (b) of Proposition 8.3.10 to see that the series ai does not
i=1
converge.

X 1 1/(i + 1)
(c) If we consider , we see that lim
= 1 and we know that the
i=1
i i→∞ 1/i
X1∞ ∞
X 1
series does not converge. If we consider , we see that
i=1
i i=1
i2

1/(i + 1)2
X 1
lim

2
= 1 and we know that the series converges. Hence the
i→∞ 1/i
i=1
i2
condition that r = 1 does not determine whether or not a series converges.
Notice that as a part of the proofs of parts (a) and (b) we had to do a dance
to make for the point that we only require the “N ” in the definition of a limit of
a sequence to be in R—using N1 = [N ] + 1 when we needed an integer. There
were times when it was convenient to allow N to be any real number that works.
In the above proof we had to pay for that earlier convenience.
Another well known test is the root test. At times the root test is clearly the
natural test to use. In most other cases the root test is very difficult to apply.
We state the following result.
Proposition 8.3.12 (Root Test) Consider a real sequence {ai }.
(a) Suppose that there exists r, 0 < r < 1, and an N ∈ N such that i ≥ N

X
1/i
implies that |ai | ≤ r. Then the series ai is absolutely convergent.
i=1
(b) Suppose that there exists r, r ≥ 1, and an N ∈ N such that i ≥ N implies
X∞
1/i
that |ai | ≥ r. Then the series ai does not converge.
i=1
1/i P∞
Proof: (a) If |ai | ≤ r for i ≥ N , then |ai | ≤ ri . Then since i=1 ri converges

X
(it is a geometric series and r < 1), the series |ai | converges—of course we
i=1
8.3 Convergence Tests 233

really get that the tail end of the series converges (from N on) which implies

X
that the whole series converges. therefore the series ai converges absolutely.
i=1
1/i i
(b) Since |ai | ≥ r, we have |ai | ≥ r . Then since r ≥ 1, we know that
|ai | 6→ 0 which implies that ai 6→ 0—which by Corollary 8.3.3 implies that the
X∞
series ai does not converge.
i=1
We should note that like the ratio test, the statement of Proposition 8.3.12
is not the version that is usually given in basic calculus texts. And as was the
case with the ratio test, the more tradition root test will follow from Proposition
8.3.12.
There is one more important test for convergence that we must consider.
We should realize that until this time all of our tests were for positive series
or they gave absolute convergence (ratio and root test). The only way that we
proved convergence of series that was not positive was to use Proposition 8.3.5—
absolute convergence implies convergence. We next consider a class of series that
are not positive called alternating series. We now include the following definition
and the associated convergence theorem.

Definition 8.3.13 Consider a real sequence of positive elements {ai }. The



X
series (−1)i+1 ai is said to be an alternating series.
i=1

Note that we set the exponent on the −1 term to be i + 1 just so that the first
term would be positive—that seems a bit neater. This is not important. It is
still an alternating series if it starts out negative and the result given below is
equally true for alternating series that start with a negative term.

Proposition 8.3.14 (Alternating Series Test) Consider the real, positive


decreasing sequence of elements {ai } and suppose that lim ai = 0. Then the
i→∞

X
i
alternating series (−1) ai converges.
i=1

Proof: Consider the sequence of partial sums s2n = (a1 − a2 ) + (a3 − a4 ) +


· · · + (a2n−1 − a2n ). Since ak − ak+1 ≥ 0, the sequence is increasing. Also since
ak − ak+1 ≥ 0, k = 2, · · · , 2n − 2, and a2n > 0,

s2n = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2n−2 − a2n−1 ) − a2n ≤ a1 ,

i.e. the sequence {s2n } is bounded above. Then by the Monotone Convergence
Theorem, Theorem 3.5.2-(a), the sequence {s2n } converges—say to s. Then
since s2n+1 = s2n + a2n+1 and the fact that a2n+1 → 0, we see that {s2n+1 }
also converges to s.
234 8. Sequences and Series

Claim: The sequence {sn } converges to s. Let  > 0 be given and let
N1 be such that n > N1 implies that |s2n − s| < , i.e. if 2n > 2N1 , then
|s − s2n | < . Let N2 be such that n > N2 implies that |s − s2n+1 | < , i.e. if
2n + 1 > 2N2 + 1, then |s − s2n+1 | < .
Then if we define N = max{2N1 , 2N2 +1}, then n > N implies that |s−sn | <

X
, so lim sn = s or the series (−1)i ai converges—to s.
n→
i=1
We emphasize here that to apply the alternating series test we must show
that the sequence is (i) decreasing, (ii) ai → 0 and that (iii) the series is alter-

X (−1)i
nating. We used the fact earlier that the series √ converges. It is easy
i=1
i
1 1 √ √
to see that (i) ai+1 = √ < √ = ai (which is the same as i < i + 1 or
i+1 i
1
i < i + 1), and (ii) lim √ = 0. Surely the series is alternating. Hence by the
i→∞ i

X (−1)i
alternating series test the series √ converges.
i=1
i

HW 8.3.1 (True or False and why)



X
(a) If lim an = 0, then an converges.
n→∞
n=0
∞ ∞
X 1 X 1
(b) Since the series does not converge, the series (−1)n does not
n=1
n n=1
n
converge.

X 1
(c) The integral test implies that the series converges.
n=2
n ln n

X ∞
X
(d) If an converges, then |an | converges.
n=1 n=1

X ∞
X
(e) Suppose an ≤ bn for n = 1, 2, · · · and bn converges. Then an con-
n=1 n=1
verges.

HW 8.3.2 Tell which test (if any) will determine whether the following series
are convergent or not.
∞ ∞ ∞ ∞
X n X 1 X 1 X 1
(a) n
(b) n
(c) 2+1
(d) 2−1
n=1
2 n=1
n n=1
n n=2
n

8.4 Power series


In Section 8.2 we showed that the Maclaurin series of f (x) = ex about x = 0
converges pointwise to f (x) = ex on [−3, 3].
8.4 Power Series 235

From Section 8.3 we now have other methods for proving convergence of

X 1 k
series. For example if we consider that same Maclaurin series, x , and
k!
k=0
apply the ratio test, we see that
1
xk+1 x

an+1 (k+1)!
= =
→0
an 1 k k + 1
x
k!

as k → ∞ for all x ∈ R. Thus we have just shown that the series that is the
natual infinite series associated with f (x) = ex converges on all of R—a much
better result than that proved in Section 8.2 where we proved it converged on
[−3, 3].
n
X 1 k
But note several important things. Here we prove that the series x
k!
k=0
converged for all x ∈ R but we did not prove that it converged to f (x) = ex . We
should also note that if we apply the same approach using the Taylor inequality
as we did in Section 8.2 on a larger interval, say [−88, 88], we still get conver-
e88
gence, i.e. f (n+1) (x) = ex for all n so M = e88 , |Tn (x) − ex | ≤ 88n+1 ,
(n + 1)!
88n+1
and → 0 as n → ∞ (Example 3.5.2) implies that Tn (x) → ex for all
(n + 1)!
x ∈ [−88, 88]. And since this argument will work for any interval [−R, R] ⊂ R,

X 1 k
the series x converges to f (x) = ex for all x ∈ R. Then we can write
k!
k=0

X 1 k
f (x) = x on R.
k!
k=0
We can find an assortment of Taylor-Maclaurin series expansions and are
able to prove convergence of the series to the given function—as we see in the
following result.

Proposition 8.4.1 Let I be a neighborhood of x = a and suppose f : I → R has


derivatives of all orders at x = a. Suppose further that there exists r and M such
that the interval [a − r, a + r] ⊂ I and |f (n) (x)| ≤ M for all x ∈ [a − r, a + r] and

X f (k) (a)
all n ∈ N. Then the Taylor series of f converges and f (x) = (x−a)k ,
k!
k=0
i.e. the Taylor series converges and is equal to the function f (x) on [a−r, a+r].

Proof: Let Tn be the Taylor polynomial approximation of f about x = a,


n ∞
X f (k) (a) X f (k) (a)
Tn (x) = (x − a)k , the partial sum of the series (x − a)k .
k! k!
k=0 k=0
By ZProposition 8.1.1 we know that f (x) − Tn (x) = Rn (x) where Rn (x) =
1 x
(x − t)n f (n+1) (t) dt. Then by the Taylor inequality, Proposition 8.1.3,
n! a
236 8. Sequences and Series

(since |f (n) (x)| ≤ M for all x ∈ [a − r, a + r] and all n ∈ N, then |f (n+1) | ≤ M


for x ∈ [a − r, a + r])
M
|f (x) − Tn (x)| ≤ |Rn (x)| ≤ rn+1 .
(n + 1)!
n+1
r
By Example 3.5.2 M (n+1)! → 0 as n → ∞. This implies that for any x ∈
[a − r, a + r], |f (x) − Tn (x)| → 0 as n → ∞, or Tn (x) → f (x) as n → ∞ or {Tn }

X f (k) (a)
converges to f pointwise on [a−r, a+r]. Therefore the series (x−a)k
k!
k=0

X f (k) (a)
converges pointwise to f on [a−r, a+r] and we can write f (x) = (x−
k!
k=0
a)k on [a − r, a + r].
To apply this proposition we must understand that the choice of r may be
very important. When we considered f (x) = ex and it’s associated Maclaurin
series, the choice of r was not important—we could find a bound M for any
interval [−R, R] or any [a − r, a + r]. This does not always work. Consider the
following example.
Example 8.4.1 Find the Maclaurin series expansion of f (x) = ln(x + 1) and analyze the
convergence of this series.
Solution: It is easy to see that f (k) (x) = (−1)k+1 (k − 1)!(x + 1)−k for k = 1, 2, · · · . Thus
since f (k) (0) = (−1)k+1 (k − 1)! for k = 1, 2, · · · (and of course f (0) (0) = 0), we see that the

X (−1)k+1 k
Maclaurin series expansion of f (x) = ln(x + 1) is given by x .
k=1
k
The Taylor polynomial and the associated error term of f (x) = ln(x + 1) about x = 0 are
n
(−1)k+1 k
x and Rn (x) = (−1)n 0x (x − t)n (t + 1)−(n+1) dt. Notice that
X R
given by Tn (x) =
k=1
k
we do not have the n! term in the denominator to help us make Rn small. Also note that we
surely do not want to have r ≥ 1—because then [a − r, a + r] would become [−r, r] and f (n) (t)
is not defined at t = −1. And finally if we consider Rn on [−r, r] for r < 1, the maximum of
the term (t + 1)−(n+1) occurs at t = −r and is given by (1 − r)−(n+1) . This term cannot be
bounded as a function of n (try r = 1/2—then it is really easy to see). And of course it will
be impossible to find an M that bounds f (n) on [−r, r].
Thus we cannot use Proposition 8.4.1. Moreso, the expression for Rn doesn’t look good—
the term (t + 1)−(n+1) will be large with respect to n. We have methods to consider the
(−1)k+1 k bk+1
convergence of the series. If we let bk = k
x , we see by the ratio test that =
bk
k
|x| → |x| so that the series converges for |x| < 1 (and not converge for |x| > 1. If set
k+1
let x = −1, we see that the series does not converge because it is the negative of the p-series,
p = 1. And if we let x = 1, it is easy to see that the series converges by the alternating series
test. Thus the series converges for x ∈ (−1, 1] and does not converge elsewhere.
However, when the series converges, we do not know that it converges to ln(x + 1). We
will return to this example later Example 8.6.1.

Thus we see that though Proposition 8.4.1 is a very nice result, there are times
(lots of times) that it does not apply.
Power Series In our discussion of Taylor and Macclaurin series above we
started with a function f , used that function to generate a series and then
8.4 Power Series 237

proved that the series converged to f on some interval. There are times that
we want to go approximately in the other direction. We begin with a series of
functions, prove that the series is convergent and define a function to be the
result of the convergent series. We begin with the following definition.

X
Definition 8.4.2 Consider the real sequence {ak }∞
k=0 . The series ak (x−a)k
k=0
is said to be a power series about x = a.

We first note that we started the power series at k = 0. A power series can
equally well start at k = 1 or any other particular value. It is very traditional
to start power series at k = 0—and that’s ok. There is a slight problem starting
the power series at k = 0. The first term is then a0 x0 , and we do want the power
series to be well-defined at x = 0. And of course x0 is not defined at x = 0.
X∞ X∞
Thus we want to emphasize that we write ak xk and mean a0 + ak xk .
k=0 k=1
We should also note that power series are commonly defined for complex
sequences of numbers. We restricted out power series to real coefficients because
here we are interested in real functions and real series. Everything that we do
can be generalized to complex power series. And finally, we will work with
power series about x = 0—everything that we do can be translated to results
about x = a.
Of course we see that a Taylor series expansion gives us a power series.
There are power series where it is not clear that they come from a Taylor series.
Power series appear in a variety of applications. One of the common reasons
for generating power series is when we find power series solutions to ordinary
differential equations—including the resulting power series that define Bessel’s
functions, hypergeometric functions and others.
Consider the following examples of power series.
Example 8.4.2 Discuss the convergence of the following power series:
∞ ∞ ∞ ∞
X X xk X xk X
(a) k!xk (b) (c) (−1)k (d) xk
k=0 k=0
k! k=1
k k=0

Solution (a) Let bk = k!xk . Applying the ratio test to the power series (a) we see that

bk+1 X
b = (k + 1)|x| → ∞ as k → ∞ if x 6= 0. Thus k!xk does not converge for any x ∈ R,
k
k=0
x 6= 0. The series converges to a0 for x = 0, i.e. series (a) converges on the set {0}.

xk

= |x| → 0 as k → ∞. Thus the series
bk+1 X
(b) Let bk = xk /k!. Then converges
bk k+1 k=0
k!

X xk
absolutely for all x ∈ R. Thus the series converges for all x ∈ R.
k!
k=0
b k
(c) Let bk = (−1)k xk /k. Then k+1 = k+1 |x| → |x| as k → ∞. Thus by the ratio test the

b k

X xk k
series (−1) converges absolutely on {x ∈ R : |x| < 1} = (−1, 1) and does not converge
k=1
k
on {x ∈ R : |x| > 1} = (−∞, −1) ∪ (1, ∞).
238 8. Sequences and Series

The ratio test tells us nothing about the convergence at x = −1 and x = 1. At x = −1



X 1
the series becomes which we know diverges—a p-series with p = 1. At x = 1 the series
k=1
k

X 1
becomes (−1)k which converges by the alternating series test.
k=1
k

X xk
Therefore the series (−1)k converges on (−1, 1] and does not converge on
k=0
k
(−∞, −11] ∪ (1, ∞).

b X
(d) Let bk = xk . Then k+1 = |x| → |x| as k → ∞. Thus by the ratio test the series xk

b k
k=0
converges on {x ∈ R : |x| < 1} = (−1, 1) and does not converge on {x ∈ R : |x| > 1} =
(−∞, −1) ∪ (1, ∞). As in (c) the ratio test tells us nothing about the convergence at x = ±1.
X∞ X∞
At x = −1 and x = 1 the series become (−1)k and 1, respectively. Both of these series
k=0 k=0
do not converge by the test for divergence, Corollary 8.3.3.

X
Therefore the series xk converges on (−1, 1) and does not converge on (−∞, −1] ∪
k=0
[1, ∞).

We see from the above example that the ratio test is a powerful tool that
can be used to determine the convergence of power series. Also, we see that we
get all different possibilities—convergence at one point, convergence on all of R,
converges on an interval, including end points of the interval or not including
end points.
The first result concerning power series is a proposition that describes the
convergence of a power series—results that we sort of see from the previous
example.

X
Proposition 8.4.3 (a) If the power series ak xk converges for x = x0 and
k=0

X
z is such that |z| < |x0 |, then ak xk converges absolutely for x = z.
k=0

X
(b) If the power series ak xk does not converge at x = x0 and z is such that
k=0

X
k
|z| > |x0 |, then ak z does not converge.
k=0


X
Proof: (a) If ak xk0 converges, then ak xk0 → 0 as k → ∞. Choosing  = 1
k=0
we know that there exists N ∈ R such that k > N implies that ak xk0 < 1. If z
is such that |z| < |x0 |, then
k k
z z
|ak z k | = |ak xk0 | <
x0 x0
8.4 Power Series 239


z k

z X
for k > N . Since < 1, the series

x0 is a convergent geometric series.

x0
k=0
X∞
By the comparison test, Proposition 8.3.8, the series |ak z k | is convergent,
k=N +1

X ∞
X
i.e. the series ak z k and hence the series ak z k ae absolutely convergent.
k=N +1 k=0

X
(b) Suppose the statement is false, i.e. suppose the series ak z k converges.
k=0

X
Then since |x0 | < |z|, by part (a) of this result we know that ak xk0 con-
k=0
verges absolutely. This is a contradiction to the hypothesis. Therefore the

X
series ak z k does not converge.
k=0
We want to be able to describe (somewhat) the set of convergence of a power
series as we found in Example 8.4.2. We make the following definition.
X∞
Definition 8.4.4 Define the radius of convergence R of a power series ak xk
k=0
as

X
R = lub{y ∈ R : ak y k is absolutely convergent}.
k=0

We then obtain the following result.



X
Proposition 8.4.5 If R is the radius of convergence of the power series a k xk ,
k=0
then the series converges absolutely for |x| < R and does not converge for
|x| > R.
If |x| = R, the series may converge conditionally, may converge absolutely
or many not converge.
Proof: Suppose that |x| < R. By the definition of R we know that there exists
X∞
an x0 such that |x| < x0 < R and ak xk0 is absolutely convergent. Then by
k=0

X
Proposition 8.4.3-(a) the series ak xk is absolutely convergent.
k=0
Suppose not that |x| > R and suppose the the result is false, i.e. suppose that

X X∞
ak xk converges. By Proposition 8.4.3-(a) ak z k will converge absolutely
k=0 k=0
for all z such that |z| < |x|. Hence R ≥ x. This contradicts the assumption

X
that |x| > R. Therefore the series ak xk does not converge.
k=0
240 8. Sequences and Series

Example 8.4.2-(c) shows that if |x| = R, the series may converge converge
conditionally, or not at all. HW8.4.2-(a) shows that it is possible that the series
may converge absolutely when |x| = R.
We see that for a given power series, the series will converge (absolutely) for
|x| < R, not converge for |x| > R and may, or may not, converge for |x| = R.
When the series converges, we want to use the power series to define a function
on the domain {x ∈ R : |x| < R}—or maybe a bit more, we may want to
include x = ±R. It’s not hard to see that wherever the series converges, we

X
can define a function f (x) = ak xk . As we always do in calculus once we
k=0
have a new function, we ask the question of whether the function is continuous,
differentiable and/or integrable. The most obvious approach is to hope that
Xn
because ak xk is continuous for each k (so ak xk is continuous for any n),
k=0

X
k
then ak x will be continuous. And, since ak xk is differentiable for each
k=0
Xn ∞
X
k (so ak xk is differentiable for any n), then ak xk will be differentiable
k=0 k=0

!0 ∞
k 0
X X
ak xk

and = ak x . And, finally since ak xk is integrable for each
k=0 k=1
n
X ∞
X
k (so ak xk is integrable for any n), then ak xk will be intebrable and
k=0 k=0
Z ∞
bX ∞ Z
X b
k k
ak x = ak x .
a k=0 k=0 a
The fact is that at this time we cannot make these claims or answer these
questions. Pointwise convergence is not enough. We mentioned earlier that
there were other kinds of convergence. In the next section we will give ourselves
the necessary structure to answer these questions.

HW 8.4.1 (True or False and why)



X
(a) If R is the radius of convergence of the power series ak xk , then the series
k=0
convergence at x = −R.

X f (k) (0) k
(b) Suppose that x is the Maclaurin series for the function f . Sup-
k!
k=0
pose that for each k ∈ N, f (k) is bounded on the interval [−r, r]. Then the
Maclaurin series converges to f pointwise on [−r, r].

X
(c) The power series kxk converges only at x = 0.
k=1
8.5 Uniform Convergence 241


X
(d) The radius of convergence of the power series k k xk is R = e.
k=1

X
(e) Suppose for all k ∈ N, |ak | < 1. Then the power series ak xk converges
k=1
of R.

HW 8.4.2 Discuss the convergence of the following power series:


∞ ∞ ∞
X 1 k X
k 1 k P∞ X k
(a) x (b) (−1) x (c) k=0 (−1) k k
x (d) xk
k2 K2 k2 + 1
k=0 k=1 k=1


X
HW 8.4.3 Discuss the convergence of the power series sin kxk .
k=1

HW 8.4.4 Compute the Maclaurin series of f (x) = sin x. Determine for which
values of x the series converges. Prove or disprove that when the series con-
verges, it converges to f (x) = sin x.

8.5 Uniform Convergence of Sequences


We mentioned in the last section that we need more—we need a stronger form
of convergence than pointwise convergence to give us the results that we want
and need. As should be obvious from the title of this section, we will introduce
uniform convergence. Remember that convergence of series was defined in terms
of the convergence of sequences. For that reason we shall return to consideration
of uniform convergence of sequences of functions—this is the easiest way to
introduce uniform convergence and it is an important concept even without
consideration of series of functions.
We begin with three traditional examples that every grade school child show
know. Specifically, we will show that we were right when we said that point-
wise convergence is not enough. Recall in Example 8.2.1 that we defined a
sequence (of functions {f1n } on [0, 1] by f1n (x) = xn and a function f1 by
0 for 0 ≤ x < 1
f1 (x) = and showed that f1n → f1 pointwise on [0, 1].
1 for x = 1,
Consider the following example.
Example 8.5.1 Suppose fn , f : D → R, D ⊂ R, and fn → f pointwise on D. Suppose
that fn is continuous on D for all n ∈ N. Show that it need not be the case that f is
continuous.
Solution: If we consider the sequence of functions defined above, {f1n }, and the limiting
function f1 . We saw that f1n → f1 pointwise on [0, 1]. We know that f1n is continuous on
[0, 1] for each n ∈ N. It should be clear that f1 is not continuous at x = 1. Therefore we have
an example of a pointwise convergent sequence of continuous functions that converges to a
discontinuous function.

In Example 8.2.2 we considered another example of a pointwise convergent


n
sequence. We defined f2n , f2 : [0, 1] → R for n ∈ N by f2n (x) = xn and
242 8. Sequences and Series

f2 (x) = 0. We showed that f2n → f2 pointwise on [0, 1]. Consider the related
example.
Example 8.5.2 Suppose fn , f : D → R, D ⊂ R, and fn → f pointwise on D. Suppose
that fn and f are differentiable on D for all n ∈ N. Suppose also that the sequence of
derivatives {fn0 } converges pointwise on D to f ∗ . Show that it need not be the case that
f 0 = f ∗ , i.e. show that the derivative of the limit need not be the limit of the derivatives.
Solution: Obviously we want to consider the sequence {f2n } and limiting function f2 . Clearly
each function f2n and f2 are differentiable, and f20 n (x) = xn and f20 (x) = 0 for all x ∈ [0, 1].
We know from Example 8.2.1 that f20 n → f1 pointwise on [0, 1] where f1 is as was defined in
Examples 8.2.1 and 8.5.1. And clearly, f1 (x) 6= f20 (x) for all x ∈ [0, 1]. Therefore, the limit of
derivatives need not be equal to the derivative of the limit.

Before we get to work we include one more traditional example that shows
the inadequacy of pointwise convergence. Consider the following example.
Example 8.5.3 Define f4n , f4 : [0, 2] → R by f4 (x) = 0 for x ∈ [0, 2] and f4n (x) =

2 x ∈ [0, 1/n]
n x

n − n2 (x − 1/n) x ∈ [1/n, 2/n] (the Teepee function that goes from (0, 0), to (1/n, n),

0 elsewhere in [0, 2]

Z 2 Z 2
to (2/n, 0)). Show that f4n (x) → f4 (x) for all x ∈ [0, 2] and that lim f4n 6= f4 , i.e.
n→∞ 0 0
the limit of the integrals is not equal to the integral of the limit.
Solution: If we choose any x ∈ (0, 2] it is easy to see that there is an N so that n > N
implies f4n (x) = 0 (choose N such that N > 2/x). Thus f4n (x) → 0 = f4 (x) as n → ∞. By
definition f4n (0) = 0 for all n. Thus f4n (0) → 0 = f4 (0) as n → ∞. Therefore f4n → f4
pointwise. Z
2 Z 2
12
Since f4n is the area under the Teepee, f4 n = n = 1 for all n. And clearly
R2 0 0 2 n
0 f4 = 0. Thus Z 2 Z 2
lim f4n = lim 1 6= f4 = 0.
n→∞ 0 n→∞ 0

Thus we see that if we want such properties as (1) the limit of a sequence of
continuous functions is continuous, (2) the limit of the derivatives of a sequence
of functions is equal to the derivative of the limit of the sequence of functions
and (3) the integral of the limit of a sequence of functions is equal to the limit
of the integral of the sequence of functions, we need something stronger than
pointwise convergence. And these are the properties that we want for a variety
of reasons, including the fact that we want the continuity, differentiability and
integrability of power series. For this reason we make the following definition of
the uniform convergence of a sequence of functions.
Definition 8.5.1 Consider the sequence of functions {fn }, fn : D → R and
function f : D → R for D ⊂ R. The sequence {fn } is said to converge uniformly
to f on D if for every  > 0 there exists an N ∈ R such that n > N implies that
|fn (x) − f (x)| <  for all x ∈ D.

The emphasis in the above definition is that the N that is provided must
work for all x ∈ D. We see in Figure 8.5.1 that we have drawn an  neighborhood
about the function f . The definition of uniform convergence requires that for
8.5 Uniform Convergence 243

n > N , all functions fn must be entirely within the -tube around f . Consider
the following examples.
Example 8.5.4 Consider {f2n }, f2 defined just prior to Example 8.5.2 and also in Ex-
ample 8.2.2. Prove that f2n → f2 uniformly on [0, 1].
Solution: We suppose that we are given an  > 0. We must find N that must work for all
x ∈ [0, 1]. If you know what the plots of the various f2n look like—or if you plot a few of
these, you realize that the sequence {f2n } converges to f2 the most slowly at x = 1, i.e. it is
the worst point. Thus we consider the 1 convergence of the sequence {f2n (1)} to 0. As we did
1

in our study of sequences, we need n − 0 = n < . Thus we see that if we choose N = 1/,
1 1 1 1
then n > N implies n − 0 = n < N < . Therefore lim f2n (1) = lim = 0.
n→∞ n→∞ n
But more importantly, we now consider the sequence {f2n } and f2 . If n > N , then
n
xn

x 1 1
|f2n (x) − f2 (x)| = − 0 = ≤ < = .
n n n N
Notice that this sequence of inequalities holds for all x ∈ [0, 1]. Therefore f2n → f2 uniformly.

1.5

0.5

−0.5

0.5 1 1.5
x

Figure 8.5.1: Plot of a function and an  neighborhood (-tube) of the function.

We found the N in the above example by choosing the N associated with


x = 1 because it was the “worst point”. Another way to describe the approach
we used was to compute the maximum of |f2n (x) − f2 (x)|. (This was a very nice
example because the maximum occured at x = 1 for all n.) Since this maximum
approaches zero, it seems clear that for large n, f2n will eventually be in the
-tube about f2 and stay there. This is a common approach to proving uniform
convergence.
There are three basic results concerning uniform convergence of interest to
us at this time. As you will see the results are directly related to Examples
8.5.1, 8.5.2 and 8.5.3.
244 8. Sequences and Series

Proposition 8.5.2 Consider the function f and sequence of continuous func-


tions {fn }, f, fn : D → R for D ⊂ R. If fn → f uniformly on D, then f is
continuous.

Proof: Consider some x0 ∈ D and suppose that we are given an  > 0. (We
must find a δ such that |x − x0 | < δ implies that |f (x) − f (x0 )| < .)
Since fn → f uniformly in D, we know that there exists N ∈ R such that
n > N implies |f (y) − fn (y)| < /3 for all y ∈ D (so it would hold for x0 ∈ D
and any x ∈ D also). Choose some particular n0 > N . Then we know that fn0
is continuous on D so there exists a δ such that |x − x0 | < δ and x ∈ D implies
that |fn0 (x) − fn0 (x0 )| < /3. Then |x − x0 | < δ and x ∈ D implies that

|f (x) − f (x0 )| = |(f (x) − fn0 (x)) + (fn0 (x) − fn0 (x0 )) + (fn0 (x0 ) − f (x0 ))|
≤∗ |f (x) − fn0 (x)| + |fn0 (x) − fn0 (x0 )| + |fn0 (x0 ) − f (x0 )|
< /3 + /3 + /3 = 

where inequality “≤∗ ” follows from two applications of the triangular inequality,
Proposition 1.5.8-(v). Therefore f is continuous at x0 —for any x0 ∈ D, so f is
continuous on D.
If we return to Example 8.5.4, we see that since the functions f2n are con-
tinuous for all n and the fact that the sequence {f2n } converges uniformly to
f2 , by Proposition 8.5.2 we know that the function f2 is continuous—but that’s
pretty easy since we know that f2 (x) = 0 for x ∈ [0, 1].
Next consider the sequence of functions {f1n } and the function f1 used in
Example 8.5.1 (and also in Example 8.2.1). By Proposition 8.5.2 and the fact
that f1 is not continuous, we then know that the sequence of functions {f1n }
do not converge uniformly.
The next result that we consider is the interaction of uniform convergence
and integration—see Example 8.5.3.

Proposition 8.5.3 Consider the functions f, fn : [a, b] → R, a < b and n ∈ N,


where the functions fn are continuous on [a, b] for all n and the sequence {fn }
converges uniformly to f on [a, b]. Rx Rx
(a) If we define Fn , F : [a, b] → R by Fn (x) = a fn and F (x) = a f , then
Fn → F uniformly on [a, b].
Rb Rb
(b) If we define Fn , F : [a, b] → R by Fn (x) = x fn and F (x) = x f , then
Fn → F uniformly on [a, b].
Z b Z b
(c) lim fn = f.
n→∞ a a

Proof: (a) Suppose  > 0 is given. Since {fn } converges uniformly to f on


[a, b], we know by Proposition 8.5.2 that the function f is continuous—thus it
is integrable. Define 1 = /(b − a). Since the sequence {fn } converges to f
uniformly, there exists N ∈ R such that n ≥ N implies that |fn (t) − f (t)| < 1
8.5 Uniform Convergence 245

for all t ∈ [a, b]. Then for any x ∈ [a, b]


Z x Z x Z x
∗ x x
Z Z
#


fn − f =
(fn − f ) ≤
|fn (t)−f (t)| dt < 1 ≤ (b−a)1 = 
a a a a a
(8.5.1)
where inequality “≤∗ ” follows from Proposition 7.4.6-(a) and inequality “<# ”
follows from Proposition 7.4.6-(b). Hence, Fn → F uniformly on [a, b].
(b) The proof of part (b) is about identical to the proof of part (a).
(c) If we apply the convergence of {Fn } to F at x = b given in part (a) of this
Z b Z b
proposition, we see that lim fn = f.
n→∞ a a
Of course one of the fast and easy results we get from Proposition 8.5.3 is that
the sequence {f4n } considered in Example 8.5.3—the Teepee functions—does
Z 2
not converge uniformly to f4 on [0, 2]—otherwise f4n = 1 would converge to
Z 2 0

f4 = 0.
0
And finally, we consider our last result involving uniform convergence. Sup-
pose that fn → f —some sort of convergence. There are many times that we
would like to be able to obtain the derivative of f by taking the limit of the
sequence of derivatives {fn0 }. We state the following proposition.
Proposition 8.5.4 Consider the sequence of functions, {fn }, fn : [a, b] → R,
a < b and n ∈ N, where each fn is continuously differentiable on [a, b]. Suppose
there exists some x0 ∈ [a, b] such that {fn (x0 )} converges and the sequence of
functions {fn0 } converge uniformly on [a, b]. Then the sequence of functions
{fn } converges uniformly on [a, b], say to the function f , the function f is
differentiable on [a, b] and f 0 (x) = lim fn0 (x) for all x ∈ [a, b].
n→∞

Proof: Let g be such that fn0 → g uniformly. Since each fn0 is continuous on
[a, b] and fn0 → g uniformly, then g is continuous on [a, b]. Consider the sequence
{fn0 } on [x0 , b]. Clearly the sequence {fn0 } converges uniformly to g on [x0 , b].
Then by Proposition 8.5.3-(a)
Z x Z x
lim fn0 = g, (8.5.2)
n→∞ x0 x0
nR o
x Rx
and the convergence of f 0 to x0 g is uniform on [x0 , b]. Also by the
x0 n
Fundamental Theorem, Theorem 7.5.4,
Z x
fn0 = fn (x) − fn (x0 ). (8.5.3)
x0
Z x
Combining equations (8.5.2) and (8.5.3) gives lim [fn (x) − fn (x0 )] = g.
n→∞ x0
Since we know that lim fn (x0 ) exists, we can add lim fn (x0 ) to the last
n→∞ n→∞
246 8. Sequences and Series

expression and get


Z x
lim fn (x) = g + lim fn (x0 ). (8.5.4)
n→∞ x0 n→∞

Thus the sequence {fZn (x)} converges for each x ∈ [x0 , b] Because the conver-
nR o x
x
gence of x0 fn0 to g is uniform, the convergence of {fn (x)} is uniform on
x0 Z x
[x0 , b]. Denote this limit by f , i.e. f (x) = g + lim fn (x0 ).
x0 n→∞
By Proposition 7.5.2 we see that f is differentiable and f 0 (x) = g(x) (the
derivative of the limit term is zero), i.e. f 0 (x) = lim fn0 (x) for x ∈ [x0 , b].
n→∞
If we essentially repeat the above proof, this time applying Proposition 8.5.3-
(b) instead of part (a) (when we got equation (8.5.2)), we find that f 0 (x) =
lim fn0 (x) for x ∈ [a, x0 ]. If we combine these results, we get the desired result
n→∞
on [a, b].
Thus we see by Propositions 8.5.2-8.5.4 if we want the limit of a sequence of
functions to inherit certain properties of the sequence, we need uniform conver-
gence.
Earlier we used Proposition 8.5.2 to show that the sequence {f1n } does not
converge uniformly and Proposition 8.5.3 to show that the sequence of functions
{f4n } does not converge uniformly. The proofs are completely rigorous but it’s
sort of cheating.
Of course the fact that these sequences do not converge uniformly can be
proved using the defintion of uniform convergence, Definition 8.5.1. To prove
that a sequence does not converge uniformly we must show that there is least
one  so that for all N ∈ R there will be an n > N and at least one x0 ∈ D for
which |fn (x0 ) − f (x0 )| ≥ .
For example consider {f4n } and choose  = 1/2. The maximum value of
|f4n (x) − f4 (x)| occurs at x = 1/n and equals n for every n. For every N ∈ R
and any n > N , x0 = 1/n ∈ [0, 2] is the point such that |f4n (x0 ) − f4 (x0 )| =
n ≥ 1 > . Therefore the sequence {f4n } does not converge uniformly to f4 .
Likewise, if we next consider the sequence {f1n } and choose  = 1/2, for any
N ∈ R we must find an n > N and x0 ∈ [0, 1] so that |f1n (x0 ) − f1 (x0 )| ≥ . If
you plot the function y = xn for a few n’s, you will see that the point is going
to occur near x = 1 (but surely not at x = 1). Let N ∈ R (any such N ) and
suppose n is any integer greater than N . We need to find x0 < 1 such that
|xn0 − 0| = xn0 ≥ 1/2, or taking the n-th rootpof both sides (realizing that the
nth root function
p is increasing) gives x0 ≥ n 1/2. So we could surely choose
x0 = (1 + 1/2)/2 and see that f1n 6→ f1 uniformly on [0, 1].
n

We notice that proving that a sequence of functions does not converge uni-
formly is not easy—but generally showing that any type of limit does not exist
is not easy.
Before we leave we include one more approach to proving uniform conver-
gence. Recall that when we studied convergence of sequences, we included the
8.5 Uniform Convergence 247

Cauchy criterion for convergence of a sequence, Definition 3.4.9 and Proposition


3.4.11. We begin with the following definition of a uniform Cauchy criterion.
Definition 8.5.5 Consider a sequence of functions {fn }, fn : D → R for D ⊂
R. The sequence {fn } is said to be a uniform Cauchy sequence if for every
 > 0 there exists an N ∈ R such that n, m ∈ N and n, m > N implies that
|fn (x) − fm (x)| <  for all x ∈ D.

Hopefully it is clear that as with the Cauchy criterion for sequences, the
advantage of using the uniform Cauchy criterion is when you really don’t know
the limiting function. Also as was the case with the Cauchy criterion for se-
quences, our major application for the uniform Cauchy criterion will be when
we use it to show uniform convergence of series. We do need the convergence
result—analogous to Proposition 3.4.11
Proposition 8.5.6 Consider a sequence of functions {fn }, fn : D → R for
D ⊂ R. The sequences {fn } converges uniformly on D to some function f ,
f : D → R if and only if the sequence is a uniformly Cauchy sequence.

Proof: (⇒) We suppose that the sequence {fn } converges uniformly to f on


D and suppose that  > 0 is given. Then we know that there exists N ∈ R such
that n > N implies |fn (x) − f (x)| < /2 for all x ∈ D. Then m, n > N implies
that

|fn (x) − fm (x)| = |(fn (x) − f (x)) + (f (x) − fm (x)| ≤ |fn (x) − f (x)|
+ |f (x) − fm (x)| < /2 + /2 = 

for all x ∈ D. Thus the sequence {fn } is a uniform Cauchy sequence.


(⇐) If the sequence {fn } is a uniform Cauchy sequence on D, then for each
x ∈ D the sequence {fn (x)} is a Cauchy sequence. Then by Proposition 3.4.11
the sequence {fn (x)} converges—call this limit f (x). Let  > 0 be given. Since
{fn } is a uniformly Cauchy sequence, there exists an N ∈ R such that n, m > N
implies |fn (x) − fm (x)| < /2 for all x ∈ D. If we let m → ∞, then we have
|fn (x) − f (x)| ≤ /2 <  for all x ∈ D. Therefore the sequence {fn } converges
uniformly to f .
This section gives only a brief introduction to uniform convergence of se-
quences. There are other versions of the basic theorems, Propositions 8.5.2–
8.5.4, with weaker hypotheses (stronger results with more difficult proofs). Uni-
form convergence is an important enough concept to deserve more space and
work—but not in this text. We have tried to give you enough so that in the
next section we can go on and discuss the uniform convergence of power series
and the resulting power series results.

HW 8.5.1 (True or False and why)


(a) Suppose f, fn : D → R, D ⊂ R, are such that fn is continuous on D for all
n, f is continuous on D and fn → f pointwise on D. Then fn → f uniformly
on D.
248 8. Sequences and Series

(b) Suppose f, fn : D → R, D ⊂ R, are such that fn → f uniformly on D, and


c ∈ R. Then cfn → cf uniformly on D.
(c) Suppose f, fn : D → R, D ⊂ R, are such that fn → f . It may be the case
that {fn } does not converge to f pointwise on D.
(d) Suppose f, fn : D → R, D ⊂ R, are such that fn → f uniformly on D.
Then |fn | → |f | uniformly on D.
(e) Suppose the sequence fn : D → R, D ⊂ R, is uniformly Cauchy on D and
each function fn is differentiable on D. Then the sequence {fn0 } is uniformly
Cauchy on D.

HW 8.5.2 Consider the following sequences of functions. Find the pointwise


limit of the sequences on their domains and determine whether the convergence
is uniform.
nx x
(a) fn (x) = 1+n 2 x2 , x ∈ [0, 1] (b) fn (x) = nx+1 , x ∈ [0, 1]
2
(c) fn (x) = e−nx , x ∈ R (d) fn (x) = x
1+nx2 , x∈R

HW 8.5.3 Suppose f, fn , g, gn : D → R, D ⊂ R, are such that fn → f and


gn → g uniformly on D. Then for α, β ∈ R, αfn + βgn → αf + βg uniformly
on D.

HW 8.5.4 Consider f, fn : [0, 1] → R defined by fn (x) = (1 − x2 )n and f (x) =


Z 1 Z 1
0 for x ∈ [0, 1]. Compute fn and f . Discuss these results.
0 0

HW 8.5.5 (a) Suppose f, fn : D → R, D ⊂ R, are such that fn → f uniformly


on D and each function fn is bounded on D. Prove that f is bounded on D.
(b) Find a sequence of functions {fn } and function f all with domain D such
that fn → f pointwise on D, each function fn is bounded on D, but f is not
bounded on D.

8.6 Uniform Convergence of Series



X
We now want to return to series of functions of the form fi (x). Hopefully it
i=1
should be reasonably clear that the uniform convergence of series of functions
will be a special case of the uniform convergence of sequences of functions. In
spite of this, for emphasis, we set the partial sum of the series to be sn (x) =
Xn
fi (x) and make the following definition.
i=1

Definition 8.6.1 Consider the sequence of functions {fi (x)} where for each
i ∈ N, fi : D → R, D ⊂ R. If the sequence of partial sums, {sn (x)}, converges
X∞
uniformly on D, say to s(x), then the series fi (x) converges uniformly to
i=1
s(x).
8.6 Uniform Convergence 249

Earlier we saw that methods for proving convergence of sequences were not
especially useful for proving convergence of series. Likewise, the methods of
proving uniform convergence of sequences of functions aren’t very useful for
proving the uniform convergence of a series of functions. There is one excellent
result that we will use, the Weierstrass test for uniform convergence. The Weier-
strass test is to uniform convergence of series of functions that the comparison
test is to convergence of real series. For that reason before we state and prove
the Weierstrass Theorem, we prove the following proposition which also defines
the uniform Cauchy criterion for a series.

Proposition 8.6.2 (Cauchy Criterion for Series of Functions) Consider


the sequence of functions {fi } where for each i ∈ N fi : D → R, D ⊂ R.
X∞
The series fi (x) converges uniformly in D if and only if for every  > 0
i=1
there
N ∈ R such that m, n ∈ N, m ≥ n and m, n > N implies that
exists
Xm
fi (x) <  for all x ∈ D.



i=n

Proof: This result follows direction from Proposition 8.5.6. Let sn (x) =
Xn ∞
X
fi (x). The series fi (x) converges uniformly if and only if the sequence
i=1 i=1
m
X
{sn } converges uniformly. Also sm (x) − sn−1 (x) = fi (x). Thus the sequence
i=n
{sn } is a uniform Cauchy sequence if and only if the sequence {fi } satisfies for
every  > 0 there exists
m
N ∈ R such that m, n ∈ N, m ≥ n and m, n > N
X
implies that fi (x) <  for all x ∈ D.


i=n
The result then follows from Proposition 8.5.6. (Again you should realize
that replacing n by n − 1 in the Cauchy criterion for {sn } does not cause any
problem.)
We now proceed with the following theorem.

Theorem 8.6.3 (Weierstrass Theorem-Weierstrass Test) Suppose that



X
Mi is a convergent series of nonnegative numbers. Suppose further that
i=1
{fk }∞k=1 is a sequence of functions, fk : D → R, D ⊂ R and k ∈ N, such that
X∞
|fk (x)| ≤ Mk for x ∈ D and k ∈ N. Then fi (x) converges absolutely for
i=1
each x ∈ D and converges uniformly on D.

X
Proof: Since |fk (x)| ≤ Mk for each x ∈ D and k ∈ N, the series |fi (x)|
i=1
converges for each x ∈ D by the comparison test, Proposition 8.3.8. Thus the
250 8. Sequences and Series


X
series fi (x) converges absolutely for each x ∈ D.
i=1
n
X ∞
X n
X
Define sn and mn by sn (x) = fi (x), s(x) = fi (x) and mn = Mi .
i=1 i=1 i=1

X
Let  > 0 be given. Since the series Mi converges, we know by Proposition
i=1
exists N ∈ R such
8.2.3 that the series satisfies the Cauchy criterion, i.e. there
m
X
that m, n ∈ N, m ≥ n and m, n > N implies that Mi < .


i=n
Suppose m, n > N and m ≥ n. Then

Xm m
X m
X
|sm (x) − sn−1 (x)| = fi (x) ≤∗ |fi (x)| ≤ Mi < 


i=n i=n i=n

for all x ∈ D where inequality ”≤∗ ” is due to a bunch of applications of the



X
triangular inequality. Therefore the series fi (x) satisfies the uniform Cauchy
i=1
criterion on D and hence converges uniformly on D.
Applying the Weierstrass test to power series we obtain the following result.

Proposition 8.6.4 Suppose that the radius of convergence of the power series

X
ai xi is R and R0 is any value such that 0 < R0 < R. Then the power series
i=0
converges uniformly on [−R0 , R0 ].

Proof: Let r be some value such that R0 < r < R. Then by the definition

X
of radius of convergence, Definition 8.4.4, the power series ai ri converges
i=0
absolutely. For any x ∈ [−R0 , R0 ] |ak xk | ≤ |ak rk |. By the Weierstrass Theorem
X∞
the power series ai xi converges uniformly on [−R0 , R0 ].
i=0
You should realize that the above result shows us that power series are very
nice. When they do converge, they just about always converge uniformly—
except possible at ±R. Since we know that the series may not converge at ±R,
we cannot make a stronger statement. That is surely nicer than most sequences
and series.
We are now ready to apply the fact that power series converges uniformly to
obtain the properties we developed for sequences in the last section. The first
is very easy. Since each of the terms in the power series is continuous and the
convergence is uniform on any closed interval contained in (−R, R), we obtain
continuity on the interval [−R0 , R0 ] for any 0 < R0 < R.
8.6 Uniform Convergence 251


X
Proposition 8.6.5 Consider the power series ai xi with radius of conver-
i=0

X
gence R. The function f : (−R, R) → R defined by f (x) = ai xi is continuous
i=0
on (−R, R).

We next want to differentiate a power series term by term. To see when


and if this is possible, we return to Proposition 8.5.4. We see that we easily
satisfy the hypothesis that the sequence of partial sums {sn } converges at a
point—we’ll only consider differentiating the series in (−R, R) and the series
converges on all of (−R, R). We also easily satisfy the hypothesis that each sn
is continuous. The difficulty is satisfying the hypothesis that the sequence of
derivatives of partial sums, {s0n }, converges uniformly. We want the derivative

X
of the series to be iai xi−1 . Thus we must show that this ”derivative” power
i=1
series converges uniformly.

X
Proposition 8.6.6 Consider the power series ai xi with radius of conver-
i=0

X
gence R. The function f : (−R, R) → R defined by f (x) = ai xi is differ-
i=0

X
0 i−1
entiable on (−R, R), f (x) = iai x and the radius of convergence of the
i=1
series for f 0 is at least R.

Proof: As we said in our introduction to this proposition, we wish to apply


n
X
Proposition 8.5.4 to the sequence of functions {sn } where sn (x) = ai xi . Let
i=0
R0 ∈ R be such that 0 < R0 < R. We discussed how we easily satisfy the
hypotheses that sn must be continuous for each n and that there must exist
one point x0 ∈ [−R0 , R0 ] for which {sn (x0 )} converges—we know that {sn }
converges absolutely on (−R, R).

X
Claim: iai xi−1 converges absolutely for all x ∈ (−R, R) For r such
i=1

X
that 0 < r < R we know that ai ri is convergent. Consider any x such
i=0
that |x| < r, set Ai = iai xi−1 and Bi = ai ri , i = 1, · · · . We apply the limit
comparison test, Proposition 8.3.9-(b). Note that

iai xi−1
 
Ai i−1
= lim i x

lim = lim i
= 0.
i→∞ Bi i→∞ ai r i→∞ r
r
252 8. Sequences and Series

(To see that this last limit is zero let ρ be such that 0 < ρ < 1 and consider the
y
limit lim yρy . Write this limit as lim −y and apply L’Hospital’s rule, the
y→∞ y→∞ ρ
x → ∞ version of Proposition 6.4.4-(c), or HW6.4.1-(d).)
X∞
Thus by the limit comparison test the series iai xi−1 converges absolutely
i=1
for |x| < r for any r, 0 < r < R—and since this holds true for any r < R, the

X
radius of convergence of iai xi−1 is at least R.
i=1

X
Therefore we know that the series iai xi−1 converges uniformly on
i=1
[−R0 , R0 ] for any R0 < R, and by Proposition 8.5.4 we know that the function

X
f is differentiable and f 0 (x) = iai xi−1 . And since this is true for any R0 ,
i=1
0 < R0 < R, the function f is differentiable on (−R, R).
Notice that we do not prove that the radius of convergence of the differen-
tiated power series is R—we proved that it was at least R. We will come back
in Proposition 8.6.10 and show that the radius of convergence of the derivative
power series is actually R.
Once we know that we can always differentiate a power series and that the
derivative series converges too, we know that we can do it again. We obtain the
following corollary.

X
Corollary 8.6.7 Consider the power series ai xi with radius of convergence
i=0

X
R. The function f : (−R, R) → R defined by f (x) = ai xi has derivatives of
i=0
all orders on (−R, R) and ai = f (i) (0)/i!.
Of course the above result follows from applying Proposition 8.6.6 many times
and evaluating that result at x = 0 (which we can by Proposition 8.6.5). We
should realize that the above proposition gives us the very nice result that when
a power series is convergent so that it defines a function f , the power series is
the Maclaurin for the function f . Using the result we also obtain the following
corollary.

X ∞
X
Corollary 8.6.8 Consider the power series ai xi and bi xi both of which
i=0 i=0

X ∞
X
converge on (−r, r) for some r, 0 < r. If ai xi = bi xi for x ∈ (−r, r),
i=0 i=0
then ak = bk for k = 0, 1, 2, · · · .
These are very nice results when it comes to differentiating power series. In
the next result we see that we obtain an analogous result concerning integration
8.6 Uniform Convergence 253

of power series.

X
Proposition 8.6.9 Consider the power series ai xi with radius of conver-
i=0

X
gence R. The function f : (−R, R) → R defined by f (x) = ai xi is in-
i=0
tegrable on any closed interval contained in (−R, R), for any x ∈ (−R, R)
Z x ∞ Z x
X ai i+1
f = x and the radius of convergence of the series for f
0 i=0
i+1 0
is R.

Proof: Let x be such that x ∈ (−R, R) and suppose that x < R0 < R. If we

X
use the fact that the power series ai xi converges uniformly on [−R0 , R0 ],
i=0
Z x Z n
xX n
X ai i+1
Proposition 8.5.3 implies that f = lim ai xi = lim x
0 n→∞ 0
i=0
n→∞
i=0
i+1
converges uniformly which gives the desired result. We note that since this is

X ai i+1
true for any x ∈ (−R, R), the radius of convergence of x is at least
i=0
i+1
R.

X ai i+1
Suppose now that the radius of convergence of the power series x
i=0
i + 1
were greater than R, say R∗ > R. Because the derivative of the series
∞ ∞
X ai i+1 X
x is ai xi , Proposition 8.6.6 implies that the radius of convergence
i=0
i + 1 i=0
X∞
i
of ai x is at least R∗ . But this is a contradiction to the hypotheses of the
i=0

X ai i+1
proposition. Thus the radius of convergence of the power series x
i=0
i+1
is R.
Notice that we were careful in the last part of the above proof and used the
fact that the radius of convergence of the differentiated power series is at least
as large as the radius convergence of the original series. That’s all we had and
it was all we needed. If we return to Proposition 8.6.6, use the same type of
argument used in the last part of the above proof, the fact that the integral of

X ∞
X
the series iai xi−1 is ai xi and Proposition 8.6.9, we obtain the following
i=1 i=1
result—surely the radius of convergence doesn’t care about the constant term.

X
Proposition 8.6.10 Consider the power series ai xi with radius of conver-
i=0
254 8. Sequences and Series


X
gence R. The radius of convergence of the series iai xi−1 is R.
i=1

Note that in Proposition 8.6.9 we integrated the power series from 0 to x—


we did so because the result looks nicer that way. We could have integrated
the function f on any interval [a, b] ⊂ (−R, R). Also, notice that the radius of
convergence of the differentiated and integrated series is the same as that of the
original series. We mention specifically however that the convergence of these
series may differ at the points x = ±R—you might want to find examples that
will illustrate this.
There are many applications of Propositions 8.6.5-8.6.9. An example that
we alluded to earlier is when power series are used to find the solutions to
differential equations. This is not a very popular topic in differential equation
courses anymore but is still important. The power series solution is found by
inserting a general power series into the differential equation and combining like
terms. After the power series solution is computed it is really nice to know that
the series is in fact a solution in that it can be differentiated the appropriate
number of times, etc.
Other, more fun, applications of propositions 8.6.6 and 8.6.9 are when we use
differentiating and integration of known power series to find other power series.
1
For example, we know that 2 = 1 − x2 + x4 − x6 + · · · (it’s a geometric
x +1
series) and the the radius of convergence is R = 1. Hence by integrating both
x3 x5 x7
sides from 0 to x we see that tan−1 x = x − + − + · · · —the radius of
3 5 7
convergence of this resulting series is also R = 1.
And finally, we now return to the Maclaurin series considered in Example
8.4.1 where we considered the convergence of the Maclaurin series of the function
f (x) = ln(x + 1). Sadly to say, the results we have obtained since that time do
not make it easier to prove this result. We can prove that the series converges
to ln(x + 1). However the methods we shall use, though basic, are not clearly
applicable to other functions and series.

X (−1)k+1 k
Example 8.6.1 Prove that the series x converges to f (x) = ln(x + 1) on
k=1
k
(−1, 1].
n
X (−1)k+1 k
Solution: We know that we can write f (x) = Tn (x) + Rn (x) where Tn (x) = x
k=1
k
Z x(x − t)n
and Rn (x) = (−1)n n+1
dt. Clear if we can show that Rn (x) → 0 for x ∈ (−1, 1)
0 (1 + t)
as n → ∞, then we will have proved our result.
Case 1: 0 ≤ x ≤ 1: We see that
Z x Z x
(x − t)n ∗ xn+1
|Rn (x)| = n+1
dt ≤ (x − t)n dt = →0
0 (1 + t) 0 n+1
where inequality ”≤∗ ” follows from Proposition 7.4.5.
Case 2: −1 < x < 0: For −1 < x < 0 we have
t−x n 1
Z x Z 0
(x − t)n


|Rn (x)| = n+1
dt = dt.
0 (1 + t) 1+t 1+t

x
8.6 Uniform Convergence 255

t−x n 1
 
Since ≥ 0, we can apply the Mean Value Theorem for Integrals, Theorem
1+t 1+t
7.5.8, to get
t−x n 1 tn − x n tn − x n
Z 0    Z 0  
1 1
|Rn (x)| = dt = dt = (−x)
x 1 + t 1 + t 1 + tn 1 + tn x 1 + tn 1 + tn
where tn ∈ [x, 0]—emphasize that tn does depend on n.
Since x ≤ tn ≤ 0 implies that 1 + x ≤ 1 + tn ≤ 1 and 0 < 1 + x, and |x| = −x, we see that
tn − x n tn + |x| n |x| tn |x| + |x| n |x| |x|n+1
     
1
(−x) ≤ < = .
1 + tn 1 + tn 1 + tn 1+x 1 + tn 1+x 1+x
|x|n
Thus |Rn (x)| < 1+x
→ 0 as n → ∞.

X (−1)k+1 k
Therefore in both cases Rn (x) → 0 as n → ∞, so f (x) = ln(x + 1) = x for
k=1
k
x ∈ (−1, 1].

One important fact that we should emphasize is that the series converges
to ln(x + 1) on only (−1, 1]—we found as a part of Example 8.4.1 that the

X (−1)k+1 k
radius convergence of the series x is R = 1, i.e. the series diverges
k
k=1
for |R| > 1. We notice that even though the function ln(x + 1) is defined for
x ∈ (−1, ∞), the Maclaurin series doesn’t converge on the (1, ∞) part of the
function. This is just a fact of power series—and a Maclaurin series is a power
series—that they always converge only on a symmetric interval (−R, R)—and
maybe the endpoints. We cannot do better.
In Proposition 8.4.1 we found a tool for proving that Taylor series-Maclaurin
series converged to the function that generated the series. This result works well
for a variety of functions, exp, sine, cosine, etc. We saw in Example 8.4.1 that
Proposition 8.4.1 will not work for all functions—specifically for f (x) = ln(x+1).
We were able to prove that the Maclaurin series for ln(x + 1) will converge to
ln(x + 1) on (−1, 1], but we really used ad hoc methods—methods that will not
necessarily carry over to other examples. There are not methods that will work
for all Taylor series-Maclaurin series. To illustrate how bad it can really be,
consider the following very important example.
( 2
e−1/x if x 6= 0
Example 8.6.2 Consider the function f (x) =
0 if x = 0.
Find the Maclaurin series of f —if it exists—and determine for which values of x the series
converges, and for which values of x the series converges to f (x).
Solution: To determine the Maclaurin series of f we begin by computing the derivatives at
x = 0. Note that each of the equalities ”=∗ ” follow by L’Hospital’s rule.
2
f (h) − f (0) e−1/h h−1 −h−2 1 h
f 0 (0) = lim = lim = lim 2 =

lim 2 = lim
h→0 h−0 h→0 h h→0 e1/h h→0 −2h−3 e1/h 2 h→0 e1/h2
= 0
2
f 0 (h) − f 0 (0) 2h−3 e−1/h h−4 −4h−5
f 00 (0) = lim = lim = 2 lim 2 =

lim 2
h→0 h−0 h→0 h h→0 e1/h h→0 −2h−3 e1/h

h−2 ∗ −2h−3 1
= 4 lim 2 = 4 lim 2 = 4 lim 2 = 0
h→0 e1/h h→0 −2h−3 e1/h h→0 e1/h
256 Index

2 2
!
e−1/h e−1/h
f 000 (0) = lim −6 +4 = ··· = 0
h→0 h4 h7
We don’t know how you feel about it but we’re tired of these computations by now. It should
be reasonably clear that all derivatives of f evaluated at x = 0 will involve one or more limits
2
of the form lim h−k e−1/h . Hopefully the above computations convinces you that all of these
h→0
limits can be computed and are equal to zero. How would we prove this? To prove that the
2
particular limits lim h−k e−1/h are zero we must use mathematical induction—for even and
h→0
odd k separately—but we don’t really want to do that here.
Thus we see that f (k) (0) = 0 for k = 0, 1, 2, · · · . Hence the Maclaurin series expansion
∞ ∞
X f (k) (0) k X f (k) (0) k
x exists and is the identically zero series, and we see f (x) 6= x for
k=0
k! k=0
k!
all x 6= 0.
The function used in this example is clearly a non-standard function. Plot it in various
neighborhoods of x = 0 to see what it looks like. However, the example does show that if you
compute a Maclaurin series-Taylor series expansion, you do not necessarily get the original
function back again.


X
HW 8.6.1 (a) Prove that the series xn converges uniformly to 1
1−x on
n=0
[−R, R] for 0 < R < 1.

X
(b) Discuss whether or not the series xn converges uniformly to 1
1−x on
n=0
(−1, 1).

X
HW 8.6.2 (a) Prove that the series (−1)n xn converges uniformly to 1
1+x
n=0
on [−R, R] for 0 < R < 1.

X
(b) Discuss whether or not the series (−1)n xn converges uniformly to 1
1−x
n=0
on (−1, 1).

1 X
HW 8.6.3 (a) Show that = (−1)n x2n has a radius of convergence
1 + x2 n=0
of R = 1.
(b) Use part (a) to determine the power series expansion of tan−1 x—and the
radius of convergence of the series. Justify your results.
Index

{an }∞ n=1 , 54 min, 62


am , 33 A − B, 38
ax , 210 N, 6
[x], 69 p implies q, 10
f ◦ g, 128 if p, then q, 10
Rb
f , 178 p is a sufficient condition for q, 10
a
e, 208 p only if q, 10
∅, 35 q is a necessary condition for p, 10
A = B, 35 Q, 6
ex , 209 R, 13
exp(x), 208 A ⊂ B, 35
f : D → R, 53 sup(S), 21
glb(S), 21 ∪ Eα , 36
x∈S
inf (S), 21 E1 ∪ E2 , 36
n
∞, 30 ∩ Ek , 36
∞, 30 k=1
n
Rb
f dx, 179 ∪ Ek , 36
a k=1
E o , 40 U (f, P ), 172
∩ Eα , 36 Rb
x∈S a
f , 177
E1 ∩ E2 , 36 Z, 6
lim an , 55 absolute convergence, 224
n→∞
lim f (x), 109 absolute maximum, 128
x→−∞
absolute minimum, 128
lim f (x), 109
x→∞ absolute value, 28
lim f (x), 111 of a function, 191
x→x0 −
lim f (x), 111 accumulation point, 40
x→x0 +
addition, 13
lim f (x) = −∞, 110
x→x0 alternating series, 233
lim f (x) = L, 89 antiderivative, 197
x→x0
lim f (x) = ∞, 110 A-R Theorem, 180
x→x0 Archimedes-Riemann Theorem, 180
0
E , 40 Archimedian property, 27
ln x, 206 Archimedian sequence
L(f, P ), 172 partitions, 182
Rb
a
f , 177 arithmetization of analysis, 6
lub(S), 21 associative

257
258 Index

addition, 14 definition, 117


multiplication, 14 contradiction, 10, 11
Associative Laws contrapositive, 10
set theory, 36 converge
infinity, 85
backwards triangular inequality, 29 convergence
Bisection Method, 131 Cauchy criterion, 79
Bolzano-Weierstrass Theorem, 77 sequence of functions
bounded, 20 pointwise, 219
bounded above, 20, 82 series, 221
bounded below, 20, 82 series of functions
bracket function, 69 pointwise, 222
converges, 55
cancellation law, 14 critical point, 157
Cauchy criterion
sequences, 79 d’Alembert, Jean-le-Rond, 5
sequences of functions, 247 Darboux integral, 178
series, 221 decreasing function, 132
Cauchy sequence, 6, 78 Dedekin cuts, 6
Cauchy, Augustus-Louis, 5 DeMorgan’s Laws, 38
chain rule, 150 dense, 40
closed derivative, 147
with respect to addition, 14 left hand, 148
with respect to multiplication, 14 right hand, 148
closed interval, 29 derivative function, 147
closed set, 40 derived set, 40
common refinement, 175 difference
commutative sets, 38
addition, 14 differentiable
multiplication, 14 at a point, 147
Commutative Laws on a set, 147
set theory, 36 differentiable function, 6
compact, 78 direct proof, 9
compact set, 45 discontinuous, 117
comparison test, 227 distributive, 14
complement, 38 diverge
complete, 22 infinity, 85
complete ordered field, 23 divergence
completeness axiom, 22 series, 221
composite function, 128 domain, 53
continuity, 128
differentiation, 150 empty set, 35
conditional convergence, 224 equality, 13
continuity, 117, 126 equivalence class, 6
continuous equivalent statements, 11
at a point exponential function, 208
Index 259

extended reals, 30 derivative, 160


invertible, 134
field, 13 isolated point, 40
finite set, 21 isomorphic, 23
finite subcover, 45 IVT, 130
function, 53
Fundamental Theorem of Calculus, 197 L’Hospital’s Rule, 164
Lagrange, Joseph Louis, 5
Gauss, Carl Freidrich, 5 least upper bound, 21
greatest lower bound, 21 left hand derivative, 148
left hand limit, 111
harmonic series, 227 Leibniz, Wilhelm, 5
horizontal line test, 134 limit, 5
hypergeometric series, 5 at ±∞, 109
from the left, 111
identity from the right, 111
addition, 14 function, 89
multiplication, 14 infinite, 110
image, 53 one-sided, 111
implication, 10 limit comparison test, 228
increasing limit point, 40
function, 82 limits
increasing function, 60, 132 sequences, 53, 72
indirect proof, 9, 10 limits of sequences, 53
inductive assumption, 31 local maximum, 128
infimum, 21 local minimum, 128
infinite limit logarithm, 206
sequence, 85 lower bound, 20
infinite limits lower Darboux sums, 172
sequences, 85 lower integral, 177, 178
infinite series, 5 lower sums, 172, 178
infinity, 30
integers, 6 Maclaurin series, 223
integrable, 178 map, 53
integral, 5, 178 mathematical induction, 30
integral domain, 14 maximum, 128, 156
integral test, 225 Mean Value Theorem, 157, 158
interior of a set, 40 minimum, 62, 128, 156
interior point, 40 monotone, 81
Intermediate Value Theorem, 130 Monotone Convergence Theorem, 80,
intersection, 36 82
interval, 29, 132 monotone sequence, 80
inverse monotonic function, 183
addition, 14 monotonic sequences, 81
multiplication, 14 monotonically decreasing, 81–84
inverse function, 134 monotonically increasing, 81, 82
260 Index

multiplication, 13 sequential limits, 73


MVT, 158
radius
natural numbers, 6 neighborhood, 40
negation, 11 radius of convergence, 239
neighborhood, 40 range, 53
infinity, 56 ratio test, 230
Newton, Isaac, 5 rational functions
non-convergence, 61 continuity, 127
nonexistence rational numbers, 6
limit, 62 rational roots, 8
real line, 5
one-sided limits, 111 real number system, 6
one-to-one, 23 real numbers, 20, 23
one-to-one function, 134 reduced form, 6
onto, 53 reductio ad absurdum, 11
open cover, 45 refinement
open interval, 29 common, 175
open set, 40 partition, 174
order, 13, 16 reflexive law
order structure, 13 equality, 13
ordered field, 16 Riemann Theorem, 182
right hand derivative, 148
p-series, 227 right hand limit, 111
partial sums, 220 Rolle’s Theorem, 157
partition, 171 root, 7
partition interval, 171 Root Test, 232
partition points, 171
Peano Postulate, 31 Sandwich Theorem, 188
Peano Postulates, 24 sequences, 75
piecewise continuous, 192 sequence, 53, 55
polynomial equation, 7 sequential limit, 53
polynomials series, 221
continuity, 127 functions, 220
power series, 237 set
premise, 9 closed, 40
Principle of Mathematical Induction, compact, 45
30, 31 derived, 40
product rule interior, 40
differentiation, 148 open, 40
proof set containment, 36
mathematical induction, 30 set theory, 35
proper subset, 35 sine, 102
step function, 192
quotient rule strictly decreasing, 81, 82
differentiation, 149 strictly decreasing function, 132
Index 261

strictly increasing, 81, 82


strictly increasing function, 132
subsequence, 76
subset, 35
substitution law, 14
successor, 24
supremum, 21
symmetric law
equality, 13

Taylor error term, 215


Taylor Inequality, 217
Taylor polynomial, 215
Taylor series, 223
theory of limits, 5
topological space, 40
topology, 40
transitive law
equality, 13
triangular inequality, 29
trigonometric functions, 102
truth table, 11
truth value, 11

uniform Cauchy criterion, 247


series of functions, 249
uniform continuity, 137
uniform convergence, 241, 242
functions, 242
sequences, 241
series, 248
union, 36
universal set, 37
universe, 37
upper bound, 20, 83
upper Darboux sums, 172
upper integral, 177, 178
upper sums, 172, 178

valid argument, 9, 11
Venn Diagram, 37

Weierstrass Test, 249


Weierstrass Theorem, 249
Weierstrass, Karl, 5
Well-Ordered Principle, 33
Well-Ordering principle, 28

You might also like