Professional Documents
Culture Documents
J.W. Thomas
Professor of Mathematics
Colorado State University
Fort Collins, CO 80543
June, 2007
Contents
2 Some Topology of R 35
2.1 Introductory Set Theory . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Basic Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Compactness in R . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Limits of Sequences 53
3.1 Definition of Sequential Limit . . . . . . . . . . . . . . . . . . . . 53
3.2 Applications of the Definition of a Sequential Limit . . . . . . . . 58
3.3 Some Sequential Limit Theorems . . . . . . . . . . . . . . . . . . 66
3.4 More Sequential Limit Theorems . . . . . . . . . . . . . . . . . . 72
3.5 The Monotone Convergence Theorem . . . . . . . . . . . . . . . 80
3.6 Infinite Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4 Limits of Functions 89
4.1 Definition of the Limit of a Function . . . . . . . . . . . . . . . . 89
4.2 Applications of the Definition of the Limit . . . . . . . . . . . . . 95
4.3 Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4 Limits at Infinity, Infinite Limits and One-sided Limits . . . . . 109
5 Continuity 117
5.1 An Introduction to Continuity . . . . . . . . . . . . . . . . . . . 117
5.2 Some Examples of Continuity Proofs . . . . . . . . . . . . . . . . 122
5.3 Basic Continuity Theorems . . . . . . . . . . . . . . . . . . . . . 126
5.4 More Continuity Theorems . . . . . . . . . . . . . . . . . . . . . 130
5.5 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6 Rational Exponents . . . . . . . . . . . . . . . . . . . . . . . . . 142
3
4
6 Differentiation 147
6.1 An Introduction to Differentiation . . . . . . . . . . . . . . . . . 147
6.2 Computation of Some Derivatives . . . . . . . . . . . . . . . . . . 152
6.3 Some Differentiation Theorems . . . . . . . . . . . . . . . . . . . 156
6.4 L’Hospital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7 Integration 171
7.1 An Introduction to Integration: Upper and Lower Sums . . . . . 171
7.2 The Darboux Integral . . . . . . . . . . . . . . . . . . . . . . . . 177
7.3 Some Topics in Integration . . . . . . . . . . . . . . . . . . . . . 183
7.4 More Topics in Integration . . . . . . . . . . . . . . . . . . . . . . 188
7.5 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . 195
7.6 The Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . 201
7.7 Logarithm and Exponential Functions . . . . . . . . . . . . . . . 206
7.8 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Index 257
Chapter 1
1.1 Introduction
Most students feel that they have an understanding of the real numbers and/or
the real line. Throughout most of your education you have drawn a line—
sometimes called a number line—having a positive (and hence negative) di-
rection and with the integers indicated. From this you could approximately
designate any other real number.
You used two of these as axes when you graphed functions. When you
learned how to take limits as x → a, you used a number line as the x-axis when
you (or the instructor or the book) explained the meaning of the concept of a
limit. When you were introduced to integrals, an interval on the real line was
subdivided which helped to give you an approximation to the area under some
function defined on the interval—to aid in the definition of the integral.
Unless you were given an non-standard introduction to these concepts you
really didn’t know enough about the real line to know what’s missing. You
surely don’t know enough about the real line to be able prove the important
calculus theorems.
This isn’t as bad as it may sound. When Isaac Newton and Gottfried Wil-
helm Leibniz invented calculus in the late 1600’s, they used a very intuitive
approach. After a while some people started pointing out that there were in-
consistencies to their approaches. Over the next 200 years many of the great
mathematicians worked on rigorizing calculus. In 1754 Jean-le-Rond d’Alembert
decided that it was necessary to give a rigorous treatment of limits. Joseph Louis
Lagrange published his first paper rigorizing calculus in 1797. As a part of his
work on hypergeometric series, in 1812 Carl Freidrich Gauss gave rigorous dis-
cussion of the convergence of an infinite series. And finally, Augustus-Louis
Cauchy in 1821 answered d’Alembert’s call and introduced a theory of lim-
its. In 1874 Karl Weierstrass gave an example of an everywhere continuous,
5
6 1. Real Numbers
graph the function f (x) = 2−x2 really carefully using only the rational numbers
on the x-axis, you see that the graph passes through the x-axis without hitting
the axis—we don’t want that but then you really can’t √ graph it that carefully.
Many of you have seen the common proof that 2 is not rational which
goes as follows. (Read the proof carefully. It’s not terribly important to be
able to reproduce the proof
√ but it is important √ that you can follow the proof.)
Assume false, i.e. that 2 is rational. Then 2 = m/n where m and n are in
N, n 6= 0 and m/n is in reduced form. If we square both sides we get m2 = 2n2 .
Since m2 is a multiple of 2, m2 is even. This implies that m is even. (If m
is not even, i.e. m is odd, then m = 2k + 1 for some integer k. But then
m2 = (2k + 1)2 = 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1 is odd. This is a contradiction.)
If m is even, then m can be written as m = 2k. But then the facts that m = 2k
and m2 = 2n2 imply that m2 = 4k 2 = 2n2 or n2 = 2k 2 . Thus n must also
be even. This is a contradiction to the fact that we assumed that m/n was in
reduced form. √
√Thus we see that 2 is not rational. But do we really care? Do we need
a 2 in our lives? With a little bit of thought about the diagonal of the unit
square or the √graph of the functions f (x) = 2 − x2 , it’s reasonably clear that we
want to have 2 in our number system, i.e. Q is not enough. In general, √ we do
not want to work on domains that have holes in them like Q has at 2.
It should be pretty clear that there are a lot of rational numbers (there are
a lot of natural numbers and there are clearly a lot more rational numbers than
there are natural numbers). A little thought will also convince you that there
are also a lot of numbers on the number line√that are √ not rational. A proof
similar to the one given √above will show that 3 and√ 7 are not rational. It’s
also easy to prove that q 2 (Example 1.2.2) and q + 2 (HW1.2.2–(a)) are not
rational for any q ∈ Q. Since there are a lot of what we think of as numbers
that are not rational, there are many holes in Q.
What we would like to do is to figure out how to fill in the holes in the real
number line of the numbers that are not rational. This is close to what the
approaches to building the real numbers using either Dedekin cuts or Cauchy
sequences do. We will really approach this from the other direction. We will
define the set of real numbers and then show that this set is what we want.
Before we go on, we would like to include a topic related to our work with the
rational numbers. The following nice result makes it easy to show that a given
number is not rational—and it shows more. Consider the following proposition.
Proof: If you observe carefully, you will see that the√proof of this proposition
is really very similar to the way that we proved that 2 was not rational.
n n−1
p p
If r = p/q is a root of equation (1.1.1), then a0 + a1 + ··· +
q q
p
an−1 + an = 0. Multiplying by q n we get
q
We use the same argument as before. Because p must divide an q n and no factors
of p can divide out any factors of q n , then p must divide an .
Before we show you how nice this result is in relation to our work with
rational numbers, let us remind you that you have probably used this result
before. A while after you learned how to factor polynomials in your algebra
classes, you were faced with factoring polynomials of degree greater than or
equal to three. You were given a problem like “factor 2x3 + 3x2 − 8x + 3.” You
were taught to try to divide by x ± 3, x ± 3/2, x ± 1/2 and x ± 1. These potential
roots were formed by trying all rationals p/q where p is a factor of an = 3 (±3,
±1)and q is a factor of a0 = 2 (±2, ±1), i.e. by applying Proposition 1.1.1. If
and when you were lucky enough to divide 2x3 + 3x2 − 8x + 3 by x − 1 you got
2x3 + 3x2 − 8x + 3
= 2x2 + 5x − 3.
x−1
You then factored the quadratic term which gives you the complete factorization
If none of the potential roots satisfy the equation, Proposition 1.1.1 implies
that there are no rational roots. In you algebra class you usually didn’t have to
worry about that since they were trying to teach you how to factor—one of the
potential roots always satisfied the equation.
Our application of Proposition 1.1.1 goes as follows. Consider the equation
x2 − 2 = 0. By Proposition 1.1.1 we know that if there are going to be rational
roots to this equation, they will be either ±2 or ±1. It is easy to try these
four potential roots and see that none of them satisfy the equation x2 − 2 =
0. Therefore the equation has no rational roots. Solving for x we know that
1.2 Proofs 9
√ √
x = ± 2 represents the solutions to this equation. Therefore ± 2 must not be
rational.
This same approach √ can be used to produce
√ many numbers that are not
rational. Many such as 13 are as easy as 2. For some it is more difficult
to find the appropriate algebraic equation associated q
with the numberq
but the
√ √
method still works. For example consider the number 3 4−3 2 . Set x = 3 4−3 2 .
√ √ √ 2
Then x3 = 4−3 2 , 3x3 = 4 − 2, 3x3 − 4 = − 2 and 3x3 − 4 = 2. Expand
this last expression and apply Proposition 1.1.1 to the resulting polynomial
q √
3 4− 2
(with integer coefficients). Surely 3 is a root of this polynomial.
HW 1.1.1 Assume that p and q have the prime factorizations p1 · · · pmp and
q1 · · · qmq , respectively and p/q is in reduced form. Prove that if q divides a0 pn
(where a0 is an integer), then q divides a0 .
√
HW 1.1.2 Prove that 13 is not rational.
√
HW 1.1.3 Prove that 32 + 13 is not rational.
q √
HW 1.1.4 Prove that 3 4−3 2 is not rational.
This is a very easy proof but it is hoped that it shows explicitly what the
“true hypotheses” are and how these hypotheses fit together with the valid
argument to construct the proof. We will have more difficult direct proofs, but
more difficult direct proofs will just be more difficult analogs to this proof. We
should realize that the statement p implies q can also be written as if p, then q,
p is a sufficient condition for q, p only if q and q is a necessary condition for p.
Depending on the author you may see all of these different expressions.
And finally we discuss again what we mean by true premises. It is diffi-
cult when you leave the “do as I say” world of mathematics to the “prove it”
world of mathematics. At this time you “know” a lot of things that have been
told to you—things that have not been based on a firm mathematical founda-
tion. Students sometimes have trouble knowing what they can assume are true
premises. It is clear that you can assume anything that we have given you as
postulates, definitions or anything that you or we have proved (which so far
is almost nothing). We did cheat a bit when we have told you that you know
about the integers, the arithmetic for integers and consequently the arithmetic
for rationals. Actually, the facts that you know for the integers include a small
set of postulates and results proved from those postulates. Because we had to
start somewhere, we assume that you know those. When it is necessary, we will
include some of the properties of the integers—postulated and/or proved. Just
about every other true premise that you will have to use or we will use will be
included in this text. If we cheat, we will try to remember to tell you that we
are cheating.
Indirect Proofs: Indirect proofs are very common in analysis. There are
certain results that are very difficult to prove directly yet can be easily proved
using an indirect proof. The indirect proofs are based on the logical concepts of
the contrapositive and the contradiction. As you are probably aware√we have
already given you two examples of indirect proofs when we proved “ 2 is not
rational” and “m2 even implies that m is even” using a proof by contradiction.
We discuss first the use of the contrapositive in proof.
The Contrapositive: When the statement we wish to prove is if r, then s, a
common approach to proving the statement is to consider the contrapositive of
the statement. For this short discussion we will write the implication as r → s
and read it as r implies s. The contrapositive of the statement r → s is the
statement (∼ s) → (∼ r) where (∼ s) means “not s” and (∼ r) means “not r”.
1.2 Proofs 11
r s r→s ∼s ∼r (∼ s) → (∼ r)
T T T F F T
T F F T F F
F T T F T T
F F T T T T
There are clearly some similarities between proofs using the contrapositive
and contradiction. If the statement that we want to prove is of the form if r then
s, then we know that we can prove this statement by proving the contrapositive,
if ∼ s then ∼ r. As we mentioned earlier if we were to try to prove this statement
by contradiction we would include the statement r as a part of pK —the things
that we know to be true. We then proceed by assuming that s is false, and
we will complete the proof if we prove some statement p1 is false where p1
1.3 Some Preliminaries 13
You see that the field properties consist of the very basic properties satisfied
by the addition and multiplication that you have used since grade school. When
you were working in N, Z, Q or R, you, your teachers and your books probably
wrote a · b as ab, θ as 0, 1 as 1 and a−1 as 1/a. We will stick with the more
formal notation at this time. After we “have the reals” we will revert to the
usual notation of ab, 0, etc.
It should be easy to see that N is not a field nor an integral domain because
it does not contain either an additive inverse, Z is not a field because it does not
contain a multiplicative inverse (but it is an integral domain), and Q is a field
(and an integral domain). There are many other fields that are very important
in mathematics.
As a part of our defintion of a field above, we assumed that we have oper-
ations addition and multiplication defined on the set. We emphasize that we
want to assume that these operations are uniquely defined. This is a trivial
idea, but it is important. That is, if we have a + b = a + c and b = b0 , then
we also have a + b0 = a + c. For the obvious reason we will sometimes refer to
this as the substitution law—and after a while we will not refer to it, we will
just do it. Of course we have the analogous substitution law associated with
multiplication.
As a part of our definition, if Q is a field, it possesses the basic properties
that are generally familiar to us. However, there are many more properties
associated with a field that are also familiar to us. The point is that there are
many very useful properties in a field that follow from the field axioms. (Though
we will try, we will probably not prove all of the properties in the right order.
We surely will not prove all of the necessary properties. We want to emphasize
1.3 Some Preliminaries 15
that all of the necessary (or desired) properties can be shown to follow from the
field axioms.) We include the following proposition that will give us some of
these properties.
Proposition 1.3.2 Suppose that Q is a field. Then the following properties are
satisfied.
(i) If a, b, c ∈ Q and a + c = b + c, then a = b.
(ii) If a, b, c ∈ Q, c 6= θ and a · c = b · c, then a = b. (This also shows that if Q
is a field, then Q is an integral domain.)
(iii) If a ∈ Q, then a · θ = θ.
(iv) The additive and multiplicative inverses are unique.
(v) If a, b ∈ Q, then (−a) · b = −(a · b).
(vi) If a, b ∈ Q, then (−a) · (−b) = a · b
(vii) If a, b ∈ Q and a · b = θ, then a = θ or b = θ.
Proof: (i) c ∈ Q implies there exists −c ∈ Q such that c + (−c) = θ (a4). By
the reflexive law of equality, (a + c) + (−c) = (a + c) + (−c). Since a + c = b + c,
the substitution law implies that (a + c) + (−c) = (b + c) + (−c). Then using
a2 twice we have a + (c + (−c)) = b + (c + (−c)). By a4 (twice) this becomes
a + θ = b + θ which implies (by a3 twice) that a = b.
Note that if we applied HW1.3.2–(b), we could have began this proof with
a + c = b + c implies that (a + c) + (−c) = (b + c) + (−c) and then proceeded
as above. However, the proof of HW1.3.2 uses the reflexive law of equality and
the substitution law.
(ii) It should be logical that this proof is analogous to the proof given for
(i)—properties (i) and (ii) are essentially the same properties, (i) with respect
to addition and (ii) with respect to multiplication. Because we have the hy-
pothesis that c 6= θ, by m4 there exists c−1 ∈ Q such that c · c−1 = 1. By
the multiplication analog of HW1.3.2–(b) we see that a · c = b · c implies that
(a · c) · c−1 = (b · c) · c−1 . Then by m2, m4 and m3, a = b.
(iii) Properties a3 and a1 imply that for a · θ ∈ Q (the closure with respect to
multiplication implies that if a, θ ∈ Q, then a · θ ∈ Q), (a · θ) + θ = a · θ and
a · θ + θ = θ + a · θ, or θ + a · θ = a · θ (by substitution). Then by a4 (applied
to θ), substitution and d1, we have θ + a · θ = a · θ = a · (θ + θ) = a · θ + a · θ.
Using part (i) of this proposition we have θ = a · θ.
(iv) We will show that the multiplicative inverse is unique. The common way
to approach a uniqueness result is to assume that there are two (at least two),
i.e. for a ∈ Q, a 6= θ, there exists a−1 and a? in Q such that a · a−1 = 1 and
a · a? = 1. Then by substitution we have a · a−1 = a · a? . Then by part (ii)
of this proposition we see that a−1 = a? . The proof of the uniqueness of the
additive inverse follows in the same way.
(v) The element −(a · b) ∈ Q is an element that satisfies (a · b) + (−(a · b)) = θ.
Thus if we can show that (a · b) + ((−a) · b) = θ, we will be done (by part (iv) of
this proposition). We have a · b + ((−a) · b) = b · (a + (−a)) (using a1 twice and
then d1) = b · θ (by a4 and substitution) = θ(by part (iii) of this proposition).
Therefore −(a · b) = (−a) · b.
16 1. Real Numbers
(vi) It is easy to use part (v) of this proposition along with m1 to show that
(−a) · (−b) = −(−(a · b)). Then HW1.3.2–(a) implies that −(−(a · b)) = a · b.
(vii) If a and b both equal θ, then we are done. If b 6= θ, then there exists b−1
such that b · b−1 = 1 (by m4). Then a · b = θ implies that (a · b) · b−1 = θ · b−1
by substitution. The right hand side equals θ by m1 and part (iii) of this
proposition. Then θ = (a · b) · b−1 = a · (b · b−1 ) = a · 1 = a by m2, m4 and m3.
Therefore a = θ.
The proof of the case if a 6= θ is clearly the same (with b replaced by a).
Of course there are more properties. The purpose of the above proposition
is to illustrate to you how some of the other properties that you know can be
proved from the field axioms.
We want to emphasize here that in the proofs above we used only the axioms
and properties that we had previously proved. It’s not terribly important that
you can prove these properties. It would be nice if you’re capable of proving
some reasonably easy properties using the axioms and previous results. It is
very important that you are able to read these proofs and verify that they are
correct (which we hope they are). In the proofs given we tried to be complete,
giving each step and giving a reason for each step. As we move along we will
ease up on some of the completeness, assuming that the reader understands the
reasons for some of the “simple” steps. When we are done with this section, we
will assume that you know and/or have proved all basic arithmetic properties
of a field. You will have seen proofs of some of these properties such as those
given in Proposition 1.3.2 and HW1.3.2. There are countless other little facts
concerning fields (the rational numbers and the reals—when we really know
what the rational numbers and the real numbers are) that we will need to use.
So that proofs of these facts do not slow down our subsequent work, we will not
fill in every detail and we will assume that you have proved all of these facts or
could prove them if someone wanted a proof.
We next would like to extend our definition to that of an ordered field. As
with the equality, an order must satisfy certain properties. A necesssary part of
defining an order and an ordered field Q is to identify a set P ⊂ Q of positive
elements. We will use the notation that if a ∈ P we will write a > θ. We now
proceed to define an ordered field. We define the ordered field with respect to
the order >.
Definition 1.3.3 Suppose that Q is a field in which we identify a set of positive
elements P ⊂ Q. The set Q along with > is said to be an ordered field if the
satisfy the following properties.
o1. The sum of two positive elements is positive, i.e. a, b ∈ P implies that
a + b ∈ P.
o2. The product of two positive elements is positive, i.e. a, b ∈ P implies that
a · b ∈ P.
o3. For a ∈ Q, one and only one of the following alternatives hold: either a is
positive, a = θ, or −a is positive, i.e. a > θ, a = θ or −a > θ.
You should recognize these three properties as being common facts that you
have used in the past when dealing with inequalities. One of the pertinent facts
1.3 Some Preliminaries 17
is that these three axioms are all you need to get everything you know and/or
need to know about inequalities. Of course we need—want—inequalities defined
on the entire set Q and the other inequalities that you know exist defined, <,
≥ and ≤. We make the following definition.
It should then be reasonably easy to see that Z is not an ordered field (Z is not
a field) and that Q is an ordered field (use P = { m n : m, n ∈ Z and mn > 0}).
Just in case you are not familiar with the “if and only if” statement we pause
to fill you in. For example statement (i) above means that “b < a implies a > b”
and “a > b implies b < a”. This is a little bit unfair in this case because it is
a definition. Definitions are always “if and only if” statements. However, it is
still the case that the “if and only if” statements gives the implication in both
directions. We will encounter this in cases not involving a definition later.
As with the arithmetic properties of the field, the axioms above are then
used to prove a variety of properties concerning ordered fields. We state some
of these properties in the following proposition where we include some of the
very basic results that follow directly from Definition 1.3.3.
Proposition 1.3.5 Let Q along with the operations +, · and > be an ordered
field. Then the following propertes hold.
(i) If a, b, c ∈ Q, a > b and b > c, then a > c (transitive law).
(ii) If a, b, c ∈ Q and a > b, then a + c > b + c.
(iii) If a, b, c ∈ Q, a > b and c > θ, then a · c > b · c.
(iv) If a, b ∈ Q and a > b, then −b > −a.
(v) If a ∈ Q and a 6= θ, then a2 > θ.
Proof: (i) Since a > b and b > c, we have a − b > θ and b − c > θ. Then
by property o1 of Definition 1.3.3 we know that (a − b) + (b − c) > θ. Then by
using Definition 1.3.1 a2 (a couple times) and a4 we get a − c > θ or a > c.
(ii) a > b implies that a − b > θ. Then using a3, a4, a2, a1 etc, shows that
a−b = a−b+θ = (a−b)+(c+(−c)) = (a+c)−b+(−c) = (a+c)+((−b)+(−c))
= (a + c) − (b + c) or a + c > b + c.
(iii) a > b implies that a − b > θ. Applying 02 of Definition 1.3.3 to b − a and
c >, gives (a − b) · c > θ. Then d1 implies that a · c − b · c > θ or a · c > b · c.
(iv) If a > b, then by part (ii) of this result a + (−b) > b + (−b). By a4 and
a3 this becomes a + (−b) > θ. We next use a1 to fix up the right hand, apply
part (ii) of this result again (this time with −a), and then clean it all up with
a4 and a3 (on the left) and a3 to get (−b) > (−a).
(v) If a ∈ Q, then by o3 a > θ or −a > θ (we assumed that a 6= θ). The
case when a > θ follows immediately from o2. If −a > θ, by o2 we have
18 1. Real Numbers
(−a) · (−a) > θ. Then part (vi) of Proposition 1.3.2 gives us our desired result.
There are a lot of different properties of ordered real fields. In the next
proposition we include three more very important results.
It is easy to see that in the set of rational numbers, Q, (an ordered field) 7 is
an upper bound of the set S1 = {−3, −2, −1, 3, 4} and −5 is a lower bound of
S1 , there is no upper bound of the set
S2 = {−17, −3/2, −1/2, 0, 2, 8/3, 4, 32/5, 32/3, 128/7, · · · } (the elements of
the set continue to increase without bound) and −23.1 is a lower bound of
S2 , 7 is an upper bound of the set S3 = {r ∈ Q : r = 7 − 1/n for all n ∈ N}
and 6 is a lower bound of S3 , and −1 is an upper bound of the set S4 =
{· · · , −4, −3, −2, −1, −3/2, −5/4, −9/8, −17/16, · · · } and S4 has no lower bound.
Also, both 4 and 4.00001 are upper bounds and −3 and −3.1 are lower bounds
of the set S5 = {r ∈ Q : −3 < r ≤ 4}. Note that 3.9999 is not an upper bound
of S5 . It is also the case that −17 is also a lower bound of the set S2 , 0 is also
an upper bound of the set S4 , and 14 and −10, 9 and 5, and −10 and 10 are
upper and lower bounds of S1 , S3 and S5 , respectively.
We note that upper and lower bounds of a set may be elements of the set
(for example −1 ∈ S4 and 6 ∈ S3 ). And of course by the other examples, we
see that upper and lower bounds need not be elements of the set, nor are they
unique.
We next include the definition of least upper bounds and greatest lower
bounds. They are related to upper and lower bounds, more difficult than upper
and lower bounds and extremely important.
(i) If M ∗ ∈ Q is such that M ∗ is an upper bound of S, and for any upper bound
of S, M , M ∗ ≤ M , then M ∗ is said to be the least upper bound of S. We denote
the least upper bound of S by M ∗ = lub(S). Another word that is used for the
least upper bound of S is the supremum of S and is written as sup(S).
(ii) If m∗ ∈ Q is such that m∗ is a lower bound of S, and for any lower bound of
S, m, m∗ ≥ m, then m∗ is said to be the greatest lower bound of S. We denote
the greatest lower bound of S by M ∗ = glb(S). Another word that is used for
the greatest lower bound of S is the infimum of S and is written as inf (S).
Let us emphasize the fact that the least upper bound must be an upper bound.
Hence, if the set does not have an upper bound, the least upper bound of the set
does not exist. Likewise, if a set does not have a lower bound, then the greatest
lower bound of the set does not exist.
It should be easy to see that for the four sets S1 , S2 , S3 , S4 and S5 , glb(S1 ) =
−3 and lub(S1 ) = 4, glb(S2 ) = −17 and lub(S2 ) does not exist, glb(S3 ) = 6 and
lub(S3 ) = 7, glb(S4 )does not exist and lub(S4 ) = −1, and glb(S5 ) = −3 and
lub(S5 ) = 4 (where the facts that lub(S4 ) = −1 and glb(S5 ) = −3 are the two
that should be considered carefully).
Least upper bounds (as in S1 , S4 and S5 ) and greatest lower bounds (as
in S1 , S2 and S3 ) may be elements of the set, but that is not a requirement
(as 7 6∈ S3 and −3 6∈ S5 ). You should note that the upper and lower bounds
need not be close to the set (as 1000 is an upper bound of S5 ), whereas the
least upper bound and greatest lower bound must be close to the set (close to
at least one element of the set). We note that if the set is finite (meaning the
set has a finite number of elements), the least upper bounds and the greatest
lower bounds will always be the smallest and the largest elements of the set (as
with the set S1 ). That need not be the case for sets with an infinite number of
elements as can be seen by lub(S3 ) and glb(S5 ).
It is not difficult to prove any of the above claims. For example, if you were
forced to prove that −5 is a lower bound of the set S1 , we would only have to
list the elements of the set noting that −5 ≤ −3, −5 ≤ −2, −5 ≤ −1, −5 ≤ 3
and −5 ≤ 4. Therefore, −5 ≤ s for all x ∈ S1 so −5 is a lower bound of S1 .
If you wanted to prove that −3.0001 is an lower bound of the set S5 , you
would only have to show that if s ∈ S5 , then −3.0001 < −3 < s. Therefore, if
s ∈ S5 , then s ≥ −3.00001 so −3.00001 is a lower bound of the set S5 .
To prove that a given value is the greatest lower bound of a set or the least
upper bound of a set is a bit more difficult. To prove that glb(S5 ) = −3 we must
first prove that −3 is a lower bound of S5 —but this proof is almost identical to
the proof given above that −3.0001 is a lower bound of S5 .
We next must prove that −3 is the greatest lower bound of S5 . The way
to prove this is by contradiction. Assume that m∗ = glb(S5 ) and m∗ > −3.
Then we can find a number r = (−3 + m∗ )/2 that will be in S5 (because
r = (−3 + m∗ )/2 > (−3 + (−3))/2 = −3) but m∗ 6≤ r (r = (−3 + m∗ )/2 <
(m∗ + m∗ )/2 = m∗ so m∗ is not a lower bound of S5 . This contradicts the fact
that the greatest lower bound of a set must also be a lower bound of the set.
Therefore there cannot be a greatest lower bound of S5 that is greater than −3.
22 1. Real Numbers
We emphasize that in the last example the point is that the lub(S6 ) does
not exist in Q. That is important.
We are finally ready to define the set of real numbers.
Definition 1.4.3 An ordered field Q is complete if and only if every subset of
Q that is bounded above has a least upper bound.
We refer to Definition 1.4.3 as the completeness axiom.
1.4 Real Numbers 23
numbers as a particular subset of the reals, we don’t use any of these axioms.
Of course we want our set of natural numbers to satisfy these axioms–otherwise
either something is wrong with the sets of axioms or something is wrong with
our definition of N. In our setting these axioms can proved as theorems. We
will not prove all of these results though when we need them, we will use them.
We do want to give you one of the common sets of axioms, called the Peano
Postulates:
• PP1: 1 is a natural number.
• PP2: For each natural number k there exists exactly one natural number,
called the successor of k, which we donote by k+1 .
• PP3: 1 is not the successor of any natural number.
• PP4: If k+1 = j+1 , then k = j.
• PP5: Let M be a set of natural numbers such that (i) M contains 1
and (ii) M contains x+1 whenever it contains x, then M contains all the
natural numbers.
It should be clear that the Peano Postulates are a long way from the real
numbers–in other words, if you use this approach, you have a lot of work to
do before you get to R. Also based on our definition of R, some of the prop-
erties that we have proved in R and the definition of N, it is easy to see that
PP1, PP4 and PP5 are true. It not too difficult to see that PP3 can be proved
using PP5; and PP2 follows from the result that for k ∈ N there are no natural
numbers between k and k + 1 which follows from PP3. We prove PP3 in Ex-
ample 1.6.4 as one of our examples of the application of proof by mathematical
induction. You will see in Section 1.6 that postulate PP5 is a very important
property of the natural numbers.
We know from our work in Section 1.1 that there exist real numbers that
are not rational. We define
• the set of irrational numbers, I = {x ∈ R : x 6∈ Q} = R − Q.
Obviously by the definition of I, Q ∩ I = ∅ and R = Q ∪ I. In Section 1.1 we
showed that there were a lot of real numbers that are not rational, i.e. that are
irrational. Hence not only do we know that I is not empty, I 6= ∅, but I is large.
And finally, to this point we have tried to be careful to use the formal notation
of · for multiplication, −1 for the multiplicative inverse, θ for the additive identity
and 1 for the multiplicative identity. Now that we have defined R and made the
argument (some of it not proved) that R is the set of reals that we have always
used, we will change to a more traditional notation. We will write a · b as ab,
ab−1 and a/b, θ as 0 and 1 as 1.
(b) If S is a bounded set of real numbers and M ∗ and m∗ are least upper and
greatest lower bounds of S, respectively, then M ∗ and m∗ are unique.
(c) If S is a bounded set of real numbers and S ∗ ⊂ S, then lub(S) ≥ lub(S ∗ )
and glb(S) ≤ glb(S ∗ ).
(d) If S is a bounded set of real numbers, then glb(S) < lub(S).
(e) If S is a bounded set of real numbers, then glb(S) ≤ lub(S).
(f) Suppose that S is a bounded set of positive, real numbers, and set m∗ =
glb(S) and M ∗ = lub(S). Define S 0 = {x ∈ R : x = 1/y, y ∈ S}. Then
glb(S 0 ) = 1/M ∗ and lub(S 0 ) = 1/m∗ .
not exist. Since S6 is an example of a set bounded above that does not have a
least upper bound, we obtain the following result.
√ √
The completeness of the reals allows us to define 2 as 2 = lub(S60 ) and
√ √ 2
we know that 2 satisfies 2 = 2. This approach also allows us to define
square roots of all positive real numbers. √
This is a big deal. When I was a young student, I was told that we let 2 be
the number such that when squared gives 2—and I would guess most of you were
given the same introduction. No one questioned whether or not such a number √
might exist—I surely never questioned it. It wasn’t until I had to define 2 for
students that
√ I started wondering why we never discuss the existence. You now
know that 2 exists and why. You should also realize that if you consider S6 as
a
√subset of R, since S6 is surely bounded, lub(S6 ) will exist in R and will equal
2.
As we stated earlier the completeness axiom is a very important and essential
part of the definition of the set of real numbers. To better describe this property
of R and to make this property easier to use, we next include several useful
results that follow from the definition of the least upper and greatest lower
bounds and/or completenes axiom. These results are very important in that
often when we need to use the completeness of the reals, we will use Proposition
1.5.3–Corollary 1.5.5 rather than the definition of completeness.
We begin with a result that illustrates our earlier claim that when a set
doesn’t have or might not have a largest or smallest element, the least upper
bound and the greatest lower bound can be used to find elements of the set
that are arbitrarily close to being the largest or smallest elements of the set—
arbitrarily close to the least upper or greatest lower bounds of the set.
1.5 Properties of the Real Numbers 27
Proof: (a) Suppose false, i.e. suppose that for some 0 > 0 there is no
element x ∈ S greater than M ∗ − 0 . Then M ∗ − 0 will be an upper bound for
the set S—for all x ∈ S, x ≤ M ∗ − 0 . But this is a contradiction to the fact
that M ∗ is the least upper bound of S.
(b) The proof of (b) is similar—do be careful with the inequalities.
Remember that m∗ and M ∗ may or may not be in the set. Though we
cannot choose the smallest or largest element in the set, we can always find an
element in the set that is arbitrarily close to m∗ and M ∗ . Often when we are
using an argument where we would like to use the smallest or largest element
in a set (and can’t make the claim that there is such an element), we can use
the elements provided by the above proposition that are arbitrarily close the
the greatest lower bound and least upper bound of the set.
We might also make special note of the argument used in the proof of (a)
above. The proof is not difficult. For many students it is difficult to negate
the original statement. The statement is that “for every > 0 there is an x0
that satisfies an inequality.” To negate that statement, you need “some > 0
for which there is no x0 that will satisfy the inequality,” or “some > 0 for
which every x ∈ S does not satisfies the inequality.” Analysis results often
involve convoluted statements. It is often difficult to negate these convoluted
statements.
We next obtain a very important corollary known as the Archimedian prop-
erty (and of course, it doesn’t really deserve to be a corrolary).
Corollary 1.5.4 For any positive real numbers a and b, there is an n ∈ N such
that na > b.
Proof: Suppose false, i.e. suppose that na ≤ b for all n ∈ N. Set S = {na :
n ∈ N}. Since we are assuming that na ≤ b for all n ∈ N, the set S is bounded
above by b. The completeness axiom implies that S has a least upper bound,
let M ∗ = lub(S).
By Proposition 1.5.3–(a) there exists an element of S, n0 a, n0 ∈ N, such
that M ∗ − n0 a < a. (The statement must be true for any > 0. We’re applying
the proposition with = a.) Then we have M ∗ < (n0 + 1)a for n0 + 1 ∈ N so
M ∗ is not an upper bound of S. This is a contradiction so there must be an
n ∈ N such that na > b.
The next result we give two special cases of the Archimedian property that
are basic and seem obvious (or as the very clever graduate students would say,
“intuitively clear to the casual observer”). The result makes it clear that the
completeness axiom is very important because without it we could not make
these seemingly obvious claims. By first choosing b = c and a = 1, and then
choosing a = 1 and b = , we obtain the following corollary.
28 1. Real Numbers
Corollary 1.5.5 (a) For any positive real number c there is an n ∈ N such that
n > c.
(b) For any > 0 there is an n ∈ N such that 1/n < .
We next include two properties of the set of reals that both illustrate the
complexity of the reals.
Proof: We will claim that the proof of (i) and (iii) are trivial (follow directly
from the defintion). Likewise we would like to think that property (ii) is clear
1.5 Properties of the Real Numbers 29
and/or claim that property (ii) is very clear if you consider the four cases x ≥ 0,
y ≥ 0; x ≥ 0, y < 0; x < 0, y ≥ 0; and x < 0, y < 0.
(iv) Earlier discussed an“if and only if” statement as a part of a definition and
promised that we would see it again as a part of a theorem (ok, a proposition).
Recall that the “if and only if” statement means that we have implications going
in each direction. Often to prove “if and only if” you prove both directions
separately. Sometimes, as is the case here, you can prove both directions at the
same time.
We consider the statement |x| ≤ a. If we only consider x values greater
than or equal to zero, this statement becomes |x| = x ≤ a so that statement
is equivalent to 0 ≤ x ≤ a. If we only consider x values less than zero, this
statement becomes |x| = −x ≤ a or 0 > x ≥ −a. Since x is either greater or
equal to zero or less than zero, the statement |x| ≤ a is equivalent to 0 ≤ x ≤ a
or 0 > x ≥ −a. If we consider this set of x values carefully, we see that it is the
same as −a ≤ x ≤ a.
(v) Property (v) is well know as the triangular inequality and is an important
property of the absolute value. We will use it often. Having proved properties
(iii) and (iv), property (v) is easy to prove. Using property (iii) twice, for any
x, y ∈ R we have −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|. Adding these inequalities
gives −|x| − |y| ≤ x + y ≤ |x| + |y| (consider it carefully why it is permissable
to add these inequalities). By property (iv) this last inequality implies that
|x + y| ≤ |x| + |y|.
(vi) Property (vi) is another useful property of the absolute value. We will refer
to property (vi) as the backwards triangular inequality. The proof of property
(vi) is a trick. Consider the following two computations.
and
Infinity:
To include a discussion about infinity in this section is a bit odd since we
want to make it very clear that ±∞ 6∈ R, i.e. ∞ and −∞ are not real numbers.
But we will do it anyway. Often the extended reals are defined to include R and
±∞. Plus and minus infinity do fit into our order system in that ±∞ are such
that for x ∈ R, −∞ < x < ∞, i.e. ∞ is larger than any real number and −∞
is smaller than any real number. Above we defined [a, b], (a, b), etc for a, b ∈ R.
We can logically extend these definitions to the unbounded intervals (a, ∞) =
{x ∈ R : a < x < ∞}, [a, ∞) = {x ∈ R : a ≤ x < ∞}, (−∞, a] = {x ∈ R :
−∞ < x ≤ a}, (−∞, a) = {x ∈ R : −∞ < x < a}, and even (−∞, ∞) = R.
Notice that since we want these intervals to be subsets of R, ±∞ are not included
in any of these sets.
At times we will have to do some arithmetic with infinities so we define for
a ∈ R, a + ∞ = ∞, a − ∞ = −∞, a∞ = ∞ for a > 0 and a∞ = −∞ for a < 0.
We emphasize that ∞ − ∞ and 0 ∞ are not defined (we don’t know what order
of “large” the infinity represents). And finally, since N ⊂ R, for all n ∈ N we
have 1 ≤ n < ∞.
1 − rk+2
= . (1.6.2)
1−r
k+1
X
(Notice that in the first step of (1.6.1) we take the last term of the summation rj out of
j=0
the summation, changing the upper limit of the summation to k and including the last term
separately.) Therefore P is true for n = k + 1.
n
X 1 − rn+1
By the principle of mathematical induction P is true for all n, i.e. rj = .
j=0
1−r
We want to emphasize that all proofs by math induction follow the above
template. In Step 1 you prove that the proposition is true for n = 1. In Step 2
you assume that the proposition is true for n = k—this assumption is referred
to as the inductive assumption. In Step 3 you prove that the proposition is true
for n = k + 1–using the inductive assumption as a part of the proof. If you
are able to prove that the proposition is true for n = k + 1 without using the
inductive assumption, you would have a direct proof of the proposition—math
induction would not be necessary.
You should recognize the formula in Example 1.6.1 as the formula for the
sum of a geometric series. A common proof of this formula is to write
S = 1 + r + r2 + · · · + rn−1 + rn and note that (1.6.3)
rS = r + r2 + r3 + · · · + rn + rn+1 . (1.6.4)
32 1. Real Numbers
k+1 k
X X k(k + 1)
j= j + (k + 1) = + (k + 1) by assumption in Step 2 (1.6.5)
j=1 j=1
2
k (k + 2)
= (k + 1) + 1 = (k + 1) . (1.6.6)
2 2
Therefore the proposition is true for n = k + 1.
By the principle of mathematical induction the proposition is true for all n.
There are many of these summation formulas that can and are proved by
math induction. You should note that except for details, the proofs are very
similar.
We next include a proof by math induction that is somewhat different than
the preceeding two.
Example 1.6.3 If m, n ∈ N and a ∈ R, then am an = am+n .
1.6 Math Induction 33
Solution: Before we begin we should note that the definition of am is what we call an
inductive definition: Define a1 = a and for any k ∈ N, define ak+1 as ak+1 = ak a. We now
begin our proof by fixing m.
Step 1: Prove that the proposition is true for n = 1. Since am a1 = am+1 , by the definition
above the proposition is true for n = 1.
Step 2: Assume that the proposition is true for n = k, i.e. assume that am ak = am+k .
Step 3: Prove that the proposition is true for n = k + 1, i.e prove that am ak+1 = am+k+1 .
am ak+1=am ak a1 = am ak a=am+k a by the inductive hypothesis (1.6.7)
We show how another basic property of the integers can be proved in the
following example.
Example 1.6.4 1 ≤ n for all n ∈ N.
Solution: Step 1: Prove true for n = 1. Clearly 1 ≤ 1 so the proposition is true for n = 1.
Step 2: Assume true for n = k, i.e. 1 ≤ k.
Step 3: Prove true for n = k + 1, i.e. 1 ≤ k + 1.
By adding 1 to both sides of the inequality 1 ≤ k (using the inductive hypothesis and
(x) of Proposition 1.3.7) we get 2 ≤ k + 1. By Proposition 1.3.6-(i) we have 1 > 0–which we
know implies that 0 < 1. Adding 1 to both sides gives 1 < 2. We then have 1 < 2 < k + 1 or
1 < k + 1. This implies that 1 ≤ k + 1.
If you are not inclined to just believe that for k ∈ N there are no natural
numbers between k and k + 1—and we surely hope you wouldn’t believe that,
34 1. Real Numbers
consider the following short proof. (Sometimes it makes life tough but you
must be careful what you believe to be obvious.) For α ∈ N suppose that there
exists a natural number β such that α < β < α + 1. Then β − α > 0 and
α + 1 − β > 0. But since these are natural numbers and 1 is the smallest natural
number, this implies that β − α ≥ 1 and α + 1 − β ≥ 1. Then we see that
(β − α) + (α + 1 − β) = 1 ≥ 2. This is a contradiction so for α ∈ N there are no
natural numbers between α and α + 1 by reductio ad absurdum.
n
X n(n + 1)(2n + 1)
HW 1.6.1 Prove that j2 = .
j=1
6
m
HW 1.6.2 Prove that if m, n ∈ N and a ∈ R, then (an ) = anm .
n
n
X n
HW 1.6.3 For n ∈ N and a, b ∈ R prove that (a + b) = an−k bk
k
k=0
n n!
where = .
k (n − k)!k!
n−1
X
HW 1.6.4 For n ∈ N and a, b ∈ R prove that an − bn = (a − b) an−1−j bj .
j=0
HW 1.6.5 Suppose that Q is an ordered field (or the reals) and suppose that
a, b ∈ Q and θ ≤ a ≤ b. Then for n ∈ N we have an ≤ bn .
HW 1.6.6 For 0 < c < 1 prove that 0 < cn < 1 for all n ∈ N.
Chapter 2
Some Topology of R
The sets in which we will be interested will almost always be subsets of the
real numbers—but none of the general definitions require that to be the case.
We have already seen that N ⊂ Z ⊂ Q ⊂ R—all of the the subsets being proper
subsets. We also have I ⊂ R. We note that for any set A, A ⊂ A (clearly x ∈ A
implies that x ∈ A) and ∅ ⊂ A (if x ∈ ∅, then x ∈ A because there are no x’s in
∅).
We will often want to combine two or more sets in various ways. We make
the following definition.
Definition 2.1.2 Suppose that S is a set and there exists a family of sets as-
sociated with S in that for any α ∈ S there exists the set Eα .
(a) We define the union of the sets Eα , α ∈ S, to be the set E such that x ∈ E
if and only if x ∈ Eα for some α ∈ S. We write E = ∪ Eα . If we have only
α∈S
35
36 2. Topology
We note that the union contains all of the points that are in any of the sets
under consideration while the intersection contains the points that are in all of
the sets under consideration. It is easy to see that
• Q ∪ I = R, Q ∩ I = ∅
• (1, 10)∪{1, 10} = [1, 10], [1, 10]∪[10, 20] = [1, 20], [1, 10)∪[10, 20] = [1, 20],
[1, 10] ∩ [10, 20] = {10}, [1, 10) ∩ [10, 20] = ∅
Proposition 2.1.3 For the sets A, B and C we obtain the following properities.
(a) A ⊂ A ∪ B
(b) A ∩ B ⊂ A
(c) A ∪ ∅ = A
(d) A ∩ ∅ = ∅
(e) If A ⊂ B, then A ∪ B = B and A ∩ B = A.
(f ) A ∪ B = B ∪ A, A ∩ B = B ∩ A Commutative Laws
(g) (A ∪ B) ∪ C = A ∪ (B ∪ C), (A ∩ B) ∩ C = A ∩ (B ∩ C) Associative Laws
(h) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) Distributive Law
Proof: We will not prove all of these—hopefully most of these proofs are very
easy for you. We will prove three of the properties to illustrate some methods
of proofs of set properties.
(b) To prove the set containment in property (b) we begin with an x ∈ A ∩ B.
This implies that x ∈ A and x ∈ B. Therefore x ∈ A and we are done.
(h) In the proof of property (b) we applied the definition of set containment
and proved that if x ∈ A ∩ B, then x ∈ A. To prove property (h) we must apply
the definition of equality of sets, Theorem 2.1.1–(c), and prove containment
both directions, i.e. we must prove that A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C) and
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C).
2.1 Set Theory 37
If A1 = (−∞, 4), A2 = (2, 5) and A3 = (−∞, 5], it is easy to see that Ac1 =
[4, ∞), Ac2 = (−∞, 2] ∪ [5, ∞) and Ac3 = (5, ∞). If we wanted the complement
of A1 with respect to A3 , then A3 − A1 = [4, 5].
We next state the very basic but important result concerning complements
of sets.
Proposition 2.1.5 (Ac )c = A
It should be very easy to see that the above result is true. Probably the easiest
way is to draw the very simple Venn Diagram representing the left hand side of
the equality.
We next prove a very important result related to complements referred to
as DeMorgan’s Laws.
Proposition 2.1.6 Consider the set A, and the family of sets associated with
S, Eα .
(a) A − ∪ Eα = ∩ (A − Eα )
α∈S α∈S
(b) A − ∩ Eα = ∪ (A − Eα )
α∈S c α∈S
(c) ∪ Eα = ∩ Eαc
α∈S α∈S
2.1 Set Theory 39
c
(d) ∩ Eα = ∪ Eαc
α∈S α∈S
Proof: (a) The proof of property (a) follows by carefully applying the definition
of set equality. We begin by assuming that x ∈ A − ∪ Eα . Then we know
α∈S
that x ∈ A and x 6∈ ∪ Eα . The statement that x 6∈ ∪ Eα is a very strong
α∈S α∈S
statement. This means that x 6∈ Eα for any α ∈ S—if x ∈ Eα0 for some α0 ∈ S,
then x ∈ ∪ Eα . Thus x ∈ A and x 6∈ Eα so x ∈ A − Eα —and this holds for
α∈S
any α ∈ S. Therefore x ∈ ∩ (A − Eα ), and A − ∪ Eα ⊂ ∩ (A − Eα ).
α∈S α∈S α∈S
(b) The proof of property (b) is very similar to that of property (a). We assume
that x ∈ A − ∩ Eα . Then x ∈ A and x 6∈ ∩ Eα . The statement x 6∈ ∩ Eα
α∈S α∈S α∈S
implies that x 6∈ Eα0 for some (at least one) α0 ∈ S. But then x ∈ A − Eα0 so
x ∈ ∪ (A − Eα ) and A − ∩ Eα ⊂ ∪ (A − Eα ).
α∈S α∈S α∈S
(c) and (d) Properties (c) and (d) follow from properties (a) and (b),
respectively, by letting A = U the universal set.
HW 2.1.2 Give set containment proofs of parts (c) and (g) of Theorem 2.1.3.
HW 2.1.3 Give Venn diagram proofs of part (h) of Proposition 2.1.3 and part
(c) of Proposition 2.1.6.
40 2. Topology
If x0 6∈ [0, 1], say x0 > 1, then if r = (x0 − 1)/2, the neighborhood of x0 is such that
Nr (x0 ) ∩ E1 = ∅. Thus x0 is not a limit point of E1 . A similar argument shows that any x0
such that x0 < 0 is not a limit point of E1 .
Thus only the points in [0, 1] are limit points of E1 = [0, 1], i.e. E10 = [0, 1] = E1 .
We would like to emphasize that by the definition of a limit point, for the point x0 to be
a limit point of E1 , it must be shown that all neighborhoods of x0 contains an element of E1
different from x0 . To show that a point x0 is not a limit point of E1 , we only have to show
that there exists one neighborhood of x0 that does not contain any elements of E1 other than
x0 .
(b) If we consider the set E2 = (0, 1) and let x0 be an arbitrary point of E2 , then for any
neighborhood of x0 , Nr (x0 ), the point x1 = min{x0 + r/2, (1 + x0 )/2} will be in both E2
and Nr (x0 ), and not equal to x0 . (Again we emphasize that we need the nasty looking point
x1 because we use x0 + r/2 when r is sufficientlly small and use (1 + x0 )/2 when r is large.)
Thus every point in (0, 1) is a limit point of E2 .
Since for any r > 0 the neighborhoods Nr (0) = (−r, r) and Nr (1) = (1 − r, 1 + r) contain
the points x0 = min{r/2, 1/2} and x1 = max{1 − r/2, 1/2}, respectively—both points in
E2 —and surely x0 6= 0 and x1 6= 1, the points x = 0 and x = 1 are both limits points of E2 .
The same argument used for the set E1 can be used to show that all points x 6∈ [0, 1]
are not limit points of (0, 1). Thus only the points in [0, 1] are limit points of E2 = (0, 1) or
E20 = [0, 1].
(c) To find E30 for the set E3 = {1, 1/2, 1/3, · · · } is easy but messy. The easiest way is to
first determine some facts concerning E3 . It is very intuitive that given some element of E3
other than 1, say 1/m, m ∈ N , then the elements of E3 that are closest to 1/m in value are
1/(m − 1) (the next larger element in the set) and 1/(m + 1) (the next smaller element in the
set). Of course we must be able to prove these statements—if someone asked. The easiest
way to prove these is to use the second Peano Postulate that could be stated that there are no
natural numbers between m − 1 and m, or m and m + 1—if there is some element of E3 , say
1/k, such that 1/m < 1/k < 1/(m − 1), then we have k < m and k > m − 1 which contradicts
PP2.
If we proceed and choose any specific element of E3 , say x0 = 1/1004, it is not difficult
to see that the neighborhood Nr (1/1004) where r = 0.00001 will not contain any elements
of E3 other than x0 (because we can compute the value of the elements of E3 that are
1 1
closest to x0 ). This same argument will work for a general element of E3 , m —Nr m with
1 1 1
r = 2 m − m+1 will always work. For x0 = 1 the neighborhood N = (0.99, 1.01) will be
such that N contains no points of E3 other than x0 = 1. Thus no points in E3 are limit points
of E3 .
If we consider the point x0 > 1, then the neighborhood Nr (x0 ) with r = (x0 − 1)/2 will
not contain any elements of E3 . Thus x0 > 1 is not a limit point of E3 .
If we consider the point x0 < 0, then the neighborhood Nr (x0 ) with r = −x0 /2 will not
contain any elements of E3 . Thus x0 < 0 is not a limit point of E3 .
We now consider the point x0 , x0 6∈ E3 and 0 < x0 < 1, We know that there must be two
elements of E3 , say x1 = 1/(m − 1) and x2 = 1/m such that x2 < x0 < x1 —choose m be
setting x2 = 1/m = lub{y ∈ E3 : y < x0 } (you must prove that this least upper bound will
be in E3 ) and let x1 = 1/(m − 1) be the value of the next largest element in the set. We can
then set r = min{(x0 − x2 )/2, (x1 − x0 )/2} and note that Nr (x0 ) ∩ E3 = ∅. Therefore, the
points x0 such that x0 6∈ E3 and 0 < x0 < 1 are not limit points of E3 .
The last point that we have to consider is the point x0 = 0. We let Nr (0) denote any
neighborhood of x0 , i.e. consider any r. Then by Corollary 1.5.5–(b) with = r we see that
there exists an n such that 1/n < r or 1/n ∈ Nr (0). Thus x0 = 0 is a limit point of E3 . Thus
we see that the only limit point of the set E3 is x0 = 0, i.e. E30 = {0}.
As we promised you, the proof was not nice. However, it is a good example of a case
where you consider different classes of points separately—showing that some of the points are
limit points and that some of the points are not limit points.
It should be clear that all of the points in E3 are isolated points. Likewise,
if we consider the set N, since none of the points of N are limit points (for any
42 2. Topology
k ∈ N, the neighborhood N1/2 (k) does not contain any elements of N), all of
the points in N are isolated. It should also be easy to see that no points of E1
or E2 are isolated points—all of the points in both E1 and E2 are limit points.
Example 2.2.2 Let E1 , E2 and E3 be as in Example 2.2.1.
(a) Show that E1 is a closed set.
(b) Show that E2 is not a closed set.
(c) Show that E3 is not a closed set.
Solution: These proofs are very easy based on the work done in Example 2.2.1. Since we
saw that E10 = [0, 1] = E1 , then clearly all of the limits points of E1 are contained in E1 and
the set E1 is closed.
Since we found that E20 = [0, 1], we see that the limit points 0 and 1 do not belong to E2
so the set E2 is not closed. Likewise, since we saw that the only limit point of E3 is the point
0 and 0 6∈ E3 , then the set E3 is not closed.
It should be clear that the set N does not contain any interior points and every
point in R is an interior point.
It should now be easy to see that since 0 6∈ E1o (1 would work too), the set
E1 is not open. Since E2o = (0, 1) = E2 (i.e. every point in E2 is an interior
point), the set E2 is an open set. Clearly since 1/120012 6∈ E3o (and of course
any element of E3 would work here), the set E3 is not open. And finally, it
should be easy to see that N is not open and R is open.
The question of whether a set is dense in R is more difficult but we do not
want to consider many examples. Hopefully it is clear that sets like E1 , E2 , E3
2.2 Basic Topology 43
and N are clearly not dense in R—you must have much bigger sets than these
to be dense in R. It should be clear that E = R will be trivially dense in R. The
two important examples were already consider in Proposition 1.5.6. Consider
the following.
Example 2.2.4 (a) Show that Q is dense in R.
(b) Show that I is dense in R.
Solution: (a) By the definition of I a point x0 ∈ R is either in Q or in I. Let x0 be an
arbritray point of R. We must show that x0 is in Q or x0 is a limit point of Q. Thus if x0 ∈ Q,
we are done. Suppose that x0 6∈ Q (so x0 ∈ I). Consider Nr (x0 ) for any r, i.e. the interval
(x0 − r, x0 + r). Then by Proposition 1.5.6–(a) (with a chosen to be x0 − r and b chosen to
be x0 + r) there exists a rational r1 such that x0 − r < r1 < x0 + r. Therefore x0 is a limit
point of Q.
Since any point of R is either in Q or a limit point of Q, Q is dense in R.
(b) The proof of part (b) follows the same pattern as the proof of part (a) except that we use
part (b) of Proposition 1.5.6 instead of part (a).
Proof: You should note that this proof is and should be very similar to the
proof that E2o = E2 = (0, 1) and that E2 is open in Example 2.2.3–(b) and the
statements following that example. We write the neighborhood N as N = (x0 −
r, x0 + r). If we choose any point y0 ∈ N , then it clear that the neighborhood of
y0 , Nr1 (y0 ) = (y0 − r1 , y0 + r1 ) where r1 = 21 min{r − (x0 − y0 ), r − ((y0 − x0 )}, is
in N (draw a picture to help see that this is true). Thus y0 is an interior point
of N so N is open (N o = N ).
∞
HW 2.2.4 (a) Suppose E1 , E2 , · · · ⊂ R are open. Prove that ∪ Ek is open.
k=1
∞
(b) Suppose E1 , E2 , · · · ⊂ R are closed. Prove that ∩ Ek is closed.
k=1
∞
(c) Suppose E1 , E2 , · · · ⊂ R are open. Show that ∩ Ek need not be open.
k=1
∞
(d) Suppose E1 , E2 , · · · ⊂ R are closed. Show that ∪ Ek need not be closed.
k=1
2.3 Compactness in R
The concept of compactness of sets is very important in analysis. We will use
compactness results in later chapters and you will probably use them throughout
your mathematic career. This is a tough section where the proofs are difficult—
probably more difficult than we have done so far. If the section seems too tough
at this time, read the results, consider the examples carefully and maybe try the
proofs again later when you use the results—and hopefully are better at proofs.
We make the following two definitions.
Definition 2.3.1 The collection {Gα }α∈S of open subsets of R is an open cover
of the set E ⊂ R if E ⊂ ∪ Gα .
α∈S
not compact, you only have to find one cover that has no finite subcover, than
it is to prove that a set is compact (to show that it is compact you must show
that for any open cover, you can find a finite subcover). Later we will use some
of our theorems to produce other sets that are compact (and some that are not
compact).
We need two different types of results concerning compactness. We first need
some general methods that help us determine when and if a given set is compact.
In addition we need some results that give us some of the useful properties of
compact sets—this is why we need and want the concept of compactness. We
begin with the following result.
Proposition 2.3.3 If K ⊂ R is compact, then K is closed.
Proof: We will prove this result by showing that K c is open (and then apply
Propositions 2.2.5 and 2.1.5).
Suppose that x ∈ K c . Then for any point y ∈ K, we can choose neighbor-
hoods Vy and Wy of points x and y, respectively, of radius r = |x − y|/4 (and
since x 6= y, r > 0). The collection of sets {Wy }, y ∈ K, will surely define
an open cover of the set K—y ∈ K implies y ∈ Wy . Since the set K is com-
pact, we choose a finite number of sets Wy1 , Wy2 , · · · , Wyn that covers K, i.e.
n
K ⊂ W = ∪ Wyk .
k=1
n
Let V = ∩ Vyk . Since each Vyk is a neighborhood of the point x and we
k=1
are considering only a finite number of such neighborhoods, V will also be a
neighborhood of the point x—of radius min{|x − y1 |/4, · · · , |x − yn |/4}. (Note
that the sets Vyk , k = 1, · · · , n form a nested set of neighborhoods all about the
point x—we don’t know what order. The set V will be the smallest of those
neighborhoods.) Since Vyk ∩ Wyk = ∅ for k = 1, · · · , n, V ∩ W = ∅. Since
K ⊂ W , V ⊂ K c —draw a picture, it’s easy. Therefore V is a neighborhood of
x ∈ K c such that V ⊂ K c , so x is an interior point of the set K c . Since x was
an arbitrary point of K c , the set K c is open, and then K = (K c )c is closed.
Since we know that the set N is closed but not compact, we know that
we cannot obtain the converse of the above result. We can however prove the
following ”partial converse.”
Proposition 2.3.4 If the set K ⊂ R is compact and F ⊂ K is closed, then F
is compact.
Otherwise, if F c was not included in the subcover, then clearly Vα1 , · · · , Vαn
covers F .
In either case we have found a subcover of the collection of sets {Vα } which
covers F . Therefore F is compact.
You should understand that the above proof is especially abstract since you
start with a given open cover which you have no idea what it’s like. You still
must find a finite subcover—and we do. That is generally a tough job.
We next give a result that will be very important to us later. We will care
that sets have limit points. This result guarantees that a set has a limit point
any time that the set is infinite and is contained in a compact set.
Proposition 2.3.5 If K ⊂ R is compact, and the set E is an infinite subset of
K, then E has a limit point in K.
Proof: Suppose the result is false, i.e. suppose that K is compact and E ⊂ K
is infinite and E has no limit points in K. Then for any x ∈ K (which would
not be a limit point of E) there exists a neighborhood of x, Nx , such that if
x ∈ E (it need not be), then Nx ∩ E = {x} and if x 6∈ E, then Nx ∩ E = ∅.
(Since x is not a limit point of E, it is not the case that every neighborhood of
x contains a point of E other than maybe x. Or there is some neighborhood of
x that does not contain any point of E other than maybe x.)
The collection of all such neighborhoods, Nx , x ∈ K, is surely an open
cover of K (the sets are open because they are neighborhoods and they dover
K because for any x ∈ K, x ∈ Nx ). Clearly no finite subcover of this collection
of sets can cover E—each set Nx contains at most one point of E and the set E
is infinite. If no finite subcover of this collection can cover E, no finite subcover
of this collection can cover K since E ⊂ K. This contradicts the fact that the
set K is compact. Therefore the set E has at least one limit point in K.
When you think about the next statement it probably seems clear. We need
this result proved because it will be very important for us.
Proposition 2.3.6 Suppose that {In }∞
n=1 is a collection of closed intervals in
∞
R such that In+1 ⊂ In for all n = 1, 2, · · · , then ∩ In is not empty.
n=1
This along with the inductive hypothesis implies that an ≤ an+k ≤ a(n+k)+1 ≤
b(n+k)+1 ≤ bn+k ≤ bn , i.e. an ≤ an+(k+1) ≤ bn+(k+1) ≤ bn which is what we
were to prove.
Therefore by the Principal of Mathematical Induction,
Using the first three inequalities of (2.3.1) and the last inequality of (2.3.2)
we see that an ≤ an+m ≤ bn+m ≤ bm . Thus for any m, bm is an upper bound
of E. Therefore x = lub(E) ≤ bm for all m. Thus, since am ≤ x ≤ bm for all m,
∞ ∞
x ∈ Im for all m and x ∈ ∩ In . Therefore ∩ In 6= ∅.
n=1 n=1
We next start proving some theorems that will give us a better idea of what
compact sets might look like. We begin with the first, very basic result.
Proposition 2.3.7 For a, b ∈ R with a < b the set [a, b] is compact.
We should be a little careful above where we chose n0 such that (b−a)/2n0 <
r/2. However, we can do this. By Corollary 1.5.4 (the Archimedian property) we
can choose n0 such that n0 > 2(b − a)/r (letting ”a” = 1 and ”b” = 2(b − a)/r
in Corollary 1.5.4—where ”a” and ”b” are the a and b of the Archimedian
property). It is then easy to use Mathematical Induction to prove that 2n0 > n0
for all n0 . We don’t really want to stop and prove everything like this but we
must realize that we must be ready and able to do so if asked.
We next prove a very important theorem that gives a characterization of
compact sets. This result is know as the Heine-Borel Theorem.
Theorem 2.3.8 (Heine-Borel Theorem) A set E ⊂ R is compact if and
only if E is closed and bounded.
Proof: (⇒) We begin by assuming that the set E is compact but is not
bounded. If E is not bounded we know that the set E either does not have
an upper bound or does not have a lower bound. Let’s suppose that E does not
have an upper bound. Then there exist points xn ∈ E such that xn > n for
n = 1, 2, · · · . Clearly the set E1 = {x1 , x2 , · · · } is an infinite subset of E that
does not have a limit point in E (the set E1 doesn’t even have a limit point in
R). This contradicts Proposition 2.3.5. Therefore the set E must be bounded.
We now suppose that E is not closed. This implies that there is a limit
point of E, x0 , such that x0 6∈ E. From Proposition 2.2.3 we know that every
neighborhood of x0 contains infinitely many points of E. We will use a con-
struction similar to that used in Proposition 2.2.3. Since x0 is a limit point of E,
there exists a point x1 ∈ E such that x1 ∈ N1 (x0 ) (neighborhood of radius 1).
Likewise, there is a point x2 ∈ E such that x2 ∈ N1/2 (x0 ). In general (or induc-
tively), there exists a point xn ∈ E such that xn ∈ N1/n (x0 ) for n = 1, 2, · · · .
Set E1 = {x1 , x2 , · · · }. E1 is a subset of E and is an infinite set. (Otherwise an
infinite number of the xj ’s must be equal. Since x0 6∈ E and xn ∈ N1/n (x0 ) ∩ E,
they cannot equal x0 .)
We want to show that E1 does not have a limit point in E. Since x0 6∈ E,
we know that it’s not x0 . We will next show that nothing else will be a limit
point of E1 . For any y0 ∈ R, y0 6= x0 we have
an infinite subset of the compact set E which has no limit point in E. This
contradicts Proposition 2.3.5. Therefore the set E is closed.
(⇐) Since E is bounded, there exists an a, b ∈ R such that a, b and E ⊂ [a, b].
Since [a, b] is compact (by Proposition 2.3.7) and E is closed, E is compact by
Proposition 2.3.4 which is what we were to prove.
Proposition 2.3.7 gives us a lot of compact sets. Theorem 2.3.8 makes it
easier yet to determine whether certain sets are compact. For example we
know that the sets (0, 1), [0, 1] ∩ Q and [0, ∞) are not compact. And the sets
{0, 1, 1/2, 1/3, · · · } and [0, 10] ∪ {3/2} ∪ [2, 3] are compact. The next result helps
us use the compact sets that we have to build more.
HW 2.3.3 (a) Give an open cover of the set (0, 1) that does not have a finite
subcover.
(b) Give an open cover of the set [1, ∞) that does not have a finite subcover.
52 2. Topology
Chapter 3
Limits of Sequences
We note that f (x) must be defined for each element x ∈ D. We also note that it
is not necessary that each element of R be associated with some element of D.
For D1 ⊂ D we define the set f (D1 ) = {y ∈ R : y = f (x) for some x ∈ D1 }.
f (D1 ) is called the image of D1 . Obviously, f (D) ⊂ R and f (D1 ) ⊂ R. Since
every element of R need not be associated with some element of D, f (D) need
not be equal to R. If f (D) = R, f is said to be onto and we say f maps D
onto R. When working with functions, the domain and range can be any sort
of sets. In our work the domain and the range will not only be subsets of the
set of real numbers, but (except for the definition of a sequence) will most often
be intervals of R or all of R. We will not dwell on these definitions now—we
will try to make a point to explicitly define that domain, range, etc in examples
later.
Sequences: Definition and Examples As we see in the next definition, a
sequence is just a special function.
53
54 3. Limits of Sequences
We can then write Definition 3.1.3 as follows: lim an = L if for every neigh-
n→∞
borhood of L, N (L), there exists a neighborhood of infinity, NN (∞), such that
n ∈ NN (∞) ⇒ an ∈ N (L). Other than notation there is no difference between
this version of the definition and the one given in Definition 3.1.3.
(v) In addition to being able to write Definition 3.1.3 in terms of neigh-
borhoods we get results connecting limit points of sets and limits of sequences.
Proof: The proofs of both parts are easy. (a) We know by Definition 3.1.3
that for any neighborhood of L, Nr (L), there exists an N and all of the points
of {an } for n > N are in that neighborhood. Note that for all we know all of
3.1 Definition 57
these sequence values could be the same—say if the sequence was a constant
sequence.
(b) If we consider any neighborhood of L, Nr (L), and apply the definition of
the limit of a sequence to the sequence {an }, then there exists an N ∈ R such
that n ∈ N and n > N implies that an ∈ Nr (L). Since we have assumed that
the set E is infinite, we can find a point in Nr (L) ∩ E that is different from L.
Note that it is important for part (b) to assume that the set E is infinite.
For the sequence {an } where an = 1 for all n, then an → 1 but 1 is not a limit
point of the set {a1 , a2 , · · · } = {1}.
When you think about the definition of the limit of a sequence—or the graph
of a sequence—the above result is not surprising. The other direction—and it’s
not really a converse—is a bit more of a surprise.
Proof: We have essentially already proved this result. In the proof of Theorem
2.3.8 we consider the sequence of neighborhoods of x0 , N1/n (x0 ), n = 1, 2 · · ·
and chose a point from each neighborhood, xn ∈ N1/n (x0 ). Clearly |xn − x0 | <
1/n for all n. Then for any > 0 we can use Corollary 1.5.5–(b) to obtain
N ∈ N such that N1 < . (Step 1: Define N .) Then for n > N we have
|xn − x0 | < n1 < N1 < . (Step 2: Show that the defined N works.) Thus
lim xn = x0 .
x→∞
Thus we see that when E is a bigger set (has many points), if x0 is a limit
point of E, we can always find a sequence of points of E converging to x0 . For
example if E = [0, 2), 3/2 is a limit point of E and xn = 23 + 4n
1
→ 32 . Also 1
1
is a limit point of E and xn = 1 + 2n → 1. And 2 is a limit point of E and
2 − n1 → 2.
Note that the sequence produced by the proof of Theorem 2.3.8 is such that
xn 6= x0 for any n. This is not necessary for this proof.
Of course we need some experience with proving limits—finding the N and
showing that it works. We will do that in the next section.
a while—exaggerated on this plot—the points representing the plot of the sequence enter the
corridor formed by y = ± and never leave it. The points will never leave the corridor because
they are positive and decreasing. It is of interest to see if we can sort of compute when the
1
points will cross the line y = . We set = and solve for n as n = 1/. Of course would
n
have to be special for 1/ to be an integer. However, it should be clear that if we set N = 1/
(This is the definition of N required by Step 1: we did it graphically, but we don’t really care
how we got it.), then if n > N , the plotted values of 1/n have entered into the ± corridor and
because 1/n < (because
1 n > N = 1/) and 1/n > 0, these plotted values will never leave the
± corridor, i.e. n < . (This shows that the N defined as N = 1/ works: Step 2.) Hence,
1
we have defined an N = 1/ such that if n > N = 1/, then − 0 < . Therefore by the
n
1
definition of limit, Definition 3.1.3, lim = 0. We could also complete Step 2 by noting if
n→∞ n
3.2 Applications of the Definition 59
1 1 1
n > N = 1/, then − 0 = < = . (This also shows that the N works: Step 2.)
n n N
The second way we will do this problem is the way that limit proofs are most often done.
1 1
We suppose that > 0 is given. We need N so that n > N implies that − 0 = < .
n n
This last inequality is equivalent to n > 1/. Therefore
if we choose N = 1/ (definition of
1 1 1
N : Step 1), then n > N = 1/ implies that − 0 =
< = (Step 2: The defined N
n n N
1
works.). Therefore → 0 as n → ∞.
n
Notice that in each method, the first graphically and the second algebraically,
we define N and then show that this N works, i.e. we satisfy the definition of
the limit. This is always the way limit proofs are done when we are applying
the definition of the limit. The first method illustrates that it really makes
no difference how you find N . If we can show rigorously that a particular N
works—even if we only guessed it—we are done.
And finally, we note that the N that we found, N = 1/, depended on .
This is perfectly permissable. The statement in the definition is that ”for every
> 0 there exists an N .” The same N surely does not have to work for all . It is
logical that N would generally have to depend on (it is not a requirement) and
with this dependence, we can still satisfy the definition. We should understand
that generally N will depend on in a way that as gets smaller, N will get
larger—as with N = 1/.
1
Example 3.2.2 Prove that n→∞
lim 1 − = 1.
n2
Solution:
Again, we assume
that
we are given > 0. We want an N such that n > N
1 1 1
implies [1 − 2 − 1 = − 2 = 2 < . This last inequality is equivalent to n2 > 1/ or
n n n
p √ √ 2
n > 1/ = 1/ (because n > 0). Thus ifwe choose N = 1/
(or
such that 1/N = )
1 1 1 1
(Define N : Step 1) and let n > N , we have [1 − 2 − 1 = − 2 = 2 < 2 = (Step
n n n N
1
2: N works). Thus lim 1 − 2 = 1.
n→∞ n
You should note the proof of a limit involves the first step, where we set
|an − L| < and solve this inequality of n. This shows us how we should define
N –by setting the inequality we want to be satisfied as true, we are able to see
what we need to make it true. This is sort of a ”mathematical limits dance.”
Other than the resulting definition of N , this is not technically a part of the
proof. It is a very common approach to help find N . We then show that this
N works. And we emphasize, after we perform the first step and define N , we
always must show that N satisfies the definition of a limit. If you understand
all of the parts of the dance, this is often easy.
Also, as a part of the analysis above we first had an inequality n2 > 1/
and took the square root of both sides. As a part of solving the inequality for
n, we often have to perform operations on both sides of an inequality. The
question is why can you take the square root of both sides (or perform some
other operation on both side of an inequality)? In the case of the square root,
60 3. Limits of Sequences
Before we move on to limits that do not exist, we prove one more limit. We
emphasize that we are cheating. As a part of the next example we will use the
natural logarithm, ln and the exponential, exp. We will define these functions
and prove properties of the ln and exp functions in Section 7.7. We could wait
for this example until then but are probably better served doing it now. There
are no circular arguments involved.
1
Example 3.2.4 Prove that n→∞
lim = 0.
2n
Solution: We proceed as we did in the last example. We suppose
that we are given > 0
1 1
and we need to find N such that n > N implies that n − 0 = n < . This last inequality
2 2
is equivalent to 2n > 1/. Taking the logarithm base e of both sides gives (because ln is an
n
increasing function) ln 2 = n ln 2 > ln (1/) = − ln or n > − ln / ln 2.
Thus we see that if we choose N = − ln / ln 2 (Step 1: Define N ) and consider n > N ,
then we have n > N = − ln / ln 2 or n ln 2 = ln 2n > − ln = ln (1/). Taking the exponential
1 1 1
of both sides, we get 2n > 1/, n < or n − 0 < (N works: Step 2), and therefore n
2 2 2
approaches 0 as n approaches ∞.
We note that we can take the ln and exp of both sides of the inequality
because they are both increasing functions–cheating again, but think of the
graphs of these functions and it will be clear that they are increasing.
You should also note that we have used the fact that ln 2 ≈ .69 > 0 al-
lowing us to divide both sides of the inequality by ln 2 keeping the direction
of the inequality the same. Also note that we write the definition of N as
3.2 Applications of the Definition 61
We should note that both of these cases of nonexistence of limits can be illus-
trated graphically—you should be very careful before you claim that the picture
gives you a proof. You will be asked to graphically illustrate the nonexistence of
several limits of sequences in HW3.2.2. In Figure 3.2.2 we draw a picture much
3.2 Applications of the Definition 63
like we did in Figure 3.1.1—choose some L and , plot the point (0, L) and draw
the lines y = L ± . To illustrate the non-existence in Example 3.2.5 we choose
an arbitrary L and let = 1. We then plot some sequence values an = n2 + 1.
We note that sooner or later the sequence points go outside of the y = L ±
corridor (actually above the corridor) and stay out of there forever—we did not
get to plot many points in Figure 3.2.2 because n2 + 1 grows large quickly. Since
the sequence clearly leaves the L ± -corridor and never comes back, the limit
is surely not equal to L.
if for every > 0 there exists N such that n > N implies |an − L| < .
One might ask if it is necessary to apply the definition so that it works for every
> 0. We might try the following candidate for the definition.
F1 : if for some > 0 there exists N such that n > N implies |an − L| < .
1
If this were the definition, life would be much easier. Consider an = and
n
choose = 0.1. If we choose N = 13, then n > N = 13 implies that
1
− 0 = 1 < 1 = 1 < 0.1.
n n N 13
1
If F1 were the definition, the above computation would imply that lim = 0.
n→∞ n
This is the same result that we got in Example 3.2.1 (which we hope our intuition
tells us is the correct limit). We see that F1 is easy to apply. However, using
the same = 0.1 we see that if we choose N = 100, then n > N = 100 implies
that
1
− 0.001 ≤ 1 + 0.001 < 0.01 + 0.001 < 0.1.
n n
1
So the same would imply that lim = 0.001. Further calculations using
n→∞ n
1
F1 would give as a large assortment of answers for lim (this makes it very
n→∞ n
difficult to grade homework). And different choices of would give us more
values of the limit. Clearly F1 is a bad choice—it is not a strong enough criterion
to serve as our definition.
If we instead tried
This proposal is in big trouble. Here we are claiming that for any > 0, the
implication in the definition must be true for all N . That is just too strong of a
3.2 Applications of the Definition 65
∞
2n2 + 4
HW 3.2.2 (a) Consider the sequence . Illustrate graphically
n+3 n=1
that
66 3. Limits of Sequences
2n2 + 4
lim does not exist.
n→∞ n + 3
∞
1
(b) Consider the sequence (−1)n + . Illustrate graphically that
n n=1
1
lim (−1)n + does not exist.
n→∞ n
n 1
(c) Prove that lim (−1) + does not exist, i.e. use the definition).
n→∞ n
2n2 + 4 2
HW 3.2.3 (a) Prove that the limit lim = (use the definition).
n→∞ 3n2 + 1 3
2n2 + 4 2
(b) Prove that the limit lim = (use the definition).
n→∞ 3n2 + n + 1 3
HW 3.2.4 Suppose that {an } and {bn } are sequence such that lim an =
n→∞
lim bn = 0. Prove that lim an bn = 0.
n→∞ n→∞
the definition of a sequential limit to show that the the sequence {an } and L
satisfy the definition. But what if after you read Example 3.2.3 and gain an
2n + 3 2
understanding how Definition 3.1.3 was used to show that lim = ,
n→∞ 5n + 7 5
one of your classmates says that the limit is really 5/2 and claims that she can
apply Definition 3.1.3 to prove it. Could the text (and your reading of the text)
and your classmate both be right? Has it been made clear that a sequence can’t
have two distinct limits that satisfy the definition? We answer these questions
with the following proposition.
Proposition 3.3.1 Suppose {an } is a real sequence and lim an exists. The
n→∞
limit is unique.
and
for n > N2 , L2 − 2 < an < L2 + 2 .
To help see why this assumption is clearly false, we have included the plot
in Figure 3.3.1. Since L1 6= L2 , we have choosen 1 and 2 sufficiently small so
that the L1 ± 1 and L2 ± 2 -corridors do not intersect. Yet for all n > N1 all of
the values an must be in the L1 ± 1 -corridor and for n > N2 all of the values an
must be in the L2 ± 2 -corridor—we try to illustrate this in the plot but it isn’t
68 3. Limits of Sequences
going to happen—the question marks signify the fact that we can’t put them in
both places.
For the proof we set 1 = 2 = |L1 − L2 | /2. For convenience and without
loss of generality, we assume that L1 > L2 so 1 = 2 = (L1 − L2 ) /2—one of
the two values must be larger than the other. Define N = max{N1 , N2 }. Then
for all n > N (so that n > N1 and n > N2 ) we will have
and
The left most inequality in statement (3.3.1) and the right most inequality in
statement (3.3.2) gives us for all n > N = max{N1 , N2 }
Proof: (a) Suppose > 0 is given. The first two hypotheses gives us that
for any 1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < 1 ,
(3.3.4)
and
for any 2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < 2 .
(3.3.5)
We note that
or |an | < |L1 | + 1. This inequality bounds most of the sequence {an }. If we let
N0 = [N1 ] where the bracket function is defined by [x] is the largest integer less
than or equal to x, then inequality |an | < |L1 | + 1 for n > N1 bounds |an | for
n = N0 + 1, N0 + 2, · · · . Thus we set K = max{|a1 |, |a2 |, · · · , |aN0 |, |L1 | + 1}
and we have our desired result.
We note that in the above proof it would have been convenient if we had
always defined N to be a natural number (so that we didn’t have to use the
[N ]). However, we have had many instances when it has been convenient for
us to only require that N ∈ R. We should also note that it is the completeness
axiom that assures us that such an integers N0 exists. We are defining [N ] to be
the least upper bound of the set {n ∈ N : n ≤ N }—which surely exists because
the set is bounded above by N .
(d) Suppose > 0 is given. The first two hypotheses give us that
for any 1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < 1 ,
(3.3.7)
70 3. Limits of Sequences
and
for any 2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < 2 .
(3.3.8)
We must find an N such that n > N implies that |an bn − L1 L2 | < . It should
not surprise you that the proof will be similar to that given in part (a)—with a
different dance. We note that
(To verify the first step just multiply out the second expression. The first
inequality is due to the triangular inequality, Proposition 1.5.8–(v)—we will use
this often. The last step just uses |xy| = |x||y|, Proposition 1.5.8–(ii).) Then
starting with expression (3.3.9) and using (3.3.7), n > N1 , (3.3.8), n > N2 and
part (c) of this proposition, we get
where K is the bound of the sequence {an } given by part (c) of this proposition.
Thus we see that if we choose 2 = /2K, 1 = /2|L2 | and N = max{N1 , N2 }
(so that both of the inequalities in (3.3.7) and (3.3.8) are satified), we have
|an bn − L1 L2 | < whenever n > N . Therefore lim an bn = L1 L2 .
n→∞
We note that in the last step we must assume that K and |L2 | are nonzero.
It is easy to see that we can assume that K 6= 0—if K is a bound so is K +
7. Assuming that |L2 | = 0 is a real assumption so we must prove the result
separately for this case. If L2 | = 0, then L2 = 0 and and L1 L2 = 0, and the
fact that lim bn = 0 implies that for any 2 > 0 there exists an N2 such that
n→∞
n > N2 implies that |bn | < 2 . Then let = /K and statement (3.3.10) can be
replaced by
|an bn − 0| = |an ||bn | ≤ K|bn | < K(/K) = .
Thus lim an bn = 0 = L1 L2 .
n→∞
We should note that when many mathematicians are doing proofs such as
this one, they will often let 1 = 2 = , obtain expression (3.3.10) (with
replacing 1 and 2 ) and claim that they are done. And they are. The last term
of expression (3.3.9) would be (K + |L2 |). Because of the we are able to make
|an bn − L1 L2 | arbitrarily small—which is really our goal. However, we don’t
technically satisfy Definition 3.1.3. But it should also be clear at this time that
the (K + |L2 |) term can be fixed up so as to give the desired result. Textbooks
will generally fix it up so that they always end with just an at the end of
the inequality—it’s just a bit cleaner. But don’t be surprised if you see this
”sloppier” (but correct) approach in classes and talks.
We now have some of the basic results that let us compute easy limits. We
know from Example 3.2.1 that lim 1/n = 0. It should be easy to see that
n→∞
we can use part (d) of the above theorem to get lim 1/n2 = 0 and another
n→∞
3.3 Some Limit Theorems 71
application of part (d) will give lim 1/n3 = 0. We are able to obtain the
n→∞
following more general result.
1
Example 3.3.1 Prove that n→∞
lim = 0 for any k ∈ N.
nk
Solution: We hope that you realize that this result is a natural for mathematical induction.
Step 1: Prove true for k = 1. Example 3.2.1 shows that it is true for k = 1.
1
Step 2: Assume true for k = j, i.e. lim j = 0.
n→∞ n
1
Step 3: Prove true for k = j + 1, i.e. prove that lim j+1 = 0. This proof is an easy
n→∞ n
1 1 1
application of part (d) of Proposition 3.3.2. We write j+1 as j . We know from Example
n n n
1
3.2.1 that lim 1/n = 0. We know from the inductive assumption, Step 2, that lim j = 0.
n→∞ n→∞ n
Then by part (d) of Proposition 3.3.2 we have
1 1 1
lim = lim j lim = 0.
n→∞ nj+1 n→∞ n n→∞ n
Therefore the proposition is true for k = j + 1.
By the principle of mathematical induction the proposition is true for all k ∈ N.
Note that you must be careful to not mix up the fact that usually our
statements to be proved by math induction were given in terms of n and we
used k as our dummy index. In this case, since n was already in use, our
statement is given in terms of k and we used j as our dummy index. It is only
a matter of notation.
Also, we might be inclined to want to prove the above result using part (d)
of Proposition 3.3.2) (k − 1 times) and Example 3.2.1 to show that
1 1 1
lim = lim · · · lim repeated k times
n→∞ nk n→∞ n n→∞ n
= 0.
This is a perfectly good approach. Hopefully you realize when you include the
”three dots” you are including a math induction proof in disguise—and hopefully
an easy one. The result needed here is the extension of part (d) of Proposition
3.3.2 that can be stated as follows. Let {ajn }∞ n=1 , j = 1, · · · , k denote k real
sequences such that lim ajn = Lj for j = 1, · · · , k. Then lim a1n · · · akn =
n→∞ n→∞
L1 · · · Lk . It is hoped that you realize that this statement can be proved easily
by mathematical induction—like the proof given above.
We next note that we can use parts (a) and (b) of Proposition 3.3.2, and
Example 3.3.1 to show that
1 1 1
lim 1 − 2 = lim 1 + lim (−1) 2 = 1 + lim (−1) lim 2 = 1 + (−1)0 = 1,
n→∞ n n→∞ n→∞ n n→∞ n→ n
which is the same result we got in Example 3.2.2. We note that once we have
proved Proposition 3.3.2, HW3.2.1-(b) and Example 3.3.1, the proof of the
limit given here is every bit as rigorous as the proof given in Example 3.2.2.
In the next example we prove a more general result that can be useful. Let p
denote a k-th degree polynomial p(x) = a0 xk + a1 xk−1 + · · · + ak−1 x + ak where
a0 , a1 , · · · , ak are real.
72 3. Limits of Sequences
Solution: Again it should be clear that this proof could be done by induction. Instead, we
will prove this result using the extension of part (a) of Proposition 3.3.2 along with parts (b)
and Example 3.2.1 to see that
1 1 1
lim p(1/n) = lim a0 k + a1 k−1 + · · · + ak−1 + ak
n→∞ n→∞ n n n
1 1 1
= lim a0 k + lim a1 k−1 + · · · + lim ak−1 + lim [ak ]
n→∞ n n→∞ n n→∞ n n→∞
1 1 1
= lim a0 lim k + lim a1 lim k−1 + · · · + lim ak−1 lim + lim ak
n→∞ n→∞ n n→∞ n→∞ n n→∞ n→∞ n n→∞
= ak .
This is a nice straightforward proof of the desired result but it does depend strongly on
the use of the extension of part (a) Proposition 3.3.2—which you should be confident follows
easily from part (a) or you should prove it.
HW 3.3.1 (True or False and why) (a) If lim |an | exists, then lim an exists.
n→∞ n→∞
an
(b) If lim and lim bn exists, then lim an exists.
n→∞ bn n→∞ n→∞
(c) If lim an exists, then lim a3n exists.
n→∞ n→∞
(d) If lim a3n exists, then lim an exists.
n→∞ n→∞
(e) If lim an bn exist, then lim an and lim bn exist.
n→∞ n→∞ n→∞
The first step above follows from the fact that the expression inside of the limit is
exactly the same for the first two terms—the second term is found by multiplying
1/n2
the first by . In the second step of the calculation we would be using part
1/n2
(b) of Proposition 3.4.1 given below. Then two applications of Example 3.3.2
(or in place of Example 3.3.2 you can use parts (a), (b) of Proposition 3.3.2 and
Example 3.3.1). Thus we next include the quotient rule for sequential limits.
Proposition 3.4.1 Suppose that {an } and {bn } are real sequences, lim an =
n→∞
L1 and lim bn = L2 . Then we have the following results.
n→∞
(a) If L2 6= 0 then there exists an M ∈ R and an N3 ∈ R such that |bn | ≥ M
for all n > N3 .
an L1
(b) If L2 6= 0, then lim = .
n→∞ bn L2
Proof: (a) As in the other proofs, the hypotheses for this result implies that
for every 2 > 0 there exists an N2 ∈ R so that for n > N2 , |bn − L2 | < 2 .
We are also given the fact that L2 6= 0. This is another result for which it is
convenient to draw a picture. In Figure 3.4.1 we have used the fact that L2 6= 0
to choose an 2 so that the L2 ± 2 -corridor forces the sequence values to be
away from zero for all n greater than some N3 , i.e. we have choosen an 2 so
that y = 0 is not in the L2 ± 2 -corridor.
The easiest way to accomplish this is to choose 2 = |L2 | /2. Then the
hypothesis implies that there exists an N2 ∈ R such that n > N2 implies that
|bn − L2 | < |L2 | /2 or |L2 − bn | = |bn − L2 | < |L2 | /2. Then by the backwards
triangular inequality, Proposition 1.5.8–(vi), for n > N3 = N2 we get
Thus if we choose 1 = M /2 and 2 = M |L2 | / (2 |L1 |), we can apply inequal-
ities (3.4.4)–(3.4.5) to see that for n > N
an L1 |L2 ( |M /2) + |L1 | [M |L2 | / (2 |L1 |)]
− < = .
bn L2 |L2 | M
Therefore an /bn → L1 /L2 as n → ∞.
Note that in the above argument we have assumed that L1 6= 0. If L1 = 0,
we have that for any 1 > 0 there exists an N1 such that n > N1 implies that
|an | < 1 . Thus
an |an | 1
− 0= < .
bn |bn | K
Then choosing N = max{N1 , N3 } and 1 = K, we see that
an limn→∞ an 0
lim = = = 0.
n→∞ bn limn→∞ bn L2
There are other results that we need or would like concerning limits of se-
quences. We can use the definition to prove that lim (−1)n /n converges to zero.
n→∞
However a tool that can be used to prove the convergence of this limit and many
others is the following proposition referred to as the Sandwich Theorem.
Proposition 3.4.2 Suppose that {an }, {bn } and {cn } are real sequence for
which lim an = lim cn = L and an ≤ bn ≤ cn for all n greater than some N1 .
n→∞ n→∞
Then lim bn = L.
n→∞
76 3. Limits of Sequences
Proof: We suppose that we are given an > 0. The two limit hypotheses give
us that there exists an N2 such that n > N2 implies that |an − L| < , or
and there exists an N3 such that n > N3 implies that |cn − L| < , or
Then if we use the left inequality of (3.4.8), the right inequality of (3.4.9) and
the hypothesis that an ≤ bn ≤ cn , we find that if n > N = max{N1 , N2 , N3 },
then
L − < an ≤ bn ≤ cn < L + ,
or |bn − L| < . Therefore lim bn = L.
n→∞
It should then be easy to see that we can use the inequality −|an | ≤
(−1)n an ≤ |an | (see HW3.3.2) and Proposition 3.4.2 to obtain the following
result.
Corollary 3.4.3 If limn→∞ an = 0, then limn→∞ (−1)n an = 0.
Proof: (⇒) If limn→∞ an = L then for every > 0 there exists an N ∈ R such
that n > N implies that |an − L| < . Consider any subsequence of {an }, {ank }.
Clearly if nk > N , then |ank − L| < . Let K be such that nK ≤ N < nK+1
(nK can be defined to be lub{nk : nk ≤ N }). Then k > K implies that nk > N
and |ank − L| < . Therefore lim ank = L.
k→∞
(⇐) Suppose false, i.e. every subsequence of {an } converges to L but lim an 6=
n→∞
L (either it doesn’t exist or it exists and does not equal L). The limit lim an 6=
n→∞
L if for some > 0 and every N ∈ R there exists an n > N for which |an −L| ≥ .
Let N = 1 and denote by n1 the value (of n) such that |an1 − L| ≥ .
Then let N = n1 and denote by n2 the element of N such that n2 > N = n1
and |an2 − L| ≥ .
Continue in this fashion and get a sequence of natural numbers {nk } such that
n1 < n2 < n3 < · · · and |ank − L| ≥ for all k. Thus the subsequence {ank }
does not converge to L. This is a contradiction so lim an = L.
n→∞
The next result is an important result for later work known as the Bolzano–
Weierstrass Theorem.
∞
empty. Let x0 be such that x0 ∈ ∩ In . (Because the length of the intervals
n=1
goes to zero, there is really only one point in the intersection—but we don’t
care—finding one point is enough.)
Now choose the subsequence {xnj } as follows:
Choose xn1 as one of the terms of the sequence {xn } such that xn1 ∈ I1 (since
I1 contains infinitely many elements of E1 , this is surely possible).
Choose xn2 ∈ I2 from the terms of the part of the original sequence {xn1 +1 , · · · }
(i.e. such that n2 > n1 )—I2 contained infinitely many elements of E1 so there
are still enough to choose from.
In general choose xnj ∈ Ij so that nj > nj−1 —since Ij contained infinitely
many elements of E1 there are still plenty of elements to choose from. Do so for
all j, j = 1, · · · .
∞
Since x0 ∈ ∩ In , x0 ∈ Ij for all j. Since xnj ∈ Ij also, xnj − x0 ≤
n=1
(b1 − a1 )/2j−1 → 0 as j → ∞ and the subsequence {xnj } converges to x0 .
One very easy result that we obtain from the Bolzano–Weierstrass Theorem
is the following.
Corollary 3.4.8 If the set K ⊂ R is compact, then every sequence in K has a
convergent subsequence that converges to a point in K.
= 1, then for any N and n > N , we can find an m > N , say m = n + 5, such
that |an − am | = |n − m| = 5 > .
We next include a lemma that is really part of the proof of the Cauchy
criterion. We are separating it out because it may be useful in its own right.
Lemma 3.4.10 If the sequence {an } is a Cauchy sequence, then the sequence
is bounded.
Proof: (⇒) Begin by supposing that an → L. We know that for any 1 > 0
there exists N ∈ R such that n > N implies that |an − L| < 1 . Now suppose
that we are given an > 0, choose 1 = /2 and let N ∈ R be the value
promised us by the convergence of the sequence {an }. Then if n, m > N ,
|an − am | = |(an − L) + (L − am )| ≤∗ |an − L| + |L − am | < 21 = where
the step labeled ≤∗ is due to the triangular inequality, Proposition 1.5.8-(v).
Therefore {an } is a Cauchy sequence.
(⇐) Suppose that {an } is a Cauchy sequence. Let E be the set of points
{a1 , a2 , · · · }. By Lemma 3.4.10 the sequence {an }, and hence the set E, is
bounded. Then we know by the Bolzano–Weierstrass Theorem, Theorem 3.4.7,
the sequence {an } has a convergent subsequence, say {ank }. Let L be such that
ank → L as k → ∞.
We will now proceed to prove that {an } converges to L. Suppose > 0 is
given and let N ∈ R be such that n, m ∈ N and n, m > N implies |an −am | < /2
(because {an } is a Cauchy sequence). Let N2 ∈ R be such that k ∈ N and k > N2
implies that |ank − L| < /2 (because ank → L). Let nK (where nK is one of
the subscripts from the subsequence, n1 , n2 , · · · ) be a fixed integer such that
K > N2 . In addition require that nK > N —if we have found an appropriate
nK , we can always use a larger value. Hence, we have that |anK − L| < /2 and
if n > N we have that |anK − an | < /2. Thus for n > N ,
|an − L| = |(an − anK ) + (anK − L)| ≤∗ |an − anK | + |anK − L| < + =
2 2
where the step labeled ≤∗ follows by the triangular inequality. Therefore an → L
as n → ∞.
80 3. Limits of Sequences
1
HW 3.4.1 (True or False and why) (a) If lim exists, then lim an exists.
n→∞ an n→∞
(b) Consider the sequence {an }. If the subsequences {a2n } and {a2n+1 } both
converge, then the sequence {an } converges.
(c) If {an } is a sequence of rationals in [0, 1], then {an } has a subsequence that
converges to a rational in [0, 1].
(d) If an < 0 for all n > N for some N ∈ R and lim an exists, then lim an ≤ 0.
n→∞ n→∞
(e) The sequences {1/n2 } is a Cauchy sequence.
(f) There is a sequence that consists of the set of rationals in [0, 1].
1
HW 3.4.2 Prove that the sequence n sin n converges.
HW 3.4.3 Prove that there exists a subsequence {nk } of N such that {cos nk }
converges.
HW 3.4.4 Use the definition, Definition 3.4.9, to prove that {1/n3 } is a Cauchy
sequence.
HW 3.4.5 Let {an } represent the sequence that contains all of the rational
numbers in [0, 1]. Explain why the sequence {an } is not convergent. Prove
that {an } has a convergent subsequence. Describe one of the convergent subse-
quences of {an }.
It should not be hard to see that sequences {−1/n}, {−1/n2 }, {1−1/n2 } and
n
{3 } are monotonically increasing (they’re strictly
n increasing too) and that the
2 2 1
sequences {1/n}, {1/n }, {1 + 1/n } and are monotonically decreas-
2
ing (and strictly
decreasing).
Likewise, it should be clear that the sequences
n n1 1
{(−1) }, (−1) and 1 + (−1)n are not monotonic sequences. The
n n
easiest approach to demonstrate that a sequence such as {1 − 1/n2 } is mono-
1 ? 1
tone increasing is by setting an+1 ≥ an , i.e. 1 − ≥ 1 − 2 —which at
(n + 1)2 n
this time we do not know is true (we have placed a question mark, ?, over the
inequality to indicate that you don’t know that it is true), and then simplify the
inequality with reversible steps until you arrive at an inequality that you know
is true or that you know is false. In this case we see that
1 ? 1
1− 2
≥ 1 − 2 is the same as
(n + 1) n
1 ? 1
≤ 2 (subtract 1 from both sides and multiply both sides
(n + 1)2 n
by −1) is the same as
?
n2 ≤ (n + 1)2 = n2 + 2n + 1 (multiply both sides by n2 and
(n + 1)2 and simplify) is the same as
?
0 ≤ 2n + 1(subtract n2 from both sides).
Proof: This is a very important theorem that is especially nice because the
proof is really easy—in fact, when you think about it, it’s obvious. Consider
part (a). If the sequence is monotonically increasing, then it surely cannot be
the type of sequence that does not converge because it oscillates back and forth
between two distinct numbers. If the sequence is bounded, the sequence cannot
be the type of sequence that does not converge because it goes off to infinity.
There’s really nothing left.
3.5 Monotone Convergence Theorem 83
We begin the proof as usual by supposing that we are given > 0. For
convenience let S = {an : n ∈ N} and L = lub(S)—which exists because the
sequence is bounded above. Recall that from Proposition 1.5.3–(a) we know
that for any > 0 there exists some an0 ∈ S such that L − an0 < . Then
for all n > N = n0 (Step 1: Define N), by the fact that the sequence {an } is
monotonically increasing, an ≥ an0 > L − . Also for all n > N = n0 (really for
all n), because L = lub(S) is an upper bound of S, an ≤ L < L + . Therefore,
for n > N we have
so lim an = L.
n→∞
(b) We will not include the proof of part (b). You should make sure that
you understand that part (b) follows from Proposition 1.5.3–(b) in the same
way that (a) followed from Proposition 1.5.3–(a)—or you could consider the
sequence {−an } and apply part (a) of this theorem.
(c) This statement was only included in the proposition for completeness.
The contrapositive of the statement is that if the sequence is convergent, it
is bounded—but we already know that to be true for any sequence (monotone
or not) by Proposition 3.3.2–(c).
The Monotone Convergence Theorem has many applications and is an im-
portant theorem. At this time we will use it to prove a very useful limit.
lim cn = 0.
Example 3.5.1 Prove that if |c| < 1 then n→∞
Solution: (You should be aware that if c = 1, the limit is one. If c > 1, the sequence is
unbounded so the limit does not exist—or as we will soon show, the limit is infinity. If c ≤ −1,
the limit does not exist. We will not prove these now.)
Case 1: Suppose we make it easy and assume that 0 < c < 1. From HW1.6.6) we see that
0 < cn < 1 for all n ∈ N (an induction proof). If an = cn , then by the fact that c < 1
and Proposition 1.3.7-(iii) we have an+1 = cn+1 = cn c < cn 1 = an . Thus the sequence
{an = cn } is monotonically decreasing. Also since an = cn > 0, the sequence is bounded
below. Thus by Theorem 3.5.2–(b) we know that lim cn exists and equals L = glb(S) where
n→∞
S = {cn : n ∈ N}.
Notice that since 0 < cn for all n ∈ N, by HW1.5.1–(a) L ≥ 0. To show that L = 0 we
suppose false, i.e. suppose that L > 0. Since L is a lower bound of S, L ≤ cm for all m ∈ N.
cn+1 L
Specifically, n ∈ N, L ≤ cn+1 also. Then cn = ≥ , so L/c is a lower bound of S.
c c
But since c < 1, L/c > L. This contradicts the fact that L = glb(S). Therefore L = 0 and
lim cn = 0.
n→∞
Case 2 & 3: If c = 0, then cn = 0 for all n so the result follows from HW3.2.1-(b). If
−1 < c < 0, then we can write cn = (−|c|)n = (−1)n |c|n and the result follows from the Case
1 and Corollary 3.4.3 (−1 < c < 0 implies that 0 < |c| < 1, so Case 1 implies that |c|n → 0).
Note that this example includes the limit proved in Example 3.2.4. Hopefully
you realize that the limit considered above could also be proved using the same
approach as we used in Example 3.2.4
We next use Example 3.5.1, Proposition 3.4.2 and Corollary 3.4.3 to prove
the convergence of another important limit.
84 3. Limits of Sequences
an
Example 3.5.2 Prove that n→∞
lim = 0 for any a ∈ R.
n!
Solution: Before we proceed, recognize that this is a strong result. No matter how large a
is (and if a is large, an will get really large), eventually n! gets to be big enough to dominate
the an term. (If you are interested, set a = 100 and compute an /n! for n = 150, 151, 152.
You’ll see they are getting smaller but they have a long way to go. If you look at this proof
carefully, you will see exactly how and why this happens.
To make the solution a bit easier we consider Case 1: a > 0. We begin by choosing
M ∈ N such that M > a (we can do this by Corollary 1.5.4). Then for n > M , we see that
an an an
= ≤ there were n − M factors
n! M !(M + 1) · · · n M !M n−M
M M a n
= . (3.5.1)
M! M
an M M a n a n
Since M is fixed, 0 < ≤ and → 0 (because a/m < 1), we apply
n! M! M M
n
Proposition 3.4.2 to see that the sequence {a /n!} converges to 0.
As in the last example Case 2 & 3 are easy. When a = 0 we have the trivial zero
an |a|n
sequence. When a < 0, we see that = (−1)n so the result follows from Case 1 and
n! n!
Corollary 3.4.3.
You might note that the limit proved in Example 3.5.2 can be proved using
the Monotone Convergence Theorem directly. It is an interesting application of
the Monotone Convergence Theorem in that the sequence is not monotonically
decreasing—set a = 100 and compute a1 , a2 , a3 , a150 , a151 and a152
. Toapply
∞
an
the Monotone Convergence Theorem you apply it to the sequence
n! n=M
(where M is as in the last example). Again we begin with Case 1 where a > 0.
Since n + 1 > M and M > a, we see that
an+1 an a an a an
= < < ,
(n + 1)! n! n + 1 n! M n!
so the sequence is monotonically decreasing. The sequence is bounded below
by zero, so the limit exists and equals L = glb(S) = glb ({cn : n ∈ N}). As in
an+1
Example 3.5.1 we assume that L > 0 and note that ≥ L for any n and
(n + 1)!
an+1
an (n+1)! L
= a ≥ a .
n! n+1 M
Since this is true for any n ∈ N, M L/a is also a lower bound and M L/a > L so
L cannot be the greatest lower bound. Therefore L = 0.
We see that the tail end of the given sequence converges, hence the seequence
converges—recall the discussion of tail ends of sequence at the end of Section
3.2.
HW 3.5.1 (True or False and why) (a) The sequence {sin(1/n)} is monotone.
(b) The sequence {n + (−1)n /n} is monotone.
(c) The sequence {n/2n } is monotone.
3.6 Infinite Limits 85
n o
n+1
(d) The sequence n+2 is monotonically decreasing.
(e) If {an } and {bn } are Cauchy sequences, then {an +bn } is a Cauchy sequence.
(f) If {an } and {bn } are Cauchy sequences, then {an bn } is a Cauchy sequence.
(g) If the sequence {an } has a convergent subsequence, then {an } converges.
HW 3.5.2 Suppose S ⊂ R is bounded above and not empty, and set s = lub(S).
Prove that there exists a monotonically increasing sequence {an } ⊂ S such that
s = lim an .
n→∞
that N works. We suppose that we are given an M > 0. We want an N so that n > N implies
that n2 + 1 > M . As we did in the case of finite limits, we solve this inequality for n, i.e.
√
n2 + 1 > M is the same as n2 > M − 1 is the same as n > M − 1.
√ √
Therefore we want to define N = M − 1 (Step 1: Define N ). Then if n > N = M − 1,
2 2
n > M − 1 and n + 1 > M (Step 2: N works).
Before we say that we are done we should note that what we have done above is not quite
correct. The definition must hold for any M > 0. But if 0 < M < 1, then M − 1 < 0 so we
cannot take the square root of M − 1—but using an M between 0 and 1 to measure whether a
sequence is going to infinite is not the smartest thing to do anyway. However, we must satisfy
the definition (this technicality is analogous to large ’s when we are considering finite limits.
The approach is to take two cases, 0 < M < 1 and M ≥ 1.
Case 1: (0 < M < 1) Choose N = 1. Then n > N = 1 implies that n2 + 1 > M (this is
assuming that the sequence starts at either n = 0 or n√= 1).
Case 2: (M ≥ 1) Proceed as we did originally—now M − 1 makes sense.
We include one more infinite limit example because we hinted at the result
in the last section—but warn you as was the case with Example 3.2.4 we will
again cheat in that we will use the logarithm and exponential functions.
lim cn = ∞.
Example 3.6.2 Prove that if c > 1 then n→∞
Proof: As before we assume that we are given M > 0. We want N so that n > N implies
that cn > M . We solve the last inequality for n by taking the logarithm of both sides to get
ln cn = n ln c > ln M or n > ln M/ ln c. We choose N = ln M/ ln c (Step 1: Defined N ). Then
n > N = ln M/ ln c implies that n ln c > ln M or ln cn > ln M . Taking the exponential of both
sides (the exponential function is also increasing) gives cn > M (Step 2: N works). Therefore
cn → ∞ as n → ∞.
We should note that some of the reasons that make the above steps correct include the
following facts. The logarithm and exponential functions are increasing so the inequalities
stay in the same direction when these functions are applied. We were given that c > 1 so that
ln c > 0 so that inequalities stay in the same direction when we divide by or multiply by ln c.
And if 0 < M < 1, ln M < 0 but it’s permissible to have a negative N because if we assume
that the sequence starts at either n = 0 or n = 1, for all n ≥ 0 > N = ln M , cn > M .
We should include the last few cases here. If c = 1, the sequence is the trivial
sequence of all ones so cn → 1. If c = −1, the sequence is the sequence that we
have consider in Example 3.2.6 so lim cn does not exist. Since the sequence
n→∞
is clearly unbounded, by Theorem 3.5.2–(c) the sequence does not converge to
any finite value—Theorem 3.5.2 only included convergence in R. Using Example
3.6.2 we see that c2n → ∞ and c2n+1 → −∞. Thus the sequence {cn } cannot
converge to either ±∞. Since there is nothing left, lim cn does not exist.
n→∞
We want to emphasize the point that the limit theorems stated and proved
in Sections 3.3 and 3.4 do not apply to infinite limits—we always had the as-
sumption that the limits were L, L1 or L2 and they were in R. It should not
surprise you that there are limit theorems for infinite limits—and that they are
not as nice as the theorems for finite limits. We include some of the results
without proof. The proofs of these results are easy. You should know that there
are more results available.
Proposition 3.6.2 Suppose that {an } and {bn } are real sequences. We have
the following results.
3.6 Infinite Limits 87
n2
HW 3.6.2 Prove that lim = ∞.
n→∞ n + 1
88 3. Limits of Sequences
Chapter 4
Limits of Functions
89
90 4. Limits of Functions
L+ε
L
L−ε
x0−δ2 x0+δ1
0
x0
0
x
the y-axis to define L. Thus that part of the plot gives us the function, f , the
point at which we want the limit, x0 , and the limiting point, L. We are given
an > 0 so we plot the points L ± . We then project these two points across
to the curve and down to the x-axis. We denote these two points as x0 − δ2 and
x0 + δ1 . This notation is really defining the size of δ1 and δ2 .
We note that whenever the curve is nonlinear, δ1 6= δ2 . In this case δ1 < δ2 .
More importantly you should realize that for any x between x0 − δ2 and x0 + δ1 ,
f (x) will be between L − and L + —you choose any such x, project the point
vertically to the curve and then horizontally to the y-axis. We want to find
a δ so that whenever 0 < |x0 − x| < δ (or x0 − δ < x < x0 + δ, x 6= x0 ),
then f (x) will satisfy |f (x) − L| < (or L − < f (x) < L + ). If we choose
δ = min{δ1 , δ2 } (Step 2: Define δ), the point x0 + δ will be at x0 + δ1 (because
we claimed that δ1 < δ2 so in this case δ = min{δ1 , δ2 } = δ1 ) and x0 − δ will be
inside of x0 − δ2 —between x0 − δ2 and x0 . Hence by the Figure 4.1.1 it should
be clear that whenever 0 < |x0 − x| < δ, |f (x) − L| < (Step 2: δ works).
You should realize that anytime you have an acceptable candidate for the
δ, i.e. one that works, you can always choose a smaller δ. For example, it
is clear from the picture that everything between x0 − δ1 (remembering that
δ1 < δ2 ) and x0 + δ1 will get mapped into the region (L − , L + ). So it
should be clear that if we chose δ = δ1 /13, then all points in the interval
(x0 − δ, x0 + δ) = (x0 − δ1 /13, x0 + δ1 /13) would also get mapped into the region
(L − , L + ). And, of course there is nothing special about 13 (except that it
is a very nice integer). In this case any δ such that 0 < δ < δ1 will work.
The second note that we should make about this example is that we have not
done anything to eliminate the point ”x0 ” from our deliberations, i.e. we have
not done anything to allow for the ”0 <” part of the requirement 0 < |x−x0 | < δ.
The reason is that in this case the function is sufficiently nice that we don’t have
to. In this case it is clear that f (x0 ) = L so that when |x − x0 | is actually zero,
i.e. when x = x0 , then |f (x) − L| = |f (x0 ) − L| = 0 < . The point is that once
we have the δ, we only need to satisfy if x is such that 0 < |x − x0 | < δ, then
|f (x) − L| < . If whenever x is such that |x − x0 | < δ, then |f (x) − L| < ,
the above statement will be satisfied (plus nice info at one extra point that we
didn’t need). This happens because f is a nice function. We will see that this
is not always the case. Even in this caase we could have not defined f at x0 or
defined f at x0 to be anything that we wanted. We would get the same picture
(except for right at x0 ) and the same result.
It should not surprise you to hear that we can also rewrite Definition 4.1.1
in terms of neighborhoods. We define a punctured neighborhood of a point
x0 to be the set (x0 − r, x0 + r) − {x0 } = (x0 − r, x0 ) ∪ (x0 , x0 + r) for some
r > 0, i.e. the same as a neighborhood of x0 except that we eliminate the point
x0 . We denote a punctured neighborhood of x0 by N̂r (x0 ). We can then restate
Definition 4.1.1 as follows: lim f (x) = L if for every neighborhood of L, N (L),
x→x0
there exists a punctured neighborhood of x0 , N̂δ (x0 ), such that x ∈ N̂δ (x0 ) ∩ D
implies that f (x) ∈ N (L). Again there is only a difference of notation between
this version of the definition and Definition 4.1.1.
92 4. Limits of Functions
Two limit theorems Before we proceed to apply the definitions to some spe-
cific examples, we are going to prove two propositions. The first is the analog
to Proposition 3.3.1. It would be best—in fact it is imperative—that when we
do have a value of L satisfying Definition 4.1.1, there isn’t some other L1 that
would also satisfy the definition. We have the following proposition.
Proposition 4.1.2 Suppose that f : D → R, D, R ⊂ R, x0 ∈ R and x0 is a
limit point of D. If lim f (x) exists, it is unique.
x→x0
− + L1 < f (x) < + L1 or (L1 + L2 )/2 < f (x) < (3L1 − L2 )/2. (4.1.1)
Likewise since lim f (x) = L2 , we know that for the given above there
x→x0
exists a δ2 such that 0 < |x − x0 | < δ2 implies that |f (x) − L2 | < . This
inequality can be rewritten as
− + L2 < f (x) < + L2 or (3L2 − L1 )/2 < f (x) < (L1 + L2 )/2. (4.1.2)
Let δ = min{δ1 , δ2 } and consider x such that 0 < |x − x0 | < δ. Then both
inequalities (4.1.1) and (4.1.2) will be satisfied. If we take the leftmost part of
inequality (4.1.1) and the rightmost part of inequality (4.1.2) we get
Proof: (⇒) We begin by assuming the hypothesis that lim = L and suppose
x→x0
that we are given a sequence {an } with an ∈ D for all n, an 6= x0 for any n and
an → x0 . We also suppose that we are given some > 0. We must find an N
such that n > N implies that |f (an ) − L| < .
Because lim f (x) = L, we get a δ such that
x→x0
We apply the definition of the fact that an → x0 with the ”traditional relaced
by δ” to get an N ∈ R such that
(Step 1: Define N .)
Now suppose that n > N . We first apply statement (4.1.4) above to see that
|an − x0 | < δ. By the fact that we assumed that the sequence {an } satisfied
an 6= x0 for all n, we know that 0 < |an − x0 |, i.e. for n > N we have
0 < |an − x0 | < δ. We then apply statement (4.1.3) (with x replaced by an ) to
see that |f (an ) − L| < (Step 2: N works). Therefore lim f (an ) = L.
n→∞
(⇐) We now assume that if {an } is any sequence such that an ∈ D for all n,
an 6= x0 for any n and an → x0 , then lim f (an ) = L. We assume that the
n→∞
proposition is false, i.e. that lim f (x) does not converge to L. This means
x→x0
that there is some such that for any δ there exists an x-value, xδ , such that
0 < |xδ − x0 | < δ, xδ ∈ D and |f (xδ ) − L| ≥ , i.e. for any δ there is at least
one bad value xδ . Think carefully this negation.
The emphasis is that the above last statement is true for any δ.
Let δ = 1: Then there exists an xδ value, call it a1 , such that 0 < |a1 − x0 | < 1
and |f (a1 ) − L| ≥ . a1 will be in D and a1 6= x0 .
Let δ = 1/2: Then there exists an xδ value, call it a2 , such that 0 < |a2 − x0 | <
1/2 and |f (a2 ) − L| ≥ . (It happens for any δ.) a2 ∈ D and a2 6= x0 .
We could go next to 1/3, then 1/4, etc except that it gets old. We’ll jump
to a general n.
Let δ = 1/n: Then there exists an xδ value, call it an , such that 0 < |an − x0 | <
1/n and |f (an ) − L| ≥ . an ∈ D and an 6= x0 .
And of course this works for all n ∈ N. We have a sequence {an } such that
an 6= x0 for all n (true because of the ”0 <” part of the restriction). We also
have |an − x0 | < 1/n for all n. This implies that an → x0 . (See HW3.2.1–(a).)
Then by our hypothesis we know that f (an ) → L, i.e. for any > 0 (including
specifically the given to us above) there exists an N such that n > N implies
|f (an ) − L| < . But for this sequence we have that |f (an ) − L| ≥ for all
n ∈ N. This is a contradiction therefore the assumption that ” lim f (x) does
x→x0
not converge to L” is false and lim f (x) = L.
x→x0
94 4. Limits of Functions
HW 4.1.4 Suppose that f (y) < 0 for all y in some punctured neighborhood of
y0 . Suppose that lim F (y) exists. Prove that lim F (y) ≤ 0.
y→y0 y→y0
Note in the above example when we applied the definition, we started with
the inequality that we need to be satisfied, |f (x) − 9| < . We then proceeded
96 4. Limits of Functions
to manipulate this inequality until we were able to isolate a term of the form
|x − 3|. This led to an easy definition of δ. The algebra of inequalities will not
always be as easy but it will always be possible to isolate the term |x − x0 |.
Observe this occurence as we proceed.
Example 4.2.2 Prove that lim x2 = 4.
x→2
Solution: Using Definition 4.1.1: We begin as we did in the last problem. We suppose
that > 0 is given. We must find δ. Eventually we must satisfy the inequality |f (x) − L| =
|x2 − 4| = |(x − 2)(x + 2)| = |x − 2||x + 2| < . Notice that the |x − 2| term is included in the
second to the last term of the inequality—as we promised it would. The next step is tougher.
We cannot divide by |x + 2|, get |x − 2| < /|x + 2| and define δ = /|x + 2|. This would
be analogous to what we did in Example 4.2.1. The δ that we find can depend on (like the
N ’s almost always depended on ). If we are taking a limit as x approaches a general point
x0 , the δ can depend on x0 —it’s a fixed value. δ cannot depend on x.
We used the bold face above but we did want to make that point extremely clear—
otherwise (and maybe in spite of) someone would make that mistake. The last couple sentences
of the above paragraph are very important.
We return to the inequality that we want satisfied |x − 2||x + 2| < . The technique we
use is to bound the |x + 2| term. How we do this is to choose a temporary fixed δ1 , say δ1 = 1,
and assume that |x − 2| < δ1 = 1. Then −1 < x − 2 < 1, 1 < x < 3 and 3 < x + 2 < 5.
The last inequality implies that |x + 2| < 5. Could it be less for some x? Of course it could.
However it could be very close to 5—and never bigger. Therefore if we assume that x satisfies
|x − 2| < δ1 = 1, then |x − 2||x + 2| < 5|x − 2|. If we then set 5|x − 2| < , we see that
|x − 2| < /5 so we see that it’s logical to define δ = /5. But this is wrong. If we review this
paragraph carefully, we see that |x − 2||x + 2| < 5|x − 2| < 5δ = 5(/5) = only if x satisfies
|x − 2| < δ1 = 1 and |x − 2| < δ = /5.
Therefore the way to do it is to forget our earlier definition of δ and define δ to be
δ = min{1, /5} (Step 1: Define δ). Then if x satisfies |x − 2| < δ, x will satisfy both
|x − 2| < 1 and |x − 2| < /5. Then
|x2 − 4| = |x − 2||x + 2| <∗ 5|x − 2| <∗∗ 5(/5) =
(Step 2: Show that the defined δ works) where inequality ”<∗ ” is satisfied because |x −
2| < 1 (δ = min{1, /5} < 1) and inequality ”<∗∗ ” is satisfied because |x − 2| < /5 (δ =
min{1, /5} < /5). Therefore lim x2 = 4.
x→2
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an 6= 2
for any n and an → 2. By Proposition 3.3.2–(d) we see that
lim f (an ) = lim a2n = lim an lim an = 2 · 2 = 4.
n→∞ n→∞ n→∞ n→∞
Therefore lim x2 = 4.
x→2
For the application of the definition, if you compare Examples 4.2.1 and 4.2.2,
you realize that the difference is that the function in Example 4.2.1 is linear and
that is why it is so easy to apply the definition. For most functions (at least all
nonlinear functions) you will have to apply some version of the method (trick?)
used in Example 4.2.2. Of course application of Proposition 4.1.3 let us skip
these difficulties. Recall that in the proof of Proposition 3.3.2–(d), we used
part (c) of the proposition—the result that guaranteed the boundedness of a
convergent sequence. Thus the application of Proposition 4.1.3 to find the limit
in Example 4.2.2 also uses a boundedness result—albeit, very indirectly.
We might note that in the application of the definition we used δ1 = 1 (we
call it δ1 because it’s sort of the first approximation of our δ) because 1 is a really
4.2 Applications of the Definition 97
nice number. If we had used δ1 = 1/2, we would have gotten 7/2 < x + 2 < 9/2.
In this case we see that |x + 2| < 9/2, so |x − 2||x + 2| < (9/2)|x − 2| and we
would define δ to be δ = min{1/2, /(9/2)}. If instead we had used δ1 = 2, then
we would find that |x + 2| < 6 and would eventually define δ = min{2, /6}.
Any of these choices would give you a correct result. As we see in the next
example it is sometimes important to be careful how we choose δ1 .
x−2
Example 4.2.3 Prove that lim = −4.
x→−2 x+3
Solution: Using Definition 4.1.1: For this problem we proceed as we have before and
assume
that > 0 is given. We want a δ so that when 0 < |x − (−2)| = |x + 2| < δ,
x − 2
x + 3 − (−4) < . We see that
x − 2 5(x + 2) 5|x + 2|
x + 3 − (−4) x + 3 = |x + 3| .
= (4.2.1)
We note that the |x + 2| term is there—as we promised it would be. So as in Example 4.2.2
we must bound the rest. But in this case we must be more careful. If we chose δ1 = 1 as
we did before (and 1 is such a nice number), then 5/|x + 3| would be unbounded on the set
of x such that |x + 2| < δ1 = 1. (|x + 2| < 1 implies that −3 < x < −1. 5/|x + 3| goes to
infinity as x goes to −3.) Hence we must be a little bit more careful and choose δ1 = 1/2. If
x is such that |x + 2| < 1/2, then −5/2 < x < −3/2 and 1/2 < x + 3 < 3/2. Thus we see
that if |x + 2| < 1/2, |x + 3| > 1/2 (and it’s only the bad luck of the numbers that the two
1/2’s appear) and 5/|x + 3| < 5/(1/2) = 10. Thus we return to equation (4.2.1) and see that
if |x + 2| < 1/2, then
x − 2 5|x + 2|
x + 3 − (−4) = |x + 3| < 10|x + 2|. (4.2.2)
Thus define δ = min{/10, 1/2} (Step 1: Define δ). Then if 0 < |x + 2| < δ, 5/|x + 3| < 10
and 10|x + 3| < 10(/10) = . Therefore if 0 < |x + 2| < δ,
x − 2
= 5|x + 2| < 10|x + 3| <
− (−4)
x + 3 |x + 3|
x−2
(Step 2: δ works.), and lim = −4.
x→−2 x+3
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an ∈ D
for all n (which in this case means that an 6= −3 for any n), an 6= −2 for any n and an → −2.
By Proposition 3.4.1–(b), Proposition 3.3.2–(a) and HW3.2.1-(b) we see that
an − 2 −4
lim f (an ) = lim = = −4.
n→∞ n→∞ an + 3 1
Note that again in this problem the ”0 <” part of the restriction on x is not
important. The function is well behaved at x = −2 (and equals −4). However,
in this problem the function f blows up near x = −3, we must be careful to
restrict the δ in the application of the definition and the sequence {an } in the
application of Proposition 4.1.3.
The next example that we consider is an important problem. The limit con-
sidered is an example of a limit used to compute a derivative—a very important
use of limits in calculus.
x3 − 64
Example 4.2.4 Prove that lim = 48.
x→4 x−4
x3 − 64
Solution: Using Definition 4.1.1: For convenience define f (x) = . Note that f (4)
x−4
is not defined. When you try to evaluate f at x = 4, you get zero over zero. This does not
98 4. Limits of Functions
mean that we will not be able to evaluate the limit given above—and hopefully you realize
this if you remember your limit work related to derivatives.
We proceed as usual, assume that we are given an > 0 and want to find a δ so that
x3 − 64
0 < |x − 4| < δ will imply that |f (x) − 48| = − 48 < . We start with expression
x−4
f (x) − 48 and note that
x3 − 64 (x − 4)(x2 + 4x + 16)
− 48 = − 48 (4.2.3)
x−4 x−4
(if you don’t believe the factoring, multiply the expression on the right to see that you get
x3 − 64 back). We have an x − 4 factor in both the numerator and the denominator of the first
term on the right. We want to divide them out. In general you have to be careful in doing
this but in this case it is completely permissible. The requirement on x will be 0 < |x − 4| < δ.
The meaning of the part of the inequality 0 < |x − 4| is that x − 4 6= 0. And if x − 4 6= 0, we
can divide them out. Hence returning to equation (4.2.3) we get
x3 − 64 (x − 4)(x2 + 4x + 16)
−48 = −48 = (x2 +4x+16)−48 = x2 +4x−32 = (x−4)(x+8).
x−4 x−4
We promised you that there would always be an x − 4 factor in the simplified version of
f (x) − L. Thus 3
x − 64
x−4 − 48 = |(x − 4)(x + 8)| = |x − 4||x + 8|.
(4.2.4)
The |x − 4| term will be made less than δ as it has been in Examples 4.2.1–4.2.3. The |x + 8|
term must be bounded as we bounded |x + 2| in Example 4.2.2 and 5/|x + 3| in Example 4.2.3.
Hence we require that |x−4| satisfy |x−4| < δ1 = 1 and notice that this gives us the following:
|x − 4| < 1 ⇒ − 1 < x − 4 < 1 ⇒ 3 < x < 5 ⇒ 11 < x + 8 < 13. Therefore if |x − 4| < δ1 = 1,
then |x + 8| < 13. More importantly, it is also clear that if 0 < |x − 4| < δ1 = 1, then
|x + 8| < 13—if it’s satisfied on the larger set, it’s satisfied on the smaller set. Returning to
equation (4.2.4) we see that if we require that x satisfy |x − 4| < δ1 = 1, then
3
x − 64
x − 4 − 48 = |x − 4||x + 8| < 13|x − 4|. (4.2.5)
And finally, if we define δ = min{1, /13} (Step 1: Define δ) and require that 0 < |x − 4| < δ
(so that 0 < |x − 4| < 1 and 0 < |x − 4| < /13), we continue with equation (4.2.5) to get
3
x − 64
x−4 − 48 = |x − 4||x + 8| < 13|x − 4| < 13(/13) =
x3 − 64
(Step 2: δ works). Therefore → 48 as x → 4.
x−4
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an ∈ D
for all n (i.e. an 6= 4 for any n), an 6= 4 for any n and an → 4. Then
a3n − 64 (an − 4)(a2n + 4an + 16)
lim = lim (4.2.6)
n→∞ an − 4 n→∞ an − 4
= lim a2 + 4an + 16 = 42 + 4 · 4 + 16 = 48. (4.2.7)
n→∞ n
We note that it is permissible to divide out the an − 4 term between lines (4.2.6) and (4.2.7)
x3 − 64
because we have assumed that an 6= 4 for any n. Therefore → 48 as x → 4.
x−4
We should emphasize that it would be wrong to apply Proposition 3.4.1–(b) after step
(4.2.6) and then try some sort of division.
You might have noticed that when we wrote the limits in the preceeding
examples, we did not usually explicitly define the function and the domain. We
wrote the expression for the function as a part of the limit statement (as you
4.2 Applications of the Definition 99
did in your basic calculus course) and assumed that you knew the domain. This
is common. Know the domain? We really assume that the domain is chosen as
the largest set on which the expression can be defined—in the case of Examples
4.2.1 and 4.2.2, D = R, in Example 4.2.3 D = R − {−3}, in Example 4.2.4
D = R − {4}, etc. Of course in these cases the requirement that x0 is a limit
point of D was always satisfied.
In the solution of Example 4.2.4 using the definition, we were able to factor
out an x − 4 term out of x3 − 64 in equation (4.2.3). This was not luck. If the
limit is to exist, it will always be there. Remember when we tried to evaluate
f (4) we got 0/0. The zero in the numerator implies that there’s a x − 4 factor in
there—somewhere, sometimes it’s hard to see. For all of the problems that result
in applying the definition of a derivative, you will always have the x − 4 term
in the numerator that will divide out with the x − 4 term in the denominator.
But remember it was the ”0 <” part of the restriction on x that allowed
us to divide out the x − 4 terms. This was essential. Likewise, this is
an example that if you choose to prove the limit using Proposition 4.1.3, the
hypothesis ”an 6= x0 for any n” becomes important. In the application of
Proposition 4.1.3, it is this assumption on the sequence {an } that allows us to
divide out the an − 4 terms.
There are other problems that require the ”0 <” restriction on x (or an 6=
x0 assumption) and a division other than the limits involved in computing
derivatives. You could make up a function that when factored looked like
(x − 2)2 (x2 + x + 1)/(x − 2)2 (you can multiply it out if you’d like to make
it look like a real example) and try to calculate the limit of that function as
x → 2. The limit would be 7. You would use the ”0 <” restriction to divide
out the (x − 2)2 terms and then would have
(x − 2)2 (x2 + x + 1)
− 7 = (x2 + x + 1) − 7 = x2 + x − 6 = (x + 3)(x − 2).
(x − 2)2
The last expression contains the x−2 term (we promised) and if we were applying
the defintion, we would proceed by bounding |x + 3| the way that we have done
before.
(x + 2)2 h(x)
If the function is of the form f (x) = where h(−2) 6= 0 (and we
x+2
want the limit of f as x → −2, then only one x + 2 term will divide out (you
only have one in the denominator—what else could you do) and the limit would
be 0 because of the x + 2 term that is left in the numerator. This is just a really
nice limit.
(x − 3)2 h(x)
If the function is of the form f (x) = where h(3) 6= 0—
(x − 3)3
emphasizing the fact that the degree of the term in the numerator is smaller
than that in the denominator, then you could divide out only two of the x − 3
terms and the x − 3 term that was left in the denominator would cause the limit
to not exist.
Let us emphasize again, all of these slight variations of the problem given
in Example 4.2.4 work because of the ”0 <” part of the restriction on x in
100 4. Limits of Functions
Definition 4.1.1. We see that it does not come into problems involving easy
limits but is important on the class of limits associated with derivatives—and
similar problems.
Nonconvergence of Limits: Of course if we have a definition of convergence
of limits and some examples of application of the definition, we must have some
examples where the function doesn’t converge to a limit. As in the case of
nonconvergence of sequential limits, proving that a limit does not exist using
the definition is often difficult. You must show that for some > 0 there does
not exist any δ such that 0 < |x − x0 | < δ implies that |f (x) − L| < (using
the notation as given in Definition 4.1.1), i.e. for some > 0 and any δ, there
exists an xδ such that 0 < |xδ − x0 | < δ and |f (x) − L| ≥ .
In general, it is usually much easier and more natural to use Proposition
4.1.3 to show that a limit does not exist. Again we do want you to see that it
is possible to use the definition in these arguments and how to use it.
The application of Proposition 4.1.3 is a bit different from before. Consider
the ⇒ direction of the proposition: if lim f (x) = L, then for any sequence
x→x0
{an } such that an ∈ D for all n, an 6= x0 for any n and lim an = x0 , then
n→∞
lim f (an ) = L. Of course the contrapositive of this statement would read
n→∞
something like the following. if it is not the case that for any sequence {an } such
that an ∈ D for all n, an 6= x0 for any n and lim an = x0 , then lim f (an ) = L,
n→∞ n→∞
then lim f (x) 6= L. How does one satisfy the statement ”it is not the case that
x→x0
for any sequence {an } such that an ∈ D for all n, an 6= x0 for any n and
lim an = x0 , then lim f (an ) = L”? It is easy. One way is to find a sequence
n→∞ n→∞
that satisfies the properties an ∈ D for all n, an 6= x0 for any n and an → x0 , but
the limit lim f (an ) does not exist. That implies that not only is the limit not
n→∞
some particular L but that the limit does not exist. Another way is to find two
sequences {an } and {bn } such that an , bn ∈ D for all n, an 6= x0 and bn 6= x0 for
any n, an → x0 and bn → x0 as n → ∞, and lim f (an ) 6= lim f (bn ). Not only
n→∞ n→∞
will this imply that the original limit is not L but will also imply that it can’t
be anything else either (because we have at least two nonequal candidates), i.e.
the limit does not exist.
We will include three examples of nonconvergence. The first will not satisfy
Definition 4.1.1 because it wants to have an infinite limit and Definition 4.1.1
requires that L ∈ R (and as in the case with sequential limits, we will later
define what means to have an infinite limit). The second will be analogous to
the sequential limit example given in Example 3.2.6—there will be two logical
limits, so neither (and nothing else) will satisfy the definition. And the last—
probably the most interesting—will not have a limit just because it is a really
nasty function. As we mentioned earlier we will show nonconvergence by the
definition because we want you to see how it is done. Because we feel that the
natural approach is to apply Proposition 4.1.3, we will give that approach first.
1
Example 4.2.5 Prove that lim does not exist.
x→0 x2
4.2 Applications of the Definition 101
Solution: (Using Proposition 4.1.3) Consider the sequence {1/n}. This sequence satisfies
the
properties 1/n 6= 0 for any n and 1/n → 0. Thus if the limit were to exist, the sequence
1
= {n2 } would have to converge to some L—the resulting limit. Clearly this is not
(1/n)2
1 2 1
the case in that (1/n) 2 = n → ∞ as n → ∞. Therefore lim does not exist.
x→0 x2
(Using Definition 4.1.1) If you evaluate the function 1/x2 near zero, we would hope that
you figure out what is happening. You consider = 1 (remember, we only have to show that
it’s bad for one particular ) and suppose that the limit is some L ∈ R where for the moment
we assume that L > −1. We must show that for any δ we do not satisfy 0 < |x| < δ implies
1
< = 1 or L − 1 < 1 < L + 1, i.e. we must show that for any δ there exists an xδ
x2 − L 2
x
1
such that 0 < |xδ | < δ and 2 − L ≥ = 1.
x √
Choose xδ = min{δ/2, 1/2 L + 1}. Then xδ is such that 0 < |xδ | < δ and 0 < xδ <
√ √ 1 1
1/ L + 1. We note that 0 < xδ < 1/ L + 1 implies that 2 > L + 1, or 2 − L ≥ 1. Thus
xδ xδ
1
lim cannot equal L. How did we choose this xδ that worked so well? Of course we worked
x→0 x2
backwards—knowing that we could choose xδ small enough so that 1/x2δ would be greater
than L + 1 (and δ/2 would guarantee that it is between 0 and δ).
If L ≤ −1 (and note that this implies that −L ≥ 1), then
for any δ we choose xδ = δ/2
1 4 4 1 1
and note that 2 − L = 2 − L ≥ 2 + 1 > 1, or 2 − L ≥ 1. Thus lim 2 cannot equal
xδ δ δ xδ x→0 x
L.
And of course, if the limit cannot equal L for L > −1 and cannot equal L for L ≤ −1,
then the limit cannot exist.
Note that even though this limit does not exist, it is handy here having the
”0 <” part of the restriction on x in the definition of a limit so that 1/x2 need
not be defined at x = 0—otherwise we would have been done long ago.
(
1 if x ≥ 0
Example 4.2.6 For f defined as f (x) = prove that lim f (x) does not
0 if x < 0. x→0
exist.
Solution: (Using Proposition 4.1.3) The approach we use for this example is to use
two sequences—remember that Proposition 4.1.3 must hold for all sequences {an } such that
an 6= x0 and an → x0 . We first considerthesequence {1/n}. We know that 1/n 6= 0 for any
1
n and 1/n → 0. It is easy to see that f → 1 (since all of the 1/n’s are positive implies
n
the f (1/n) = 1 for all n—so this is just an application of HW3.2.1-(b)). Then we consider the
sequence {−1/n}. Again we notice that the sequence satisfies the hypothesis of Proposition
1
4.1.3, but this time f − → 0. Therefore lim f (x) does not exist.
n x→0
(Using Definition 4.1.1) We approach this proof similar to the way that we proved that
lim (−1)n did not exist in Example 3.2.6. Case 1: We first guess that maybe the limit is
n→∞
1. We choose = 1/2 (remember that we only have to show that for some > 0 there is no
appropriate δ). If this limit were not to be 1, we would have to show that for any δ there
is an xδ such that 0 < |xδ | < δ, or −δ < xδ < δ, xδ 6= 0, and |f (x) − 1| ≥ = 1/2. Now
consider any δ and choose xδ = −δ/2. Then xδ satisfies |xδ | < δ and xδ 6= 0. Since f (xδ ) = 0
(xδ < 0), |f (xδ ) − 1| = 1 ≥ = 1/2. Thus we know that lim f (x) 6= 1.
x→0
Case 2: We next guess that the limit might be 0. We again choose = 1/2 and consider any
δ. This time choose xδ = δ/2. Then since |f (xδ ) − 0| = |1 − 0| = 1 ≥ = 1/2, lim f (x) 6= 0.
x→0
102 4. Limits of Functions
Case 3: We next consider the most difficult case, and assume that lim f (x) = L where L
x→0
is any real number other than 1 or 0 (the two cases that we have already considered). Then
choose = min{|L|/2, |L − 1|/2}. For any δ choose xδ = δ/2. Then |f (x) − L| = |1 − L| >
|L − 1|/2 ≥ . Thus lim f (x) 6= L, L 6= 1 and L 6= 0. (We could just have well chosen
x→0
xδ = −δ/2. Draw a picture to show why our choice of works in this case.)
Since we have exhausted all possible limits in R, lim f (x) does not exist.
x→0
In the next example we will use the sine function—and we have never defined
it (but we used it earlier). We assume that your trigonometry course gave a
sufficiently rigorous definition of these functions. We proceed with our last
case of non-existence.
(
1
sin if x 6= 0
Example 4.2.7 Define the function f : R → R by f (x) = x Prove that
0 if x = 0.
lim f (x) does not exist.
x→0
It is especially instructive for this example to get a plot of the function. We see on the
plot below that like the sine function −1 ≤ f (x) ≤ 1. But as x nears zero, 1/x goes through
odd multiples of π/2 (giving values ±1), multiples of π (giving values of 0) and everything
else in between—many times.
0.8
0.6
0.4
0.2
−0.2
−0.4
−0.6
−0.8
−1
−3 −2 −1 0 1 2 3
x
Case 2: L = 0 We next suppose that the limit is 0 (it is the only value left). We choose
= 1/2. Then for any δ, we can find an x0 such that 0 < |x0 | < δ, x0 6= 0 and x0 =
2/[(2n0 + 1)π] for some n0 ∈ N (one over an odd multiple of π/2). For this value of x0
|f (x0 ) − 0| = | ± 1| = 1 > 1/2 = . Thus again it is impossible to satisfy Definition 4.1.1 for
any δ (so lim f (x) 6= 0).
x→0
Therefore, lim f (x) does not exist.
x→0
HW 4.2.2 Use the graphical approach to show that lim 2x+3 = 9. Specifically
x→3
find the δ1 and δ2 (of Figure 4.1.1), determine δ and show that it works. Explain
why δ1 = δ2 in this example.
HW 4.2.3 (a) Prove that lim 7 = 7. Show this using the graphical approach
x→4
and then prove it twice—first using Definition 4.1.1 and then using Proposition
4.1.3.
(b) Prove that for any x0 , c ∈ R, lim c = c.
x→x0
(
x2 + x + 1 if x 6= 2
HW 4.2.4 Define the function f : R → R by f (x) =
12 if x = 2.
Prove that lim f (x) = 7—prove it twice, first using Definition 4.1.1 and then
x→0
using Proposition 4.1.3.
x2
HW 4.2.5 Prove that lim = −9—prove it twice, first using Definition
x→3 x − 4
4.1.1 and then using Proposition 4.1.3.
limit theorems that allow us to compute a large number of limits. You already
know most of these limit theorems from your elementary calculus class. Of
course we will now include the proofs of these theorems. And it should not
be a surprise to you that the limit theorems will look very much like the limit
theorems that we proved for limits of sequences. As with the proofs of conver-
gence of the specific limits done in Section 4.2, parts (a), (b), (d) and (f) of
Proposition 4.3.1 given below can be proved by either using the definition or
Proposition 4.1.3. Again we feel that you should see both approaches. For that
reason we will include both proofs for these parts. Since the proofs applying
Definition 4.1.1 are very similar to the proofs of the analogous proofs for limits
of sequences and the proofs applying Proposition 4.1.3 are pretty easy, we will
give reasonably abbreviated versions of these proofs.
Proposition 4.3.1 Consider the functions f, g : D → R where D ⊂ R, suppose
that c, x0 ∈ R and x0 is a limit point of D. Suppose lim f (x) = L1 and
x→x0
lim g(x) = L2 . We then have the following results.
x→x0
(a) lim (f (x) + g(x)) = lim f (x) + lim g(x) = L1 + L2 .
x→x0 x→x0 x→x0
(b) lim cf (x) = c lim f (x) = cL1 .
x→x0 x→x0
(c) There exists a δ3 , K ∈ R such that for x ∈ D and 0 < |x − x0 | < δ3 ,
|f (x)| < K.
(d) lim f (x)g(x) = lim f (x) lim g(x) = L1 L2 .
x→x0 x→x0 x→x0
(e) If L2 6= 0, then there exists a δ4 , M ∈ R such that if x ∈ D and 0 < |x−x0 | <
δ4 , then |g(x)| > M .
(f ) If L2 6= 0, then
f (x) limx→x0 f (x) L1
lim = = .
x→x0 g(x) limx→x0 g(x) L2
Proof: So that we don’t have to repeat it every time, throughout this proof
let {an } be any sequence such that an ∈ D for all n, an 6= x0 for any n, and
a n → x0 .
(a) (Using Definition 4.1.1) We suppose that we are given an > 0. We
apply the hypothesis lim f (x) = L1 with 1 = /2 to get a
x→x0
Since this holds true for any such sequence {an }, by Proposition 4.1.3 we get
lim (f (x) + g(x)) = L1 + L2 .
x→x0
(b) (Using Definition 4.1.1) If c 6= 0, we apply the hypothesis lim f (x) = L1
x→x0
with 1 = /|c|. Then setting δ = δ1 will give the desired result.
If c = 0, the result is trivial since cf (x) = 0 for all x ∈ D—so it follows from
HW4.2.3-(b).
(Using Proposition 4.1.3) Since lim cf (an ) = c lim f (an ) by Proposition
n→∞ n→∞
3.3.2–(b), the result follows.
(c) (We do not give a proof of this result based on Proposition 4.1.3—it is
possible but it would not be very insightful.) Using the hypothesis lim = L1
x→x0
with 1 = 1, we get a δ3 such that if x ∈ D and 0 < |x − x0 | < δ3 implies that
|f (x) − L1 | < 1 = 1. Then by the backwards triangular inequality, Proposition
1.5.8–(vi), we see that for all x such that x ∈ D and 0 < |x − x0 | < δ3 ,
Then using the fact that 1 = /(2|L2 |) and 2 = /(2K) (Step 2: δ works), the
result follows.
Note that generally K 6= 0—or we can always choose it to be so. If L2 = 0,
the result follows by choosing 2 = /K and δ = min{δ2 , δ3 }, and noting that
whenever x ∈ D and 0 < |x − x0 | < δ—we use the hypothesis lim f (x) = L1
x→x0
only to get K.
(Using Proposition 4.1.3) The result follows from Proposition 3.3.2–(d).
(e) (Again we do not include a proof of this result based on Proposition 4.1.3.)
Since L2 is assumed to be nonzero, we use the hypothesis lim g(x) = L2
x→x0
with 2 = |L2 |/2 and obtain a δ4 such that 0 < |x − x0 | < δ4 implies that
|g(x) − L2 | < = |L2 |/2. We have
|L2 | − |g(x)| ≤ |L2 − g(x)| = |g(x) − L2 | < = |L2 |/2
by the backwards triangular inequality, Proposition 1.5.8–(vi).
Thus when x ∈ D and 0 < |x − x0 | < δ4 , |g(x)| > |L2 | − |L2 |/2 = |L2 |/2. If we
set M = |L2 |/2, we are done.
(f ) (Using Definition 4.1.1) We suppose that we are given an > 0 and apply
the hypotheses lim f (x) = L1 with respect to 1 to get δ1 such that x ∈ D
x→x0
and 0 < |x − x0 | < δ1 implies |f (x) − L1 | < 1 , lim g(x) = L2 with respect to
x→x0
2 to get δ2 such that x ∈ D and 0 < |x − x0 | < δ2 implies |g(x) − L2 | < 2 ,
and L2 6= 0 and part (e) of this proposition to get δ4 such that x ∈ D and
0 < |x − x0 | < δ4 implies that g(x) > M . Then we set δ = min{δ1 , δ2 , δ4 } (Step
1: Define δ), require x to satisfy x ∈ D and 0 < |x − x0 | < δ and note that
f (x) L1 f (x)L2 − L1 g(x) (f (x) − L1 )L2 + L1 (L2 − g(x))
g(x) − = =
L2 L2 g(x) L2 g(x)
|f (x) − L1 ||L2 | + |L1 ||L2 − g(x)| 1 |L2 | + |L1 |2
≤ < .
g(x)|L2 | M |L2 |
Thus we see that if we choose 1 as 1 = M /2 and 2 as 2 = M |L2 |/(2|L1 |),
f (x) L1
then − < (Step 2:δ works) and lim f (x) = L1 .
g(x) L2 x→x0 g(x) L2
We have assumed that L2 6= 0 and we know by part (e) of this proposition
that g(x) 6= 0—so we can put them in the denominator. If L1 = 0, the result
follows by choosing δ = min{δ1 , δ4 } and 1 = M ).
(Using Proposition 4.1.3) Since by Proposition 3.4.1–(b)
f (an ) limn→∞ f (an ) L1
lim = =
n→∞ g(an ) limn→∞ g(an ) L2
for any such sequence {an }, the result follows from Proposition 4.1.3.
Notice that in this proof we don’t use part (d) of this proposition. It should
be clear that we use the hypothesis that L2 6= 0 via Proposition 3.4.1–(a) which
is then used to prove part (b) of Proposition 3.4.1.
Parts (a), (b), (d) and (f) of the above proposition are basic tools used in
the calculation of limits. However, to use these tools—which are always used to
simplify a given limit to a set of easier limits—we need some easier limits. In
the next proposition we provide one of the easy limits that we need.
4.3 Limit Theorems 107
Proof: The proofs of these results are very easy. Result (a) follows by choosing
δ = in Definition 4.1.1. Property (b) is an elementary application of math-
ematical induction using part (d) of Proposition 4.3.1. And then the result
given in part (c) follows from parts (a) and (b) of this proposition (or by apply
Proposition 4.3.1–(d) k − 1 times along with part (a) of this proposition).
We next include an inductive version of Proposition 4.3.1–(a) and use this
result—along with the other parts of Proposition 4.3.1 to compute a large class of
limits. Let p and q denote mth and nth degree polynomials, respectively, p(x) =
a0 xm + a1 xm−1 + · · · + am−1 x + am and q(x) = b0 xn + b1 xn−1 + · · · + bn−1 x + an .
m−1
p(x) p(x0 ) a0 xm
0 + a1 x0 + · · · + am−1 x0 + am
lim = = n n−1 .
x→x0 q(x) q(x0 ) b0 x0 + b1 x0 + · · · + bn−1 x0 + bn
Proof: As was the case with Proposition 4.3.2 the proof of this proposition is
also easy. Part (a) can be proved by applying mathematical induction along with
part (a) of Proposition 4.3.1. The result given in part (b) then follows from part
(a) of this result, Proposition 4.3.1–(b) and Proposition 4.3.2–(c). And finally,
to prove part (c) we apply the quotient rule from Proposition 4.3.1–(f) along
with part (b) of this proposition.
We are now able to compute a large class of limits. We have intentionally
skipped functions involving irrational exponents and functions of the form ax
(which are two other very basic ”easy limits” that we use along with Proposition
4.3.1 to compute limits) because we will give a rigorous mathematical introduc-
tion to these functions in Section 7.7—so discussiing their limits at this time
would be cheating. If we returned to the examples considered in Section 4.2
we would now be able to compute the limits considered in Examples 4.2.1–4.2.3
very easily. To compute a limit such as that considered in Example 4.2.4, we
108 4. Limits of Functions
proceed much the way we did in our elementary course and compute as follows.
x4 − 16 (x − 2)(x + 2)(x2 + 4)
lim = lim
x→2 x − 2 x→2 x−2
= lim (x + 2)(x2 + 4) because we know that x − 2 6= 0
x→2
= 32 by Proposition 4.3.3–(b).
Before we leave this section we include one more limit result—the Theorem,
analogous to the sequential Sandwich Theorem, Proposition 3.4.2.
Proposition 4.3.4 Consider the functions f, g, h : D → R where D ⊂ R,
suppose that x0 ∈ R and x0 is a limit point of D. Suppose lim f (x) =
x→x0
lim g(x) = L and there exists a δ1 such that f (x) ≤ h(x) ≤ g(x) for x ∈ D
x→x0
and 0 < |x − x0 | < δ1 . Then lim h(x) = L.
x→x0
Proof: Suppose > 0 is given. Let δ2 and δ3 be such that 0 < |x − x0 | < δ2
implies |f (x) − L| < or L − < f (x) < L + , and 0 < |x − x0 | < δ3 implies
|g(x) − L| < or L − < g(x) < L + . Let δ = min{δ1 , δ2 , δ3 } (Step 1: Define
δ) and suppose that x satisfies x ∈ D and 0 < |x − x0 | < δ. Then
or L− < h(x) < L+. Thus |h(x)−L| < (Step 2: δ works) so limx→x0 h(x) =
L.
x−2 √
HW 4.3.2 Prove that lim √ √ = 2 2.
x→2 x− 2
4.4 Infinite and One-sided Limits 109
You probably computed some limits of this sort in your basic calculus class.
One of the common applications of limits at ±∞ is to determine asymptotes
to curves. Methods for computing limits at infinity are similar to the methods
for computing sequential limits. For example, the approach used to calculate a
2x2 + x − 3
limit such as lim is to perform the following computation.
x→∞ 3x2 + 3x + 3
(Compare this result with the limit evaluated at the beginning of Section 3.4.)
To perform the above computation we first multiplied the numerator and de-
nominator by 1/x2 and then used ”limit of a quotient is the quotient of the
limits”, ”limit of a sum is the sum of the limits”, ”limit of a constant times a
function is the constant times the limit”, ”limit of a constant is that constant”
and ”the limit of 1/xk as x goes to infinity is zero” (k ∈ N). Clearly, at present
we do not have these results, and hopefully equally clearly, these results are
110 4. Limits of Functions
We are not going to prove these two propositions. Their proofs are just
copies of the analogous sequential results. Likewise, we could also take some
2x + 3 2
particular examples such as lim = and use Definition 4.4.1 to prove
x→∞ 5x + 7 5
this statement. We will not do that because such a proof would be almost
identical to the proof given in Example 3.2.3 (when we did the analogous result
for sequences). Also we should add that there are versions of Propositions 4.4.2
and 4.4.3 for the case when x approaches −∞. And finally note that we have
not mentioned a result analogous to Example 3.5.1. It is possible to prove that
for 0 < c < 1, lim cx = 0. However, since we will wait until Section 7.7 to
x→∞
define cx , we do not consider this limit at this time.
Infinite Limits In Example 4.2.5 we showed that lim (1/x2 ) does not exist. We
x→0
mentioned as a part of the proof that the limit wanted to go to infinity—so since
according to Definition 4.1.1 it is necessary that L ∈ R, the limit cannot exist.
We want to be able to show that the limit above does not exist in a much nicer
way than the nonexistence of the limits considered in Examples 4.2.6 and 4.2.7.
Just as we included an alternative definition for sequences converging to infinity
in Section 3.6, we want the same concept for limits of functions. Consider the
following definition.
Definition 4.4.4 (a) Suppose that f : D → R, D, R ⊂ R, x0 ∈ R and x0 a
limit point of D. We say that lim f (x) = ∞ if for every M > 0 there exists a
x→x0
δ such that x ∈ D, 0 < |x − x0 | < δ implies that f (x) > M .
(b) lim f (x) = −∞ if for every M < 0 there exists an δ such that x ∈ D,
x→x0
0 < |x − x0 | < δ implies that f (x) < M .
4.4 Infinite and One-sided Limits 111
We should note that the functions f + and f − are just copies of f to the right
and the left of x0 , respectively—hence using f + and f − we get the right and left
hand limits of f , respectively. The fact that Definition 4.1.1 is a very general
definition of a limit allows us to easily define the right and left hand limits. Also
notice that it is still a requirement that x0 is a limit point of D+ and D− —this
is to guarantee that we have enough points of D on either side of x0 to allow us
to apply Definition 4.1.1. Note that if x0 is a limit point of either D+ or D− ,
112 4. Limits of Functions
then x0 will also be a limit point of D. However there are sets D when x0 is
a limit point of D but x0 is not a limit point of both D+ and D− —Consider
x0 = 1 and D = (0, 1).
Before we look at some results concerning right and left hand limits, we
include the more common definition in the following result.
Proposition 4.4.6 Suppose that f : D → R where D ⊂ R and x0 , L ∈ R.
(a) Suppose that x0 is a limit point of D ∩ (x0 , ∞). Then lim f (x) = L if
x→x0 +
and only if for every > 0 there exists a δ such that x ∈ D and 0 < x − x0 < δ
implies that |f (x) − L| < .
(b) Suppose that x0 is a limit point of D ∩ (−∞, x0 ). Then lim f (x) = L if
x→x0 −
and only if for every > 0 there exists a δ such that x ∈ D and 0 < x0 − x < δ
implies that |f (x) − L| < .
Proof: (a) (⇒) We begin by assuming that lim f (x) = L, i.e. lim f + (x) =
x→x0 + x→x0
L. This means that for every > 0 there exists a δ such that x ∈ D+ and 0 <
|x−x0 | < δ implies that |f + (x)−L| < . We note that if x ∈ D+ = D ∩(x0 , ∞),
then |x − x0 | = x − x0 , so 0 < |x − x0 | < δ is equivalent to 0 < x − x0 < δ. Also,
note that if x ∈ D+ , then x ∈ D also. And finally, for x ∈ D+ f + (x) = f (x).
Thus for the δ given we see that x ∈ D and 0 < |x − x0 | = x − x0 < δ implies
|f (x) − L| < .
(⇐) We will skip the proof of this direction because it is so similar to the proof
given for part (b)
(b) (⇒) We will skip the proof of this direction because it is so similar to the
proof given for part (a).
(⇐) We suppose that for every > 0 there exists a δ so that if x ∈ D and
0 < x0 − x < δ implies |f (x) − L| < . Note that x ∈ D and 0 < x0 − x < δ
is equivalent to x ∈ D− and |x − x0 | < δ. Also, if x ∈ D and 0 < x0 − x < δ,
then f (x) = f − (x). For x ∈ D− and 0 < x0 − x = |x − x0 | < δ we have
|f − (x) − L| < . Thus lim f − (x) = L or lim f (x) = L.
x→x0 x→x0 −
The way that we apply Proposition 4.4.6 in a one-sided limit proof is very
similar to the way that we applied Definition 4.1.1—except that we now only
need to consider points on one side of x0 .
The third characterization of one-sided limits should be very familiar to us.
In Proposition 4.1.3 we gave a sequential characterization of limits—we can do
the same thing for one-sided limits. We state the following proposition.
Proposition 4.4.7 Suppose that f : D → R, D ⊂ R, x0 , L ∈ R.
(a) Suppose that x0 is a limit point of D ∩ (x0 , ∞). Then lim f (x) = L if
x→x0 +
and only if for any sequence {an } such that an ∈ D for all n, an > x0 for all
n, and lim an = x0 , then lim f (an ) = L.
n→∞ n→∞
(b) Suppose that x0 is a limit point of D ∩ (−∞, x0 ). Then lim f (x) = L if
x→x0 −
and only if for any sequence {an } such that an ∈ D for all n, an < x0 for all
n, and lim an = x0 , then lim f (an ) = L.
n→∞ n→∞
4.4 Infinite and One-sided Limits 113
Proof: We will skip this proof because it is so much like the proof of Proposition
4.1.3. For the (⇒) direction of part (a) for a given > 0 the right hand limit
hypothesis gives a δ, this δ used as the ”” in the assumption that an → x0
(and it works because we have assumed that an > x0 ) gives an N which is the
N that we need to prove that f (an ) → L as n → ∞. The (⇒) direction of part
(b) is essentially the same.
To prove the (⇐) directions we again assume false and use this assumption
to create a sequence {an } that contradicts our hypthesis—because the one-sided
limit is used the our contradiction assumption, the sequence will either greater
than or less than x0 .
We emphasize that when we want to prove things about one-sided limits, we
will generally use Propositions 4.4.6 and 4.4.7. We used Definition 4.4.5 as our
definition of one-sided limits to emphasize the ”one sidedness” of the functions
when we consider one-sided limits.
In Example 4.2.2 we proved that lim x2 = 4. It is very easy to show that
x→2
lim x2 = 4 and lim x2 = 4. If we use Proposition 4.4.6 we can choose
x→2+ x→2−
δ = min{1, /5} (the same δ used in Example 4.2.2). If we apply Proposition
4.4.7 to prove that lim x2 = 4, we use a sequence an → 2 with an > 2 and
x→2+
the fact that lim a2n = lim an lim an = 2 · 2 = 4. And of course the
n→∞ n→∞ n→∞
application of Proposition 4.4.7 to lim x2 is similar except that this time we
x→2−
assume that the sequence satisfies an < 2. We do not try to apply Definition
4.4.5 directly—either Propositions 4.4.6 and 4.4.7 are much cleaner ways to
prove one-sided( limits.
1 if x ≥ 0
If f (x) = we showed in Example 4.2.6 that lim f (x) does not
0 if x < 0, x→0
exist. It is very easy to show that lim f (x) = 1 and lim f (x) = 0. If we
x→0+ x→0−
were to apply Proposition 4.4.6, we can choose δ = 1 (or anything else) for both
of them. If we apply Proposition 4.4.7, the results follow because f (an ) = 1 if
an > 0 and f (an ) = 0 if an < 0.
In Example 4.2.7 we showed that lim f (x) does not exist when f (x) =
( x→0
sin(1/x) x 6= 0
The easiest way to show that lim f (x) does not exist is
0 x = 0. x→0+
Proof: (⇒) We assume that lim f (x) exists, i.e. for every > 0 there exists
x→x0
a δ such that x ∈ D and 0 < |x − x0 | < δ implies that |f (x) − L| < for some
L ∈ R. Note that 0 < |x − x0 | < δ implies that 0 < x − x0 < δ or 0 < x0 − x < δ.
We assumed that x0 is a limit point of D ∩ (x0 , ∞) and we know that we have
a δ such that x ∈ D and 0 < x − x0 < δ implies that |f (x) − L| < . Thus by
Proposition 4.4.6-(a) lim f (x) = L. Also, x0 is a limit point of D ∩ (−∞, x0 )
x→x0 +
and we have a δ such that x ∈ D and 0 < x0 − x < δ implies that |f (x) − L| < .
Thus by Proposition 4.4.6-(b) lim f (x) = L.
x→x0 −
(⇐) Suppose > 0 is given. We suppose that lim f (x) = L and lim f (x) =
x→x0 + x→x0 −
L. Then there exists a δ1 and δ2 such that if x satisfies either 0 < x − x0 < δ1
or 0 < x0 − x < δ2 implies that |f (x) − L| < . Let δ = min{δ1 , δ2 } (Step 1:
Define δ). Then if x satisfies 0 < |x − x0 | < δ, x satisfies 0 < x − x0 < δ or
0 < x0 − x < δ. So x satifies 0 < x − x0 < δ ≤ δ1 or 0 < x0 − x < δ ≤ δ2 . Thus
|f (x) − L| < (Step 2: δ works) and lim f (x) = L.
x→∞
4.4 Infinite and One-sided Limits 115
There are times when we want to prove a particular limit that the above
theorem is very useful. We can handle the function on each side of x0 separately,
get the same one-sided limits and hence prove our limit result. The theorem is a
very useful too for proving that a particular limit does not exist. You compute
both one-sided limits and if either one doesn’t exist or they are not equal, the
limit doesn’t exist.
HW 4.4.2 (a) Prove that lim 1/x4 = ∞. (b) Prove that lim 1/x4 = 0.
x→0 x→∞
x 1
(c) Prove that lim sin x does not exist. (d) Prove that lim = .
x→∞ x→∞ 2x − 1 2
(
x2 if x 6= 3
HW 4.4.3 Suppose that f (x) = Prove that limx→3+ f (x) =
14 if x = 3.
lim f 9x) = 9.
x→3−
1
sin x
if x > 0
HW 4.4.4 Suppose that f (x) = 0 if x < 0 Compute lim f (x) and
x→0+
12 if x = 0.
lim f (x) if they exist. If they do not exist, prove that they do not exist.
x→0−
116 4. Limits of Functions
Chapter 5
Continuity
Proof: This proof is very easy. Let > 0 be given. If we apply the definition
of lim f (x) = f (x0 ) we get a δ such that 0 < |x − x0 | < δ implies that
x→x0
|f (x) − f (x0 )| < . But this is almost what we need to satisfy Definition 5.1.1.
We need to get rid of the ”0 <” requirement. But when x = x0 , we know that
117
118 5. Continuity
• At point xA , though the graph has a ”corner” at that point, the function
is continuous at that point. (Generally, a function is continuous at well-
defined corners.)
120 5. Continuity
Proof: Before we begin let’s look at some of the differences between the above
statement and that given in Proposition 4.1.3. Because we now assume that
x0 ∈ D and because we no longer have the ”0 <” restriction on the range of x,
we now do not require that an 6= x0 . In addition, in the above statement we no
longer require that x0 is a limit point of D. We know that when we consider
the continuity of a function, it is permissible to have isolated points in D and
the function will always be continuous at those isolated points. Despite these
differences the proof of this result is essentially identical to that of Proposition
4.1.3.
(⇒) We are assuming that f is continuous at x0 ∈ D. We consider any sequence
{an } where an ∈ D and an → x0 . We suppose that we are given an > 0. The
continuity of f at x = x0 implies that there exists a δ such that |x − x0 | < δ
implies that |f (x) − f (x0 )| < . If we apply the definition of the fact that
an → x0 with the traditional ”” replaced by δ, we get an N such that n > N
implies that |an − x0 | < δ. Then statement that f is continuous at x = x0
implies that for n > N , |f (an ) − f (x0 )| < . Thus f (an ) → f (x0 ).
(⇐) We suppose that f is not continuous at x0 , i.e. for some 0 > 0 for any δ
there exists an xδ ∈ D such that |xδ − x0 | < δ and |f (xδ ) − f (x0 )| ≥ .
5.2 Examples 121
Let δ = 1 (it’s true for any δ) and get an x1 ∈ D such that |x1 − x0 | < 1 and
|f (x1 ) − f (x0 )| ≥ .
Let δ = 1/2 and get an x2 ∈ D such that |x2 −x0 | < 1/2 and |f (x2 )−f (x0 )| ≥ .
And in general
let δ = 1/n and get an xn ∈ D such that |xn −x0 | < 1/n and |f (xn )−f (x0 )| ≥ .
Thus we have a sequence {xn } such that xn ∈ D for all n, xn → x0 and
f (xn ) 6→ f (x0 ). This is a contradiction so f is continuous at x = x0 .
We should be mildly concerned that the proofs of Propositions 4.1.3 and
5.1.3 are the same—we pointed out the differences between the two results.
This can be explained fairly easily if we consider two separate cases: when x0
is a limit point of D and when it is not. For the first case we can consider
the continuity in terms of limits so it shouldn’t be surprising that the proof of
Proposition 5.1.3 is very much like the proof of Proposition 4.1.3. When x0 is
not a limit point of D both the continuity and the sequential statement are very
easy—so this case is included trivially.
There are some texts that use the right hand side of Proposition 5.1.3 as
the definition of continuity—since the proposition is an ”if and only if” result,
this is completely permissible. The definition of continuity given in Definition
5.1.1 is the most common definition. Of course there are many times that the
sequential characterization of continuity is very useful. We feel that you must
be comfortable with using both Definition 5.1.1 and Proposition 5.1.3 (just as
we tried to force you to work with both Definition 4.1.1 and Proposition 4.1.3).
Specifically, as was the case with limits, when we want to show that a function is
not continuous at a given point, it is usually easier to apply Proposition 5.1.3—
providing a sequence {an } at which {f (an )} does not converge or converges to
something different from f (x0 ). When a push comes to a shove, we will use
which ever characterization is best at the time.
We next consider the absolute value function at x = 0. Recall that the graph
of the absolute value function has a corner at x = 0 (like point xA on the graph
of f in Figure 5.2.1). Functions are continuous at corners of the graph.
Example 5.2.2 Show that the function f (x) = |x| is continuous at x = 0.
Solution: Note that f is defined on R (which we will assume to be the domain of f ). Clearly
x = 0 is a limit point of the domain. Since −x ≤ |x| ≤ x, lim x = 0 lim (−x) = 0, we have
x→0 x→0
lim |x| = 0 = |0| by using Proposition 4.3.4. Then by Proposition 5.1.2 | · | is continuous at
x→0
x = 0.
It should be clear that the absolute value function is also continuous at all other points
of R.
Figure 5.2.1: The circle has radius 1, A and P are on the circumference of the
circle and P Q is perpendicular to OA
We see that |AP | ≤ |θ| (equality when θ = 0)—where the absolute value signs are included
to allow for a negative angle θ. From triangle 4OQP we see that |QP | = | sin θ| and |OQ| =
cos θ—which gives |AQ| = 1 − cos θ. Then applying the Pythagorean Theorem to triangle
4AQP we see that
sin2 θ + (1 − cos θ)2 = |AP |2 ≤ θ2 .
124 5. Continuity
Therefore sin2 θ ≤ θ2 and (1 − cos θ)2 ≤ θ2 . If we take the square roots of both inequalities
we get | sin θ| ≤ |θ| and |1 − √
cos θ| ≤ |θ|. (Notice
√ that this is one of the times that you must
be very careful to note that a2 = |a|—not a2 = a.)
(b) and (c) By Example 5.2.2 we know that lim (±|θ|) = 0. Then by part (a) of this example
θ→0
and Proposition 4.3.4 lim sin θ = 0 = sin 0. Since x = 0 is a limit point of R we can apply
θ→0
Proposition 5.1.2 to see that the sine function is continuous at θ = 0.
It should be easy to see that the proof that lim (1 − cos θ) = 0 is the same as the proof
θ→0
given above for the sine function. From this we see that lim cos θ = 1 = cos 0 and that the
θ→0
cosine function is continuous at θ = 0.
(Alternative proof:) Once we have the inequalities from part (a), it is also easy to prove
the continuity of sine and cosine using Definition 5.1.1. (If > 0 is given, then by choosing
δ = , we see that |θ − 0| < δ implies that − = −δ < −|θ| ≤ 1 − cos θ ≤ |θ| < δ = . So cosine
is continuous at θ = 0.)
(d) Let θ0 ∈ R. We consider lim sin θ. Note that by a trig identity
θ→θ0
By part (b) lim sin h = 0, hence for any > 0 there exists a δ such that |h| < δ implies that
h→0
| sin h| < . Then: if |θ − θ0 | < δ, we have | sin(θ − θ0 )| < . therefore lim sin(θ − θ0 ) = 0.
θ→θ0
In part (b) we found that lim cos θ = 1. Hence, lim cos(θ − θ0 ) = 1. Thus by these
θ→0 θ→θ0
limits, equation (5.2.1) and parts (a) and (b) of Proposition 4.3.1 we see that lim sin θ =
θ→θ0
sin θ0 (1) + cos θ0 (0) = sin θ0 . Therefore the sine function is continuous at θ = θ0 (for any
θ0 ∈ R).
To prove the continuity of the cosine function at θ = θ0 we use the identity
and proceed as we did in the proof of the continuity of the sine function.
The next example is a fun example. Before we get working notice the func-
tion f defined in Example 5.2.4. Recall that in Example 4.2.7 we considered a
similar function (without the x term multiplying the sine term) that was not
continuous at x = 0. As we did in Example 4.2.7 it is useful here to look at the
plot of f . In Figure 5.2.2 we see that the plot of f squeezes down to zero when
x is near zero. This is the attribute of this function that makes it continuous at
x = 0 whereas the function given in Example 4.2.7 was not continuous at x = 0.
(
1
x sin if x 6= 0
Example 5.2.4 Define the function f : R → R by f (x) = x
0 if x = 0.
Show that f is continuous at x = 0.
Proof: It is easy to use Definition 5.1.1 to prove that f is continuous at x = 0. Let > 0 be
given, define δ = and consider x values that satisfies |x| < δ. Then
x sin 1 − 0 ≤ |x| < δ = .
x
Therefore f is continuous at x = 0.
It should be clear that f is also continuous for all other points in R.
We include one more example (that is also a fun example) that introduces
a useful, interesting function.
5.2 Examples 125
0.5
0.4
0.3
0.2
0.1
−0.1
−0.2
−0.5 0 0.5
x
(
1 if x ∈ Q
Example 5.2.5 Define the function f : D = [0, 1] → R by f (x) =
0 if x ∈ I.
Show that f is discontinuous at all points x ∈ D = [0, 1].
Solution: First consider x0 ∈ [0, 1]∩I. Let = 1/2. Consider any δ. We know by Proposition
1.5.6-(a) that there exists rδ ∈ Q such that rδ ∈ (x0 − δ, x0 + δ), i.e. rδ satisfies |rδ − x0 | < δ
and |f (rδ ) − f (x0 )| = |1 − 0| = 1 > = 1/2. Therefore f is not continuous at x0 .
Likewise consider x0 ∈ [0, 1] ∩ Q. Let = 1/2. Consider any δ. We know by Proposition
1.5.6-(b) that there exists iδ ∈ I such that iδ ∈ (x0 − δ, x0 + δ), i.e. iδ satisfies |iδ − x0 | < δ
and |f (iδ ) − f (x0 )| = |0 − 1| = 1 > = 1/2. Therefore f is not continuous at x0 .
HW 5.2.2 (a) Use Definition 5.1.1 to prove that f (x) = |x| is continuous at
x = 0.
(b) Use Definition 5.1.1 to prove that f (x) = |x| is continuous at x = 2.
HW 5.2.6 Prove that any rational function is continuous at all points where
the denominator is nonzero.
Proof: The proofs of (a)-(d) follow from Proposition 5.1.3 along with Proposi-
tions 3.3.2 and 3.4.1. We consider any sequence {an } such that an ∈ D for all n
and an → x0 . Then by the continuity hypothesis and Proposition 5.1.3 we know
that f (an ) → f (x0 ) and g(an ) → g(x0 ). Then by Proposition 3.3.2 we know that
cf (an ) → cf (x0 ), (f + g)(an ) = f (an ) + g(an ) → f (x0 ) + g(x0 ) = (f + g)(x0 )
and (f g)(an ) = f (an )g(an ) → f (x0 )g(x0 ) = (f g)(x0 ). Then by Proposition
5.1.3 cf , f + g and f g are continuous at x = x0 .
Likewise, for any sequence {an } such that an ∈ D for all n and an → x0 ,
the continuity of f at x0 implies that f (an ) → f (x0 ) and g(an ) → g(x0 ).
Since g(x0 ) 6= 0, Proposition 3.4.1 implies that (f /g)(an ) = f (an )/g(an ) →
f (x0 )/g(x0 ) = (f /g)(x0 ). Then by Proposition 5.1.3, f /g is continuous at
x = x0 .
We must realize that the above results can also be proved based on Definition
5.1.1—similar to the proof of Proposition 4.3.1 given using Definition 4.1.1.
Also, we want to emphasize that Proposition 5.3.1 implies that if f and g are
continuous on D ⊂ R, then cf , f ± g, f g are continuous on D. And, f /g is
continuous on {x ∈ D : g(x) 6= 0}—which is the natural domain of f /g.
5.3 Basic Continuity Theorems 127
In addition the the results given above, we also have the results analogous
to parts (c) and (e) of Proposition 4.3.1. The result is a useful tool in the study
of continuity. We state the following proposition.
Proposition 5.3.2 Consider f : D → R for D ⊂ R, x0 ∈ D, and suppose that
f is continuous at x = x0 .
(a) There exists a K ∈ R and δ > 0 such that |x − x0 | < δ implies |f (x)| ≤ K.
(b) If f (x0 ) 6= 0, there exists M ∈ R and δ > 0 such that |x − x0 | < δ implies
|f (x)| ≥ M .
We don’t prove the above result—the proof is the same as those of parts (c)
and (e) of Proposition 4.3.1.
The next result could be pieced together by multiple applications of parts
Proposition 5.3.1—but we don’t have to work that hard. We have already done
the work in Section 4.3
Example 5.3.1 (a) For n ∈ N the function f (x) = xn is continuous on R.
(b) All polynomials are continuous on R.
(c) All rational functions are continuous at all points at which the denominator is not zero.
Solution: All of the points under consideration are limit points of the domains. Then part
(a) follows from Proposition 4.3.2-(c) along with Proposition 5.1.2. Parts (b) and (c) follows
from parts (b) and (c) of Proposition 4.3.3.
Proof: (a) Suppose > 0 is given. Then there exists δ1 and δ2 such that
|x − x0 | < δ1 implies that |f (x) − f (x0 )| < or f (x0 ) − < f (x) < f (x0 ) +
(5.3.1)
and
|x − x0 | < δ2 implies that |g(x) − g(x0 )| < or g(x0 ) − < g(x) < g(x0 ) + .
(5.3.2)
Let δ = min{δ1 , δ2 } (Step 1: Define δ). Then for x satisfying |x − x0 | < δ
max{f (x0 ), g(x0 )} − = max{f (x0 ) − , g(x0 ) − } < max{f (x), g(x)} (5.3.3)
and
max{f (x), g(x)} < max{f (x0 ) + , g(x0 ) + } = max{f (x0 ), g(x0 )} + (5.3.4)
or F (x0 ) − < F (x) < F (x0 ) + (Step 2: δ works). Thus we have |F (x) −
F (x0 )| < so F is continuous at x = x0 . Look at the computation given
128 5. Continuity
in (5.3.3) and (5.3.4) carefully. It’s easy but looks difficult. You start with
max{f (x), g(x)} and replace each of them by the inequalities given by state-
ments (5.3.1) and (5.3.2).
(b) Of course the proof of part (b) will be the same. We again consider state-
ments (5.3.1) and (5.3.2). This time taking the minimums, we get G(x0 ) − <
G(x) < G(x0 ) + or |G(x) − G(x0 )| < . Thus G is continuous at x = x0 .
Next we want to give a result that will expand the number of functions that
we know are continuous. Before we give the result we include the definition of
the composite function.
It is easy to see that the function f (x) = −x2 defined on [−1, 1] has a maxi-
mum at (0, 0)—it is an absolute maximum. This function has minimums at both
points (−1, −1) and (1, −1)—which are both absolute maximums. Note that
this means that the absolute maximum or minimum need not be unique. Note
5.3 Basic Continuity Theorems 129
that the same function defined on the set (−1, 1) does not have any minimi—
and then surely does not have an absolute minimum. We also note that if we
define a function f : D → R, D = (−2, −1) ∪ {0} ∪ (1, 2), by f (x) = x2 , then by
the definition x = 0 is both a maximum and a minimum—not very satisfying
but acceptable.
We next prove a useful lemma and an very important theorem concerning
continuous functions—a result that is much stronger than Proposition 5.3.2–(a).
Lemma 5.3.7 Suppose that f : [a, b] → R and f is continuous on [a, b]. Then
there exists an M ∈ R such that f (x) ≤ M for all x ∈ [a, b].
Proof: Suppose false. Suppose there is not such M such that f (x) ≤ M .
For M = 1 there exists an x1 ∈ [a, b] such that f (x1 ) > 1 (otherwise M = 1
would work).
For M = 2 there exists an x2 ∈ [a, b] such that f (x2 ) > 2.
And, in general, for each n ∈ N there exists an xn ∈ [a, b] such that f (xn ) > n.
{xn } is a sequence in [a, b]. Since [a, b] is compact, by Corollary 3.4.8 we know
that the sequence {xn } has a subsequence, {xnj } and there is an x0 ∈ [a, b],
such that xnj → x0 as j → ∞. Then by the continuity of f on [a, b] and
Proposition 5.1.3 we know that f (xnj ) → f (x0 ). Since the sequence {f (xnj }∞j=1
is convergent, we know by Proposition 3.3.2–(c) that the sequence is bounded.
This contradicts the fact that f (xnj ) > nj > j for all j. Therefore the set
f ([a, b]) is bounded above.
Theorem 5.3.8 Suppose that f : [a, b] → R and f is continuous on [a, b]. Then
f has an absolute maximum and an absolute minimum on [a, b].
Proof: Let S = f ([a, b]). By Lemma 5.3.7 S is bounded above. Thus by the
completeness axiom, Definition 1.4.3, M ∗ = lub(S) exists. Note that to find an
absolute maximum of f on [a, b], we must find an x0 such that f (x0 ) = M ∗ .
Recall that by Proposition 1.5.3–(a) for every > 0 there exists an s ∈ S
such that M ∗ −s < . In our case Proposition 1.5.3–(a) gives that for every > 0
there exists an x ∈ [a, b] (and an associated f (x)) such that M ∗ − f (x) < —all
points in S look like f (x)—and are associated with an x ∈ [a, b].
Let = 1. We get x1 ∈ [a, b] such that M ∗ − f (x1 ) < 1.
Let = 1/2. We get x2 ∈ [a, b] such that M ∗ − f (x2 ) < 1/2.
In general, let = 1/n for n ∈ N. We get xn ∈ [a, b] such that M ∗ −f (xn ) < 1/n.
Proof: We have two cases, f (a) < c < f (b) and f (b) < c < f (a). We will
consider the first case—the second case will follow in the same way.
This will be a constructive proof. Let a1 = a and b1 = b.
Let m1 = (a1 + b1 )/2. If f (m1 ) ≤ c, define a2 = m1 and b2 = b1 . If f (m1 ) > c,
define a2 = a1 and b2 = m1 . Note that this construction divides the interval
5.4 More Continuity Theorems 131
in half, and chooses the half so that f (a2 ) ≤ c < f (b2 )—specifically we have
a = a1 ≤ a2 < b2 ≤ b1 = b and f (a2 ) ≤ c < f (b2 ).
Let m2 = (a2 + b2 )/2. If f (m2 ) ≤ c, define a3 = m2 and b3 = b2 . If f (m2 ) > c,
define a3 = a2 and b3 = m2 . We have a = a1 ≤ a2 ≤ a3 < b3 ≤ b2 ≤ b1 = b and
f (a3 ) ≤ c < f (b3 ).
We continue in this fashion and inductively obtain an and bn , n = 1, 2, · · ·
such that
a = a1 ≤ a2 ≤ a3 ≤ · · · ≤ an < bn ≤ bn−1 ≤ · · · ≤ b1 = b
and f (an ) ≤ c < f (bn ) for all n ∈ N. We have a sequence of closed intervals
[an , bn ] such that f (an ) ≤ c < f (bn ) and bn − an = 21 [bn−1 − an−1 ] = · · · =
1 1
2n−1 [b1 − a1 ] = 2n−1 [b − a].
Clearly {an } is a monotonically increasing sequence that is bounded above
by b. Therefore by the Monotone Convergence Theorem, Theorem 3.5.2, there
exists α ≤ b such that an → α. Likewise the sequence {bn } is a monotonically
decreasing sequence bounded below by a. Thus by the Monotone Convergence
Theorem there exists a β ≥ a such that bn → β.
1
We see that limn→∞ [bn − an ] = limn→∞ 2n−1 [b − a] = 0 and limn→∞ [bn −
an ] = β − α. Thus α = β. Call it x0
We have f (an ) ≤ c < f (bn ) and limn→∞ f (an ) = limn→∞ f (bn ) = f (x0 ).
By the Sandwhich Theorem, Proposition 3.4.2, (where the center sequence will
be the constant sequence {f (c), f (c), · · · }) we have f (x0 ) = c.
We might note that one of the nice applications of the IVT is to prove the
existence of a solution of an equation of the form f (x) = 0. The approach is to
find a and b in the domain of the function such that f (a) < 0, f (b) > 0 and f is
continuous on [a, b]. The IVT then implies that there exists an x0 ∈ [a, b] such
that f (x0 ) = 0. For example consider the function f (x) = x5 + x + 1. We note
that f (−1) = −1, f (1) = 3 and f is surely continuous on the interval [−1, 1].
Therefore by the Intermediate Value Theorem there exists an x0 ∈ [−1, 1] such
that f (x0 ) = 0. Can you find such an x0 . Can you approximate it? (Use your
calculator.)
Can you find a sequence of points that converges to the solution? Use the
proof of the previous proposition: it’s called the Bisection Method. Suppose
we know that f (a) < 0, f (b) > 0 and f is continuous on [a, b]. We then use a
construction that we have used in the proof of the IVT. Let {an }, {mn } and
{bn } be as defined in the proof. But we don’t compute them all—that would
take a long time even on the best computer. We continue with this computation
until for some mn , f (mn ) is sufficiently small. We use that value of mn as an
approximation of the solution of f (x) = 0. We note that the proof of the IVT
proves that the sequence {an } converges to the solution of f (x) = 0. It’s easy
to see that the sequence {mn } will also converge to the solution of f (x) = 0.
(You might prove it—just to show that it is easy.) We really don’t know how
fast this convergence is taking place (the Bisection method is not the fastest
method) but we can get an excellent approximation to solutions of equations
using this method. For example if we again consider f (x) = x5 + x + 1, set
132 5. Continuity
Proof: If f (I) is not an interval, there must be an f (a), f (b) ∈ f (I) and a
c ∈ R such that c is between f (a) and f (b) but c 6∈ f (I). This would contradict
Theorem 5.4.1 applied to f on [a, b] (where for convenience we assume that
a < b).
Often we are interested in when and where the functions are increasing and
decreasing—if you recall, you probably used these ideas in your basic class when
you used calculus to plot the graphs of some functions. We will use these ideas
in a very powerful way to help us study the inverse of functions. Before we
proceed we make the following definitions.
Proof: Consider the case when f is increasing. The case of f decreasing will
be the same. Let x0 ∈ D and suppose that > 0 is given. We must find a δ so
that |x − x0 | < δ implies that |f (x) − f (x0 )| < , i.e.
The method of proof will be to first work to the right of x0 to limit δ so that
for x ∈ (x0 , x0 + δ) f cannot grow too much (not more than to f (x0 ) + ). We
will then do the same thing to the left of x0 .
Consider the right most part of inequality (5.4.1): f (x) < f (x0 ) + .
If f (x) ≤ f (x0 ) for all x ∈ D, the desired inequality is satisfied and we can
choose δ1 = 1.
Otherwise, let x∗ ∈ D be such that f (x∗ ) > f (x0 ). Then x0 < x∗ (because f
is increasing) and the interval [f (x0 ), f (x∗ )] is contained in f (D) because f (D)
is assumed to be an interval. Let y ∗∗ = min{f (x0 ) + /2, f (x∗ )}. Then the
interval [f (x0 ), y ∗∗ ] is contained in f (D). Thus there exists an x∗∗ ∈ D such
that f (x∗∗ ) = y ∗∗ . Then x0 < x < x∗∗ implies that f (x0 ) ≤ f (x) ≤ f (x∗∗ ) =
y ∗∗ < f (x0 ) + . Let δ2 = x∗∗ − x0 .
Now consider the left most part of inequality (5.4.1): f (x0 ) − < f (x).
If f (x) ≥ f (x0 ), we are done. Let δ3 = 1.
Otherwise, let x∗ ∈ D be such that f (x∗ ) < f (x0 ). Then x∗ < x0 and the
interval [f (x∗ ), f (x0 )] ⊂ f (D). Let y ∗∗ = max{f (x0 ) − /2, f (x∗ )}. Then
[y ∗∗ , f (x0 )] ⊂ f (D) and there exists x∗∗ ∈ D f (x∗∗ ) = y ∗∗ . Then x∗∗ < x < x0
implies that f (x0 ) − < y ∗∗ = f (x∗∗ ) ≤ f (x) ≤ f (x0 ). Let δ4 = x0 − x∗∗ .
Thus we see that if we define δ = min{δ1 , δ2 , δ3 , δ4 } (Step 1: Define δ) and
require that |x − x0 | < δ, then |f (x) − f (x0 )| < (Step 2: δ works).
Notice that the functions f7 and f8 consider earlier are both monotone but
are not continuous—neither f7 (R) nor f8 (R) are intervals. Check it out.
We next state a result is a bit strange because we already have this result—it
is a combination of Corollary 5.4.2 and Proposition 5.4.4. We do so because we
want to emphasize this result in this form.
Corollary 5.4.5 Consider f : I → R where I ⊂ R is an interval. Assume
that f is monotone on I. Then f is continuous on I if and only if f (I) is an
interval.
134 5. Continuity
There are times when we are given a function for which it is very important
to us to know that the function has an inverse and to be able to determine
properties of that inverse. You have been using inverse functions for a long
time(it is a very basic result to be give some y = f (x) and want to solve for x—
sometimes you might have been aware that you were using an inverse, and other
times you might not have been aware. We begin with the following definition.
Definition 5.4.6 Consider f : D → R where D ⊂ R. The function f is said
to be one-to-one (often written 1-1) if f (x) = f (y) implies that x = y.
In your basic calculus course when you studied one-to-one functions you used
what is called the horizontal line test—that is, draw an arbitrary horizontal line
on the graph of the function, the function is one-to-one if the line intersects
that graph at only one point. It should be clear that this description of the
horizontal line test is equivalent to Definition
√ 5.4.6—though less rigorous.
To prove that function f (x) = x defined on [0, ∞) we note that f1 (x) =
√ √1
f1 (y) is the same as x = y. If we then square both sides, we find that x = y—
which is what we must prove. Graph f1 to see how the horizontal line test works.
The function f2 (x) = x2 is surely not one-to-one on R (f2 (−1) = f2 (1)). Again,
plot the function and draw the horizontal line. If we consider f2 on [0, ∞)
instead, then f2 is one-to-one.
A statement that is equivalent to Definition 5.4.6 is as follows: The function
f is said to be one-to-one for each element y ∈ f (D) there exists one and only
one element in D, x, such that f (x) = y. The definition of one-to-one allows us
to make the following definition.
Definition 5.4.7 Consider f : D → R where D ⊂ R. Assume that the function
f is one-to-one. We define the function f −1 : f (D) → D by f −1 (y) = x if
f (x) = y. The function f −1 is called the inverse of f . When f −1 exists, f is
said to be invertible.
Note that the definition that f is one-to-one is exactly what is needed to make
f −1 a function, i.e. for each y ∈ f (D)there exists one and only one x ∈ D such
that f −1 (y) = x. We also note that by rewriting the statement in Definition
5.4.7 we see that f and f −1 satisfy f −1 (f (x)) = x for x ∈ D and f (f −1 (y)) = y
for y ∈ f (D).
If we consider f2 (x) = x2 on [0, ∞), let y = x2 and solve for x, we get
√ √
x = ± y. Since x must be greater than or equal to zero, f2−1 (y) = y. Note
that since f2 ([0, ∞)) = [0, ∞), the domain of f2−1 is also [0, ∞). If we next
√
consider f3 (x) = x3 on R, we note that f3 (R) = R and f3−1 (y) = 3 y for all
y ∈ R.
We obtain the following important but very easy result.
Proposition 5.4.8 Consider f : D → R where D ⊂ R. Assume that f is
strictly monotone on D. Then f is one-to-one on D.
such that f (x) = f (y). If x 6= y, then either x < y or y < x—either case is a
contradiction to the fact that f is strictly increasing.
It’s clear that the converse of Proposition 5.4.8 is not true when we consider
a function like f (x) = 1/x defined on R − {0}. However we are able to obtain
the following result.
Proof: (a) We assume that f −1 is not strictly increasing, i.e. suppose that
u < v and f −1 (u) 6< f −1 (v). Then f −1 (u) ≥ f −1 (v). Suppose x and y are
such that f (x) = u and f (y) = v. Then u < v is the same as f (x) < f (y) and
f −1 (u) ≥ f −1 (v) implies that x ≥ y.
This contradicts the fact that f is increasing because when f is increasing
x > y implies f (x) > f (y) and x = y implies f (x) = f (y). i.e. if we have x ≥ y,
we have u = f (x) ≥ f (y) = v.
(b) The proof when f is strictly decreasing is very similar to that given in part
(a).
136 5. Continuity
The next two results are the ultimate results relating the continuity proper-
ties of f −1 to those of f .
Proof: Suppose that f is strictly increasing—the proof of the case for f strictly
decreasing is the same. We know then by Proposition 5.4.10–(a) that f −1 :
f (I) → I is strictly increasing. Then since f −1 (f (I)) = I is an interval (and
f (I) is the domain of f −1 ), by Proposition 5.4.5 we see that f −1 is continuous
on f (I).
find an n such that 1/2n < δ (and this will hold for all larger n) and n ≥ = 1.
1 1
For this value of n we have |xn − yn | < δ and − = n ≥ 1. Thus f is not
xn yn
uniformly continuous on (0, 1).
There is a result that makes this last proof a bit easier. Since it is an ”if and
only if” result, the following proposition provides for an alternative definition
of uniform continuity.
Proposition 5.5.4 Suppose the function f : D → R, D ⊂ R. The function f
is uniformly continuous if and only if for all sequences {un }, {vn } in D such
that lim [un − vn ] = 0, then lim [f (un ) − f (vn )] = 0.
n→∞ n→∞
|(c1 f (x) + c2 g(x)) − (c1 f (y) + c2 g(y))| ≤ |c1 ||f (x) − f (y)| + |c2 ||g(x) − g(y)|
< |c1 |1 + |c2 |2 .
Notice that we have not included results for products and quotients of uni-
formly continuous functions. See HW5.5.3. We have one more very important
result relating continuity and uniform continuity.
HW 5.5.2 (a) Show that the function f : (0, 1) → R defined by f (x) = 3x2 + 1
is uniformly continuous.
(b) Show that f : (2, ∞) → R defined by f (x) = 1/x2 is uniformly continuous.
(c) Show that the function f : R → R defined by f (x) = x3 is not uniformly
continuous on R.
Example 5.6.1 Consider the function f (x) = x2 on D = [0, ∞). Show that f is invertible
and that f −1 is continuous on [0, ∞).
5.6 Rational Exponents: An Application 143
Solution: We first note that f (D) = [0, ∞) and as we saw in Section 5.4, f is strictly
increasing on D. By Proposition 5.4.8 we know that f is one-to-one, i.e. f is invertible. As
usual denote the inverse of f by f −1 . Of course f −1 : f (D) = [0, ∞) → D = [0, ∞).
In addition since D = [0, ∞) is an interval, by Proposition 5.4.11 we know that f −1 is
continuous on f (D) = [0, ∞).
Now that the nth roots are defined we have to decide what we want to do
with these definitions. To begin with we make the following extensions of the
of the above definition.
144 5. Continuity
1
Definition 5.6.3 (a) For n is a negative we define y 1/n = y−1/n for y ∈ (0, ∞).
m r 1/n
m
(b) For r ∈ Q, r = n , we define y = y for y ∈ (0, ∞). If y = 0 and
r > 0, define y r = 0.
Now that we have xr defined we have work to do. We noted in Example
1.6.3 and HW1.6.2 that for m, n ∈ N and a > 0 we have am an = am+n and
n
(am ) = amn , respectively. Of course we would like these properties to be true
for rationals also. But before we prove these arithmetic properties we must
prove that xr is well defined. The problem is that r = m mk
n and r = nk are equal
m/n mk/nk
rationals. We need to know that x =x .
Proposition 5.6.4 xr is well defined.
where in both cases the **-equalities are due to integer algebra and the *-
equalities and ***-equalities are due to Definition 5.6.3-(b)—the definition of
kn kn
y r . Thus we have xkm/kn = xm/n . Then because h(u) = ukn is
one-to-one on [0, ∞), we get xkm/kn = xm/n so xr is well defined.
We should note that in the last step where we used the fact that h is one-to-
one, we could have equally said that we were taking the kn-th root of both sides
of the equality—but you might recall that the fact that the kn-th root exists is
because h is one-to-one.
Now that we know that xr is well defined it’s time to start developing the
necessary arithmetic properties. We begin the the fractional part of our arith-
metic properties.
Proposition 5.6.5 Suppose m, n ∈ N. Then we get the following results.
m
m 1/n 1/n
(a) (x ) = x = xm/n
1 1 1 1
(b) x n x m = x n + m
1/m
(c) x1/n = x1/mn
h in
1/n
Proof: (a) We note that by (5.6.3)-(b) xm = (xm ) , and by (5.6.3)-(b)
h n
im h m
in
and integer algebra xm = x1/n = x1/n . Then since h(u) = un is
1/n m
one-to-one, we get (xm ) = x1/n and this last expression is the definition
of xm/n .
5.6 Rational Exponents: An Application 145
and
h n im h m in
xm+n = xm xn = x1/n x1/m
nm nm h inm
= x1/n x1/m = x1/n x1/m
s
and h(u) = unq is one-to-one, xrs = (xr ) .
Thus we now have the arithmetic properties for rational exponents that we
have all known for a long time. The proofs given above are a bit gross but we
hope that you realize that you now have a rigorous treatment of these definitions
and properties.
We notice that the above definitions and analysis are all done for x ∈ (0, ∞).
When the appropriate rations are positive, then the same properties can be
proved for x = 0. It is necessary to only consider [0, ∞) because xn is not
one-to-one on R for n even—therefore it’s not invertible. It should be clear
that it’s possible to define y 1/n on R for n odd. For n odd we could repeat the
construction given earlier for f (x) = xn and arrive at the definition of y 1/n as
y 1/n = f −1 (y)—which is good because we have all taken the cube root of −27
sometime in our careers and got −3. You can do most of the arithmetic that we
developed for roots, etc. defined on [0, ∞). However you do have to be careful.
For example we know that 31 and 62 are two representations of the same rational
1/6 2
number. But (−27)1/3 = −3, (−27)2 = 3 and (−27)1/6 is not defined.
This is not good, i.e. you must be careful when you start taking odd roots of
negative numbers.
And finally we remember that we are in a chapter entitled Continuity. This
has been a very nice application of some of our continuity results but we will
now return to continuity. We obtain the following result.
Proposition 5.6.7 Suppose that r ∈ Q and define f : [0, ∞) → R by f (x) = xr .
Then f is continuous on [0, ∞).
Differentiation
Definition 6.1.1 Suppose that the function f : [a, b] → R. If x0 ∈ [a, b], then
f (x) − f (x0 )
f is said to be differentiable at x = x0 if lim exists. The limit
x→x0 x − x0
0
is the derivative of f at x0 and is denoted by f (x0 ). If E ⊂ [a, b] and f is
differentiable at each point of E, then f is said to be differentiable on E. The
function f 0 : E → R defined to be the derivative at each point of E is called the
derivative function. A common notation for the derivative function is to write
dy
the function as y = f (x) and denote the derivative of f as dx —at a particular
dy dy
0 d
point x0 write either dx (x0 ) or dx x=x0 . We also denote f (x) by dx f (x).
There is an important alternative form of the limit given in the definition above.
f (x) − f (x0 )
It should be clear that if we replace the x in the limit lim by
x→x0 x − x0
x0 + h, then x → x0 is the same as h → 0. Thus an alternative definition
f (x + h) − f (x0 )
of the derivative is given by lim . There are times that this
h→0 h
particular limit is preferable to the limit given in Definition 6.1.1 above.
In the above definition the derivative is defined at x = a and x = b and
the derivatives at these points will be in reality right and left hand derivatives,
respectively. We can also define right and left hand derivatives at interior points
of [a, b] by using right and left hand limits, i.e. the right hand derivative of f at
f (x) − f (x0 )
x = x0 ∈ (a, b) is defined by f 0 (x0 +) = lim , and the left hand
x→x0 + x − x0
147
148 6. Differentiation
f (x) − f (x0 )
derivative of f at x = x0 ∈ (a, b) is defined by f 0 (x0 −) = lim .
x→x0 − x − x0
We will not do much with one sided derivatives. Generally the results that you
need for one sided derivatives are not difficult.
Since hopefully we are good at taking limits, it is not difficult to apply
x3 − 64
Definition 6.1.1. In Example 4.2.4 we showed that lim = 48, i.e. if
x→4 x − 4
3 0
f (x) = x we showed that f (4) = 48. We can just as easily show that
f (x) − f (x0 ) x3 − x30
f 0 (x0 ) = lim = lim
x→x0 x − x0 x→x0 x − x0
We next include a result that is an extremely nice result and is necessary for
us to be able to proceed.
Proposition 6.1.2 Consider f : [a, b] → R and x0 ∈ [a, b]. If f is differentiable
at x = x0 , then f is continuous at x = x0 .
f (x) − f (x0 )
Proof: Note that f (x) = (x − x0 ) + f (x0 ). Then we see that
x − x0
f (x) − f (x0 )
lim f (x) = lim (x − x0 ) + f (x0 )
x→x0 x→x0 x − x0
f (x) − f (x0 )
= lim lim (x − x0 ) + lim f (x0 )
x→x0 x − x0 x→x0 x→x0
0
= f (x0 ) · 0 + f (x0 ) = f (x0 )
where we can apply the appropriate limit theorems because all of the individ-
ual limits exist. Since x0 ∈ [a, b], x0 is a limit point of [a, b]. Therefore by
Proposition 5.1.2 f is continuous at x = x0 .
The above result shows that there is a heirarchy of properties of functions.
Continuous functions may be nice but differentiable functions are nicer. It is
easy to see by considering the absolute value function at the origin—which we
will do soon—that the converse of this result is surely not true.
In your basic calculus course the very important tools that you used con-
stantly to compute derivatives were ”derivative of the sum is the sum of the
derivatives, derivative of a constant times a function is the constant times the
derivative, the product rule and the quotient rule.” We now include these results.
Proof: (a) & (b) The proofs of (a) and (b) are direct applications of Propo-
sition 4.3.1 parts (b) and (a).
(c) We note that
(To allow us to take the limits that get us from (6.1.1) to (6.1.2) we use the fact
that if f differentiable at x0 , then f is continuous at x0 , Proposition 6.1.2, and
of course Definition 6.1.1.)
Therefore we get the product rule, (f g)0 (x0 ) = f (x0 )g 0 (x0 ) + f 0 (x0 )g(x0 ).
(d) We attack the quotient rule in a similar way. We note that
f (x) f (x0 )
(f /g)(x) − (f /g)(x0 ) g(x) − f (x)g(x0 ) − g(x)f (x0 )
g(x0 )
= =
x − x0 x − x0 g(x)g(x0 )(x − x0 )
1 f (x) − f (x0 ) g(x) − g(x0 )
= g(x0 ) + f (x0 ) .
g(x)g(x0 ) x − x0 x − x0
(To get from the third term to the last term we have added and subtracted
things again. You can simplify the last expression to see that it is equal to the
second to the last expression.) Then
(f /g)(x) − (f /g)(x0 ) 1 f (x) − f (x0 )
lim = lim g(x0 )
x→x0 x − x0 x→x0 g(x)g(x0 ) x − x0
g(x) − g(x0 )
+f (x0 ) (6.1.3)
x − x0
1 0 0
= 2 [g(x0 )f (x0 ) − f (x0 )g (x0 )] . (6.1.4)
[g(x0 )]
(Note that to get to (6.1.4) from (6.1.3) we have used parts (a), (b), (d) and
(f) of Proposition 4.3.1 along with Definition 6.1.1. Again it is very important
that by Proposition 6.1.2 since g is differentiable at x0 , then g is continuous at
x0 —and nonzero—so that we can take the limit in the denominator.)
150 6. Differentiation
Proof: You should realize that this is a difficult proof. The proof given here
is clearly not difficult but it’s tricky. Read it carefully—otherwise before you
know what we’re doing, we’ll be done. (
g(y)−g(f (x0 ))
y−f (x0 ) if y 6= f (x0 )
Define h : [c, d] → R by h(y) = 0
g (f (x0 )) if y = f (x0 ).
Since g is differentiable at y = f (x0 ), h is continuous at y = f (x0 )—clearly
g(y) − g(f (x0 ))
lim h(y) = lim = g 0 (f (x0 )) = h(f (x0 )).
y→f (x0 ) y→f (x0 ) y − f (x0 )
Note that g(y) − g(f (x0 )) = h(y)(y − f (x0 )) for all y ∈ [c, d]—specifically check
the identity at y = f (x0 ). We let y = f (x) and get g(f (x)) − g(f (x0 )) =
h(f (x))(f (x) − f (x0 )) for all x ∈ [a, b].
Thus
g ◦ f (x) − g ◦ f (x0 ) f (x) − f (x0 )
(g ◦ f )0 (x0 ) = lim = lim h(f (x)) . (6.1.5)
x→x0 x − x0 x→x0 x − x0
f (x) − f (x0 )
Since f is differentiable at x = x0 , → f 0 (x0 ). Also, since f is
x − x0
differentiable at x = x0 , then f is continuous at x = x0 . And finally, since f
is continuous at x = x0 and h is continuous at y = f (x0 ), by Proposition 5.3.5
h ◦ f is continuous at x = x0 . Returning to (6.1.5) we get
f (x) − f (x0 )
(g ◦ f )0 (x0 ) = lim h(f (x)) = h(f (x0 ))f 0 (x0 ),
x→x0 x − x0
or (g ◦ f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ).
Often in texts of a variety of levels the justification of the chain rule is given
approximately as follows. We note that
g(f (x)) − g(f (x0 )) g(f (x)) − g(f (x0 )) f (x) − f (x0 )
=
x − x0 f (x) − f (x0 ) x − x0
g(y) − g(f (x0 )) f (x) − f (x0 )
= , (6.1.6)
y − f (x0 ) x − x0
where we have set y = f (x). The argument made is as x → x0 , y → f (x0 ) so
(6.1.6) implies (g ◦ f )0 (x0 ) = g 0 (f (x0 ))f 0 (x0 ). Most often if you read the texts
6.2 Some Derivatives 151
carefully, they do not claim that it’s a proof. But you have to read it carefully.
The difference is between the statements
and
g(f (x)) − g(f (x0 ))
lim . (6.1.8)
x→x0 f (x) − f (x0 )
We must realize that this proof has two ”obvious” mathematics induction proofs
hidden in the middle—the · · · ’s.
If we apply parts (a), (b) and (c) of Proposition 6.1.3 along with Example
6.2.1, we see that any polynomial is differentiable and (a0 xm + a1 xm−1 + · · · +
am−1 x+am )0 = ma0 xm−1 +(m−1)a1 xm−2 +· · ·+am−1 . Likewise, if in addition
we apply part (d) of Proposition 6.1.3 we find that any rational function is
differentiable at all points where the denominator is not zero and
0
p0 (x)q(x) − p(x)q 0 (x)
p(x)
= 2
q(x) [q(x)]
Again we get to divide out the x−x0 term because in the definition of a limit, we only consider
x − x0 6= 0.
√
We note that since x is not defined for x < 0, we know that we cannot
worry about the derivative there. If we consider x0 = 0, we see that
√
x−0 1
lim = lim √ = ∞.
x→0+ x−0 x→0+ x
(We have used a one-sided limit to emphasize the fact that we cannot consider
x < 0). We don’t really know that this limit is ∞ (even though we hope we
do know that). Hopefully we could use the methods in Section 4.4 to prove √
that this limit is ∞.) Since this limit does not exist in R, the derivative of x
does not exist at x = 0. However, this computation is useful when we use the
derivative to give the slope of the tangent to the curve. The above computation
shows that at x = 0 the tangent line is vertical—that’s surely better information
than just telling us that there is no tangent at that point.
If we think about the approach used for the above example, we can use the
d √ 1 d √ 1
analogous approach to show that 3
x= √ 3
for x ∈ R, 4
x= √ 4
dx 3 x 2 dx 4 x3
for x ∈ (0, ∞), etc.
We next include an example that includes an interesting limit and the very
important application of that limit that gives the derivatives of the trig func-
tions.
154 6. Differentiation
sin θ
Example 6.2.3 (a) Prove that lim = 1.
x→0 θ
1 − cos θ
(b) Prove that lim = 0.
x→0 θ
d
(c) Show that sin x = cos x.
dx
Solution: (a) In Figure 6.2.1 we notice that given that angle ∠P OA is θ, then |OB| = cos θ,
|BP | = sin θ and |AP | = tan θ. We also note that the area of triangle 4OAP is 12 sin θ,
the area of the sector OAP is 12 θ and the area of triangle 4OAQ is 21 tan θ (remember that
|OA| = 1). Also, the area of triangle 4OAP is less than the area of sector OAP is less than
the area of triangle 4OAQ, i.e. we have that
1 1 1 θ 1
sin θ < θ < tan θ or 1 < < .
2 2 2 sin θ cos θ
sin θ
Inverting these inequalities very carefully gives us that cos θ < < 1. Since lim cos θ = 1
θ x→0
(Example 5.2.3-(c)) and lim 1 = 1, we can apply the Sandwhich Theorem, Proposition 4.3.4,
x→0
sin θ
to see that lim = 1.
x→0 θ
Of course the derivatives of the rest of the trig functions follow from the deriva-
tive of the sine function.
In Example 4.2.7 we showed that the limit of the function
(
sin x1
if x 6= 0
f1 (x) =
0 if x = 0
d 1/3
HW 6.2.2 Compute x .
dx
sin 3θ
HW 6.2.3 Compute lim .
θ→0 sin 5θ
156 6. Differentiation
(
x2 if x ∈ [−1, 1] ∩ Q
HW 6.2.4 Consider the function f (x) = 2
−x if x ∈ [−1, 1] ∩ I.
(a) Is f differentiable at x ∈ [−1, 1], x 6= 0? If so, compute f 0 (x).
(b) Is f differentiable at x = 0? If so, compute f 0 (0).
Proposition 6.3.1 Suppose the function f is such that f : (a, b) → R for some
a, b ∈ R, a < b and suppose that f is differentiable at x0 ∈ (a, b). If f has a
maximum or minimum at the point (x0 , f (x0 )), then f 0 (x0 ) = 0.
Proof; Consider the case where (x0 , f (x0 )) is a maximum. Let N ⊂ (a, b) be a
f (x) − f (x0 )
neighborhood of x0 so that f (x) ≤ f (x0 ) for all x ∈ N . Since lim
x→x0 x − x0
f (x) − f (x0 )
exists (f is differentiable at x0 ), by Proposition 4.4.8 both lim
x→x0 − x − x0
f (x) − f (x0 )
and lim exist and are equal. If x ∈ N and x < x0 , then
x→x0 + x − x0
f (x) − f (x0 )
x − x0 < 0, f (x) − f (x0 ) ≤ 0 and hence ≥ 0 (the slope is positive
x − x0
going up hill).
f (x) − f (x0 )
Claim 1: lim ≥ 0: We prove this claim by contradiction. We
x→x0 − x − x0
f (x) − f (x0 )
assume that lim < 0, i.e. there exists L < 0 such that for every
x→x0 − x − x0
f (x) − f (x0 )
> 0 where exists δ such that 0 < x0 − x < δ implies − L < .
x − x0
= |L|/2. Then there exists a δ such that 0 < x0 − x < δ implies
Choose
that f (x)−f (x0 )
− L < |L|/2 or − |L|/2 + L < f (x)−f (x0 )
< |L|/2 + L. Since
x−x0 x−x0
f (x)−f (x0 )
L < 0, L + |L|/2 = L/2 = −|L|/2 and x−x0 < −|L|/2. This contradicts
6.3 Differentiation Theorems 157
f (x) − f (x0 )
that fact that ≥ 0 for x ∈ (x0 − δ, x0 ) ∩ N so we know that
x − x0
f (x) − f (x0 )
lim ≥ 0.
x→x0 − x − x0
f (x) − f (x0 )
Claim 2: lim ≤ 0: This proof is very much like the last case.
x→x0 + x − x0
f (x) − f (x0 )
We show that if x ∈ N and x > x0 , then ≤ 0 (the slope is
x − x0
f (x) − f (x0 )
negative going down hill). We then assume that lim > 0, apply
x→x0 + x − x0
the definition of the right hand limit with = L/2 and arrive at a contradicition.
f (x) − f (x0 )
Therefore lim ≤ 0.
x→x0 + x − x0
f (x) − f (x0 ) f (x) − f (x0 )
Since lim ≥ 0, lim ≤ 0 and they are equal,
x→x0 − x − x0 x→x0 + x − x0
f (x) − f (x0 )
they both must be zero. Therefore the limit lim = 0 or f 0 (x0 ) =
x→x0 x − x0
0.
We can prove the analogous result for a minimum using a completely similar
argument or by considering the function −f . If f has a minimum at x0 , then
−f will have a maximum at x0 .
Of course we know that Proposition 6.3.1 gives us a powerful tool for finding
local maximums and minimums. We set f 0 (x) = 0—we called these points
critical points in our basic course. This gives us all of the maximums and
minimums at points at which f is differentiable and probably a few extras. We
then develop methods (which we will discuss more later) to determine which of
these critical points are actually maximums or minimums. Then if we consider
any points at which f is not differentiable and maybe some end points, we have
the maximums and minimums.
However, at this time we wanted Proposition 6.3.1 to help us prove the
following theorem commonly referred to as Rolle’s Theorem.
Proof: We know from Theorem 5.3.8 that there exists x0 , y0 ∈ [a, b] such that
f (x0 ) = glb{f (x) : x ∈ [a, b]} and f (y0 ) = lub{f (x) : x ∈ [a, b]}. If both x0
and y0 are endpoints of [a, b] (and f (a) = f (b)), then f is constant on [a, b],
f 0 (x) = 0 for x ∈ (a, b) so we can choose ξ = (a + b)/2. Otherwise, either
x0 or y0 is in (a, b), say x0 ∈ (a, b). Then by Proposition 6.3.1 we know that
f 0 (x0 ) = 0, i.e. we can set ξ = x0 .
The real reason that we want Rolle’s Theorem is to help us prove the Mean
Value Theorem, abbreviated by MVT, which is a very important result.
158 6. Differentiation
Figure 6.3.1: Plot of a function f , the line from (a, f (a)) to (b, f (b)) and the
tangent line at x = ξ.
Proof: Define h by h(x) = f (x) − g(x). If we then apply Corollary 6.3.4 to the
function h, we see that h is constant on I. This is what we wanted to prove.
As we stated earlier you should recall that you used both of the above results
often in your basic course. We next give the results that relate increasing-
decreasing functions with their derivatives. Recall that in Definition 5.4.3 we
defined increasing and decreasing, and strictly increasing and strictly decreasing
functions. We state the following corollary.
We should realize that the application of Corollary 6.3.6 along with Propo-
sition 6.3.1 gives us a method for catorgorizing the maximums and minimums
of a function. We use Proposition 6.3.1 (along with listing the points where the
derivative does not exist) to find the potential maximums and minimums, criti-
cal points. We handle points at which the function is not defined separately. We
then evaluate f 0 at one point in the interval between each of these critical points
to determine whether f is strictly increasing or decreasing in that interval—if
we have all critical points listed, the sign of f 0 cannot change in the interval.
We then classify the critical point as a maximum if the curve of the function
is increasing to the left of the critical point and decreasing to the right of the
critical point. We classify the critical point as a minimum if the curve of the
function is decreasing to the left of the critical point and increasing to the right
of the critical point.
160 6. Differentiation
Also we get a very useful result from Corollary 6.3.6. From Proposition 5.4.8
we saw that if a function f is strictly monotone on the domain of the function,
then the function is one-to-one. Then from Corollary 6.3.6 and Proposition 5.4.8
we obtain the following useful result.
1 1 1 x − x0
= = lim = lim .
f 0 (x0 ) limx→x0 f (x)−f (x0 ) x→x0 f (x)−f (x0 ) x→x0 f (x) − f (x0 )
x−x0 x−x0
Thus for every > 0 there exists δ such that 0 < |x − x0 | < δ implies
x − x0 1
−
f (x) − f (x0 ) f 0 (x0 ) < . (6.3.2)
Therefore we have that for every > 0 there exists an η such that 0 < |y−y0 | < η
g(y) − g(y0 ) 1 g(y) − g(y0 ) 1
implies − 0 < or lim = 0 —which is
y − y0 f (x0 ) y→y0 y − y0 f (x0 )
what we were to prove.
We next give a nice application of Proposition 6.3.8. In Section 5.6 we used
Proposition 5.4.8 to define y 1/n , we used Proposition 5.4.11 to show that the
function y 1/n is continuous on [0, ∞), and then used the composition of y 1/n and
xm to define xr , r ∈ Q and show that xr is continuous. We now want to extend
these results to show that xr is differentiable. To do so in the most pleasant
√
way we return to Example 5.6.1 and consider y. (Recall that in Example 6.2.2
d√ 1
we proved that y = √ . We did so using more elementary methods and
dy 2 y
methods that did not extend as nicely to y 1/n —and methods that weren’t as
nice as these.)
Example 6.3.1 Consider the function f (x) = x2 on D = (0, ∞). Show that f −1 (y) = √y
d √ 1 1
is differentiable on (0, ∞) and that y = √ = y −1/2 .
dy 2 y 2
Solution: We should recall that we already know that f is invertible, that f −1 is continuous
√
on (0, ∞) and that f −1 (y) = y. We see that it is very easy to apply Proposition 6.3.8 to
f —we know that f : I = (0, ∞) → R, I is surely an interval and that f is one-to-one and
continuous on I. We let y0 be an arbitrary element of (0, ∞) and let x0 ∈ (0, ∞) be such
√
that y0 = f (x0 ) = x20 , i.e. x0 = y0 . Then we know from Proposition 6.3.8 that f −1 is
d −1 1 1 1
differentiable at y0 and f (y0 ) = 0 = = √ . This is what we were to prove.
dy f (x0 ) 2x0 2 y0
Of course the next step is to apply the Chain Rule, Proposition 6.1.4 to
prove that for r ∈ R, r = m/n,
The inverse trig functions provide us with another nice application of the
Propositions 5.4.11 and 6.3.8. In our basic functions course some time after
defining the trig functions we defined the inverse trig functions—for the sine
function, probably very intuitively as θ = sin−1 x as the “angle whose sine is
x.” The interesting part and sometimes the tough part is that the sine function is
not one-to-one. We can get around this problem easily. Suppose for the moment
we write sin x to denote the sine function defined on R and define the restriction
of sin to [−π/2, π/2] by Sin x—this is a temporary, uncommon notation used to
make the point. It should be reasonably clear that though sin is not one-to-one,
Sin is one-to-one—we have restricted the domain, just as you did in your basic
course, so as to make the restriction one-to-one. We then have the following
result.
Example 6.3.3 Consider the function f = Sin : D = [−π/2, π/2] → R where f (x) =
Sin x = sin x. Show that f −1 exists on f (D) = [−1, 1], f −1 is continuous on [−1, 1], f −1 is
0 1
differentiable on (−1, 1) and f −1 (x0 ) = √ for x0 ∈ (−1, 1).
1 − x2
Solution: Above we stated that it was ”reasonably clear” that Sin is one-to-one. We also
d
need to know that the Sin function is monotone. Since we know that dx sin x = cos x and
cos x > 0 on (−π/2, π/2), by Corollaries 6.3.6 and 6.3.7 we know that the Sin function is
strictly increasing and one-to-one on the interval (−π/2, π/2). Since we also know that the
sine function does not equal ±1 on the open interval (−π/2, π/2), we can include the end
points to see that the function Sin is strictly increasing and one-to-one on [−π/2, π/2].
Since f is strictly monotone, we can apply Proposition 5.4.11 to see that f −1 is continuous
on f (D) = [−1, 1]. We know from Example 5.2.3-(d) that the sine function is continuous on
R—hence f (x) = Sin x is continuous on D = [−π/2, π/2]. Then since we already knoiw that
f is one-to-one, by Proposition 6.3.8 f −1 = sin−1 is differentiable on f (D) = (−1, 1), and for
d 1 1
x ∈ (−1, 1) and θ ∈ [−π/2, π/2] such that sin θ = x, we have sin−1 x = 0 = .
p dx f (θ) cos θ
We know that cos θ = ± 1 − sin2 θ. Because for θ ∈ [−π/2, π/2] we know that cos θ ≥ 0, we
p d 1
have cos θ = 1 − sin2 θ. Also sin θ = x so we have sin−1 x = √ —the formula you
dx 1 − x2
learned in your basic course.
Of course we know that in this case f −1 is usually written as sin−1 . You should
be careful with this notation. In some texts—usually old ones—they will write
sin−1 as the inverse of sin (not a function since sin is not one-to-one) and Arcsin
as the inverse of Sin. This is nice notation because it emphasizes the fact that
sin is not one-to-one but it doesn’t seem to be used much anymore. Just use the
notation sin−1 to denote the inverse of the Sin function and never talk about
the inverse of the sin function—because the inverse isn’t even a function. But
be careful.
Also we should realize that we could next consider the cosine, tangent, se-
cant, etc functions. Just like the sine function, none of these functions are
one-to-one. Thus we restrict the domain as you did in your basic class (some-
times different from the domain used to define sin−1 and proceed as we did in
6.4 L’Hospital’s Rule 163
Example 6.3.3. We emphasize that having different domains for these functions
can make things difficult when we have the more than one of them interacting
with each other. Also we must be careful when using a calculator.
HW 6.3.2 Consider the function f (x) = |x| for x ∈ R. Show that if a < 0 < b,
then there is no ξ ∈ (a, b) such that f (b) − f (a) = f 0 (ξ)(b − a). Why does this
not contradict the Mean Value Theorem.
We know that because f and g are differentiable at x0 , the limits in both the
numerator and denominator exist. Also that the limit in the denominator is
nonzero. Thus
f (x)−f (x0 )
f (x) limx→x0 x−x0 f 0 (x0 )
lim = =
x→x0 g(x) limx→x0 g(x)−g(x0 ) g 0 (x0 )
x−x0
Proof: We prove this theorem very much we proved the Mean Value Theorem—
we use a trick and Rolle’s Theorem, Theorem 6.3.2. We define a function h by
6.4 L’Hospital’s Rule 165
f (b) − f (a)
h(x) = f (x) − mg(x) where m = . Then it is easy to see that
g(b) − g(a)
h(a) = h(b), h is continuous on [a, b] and h is differentiable on (a, b). Thus by
Rolle’s Theorem, Theorem 6.3.2, there exists ξ ∈ (a, b) such that h0 (ξ) = 0.
Since h0 (x) = f 0 (x) − mg 0 (x) and g 0 (x) 6= 0 on (a, b), we have the desired result.
and apply the CMVT to the left hand side. Equation (6.4.1) is easily seen to
be true by simplifying the right hand side.
We should also note that in our applications of the CMVT, it is always the
case that g(x) 6= g(y)—because we will always assume that g 0 (x) 6= 0 on our
interval I, the Mean Value Theorem, Theorem 6.3.3, implies that g(x) − g(y) =
g 0 (ξ)(x − y) for some ξ ∈ (y, x). Thus g(x) − g(y) 6= 0.
And finally another operation we will
use often is the following. Again in part
f (x) f (y)
g(x) − g(x)
(a) of Proposition 6.4.3 we will have
g(y)
− A < where x is fixed and
1 − g(x)
f (x)
g(x) − 0
f and g approach zero as y → c+. We let y → c+ and get − A ≤ .
1 − 0
We should realize that this follows from HW4.1.4.
Since some flavor of each of the above statements will appear in each proof,
we thought that we’d belabor the idea once and in the proofs just proceed as if
we know what we’re doing. Thus we begin with the following result where we
consider several of the possibilities when x is approaching a real number from
the right hand side.
Proposition 6.4.3 (L’Hospital’s Rule) Suppose that f, g : I = (c, a) → R,
c, a ∈ R, where f, g are differentiable on I and g is assumed to satisfy g(x) 6= 0,
g 0 (x) 6= 0 on I. We then have the following results.
f 0 (x) f (x)
(a) If lim f (x) = lim g(x) = 0 and lim 0 = A ∈ R, then lim =
x→c+ x→c+ x→c+ g (x) x→c+ g(x)
166 6. Differentiation
A.
f 0 (x) f (x)
(b) If lim f (x) = lim g(x) = 0 and lim = ∞, then lim = ∞.
x→c+ x→c+ x→c+ g 0 (x) x→c+ g(x)
f 0 (x) f (x)
(c) If lim g(x) = ∞ and lim = A ∈ R, then lim = A.
x→c+ x→c+ g 0 (x) x→c+ g(x)
f 0 (x) f (x)
(d) If lim g(x) = ∞ and lim 0
= ∞, then lim = ∞.
x→c+ x→c+ g (x) x→c+ g(x)
f 0 (x)
Proof: (a) Suppose > 0 is given and let 1 = /2. Since lim = A,
x→c+ g 0 (x)
0 1 given we know that there exists a δ such that ξ ∈ (c, c + δ) implies that
for
f (ξ)
g 0 (ξ) − A < 1 . Choose x, y ∈ (c, c + δ) such that y < x. By the CMVT
We then let y → c+, noting that ξy ∈ (c, c + δ) for all of the y’s, and get
f (x)
f (x)
g(x) − A ≤ 1 = /2 < for any x ∈ (c, c + δ). Therefore lim = A.
x→c+ g(x)
You might notice that this proof isn’t too different from the proof of Propo-
sition 6.4.1, except in this case since we cannot evaluate f or g at x = c, we use
x and y and then let y → c+. Otherwise the proofs are really very similar.
(b) Suppose K > 0 is given—this time we are proving that a limit is infinite so
f 0 (x)
we begin with K in place of the traditional . Let K1 = 2K. Since lim 0 =
x→c+ g (x)
f 0 (ξ)
∞, there exists a δ such that ξ ∈ (c, c + δ) implies that 0 > K1 . Choose
g (ξ)
x, y ∈ (c, c + δ) with y < x. By the CMVT there exists ξy ∈ (y, x) such that
f (x) − f (y) f 0 (ξy )
= 0 . Then we have
g(x) − g(y) g (ξy )
f (x) f (y)
g(x) − g(x) f (x) − f (y) f 0 (ξy )
g(y)
= = 0 > K1
1− g(x) − g(y) g (ξy )
g(x)
—since ξy ∈ (y, x) ⊂ (c, c + δ)—and it’s true for any y and ξy as long as y < x.
f (x)
Let y → c+ and get ≥ K1 = 2K > K for all x ∈ (c, c + δ). Thus
g(x)
f (x)
lim = ∞.
x→c+ g(x)
6.4 L’Hospital’s Rule 167
(c) The proof of statement (c) is difficult. We feel that it is important for you
to know that there is a rigorous proof. We also feel that it is important that
you are able to read and understand such a proof—even when it is tough. We
proceed.
f 0 (x)
Suppose > 0 is given. Let 1 = min{1, /2}. Since lim 0 = A, for
x→c+ g (x)
0
f (ξ)
1 > 0 given, there exists δ1 such that ξ ∈ (c, c + δ1 ) implies 0 − A < 1 .
g (ξ)
Choose x, y ∈ I such that y < x < c + δ1 . Then since ξy ∈ (y, x) ⊂ (c, c + δ1 )
f (y) f (x)
g(y) − g(y)
0
f (y) − f (x) f (ξy )
− A = − A = − A < 1 . (6.4.2)
1−
g(x) g(y) − g(x) g 0 (ξy )
g(y)
Since lim g(y) = ∞ for a fixed x there exists δ4 such that y ∈ (c, c + δ4 )
y→c+
g(x)
implies that 1 − > 0. Set 2 = 1 /2[2 + |A|]. Again using the fact that
g(y)
|f (x)|
lim g(y) = ∞, there exists δ5 such that y ∈ (c, c+δ5 ) implies that g(y) >
y→c+ 2
|g(x)|
and there exists δ6 such that y ∈ (c, c + δ6 ) implies that g(y) > . Let
2
δ = min{δ1 , δ4 , δ5 , δ6 }. Then y ∈ (c, c+δ) implies that |fg(y)
(x)|
< 2 and |g(x)|
g(y) < 2 .
Now from (6.4.2) we have
f (y) f (x)
g(y) − g(y)
−1 + A < g(x)
< A + 1
1− g(y)
or
g(x) f (x) f (y) g(x) f (x)
(−1 + A) 1 − + < < (1 + A) 1 − + . (6.4.3)
g(y) g(y) g(y) g(y) g(y)
|g(x)| g(x)
Note that < 2 implies that −2 < < 2 . From this we see that
g(y) g(y)
g(x) g(x)
1− > 1 − 2 and 1 − < 1 + 2 . Also, |fg(y)
(x)|
< 2 implies that
g(y) g(y)
−2 < fg(y)
(x)
< 2 . Then inequality (6.4.3) gives
f (y)
(−1 + A)(1 − 2 ) − 2 < < (1 + A)(1 + 2 ) + 2
g(y)
or
f (y)
−1 + A − 2 (1 + A − 1 ) < < 1 + A + 2 (1 + A + 1 ). (6.4.4)
g(y)
Using the fact that 2 = 1 /2[2 + |A|] and the fact that 1 ≤ 1, the extra term
on the right hand side of (6.4.4) becomes
2 (1 + A + 1 ) ≤ 2 (2 + A) ≤ 2 [2 + |A|] = 1 /2 < 1 .
168 6. Differentiation
The extra term on the left side of (6.4.4) (without the minus sign) becomes
1 1 + |A| 1
2 (1 + A − 1 ) < 2 [1 + A] ≤ 2 [1 + |A|] = ≤ < 1 .
2 2 + |A| 2
f (y)
− + A ≤ −21 + A = −1 + A − 1 < −1 + A − 2 (1 + A − 1 ) <
g(y)
and
f (y)
< 1 + A + 2 (1 + A + 1 ) < 21 + A ≤ + A,
g(y)
f (y) f (y)
or − A < for all y ∈ (c, c + δ). Thus lim = A.
g(y) y→c+ g(y)
f 0 (x)
(d) Let K > 0 be given. Let K1 = 2K + 1. Since lim 0 = ∞, there exists
x→c+ g (x)
f 0 (ξ)
a δ1 such that ξ ∈ (c, c + δ1 ) implies that 0 > K1 . Choose x, y ∈ (c, c + δ1 )
g (ξ)
such that y < x. Then by the CMVT there exists ξy ∈ (y, x) such that
f (y) f (x)
g(y) − g(y) f (x) − f (y) f 0 (ξy )
g(x)
= = 0 > K1 . (6.4.5)
1− g(x) − g(y) g (ξy )
g(y)
f (y)
Therefore lim = ∞.
y→c+ g(y)
We next state and do not prove a version of l’Hospital’s Rule when x ap-
proaches −∞. Originally we had included the proofs in the text but then decided
that it did no good to include these proofs if no one reads them. They are tough.
If you are interested in the proofs see Advanced Calculus, Robert C. James.
f 0 (x) f (x)
(b) If lim f (x) = lim g(x) = 0 and lim = ∞, then lim =
x→−∞ x→−∞ x→−∞ g 0 (x) x→−∞ g(x)
∞.
f 0 (x) f (x)
(c) If lim g(x) = ∞ and lim = A ∈ R, then lim = A.
x→−∞ x→−∞ g 0 (x) x→−∞ g(x)
f 0 (x) f (x)
(d) If lim g(x) = ∞ and lim 0 = ∞, then lim = ∞.
x→−∞ x→−∞ g (x) x→−∞ g(x)
Integration
171
172 7. Integration
n
X n
X
(b) Define L(f, P ) = mi (xi − xi−1 ) and U (f, P ) = Mi (xi − xi−1 ). The
i=1 i=1
values L(f, P ) and U (f, P ) are called the lower and upper Darboux sums of f
based on P , respectively, or just the lower and upper sums.
We notice in Figure 7.1.1 that for the given partition P , L(f, P ) (repre-
sented by the area under the thin horizontal lines) gives a lower approximation
for the area under the curve y = f (x), a ≤ x ≤ b, and U (f, P ) (represented
by the area under the thick horizontal lines) gives an upper approximation for
the area under the curve y = f (x), a ≤ x ≤ b. Note that when we use the
word ”approximation”, we do not mean that it necessarily provides an accu-
rate approximation. You should realize that if we include more points in the
partition, the values L(f, P ) and U (f, P ) will provide better approximations
of the area compared to the approximation given with respect to the parition
pictured—add a point to the partition in the figure, draw the new version of
the segments indicating the new upper and lower sums, and note that the new
upper and lower sums gives a value closer to the area under the curve y = f (x).
Figure 7.1.1: Plot of the function y = f (x) on [a, b], a partition indicated on
[a, b] and the step function representing the upper and lower sums, U (f, P ) and
L(f, P ), respectively.
Also
n
X n
X
U (f1 , P ) = Mi (xi − xi−1 ) = k (xi − xi−1 ) = k(b − a).
i=1 i=1
Example
( 7.1.2 Consider the function f2 : [0, 1] → R defined by
1 if x ∈ Q ∩ [0, 1]
f2 (x) = and let P = {x0 , · · · , xn } be a partition of [0, 1]. Compute
0 if x ∈ I ∩ [0, 1]
L(f2 , P ) and U (f2 , P ).
Proof: Recall that f2 is the same function considered in Example 5.2.5. Let [xi−1 , xi ] denote
a partition interval of partition P and assume that xi−1 < xi . Since by Proposition 1.5.6-(a)
there exists a q ∈ Q such that q ∈ (xi−1 , xi ), we see that Mi = 1. Also since by Proposition
1.5.6-(b) there exists p ∈ (xi−1 , xi ) such that p ∈ I, we see that mi = 0. This is true for every
i, i = 1, · · · , n. Thus
n
X n
X
L(f2 , P ) = mi (xi − xi−1 ) = 0(xi − xi−1 ) = 0
i=1 i=1
and
n
X n
X
U (f2 , P ) = Mi (xi − xi−1 ) = 1(xi − xi−1 ) = 1.
i=1 i=1
and
n
X n
X n
X n
X
U (f3 , P ) = Mi (xi − xi−1 ) = (2xi + 3)(xi − xi−1 ) = 2 x2i − 2 xi xi−1 + 3.
i=1 i=1 i=1 i=1
and
n n n
X X i i i−1 2 X 2 n(n + 1)
U (f3 , P1 ) = mi (xi −xi−1 ) = 2 +3 − = 2 i+3 = 2 +3.
i=1 i=1
n n n n i=1
n 2
These are much nicer expressions. You’d think they’d be useful for something—but remember
that this is a very nice and specific partition.
We now proceed to develop some results comparing and relating the upper
and lower sums. Because it is clear that mi ≤ Mi , i = 1, · · · , n, we obtain the
following result.
Proposition 7.1.2 Suppose f : [a, b] → R where f is bounded on [a, b] and P
is a partition of [a, b]. Then L(f, P ) ≤ U (f, P ).
174 7. Integration
Rb
Remember that we want f to give us the area under the curve. If we look at
a Rb
Figure 7.1.1 again, we note that for that to be true we at least must define a f
Rb
so that for any partition P , L(f, P ) ≤ a f ≤ U (f, P ), i.e. we must squeeze
Rb
a
f between the inequality given in Proposition 7.1.2.
We next state and prove an easy but necessary lemma for our later work.
Lemma 7.1.3 Suppose that f : [a, b] → R is bounded such that m ≤ f (x) ≤ M
for all x ∈ [a, b]. Then for any partition P of [a, b], m(b − a) ≤ L(f, P ) and
U (f, P ) ≤ M (b − a).
which is one of the inequalities that we were to prove. The other inequality
follows in the same manner.
One of the problems that we have is that we have defined the upper and
lower Darboux sums for a particular partition. We have already discussed the
fact that if we make the partition finer, the upper and lower sums give us better
Rb
approximations of the area—the value that we want for a f . We must have
ways to connect the sums in some way for different partitions. The next few
definitions and propositions do this job for us. We begin with the following
definition.
Definition 7.1.4 Let P and P ∗ be partitions of [a, b] given by P = {x0 , · · · , xn }
and P ∗ = {y0 , · · · , ym }. If P ⊂ P ∗ , then P ∗ is said to be a refinement of P .
Note that since the partitions P and P ∗ (and any other partitions that we may
define) are given as a set of points in [a, b], the set containment definition above
makes sense.
We should note that the easiest way to get a simple refinement of the parti-
tion P is to add one point, i.e. if P = {x0 , · · · , xn }, choose a point yI ∈ [xI−1 , xI ]
and let P ∗ = {x0 , · · · , xI−1 , yI , xI , · · · , xn }. We should then realize that if P ∗
is a refinement of the partition P , it is possible to consider a series of one-point
refinements P0 , · · · , Pk such that P0 = P , Pk = P ∗ and Pj is a one-point refine-
ment of Pj−1 for j = 1, · · · , k. The construction consists of choosing P0 = P
and adding one of the points of P ∗ − P = {x : x ∈ P ∗ and x 6∈ P } at each step.
This observation makes several of the proofs given below much easier.
We next prove two lemmas, the first that relates the upper and lower sums
on a partition and refinements of that partition and the second that relates the
lower sum and upper sums with respect to different partitions.
Lemma 7.1.5 Suppose f : [a, b] → R where f is bounded on [a, b], P is a
partition of [a, b] and P ∗ is a refinement of P . Then L(f, P ) ≤ L(f, P ∗ ) and
U (f, P ∗ ) ≤ U (f, P ).
7.1 Upper and Lower Sums 175
Lemma 7.1.6 Suppose that f : [a, b] → R where f is bounded on [a, b], and
suppose that P1 and P2 are partitions of [a, b]. Then L(f, P1 ) ≤ U (f, P2 ).
If we return to Examples 7.1.1, 7.1.2 and 7.1.3, it’s really easy to see f1
and f2 satisfy L(fj , P ) ≤ U (fj , P ) for j = 1, 2. It’s more difficult to see that
Proposition 7.1.2 is satisfied for f3 —but it is true because xi−1 (xi − xi−1 ) ≤
xi (xi − xi−1 ) (sum both sides of the inequality and add 3). Because the upper
and lower sums for the functions f1 and f2 are so trivial, it is easy to see that
f1 and f2 will satisfy Lemma 7.1.5 for any refinement. Again because the upper
and lower sums for f3 are so difficult, it is difficult to see that f3 will satisfy
Lemma 7.1.5—except by approximately reproducing the proof of Lemma 7.1.5
with respect the function f3 and some particular refined partition. And finally,
while it is again easy to see that functions f1 and f2 will satisfy Lemma 7.1.6
with respect to any two partitions, it is very difficult to see that the upper and
lower sums of the function f3 with respect to the partitions P and P1 will satisfy
Lemma 7.1.6. It should be clear that if we were to consider upper and lower
sums for more complex functions, it would be next to impossible to compare
these upper and lower sums—especially with respect to very complex different
partitions. It is for that reason that the above lemmas are so important.
HW 7.1.1 (True or False and why)
(a) Suppose f : [0, 1] → R and f (x) > 0 for all x ∈ [0, 1]. Let P be some
partition of [0, 1]. Then L(f, P ) > 0.
(b) Suppose f : [0, 1] → R and let P be some partition of [0, 1]. It is possible
that L(f, P ) < 0 and U (f, P ) > 0.
(c) Suppose that P and P ∗ are partitions of [0, 1]. Then P ∩ P ∗ is a partition
of [0, 1]. Also P ∩ P ∗ is a refinement of both P and P ∗ .
HW 7.1.2 Consider the function f (x) = x and the partition of [0, 1], Pn =
{0, n1 , n2 , · · · , 1}.
(a) Compute L(f, Pn ) and U (f, Pn ).
(b) Compute U (f, Pn ) − L(f, Pn ).
(c) Show that U (f, Pn ) − L(f, Pn ) > 0.
(d) Compute lim [U (f, Pn ) − L(f, Pn )].
n→∞
HW 7.1.3 Consider the function f (x) = x and the partition of [0, 1], Pn =
1 n
{0, 2n , · · · , 2n , 1}.
(a) Compute L(f, P ) and U (f, P .
(b) Compute U (f, Pn ) − L(f, Pn ).
(c) Show that U (f, Pn ) − L(f, Pn ) > 0.
(d) Compute lim [U (f, Pn ) − L(f, Pn )].
n→∞
(
x if x ∈ Q ∩ [0, 1]
HW 7.1.4 Consider the function f (x) =
0 if x ∈ I ∩ [0, 1]
1 2
and the partition of [0, 1] Pn = {0, n , n , · · · , 1}.
(a) Compute L(f, Pn ) and U (f, Pn ).
(b) Compute U (f, Pn ) − L(f, Pn ) and lim [U (f, Pn ) − L(f, Pn )].
n→∞
7.2 The Integral 177
We first note that if P ∗ is any fixed partition of [a, b], then by Lemma 7.1.6
L(f, P ) ≤ U (f, P ∗ ) for any partition P . Because U (f, P ∗ ) is an upper bound
for the set {L(f, P ) : P is a partition of [a, b]} we know that lub{L(f, P ) :
P is a partition of [a, b]} exists. Likewise, L(f, P ∗ ) is a lower bound of the set
{U (f, P ) : P is a partition of [a, b]} so that glb{U (f, P ) : P is a partition of [a, b]}
exists.
If we again return to Example 7.1.1, we see that since L(f1 , P ) = U (f1 , P ) =
Z b Z b
k(b − a) for any partition P . Thus f1 = f1 = k(b − a). If we consider
a a
the function f2 introduced in Example 7.1.2, we see that since L(f2 , P ) = 0
Z 1
for any partition P , f2 = 0 and since U (f2 , P ) = 1 for any partition P ,
0
Z 1
f2 = 1. And finally, it should be reasonably easy to see that since L(f3 , P )
0
and U (f3 , P ) from Example 7.1.3 are so complex, it is too difficult to try to
Z 1 Z 1
use these expressions to determine f3 or f3 . Even though L(f3 , P1 ) and
0 0
U (f3 , P1 ) found in Example 7.1.3 are much nicer, the least upper bound and the
greatest lower bound in Definition 7.2.1 above must be taken over all partitions
of [0, 1], not just a few nice ones—so knowing L(f, P1 ) and U (f, P1 ) does not
Z 1 Z 1
help us determine f3 or f3 (at least at this time).
0 0
We see by Lemma 7.1.5 that as we add points to a partition, the upper sums
Z b
get smaller. We define f to be the glb of these upper sums. Likewise we
a
178 7. Integration
know that as we add points to a partition, the lower sums get larger. The lower
Rb Rb Rb
integral a f is defined to be the lub of these lower sums. Hence, a f and a f
squeeze in to provide a better upper and lower approximation of the area under
the curve y = f (x) from a to b, respectively.
We next prove a result that our intuition should tell us is obvious.
Proposition 7.2.2 Suppose that f : [a, b] → R, f is bounded on [a, b] and let
Z b Z b
P be any partition of [a, b]. Then L(f, P ) ≤ f≤ f ≤ U (f, P ).
a a
Proof: The first inequality and the last inequality follow from the fact that the
lub and the glb must be an lower bound and a upper bound of the set of all
possible sums, respectively.
Let P1 and P2 be any two partitions of [a, b]. By Lemma 7.1.6 we know that
L(f, P2 ) ≤ U (f, P1 ). Hence U (f, P1 ) is an upper bound of the set {L(f, P2 ) :
P2 is a partition of [a, b]}. Therefore
Z b
f = lub{L(f, P2 ) : P2 is a partition of [a, b]} ≤ U (f, P1 ).
a
Z b
Then f is a lower bound of the set {U (f, P1 ) : P1 is a partition of [a, b]}.
a
Z b Z b
Therefore f ≤ glb{U (f, P1 ) : P1 is a partition of [a, b]} = f which is
a a
what we were to prove.
Z b
The upper sums U (f, P ) and the upper integral f approximate the area
a
under the curve y = f (x) from above, and the lower sums L(f, P ) and the
Z b
lower integral f approximate the area under the curve y = f (x) from below.
a
Z b
Since we want the integral f to give the area under the curve y = f (x), it is
a
reasonably logical to make the following definition.
Definition 7.2.3 Suppose f : [a, b] → R and f is bounded on [a, b]. We say
Z b Z b
that f is integrable on [a, b] if f= f . If f is integrable on [a, b], we write
a a
Z b Z b Z b Z b
f= f or f= f .
a a a a
Z b
We call
f the Darboux integral of f from a to b. We will actually drop
a Z b
the Darboux from here on and refer to f as the integral of f from a to
a
7.2 The Integral 179
b. We want to make it clear that this is the same integral that you studied
in your basic class. We tacked on the ”Darboux” to differentiate it from the
”Riemann” integral that we define in Section 7.6—at which time we immediately
prove that the Riemann and the Darboux integrals are the same. We use the
Darboux definition because it makes some of the proofs easier and because we
feel that it is more intuitive.
Before we discuss the integral we want to emphasize that while we denote the
Rb
integral of f from a to b by a f , the most common notation (especially in the
Rb
basic course) is to denote the integral by a f (x) dx. There are some advantages
to this latter notation. The ”dx” sort of reminds us that there is an xi − xi−1 in
the definition of the upper and lower sums. Also, later when we want to make
a change of variables, the ”dx” term is very useful for reminding us what we
Rb
want to do. The notation a f (x) dx can also be difficult to understand when
we study differentials. In our basic course we had a ”dx” in the integral, a ”dx”
as a part of differentials, with no apparent connection. (We surely won’t have
that problem since we won’t discuss differentials.) In any case we will generally
Rb
use the notation a f to denote the integral of f from a to b—though whenever
Rb
it seems convenient or more clear to use the a f (x) dx notation, we will use it.
We return to the statement made just before we gave the definition of the
integral, ”it is reasonably logical to make the following definition.” It wouldn’t
Z b Z b
be a good definition if it were always the case that f < f —then no
a a
function would be integrable. It is only a good definition—and then our logic is
affirmed—if for a large set of nice functions, we can in fact get equality—and for
some functions we do not get equality. For a first glimpse of what we have, we
return to our Examples 7.1.1, 7.1.2 and 7.1.3 from Section 7.1. For the function
f (x) = k defined on the interval [a, b] introduced in Example 7.1.1, we see that
Z1 b
f1 = k(b − a). If we consider the function f2 introduced in Example 7.1.2—
a
Z 1 Z 1
and the subsequent work on f2 , we see that since f2 = 0 < 1 = f2 , the
0 0
function f2 is not integrable on [0, 1]. And of course, we can’t say much about
the function f3 .
We want to emphasize that the integral of f on the interval [a, b] is defined
only for functions that are bounded on [a, b]. We saw above that in the case of
the function f2 , a function can be bounded and still not integrable. But also
we should realize that f2 is not a nice function, i.e. if it is a bounded function
that is pretty nasty, it may not be integrable.
To be able to show that more functions are integrable (in practice, other
than for theoretical purposes, we don’t care too much about the function f2 ),
we need some methods and results. We begin with a very powerful and useful
theorem, the Archimedes-Riemann Theorem.
180 7. Integration
Z b Z b
0≤ f− f ≤ U (f, Pn ) − L(f, Pn ).
a a
Rb Rb
(notice that the two sequences on the left are constant sequences) or a f = a f
so f is integrable on [a, b].
Z b Z b Z b
(⇒) We now assume that f is integrable on [a, b]. Then f= f= f.
a a a
Z b
Since f = lub{L(f, P ) : P is a partition of [a, b]}, for every n ∈ N there
a
Z b
1
exists a partition of [a, b], Pn∗ , such that f − L(f, Pn∗ ) < . (Recall that
a n
by Proposition 1.5.3-(a) for any > 0 there exists a partition of [a, b], P such
Z b
that f − L(f, P ) < .) Likewise (using Proposition 1.5.3-(b)) there exists a
a
Z b
1
partition of [a, b], Pn# , such that U (f, Pn# ) −
. Let Pn be the common f<
a n
∗ # ∗ #
refinement of Pn and Pn , Pn = Pn ∪ Pn . Doing this construction of each n ∈ N
gives us a sequence of partitions of [a, b], {Pn }.
We note that L(f, Pn∗ ) ≤ L(f, Pn ) and U (f, Pn ) ≤ U (f, Pn# ). Thus we have
Z b
1
L(f, Pn ) ≥ L(f, P ∗ ) > f− (7.2.3)
a n
and
Z b
1
U (f, Pn ) ≤ U (f, Pn# ) < f+ . (7.2.4)
a n
7.2 The Integral 181
where ”=∗ ” is true because of our hypothesis that f is integrable on [a, b] (in
Z b Z b
which case f= f ). Therefore by Proposition 3.4.2 we have
a a
2
0 ≤ lim [U (f, Pn ) − L(f, Pn )] ≤ lim = 0,
n→∞ n→∞ n
or lim [U (f, Pn ) − L(f, Pn )] = 0 which is what we wanted to prove.
n→∞
Now let {Pn } is such a sequence of partitions that satisfies the above ”if
Z b Z b
and only if” statement. If we use (7.2.3), the fact that f= f (if there is
a a
Z b
such a sequence of partitions, then f is integrable), and that f ≥ L(f, Pn )
a
Z b
1
for all n, we get 0 ≤ f − L(f, Pn ) < for all n—which implies that
a n
Z b
lim L(f, Pn ) = f . A similar arguement using (7.2.4) can be used to prove
n→∞ a Z b
that lim U (f, Pn ) = f.
n→∞ a
We claimed before we stated this theorem that it was ”powerful and useful.”.
If we consider an integral using Definitions 7.2.1 and 7.2.3, we must consider
all partitions of [a, b]—this is difficult because there are a lot of partitions. To
consider a particular integral using Theorem 7.2.4, we can use only a sequence of
partitions—and we can choose a very nice sequence of partitions. For example
when we considered the function f3 defined by f3 (x) = 2x + 3 in Example 7.1.3
we found that for a general partition the upper and lower sums are not very nice.
However, when we considered the partition P1 = {0, 1/n, 2/n, · · · , n/n = 1}, we
found that L(f3 , P1 ) = n(n−1)
n2 + 3 and U (f3 , P1 ) = n(n+1)
n2 + 3. Thus if we
define the sequence of partitions {Pn } to be Pn = {0, 1/n, 2/n, · · · , n/n = 1}
(the same as P1 but for all n), we see that U (f3 , Pn ) − L(f3 , Pn ) = n2 . Thus it
is clear that lim [U (f3 , Pn ) − L(f3 , Pn )] = 0 and that by Theorem 7.2.4 we see
n→∞
that Z 1
n(n − 1)
f3 = lim L(f3 , Pn ) = lim + 3 = 4.
0 n→∞ n→∞ n2
We should note that since Theorem 7.2.4 is an “if and only if” result with
“f is integrable” one side, the theorem gives us a result that is equivalent to the
definition. In this case it is very important because the result given by Theorem
7.2.4 is easier to use than the definition. To make this result a bit easier to
discuss we make the following definition.
182 7. Integration
Definition 7.2.5 Suppose that f : [a, b] → R be such that f is bounded on [a, b].
Let {Pn } be a sequence of partitions of [a, b]. The sequence {Pn } is said to be
an Archimedian sequence of partitions for f on [a, b] if U (f, Pn ) − L(f, Pn ) → 0
as n → ∞.
Of course we can then reword Theorem 7.2.4 as follows: Consider f : [a, b] →
R such that f is bounded on [a, b]. The function f is integrable on [a, b] if
and only if there exists an Archimedian sequence of partitions for f on [a, b].
Also, if there exists an Archimedian sequence of partitions for f on [a, b], then
Z b
f = lim U (f, Pn ) = lim L(f, Pn ).
a n→∞ n→∞
Before we leave this section we include another theorem that is only a slight
variation of Theorem 7.2.4 but is sometimes useful—and gives us another char-
acterization of integrability.
Theorem 7.2.6 (Riemann Theorem) Suppose f : [a, b] → R is bounded on
[a, b]. Then f is integrable on [a, b] if and only if for every > 0 there exists a
partition of [a, b], P , such that 0 ≤ U (f, P ) − L(f, P ) < .
Proof: (⇒) If f is integrable on [a, b], we know from the A-R Theorem, Theorem
7.2.4, that there exists an Archimedian sequences of partitions of [a, b], {Pn }, so
that U (f, Pn ) − L(f, Pn ) → 0 as n → ∞, i.e. for every > 0 there exists N ∈ R
such that n ≥ N implies that |U (f, Pn ) − L(f, Pn )| < . Let n0 ∈ N be such
that n0 > N . Then the partition Pn0 is such that 0 ≤ U (f, Pn0 ) − L(f, Pn0 ) =
|U (f, Pn0 ) − L(f, Pn0 )| < which is what we were to prove.
(⇐) We are given that for any > 0 there exists a partition of [a, b], P , such
that 0 ≤ U (f, P ) − L(f, P ) < . Then for each n ∈ N there exists a partition of
[a, b], Pn , such that U (f, Pn ) − L(f, Pn ) < 1/n (letting = 1/n). Then
0 ≤ U (f, Pn ) − L(f, Pn ) < 1/n → 0 as n → ∞.
Therefore by the A-R Theorem, Theorem 7.2.4, f is integrable on [a, b].
It’s obvious from the proof that the Riemann Theorem is only a slight vari-
ation of the A-R Theorem. There are times when it is more convenient to
only have to produce one partition to prove integrability. For those times the
Riemann Theorem is convenient.
HW 7.2.1 (True or False and why)
Z 1
(a) Suppose f : [0, 1] → R is integrable on [0, 1] and f = 0. Then for any
0
partition of [0, 1], P , U (f, P ) = L(f, P ) = 0.
Z 1
(b) Suppose f : [0, 1] → R is integrable on [0, 1] and f = 0. Then f (x) = 0
0
for all x ∈ [0, 1].
(c) Suppose f : [0, 1] → R is integrable on [0, 1]. Then f is continuous on [0, 1].
(d) Suppose f : [0, 1] → R is integrable on [0, 1] and f (x) > 0 for all x ∈ [0, 1].
Z 1
It is possible that f < 0.
0
7.3 Some Integration Topics 183
Proof: Before we start, let us emphasize the approach that we shall use. By
Theorem 7.2.4 if we can find an Archimedian sequence of partitions for f on
[a, b], then we know that the function f is integrable on [a, b]. Thus we work to
find the appropriate sequence of partitions.
Consider the sequence of partitions {Pn } defined for each n ∈ N by
1 n−1
Pn = {x0 = a, x1 = a + (b − a), · · · , xn−1 = (b − a), xn = b}. (7.3.1)
n n
We will use this sequence of partitions often. Notice that it is very regular in
that the partition points are equally space throughout the interval.
184 7. Integration
Proof: Let > 0 be given. Clearly since f is continuous on [a, b], by Lemma
5.3.7 we know that f is bounded on [a, b]—thus it makes sense to consider
whether f is integrable on [a, b]. Consider the partition Pn = {x0 , x1 , · · · , xn }
where xi = a + i(b − a)/n, i = 0, · · · , n. Recall that by Proposition 5.5.6, f
continuous on [a, b] implies that f is uniformly continuous on [a, b]. Thus for any
/(b − a) > 0 there exists a δ such that |x − y| < δ implies that |f (x) − f (y)| <
/(b − a). Choose n so that (b − a)/n < δ. (Recall that we know we can find
such an n by Corollary 1.5.5-(b).)
Consider the partition interval [xi−1 , xi ]. By Theorem 5.3.8 we know that
f assumes its absolute minimum and absolute maximum on [xi−1 , xi ]. Let
(xi , f (xi )) and (xi , f (xi )) denote the absolute minimum and maximum of f on
[xi−1 , xi ]. Note that mi = f (xi ) and Mi = f (xi ). Because f is uniformly
continuous, n was chosen so that (b − a)/n < δ and xi − xi−1 = (b − a)/n,
xi , xi ∈ [xi−1 , xi ] implies that |xi − xi | < δ and Mi − mi = |f (xi ) − f (xi )| <
/(b − a). Then
n
X n
X
U (f, Pn ) − L(f, Pn ) = (Mi − mi )(xi − xi−1 ) < (/(b − a)) (xi − xi−1 ) = .
i=1 i=1
7.3 Some Integration Topics 185
Proof: Since f is integrable on [a, b], we know from Theorem 7.2.4 that there
exists an Archimedian sequence of partitions of [a, b], Pn , so that U (f, Pn ) −
L(f, Pn ) → 0. Suppose that Pn = {x0 , x1 , · · · , xn }. Then for each n there will
be some k so that xk ≤ c < xk+1 . Define the three partitions Pn0 = Pn ∪ {c},
[a,c] [c,b]
Pn = {x0 , · · · , xk , c} and Pn = {c, xk+1 , · · · , xn }—where of course, if xk =
c, no new point is really added, and if xk < c, the one new point c is added. Of
course these constructions are valid for each n.
We note that since Pn0 is a refinement of the partition Pn , by Proposition
7.1.2 and Lemma 7.1.5 0 ≤ U (f, Pn0 ) − L(f, Pn0 ) ≤ U (f, Pn ) − L(f, Pn ). Then by
the Sandwich Theorem, Proposition 3.4.2, U (f, Pn0 ) − L(f, Pn0 ) → 0 as n → ∞
and {Pn0 } is an Archemedian sequence of partitions for f on [a, b].
[a,c] [c,b] [a,c]
By the definition of Pn0 , Pn and Pn , we see that U (f, Pn0 ) = U (f, Pn )+
c,b] 0 [a,c] c,b]
U (f, Pn and L(f, Pn ) = L(f, Pn ) + L(f, Pn . Then because
and
We next include several very basic, important results for computing integrals.
Proof: Since f and g are both integrable, there exists an Archimedian sequence
for each of f and g. Let {Pn } be the common refinement of these two Archime-
dian sequences, i.e. then U (f, Pn ) − L(f, Pn ) → 0 and U (g, Pn ) − L(g, Pn ) → 0
as n → ∞. Suppose Pn = {x0 , · · · , xn }.
(a) This a very simple property. As you will see the proof is tough. Hold
on and ready carefully. Let us begin by defining the following useful notation:
Mif = lub{f (x) : x ∈ [xi−1 , xi ]}, mfi = glb{f (x) : x ∈ [xi−1 , xi ]}, Micf =
lub{cf (x) : x ∈ [xi−1 , xi ]} and mcf
i = glb{cf (x) : x ∈ [xi−1 , xi ]}. To prove
part (a) we need a relationship between U (f, Pn ) and U (cf, Pn ), and L(f, Pn )
and L(cf, Pn ).
Case 1: c ≥ 0: We begin by showing that Micf = cMif for any i. If c = 0,
then it is very easy since Micf = mcf i = cMif = cmfi = 0. So we consider
f
c > 0. Since Mi is an upper bound of {f (x) : x ∈ [xi−1 , xi ]}, then surely cMif
is an upper bound of {cf (x) : x ∈ [xi−1 , xi ]} and M,cf ≤ cMif (because Micf
is the least upper bound of the set). Likewise, since Micf is an upper bound
of {cf (x) : x ∈ [xi−1 , xi ]}, it is clear that Micf /c is an upper bound of the set
{f (x) : x ∈ [xi−1 , xi ]}. Thus Mif ≤ Micf /c (since Mif is the least upper bound
of the set) or cMif ≤ Micf . Therefore cMif = Micf . The proof that cmfi = mcf i
is very similar—with glb’s replacing lub’s.
We then have U (cf, Pn ) = cU (f, Pn ), L(cf, Pn ) = cL(f, Pn ) and
are very similar to the proofs given in Case 1, except that c < 0 reverses the
inequalities which replace the Mi by the mi . For example, since Mif = lub{f (x) :
x ∈ [xi−1 , xi ]}, Mif ≥ f (x) for all x ∈ [xi−1 , xi ]. Thus cMif ≤ cf (x) (remember
7.3 Some Integration Topics 187
least upper bound of the set {f (x) : x ∈ [xi−1 , xi ]}) or mcf i ≤ cMif . Thus
cf f
mi = cMi .
Xn X n
We then have U (cf, Pn ) = Micf (xi − xi−1 ) = cmfi (xi − xi−1 ) =
i=1 i=1
n
X n
X
cL(f, Pn ), L(cf, Pn ) = mcf
i (xi − xi−1 ) = cMif (xi − xi−1 ) = cU (f, Pn )
i=1 i=1
and
Rb
(we don’t really need the extra a (f + g) in the inequality). Since the first
term and the last term in the inequality are equal, they all must be equal and
Rb Rb Rb
a
(f + g) = a f + a g (or you could apply the Sandwich Theorem).
(c) Part (c) follows from parts (a) and (b).
The above proof was not necessarily difficult but very technical. However,
we must realize that it is the proof of an important theorem.
Proof: We will work to find a partition of [a, b] that allows us to apply Rie-
mann’s Theorem, Theorem 7.2.6. Let > 0 be given, let K = lub{|φ(y)| :
y ∈ [c, d]} and set 1 = /(b − a + 2K) (we’ll see why this is the correct choice
of 1 later). Since φ is continuous on [c, d], φ is uniformly continuous on [c, d].
So given 1 > 0 there exists δ such that y1 , y2 ∈ [c, d] and |y1 − y2 | < δ implies
that |φ(y1 ) − φ(y2 )| < 1 . Choose δ < 1 —we can always make our δ smaller.
Since f is integrable on [a, b], by the Riemann Theorem, Theorem 7.2.6, there
exists a partition P such that U (f, P ) − L(f, P ) < δ 2 . (Theorem 7.2.6 said that
we could find such a partition P for any > 0—we’re using δ 2 in place of the
in the theorem.) Suppose P is given by P = {x0 , · · · , xn }. The partition P is
the partition we want to use in our application of Theorem 7.2.6 to show that
φ ◦ f is integrable on [a, b], i.e. we must show that U (φ ◦ f, P ) − L(φ ◦ f, P ) < .
We know that
n
X
U (f, P ) − L(f, P ) = (Mif − mfi )(xi − xi−1 ) < δ 2 (7.4.1)
i=1
where we will use the notation used in the last theorem: for any F MiF =
lub{F (x) : x ∈ [xi−1 , xi ]} and mFi = glb{F (x) : x ∈ [xi−1 , xi ]}.
Since U (f, P )−L(f, P ) must be “small”, and since Mif −mfi and xi −xi−1 are
both greater than or equal to zero, then (Mif − mfi )(xi − xi−1 ) must be “small”.
There are two ways the terms (Mif − mfi )(xi − xi−1 ) are made “small”—either
Mif − mfi can be “small” or xi − xi−1 is “small”.
Let S1 be the set of indices for which Mif − mfi is “small”, i.e. S1 = {i :
1 ≤ i ≤ n and Mif − mfi < δ}. Let S2 = {i : 1 ≤ i ≤ n and Mif − mfi ≥ δ}.
Note that we have now partially defined what we mean by “small” and though
we have defined the set S2 to be the set of indices on which Mif − mfi is not
“small”, it is for these partition intervals that we had better have xi − xi−1 be
small—because we have decided that Mif − mfi is not.
We note that
n
X
U (φ ◦ f, P ) − L(φ ◦ f, P ) = (Miφ◦f − mφ◦f
i )(xi − xi−1 )
i=1
X X
= (Miφ◦f − miφ◦f )(xi − xi−1 ) + (Miφ◦f − mφ◦f
i )(xi − xi−1 ).(7.4.2)
i∈S1 i∈S2
so that Miφ◦f − miφ◦f − [φ(f (x∗ )) − φ(f (y ∗ ))] < 3 . Thus again by Proposition
1.5.3-(a) Miφ◦f − mφ◦f i = lub{φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}.
The proof of Claim 1 is essentially the same—a bit easier.
By Claim 1 and the fact that Mif −mfi < δ we see that for any x, y ∈ [xi−1 , xi ]
we have |f (x) − f (y)| < δ. Then by the uniform continuity of φ, we have for
any x, y ∈ [xi−1 , xi ] φ(f (x)) − φ(f (y))| < 1 . Therefore Miφ◦f − mφ◦f i ≤ 1 and
we have
X X
(Miφ◦f − mφ◦f i )(xi − xi−1 ) ≤ 1 (xi − xi−1 ) ≤ 1 (b − a). (7.4.3)
i∈S1 i∈S1
X X n
X
δ (xi − xi−1 ) ≤ (Mif − mfi )(xi − xi−1 ) ≤ (Mif − mfi )(xi − xi−1 ) < δ 2
i∈S2 i∈S2 i=1
P
(by (7.4.1)) or i∈S2 (xi − xi−1 ) < δ. Thus
X
U (φ ◦ f, P ) − L(φ ◦ f, P ) = (Miφ◦f − mφ◦f
i )(xi − xi−1 )
i∈S1
X X
+ (Miφ◦f − mφ◦f
i )(xi − xi−1 ) ≤ 1 (b − a) + 2K (xi − xi−1 )
i∈S2 i∈S2
< 1 (b − a) + 2Kδ < 1 (b − a + 2K) = .
Proof: The proof of part (a) consists of letting φ(y) = y 2 —which is surely
continuous anywhere—and applying Proposition 7.4.1. The proof of part (b) is
a reasonably nice induction proof using part (a). To obtain part (c) we again
use Proposition φ(y) = |y|. And part (d) follows by noting
7.4.1, this time with
that f g = 14 (f + g)2 − (f − g)2 and using Proposition 7.3.4 along with part
(a) of this proposition.
We should realize that we can let φ be any continuous function on [c, d]—
which will give us a lot of different integrable composite functions. For example,
if we let φ(x) = cx we see that the function cf is integrable if f is integrable—
part (a) Proposition 7.3.4. We next prove two reasonably easy propositions that
will give us a lot of interesting, integrable functions.
Proof: (a) Since f is integrable on [a, c] and [c, b], by Theorem 7.2.6 for any
> 0 there exists a partition P1 of [a, c] and a partition P2 of [c, b] such that
U (f, P1 ) − L(f, P1 ) < /2 and U (f, P2 ) − L(f, P2 ) < /2. Let P by the parti-
tion P = P1 ∪ P2 . Clearly P is a partition of [a, b] and U (f, P ) − L(f, P ) =
[U (f, P1 ) − L(f, P1 )] + [U (f, P2 ) − L(f, P2 )] < . Therefore by Theorem 7.2.6
the function f is integrable on [a, b]. We can then apply Proposition 7.3.3 to
Z b Z c Z b
see that f= f+ f.
a a c
(b) Part (b) follows from part (a) using mathematical induction.
Proof: Since f is bounded on [a, b], there exists some K such that |f (x)| ≤ K
for x ∈ [a, b]. Let x0 = a, x1 = a + (b − a)/n (for some n, yet to be determined),
xn−1 = b − (b − a)/n and xn = b. Note that lub{f (x) : x ∈ [x0 , x1 ]} ≤ K
(for x ∈ [x0 , x1 ], f (x) ≤ |f (x)| ≤ K), glb{f (x) : x ∈ [x0 , x1 ]} ≥ −K (for
x ∈ [x0 , x1 ], f (x) ≥ −|f (x)| ≥ −K), lub{f (x) : x ∈ [xn−1 , xn ]} ≤ K, and
glb{f (x) : x ∈ [xn−1 , xn ]} ≥ −K. Thus M1 − m1 ≤ 2K and Mn − mn ≤ 2K.
192 7. Integration
Now suppose we are given > 0. Choose n above so that 4K(b−a)/n < /2.
Since f is continuous on (a, b), we know that f is continuous on [x1 , xn−1 ]. Then
by Riemann’s Theorem, Theorem 7.2.6, we know that there exists a partition
P ∗ of [x1 , xn−1 ] such that U (f, P ∗ ) − L(f, P ∗ ) < /2. Write the partition P ∗ as
P ∗ = {x1 , x2 , · · · , xn−1 } (which is clear not in the form that we promised we’d
write all partitions—but for a good reason). Let P = {x0 , x1 , · · · , xn−1 , xn },
i.e. P = P ∗ ∪ {x0 , xn }. Then clearly P is a partition of [a, b] and
Then using Theorem 7.2.6 again we get that f is integrable on [a, b].
It might not be clear the range of integrable functions this result produces.
If a < c1 < · · · < ck < b, and f is continuous on (a, c1 ), (ck , b) and (cj−1 , cj ) for
j = 2, · · · , k, then we say that f is piecewise continuous on [a, b]. If we assume
that f is defined at a, c1 ,· · · ,ck ,b (which forces f to be bounded on [a, b]), then
by Propositions 7.4.3 and 7.4.4 we see that f is integrable on [a, b] and
Z b Z c1 k Z
X cj Z b
f= f+ f+ .
a a j=2 cj−1 ck
Also if we have the same setting a < c1 < · · · < ck < b and S is defined to be con-
stant on each open interval (a, c1 ), (ck , b) and (cj−1 , cj ), j = 2, · · · , k, then S is
said to be a step function. If, in addition S is defined at the points a, c1 ,· · · ,ck ,b,
then S is piecewise continuous on [a, b] and is integrable on [a, b]. Thus we have
a lot of integrable functions, a bit more interesting than just ( continuous func-
sin(1/x) if x 6= 0
tions. In fact, we note that the nasty function f (x) =
0 if x = 0
R1
is intregrable on [0, 1]. Can you find 0 f ? Note also that f is integrable on
[−1, 1].
We next proof a intuitive result that as we will see later is a useful tool.
Proof: Since f and g are integrable, they will both have Archimedian sequences
on [a, b]. Let {Pn } be the common refinement of both sequences, i.e. {Pn } will
Z b Z b
satisfy U (f, Pn ) → f and U (g, Pn ) → g as n → ∞. Since f (x) ≤ g(x)
a a
on [a, b], f (x) ≤ g(x) on any of the partition intervals [xi−1 , xi ] and we get
Mif ≤ Mig (using notation that we defined in the proof of part (b) of Proposition
Rb Rb
7.3.4) and U (f, Pn ) ≤ U (g, Pn ). If we then let n → ∞, we get a f ≤ a g.
We next combine Propositions 7.3.4 and 7.4.5 to obtain the following impor-
tant result.
7.4 More Integration Topics 193
Proposition Z 7.4.6
Suppose that f : [a, b] → R are integrable on [a, b].
b Z b
(a) Then f ≤ |f |.
a a
Z
b
(b) If |f (x)| ≤ M for all x ∈ [a, b], then f ≤ M (b − a).
a
Proof: (a) We note that by the definition of absolute value we have −|f (x)| ≤
f (x) ≤ |f (x)|. By Corollary 7.4.2-(c) we know that f integrable implies that |f |
Z b Z b Z b
is integrable. Then by Proposition 7.4.5 we have (−|f |) ≤ f ≤ |f |.
Z b Z ba Z b a a
(b) Before we start let us emphasize that |f (x)| ≤ M is not really an hypothesis.
Because f is already assumed to be integrable, we know f must be bounded.
We’re now just saying that it’s bounded by M . Z
b
If we apply part (a) of this proposition and Proposition 7.4.5 we get f ≤
a
Z b Z b
|f | ≤ M = (b − a)M which is what we were to prove.
a a
Everything we have done with respect to integrals has been over a range a
to b where a < b. It is convenient and necessary to have integral results for
integrals from d to c with d ≥ c—you probably have already used such integrals
in your basic course. To allow for this we make the following definition.
We then have a variety of results that “fix up” our previous results, now
allowing for integrals defined in Definition 7.4.7. The results generally follow
Rd
previous analogous results for integrals c f with c < d. We include some of
these results in the following proposition.
x1 ≤ x2 , and f≥ g if x1 > x2 .
x1 x1Z
x2 x2
Z
(c) If x1 , x2 ∈ [a, b], then
f ≤
|f |.
x1 x1 Z x2
(d) If x1 , x2 ∈ [a, b] and |f (x)| ≤ M for x ∈ [a, b], then f ≤ M |x2 − x1 |.
x1
There may be some other integration results that must be adjusted to allow
for arbitrary limits and from this time on we will assume that you are able to
complete them.
7.5 Fundamental Theorem 195
g| ≤ |f | + |g|.
a a
HW 7.4.5 Suppose f : [a, b] → R, g : [c, d] → R are such that f ([a, b]) ⊂ [c, d],
f is continuous on [a, b] and g is integrable on [c, d]. Prove or disprove that g ◦ f
is integrable on [a, b].
Thus we have for a given > 0 a δ such that 0 < |x − c| < δ and x ∈ [a, b]
F (x) − F (c) F (x) − F (c)
implies that − f (c) < . Therefore lim = f (c) or
x−c x→c x−c
F 0 (c) = f (c).
Note that continity gave us |f (x) − f (c)| < for all x satisfying |x − c| < δ.
When we used this inequality in equation (7.5.2) we only used it for 0 < |x−c| <
δ. This is all we want to use when we are working with the definition of a
derivative (a limiit) and it is ok to use less than what we have.
We now return to our basic calculus course and define the antiderivative of
a function.
Definition 7.5.3 Consider some interval I ⊂ R and f : I → R. If the function
F is such that F 0 (x) = f (x) for all x ∈ I, then F is said to be an antiderivative
of f on I.
Proof:
Z x (⇒) We assume that there is a function F such that F(x) − F(a) =
f . Using the notation of this section we can rewrite this expression as
a
F(x) − F(a) = F (x) and this expression holds for all x ∈ [a, b]. Then since by
Proposition 7.5.2 F is differentiable on [a, b], we know that F is differentiable on
[a, b] (F(a) is a constant). Also by Proposition 7.5.2 and the fact that we know
that the derivative of a constant is zero, we see that F 0 (x) = F 0 (x) = f (x).
Thus F is the antiderivative of f .
(⇐) If F : [a, b] → R is such that F 0 (x) = f (x) for x ∈ [a, b], we know that
F 0 (x) = F 0 (x). By Corollary 6.3.5 there exists a C ∈ R such that F(x) =
F (x) + C. Since F (a) = 0, we evaluate the last expression at x = a to see that
F(a) = RF (a) + C = C. Thus we have F(x) = F (x) + F(a) or F(x) − F(a) =
x
F (x) = a f which is what we were to prove.
If we let x = b, we get the following corollary that looks more like the result
we applied so often in our basic course.
Corollary 7.5.5 Suppose f : [a, b] → R is continuous on [a, b] and F is an
Z b
antiderivative of f on [a, b]. Then F(b) − F(a) = f.
a
3 2
where F(x) = x3 + x2 + x.
We next include several very nice results, the first two of which are by-
products of our previous results. We begin with integration by parts. Inte-
gration by parts is usually presented as a technique for evaluating integrals—
integrals that we can not evaluated using easier methods. However, with the
advent of computer and calculator calculus systems, integration by parts is not
as necessary as an integration technique as it was in the past. Integration by
parts is an important tool in analysis as we shall see in the next chapter. We
proceed with the following result.
Proposition 7.5.6 (Integration by Parts) Suppose f, g : [a, b] → R are
differentiable on [a, b] and are such that f 0 , g 0 : [a, b] → R are continuous on
[a, b] (f and g are continuously differentiable on [a, b]). Then
Z b Z b
f 0 g = [f (b)g(b) − f (a)g(a)] − f g0 .
a a
Proof: The proof of the integration by parts formula is very easy. We all
d
remember that the product formula for differentiation gives us (f g) = f 0 g +
0
dx
f g . We integrate both sides of this equality, use Corollary 7.5.5 (noting that
d
f g is an antiderivative of the function dx (f g) and get
Z b Z b Z b
d 0
(f g) = f (b)g(b) − f (a)g(a) = f g+ f g0 .
a dx a a
Proof: Before we start our work we might note that we could write the result
Z b Z φ(b)
of the above proposition as f (φ(t))φ0 (t) dt = f (x) dx. Substitution is
a φ(a)
7.5 Fundamental Theorem 199
one of the places that this latter notation is very nice—it reminds you that you
Z φ(b)
are making the substitution x = φ(t) in the integral f (x) dx.
φ(a)
We begin by noting that since φ is continuous on the interval [a, b], φ attains
both its maximum and minimum on [a, b], Theorem 5.3.8, i.e. there exists a
xM , xm ∈ [a, b] such that φ(xM ) is the maximum value of φ on [a, b] and φ(xm )
is the minimum value of φ on [a, b]. Suppose for convenience that xm < xM .
For any y ∈ [φ(xm ), φ(xM ]) by the IVT, Theorem 5.4.1, we know that there is
an x0 ∈ [xm , xM ] such that φ(x0 ) = y0 . Thus φ([a, b]) = [φ(xm ), φ(xM )] is an
interval.
Now let c = φ(a) and d = φ(b). We note that since φ([a, b]) is an interval,
φ([a, b]) will contain the interval with end
R x points c and d—either [c, d] or [d, c].
Define F : φ([a, b]) → R by F (x) = c f and define h : [a, b] → R by h =
F ◦ φ. Then by the Chain Rule, Proposition 6.1.4, and Proposition 7.5.2, we see
that h0 (x) = F 0 (φ(x))φ0 (x) = f (φ(x))φ0 (x). If integrate both sides of this last
expression and apply Corollary 7.5.5, we get
Z b Z b
h0 = h(b) − h(a) = f (φ(x)φ0 (x) dx. (7.5.3)
a a
Rd
Since h(a) = F (φ(a)) = F (c) = 0 and h(b) = F (φ(b)) = F (d) = c f , equation
Z d Z b
R φ(b) Rb
(7.5.3) becomes f = f (φ(x))φ0 (x) dx or φ(a) f = a f ◦ φ φ0 which is
c a
what we were to prove.
Z 1/2
1
Thus we can now consider an integral such as √ . We choose
0 1 − x2
φ(θ) = sin θ. Then φ(0) = 0 and φ(π/6) = 1/2. Thus
Z 1/2 Z φ(π/6) Z π/6
1 1 1
√ = √ = p φ0 (x)
0 1 − x2 φ(0) 1 − x2 0 1 − [φ(x)]2
Z π/6 Z π/6
cos θ π
= p = 1= .
0 1 − sin θ 2
0 6
We next include a theorem that may not be familiar to you. We will find
useful in the next chapter and can be used in a variety of interesting ways. The
theorem is called the Mean Value Theorem for Integrals.
Proof: We know from Corollary 7.4.2-(d) that since f and p are integrable
on [a, b], f p is integrable on [a, b]. We also know from Theorem 5.3.8 that
f assumes its maximum and minimum on [a, b], i.e there exists m, M ∈ R
such that m ≤ f (x) ≤ M for x ∈ [a, b], and there exists xm , xM ∈ [a, b] such
that f (xm ) = m and f (xM ) = M . Because p(x) ≥ 0 we know also that
mp(x) ≤ f (x)p(x) ≤ M p(x) for all x ∈ [a, b]. Therefore
Z b Z b Z b
m p≤ fp ≤ M p. (7.5.5)
a a a
Rb
Z b
If a
p = 0, then f p = 0 and we can choose any c ∈ [a, b] to satisfy
a
equation (7.5.4). Otherwise we rewrite inequality (7.5.5) as
Rb
fp
f (xm ) = m ≤ Ra b ≤ M = f (xM ).
a
p
Before we move on let us emphasize some important points here. The Rie-
mann sums are very difficult in that for a given partition there are many dif-
ferent sums—you get a different value for each choice of ξi ∈ [xi−1 , xi ] for each
i = 1, · · · , n. That is why we included the statement “all different choices of
Sn (f, P )” in the definition of the Riemann
integral—it’s usually
not there. The
Z b
fact that we must be able to show that (R) f − Sn (f, P ) < for arbitrary
a
202 7. Integration
ξi ∈ [xi−1 , xi ] (besides all partitions P for which gap(P ) < δ) can make working
with this definition difficult.
The definition given above is the most common definition used in elementary
textbooks. This is probably because it is the definition that can be given as
quickly as possible. In most texts this definition is given before they consider
limits of sequences—let alone limits of partial sums.
We are going to do very little with Definition 7.6.1. As we stated earlier the
main result will be Theorem 7.6.3 were we prove that the Riemann integral and
the Darboux integral are the same. Before we do this, we state the following
easy result.
Proof: (⇒) We’ll do the easier one first. Suppose > 0 is given.
Since f is
Z b
Riemann integrable there exists a δ so that (R) f − Sn (f, P ) < /3 for all
a
Z b Z b
partitions P with gap(P ) < δ or (R) f − < Sn (f, P ) < (R) f + —and
a 3 a 3
this must hold for all choices ξi ∈ [xi−1 , xi ], i = 1, · · · , n.
Choose one such partition P and consider the left half of the inequality,
Z b
(R) f − < Sn (f, P ). Since this inequality must hold for any choice of
a 3
ξi ∈ [xi−1 , xi ], i = 1, · · · , n, we can take the greatest lower bound of both sides
7.6 Riemann Integral 203
(⇐) This is a tough proof, but also an interesting, good proof. But the proof
is not as hard as it looks—we do it very carefully. If f is integrable on [a, b],
then for > 0 by Definitions 7.2.1 and 7.2.3 there exists a partition of [a, b],
P 0 = {x0 , · · · , xn }, such that
Z b Z b
U (f, P 0 ) − f = U (f, P 0 ) − f< . (7.6.4)
a a 4
204 7. Integration
Since f is bounded on [a, b], there exists M such that |f (x)| ≤ M for all x ∈ [a, b].
Set δ1 = /16M n and let P = {y0 , y1 , · · · , ym } be any partition of [a, b] such
that gap(P ) < δ1 . Let P ∗ be the common refinement of P 0 and P , P ∗ = P 0 ∪ P .
By Lemma 7.1.5 U (f, P ∗ ) ≤ U (f, P 0 ). Then from (7.6.4) we get
Z b
∗
U (f, P ) − f< . (7.6.5)
a 4
Thus we see that because each interior point of P 0 contributes less than or
equal to 2M δ1 to U (f, p) − U (f, P ∗ ),
or
U (f, P ) ≤ U (f, P ∗ )+(n−1)2M δ < U (f, P ∗ )+(n−1)2M /16M n < U (f, P ∗ )+ .
8
If we combine this inequality with inequality (7.6.5), we get
Z b
3
U (f, P ) − f< . (7.6.6)
a 8
(To show that we are right—in that we claim that “we can show”—you might
try deriving inequality (7.6.7).)
Take δ = min{δ1 , δ2 } and suppose we are given a partition of [a, b], P , such
that gap(P ) < δ—we then get both inequalities (7.6.6) and (7.6.7).
Because on any partition interval we have mi ≤ f (ξi ) ≤ Mi , we have
L(f, P ) ≤ Sn (f, P ) ≤ U (f, P ). The right half of this inequality along with
Rb
inequality (7.6.6) gives Sn (f, P ) < a f + 3
8 and the left half
Z of the inequality
Rb b
3
along with inequality (7.6.7) gives a f − Sn (f, P ) < 8 ; or f − Sn (f, P ) <
a
Z b Z b
3
< . Thus by Definition 7.6.1 f is Riemann integrable and (R) f= f.
8 a a
n
X i 1
HW 7.6.2 (a) Suppose f : [0, 1] → R and suppose lim f exists.
n→∞
i=1
n n
Show that f is not necessarily integrable on [0, 1].
(b) Show also that neither of the other limits of sums considered in HW7.6.1
will imply the integrability of f either.
Proof: We are going to apply Propositions 7.5.1 and 7.5.2. Both of these
propositions considered a function f defined on a closed interval [a, b]. For
this result we must consider the function 1/t defined on (0, ∞). However, for
any x0 ∈ R we can consider [x0 /2, 2x0 ] and apply Propositions 7.5.1 and 7.5.2
to see that
f (x) = ln x is both continuous and differentiable at t = x0 , and
d 1 d 1
ln x = . Thus for any x ∈ (0, ∞), ln x = .
dx x=x0 x 0 dx x
d 1
Since ln x = > 0 on (0, ∞), the function ln is strictly increasing by
dx x
Corollary 6.3.6-(a).
R1
And by Definition 7.4.7-(a) we see that ln 1 = 1 1t dt.
We next need to show that the logarithm function defined satisfies the basic
properties that we all know logarithm functions are supposed to satisfy.
7.7 Logarithms and Exponentials 207
Proof: Notice that the derivative found in part (b) of Proposition 7.7.2 along
0
d
with the Chain Rule, Proposition 6.1.4, dx ln f (x) = ff (x)
(x)
.
(a) We consider the expression ln(ax) where a ∈ (0, ∞) is some constant. Then
d a 1 d 1 1 d
dx ln(ax) = ax = x . Also dx (ln a + ln x) = 0 + x = x . Since dx ln(ax) =
d 1
dx (ln a + ln x) = x , by Corollary 6.3.5 we know that ln(ax) = ln a + ln x + C
where C is some constant. This last equality must be true for all x ∈ (0, ∞).
We set x = 1 to see that ln a = ln a + ln 1 + C = ln a + C or C = 0. Thus
ln(ax) = ln a + ln x.
d a 1 a 1 d
(b) We note that ln = − 2 = − and dx (ln a − ln x) = 0 − x1 =
dx x a/x x x
− x1 . Thus ln(a/x) = ln a−ln x+C. If we set x = 1, we get ln a = ln a−ln 1+C =
ln a + C or C = 0. Thus ln(a/x) = ln a − ln x.
d
(c) Since dx ln xr = x1r rxr−1 = xr and dxd
r ln x = r x1 , ln xr = r ln x + C. If we
let x = 1, we see that C = 0 and hence, ln xr = r ln x. Note that we have only
consider part (c) for r ∈ Q. This is because we have not defined xr for r ∈ I—so
surely we could not decide how to differentiate xr for r ∈ I.
We next consider ln 2—which by using a calculator we know
R 2 is approximately
equal to 0.69, but we can’t use that. We note that ln 2 = 1 (1/t) dt. We also
note that on [1, 2] we have 1t ≥ 12 (look at the graph of 1/t). Thus we know
R2 R2
that ln 2 = 1 (1/t) dt ≥ 1 (1/2) dt = 1/2—and it is true that 0.69 ≥ 1/2.
Using this inequality we see that ln 2n = n ln 2 ≥ n/2 and lim ln 2n =
n→∞
∞. Then because the ln function is increasing, we know that lim ln x = ∞.
x→∞
(Because limn→∞ ln 2n = ∞ for every R > 0 there exists an N ∈ R so that for
any n > N , ln 2n > R. Then for any R > 0 we can choose K = 2N +1 . Then
because the ln function is increasing, for any x > K, ln x > R.)
Likewise we want to show that lim ln x = −∞. We first show that ln 2−n =
x→0+
−n ln 2 ≤ −n/2. Part (b) below then follows using an argument similar to the
one used for part (a). We then have the following result that will help us to
understand the plot of the ln function.
Definition 7.7.5 The real number e is defined to be that value such that ln e =
1.
It should be reasonably clear that the same argument used above can be
used to prove that for any y ∈ (0, ∞) there exists an x ∈ (1, ∞) such that
ln x = y—for any y always use a some n so that ln 2n > y and then apply the
IVT with ln on (1, 2n ). This implies that ln(1, ∞) = (0, ∞).
Likewise, we can apply the same approach to show that ln(0, 1] = (−∞, 0]—
remember ln 1 = 0. Specifically, consider y0 = −11. We note that ln 2−24 =
−24 ln 2 ≤ −24(1/2) = −12 < −11 and ln 1 = 0 > −11 so we can apply the IVT
to imply that there exists some x0 ∈ (2−24 , 1) such that ln x0 = y0 = −11. Or
more generally, if you consider any y0 ∈ (−∞, 0), we can choose an n such that
ln 2−n = −n ln 2 ≤ −n/2 < y (and we do have ln 1 = 0 > y0 ). We can apply the
IVT to imply that there exists x0 ∈ (2−n , 1) such that ln x0 = y0 . This implies
that ln(0, 1] = (−∞, 0]. Thus we have the following result.
Definition 7.7.7 Define the exponential function, exp : (−∞, ∞) → (0, ∞), as
exp(x) = ln−1 x.
We want to make it very clear that at this time there is no special relationship
between the exponential function defined above and anything of the form ax —
we still don’t know what the latter expression means. However, we do have
tools to help us look at the exp function. We can use either Proposition 5.4.11
or 5.4.12 to prove the following result.
d 1
exp(y)|y=y0 = d
= exp(y0 ). (7.7.1)
dy dx ln xx=x0
7.7 Logarithms and Exponentials 209
Proof: The domain of the function ln is an interval—I = (0, ∞). The function
ln is one-to-one and continuous
on I, x0 is not an end point of I, ln is differen-
d
tiable at x = x0 and dx ln xx=x0 = x10 6= 0 for any x0 ∈ (0, ∞). Note that since
y0 = ln x0 , then x0 = ln−1 y0 = exp(y0 ). Thus by Proposition 6.3.8 we get
d 1 1
exp(y)|y=y0 = d
= = x0 = exp(y0 ).
dy dx ln x x=x0
1/x 0
The exponential function will inherit other more basic properties from the
logarithm function. Some of these properties are included in the following propo-
sition.
There are, of course, some very basic properties that we want exponentials
to satisfy. In Section 5.6, Proposition 5.6.6 we showed that for r, s ∈ Q, we
s
have xr xs = xr+s and (xr ) = xrs . We want and need ex1 ex2 = ex1 +x2 and
x
(ex1 ) 2 = ex1 x2 for x1 , x2 ∈ R. We have the following.
Proof: This proposition could be proved using the same approach that we used
to prove Proposition 7.7.3. Instead of using that approach we will show how
these properties follow from results proved in Proposition 7.7.3. Suppose y1 and
y2 are such that y1 = ex1 and y2 = ex2 —then also x1 = ln y1 and x2 = ln y2 .
Then x1 + x2 = ln y1 + ln y2 = ln(y1 y2 ) by Proposition 7.7.3-(a). Then taking
the exponential of both sides gives ex1 +x2 = eln(y1 y2 ) = y1 y2 = ex1 ex2 .
x
(b) In a similar way we note that x1 x2 = x2 ln ex1 = ln (ex1 ) 2 . Then taking
x
the exponential of both sides yields ex1 x2 = (ex1 ) 2 .
Note that the expression “taking the exponential of both sides” is perfectly
legal. We have two equal values, x1 + x2 = ln(y1 y2 ) in part (a). Hence from the
definition of a function, if we evaluate the exponential function at these equal
values, we can expect to get equal outputs.
We do want and need more general exponentials. To accomplish this we
make the following definition.
Definition 7.7.14 For a > 0 and x ∈ R we define ax = ex ln a .
We next would have to state and prove all of the relevant properties related
to the function ax . We want at least the following properties: a0 = 1, a1 = a,
x d x
ax1 ax2 = ax1 +x2 , (ax1 ) 2 = ax1 x2 and dx a = ax ln a. We will not prove
these properties but you should be able to see that they follow easily from the
analogous properties for the exponential—ex .
And finally we want one last very important function defined, xr for some
r ∈ R and x ∈ (0, ∞). The function xr is already defined, xr = er ln x . Of course
for certain values of r (many values of r) we can actually define xr for any x ∈ R
(x2 , x3 , x2/3 , etc)—but for many values of r (at least r = 1/2, r = π, etc) xr
just doesn’t make any sense for x < 0. And clearly the definition xr = er ln x
only makes sense for positive x.
The most basic properties of xr follow from the properties of the exponential
and logarithm. We note that becuase ln xr = ln er ln x = r ln x, we see that now
(essentially by definition) ln xr satisfies the property given in Proposition 7.7.3-
(c)—this time for any r ∈ R (instead of only r ∈ Q).
The property that we need badly is the derivative property. We already have
d r
that dx x = rxr−1 for r ∈ Q. We also have the extension of this result.
d r
Proposition 7.7.15 For r ∈ R and x ∈ (0, ∞) dx x = rxr−1 .
d r d r ln x
Proof: We note that dx x = dx e = er ln x xr = xr xr = rxr−1 which is the
desired result.
HW 7.7.1 (True or False and why)
(a) ln 2n ≥ n2 implies that lim ln 2n = ∞
n→∞
(b) If ln 2x + 3 ln 8x = 1, then x = 10
1
.
d x
(c) For x > 0, x = (1 + ln x)xx .
dx x
(d) The function sin x is defined for x ∈ [0, 2π].
(e) exp(ln x) = x for x > 0.
7.8 Improper Integrals 211
HW 7.7.2 Let f (θ) = cos θ. (a) Show that f is not one-to-one. Restrict f in
a way so that the restriction, fr , is one-to-one.
(b) prove the existence of fr−1 , prove the continuity of fr−1 and compute
d −1 d
f (x), i.e. compute cos−1 x.
dx r dx
finite. Define f= f+ f.
a a c
212 7. Integration
Z f∞ : R →
(f ) Suppose that Z R
c
is such
Z ∞that f is integrable on [−c1 , c1 ] for every
c1 > 0. Define f= f+ f for any c ∈ R.
−∞ −∞ c
When we require that the limits above exist, we first consider that they exist
in R. In this case we can easily develop a whole boat load of properties for the
improper integral. We assume that you understand which properties that you
would want (or just return to the previous sections of this chapter and decide
which of the properties will still be true for the improper integral) and that
you could prove most of these properties if you wanted them—usually using a
combination of integration results and limit results. We are not going to do this
here. We wanted to introduce the improper integral because it is an important
concept, and we wanted to emphasize that the improper integral is not part of
the usual definition of the Darboux-Riemann integral.
It is possible to allow limits above that include ±∞. There’s nothing wrong
with allowing infinite values but especially in parts (e) and (f) of the above
definition you must be careful with your “infinite arithmetic.” Also in this case
you must be carefuly about the “obvious” properties of the integral. Again you
must be careful not to perform any illegal operations with ±∞. Other than
that, all is fine.
Chapter 8
213
214 8. Sequences and Series
Rx
R0 (x) = a f 0 (t) dt. T0 is referred to as the zeroeth order Taylor polynomial
of the function f about x = a and R0 is the zeroeth order error term—of course
the trivial case—and generally T0 would not be a very good approximation of
f. Z x
We obtain the next order of approximation by integrating f 0 (t) dt by
0
parts. We let G(t) = f 0 (t) and F 0 (t) = 1. Then G0 (t) = f 00 (t) and F (t) = t − c.
You should
Z take note of the last step carefully. The dummy variable in the
x
integral f 0 (t) dt is t. Hence, if you were to integrate by parts without being
0
especially clever (or even sneaky), you would say that F (t) = t. However, there
is no special reason that you could not use F = t + 1 or F = t + π instead.
The only requirement is that the derivative of F with respect to t must be 1.
Since the integration (and hence, the differentiation) is with respect to t, x is a
constant with respect to this operation (no different from 1, π or c). Since we
want it, it is perfectly ok to set F (t) = t − x. Then application of integration
by parts gives
Z x Z x Z x
0 0 x
f (t) dt = F (t)G(t) dt = [F G]a − F (t)G0 (t) dt
a a a
Z x Z x
0 00 0
= 0 − (a − x)f (a) − (t − x)f (t) dt = (x − a)f (a) − (t − x)f 00 (t) dt.
a a
and Z x
1
Rn (x) = (x − t)n f (n+1) (t) dt. (8.1.5)
n! a
8.1 Taylor Polynomials 215
where
n
X 1 (k)
Tn (x) = f (0)xk (8.1.7)
k!
k=0
and Z x
1
Rn (x) = (x − t)n f (n+1) (t) dt. (8.1.8)
n! 0
We can consider the function f (x) = ex and can easily obtain expression for
the Taylor polynomial for f about x = 0.
Example 8.1.1 Obtain the Taylor polynomial and error term for f (x) = ex about x = 0.
n
X 1 k
Solution: It is easy to see that for any n, f (n) (0) = 1. Then we can write Tn (x) = x
k=0
k!
1
Z x
and Rn (x) = (x − t)n et dt.
n! 0
1
Example 8.1.2 Consider the function f (x) = . Compute Taylor polynomials and
x+1
error terms for f about x = 2 for n = 4 and for general n.
Solution: We begin by making a table for derivatives of f at x = 2.
1 1 1 1 1
It is then easy to see that T4 (x) = − (x − 2) + (x − 2)2 − (x − 2)3 + (x − 2)4
3 9 27 81 243
Z x n
X 1
and R4 (x) = −5 (x − t)4 (t + 1)−6 dt; and Tn (x) = (−1)k k+1 (x − 2)k and Rn (x) =
2 k=0
3
Z x
(−1)n+1 (n + 1) (x − t)n (t + 1)−(n+2) dt.
2
(To obtain this last result we must be careful. When x ≥ a, everything is positive
and the statement is true withoutRthe outside absolutevalue signs. R x When x < a,
x
by Proposition 7.4.8-(b) we get a (a − t)n f (n+1) (t) dt ≥ M a |(a − t)n | dt.
Because these two integrals are negative, we get
Z x Z x
n (n+1) n
− t) f (t) dt ≤ M |(a − t) | dt .)
(a
a a
R x
Next we must compute a |(a − t)n | dt—carefully. Probably the easiest
way is to consider x ≥ a and show that
Z x Z x
n (x − a)n+1 |x − a|n+1
|(a − t) | dt = (t − a)n dt = = .
a a n+1 n+1
Then consider x < a and show that
Z x Z x
(a − x)n+1 |x − a|n+1
|(a − t)n | dt = (a − t)n dt = − =− .
a a n+1 n+1
R x n+1
In either case a (a − t)n | dt = |x−a|
n+1 , and we get
M M
|Rn (x)| ≤ |x − a|n+1 ≤ rn+1 .
(n + 1)! (n + 1)!
We should note that the result of Proposition 8.1.3, equation (8.1.9), can
M
also be expressed as |f (x) − Tn (x)| ≤ (n+1)! rn+1 for x ∈ [a − r, a + r]. This
expression makes it extremely clear how well Tn approximates f .
In the above result that the (n + 1)! in the denominator is one part of the
above result that makes the error small on [a − r, a + r]. Also, if r is small, then
rn+1 makes Rn small. Consider the following example that is based on Example
8.1.1.
218 8. Sequences and Series
Thus we see that we can approximate ex well with a small order Taylor
polynomial on a small interval (with r small). It may not be very nice but we
also see that if for some reason we want or need a large interval, we can use
a Taylor polynomial (a high ordered Taylor polynomial) to approximate ex on
the large interval.
Likewise we can revisit the example considered in Example 8.1.2, f (x) =
1
x+1 , we clearly have to choose r so that −1 6∈ [2 − r, 2 + r]. If we choose
−6
r = 1 and n = 4, then M = 5!1−6 ≈ 120 and |R4 (x)| ≤ 5!15! 16 = 1 on the
interval [1, 3]—again not very good. If we instead chose r = 0.5 and n = 4,
−6
then M = 5!1.5−6 ≈ 10.53 and |R4 (x)| ≤ 5!1.5 5! .56 ≈ 1.37 · 10−3 on the interval
[1.5, 2.5]. This is a better result.
We see that in this case if r is a bit larger (1 or larger), M gets large—large
enough so that the (n + 1)! in the denominator of (8.1.9) doesn’t help making
R4 small. And of course, if r ≥ 1, the rn term doesn’t help make R4 small
either.
HW 8.1.3 Consider the function f (x) = sin x. (a) Compute the Taylor poly-
nomial and error term about x = 0 for n = 4 and for a general n.
(b) Apply the Taylor inequality, Proposition 8.1.3, on [−1, 1] to determine a
bound on the error for both cases.
(c) Use the result from part (b) for general n to determine an n0 such that
| sin x − Tn (x)| ≤ 1.0 · 10−10 for all x ∈ [−1, 1].
n
x
Example 8.2.2 Define f2n , f2 : [0, 1] → R for n ∈ N by f2n (x) = n
and f2 (x) = 0 for
x ∈ [0, 1]. Show that f2n → f2 pointwise on [0, 1].
xn
Solution: For any x ∈ [0, 1] lim = 0—thus f2n → f2 pointwise on [0, 1].
n→∞ n
nx
Example 8.2.3 Define f3n , f3 : [0, 1] → R for n ∈ N by f3n (x) = and f3 (x) = 0
1 + n2 x2
for x ∈ [0, 1]. Show that f3n → f3 pointwise on [0, 1].
Solution: Since f3n (0) = 0 for all n, then f3n (0) → 0 = f3 (0). For x satisfying 0 < x ≤ 1,
nx
lim = 0 = f3 (x). Thus f3n → f3 pointwise on [0, 1].
n→∞ 1 + n2 x2
Series We started this discussion talking about in which manner the Taylor
polynomial associated with f , Tn , converges to f . Specifically let Tn denote
the Taylor polynomial associated with f (x) = ex and consider the domain
D = [−3, 3]. Earlier we found that lim Tn (x) = ex for x ∈ [−3, 3]. Thus
n→∞
the sequence {Tn } converges to ex pointwise on D = [−3, 3].
It should be clear that {Tn } is a different sort of sequence from {f1n }, {f2n }
and {f3n } defined above. Recall that the sequence of Taylor polynomials Tn
n
x
X 1 k
associated with f (x) = e is given by Tn (x) = x . All Taylor polynomials
k!
k=0
look similar—given as a sum of n + 1 terms. When we take the limit as n
approaches ∞, we are computing an infinite sum. We want to understand what
∞
X 1 k
we mean by x . Sequences such as these are referred to as a series of
k!
k=0
functions. To provide a logical setting to discuss series of functions we
introduce series of real numbers.
For a sequence {a1 , a2 , · · · } where ai ∈ R for all i = 1, 2, · · · , we want to
∞
X
discuss what we mean by ai , the sum of an infinite number of real numbers.
i=1
n
X
We define partial sums of {ai } by sn = ai for n ∈ N and consider the
i=1
sequence of partial sums, {sn }.
Definition 8.2.2 Consider the real sequence {ai } and the associated sequence
n
X
of partial sums {sn }, sn = ai . If the sequence {sn } is convergent, say to
i=1
∞
X ∞
X
s, we say that the series ai converges and we define ai = s = lim sn .
n→∞
i=1 i=1
∞
X
We refer to ai as an infinite series, or just a series. If the sequence {sn }
i=1
∞
X
does not converge, we say that the series ai does not converge. If sn → ±∞,
i=1
8.2 Sequences and Series 221
∞
X
we say that the series ai diverges to ±∞, respectively—but make sure you
i=1
understand that a series that diverges to ±∞ does not converge in R.
The geometric series is very nice but this is almost the only series that we
can write and work explicitly with the sequence of partial sums (telescoping
series gives one more example).
When we consider the convergence of a series, it is sometimes useful to realize
that when we are showing that sn → s where s is to be the sum of the series
X∞ ∞
X
ai , we must consider s − sn —as in |s − sn | < . And s − sn = ai . Thus
i=1 i=n+1
to show that a series converges, we must show that the sum of the “tail end” of
the series is arbitrarily small.
And finally one other approach is extremely useful when working with the
convergence of series is to use the Cauchy criterion for the convergence of the
sequence {sn } introduced in Section 3.4. Recall that when we discussed the
Cauchy criterion, we noted that it was a case where we did not need to know
the limit of the sequence. This is especially convenient when we are working
with series in that we hardly ever know or can guess the sum of the series. We
include the application of the Cauchy criterion to the convergence of series in
the following proposition.
∞
X
Proposition 8.2.3 Consider the real sequence {ai }. The series ai con-
i=1
exists N ∈ R such that m, n ∈ N,
verges if and only if for every > 0 there
Xm
m ≥ n and m, n > N implies that ai < .
i=n
222 8. Sequences and Series
Proof: This result follows from Proposition 3.4.11 in that {sn } is convergent
if and only if the sequence {sn } is a Cauchy sequence. The sequence {sn } is a
Cauchy sequence if for every > 0 there exists an N ∈ R such that n, m ∈ N
and n, m > N implies |sm − sn | < . This can easily be adjusted by setting
N ∗ = N + 1 and requiring m, n > N ∗ which implies that |sm − sn−1 | < . If
Xm
we take m ≥ n (one of the two must be larger), then sm − sn−1 = ai . The
i=n
result follows.
We used the example of convergence of Taylor polynomials to motivate the
convergence of series. We now realize that the convergence of Taylor polynomials
are really the convergence of a series, a Taylor series. For that reason (and the
fact that it is an important concept) we now define what we mean by the
pointwise convergence of a series of functions.
Definition 8.2.4 Consider the sequence of functions {fi (x)} where for each i,
∞
X
fi : D → R, D ⊂ R. If for each x ∈ D the real series fi (x) is convergent, say
i=1
∞
X
to s(x), then we say that the series of functions fi (x) converges pointwise
i=1
to s(x).
Begin by noting that the notation used above is not very good. At the function
X∞
level it would be better to say that the series of functions fi converges
i=1
pointwise to s—but the above notation is reasonably common.
We should note that we can also consider the sequence of partial sums of
n
X
functions, sn (x) = fi (x), and say that if the sequence {sn (x)} converges
i=1
∞
X
pointwise, say to s(x), then the series of functions fi (x) is said to converge
i=1
pointwise and is defined to be equal to s(x). We emphasize that this approach
to convergence of series of functions is completely equivalent to the convergence
defined in Definition 8.2.4.
In our consideration of the convergence of sequences of Taylor polynomials
we have already given a very common example of a series of functions. Since
Tn (x) was really a partial sum, when we considered the convergence of the Taylor
polynomials of f (x) = ex on [−3, 3], we were proving the pointwise convergence
∞
X 1 i
of the series of functions x on [−3, 3] (and we hope that you realize that
i=0
i!
it is not important that we considered general series starting at i = 1 and the
Taylor series starting with i = 0). Because we expanded f (x) = ex about x = 0,
the series given above is the Maclaurin series of f . In general we make the
following defintiion.
8.3 Convergence Tests 223
When you look back to Proposition 3.3.2, you might ask “what about part
∞
X (−1)i
(d)?” We don’t know it yet but we will find later that √ is conver-
i=1
i
∞ ∞
X (−1)i (−1)i X 1
gent (twice) but √ √ = is not. Hence there is no nice result
i=1
i i i=1
i
224 8. Sequences and Series
that gives convergence for a series resulting by a term by term product of two
convergent series.
The next is very easy but was a very important tool in your basic course.
∞
X
Proposition 8.3.2 If the series ai converges, then lim ai = 0.
i→∞
i=1
Proof: If sn represents the partial sum associated with the convergent series
∞
X
s = ai , we know that both limits lim sn and lim sn−1 exist and equal
n→∞ n→∞
i=1
∞
X
s= ai . Then an = sn − sn−1 → s − s = 0.
i=1
As we said earlier, the result given in Proposition 8.3.2 is very important—
but not in the form given in Proposition 8.3.2. For this reason we state the
contrapositive of Proposition 8.3.2 as the following corollary—called the “test
for divergence” in the basic course.
∞
X
Corollary 8.3.3 (Test for Divergence) Consider the real series ai . If
i=1
∞
X
lim ai 6= 0, then the series ai does not converge.
i→∞
i=1
One thing that we want to emphasize is that the statement “ lim ai 6= 0” can
i→∞
be satisfied if either the limit does not exist, or the limit exists and is not equal
∞
X
to zero. Of course this corollary can be used to show that the series (−1)i ,
i=1
∞ ∞ ∞
X
i
X X 1
2 and sin(i) do not converge. We do not know it yet but the series
i=1 i=1 i=1
i
does not converge. For this series ai = 1/i → 0. Hence we emphasize that the
converse of Proposition 8.3.2 is not true.
We next include a concept that will be very important to us later. We begin
by including the definition of absolute and conditional convergence.
Definition 8.3.4 Suppose {ai } is a real sequence. We say that {ai } is abso-
∞
X
lutely convergent if the series |ai | is convergent. If the series {ai } is con-
i=1
vergent but not absolutely convergent, then the series is said to be conditionally
convergent.
Proof: This is one of the results where it is very convenient to consider the
Cauchy criterion for the convergence of a series given in Proposition 8.2.3. We
X∞
suppose that we are given an > 0. Since ai is absolutely convergent, we
i=1
know that there exists N ∈ R such that n, m ∈ N, n, m > N and m > n implies
X m
|ai | < (where the outer absolute value signs are not really needed). Then
i=n
m, n ∈ N, m, n > N and m > n implies—by multiple applications of the
triangular
m inequality, Proposition 1.5.8-(v), or an easy math induction proof—
X X m
a ≤ |ai | < . Thus by the Cauchy criterion for convergence of series,
i
i=n i=n
∞
X
the series ai converges.
i=1
∞
X 1
We see that if we consider the series i
, which we know is convergent
i=1
2
because it is a geometric series associated with r = 1/2 < 1, then we immediately
∞
X 1
know that the series (−1)i i is convergent—or any other series like this
i=1
2
where some of the terms are negative. We will apply this result often in a
similar way.
The series results given so far do not directly help us decide whether or
not series converge. When we worked with sequences, we had many methods
that helped find limits. We next state and proof a series of results that help
determine whether or not a series is convergent. We begin with the integral test
(recall that we considered improper integrals in Section 7.8).
∞
X
Proposition 8.3.6 (Integral Test) Suppose that ai is a real series and
i=1
suppose f : [1, ∞) is a positive, decreasing continuous function for which f (i) =
X∞ Z ∞
ai for i ∈ N. Then ai converges if and only if the integral f exists.
i=1 1
Z ∞
Proof: Before we proceed we emphasize that we are assuming that f
1
exists in R (we do not include convergence to ∞ for this assumption). Since f
is decreasing on the interval [i − 1, i], we know that f (i − 1) ≥ f (t) ≥ f (i) for
Z i Z i
any t ∈ [i − 1, i]. Hence by Proposition 7.4.5, f (i − 1) = f (i − 1) ≥ f≥
i−1 i−1
Z i Z i
f (i) = f (i), or ai−1 ≥ f ≥ ai . If we sum from i = 2 to i = n and
i−1 i−1
226 8. Sequences and Series
exists. We must show that lim f exists. We claim that this limit does in
R→∞ 1
fact exist and will equal L. Suppose that > 0 is given. Choose the N based
on the convergence of the sequence {bn }. Let N1 = N + 1 and suppose that
R > N1 . Note that since f (x) ≥ 0, we can use Propositions 7.3.3 and 7.4.5
Z R Z [R]
to show that f ≥ f where [R] is the greatest integer function. Then
1 1
[R] > N and
Z R Z R Z [R]
L − f = L − f ≤L− f = b[R] − L < .
1 1 1
Z R Z ∞
Therefore lim f exists and the improper integral f exists.
R→∞ 1 1
Z ∞
(⇐) We now assume that the integral f exists. By the right side of in-
1 Z ∞
equality (8.3.1), the fact that f is positive and the fact that f exists, we
Z n Z ∞ 1
corollary where we include one of the implications from Proposition 8.3.6 and
the contrapositive of the other implication from Proposition 8.3.6.
∞
X
Corollary 8.3.7 (Integral Test) Suppose that ai is a real series and sup-
i=1
pose f : [1, ∞) is a positive, decreasing continuous function for which f (i) = ai
for i ∈ N.
∞
R∞ X
(a) If the improper integral 1 f exists, then the series ai is convergent.
i=1
∞
R∞ X
(b) If the improper integral 1
f does not exist, then the series ai is not
i=1
convergent.
Proposition 8.3.8 (Comparison Test) Suppose that {ai } and {bi } are real,
positive sequences and suppose that for some N1 ∈ N ai ≤ bi for all i ≥ N1 . If
X∞ ∞
X
the series bi converges, then the series ai converges.
i=1 i=1
∞
X
Proof: Since bi converges, we know from Proposition 8.2.3 that for every
i=1
> 0 there exists an N2 ∈ R such that n, m ∈ N, n, m > N2 and m > n
m
X
implies that bi < (where no absolute value signs are needed since {bi } was
i=n
assumed to be a positive series). If we then let N = max{N1 , N2 }, we know
228 8. Sequences and Series
m
X m
X
that for n, m ∈ N, n, m > N and m > n we have ai ≤ bi < . Therefore
i=n i=n
∞
X
again by Proposition 8.2.3 we know that ai converges.
i=1
We mentioned earlier that the comparison test is often difficult to use. If we
∞
X 1 1 1
consider a series such as 2+i+1
, it is easy to see that 2 ≤ 2.
i=1
i i + i + 1 i
∞
X 1
converges because it is a p-series with p = 2. Hence, by the comparison
i=1
i2
∞
X 1
test the series 2+i+1
converges.
i=1
i
∞
X 1
If we instead consider a series such as 2
we have to be more
i=1
i −i+1
1 1
clever. We note that i2 −i+ 1 ≥ i2 −2i+ 1 = (i−1)2 , then ≥ 2
(i − 1)2 i −i+1
∞
X 1
for i ≥ 2. The series is a p-series with p = 2 so it is convergent—
i=2
(i − 1)2
it is not exactly in the form of a p-series but it should be clear that with a
change of variable j = i − 1, we see that it is exactly in the form of a p-series.
∞
X 1
Then by the Comparsion Test, 2−i+1
is convergent. We note that the
i=1
i
series in Proposition 8.3.8 both start at i = 1 where in this example the series
∞
X 1
starts at i = 2. This is no problem. We could add an i = 1 term,
i=2
(i − 1)2
∞
X 1
say b1 = 13. The series 13 + will still be convergent and Proposition
i=2
(i − 1)2
8.3.8 will apply with N1 = 2.
Just as we did following the integral test, we can apply the comparison test
in conjunction with absolute convergence. Using the comparison test and the
∞
| sin i| 1 X | sin i|
fact that 2
≤ 2
, it is easy to see that the series is convergent.
i i i=1
i2
∞
X sin i
Then using Proposition 8.3.5 we know that converges.
i=1
i2
The next convergence test is an extremely nice result that takes care of most
of the difficulties associated with the comparison test.
Proposition 8.3.9 (Limit Comparison Test) Suppose that {ai } and {bi }
are positive, real sequences.
∞
ai X
(a) If lim 6= 0, then the series ai is convergent if and only if the series
n→∞ bi
i=1
8.3 Convergence Tests 229
∞
X
bi is convergent. Note that (a) can be worded as follows:
i=1
∞ ∞
ai X X
(a1 ) If lim 6= 0 and bi converges, then ai converges, and
i→∞ bi
i=1 i=1
∞ ∞
ai X X
(a2 ) If lim 6= 0 and bi does not converge, then ai does not converge.
n→∞ bi
i=1 i=1
∞ ∞
ai X X
(b) If lim = 0 and bi converges, then ai converges.
n→∞ bi
i=1 i=1
Proof: The statement of the proposition above really consists of parts (a) and
(b). Parts (a1 ) and (a2 ) are rewordings of part (a)—one implication and the
contrapositive of the other implication. Statements (a1 ) and (a2 ) are in a form
much easier to apply than that of (a).
∞
ai X
(a) (⇒) We assume that lim = r 6= 0 and the series ai is convergent.
i→∞ bi
i=1
ai
Since ai and bi are positive, r > 0. Because lim = r > 0, for every > 0
i→∞ bi
ai
there exists N ∈ R such that i > N implies that − r < or
bi
ai
r−< < r + . (8.3.2)
bi
Since the sequence {bi } is assumed positive, inequality (8.3.2) can be rewritten
as
(r − )bi < ai < (r + )bi . (8.3.3)
Choose = r/2. Then for i > N we have (r/2)bi < ai . By the comparison test,
∞
X ∞
X
Proposition 8.3.8, since ai converges, (r/2)bi converges. By Proposition
i=1 i=1
X∞
8.3.1-(b) this implies that bi is also convergent.
i=1
(⇐) The proof of this directions is almost identical to the previous proof. The
difference is that this time, the right hand half of inequality (8.3.3) is used
∞
X ∞
X
along with the comparison test to show that if bi converges, then ai is
i=1 i=1
also convergent—try it.
ai
(b) If lim = 0, then for > 0 there exists an N ∈ R such that i > N implies
n→∞ bi
ai
that < (no absolute value signs are necessary because both sequences are
bi
positive). Thus for i > N we have
Thus by the Proposition 8.3.1-(b) and the comparison test, the convergence of
∞
X ∞
X
bi implies the convergence of ai .
i=1 i=1
Hopefully you remember from your basic course that you can easily prove the
∞
X 1 1 1
convergence of 2−i+1
by setting ai = 2 , bi = 2 and applying
i=1
i i − i + 1 i
X∞
part (a1 ) of the limit comparison test (realizing that bi converges because it
i=1
is a p-series with p = 2). This is much easier than applying the comparison test.
∞
X i2 + i + 1 i2 + i + 1
To show that does not converge, we set a i = ,
i=1
i3 + i2 + i + 1 i3 + i2 + i + 1
1 ai
bi = , show that → 1 as i → ∞, and apply part (a2 ) of the limit comparison
i bi
∞ ∞
X i2 + i + 1 X 1
test to see that 3 + i2 + i + 1
does not converge (recall that diverges
i=1
i i=1
i
since it is a p-series with p = 1).
Generally, the limit comparison test allows you to prove the comvergence or
divergence of a series by “comparing” the series with a known much nicer series
that is similar to the original series—similar in that the limit ai /bi → r exists.
We next introduce the convergence test that might be the most important
test of them all. The ratio test is applicable on series that are almost geometric
series—as we shall see by the proof and the examples that follow. Of course the
ratio test will work on a geometric series but we don’t need it there.
Proposition 8.3.10 (Ratio Test) Consider a real sequence of non-zero ele-
ments {ai }.
(a) Suppose that there exists r, 0 < r < 1, and an N ∈ N such that i ≥ N
∞
ai+1 X
implies that
≤ r. Then the series ai is absolutely convergent.
ai i=1
(b) Suppose that there exists r, r ≥ 1, and an N ∈ N such that i ≥ N implies
∞
ai+1 X
that ≥ r. Then the series ai does not converge.
ai i=1
Proof: (a) Suppose that there exists r, 0 < r < 1 and an N ∈ N such that
ai+1
i ≥ N implies that ≤ r. Note the following.
ai
aN +1
•
≤ r implies that |aN +1 | ≥ r|aN |
aN
aN +2
•
≤ r implies that |aN +2 | ≤ r|aN +1 | ≤ r2 |aN |
aN +1
• Claim: m ≥ 1 implies that |aN +m | ≤ rm |aN |
8.3 Convergence Tests 231
(b) Suppose
that there exists r, r ≥ 1 and an N ∈ N such that i ≥ N implies
ai+1
that ≥ r. By a mathematical induction proof similar to that used in
ai
part (a) we can show that m ≥ 1 implies that |aN +m | ≥ rm |aN |. Since r ≥ 1,
it is clear that |aN +m | 6→ 0 as m → ∞. Thus it is impossible that aN +m → 0
(because if aN +m → 0, then |aN +m | → 0). And if aN +m 6→ 0, it should be clear
X∞
that ai 6→ 0. Thus by Corollary 8.3.3 we know that the series ai does not
i=1
converge.
We should note that the above version of the ratio test is not the version
usually included in the basic calculus texts. We include the following version of
the ratio test.
really get that the tail end of the series converges (from N on) which implies
∞
X
that the whole series converges. therefore the series ai converges absolutely.
i=1
1/i i
(b) Since |ai | ≥ r, we have |ai | ≥ r . Then since r ≥ 1, we know that
|ai | 6→ 0 which implies that ai 6→ 0—which by Corollary 8.3.3 implies that the
X∞
series ai does not converge.
i=1
We should note that like the ratio test, the statement of Proposition 8.3.12
is not the version that is usually given in basic calculus texts. And as was the
case with the ratio test, the more tradition root test will follow from Proposition
8.3.12.
There is one more important test for convergence that we must consider.
We should realize that until this time all of our tests were for positive series
or they gave absolute convergence (ratio and root test). The only way that we
proved convergence of series that was not positive was to use Proposition 8.3.5—
absolute convergence implies convergence. We next consider a class of series that
are not positive called alternating series. We now include the following definition
and the associated convergence theorem.
Note that we set the exponent on the −1 term to be i + 1 just so that the first
term would be positive—that seems a bit neater. This is not important. It is
still an alternating series if it starts out negative and the result given below is
equally true for alternating series that start with a negative term.
i.e. the sequence {s2n } is bounded above. Then by the Monotone Convergence
Theorem, Theorem 3.5.2-(a), the sequence {s2n } converges—say to s. Then
since s2n+1 = s2n + a2n+1 and the fact that a2n+1 → 0, we see that {s2n+1 }
also converges to s.
234 8. Sequences and Series
Claim: The sequence {sn } converges to s. Let > 0 be given and let
N1 be such that n > N1 implies that |s2n − s| < , i.e. if 2n > 2N1 , then
|s − s2n | < . Let N2 be such that n > N2 implies that |s − s2n+1 | < , i.e. if
2n + 1 > 2N2 + 1, then |s − s2n+1 | < .
Then if we define N = max{2N1 , 2N2 +1}, then n > N implies that |s−sn | <
∞
X
, so lim sn = s or the series (−1)i ai converges—to s.
n→
i=1
We emphasize here that to apply the alternating series test we must show
that the sequence is (i) decreasing, (ii) ai → 0 and that (iii) the series is alter-
∞
X (−1)i
nating. We used the fact earlier that the series √ converges. It is easy
i=1
i
1 1 √ √
to see that (i) ai+1 = √ < √ = ai (which is the same as i < i + 1 or
i+1 i
1
i < i + 1), and (ii) lim √ = 0. Surely the series is alternating. Hence by the
i→∞ i
∞
X (−1)i
alternating series test the series √ converges.
i=1
i
HW 8.3.2 Tell which test (if any) will determine whether the following series
are convergent or not.
∞ ∞ ∞ ∞
X n X 1 X 1 X 1
(a) n
(b) n
(c) 2+1
(d) 2−1
n=1
2 n=1
n n=1
n n=2
n
From Section 8.3 we now have other methods for proving convergence of
∞
X 1 k
series. For example if we consider that same Maclaurin series, x , and
k!
k=0
apply the ratio test, we see that
1
xk+1 x
an+1 (k+1)!
= =
→0
an 1 k k + 1
x
k!
as k → ∞ for all x ∈ R. Thus we have just shown that the series that is the
natual infinite series associated with f (x) = ex converges on all of R—a much
better result than that proved in Section 8.2 where we proved it converged on
[−3, 3].
n
X 1 k
But note several important things. Here we prove that the series x
k!
k=0
converged for all x ∈ R but we did not prove that it converged to f (x) = ex . We
should also note that if we apply the same approach using the Taylor inequality
as we did in Section 8.2 on a larger interval, say [−88, 88], we still get conver-
e88
gence, i.e. f (n+1) (x) = ex for all n so M = e88 , |Tn (x) − ex | ≤ 88n+1 ,
(n + 1)!
88n+1
and → 0 as n → ∞ (Example 3.5.2) implies that Tn (x) → ex for all
(n + 1)!
x ∈ [−88, 88]. And since this argument will work for any interval [−R, R] ⊂ R,
∞
X 1 k
the series x converges to f (x) = ex for all x ∈ R. Then we can write
k!
k=0
∞
X 1 k
f (x) = x on R.
k!
k=0
We can find an assortment of Taylor-Maclaurin series expansions and are
able to prove convergence of the series to the given function—as we see in the
following result.
Thus we see that though Proposition 8.4.1 is a very nice result, there are times
(lots of times) that it does not apply.
Power Series In our discussion of Taylor and Macclaurin series above we
started with a function f , used that function to generate a series and then
8.4 Power Series 237
proved that the series converged to f on some interval. There are times that
we want to go approximately in the other direction. We begin with a series of
functions, prove that the series is convergent and define a function to be the
result of the convergent series. We begin with the following definition.
∞
X
Definition 8.4.2 Consider the real sequence {ak }∞
k=0 . The series ak (x−a)k
k=0
is said to be a power series about x = a.
We first note that we started the power series at k = 0. A power series can
equally well start at k = 1 or any other particular value. It is very traditional
to start power series at k = 0—and that’s ok. There is a slight problem starting
the power series at k = 0. The first term is then a0 x0 , and we do want the power
series to be well-defined at x = 0. And of course x0 is not defined at x = 0.
X∞ X∞
Thus we want to emphasize that we write ak xk and mean a0 + ak xk .
k=0 k=1
We should also note that power series are commonly defined for complex
sequences of numbers. We restricted out power series to real coefficients because
here we are interested in real functions and real series. Everything that we do
can be generalized to complex power series. And finally, we will work with
power series about x = 0—everything that we do can be translated to results
about x = a.
Of course we see that a Taylor series expansion gives us a power series.
There are power series where it is not clear that they come from a Taylor series.
Power series appear in a variety of applications. One of the common reasons
for generating power series is when we find power series solutions to ordinary
differential equations—including the resulting power series that define Bessel’s
functions, hypergeometric functions and others.
Consider the following examples of power series.
Example 8.4.2 Discuss the convergence of the following power series:
∞ ∞ ∞ ∞
X X xk X xk X
(a) k!xk (b) (c) (−1)k (d) xk
k=0 k=0
k! k=1
k k=0
Solution (a) Let bk = k!xk . Applying the ratio test to the power series (a) we see that
∞
bk+1 X
b = (k + 1)|x| → ∞ as k → ∞ if x 6= 0. Thus k!xk does not converge for any x ∈ R,
k
k=0
x 6= 0. The series converges to a0 for x = 0, i.e. series (a) converges on the set {0}.
∞
xk
= |x| → 0 as k → ∞. Thus the series
bk+1 X
(b) Let bk = xk /k!. Then converges
bk k+1 k=0
k!
∞
X xk
absolutely for all x ∈ R. Thus the series converges for all x ∈ R.
k!
k=0
b k
(c) Let bk = (−1)k xk /k. Then k+1 = k+1 |x| → |x| as k → ∞. Thus by the ratio test the
b k
∞
X xk k
series (−1) converges absolutely on {x ∈ R : |x| < 1} = (−1, 1) and does not converge
k=1
k
on {x ∈ R : |x| > 1} = (−∞, −1) ∪ (1, ∞).
238 8. Sequences and Series
We see from the above example that the ratio test is a powerful tool that
can be used to determine the convergence of power series. Also, we see that we
get all different possibilities—convergence at one point, convergence on all of R,
converges on an interval, including end points of the interval or not including
end points.
The first result concerning power series is a proposition that describes the
convergence of a power series—results that we sort of see from the previous
example.
∞
X
Proposition 8.4.3 (a) If the power series ak xk converges for x = x0 and
k=0
∞
X
z is such that |z| < |x0 |, then ak xk converges absolutely for x = z.
k=0
∞
X
(b) If the power series ak xk does not converge at x = x0 and z is such that
k=0
∞
X
k
|z| > |x0 |, then ak z does not converge.
k=0
∞
X
Proof: (a) If ak xk0 converges, then ak xk0 → 0 as k → ∞. Choosing = 1
k=0
we know that there exists N ∈ R such that k > N implies that ak xk0 < 1. If z
is such that |z| < |x0 |, then
k k
z z
|ak z k | = |ak xk0 | <
x0 x0
8.4 Power Series 239
∞
z k
z X
for k > N . Since < 1, the series
x0 is a convergent geometric series.
x0
k=0
X∞
By the comparison test, Proposition 8.3.8, the series |ak z k | is convergent,
k=N +1
∞
X ∞
X
i.e. the series ak z k and hence the series ak z k ae absolutely convergent.
k=N +1 k=0
∞
X
(b) Suppose the statement is false, i.e. suppose the series ak z k converges.
k=0
∞
X
Then since |x0 | < |z|, by part (a) of this result we know that ak xk0 con-
k=0
verges absolutely. This is a contradiction to the hypothesis. Therefore the
∞
X
series ak z k does not converge.
k=0
We want to be able to describe (somewhat) the set of convergence of a power
series as we found in Example 8.4.2. We make the following definition.
X∞
Definition 8.4.4 Define the radius of convergence R of a power series ak xk
k=0
as
∞
X
R = lub{y ∈ R : ak y k is absolutely convergent}.
k=0
Example 8.4.2-(c) shows that if |x| = R, the series may converge converge
conditionally, or not at all. HW8.4.2-(a) shows that it is possible that the series
may converge absolutely when |x| = R.
We see that for a given power series, the series will converge (absolutely) for
|x| < R, not converge for |x| > R and may, or may not, converge for |x| = R.
When the series converges, we want to use the power series to define a function
on the domain {x ∈ R : |x| < R}—or maybe a bit more, we may want to
include x = ±R. It’s not hard to see that wherever the series converges, we
∞
X
can define a function f (x) = ak xk . As we always do in calculus once we
k=0
have a new function, we ask the question of whether the function is continuous,
differentiable and/or integrable. The most obvious approach is to hope that
Xn
because ak xk is continuous for each k (so ak xk is continuous for any n),
k=0
∞
X
k
then ak x will be continuous. And, since ak xk is differentiable for each
k=0
Xn ∞
X
k (so ak xk is differentiable for any n), then ak xk will be differentiable
k=0 k=0
∞
!0 ∞
k 0
X X
ak xk
and = ak x . And, finally since ak xk is integrable for each
k=0 k=1
n
X ∞
X
k (so ak xk is integrable for any n), then ak xk will be intebrable and
k=0 k=0
Z ∞
bX ∞ Z
X b
k k
ak x = ak x .
a k=0 k=0 a
The fact is that at this time we cannot make these claims or answer these
questions. Pointwise convergence is not enough. We mentioned earlier that
there were other kinds of convergence. In the next section we will give ourselves
the necessary structure to answer these questions.
∞
X
(d) The radius of convergence of the power series k k xk is R = e.
k=1
∞
X
(e) Suppose for all k ∈ N, |ak | < 1. Then the power series ak xk converges
k=1
of R.
∞
X
HW 8.4.3 Discuss the convergence of the power series sin kxk .
k=1
HW 8.4.4 Compute the Maclaurin series of f (x) = sin x. Determine for which
values of x the series converges. Prove or disprove that when the series con-
verges, it converges to f (x) = sin x.
f2 (x) = 0. We showed that f2n → f2 pointwise on [0, 1]. Consider the related
example.
Example 8.5.2 Suppose fn , f : D → R, D ⊂ R, and fn → f pointwise on D. Suppose
that fn and f are differentiable on D for all n ∈ N. Suppose also that the sequence of
derivatives {fn0 } converges pointwise on D to f ∗ . Show that it need not be the case that
f 0 = f ∗ , i.e. show that the derivative of the limit need not be the limit of the derivatives.
Solution: Obviously we want to consider the sequence {f2n } and limiting function f2 . Clearly
each function f2n and f2 are differentiable, and f20 n (x) = xn and f20 (x) = 0 for all x ∈ [0, 1].
We know from Example 8.2.1 that f20 n → f1 pointwise on [0, 1] where f1 is as was defined in
Examples 8.2.1 and 8.5.1. And clearly, f1 (x) 6= f20 (x) for all x ∈ [0, 1]. Therefore, the limit of
derivatives need not be equal to the derivative of the limit.
Before we get to work we include one more traditional example that shows
the inadequacy of pointwise convergence. Consider the following example.
Example 8.5.3 Define f4n , f4 : [0, 2] → R by f4 (x) = 0 for x ∈ [0, 2] and f4n (x) =
2 x ∈ [0, 1/n]
n x
n − n2 (x − 1/n) x ∈ [1/n, 2/n] (the Teepee function that goes from (0, 0), to (1/n, n),
0 elsewhere in [0, 2]
Z 2 Z 2
to (2/n, 0)). Show that f4n (x) → f4 (x) for all x ∈ [0, 2] and that lim f4n 6= f4 , i.e.
n→∞ 0 0
the limit of the integrals is not equal to the integral of the limit.
Solution: If we choose any x ∈ (0, 2] it is easy to see that there is an N so that n > N
implies f4n (x) = 0 (choose N such that N > 2/x). Thus f4n (x) → 0 = f4 (x) as n → ∞. By
definition f4n (0) = 0 for all n. Thus f4n (0) → 0 = f4 (0) as n → ∞. Therefore f4n → f4
pointwise. Z
2 Z 2
12
Since f4n is the area under the Teepee, f4 n = n = 1 for all n. And clearly
R2 0 0 2 n
0 f4 = 0. Thus Z 2 Z 2
lim f4n = lim 1 6= f4 = 0.
n→∞ 0 n→∞ 0
Thus we see that if we want such properties as (1) the limit of a sequence of
continuous functions is continuous, (2) the limit of the derivatives of a sequence
of functions is equal to the derivative of the limit of the sequence of functions
and (3) the integral of the limit of a sequence of functions is equal to the limit
of the integral of the sequence of functions, we need something stronger than
pointwise convergence. And these are the properties that we want for a variety
of reasons, including the fact that we want the continuity, differentiability and
integrability of power series. For this reason we make the following definition of
the uniform convergence of a sequence of functions.
Definition 8.5.1 Consider the sequence of functions {fn }, fn : D → R and
function f : D → R for D ⊂ R. The sequence {fn } is said to converge uniformly
to f on D if for every > 0 there exists an N ∈ R such that n > N implies that
|fn (x) − f (x)| < for all x ∈ D.
The emphasis in the above definition is that the N that is provided must
work for all x ∈ D. We see in Figure 8.5.1 that we have drawn an neighborhood
about the function f . The definition of uniform convergence requires that for
8.5 Uniform Convergence 243
n > N , all functions fn must be entirely within the -tube around f . Consider
the following examples.
Example 8.5.4 Consider {f2n }, f2 defined just prior to Example 8.5.2 and also in Ex-
ample 8.2.2. Prove that f2n → f2 uniformly on [0, 1].
Solution: We suppose that we are given an > 0. We must find N that must work for all
x ∈ [0, 1]. If you know what the plots of the various f2n look like—or if you plot a few of
these, you realize that the sequence {f2n } converges to f2 the most slowly at x = 1, i.e. it is
the worst point. Thus we consider the 1 convergence of the sequence {f2n (1)} to 0. As we did
1
in our study of sequences, we need n − 0 = n < . Thus we see that if we choose N = 1/,
1 1 1 1
then n > N implies n − 0 = n < N < . Therefore lim f2n (1) = lim = 0.
n→∞ n→∞ n
But more importantly, we now consider the sequence {f2n } and f2 . If n > N , then
n
xn
x 1 1
|f2n (x) − f2 (x)| = − 0 = ≤ < = .
n n n N
Notice that this sequence of inequalities holds for all x ∈ [0, 1]. Therefore f2n → f2 uniformly.
1.5
0.5
−0.5
0.5 1 1.5
x
Proof: Consider some x0 ∈ D and suppose that we are given an > 0. (We
must find a δ such that |x − x0 | < δ implies that |f (x) − f (x0 )| < .)
Since fn → f uniformly in D, we know that there exists N ∈ R such that
n > N implies |f (y) − fn (y)| < /3 for all y ∈ D (so it would hold for x0 ∈ D
and any x ∈ D also). Choose some particular n0 > N . Then we know that fn0
is continuous on D so there exists a δ such that |x − x0 | < δ and x ∈ D implies
that |fn0 (x) − fn0 (x0 )| < /3. Then |x − x0 | < δ and x ∈ D implies that
|f (x) − f (x0 )| = |(f (x) − fn0 (x)) + (fn0 (x) − fn0 (x0 )) + (fn0 (x0 ) − f (x0 ))|
≤∗ |f (x) − fn0 (x)| + |fn0 (x) − fn0 (x0 )| + |fn0 (x0 ) − f (x0 )|
< /3 + /3 + /3 =
where inequality “≤∗ ” follows from two applications of the triangular inequality,
Proposition 1.5.8-(v). Therefore f is continuous at x0 —for any x0 ∈ D, so f is
continuous on D.
If we return to Example 8.5.4, we see that since the functions f2n are con-
tinuous for all n and the fact that the sequence {f2n } converges uniformly to
f2 , by Proposition 8.5.2 we know that the function f2 is continuous—but that’s
pretty easy since we know that f2 (x) = 0 for x ∈ [0, 1].
Next consider the sequence of functions {f1n } and the function f1 used in
Example 8.5.1 (and also in Example 8.2.1). By Proposition 8.5.2 and the fact
that f1 is not continuous, we then know that the sequence of functions {f1n }
do not converge uniformly.
The next result that we consider is the interaction of uniform convergence
and integration—see Example 8.5.3.
f4 = 0.
0
And finally, we consider our last result involving uniform convergence. Sup-
pose that fn → f —some sort of convergence. There are many times that we
would like to be able to obtain the derivative of f by taking the limit of the
sequence of derivatives {fn0 }. We state the following proposition.
Proposition 8.5.4 Consider the sequence of functions, {fn }, fn : [a, b] → R,
a < b and n ∈ N, where each fn is continuously differentiable on [a, b]. Suppose
there exists some x0 ∈ [a, b] such that {fn (x0 )} converges and the sequence of
functions {fn0 } converge uniformly on [a, b]. Then the sequence of functions
{fn } converges uniformly on [a, b], say to the function f , the function f is
differentiable on [a, b] and f 0 (x) = lim fn0 (x) for all x ∈ [a, b].
n→∞
Proof: Let g be such that fn0 → g uniformly. Since each fn0 is continuous on
[a, b] and fn0 → g uniformly, then g is continuous on [a, b]. Consider the sequence
{fn0 } on [x0 , b]. Clearly the sequence {fn0 } converges uniformly to g on [x0 , b].
Then by Proposition 8.5.3-(a)
Z x Z x
lim fn0 = g, (8.5.2)
n→∞ x0 x0
nR o
x Rx
and the convergence of f 0 to x0 g is uniform on [x0 , b]. Also by the
x0 n
Fundamental Theorem, Theorem 7.5.4,
Z x
fn0 = fn (x) − fn (x0 ). (8.5.3)
x0
Z x
Combining equations (8.5.2) and (8.5.3) gives lim [fn (x) − fn (x0 )] = g.
n→∞ x0
Since we know that lim fn (x0 ) exists, we can add lim fn (x0 ) to the last
n→∞ n→∞
246 8. Sequences and Series
Thus the sequence {fZn (x)} converges for each x ∈ [x0 , b] Because the conver-
nR o x
x
gence of x0 fn0 to g is uniform, the convergence of {fn (x)} is uniform on
x0 Z x
[x0 , b]. Denote this limit by f , i.e. f (x) = g + lim fn (x0 ).
x0 n→∞
By Proposition 7.5.2 we see that f is differentiable and f 0 (x) = g(x) (the
derivative of the limit term is zero), i.e. f 0 (x) = lim fn0 (x) for x ∈ [x0 , b].
n→∞
If we essentially repeat the above proof, this time applying Proposition 8.5.3-
(b) instead of part (a) (when we got equation (8.5.2)), we find that f 0 (x) =
lim fn0 (x) for x ∈ [a, x0 ]. If we combine these results, we get the desired result
n→∞
on [a, b].
Thus we see by Propositions 8.5.2-8.5.4 if we want the limit of a sequence of
functions to inherit certain properties of the sequence, we need uniform conver-
gence.
Earlier we used Proposition 8.5.2 to show that the sequence {f1n } does not
converge uniformly and Proposition 8.5.3 to show that the sequence of functions
{f4n } does not converge uniformly. The proofs are completely rigorous but it’s
sort of cheating.
Of course the fact that these sequences do not converge uniformly can be
proved using the defintion of uniform convergence, Definition 8.5.1. To prove
that a sequence does not converge uniformly we must show that there is least
one so that for all N ∈ R there will be an n > N and at least one x0 ∈ D for
which |fn (x0 ) − f (x0 )| ≥ .
For example consider {f4n } and choose = 1/2. The maximum value of
|f4n (x) − f4 (x)| occurs at x = 1/n and equals n for every n. For every N ∈ R
and any n > N , x0 = 1/n ∈ [0, 2] is the point such that |f4n (x0 ) − f4 (x0 )| =
n ≥ 1 > . Therefore the sequence {f4n } does not converge uniformly to f4 .
Likewise, if we next consider the sequence {f1n } and choose = 1/2, for any
N ∈ R we must find an n > N and x0 ∈ [0, 1] so that |f1n (x0 ) − f1 (x0 )| ≥ . If
you plot the function y = xn for a few n’s, you will see that the point is going
to occur near x = 1 (but surely not at x = 1). Let N ∈ R (any such N ) and
suppose n is any integer greater than N . We need to find x0 < 1 such that
|xn0 − 0| = xn0 ≥ 1/2, or taking the n-th rootpof both sides (realizing that the
nth root function
p is increasing) gives x0 ≥ n 1/2. So we could surely choose
x0 = (1 + 1/2)/2 and see that f1n 6→ f1 uniformly on [0, 1].
n
We notice that proving that a sequence of functions does not converge uni-
formly is not easy—but generally showing that any type of limit does not exist
is not easy.
Before we leave we include one more approach to proving uniform conver-
gence. Recall that when we studied convergence of sequences, we included the
8.5 Uniform Convergence 247
Hopefully it is clear that as with the Cauchy criterion for sequences, the
advantage of using the uniform Cauchy criterion is when you really don’t know
the limiting function. Also as was the case with the Cauchy criterion for se-
quences, our major application for the uniform Cauchy criterion will be when
we use it to show uniform convergence of series. We do need the convergence
result—analogous to Proposition 3.4.11
Proposition 8.5.6 Consider a sequence of functions {fn }, fn : D → R for
D ⊂ R. The sequences {fn } converges uniformly on D to some function f ,
f : D → R if and only if the sequence is a uniformly Cauchy sequence.
|fn (x) − fm (x)| = |(fn (x) − f (x)) + (f (x) − fm (x)| ≤ |fn (x) − f (x)|
+ |f (x) − fm (x)| < /2 + /2 =
Definition 8.6.1 Consider the sequence of functions {fi (x)} where for each
i ∈ N, fi : D → R, D ⊂ R. If the sequence of partial sums, {sn (x)}, converges
X∞
uniformly on D, say to s(x), then the series fi (x) converges uniformly to
i=1
s(x).
8.6 Uniform Convergence 249
Earlier we saw that methods for proving convergence of sequences were not
especially useful for proving convergence of series. Likewise, the methods of
proving uniform convergence of sequences of functions aren’t very useful for
proving the uniform convergence of a series of functions. There is one excellent
result that we will use, the Weierstrass test for uniform convergence. The Weier-
strass test is to uniform convergence of series of functions that the comparison
test is to convergence of real series. For that reason before we state and prove
the Weierstrass Theorem, we prove the following proposition which also defines
the uniform Cauchy criterion for a series.
Proof: This result follows direction from Proposition 8.5.6. Let sn (x) =
Xn ∞
X
fi (x). The series fi (x) converges uniformly if and only if the sequence
i=1 i=1
m
X
{sn } converges uniformly. Also sm (x) − sn−1 (x) = fi (x). Thus the sequence
i=n
{sn } is a uniform Cauchy sequence if and only if the sequence {fi } satisfies for
every > 0 there exists
m
N ∈ R such that m, n ∈ N, m ≥ n and m, n > N
X
implies that fi (x) < for all x ∈ D.
i=n
The result then follows from Proposition 8.5.6. (Again you should realize
that replacing n by n − 1 in the Cauchy criterion for {sn } does not cause any
problem.)
We now proceed with the following theorem.
∞
X
series fi (x) converges absolutely for each x ∈ D.
i=1
n
X ∞
X n
X
Define sn and mn by sn (x) = fi (x), s(x) = fi (x) and mn = Mi .
i=1 i=1 i=1
∞
X
Let > 0 be given. Since the series Mi converges, we know by Proposition
i=1
exists N ∈ R such
8.2.3 that the series satisfies the Cauchy criterion, i.e. there
m
X
that m, n ∈ N, m ≥ n and m, n > N implies that Mi < .
i=n
Suppose m, n > N and m ≥ n. Then
Xm m
X m
X
|sm (x) − sn−1 (x)| = fi (x) ≤∗ |fi (x)| ≤ Mi <
i=n i=n i=n
Proposition 8.6.4 Suppose that the radius of convergence of the power series
∞
X
ai xi is R and R0 is any value such that 0 < R0 < R. Then the power series
i=0
converges uniformly on [−R0 , R0 ].
Proof: Let r be some value such that R0 < r < R. Then by the definition
∞
X
of radius of convergence, Definition 8.4.4, the power series ai ri converges
i=0
absolutely. For any x ∈ [−R0 , R0 ] |ak xk | ≤ |ak rk |. By the Weierstrass Theorem
X∞
the power series ai xi converges uniformly on [−R0 , R0 ].
i=0
You should realize that the above result shows us that power series are very
nice. When they do converge, they just about always converge uniformly—
except possible at ±R. Since we know that the series may not converge at ±R,
we cannot make a stronger statement. That is surely nicer than most sequences
and series.
We are now ready to apply the fact that power series converges uniformly to
obtain the properties we developed for sequences in the last section. The first
is very easy. Since each of the terms in the power series is continuous and the
convergence is uniform on any closed interval contained in (−R, R), we obtain
continuity on the interval [−R0 , R0 ] for any 0 < R0 < R.
8.6 Uniform Convergence 251
∞
X
Proposition 8.6.5 Consider the power series ai xi with radius of conver-
i=0
∞
X
gence R. The function f : (−R, R) → R defined by f (x) = ai xi is continuous
i=0
on (−R, R).
iai xi−1
Ai i−1
= lim i x
lim = lim i
= 0.
i→∞ Bi i→∞ ai r i→∞ r
r
252 8. Sequences and Series
(To see that this last limit is zero let ρ be such that 0 < ρ < 1 and consider the
y
limit lim yρy . Write this limit as lim −y and apply L’Hospital’s rule, the
y→∞ y→∞ ρ
x → ∞ version of Proposition 6.4.4-(c), or HW6.4.1-(d).)
X∞
Thus by the limit comparison test the series iai xi−1 converges absolutely
i=1
for |x| < r for any r, 0 < r < R—and since this holds true for any r < R, the
∞
X
radius of convergence of iai xi−1 is at least R.
i=1
∞
X
Therefore we know that the series iai xi−1 converges uniformly on
i=1
[−R0 , R0 ] for any R0 < R, and by Proposition 8.5.4 we know that the function
∞
X
f is differentiable and f 0 (x) = iai xi−1 . And since this is true for any R0 ,
i=1
0 < R0 < R, the function f is differentiable on (−R, R).
Notice that we do not prove that the radius of convergence of the differen-
tiated power series is R—we proved that it was at least R. We will come back
in Proposition 8.6.10 and show that the radius of convergence of the derivative
power series is actually R.
Once we know that we can always differentiate a power series and that the
derivative series converges too, we know that we can do it again. We obtain the
following corollary.
∞
X
Corollary 8.6.7 Consider the power series ai xi with radius of convergence
i=0
∞
X
R. The function f : (−R, R) → R defined by f (x) = ai xi has derivatives of
i=0
all orders on (−R, R) and ai = f (i) (0)/i!.
Of course the above result follows from applying Proposition 8.6.6 many times
and evaluating that result at x = 0 (which we can by Proposition 8.6.5). We
should realize that the above proposition gives us the very nice result that when
a power series is convergent so that it defines a function f , the power series is
the Maclaurin for the function f . Using the result we also obtain the following
corollary.
∞
X ∞
X
Corollary 8.6.8 Consider the power series ai xi and bi xi both of which
i=0 i=0
∞
X ∞
X
converge on (−r, r) for some r, 0 < r. If ai xi = bi xi for x ∈ (−r, r),
i=0 i=0
then ak = bk for k = 0, 1, 2, · · · .
These are very nice results when it comes to differentiating power series. In
the next result we see that we obtain an analogous result concerning integration
8.6 Uniform Convergence 253
of power series.
∞
X
Proposition 8.6.9 Consider the power series ai xi with radius of conver-
i=0
∞
X
gence R. The function f : (−R, R) → R defined by f (x) = ai xi is in-
i=0
tegrable on any closed interval contained in (−R, R), for any x ∈ (−R, R)
Z x ∞ Z x
X ai i+1
f = x and the radius of convergence of the series for f
0 i=0
i+1 0
is R.
Proof: Let x be such that x ∈ (−R, R) and suppose that x < R0 < R. If we
∞
X
use the fact that the power series ai xi converges uniformly on [−R0 , R0 ],
i=0
Z x Z n
xX n
X ai i+1
Proposition 8.5.3 implies that f = lim ai xi = lim x
0 n→∞ 0
i=0
n→∞
i=0
i+1
converges uniformly which gives the desired result. We note that since this is
∞
X ai i+1
true for any x ∈ (−R, R), the radius of convergence of x is at least
i=0
i+1
R.
∞
X ai i+1
Suppose now that the radius of convergence of the power series x
i=0
i + 1
were greater than R, say R∗ > R. Because the derivative of the series
∞ ∞
X ai i+1 X
x is ai xi , Proposition 8.6.6 implies that the radius of convergence
i=0
i + 1 i=0
X∞
i
of ai x is at least R∗ . But this is a contradiction to the hypotheses of the
i=0
∞
X ai i+1
proposition. Thus the radius of convergence of the power series x
i=0
i+1
is R.
Notice that we were careful in the last part of the above proof and used the
fact that the radius of convergence of the differentiated power series is at least
as large as the radius convergence of the original series. That’s all we had and
it was all we needed. If we return to Proposition 8.6.6, use the same type of
argument used in the last part of the above proof, the fact that the integral of
∞
X ∞
X
the series iai xi−1 is ai xi and Proposition 8.6.9, we obtain the following
i=1 i=1
result—surely the radius of convergence doesn’t care about the constant term.
∞
X
Proposition 8.6.10 Consider the power series ai xi with radius of conver-
i=0
254 8. Sequences and Series
∞
X
gence R. The radius of convergence of the series iai xi−1 is R.
i=1
t−x n 1
Since ≥ 0, we can apply the Mean Value Theorem for Integrals, Theorem
1+t 1+t
7.5.8, to get
t−x n 1 tn − x n tn − x n
Z 0 Z 0
1 1
|Rn (x)| = dt = dt = (−x)
x 1 + t 1 + t 1 + tn 1 + tn x 1 + tn 1 + tn
where tn ∈ [x, 0]—emphasize that tn does depend on n.
Since x ≤ tn ≤ 0 implies that 1 + x ≤ 1 + tn ≤ 1 and 0 < 1 + x, and |x| = −x, we see that
tn − x n tn + |x| n |x| tn |x| + |x| n |x| |x|n+1
1
(−x) ≤ < = .
1 + tn 1 + tn 1 + tn 1+x 1 + tn 1+x 1+x
|x|n
Thus |Rn (x)| < 1+x
→ 0 as n → ∞.
∞
X (−1)k+1 k
Therefore in both cases Rn (x) → 0 as n → ∞, so f (x) = ln(x + 1) = x for
k=1
k
x ∈ (−1, 1].
One important fact that we should emphasize is that the series converges
to ln(x + 1) on only (−1, 1]—we found as a part of Example 8.4.1 that the
∞
X (−1)k+1 k
radius convergence of the series x is R = 1, i.e. the series diverges
k
k=1
for |R| > 1. We notice that even though the function ln(x + 1) is defined for
x ∈ (−1, ∞), the Maclaurin series doesn’t converge on the (1, ∞) part of the
function. This is just a fact of power series—and a Maclaurin series is a power
series—that they always converge only on a symmetric interval (−R, R)—and
maybe the endpoints. We cannot do better.
In Proposition 8.4.1 we found a tool for proving that Taylor series-Maclaurin
series converged to the function that generated the series. This result works well
for a variety of functions, exp, sine, cosine, etc. We saw in Example 8.4.1 that
Proposition 8.4.1 will not work for all functions—specifically for f (x) = ln(x+1).
We were able to prove that the Maclaurin series for ln(x + 1) will converge to
ln(x + 1) on (−1, 1], but we really used ad hoc methods—methods that will not
necessarily carry over to other examples. There are not methods that will work
for all Taylor series-Maclaurin series. To illustrate how bad it can really be,
consider the following very important example.
( 2
e−1/x if x 6= 0
Example 8.6.2 Consider the function f (x) =
0 if x = 0.
Find the Maclaurin series of f —if it exists—and determine for which values of x the series
converges, and for which values of x the series converges to f (x).
Solution: To determine the Maclaurin series of f we begin by computing the derivatives at
x = 0. Note that each of the equalities ”=∗ ” follow by L’Hospital’s rule.
2
f (h) − f (0) e−1/h h−1 −h−2 1 h
f 0 (0) = lim = lim = lim 2 =
∗
lim 2 = lim
h→0 h−0 h→0 h h→0 e1/h h→0 −2h−3 e1/h 2 h→0 e1/h2
= 0
2
f 0 (h) − f 0 (0) 2h−3 e−1/h h−4 −4h−5
f 00 (0) = lim = lim = 2 lim 2 =
∗
lim 2
h→0 h−0 h→0 h h→0 e1/h h→0 −2h−3 e1/h
h−2 ∗ −2h−3 1
= 4 lim 2 = 4 lim 2 = 4 lim 2 = 0
h→0 e1/h h→0 −2h−3 e1/h h→0 e1/h
256 Index
2 2
!
e−1/h e−1/h
f 000 (0) = lim −6 +4 = ··· = 0
h→0 h4 h7
We don’t know how you feel about it but we’re tired of these computations by now. It should
be reasonably clear that all derivatives of f evaluated at x = 0 will involve one or more limits
2
of the form lim h−k e−1/h . Hopefully the above computations convinces you that all of these
h→0
limits can be computed and are equal to zero. How would we prove this? To prove that the
2
particular limits lim h−k e−1/h are zero we must use mathematical induction—for even and
h→0
odd k separately—but we don’t really want to do that here.
Thus we see that f (k) (0) = 0 for k = 0, 1, 2, · · · . Hence the Maclaurin series expansion
∞ ∞
X f (k) (0) k X f (k) (0) k
x exists and is the identically zero series, and we see f (x) 6= x for
k=0
k! k=0
k!
all x 6= 0.
The function used in this example is clearly a non-standard function. Plot it in various
neighborhoods of x = 0 to see what it looks like. However, the example does show that if you
compute a Maclaurin series-Taylor series expansion, you do not necessarily get the original
function back again.
∞
X
HW 8.6.1 (a) Prove that the series xn converges uniformly to 1
1−x on
n=0
[−R, R] for 0 < R < 1.
∞
X
(b) Discuss whether or not the series xn converges uniformly to 1
1−x on
n=0
(−1, 1).
∞
X
HW 8.6.2 (a) Prove that the series (−1)n xn converges uniformly to 1
1+x
n=0
on [−R, R] for 0 < R < 1.
∞
X
(b) Discuss whether or not the series (−1)n xn converges uniformly to 1
1−x
n=0
on (−1, 1).
∞
1 X
HW 8.6.3 (a) Show that = (−1)n x2n has a radius of convergence
1 + x2 n=0
of R = 1.
(b) Use part (a) to determine the power series expansion of tan−1 x—and the
radius of convergence of the series. Justify your results.
Index
257
258 Index
valid argument, 9, 11
Venn Diagram, 37