You are on page 1of 45

Chapter 1

Real Number System

1.1 Introduction
Definition 1.1.1. An Ordered set is a set S with a relation, denoted by <, with the following
two properties:

1. For any x, y ∈ S, one and only one of the following statements is true:

x<y x=y y<x

2. For x, y, z ∈ S, if x < y, y < z then x < z.

Definition 1.1.2. A Field is a set F with two operations addition and multiplication, which satisfy
the following axioms:
(A) Axioms for addition

A1.) If x, y ∈ F then x + y ∈ F

A2.) x + y = y + x ∀x, y ∈ F

A3.) x + (y + z) = (x + y) + z ∀x, y, z ∈ F

A4.) F contains an element 0 such that x + 0 = x for all x ∈ F .

A5.) For every x ∈ F there is an element −x such that x + (−x) = 0

(M) Axioms for multiplication

M1.) If x, y ∈ F then xy ∈ F

M2.) xy = yx ∀x, y ∈ F

M3.) x(yz) = (xy)z ∀x, y, z ∈ F

M4.) F contains an element 1 such that 1x = x for all x ∈ F .

M5.) For every x ∈ F , x 6= 0, there is an element x−1 ∈ F such that xx−1 = 1.

1
The Distributive Law

x(y + z) = xy + xz ∀ x, y, z ∈ F.

Notation: We will denote the set of all natural numbers by N, set of all integers by Z and the
set of all rationals by Q.
The set of rational numbers, Q is both an ordered set and a field, and hence satisfies all the
properties listed above in the definitions. Rational numbers further satisfy the follwing property:
For p, q ∈ Q: p > 0, q > 0 ⇒ pq > 0, p + q > 0

There are infinitely many rational numbers, in fact, between any two rationals p and q, there is
p+q
another rational, e.g., . If we continue taking the midpoint of the previous midpoint and one
2
of the extreme points then we can also see that between any two rationals there are infinitely many
rationals. But we will see below that Q by itself does not provide a whole (or a complete) picture,
and that there are ‘gaps’ in Q.

Proposition 1.1.3. x2 = 2 has no solution in Q.


m
Proof. To see this, assume that it does have a solution, say x =, for some m, n ∈ Z, n 6= 0, and
n
suppose that gcd(m, n)= 1 (i.e., m and n have no common factor other than 1.) Then
 m 2
= 2 ⇒ 2m2 = n2 ⇒ n2 is even ⇒ n is even.
n
Let n = 2k for some k ∈ Z. Then
2m2 = (2k)2 ⇒ m2 = 2k 2 ⇒ m2 is even ⇒ m is even.
If both m and n are even, 2 is their common factor. This is a contradiction, since gcd(m, n)= 1.

More generally, for every prime p, x2 = p has no solutions in Q.

Define A = {q ∈ Q : q > 0, q 2 < 2} and B = {q ∈ Q : q > 0, q 2 > 2}. Then we show (in the
theorem below) that A has no largest element in it, and B has no smallest element in it.
Theorem 1.1.4. (a) For every p ∈ A, ∃ q ∈ A such that p < q.
(b) For every p ∈ B, ∃ q ∈ B such that p > q.
Proof. (a) Suppose p ∈ A, then p2 < 2. Pick h ∈ Q with 0 < h < 1 and
2 − p2
h< .
2p + 1
Let q = p + h, then q > p since h > 0, and
q 2 = p2 + 2ph + h2 = p2 + (2p + h)h
< p2 + (2p + 1)h (since h < 1)
< p2 + (2 − p2 ) = 2.

2
So q 2 < 2, i.e., q ∈ A.

(b) Now suppose p ∈ B, so p2 > 2. Let

p2 − 2 2p2 p2 − 2 p2 + 2 p 1
q =p− = − = = + .
2p 2p 2p 2p 2 p
Then 0 < q < p and

p2 − 2 p2 − 2 2
q 2 = p2 − 2p +( )
2p 2p
> p2 − (p2 − 2) = 2.

So q ∈ B.

1.2 Dedekind Cuts


Definition 1.2.1. A set α in Q is a cut if

(i) α 6= ∅ and α 6= Q.

(ii) If p ∈ α and q < p then q ∈ α.

(iii) α contains no largest rational, that is for every p ∈ α, there is a q ∈ α such that p ≤ q.

Remarks: The follwing facts will be used frequently and follow from the definition of a cut.

1. p ∈
/ α ⇐⇒ p > q for all q ∈ α.
Reason: Note that this statement is the contrapositive of the statement that ∃ q ∈ α such
that q ≥ p ⇐⇒ p ∈ α

2. If p ∈ α and q ∈
/ α then p < q
Reason: If p ∈ α and q ≤ p then by the definition of a cut part (ii) q ∈ α.

Definition 1.2.2. An upper number of a cut α is a rational q which is not in α, i.e., q ∈ Q \α.
Every element p in α is called a lower number of the cut α.

So a “cut” divides the set of rationals into two disjoint sets.

Definition 1.2.3. Let r ∈ Q. Define r∗ := {p ∈ Q : p < r}, r∗ is called a rational cut.

Theorem 1.2.4. Let r ∈ Q, then r∗ , defined above, is a cut. Also, r is the smallest upper number
of r∗ . In particular, 0∗ and 1∗ are cuts.

3
Proof. (i) Since r − 1 ∈ Q and r − 1 < r, r − 1 ∈ r∗ . So r∗ 6= ∅.
Also since r ∈/ r∗ , r∗ 6= Q.
(ii) Let p ∈ r∗ and q < p. Since p < r and q < p, q < r. Thus q ∈ r∗ .
p+r p+r
(iii) Let p ∈ r∗ , then p < < r, so ∈ r∗ . Thus r∗ has no largest element.
2 2

/ r∗ , r is an upper number for r. If p < r, then p ∈ r∗ , so no number smaller than r is


Since r ∈
an upper number.

Definition 1.2.5. We say that two cuts α and β are equal (written α = β) if α and β are equal
as sets, i.e., p ∈ α ⇐⇒ p ∈ β.
Definition 1.2.6. Let α and β be cuts. We say α < β (or β > α) if α is a proper subset of β
i.e., For all p ∈ Q, p ∈ α ⇒ p ∈ β and ∃ q ∈ Q such that q ∈ β but q ∈
/ α.
We say α ≤ β if α < β or α = β. And, α ≥ β means β ≤ α.
If α > 0∗ , we say α is positive. If α ≥ 0∗ , we say α is non-negative.

Theorem 1.2.7. (O1. Trichotomy Law) Let α and β be cuts. Then exactly one of α = β or
α < β or α > β holds.
Proof. Case I: α = β
By the definition it is clear that the other two do not hold.
Case II: α 6= β
Then either α < β or β < α.
Suppose α < β and β < α. Then ∃ p ∈ Q such that p ∈ β, p ∈ / α. Also, ∃ q ∈ Q such that
q ∈ α, q ∈/ β. Since p ∈ β and q ∈/ β, p < q. On the other hand, q ∈ α and p ∈
/ α implies q < p.
This is a contradiction. Hence this is not possible.
Theorem 1.2.8. (O2. Transitivity of “ < ”) Let α, β and γ be cuts. If α < β and β < γ, then
α < γ.
Proof. Let α < β and β < γ. Then ∃ p ∈ Q such that p ∈ β, p ∈ / α, and ∃ q ∈ Q such that q ∈ γ,
q∈/ β.
Since p ∈ β and q ∈
/ β, p < q. Further, since p ∈
/ α and p < q, q ∈
/ α. Hence, q ∈ γ and q ∈
/ α,
that is, α < γ.

Definition 1.2.9. (Sum of cuts) Let α and β be cuts. Define α + β := {r = p + q : p ∈ α, q ∈ β}.

Theorem 1.2.10. (A1. Closure) If α and β are cuts then α + β is a cut.


Proof. (i) Since α, β 6= ∅, α + β 6= ∅.
Since α, β 6= Q, ∃ s, t ∈ Q such that s ∈
/ α and t ∈ / β. So s > p for all p ∈ α and t > q for all
q ∈ β. Thus s + t > p + q for all p ∈ α and q ∈ β, i.e., s + t ∈
/ α + β. Hence α + β 6= Q.
(ii) Let r ∈ α + β and s < r. Then r = p + q for some p ∈ α and q ∈ β.

4
We can write s = p + (s − p), then where s − p < r − p = q ∈ β and p ∈ α. Thus, s ∈ α + β.
(iii) Let r ∈ α + β, so r = p + q for some p ∈ α and q ∈ β. Since α is a cut, p is not the largest
element of α, i.e., ∃ s ∈ α, s > p. Then s + q > p + q = r and s + q ∈ α + β.

Theorem 1.2.11. Let α, β and γ be cuts. Then


(i) A2. Commutativity: α + β = β + α
(ii) A3. Associativity: (α + β) + γ = α + (β + γ)
(iii) A4. Identity: α + 0∗ = α.

Proof. (i) and (ii) are easy to prove. For instance, (i) follows from the fact that p + q = q + p for
all p, q ∈ Q.
(iii) Let r ∈ α + 0∗ . So r = p + q, p ∈ α, q ∈ 0∗ . Since q < 0, p + q < p, and hence r = p + q ∈ α.
This proves that α + 0∗ ⊆ α.
Let r ∈ α. Pick s ∈ α such that s > r. Let q = r − s, then q < 0 ⇒ q ∈ 0∗ , and r = s + q. So
r ∈ α + 0∗ . Hence α ⊆ α + 0∗ .

Definition 1.2.12. (Negative of a cut) Let α be a cut. Define


−α = {p ∈ Q : −p is an upper number of α but not the smallest upper number}.

Theorem 1.2.13. If α is a cut, then −α is a cut.

Proof. (i) Since α 6= Q, ∃ p ∈


/ α. Choose q > p, then q is an upper number for α which is not the
smallest. Thus −q ∈ −α and −α 6= ∅.
Since α 6= ∅, ∃ p ∈ α. Then p is not an upper number for α. Hence −p ∈/ −α, so −α 6= Q.
(ii) Let p ∈ β and q ∈ Q, q < p. Then −p ∈
/ α and −q > −p. So −q ∈
/ α, i.e., −q is an upper
number and not the smallest. Thus q ∈ β.
(iii) Let p ∈ β. Then −p ∈/ α and −p is not the smallest upper number of α. So pick q ∈ Q
with −q < −p, −q ∈ / α.
p+q
Let r = , then −q < −r < −p. So −r is an upper number of α which not the smallest. So
2
∃ r > p with r ∈ β.

Lemma 1.2.14. Let α be a cut and let p ∈ Q, p > 0. Then ∃ q, r ∈ Q such that q ∈ α, r ∈
/ α
where r is not the smallest upper number of α, and r − q = p.

Proof. Let s ∈ α. For n = 0, 1, 2, . . ., let sn = s + np. Then ∃ a unique m ∈ N such that sm ∈ α,


sm+1 ∈/ α. If sm+1 is not the smallest upper number of α, then let q = sm , r = sm+1 . Otherwise let
p p
q = sm + , r = sm+1 + .
2 2

Theorem 1.2.15. (A5. Inverse) Let α be a cut, then −α is a unique cut such that α + (−α) = 0∗ .

5
Proof. Let β := −α. Then we need to show α + β = 0∗ .
“⊂” : Let p ∈ α + β. Then p = q + r, q ∈ α, r ∈ β. So
/ α ⇒ −r > q ⇒ q + r < 0 ⇒ p < 0 ⇒ p ∈ 0∗ .
−r ∈

“⊃” : Let p ∈ 0∗ , so p < 0. By the lemma, ∃ q ∈ α, r ∈


/ α, r is not the smallest upper number
of α such that r − q = −p.

So − r ∈ β ⇒ p = q − r = q + (−r) ∈ α + β.

Uniqueness: If α + β1 = α + β2 = 0∗ , then

β2 = 0∗ + β2 = (α + β1 ) + β2 = (α + β2 ) + β1 = 0∗ + β1 = β1 .

Proposition 1.2.16. Let α be a cut. Then α > 0∗ if and only if −α < 0∗ .


Proof. (⇒): Let 0∗ < α, then ∃ p ∈ α, p ∈/ 0∗ .
/ 0∗ . Thus we can assume
If p = 0, we can choose q ∈ α, q > p = 0, so that q 6= 0, q ∈ α, and q ∈
that p 6= 0.
Claim: −p ∈ 0∗ and −p ∈
/ −α
Since p = −(−p) ∈ α, p is not an upper number for α. So −p ∈
/ −α.
/ 0∗ and p 6= 0 ⇒ p > 0 ⇒ −p < 0 ⇒ −p ∈ 0∗ .
Since p ∈

(⇐): Let −α < 0∗ , then ∃ p ∈ 0∗ such that p ∈


/ −α. Choose q > p such that q ∈ 0∗ .
/ 0∗ .
Claim: −q ∈ α, and −q ∈
Since q ∈ 0∗ ⇒ q < 0 ⇒ −q > 0 ⇒ −q ∈
/ 0∗ .
Since p ∈
/ −α and q > p, q ∈ / −α. By definition, q ∈ / −α implies either −q is not an upper
number of α (i.e., −q ∈ α) or −q is the smallest upper number of α.
In the first case, we are done.
If −q is the smallest upper number of α, then −q is the smallest number such that −q ∈/ α. Now
p < q ⇒ −q < −p, so −p ∈ / α and also −p is not the smallest upper number. Thus p = −(−p) ∈ −α,
which is a contraction, since p ∈
/ −α.

Proposition 1.2.17. Let r ∈ Q, then −r∗ = (−r)∗


Proof. Let r ∈ Q, then
−r∗ = {p ∈ Q −p is an upper number forr∗ but not the smallest upper number}
/ r∗ and − p 6= r}
={p ∈ Q : −p ∈
={p ∈ Q : −p ≥ r and − p 6= r}
={p ∈ Q : −p > r}
={p ∈ Q : p < −r} = (−r)∗ .

6
Theorem 1.2.18. (O3. Order axiom) Let α, β, γ be cuts such that α < β, then α + γ < β + γ.
In particular, if α = 0∗ , then β + γ > 0∗ whenever γ > 0∗ and β > 0∗ .
Proof. Let α < β, then ∃ p ∈ β such that p ∈
/ α. Let q ∈ γ, then p + q ∈ β + γ.
Claim: p + q ∈/ α+γ
Let r ∈ α, then p > r since p ∈
/ α. Thus p + q > r + q, which implies that p + q ∈
/ α + γ.

Definition 1.2.19. Multiplication of positive cuts Let α, β > 0∗ , then define


αβ = {p ∈ Q : p < rs for some r ∈ α, s ∈ β, r > 0, s > 0}.
Theorem 1.2.20. M1. Closure If α and β are cuts, α, β > 0∗ , then αβ is a cut.
Definition 1.2.21. Multiplication for cuts, in general Define α0∗ = 0∗ α = 0∗ and

 −[(−α)β]
 α < 0∗ , β > 0∗
αβ = −[α(−β)] α > 0∗ , β < 0∗

(−α)(−β) α < 0∗ , β < 0∗

Theorem 1.2.22. Let α, β, γ be cuts. Then


(i) M2. Commutativity: αβ = βα
(ii) M3. Associativity: α(βγ) = (αβ)γ
(iii) M4. Identity: α1∗ = α.
1
Theorem 1.2.23. (M5. Inverse) Let α 6= 0∗ then there exists a unique cut, written as, , such
α
1 ∗
that α. = 1 .
α
Theorem 1.2.24. Let α, β, γ be cuts. Then
(i) D. Distributive Law: α.(β + γ) = α.β + α.γ
(ii) O4. Order axiom: If 0∗ ≤ α < β and γ > 0∗ , then α.γ > β.γ
In particular, for α = 0∗ , β.γ > 0∗ whenever β, γ > 0∗ .
Theorem 1.2.25. Let r, s ∈ Q, then
(i) r∗ + s∗ = (r + s)∗
(ii) r∗ s∗ = (rs)∗
(iii) r < s ⇐⇒ r∗ < s∗
Proof. (i) Let p ∈ r∗ + s∗ , then p = q + t for some q ∈ r∗ , t ∈ s∗ .
So p = q + t < r + s ⇒ p ∈ (r + s)∗ .
Conversely, let p ∈ (r + s)∗ , then p < r + s.
h h
Let h = r + s − p < 0 and define q := r − , t = s − .
2 2
Then q < r and t < s and q + t = r + s − h = p. Hence p ∈ r∗ + s∗ .

(iii) (⇒) : Let r < s. Then r ∈ s∗ , and we know r ∈


/ r∗ . So r∗ < s∗ .
(⇐) : Let r∗ < s∗ . Then ∃ p ∈ s∗ , p ∈
/ r∗ , i.e., p < s and p ≥ r. Thus r ≤ p < s, and so
r < s.

7
1.3 Construction of Real Numbers - Dedekind Cuts

Definition 1.3.1. We define the set of real numbers to be the set of all cuts, denoted by R.
The set of irrational numbers are defined to be the set of non-rational cuts.
Remark 1.3.2. 1. We have seen that the set of cuts satisfies addition properties (A1 − A5),
multiplication properties (M 1 − M 5), distributive property (D), and order properties (O1 −
O4). Any set which satisfies these 15 properties, is called an ordered field. Thus, the set of
cuts, and hence, the set of real numbers is an ordered field.
2. The set of rational number is identified with the set of rational cuts, and also satisfies these
15 axioms. Thus, the set of rational number is just not a subset, but a sub-field, of R.
Definition 1.3.3. Let S be any ordered set in R (i.e., satisfies O1, O2), and let E ⊆ S. Then E is
bounded above if ∃ β ∈ S such that x ≤ β for all x ∈ E.
Similarly, E is bounded below if ∃ β ∈ S such that x ≥ β for all x ∈ E.

Definition 1.3.4. Let S be an ordered set and E ⊆ S which is bounded above. Then α ∈ S is a
least upper bound of E if
(i) α is an upper bound of E (i.e., x ≤ α for all x ∈ E);
(ii) If β < α, the β is not an upper bound of E.
We write α = sup E.
A greatest lower bound can be similarly defined.

Definition 1.3.5. If every non-empty subset E of S which is bounded above, has a least upper
bound in S, then S is said to have the least upper bound property.
The greatest lower bound property can be similarly defined.

Theorem 1.3.6. R has the least upper bound property.


Proof. We will show that the set of cuts has the least upper bound property. Let S be a nonempty
bounded set of cuts.
S Let a cut β be an upper bound for S such that β ∈ / S.
Define γ := α∈S α.
Claim 1: γ is a cut.
(i) It is clear, since α 6= ∅, γ =
6 ∅.
S
Since α < β ∀α ∈ S ⇒ γ = α∈S α < β ⇒ ∃ p ∈ β such that p ∈
/ γ. Hence γ 6= Q.
(ii) Let p ∈ γ, and q ∈ Q, q < p. Then p ∈ α0 for some α0 ∈ S, which is a cut. Then q ∈ α0 ⊆ γ.
(iii) Let p ∈ γ, then p ∈ α0 for some α0 ∈ S, which is a cut. So ∃ q ∈ α0 ⊆ γ such that q > p.
Claim 2: γ = sup S
(i) Let α ∈ S, then α ⊆ γ ⇒ α ≤ γ. So γ is an upper bound for S.
(ii) Let β be an upper bound of S, that is, α ≤ β for all α ∈ S. Then
[
α⊆β ∀α∈S⇒ α ⊆ β ⇒ γ ≤ β.
α∈S

8
Theorem 1.3.7. (Archimedean Property of R) If x, y ∈ R and x > 0, then ∃ n ∈ N such that
nx > y.

Proof. (By contradiction) Let x > 0, y ∈ R. Suppose nx ≤ y for all n ∈ N.


Let S = {nx : n ∈ N}, then S is a nonempty subset of R which is bounded above by y. Then
by Theorem 1.3.6, S has a supremum in R.
Let α = sup S. Then α ≥ nx for all n ∈ N. Also, since α − x < α, α − x cannot be an upper
bound for S. So ∃ m ∈ N such that

α − x < mx ⇒ α < (m + 1)x.


This is contradiction since α ≥ nx for n ∈ N. Thus our assumption must be false, and Archimedean
property holds.

Theorem 1.3.8. Q is dense in R, that is, for every x, y ∈ R, x < y, ∃ r ∈ Q such that x < r < y.

Proof. Let x, y ∈ R, x < y. Let x, y correspond to cuts α and β, respectively. Then α < β. So ∃
p ∈ Q such that p ∈ β, p ∈
/ α. Choose r ∈ Q such that r ∈ β, r > p.
So r ∈ β and r ∈/ r∗ ⇒ r∗ < β.
Since p < r, p ∈ r∗ , but p ∈
/ α ⇒ α < r∗ .
Thus, ∃ r such that α < r∗ < β, which means x < r < y.

9
Absolute Value Let x ∈ R. Then absolute value of x is defined to be
(
x if x ≥ 0
x=
−x if x < 0

Proposition 1.3.9. Let x and y be in R. Then

(i) |x| = 0 if and only if x = 0;

(ii) |−x| = |x|;

(iii) x ≤ |x|;

(iv) |xy| = |x| |y|;

(v) if y ≥ 0, then |x| ≤ y if and only if −y ≤ x ≤ y;

(vi) |x + y| ≤ |x| + |y|.

Proof. Parts (i)-(v) is an easy exercise which can be proved by considering two cases in each part,
namely, x ≥ 0 and x < 0. We prove part (vi). Consider

|x + y|2 = (x + y)2 = (x + y)2 = x2 + 2xy + y 2


= x2 + 2xy + y 2
≤ x2 + |2xy| + y 2
= |x|2 + 2 |x| |y| + |y|2
= (|x| + |y|)2 .

Taking square root both sides, gives (vi).

Corollary 1.3.10. For x, y ∈ R,

(i) |x − y| ≤ |x| + |y|,

(ii) ||x| − |y|| ≤ |x − y|.

10
1.4 Finite, Countable, and Uncountable Sets
Let f be a mapping from A into B. Then f is one-to one (1-1) if for any x1 , x2 ∈ A, f (x1 ) = f (x2 )
implies x1 = x2 .
f is onto if f (A) = B, i.e., for any y ∈ B there exists an x ∈ A such that f (x) = y.
If f is both 1-1 and onto, it is called a bijection.

Definition 1.4.1. Two sets A and B are said to be in 1-1 correspondence if there exists a
one-one mapping from A onto B. We also say that A and B has same cardinal number or that
A and B are equivalent, and write A ∼ B.

Proposition 1.4.2. The equivalence relation between sets satisfies the following properties:

(i) Reflexive: A ∼ A.

(ii) Symmetric: If A ∼ B then B ∼ A.

(iii) Transitive: If A ∼ B and B ∼ C, then A ∼ C.

Any relation with these three properties is called an equivalence relation.


Proof. Let A, B, C be sets.

(i) Identity map I : A → A, I(x) = x, is a one-one and onto.

(ii) Let A ∼ B, then there is a bijection f : A → B. Then, f −1 is a bijection from B onto A.

(iii) Let f : A → B and g : B → C be bijections, then g ◦ f is a bijection from A onto C.

Definition 1.4.3. Let Jn = {1, 2, . . . , n} and N be the set of all positive integers. For any set A,
we say

• A is finite if A = ∅ or A ∼ Jn for some n.

• A is infinite if A is not finite.

• A is countable if A ∼ N.

• A is uncountable if A is neither finite nor countable.

Note: Let A and B be finite sets. Then A ∼ B if and only if A and B have same number of
elements. But this is not the case, if A and B are infinite.
Countable sets are sometimes also called denumerable or enumerable sets.

Examples:

1. E ∼ N, where E is the set of all positive even integers. Define f : N → E as f (n) = 2n, then
f is a bijection.

11
 n
 if n even
2. Z ∼ N, where Z is set of all integers. Define f : N → Z as f (n) = 2
 − (n − 1) if n odd
2
Then f is a bijection.
Definition 1.4.4. A sequence of real numbers is defined to a map f from N into R. If f (n) = xn ,
then we write a sequence as (x1 , x2 , . . . , xn , . . .), where xn is called the nth term of the sequence
(xn ).
If A is countable, then there is a bijection from N onto A. Thus we can say that, a set
A is countable if and only if A is the range of a sequence of distinct terms, i.e.,
A = {xn : n = 1, 2, . . .}.

Theorem 1.4.5. Every infinite subset of a countable set is countable.


Proof. Let A be a countable set, and B ⊆ A, be infinite. Let A = {xn : n = 1, 2, . . .}. Let n1 be
the smallest positive integer such that xn1 ∈ B. Proceed like this: choose nk such that nk is the
smallest integer after nk−1 such that xnk ∈ B.
Define f : N → B as f (k) = xnk . Then by the choice of the subsequence, any x ∈ B will be xnk
for some k, which shows f is onto. Since all xn ’s are distinct, all xnk ’s are distinct. Thus if k 6= p,
then xnk 6= xnp . So f is one-one. Thus, f is a bijection, and B is countable.

Definition 1.4.6. Let A be a set, and for every element α ∈ A, there is a set Eα . We say
{Eα : α ∈ A} is a collection of sets or family of sets. Such a set A is called an index set.

The union of a family of sets is defined to be a set S such that x ∈ S if and only if x ∈ Eα for
some α ∈ A; written as

∪α∈A Eα .
If A = N, then we write ∪∞
n=1 En .

The intersection of a family of sets is defined to be a set P such that x ∈ P if and only if x ∈ Eα
for all α ∈ A; written as

∩α∈A Eα .
If A = N, then we write ∩∞
n=1 En .

Example: Let A = {x ∈ R : 0 < x ≤ 1}. For every x ∈ A, let Ex = {y ∈ R : 0 < y < x}.
If x, z ∈ A, then x < z if and only if Ex ⊂ Ez .

∪x∈A Ex = E1 .

∩x∈A Ex = ∅.

Theorem 1.4.7. Countable union of countable sets is countable. That is, if (En )∞
n=1 is a sequence

of countable sets, then S = ∪n=1 En is countable.

12
Proof. Let, for every n ∈ N, En = {xnk : k = 1, 2, . . . ∞}. Arrange the elements of S in an array
as follows:
x11 x12 x13 x14 ...
x21 x22 x23 x24 ...
x31 x32 x33 x34 ... .
x41 x42 x43 x44 ...
.. .. .. ..
. . . .
The array shows that the elements of S can be arranged in a sequence

x11 ; x12 , x21 ; x31 , x22 , x13 ; x41 , x32 , x23 , x14 ; . . . .

Omitting the elements which repeat will give a subsequence (xp ) of the above sequence. This shows
that S ∼ N, and hence is countable.

Corollary 1.4.8. The set of all rationals is countable.


nm o S nm o nm o
Proof. Q = : m ∈ Z, n ∈ N = m∈Z : n ∈ N . For a fixed m, the set : n ∈ N is
n n n
m
countable, since the map f (n) = is a bijection (easy to prove!). We also saw earlier that Z is
n
countable. So, Q is a countable union of countable sets. Thus, by above theorem, Q is countable.

Theorem 1.4.9. Let A be a set of all sequences whose elements are the digits 0 and 1. Then A is
uncountable.

Proof. Let E be a countable subset of A, i.e., E = {s1 , s2 , s3 , . . . : si ∈ A}. Define a sequence s such
that nth term of s is 1, if nth term of sn is 0, and vice versa. Then s ∈ / E, but s ∈ A. Thus, E is
a proper subset of A. Since E was arbitrary, we have shown that every countable subset of A is a
proper subset of A. Thus A cannot be countable, since A cannot be a proper subset of itself.

Remark: The above theorem show can be used to show that the set of real numbers, R, is
uncountable. To see this, first recall that every real number has a binary expansion.

1001 = 1 × 23 + 0 × 22 + 0 × 21 + 1 × 20 = 9
1
.01 = 0 × 2−1 + 1 × 2−2 = = .25
4
1001.01 = 9.25
Note that .01 = .010000000000 . . .. Thus every real number x between 0 and 1 can be associated
with a sequence of 0’s and 1’s (not necessarily unique though!) using binary expansion of x.
Let T be the set of all sequences of 0’s and 1’s. Without getting into the details, lets take it on
faith, that we can define a bijection from T onto a subset S of the interval (0, 1), making T ∼ S.
Thus, S is an uncountable subset of R. This implies that R must be uncountable because if R was
countable, then S would also be countable.

13
Chapter 2

Metric Spaces

2.1 Introduction
A metric space can be thought of as a generalization of real line to an abstract space which has a
notion of length or distance between two elements, similar to that on R. The main concepts and
results one studies in an undergraduate real analysis course are

• convergence of a sequence in R,

• Cauchy sequences in R,

• continuity of a function of R,

• every Cauchy sequence converges,

• a function is continuous if and only if xn → x implies f (xn ) → xn .

We will see that we can generalize these concepts and several results to this new abstract space.
In particular, we generalize them to R2 , R3 , . . . Rn . . .. Thus, we will develop a better understanding
of the concepts in higher dimensions as well as revisit them on R.

Terminology: X will denote a set, not necessarily R or Rn . Although, R or Rn are the spaces of
most interest. Also, we will call x ∈ X as a ‘point’ in X, taking the analogy of a geometrical point
in R or Rn . For understanding purposes, most of the time, we will confine to examples on R, R2 or
R3 .

2.2 Basic definitions and examples


Definition 2.2.1. A metric on a set X is function d : X × X → [0, ∞) given by (x, y) → d(x, y)
which satisfies the following properties: for x, y, z ∈ X

(i) d(x, y) = 0 ⇐⇒ x = y,

(ii) Symmetric: d(x, y) = d(y, x),

(iii) Transitive: d(x, y) ≤ d(x, z) + d(y, z).

The set X together with metric d is called a metric space, written (X, d).

14
Examples:
1. Let d : R × R → [0, ∞) be d(x, y) = |x − y|. Then (R, d) [also written as (R, |·|)] is a metric
space.

It is easy to see that d(x, y) satisfies properties (i) and (ii). Property (iii) follows from the
triangle inequality of absolute value as follows:

|x − y| = |x + (−z + z) − y| = |(x − z) + (z − y)| ≤ |x − z| + |z − y| .


p
2. Consider R2 and ~x ∈ R2 , then ~x = (x1 , x2 ). We define length of a vector ~x as kxk = x21 + x22 .

Let d : R2 × R2 → [0, ∞) be
p
d(~x, ~y ) = k~x − ~y k = (x1 − y1 )2 + (x2 − y2 )2 .

Then (R2 , d) is a metric space.

Properties (i) and (ii) are easy. To prove property (iii) one needs to first prove the following,
and then use the same trick as in Example 1 of adding and subtracting ~z.

Show that k~x + ~y k ≤ k~xk + k~y k .

The distance defined on R2 (in particular on R) can be generalized to Rn , and is called Eu-
clidean distance or metric.

3. If (X, d) is a metric space and Y ⊂ X, then (Y, d) is also a metric space. Thus, Q, N and Z
are metric spaces in their own right with the Euclidean metric, d(x, y) = |x − y|.

4. If (X, d) is a metric space, and Dn = nd(x, y), then (X, Dn ) is a metric space for all n ∈ N.

Thus if there is one metric on X, then there are infinitely many metrics on X.
d(x, y)
5. If (X, d) is a metric space, and D(x, y) = , then (X, D) is a metric space.
1 + d(x, y)

Definition 2.2.2. A ball in a metric space (X, d) is a set define as:

Br (a) = {x ∈ X : d(x, a) < r},

where a is called its center and r > 0 its radius.

Note:

(i) The emptyset ∅ and the whole space X may be considered balls with r = 0 and r = ∞.

(ii) A ball need not be ‘round’ !!

Examples:

15
1. In (R, |.|),
Br (a) = {x ∈ R : |x − a| < r} = (a − r, a + r).
a+b b−a
Also, any open interval (a, b) is an ball centered at and radius .
2 2
2. In R2 with Euclidean metric,

Br (~a) = {~x ∈ R : k~x − ~ak < r},

is a circular disk with center ~a, and radius r, without the circular perimeter.

3. In R3 with Euclidean metric, a ball Br (~a), is a solid sphere centered at ~a, radius r, and without
the outer surface.

4. If we consider R2 with a metric d(~x, ~y ) = max{|x1 − y1 | , |x2 − y2 |} where ~x = (x1 , x2 ) and


~y = (y1 , y2 ), then with this metric, a ball in R2 is a square without the outer perimeter.

5. Since R, |.| is a metric space, and N ⊆ R, therefore, (N, |.|) is a metric space. In this metric
B 1 (m) = {m} and B2 (m) = {m − 1, m, m + 1}.
2

Definition 2.2.3. Let A ⊂ X, then a point x ∈ A is called interior point of A if

∃ > 0 such that B (x) ⊆ A.

We also say, in this case that A is a neighborhood of x.


A point x ∈ X (x not in A) is called exterior point of A if

∃ > 0 such that B (x) ⊆ X \ A.

A boundary point of A is a point which is neither interior nor exterior.

Examples: In (R, |.|), let A = [0, 1], then all points 0 < x < 1 are interior points. Exterior points
are all points in (−∞, 0) ∪ (1, ∞). Thus, 0 and 1 are the boundary points.

Definition 2.2.4. A set A in X is open if all its points are interior points, i.e, for every x ∈ A ∃
 > 0 such that B (x) ⊆ A.

Examples:

1. The empty set is open.

2. X is open because all balls around any point of x is a subset of X.

3. (a, b) is open in R, but [a, b] or {a} are not.

Theorem 2.2.5. All balls are open sets in X.

16
Proof. Let Br (a) be a ball in X, and x ∈ Br (a). Then d(x, a) < r. Let  := r − d(x, a) which is
positive.
Claim: B (x) ⊆ Br (a).
Let y ∈ B (x), then d(y, x) < .

d(y, a) ≤ d(y, x) + d(x, a) <  + d(x, a) = r.


So, y ∈ Br (a).

Note: The converse of above theorem does not hold true, i.e., every open set is not a ball. For
example, (1, 2) ∪ (4, 5) is a open set but not a ball in R.

Theorem 2.2.6. A set is open if and only if it is a union of balls.

Proof. Let A be an open set in X. Then,

for each x ∈ A, ∃ x > 0 such thatBx (x) ⊆ A.

So,
A = ∪x∈A {x} ⊆ ∪x∈A Bx (x) ⊆ A.
Hence, A = ∪x∈A Bx (x).
Conversely, let A = ∪α Bα , where Bα is a ball. If x ∈ A, then x ∈ Bα0 for some α0 . Since balls
are open, there exists  > 0 such that B (x) ⊆ Bα0 ⊆ A.

Theorem 2.2.7. The following are two basic properties of open sets:

(i) Any union of open sets is open.

(ii) Any finite intersection of open sets is open.

Proof. (i) Let {Aα : α ∈ I} be a family of open sets, and B = ∪α∈I Aα . Let x ∈ B, then x ∈ Aα0
for some α0 ∈ I. Since Aα0 is open,

∃  > 0 such that B (x) ⊆ Aα0 ⊆ ∪α∈I Aα = B.

(ii) Let B = ∩ni=1 Ai , where Ai is open in X, for all i = 1, 2, . . . , n. Let x ∈ B, then x ∈ Ai for all
i = 1, 2, . . . , n. Since each Ai is open, for each i, ∃ i > 0 such that Bi (x) ⊂ Ai .

Let  := min{i : i = 1, 2, . . . , n}. Then for each i, B (x) ⊆ Bi (x) ⊆ Ai . Thus B (x) ⊆
∩ni=1 Ai = B.

Examples: (a, ∞) is open in R, since (a, ∞) = ∪n∈N (a, n), and each (a, n) is a ball in R. Similarly,
(−∞, b) is open in R.
Counter
 example:
 Any arbitrary intersection of open sets need not be open. For each n ∈ N, let
1 1
An = − , , then An ’s are open in R. But ∩n∈N An = {0} which is not open in R.
n n

17
Can you think of an example in R2 (with Euclidean metric), of infinitely many open sets whose
intersection in not open?
Remark: If X is a metric space and Y ⊆ X, then a set A in Y may be open in Y but not open
in X. For example, let X = R and Y = N, and A = {5}, then A is open in N, since {5} = B1/2 (5).
But A is not open in R.

Theorem 2.2.8. Disjoint points in a metric space, X, can be separated by disjoint balls. That is,
if x, y ∈ X, x 6= y, then ∃ r > 0 such that Br (x) ∩ Br (y) = ∅.
d(x, y)
Proof. Let x 6= y, then d(x, y) > 0. Let r = , then we claim that Br (x) ∩ Br (y) = ∅.
2
Let z ∈ Br (x) ∩ Br (y), then d(z, x) < r and d(z, y) < r. So,

d(x, y) ≤ d(x, z) + d(z, y) < r + r = 2r = d(x, y)

=⇒ d(x, y) < d(x, y),


which is a contradiction. Hence Br (x) ∩ Br (y) = ∅.

Remark: A space which satisfies the property in above theorem is called a Hausdorff Space.
Thus, metric spaces are Hausdorff.

Definition 2.2.9. A set F in X is called closed if X \ F is open in X.

Examples:

1 X and ∅ are closed, since X = X \ ∅ and ∅ = X \ X.

2. In R, [a, b] is closed, since [a, b] = R \(−∞, a) ∪ (b, ∞).

3. {a} is closed in R for all a ∈ R, since {a} = X \ (−∞, a) ∪ (a, ∞).

4. (a, b] is neither open nor closed in R.

5. {m} is both open and closed in N.

Proposition 2.2.10. Let x ∈ X, then {x} is closed in X, that is, singletons are closed sets.

Proof. We need to prove that A = X \ {x} is a open set in X. Let y ∈ A, then y 6= x. Since X is
Hausdorff, ∃ r > 0 such that Br (x) ∩ Br (y) = ∅.
We claim that Br (y) ⊂ A.
Let z ∈ Br (y), then clearly z ∈ X since Br (y) ⊆ X. Further, z ∈ / {x}, i.e., z 6= x. To see this,
suppose z = x, then z ∈ Br (x). We also know z ∈ Br (y), which means z ∈ Br (x) ∩ Br (y). But this
is not possible since Br (x) ∩ Br (y) = ∅. So, x 6= z. This proves our claim.

Theorem 2.2.11. The following are two basic properties of closed sets:

(i) Finite union of closed sets in X is closed in X,

18
(ii) Any intersection of closed sets in X is closed in X.

Proof. (i) Let {Fi : i = 1, 2, . . . , n} be a finite collection of closed sets. Then each X \ Fi is open in
X, hence ∩ni=1 (X \ Fi ) is open in X (by Theorem 2.2.7). Since

X \ (∪ni=1 Fi ) = ∩ni=1 (X \ Fi ),

∪ni=1 Fi is closed in X.
(ii) Let {Fα : α ∈ I} be an arbitrary collection of closed sets. Then each X \ Fα is open in X,
hence ∪α∈I (X \ Fα ) is open in X (by Theorem 2.2.7). Since

X \ (∩α∈I Fα ) = ∪α∈I (X \ Fα ),

∩α∈I Fi is closed in X.

Corollary 2.2.12. Any finite subset in X is closed in X.

Proof. Let F = {x1 , x2 , . . . , xn }, then F = ∪ni=1 {xi }. Since each {xi } is closed in X (by Theorem
2.2.10), thus F is closed in X, by above theorem.

Proposition 2.2.13. A closed ball C = {x ∈ X : d(x, a) ≤ r}, a ∈ X, r > 0, is a closed set in


X.

Proof is left as an exercise.

Definition 2.2.14. A point x is a limit point of a set A if

for each  > 0, ∃y 6= x such that y ∈ A ∩ B (x);

that is, every open ball around x contains a point of A, which is different from x.
A point in A which is not a limit point is called a isolated point of A.

In other words, a limit point is a point which cannot be separated or isolated from the set (using
open balls). They lie either in the interior or on the boundary of the set. Limit points are also
called cluster points or accumulation points.
Examples:

1. In R, if A = (a, b), its limits points are [a, b].

2. In R, if A = [a, b], its limits points are [a, b].

3. In R, if A = {1/n : n ∈ N}, its limit points are {0}.

4. In N, any A ⊆ N has no limit points.

5. Any finite set in R has no limit points.

19
Theorem 2.2.15. A set is closed if and only if it contains all its limit points.

Proof. Let F be a closed set and x be a limit point of F . Suppose x ∈


/ F , then x ∈ X \ F , which
is open. So, ∃  > 0 such that B (x) ⊆ (X \ F ). But this means that B (x) ∩ F = ∅, which is a
contradiction since x is a limit point of F .
Conversely, suppose F is a set containing all its limit points, and let x ∈ X \ F (i.e., x ∈
/ F ).
Then x is not a limit point of F . Therefore,

∃  > 0, such that ∀ y 6= x y ∈


/ F ∩ B (x)

⇒ F ∩ B (x) ⊆ {x} ⊆ X \ F
⇒ x ∈ B (x) ⊆ X \ F ⇒ X \ F is open.
Hence F is closed.

Definition 2.2.16. Closure of a set A is defined to be the set A together with its limit points;

A = A ∪ {limit points of A}.

Some facts about closure of sets:

• A is the smallest closed set containing A, i.e., if F is any other closed set containing A, then
A ⊂ F.

• A is closed if and only if A = A.


Thus, A = { n1 : n ∈ N} is not closed since A = A ∪ {0}.

• If A ⊆ B, then A ⊆ B.
Converse is not true. For example, let A = [0, 1] and B = (0, 1), then A = B = [0, 1], but
A * B.

• A ∪ B = A ∪ B.

• A ∩ B ⊆ A ∩ B.
Equality does not hold, in general. For example, let A = (1, 2) and B = (2, 3), then A ∩ B =
{2}, but A ∩ B = ∅ = ∅.

20
2.3 Convergence in metric spaces

A sequence in a metric space X is a function from N into X, x(n) = xn , written as (xn ).

Definition 2.3.1. A sequence (xn ) in X converges to a point x in X if

for each  > 0, ∃N ∈ N such that xn ∈ B (x) ∀ n ≥ N.

for each  > 0, ∃N ∈ N such that d(xn , x) <  ∀ n ≥ N.


We write xn → x as n → ∞, or simply xn → x, or lim xn = x.
n→∞

Note:
1. In other words, this means, any open ball centered at x will contain all the sequence terms
eventually, except finitely many (namely the first N − 1 terms).

2. N above, depends upon . Therefore, we sometimes also write as N () or N .

3. Convergence in R: Since in R, d(xn , x) = |xn − x|, it is clear that the above definition
generalizes the notion of convergence in R:

(xn ) converges in R to x ⇐⇒ for each  > 0, ∃N ∈ N such that |xn − x| <  ∀ n ≥ N.

Theorem 2.3.2. Limit of a sequence is unique.


Proof. Let (xn ) converge to x and y in X. Let  > 0, then for /2 > 0,

since xn → x, ∃ N1 such that d(xn , x) < /2 ∀ n ≥ N1 ;

since xn → y, ∃ N2 such that d(xn , y) < /2 ∀ n ≥ N2 .


Let N = max{N1 , N2 }, then n ≥ N implies n ≥ N1 , N2 . Let n ≥ N , then

d(x, y) ≤ d(x, xn ) + d(xn , y) < /2 + /2 = .

Since  > 0 was arbitrary, d(x, y) = 0, which implies, x = y.

An alternate way to prove the above result is:


Proof. Let (xn ) converge to x and y in X, and x 6= y. By Hausdorff property, ∃  > 0 such that
B (x) ∩ B (y) = ∅.

Since xn → x, ∃ N1 such that xn ∈ B (x) ∀ n ≥ N1 ;

since xn → y, ∃ N2 such that xn ∈ B (y) ∀ n ≥ N2 .


Let n0 ≥ N1 , N2 , then xn0 ∈ B (x) and xn0 ∈ B (y), which contradicts B (x) ∩ B (y) = ∅. Thus
x = y.

21
Theorem 2.3.3. Let (xn ) be a sequence in A and xn → x, then x ∈ A.

Proof. If x ∈ A, then x ∈ A.
If (xn ) is ‘eventually’ a constant sequence, i.e., xn = c for all n ≥ N for some N , then x = c ∈
A ⊂ A.
Suppose x ∈ / A and (xn ) is not eventually a constant sequence. Now

xn → x ⇒ for any  > 0 ∃ N such that xn ∈ B (x) ∀ n ≥ N.

But this implies that every open ball centered at x contains points of A, namely xn ’s, different from
x. Hence x ∈ A.

Corollary 2.3.4. If (xn ) is a sequence in F , and F is a closed set, then lim xn ∈ F .


n→∞

Proof. This follows from the above theorem and the fact that F = F .

Theorem 2.3.5. If x is a limit point of a set A in X, then ∃ a sequence (xn ) in A such that
xn → x.

Proof. Construction of (xn ): For each n ∈ N, let n = 1/n. Since x is a limit point of A, for
each n > 0, ∃ xn ∈ Bn (x) ∩ A, xn 6= x. Thus, d(xn , x) < n = 1/n.
Let  > 0, then by Archimedean property of R, ∃ N ∈ N such that 1/N < . If n ≥ N , then
1 1
d(xn , x) < ≤ < .
n N
Thus xn → x.

Convergence in R

In R with the Euclidean metric, |.|, B (x) = (x − , x + ), and

y ∈ (x − , x + ) ⇐⇒ x −  < y < x +  ⇐⇒ − < y − x < + ⇐⇒ |x − y| < .

Keeping these in mind, we will consider following as the definition of convergence in R

Definition 2.3.6. A sequence (xn ) in R converges to x ∈ R if for every  > 0 ∃ N such that
xn ∈ (x − , x + ) (or |xn − x| < ) for all N ≥ n.

Recall: Archimedean Property of R: This property has various equivalent forms, and it says that
in R there is no infinitely small or infinitely large number:

- For every x > 0 (however small), ∃ N such that 1/N < x.

- For every x ∈ R (however large), ∃ N such that x < N .

22
We will use Archimedean property in the -N proofs below, to find N corresponding to a given .
(The real number x will be in terms of .)
1
Example 2.3.7. 1. lim =0
n→∞ n
1
Let  > 0, then by the Archimedean Property of R, ∃ N such that < .
N
1 1
If n ≥ N , then ≤ < .
n N
1
That is, ∃ N such that − 0 <  for all n ≥ N .
n
1
2. lim = 0 Let  > 0, and consider
n→∞ (n + 1)2

1 1 1 1
− 0 = < ≤ .
(n + 1)2 (n + 1)2 n2 n

Since 1/n → 0, ∃ N such that 1/n <  for all n ≥ N . Then for n ≥ N , by above,
1 1
2
− 0 < < .
(n + 1) n
3. The sequence (1, 0, 1, 0, 1, . . .) does not converge in R.
1
Suppose above sequence converges to a point a ∈ R. Let  = , then ∃ N such that |xn − a| <
2
1
for all n ≥ N .
2
1 1
In particular, |0 − a| < and |1 − a| < , and
2 2
1 1
1 = |0 − 1| ≤ |0 − a| + |a − 1| < + = 1.
2 2
This is a contradiction.

4. If 0 < b < 1, then lim bn = 0.


n→∞

Let  > 0, then we want to show: ∃ N ∈ N such that |bn − 0| <  for all n ≥ N .
ln 
Consider |bn − 0| <  ⇐⇒ bn <  ⇐⇒ n ln b < ln  ⇐⇒ n > (since ln b < 0).
ln b
ln  ln 
By Archimedean property, for x = , we can choose a N ∈ N such that < N.
ln b ln b
ln 
Then if n ≥ N , then n > which implies that |bn − 0| < .
ln b

Proposition 2.3.8. Let a ∈ R, then there exists a sequence of rationals (qn ) and a sequence of
irrationals (rn ) such that qn → a and rn → a, as n → ∞.

23
Proof. Let a ∈ R. We first prove that there exists a sequence of rationals (qn ) such that qn → a.

Step 1: (Construction of the rational sequence) Consider B1 (a) = (a − 1, a + 1), then since Q is
dense in R, ∃ q1 ∈ B1 (a). Consider B1/2 (a) = (a − 1/2, a + 1/2), then since Q is dense in R, ∃
q1 ∈ B1/2 (a). Continuing this way, we get the nth sequence term qn ∈ Q, such that qn ∈ B1/n (a).

Step 2: (qn → a). Let  > 0, then since 1/n → 0, ∃ N such that 1/n <  for all n ≥ N . For n ≥ N ,
since qn ∈ B1/n (a), |qn − a| < 1/n < . Thus qn → a.

Similarly using the fact that irrationals are dense in R, the second part can be proved.

Complete Metric Spaces

Definition 2.3.9. A sequence (xn ) in X is Cauchy if

For each  > 0 ∃ N ∈ N such that d(xn , xm ) <  ∀ n, m ≥ N.

Thus a Cauchy sequence is a sequence in which the terms keep getting closer to each other.

Theorem 2.3.10. Every convergent sequence is Cauchy.

Proof. Let (xn ) be a sequence in X and xn → x. Let  > 0. Then ∃ N ∈ N such that d(xn , x) < /2
for all n ≥ N .
Let m, n ≥ N , then

d(xm , xn ) ≤ d(xm , x) + d(x, xm ) < /2 + /2 = .

Remark: The converse of above theorem is not true in a general metric space, that is, a Cauchy
sequence may not be convergent. For example:
xn 1
Let X = Q. Consider a recursively defined sequence (xn ) in Q given by xn+1 = + , with
2 xn
x 1
x1 = 1. If (xn ) converges to x then x must satisfy x = + , but this means x2 = 2. So x cannot
2 x
be in Q. Thus, (xn ) does not converge in Q.

Definition 2.3.11. A metric space is complete if every Cauchy sequence converges.

Some facts and remarks:

1. It follows that the set of rational numbers Q is not complete. But the set of real numbers R
is complete. This is the main reason why we use the real numbers rather than the rationals
in most applications of calculus.

2. X = (0, 1) is not complete, since xn = 1/n is a sequence in X which is Cauchy but not
convergent in X.

24
3. Every metric space X can be completed i.e., there is a complete metric space X
e which contains
X ; for example Q ]
e = R and (0, 1) = [0, 1].

4. R is complete, that is a sequence in R is Cauchy if and only if it is convergent.

5. In a complete metric space X, any subset F is complete if and only if F is closed in X.

25
2.4 Continuity

Some set theory facts, we will use in this section:


Let f : X → Y , A1 , A2 , A ⊆ X and B1 , B2 , B ⊆ Y , then
(i) A1 ⊆ A2 ⇒ f (A1 ) ⊆ f (A2 ),

(ii) B1 ⊆ B2 ⇒ f −1 (B1 ) ⊆ f −1 (B2 ),

(iii) A ⊆ f −1 [f (A)], equality holds if f is one-one,

(iv) f [f −1 (B)] ⊆ B, equality holds if f is onto.

(v) f −1 [Y \ B] = X \ f −1 (B)

In this section, X, Y will mean (X, dX ) and (Y, dY ).


Definition 2.4.1. Let (X, dX ) and (Y, dY ) be metric spaces, and f : X → Y . Then f is continuous
at x if
∀ > 0, ∃δ > 0 dX (x, y) < δ ⇒ dY (f (x), f (y)) < 
Since d(a, b) < r ⇐⇒ b ∈ Br (a), this can be rewritten as

∀ > 0, ∃δ > 0 such that y ∈ Bδ (x) ⇒ f (y) ∈ B (f (x))

which can be further rewritten as

∀ > 0, ∃δ > 0 such that f [Bδ (x)] ⊆ B (f (x)).

We say f is continuous on X if f is continuous at each x ∈ X.

We say f is discontinuous at x if f is not continuous at x, i.e.,

∃ > 0 such that ∀δ > 0 ∃y ∈ Bδ (x) but f (y) ∈


/ B (f (x)) [ i.e., f [Bδ (x)] * B (f (x))].

The following result characterizes continuous functions as the functions which pull open sets back
to open sets. In a general topology, this is considered as the definition of a continuous function.
Theorem 2.4.2. Let f : X → Y , then f is continuous ⇐⇒ for every open set V in Y , f −1 (V )
is open in X.
Proof. Let f be a continuous function and let V be an open set in Y . We want to show that
U := f −1 (V ) is open in X. Let x be any point in U . Then f (x) ∈ V , which is open. Hence

∃  > 0 such that f (x) ∈ B (f (x)) ⊆ V.


Since f is continuous, for  > 0 above,

∃ δ > 0 such that f [Bδ (x)] ⊆ B (f (x)).

26
So,

f [Bδ (x)] ⊆ V ⇐⇒ f −1 (f [Bδ (x)]) ⊆ f −1 (V ) (∵ B1 ⊆ B2 ⇒ f −1 (B1 ) ⊆ f −1 (B2 ))


⇐⇒ Bδ (x) ⊆ f −1 (V ) = U. (∵ A ⊆ f −1 (f (A)))

Thus,
∃ δ > 0 such that Bδ (x) ⊆ f −1 (V ) = U, that is, U is open in X.

Conversely, assume that for every open set V in Y , f −1 (V ) is open in X.


Let  > 0. We want a δ > 0 such that f [Bδ (x)] ⊆ B (f (x)).
Let V = B (f (x)), then since B (f (x)) is open in Y , f −1 (V ) = f −1 [B (f (x))] is open in X, that
is,

∃δ > 0 such that x ∈ Bδ (x) ⊆ f −1 [B (f (x))] .


Now

Bδ (x) ⊆ f −1 [B (f (x))] ⇒ f [Bδ (x)] ⊆ f f −1 [B (f (x))]



(∵ A1 ⊆ A2 ⇒ f (A1 ) ⊆ f (A2 ))
⊆ B (f (x)) (∵ f (f −1 (B)) ⊆ B.)

Thus,
∃δ > 0 such that x ∈ f [Bδ (x)] ⊆ B (f (x)).

Note: A continuous function need not map open sets to open sets. For example, consider the
constant function f : R → R, f (x) = 1. Then f is continuous on R, and f [(0, 2)] = {1}, (0, 2) is
open in R but its image under f is {1} which is not open in R.

Corollary 2.4.3. f is continuous ⇐⇒ for every closed set F of Y , f −1 (F ) is closed in X.

Proof is left as an exercise.

Theorem 2.4.4 (Sequential Criterion for Continuity). Let f : X → Y , then


f is continuous ⇐⇒ xn → x ⇒ f (xn ) → f (x) (or f ( lim xn ) = lim f (xn ) ).
n→∞ n→∞

Proof. Let f be a continuous function, and let (xn ) be a sequence converging to x in X.


We want to show: f (xn ) → f (x) in Y .
Let  > 0, and consider B (f (x)) which is open in Y . Since f is continuous,

∃δ > 0 such that f [Bδ (x)] ⊆ B (f (x)).


Since xn → x, then for δ > 0 above,

∃N ∈ N such that n ≥ N ⇒ xn ∈ Bδ (x).


So,
⇒ xn ∈ Bδ (x) ⇒ f (xn ) ∈ f [Bδ (x)] ⊆ B (f (x)).

27
Thus,
∃N ∈ N such that n ≥ N ⇒ f (xn ) ∈ f [Bδ (x)].

Conversely, suppose f is not continuous. Then there is a point x ∈ X such that

∃ > 0 such that ∀δ > 0 f [Bδ (x)] * B (f (x)).

In particular, for each n, let δ = 1/n. Then


 
∃ > 0 such that ∀n f B1/n (x) * B (f (x)).
For each n ∈ N , let xn ∈ B1/n (x) such that f (xn ) ∈
/ B (f (x)).
Since xn ∈ B1/n (x) ⇒ d(xn , x) < 1/n ⇒ xn → x. But f (xn ) 9 f (x), as d(f (xn ), f (x)) ≥  for
all n.

Thus we see that continuous functions preserve convergence.

Proposition 2.4.5. If f : X → Y and g : Y → Z are continuous, then so is g ◦ f : X → Z.

Proof. Let W be a open set in Z. We want to show that (g ◦ f )−1 [W ] is open in X. Then since
g is continuous, g −1 (W ) is open in Y . Since f is continuous, f −1 [g −1 (W )] is open in X. But
f −1 [g −1 (W )] = (g ◦ f )−1 [W ].

Note: If f : X → Y is continuous and f −1 : Y → X exists, f −1 may not be continuous.

Definition 2.4.6. A homeomorphism between two metric spaces X and Y is a map φ : X toY
such that
φ is bijective,
φ is continuous,
φ−1 is continuous.

We say X is homeomorphic to Y and write X ∼ Y .

In other words, two spaces are homeomorphic when one can be obtained continuously and
continuously invertible from the other; in effect when convergence in one space is equivalent to
convergence in the other.

28
Proposition 2.4.7. (a, b) is homeomorphic to (0, 1).

Proof. Define φ : (0, 1) → (a, b) as φ(t) = (b−a)t+a. Then φ is bijective, and φ−1 (t) : (a, b) → (0, 1)
t−a
is φ(t) = . Both φ and φ−1 are linear functions, which are continuous (we will prove this using
b−a
 − N proof later).

29
Continuity of functions on R

Definition 2.4.8. Let f : D → R, where D ⊂ R. Then f is continuous at a point c ∈ D if


for every  > 0, ∃ δ > 0, such that |x − c| < δ ⇒ |f (x) − f (c)| < .
If f is continuous at all points in D, then it is said to be continuous on D.

So, a function is not continuous or discontinuous at a point c if


∃  > 0 such that ∀ δ > 0, there ∃ x ∈ D such that |x − c| < δ but |f (x) − f (c)| ≥ .

 x 0<x<1

Example: Define f by f (x) = 5 x=1

2 1<x<4

Then f is not continuous at x = 1. Take  = 1. Then we show that there is no δ > 0 such that if
x is in (0, 4) and x is in a δ-ball centered at 1, then f (x) is within 1 [= ] of 5 [= f (c)].
Now |f (x) − 5| < 1 ⇒ 4 < f (x) < 6.
If 0 < x < 1, then f (x) = x and so 0 < f (x) < 1.
If 1 < x < 4, then f (x) = 2.
So there is no open interval (i.e., ball) centered at 1 on which f (x) is within 1 of 5. Hence f is not
continuous at x = 1.

Examples: Lets see in class how to use this  − δ definition to prove continuity of the following
functions on their domains or at the given point.

1. f : R → R f (x) = 2x − 9

2. f : R → R f (x) = x2

3. f : [0, ∞) → R f (x) = x.
x3 − 4
4. f : R → R f (x) = is at x = 2.
x2 + 1


 x sin 1 x 6= 0
5. f : R → R f (x) = x
 0 x=0

30
Sequential Criterion for Continuity: A function f : D → R is continuous at c ∈ D if
for every sequence (xn ) in D, xn → c ⇒ f (xn ) → f (c).

Let us see how we use this criterion, to prove continuity or discontinuity of a function, with an
example.
(
x if x∈Q
Example: Let f (x) = Prove that f is continuous only at x = 0.
0 if x ∈ R \ Q

We need to show continuity at x = 0, and prove f is discontinuous everywhere else.


Continuity at x = 0:
We will use the following lemma to show continuity at x = 0.

Lemma 2.4.9. Let f : D −→ R with c ∈ D. Then f is continuous at c, if the following two


conditions hold:

(i) every rational sequence satisfies the sequential criteria;

(ii) every irrational sequence satisfies the sequential criteria.

So to use the lemma above, to show continuity at x = 0, let (qn ) and (rn ) be sequences of rationals
and irrationals respectively, such that qn → 0 and rn → 0.
Then for each n, f (qn ) = qn → 0 = f (0) and f (rn ) = 0 → 0 = f (0). Hence f is continuous at 0.

Discontinuity at every x 6= 0:
Let x 6= 0.
Case 1: x ∈ Q
Since R \ Q is dense in R, we get a sequence (rn ) in R \ Q such that rn → x (by Proposition
2.3.8).
Then f (rn ) must converge to f (x) = x. But f (rn ) = 0 → 0 and 0 6= x. This is a contradiction.
Case 2: x ∈ R \ Q
Since Q is dense in R, we get a sequence (qn ) in Q such that qn → x (by Proposition 2.3.8).
Then f (qn ) must converge to f (x) = 0. But f (qn ) = qn → x and x 6= 0. This is a contradiction.
So in both cases, xn → x ; f (xn ) → f (x), hence f is discontinuous at all x 6= 0.

31
Uniform Continuity
We will spend almost all of this subsection, discussing uniform continuity of functions on R. But
to empasize the fact that the notion of uniform continuity is defined for any metric space, we will
start with a definition of uniformly continuous function on a general metric space X.

Definition 2.4.10. A function f : X → Y is said to be uniformly continuous on X if


for every  > 0, ∃ δ > 0 such that, for all x, y ∈ X, dX (x, y) < δ ⇒ dY (f (x), f (y)) < .

Note: The concepts of uniform continuity of a function and continuity of a function are different
in following ways:
1. Uniform continuity is a property of a function on a set, whereas continuity can be defined at
a single point. To ask whether a given function is uniformly continuous at a certain point is
meaningless.
2. In the definition of continuity, the δ depends upon  and the point c. But in uniform continuity,
δ only depends upon , that is, for each  > 0, we can find a δ > 0 which will work for all
points x of X.
3. Every uniformly continuous function on X is continuous on X, but the converse is not true
(in general). A counter example is f (x) = x2 which is continuous on R, but we will show
shortly, that it is not uniformly continuous on R.

We will now talk about uniform continuity of functions on R. Let f : D → R, where D ⊆ R. Then
f is uniformly continuous on D if
for every  > 0, ∃ δ > 0 such that, for all x, y ∈ D, |x − y| < δ ⇒ |f (x) − f (y)| < .
Thus, f : D → R is not uniformly continuous if
∃  > 0 such that ∀ δ > 0, ∃x, y ∈ D such that |x − y| < δ but |f (x) − f (y)| ≥ .
Lets try to understand uniform continuity graphically: The left graph below shows that for a
uniformly continuous function, given a  > 0, one should be able to find a δ works for all points
c. If we get two different δ, the smaller of them will work for both points. The graph on the right
shows a function which is not uniformly continuous, we cannot come up with one δ > 0 which will
work for all points. Think about why we can’t find one delta which will work for all? What is it
about the right graph that will not allow it?

32
Examples: Following are examples of functions, some which are uniformly continuous and some are
not. We will see the proofs in class.

1. Let f (x) = mx + b (any linear function), then f is uniformly continuous on R.

2. Let f (x) = x2 , then f is not uniformly continuous on R or on [a, ∞).

3. Let f (x) = x2 , then f is uniformly continuous on [a, b].


1
4. Let f (x) = , then f is uniformly continuous on [1, ∞), but not on (0, 1].
x

5. Let f (x) = x, then f is uniformly continuous on [1, ∞) and also on [0, 1].

You may ask the questions, are there any conditions under which a continuous function becomes
uniformly continuous? The answer is yes, for instance, see the following results.

Theorem 2.4.11. A continuous function on a closed and bounded interval [a, b], is uniformly
continuous on [a, b].

We will skip the proof for now. Later in the course, time permitting, we will see that this result
follows from a general result in metric spaces (in topology, in fact) involving the concept of “compact
sets”.
This result re- justifies the fact that f (x) = x2 is uniformly continuous on [a, b], even though it is
not uniformly continuous on R.

33
Chapter 3

Differentiation

3.1 Review: Limit of a Function


We begin by making this clear that, in order to talk about limit of a function f : D → R at a
point c, unlike continuity, f need not be defined at c. That is, c need not be a point in the
domain of f . However, we do require c to be close to the domain D, in the sense that c must be
an accumulation point of D.

Definition 3.1.1. (Limit of a Function at a point) Let D ⊂ R, let f : D → R, let c be an


accumulation point of D, and let L ∈ R. Then the limit point of f at c is L if
for every open ball centered at L, (L−, L+), there exists an open ball centered at c, (c−δ, c+δ),
such that  
x ∈ (c − δ, c + δ) \ {c} ⇒ f (x) ∈ (L − , L + ) i.e., f [(c − δ, c + δ) \ {c}] ⊆ (L − , L + ) .
m
For every  > 0, ∃ δ > 0 such that if x ∈ (c − δ, c + δ), and x 6= c, then f (x) ∈ (L − , L + ).
m
For every  > 0, ∃ δ > 0 such that if x ∈ D and 0 < |x − c| < δ then |f (x) − L| < .
If the limit of f at c does not exist, we say that f diverges at c.
We often write
L = lim f (x) or L = lim f or f (x) → L as x → c.
x→c x→c

34
Theorem 3.1.2 (Algebra of Limits). Let f : D → R and g : D → R, and let c be an accumulation
point of D. Let lim f = L and lim g = M , then
x→c x→c

(i) lim(f + g) = L + M
x→c

(ii) lim(f − g) = L − M
x→c

(iii) lim(f.g) = LM
x→c

(iv) If b ∈ R, then lim(bf ) = bL


x→c

(v) If g(x) 6= 0 for all x ∈ D and if M 6= 0, then lim(f /g) = L/M


x→c

One-Sided Limits:

• Left-Hand Limit
if given any  > 0 there exists a δ > 0 such that for all x ∈ D with 0 < c − x < δ, then
|f (x) − L| < .

• Right-Hand Limit : Similar


if given any  > 0 there exists a δ > 0 such that for all x ∈ D with 0 < x − c < δ, then
|f (x) − L| < .

Remark: Continuity and Limit are related as follows:


Let f : D → R and c ∈ D be an accumulation point of D. Then the following are equivalent:

(i) f is continuous at c,

(ii) lim f = f (c)


x→c

(iii) lim− f = f (c) = lim+ f


x→c x→c

Theorem 3.1.3. (Sequential Criterion for Limits) Let f : D → R, and let c be an accumulation
point of D, then the following are equivalent:

(i) lim f = L
x→c

(ii) For every sequence (xn ) in D which converges to c such that xn 6= c for all n ∈ N, the sequence
(f (xn )) converges to L.

35
3.2 The Derivative
In order to talk about the derivative, we will consider functions with interval domain, i.e., f : I → R,
where I is an interval one of the following forms:

(a, b), [a, b), (a, b], [a, b], (a, ∞), [a, ∞), (−∞, b), (−∞, b]

It is possible to define the derivative of a function having a non-interval domain, but the significance
of the concept is most naturally apparent for functions defined on intervals. Consequently we shall
limit our attention to such functions.

Definition 3.2.1. (Derivative of a function) The derivative of a function f at c, denoted by


f 0 (c), is defined by

f (x) − f (c)
f 0 (c) = lim ,
x→c x−c
provided this limit exists. The function is then said to be differentiable at c .
If D ⊂ I and if f is differentiable at each point of D, then f is said to be differentiable on D.

Notation and Remarks


df (x) dy
• Let y = f (x), then f 0 (x) is also written as or .
dx dx
• Geometric Interpretation: f 0 (c) denotes the slope of the tangent line to the function f at
f (c). The graph of a differentiable function is a smooth curve which does not have any sharp
turns or breaks.

•  − δ definition: f is differentiable at c iff for each  > 0 there exists a δ > 0 such that if
f (x) − f (c)
x ∈ I and 0 < |x − c| < δ then − f 0 (c) < .
x−c
• Replacing x − c with h, in above definition, we obtain an equivalent expression which is often
used:

f (c + h) − f (c)
f 0 (c) = lim
h→0 h

36
Theorem 3.2.2. If f : I → R is differentiable at c, then f is continuous at c.
Proof. For x ∈ I, x 6= c,
 
f (x) − f (c)
f (x) − f (c) = (x − c).
x−c
Using algebra of limits, and that f is differentiable at c, we have
 
f (x) − f (c)
lim(f (x) − f (c)) = lim lim(x − c) = f 0 (c).0 = 0.
x→c x→c x−c x→c

So lim f (x) = f (c), hence f is continuous at c.


x→c

Example 3.2.3. Converse of above result is not true. To see this, consider f (x) = |x|. Then we
know f is continuous at c = 0, but
(
f (x) − f (0) |x| 1 x>0
= =
x−0 x −1 x < 0
f (x) − f (0)
So lim does not exists ⇒ f is not differentiable at 0.
x→c x−0

Example 3.2.4. Lets prove the following functions are differentiable using the definition of a
derivative, and find their derivatives:
1. f (x) = x3 for x ∈ R

2. f (x) = 1/ x for x > 0
3. f (x) = x3/2 for x ∈ R
4. f (x) = x1/3 for x ∈ R, x 6= 0

1. Let c ∈ R, and x 6= c, then


f (x) − f (c) x 3 − c3 (x − c)(x2 + xc + c2 )
= = = x2 + xc + c2
x−c x−c x−c
f (x) − f (c)
Thus f 0 (c) = lim = lim x2 + xc + c2 = 3c2 .
x→c x−c x→c

2. Let c > 0, and x > 0, x 6= c, then


1 1
√ −√ √ √
f (x) − f (c) x c c− x
= = √ √
x−c x−c (x − c) x c
c−x
= √ √ √ √ (by rationalization)
(x − c) x c( c + x)
−1
=√ √ √ √ .
x c( c + x)

37
f (x) − f (c) −1 −1
Thus f 0 (c) = lim = lim √ √ √ √ = 3/2 .
x→c x−c x→c x c( c + x) 2c

3. Let c ∈ R, and x 6= c, then

f (x) − f (c) x3/2 − c3/2 x3/2 − c3/2 x3/2 + c3/2


= = . 3/2
x−c x−c x−c x + c3/2
x − c3
3
=
(x − c)(x3/2 + c3/2 )
(x − c)(x2 + xc + c2 )
= .
(x − c)(x3/2 + c3/2 )

f (x) − f (c) x2 + xc + c2 3c2 3


Thus f 0 (c) = lim = lim 3/2 3/2
= 3/2
= c1/2 .
x→c x−c x→c x +c 2c 2
4. Let c 6= 0, and x 6= c, then
f (x) − f (c) x1/3 − c1/3
= .
x−c x−c
Note that

x − c = (x1/3 )3 − (c1/3 )3 = [x1/3 − c1/3 ][(x1/3 )2 + x1/3 c1/3 + (c1/3 )2 ]


= [x1/3 − c1/3 ][x2/3 + x1/3 c1/3 + c2/3 ].

0 f (x) − f (c) x1/3 − c1/3


Thus f (c) = lim = lim 1/3
x→c x−c x→c [x − c1/3 ][x2/3 + x1/3 c1/3 + c2/3 ]
1
= lim 2/3
x→c x + x c1/3 + c2/3
1/3

1
= 2/3 .
3c

38
Theorem 3.2.5. Algebraic Rules for Differentiation Let f, g : I → R both be differentiable at
c ∈ I. Let a ∈ R. Then
(i) f ± g are differentiable at c, with (f ± g)0 (c) = f 0 (c) ± g 0 (c).

(ii) f g is differentiable at c, with (f g)0 (c) = f 0 (c)g(c) + f (c)g 0 (c).

(iii) af is differentiable at c, with (af )0 (c) = af 0 (c).


 0
f f g(c)f 0 (c) − f (c)g 0 (c)
(iv) If g(c) 6= 0 then is differentiable at c, with (c) = .
g g (g(c))2

Before we begin the proof, we state a lemma which we need for part (iv). This lemma states a
useful property of continuous functions, that is, if a continuous function is non-zero at a point c,
then it stays non-zero in a neighborhood (or an open ball, in metric terms) around c.
Lemma 3.2.6. If g : D → R is continuous at c ∈ D, and g(c) 6= 0, then ∃ δ > 0 such that g(x) 6= 0
for all x ∈ (c − δ, c + δ) ∩ D.

Proof. of the theorem: Parts (i) and (iii) are easy, and are left as an exercise. We will prove parts
(ii) and (iv).
(ii) Let x ∈ I, x 6= c, and h(x) = f (x).g(x), then
f (x)g(x) − f (c)g(c)
x−c
f (x)g(x)−f (x)g(c) + f (x)g(c) − f (c)g(c)
=
x−c
f (x)(g(x) − g(c)) + g(c)(f (x) − f (c))
=
x−c
f (x)(g(x) − g(c)) g(c)(f (x) − f (c))
= +
x−c x−c

Thus,
 
0 h(x) − h(c) f (x)(g(x) − g(c)) g(c)(f (x) − f (c))
h (c) = lim = lim +
x→c x−c x→c x−c x−c
f (x)(g(x) − g(c)) g(c)(f (x) − f (c))
= lim + lim
x→c x−c x→c x−c
g(x) − g(c) f (x) − f (c)
= lim f (x). lim + g(c). lim
x→c x→c x−c x→c x−c
0 0
= f (c)g (c) + g(c)f (c) (using continuity of g and that f 0 , g 0 exist at c).

f (x)
(iv) Let h(x) = , let x ∈ I, x 6= c.
g(x)
Note that we know g(c) 6= 0. But we also want g(x) 6= 0, when x 6= c but is close to c, since
otherwise h(x) will not be defined close to c. Luckily, g is differentiable and hence continuous
at c. Thus by the lemma above, ∃ δ > 0 such that g(x) 6= 0 for all x ∈ (c − δ, c + δ).

39
Let x ∈ (c − δ, c + δ), x 6= c. Consider

h(x) − h(c) f (x)/g(x) − f (c)/g(c)


=
x−c x−c
f (x)g(c) − f (c)g(x)
=
g(x)g(c)(x − c)
f (x)g(c)−f (c)g(c) + f (c)g(c) − f (c)g(x)
=
g(x)g(c)(x − c)
g(c)[f (x) − f (c)] + f (c)[g(c) − g(x)]
=
g(x)g(c)(x − c)
f (x) − f (c) f (c)[g(x) − g(c)]
= −
g(x)(x − c) g(x)g(c)(x − c)

Thus,
 
0 h(x) − h(c) f (x) − f (c) f (c)[g(x) − g(c)]
h (c) = lim = lim −
x→c x−c x→c g(x)(x − c) g(x)g(c)(x − c)
f (x) − f (c) f (c)[g(x) − g(c)]
= lim − lim
x→c g(x)(x − c) x→c g(x)g(c)(x − c)

f (x) − f (c) 1 f (c) g(x) − g(c) 1


= lim lim − lim lim
x→c x−c x→c g(x) g(c) x→c x−c x→c g(x)
0 0
f (c) f (c)g (c)
= − (using continuity of g and that f 0 , g 0 exist at c)
g(c) g 2 (c)
f 0 (c)g(c) − f (c)g 0 (c)
= .
g 2 (c)

40
The definition of a derivative is defined to be limit of a rational function (difference quotient)
f (x) − f (c)
. When we take limit x → c, always assume x 6= c. The following provides an equivalent
x−c
characterization of a derivative which does not involve fractions. It also helps reduce theorems on
derivatives to theorems on continuity.

Theorem 3.2.7. [Caratheodory’s Theorem] Let f : I → R and c ∈ I. Then f is differentiable


if and only if there exists a function φ : I → R which is continuous at c and satisfies

f (x) − f (c) = φ(x)(x − c) for all x ∈ I. (∗)

In this case, we also have that φ(c) = f 0 (c).

Proof. (⇒) Let f 0 (c) exists, then define φ by



 f (x) − f (c)
for x 6= c, x ∈ I
φ(x) := x−c
 f 0 (c) for x = c
Since lim φ(x) = f 0 (c), so φ is continuous at c. It is clear that φ satisfies the equation (∗).
x→c

(⇐) Suppose φ is a continuous function at c, and φ satisfies (∗).


Let x 6= c, then using (∗) and continuity of φ at c, we have

f (x) − f (c)
lim = lim φ(x) = φ(c).
x→c x−c x→c

Therefore, f is differentiable at c and f 0 (c) = φ(c).

Theorem 3.2.8. (Chain Rule) Let I and J be intervals in R, let f : I → J and g : J → R, and
let c ∈ I, with f differentiable at c and g differentiable at f (c). Then g ◦ f is differentiable at c and

(g ◦ f )0 (c) = g 0 (f (c))f 0 (c).

Proof. Since f 0 (c) exists, Caratheodory’s Theorem implies that there exists a function φ : I → R
such that φ is continuous at c and

f (x) − f (c) = φ(x)(x − c), where φ(c) = f 0 (c).


Also, since g 0 (f (c)) exists, there is a function ψ : J → R such that ψ is continuous at d := f (c) and

g(y) − g(d) = ψ(y)(y − d) ∀y ∈ J, where ψ(d) = g 0 (d).


Substituting y = f (x) and d = f (c), we get

g(f (x)) − g(f (c)) = ψ(f (x))(f (x) − f (c)) = ψ(f (x))[φ(x)(x − c)] = [(ψ ◦ f )(x).φ(x)](x − c),

for all x ∈ I such that f (x) ∈ J. Since (ψ ◦ f ).φ is continuous at c, its value at c is g 0 (f (c)).f 0 (c),
by Caratheodory’s Theorem, g ◦ f is differentiable at c and (g ◦ f )0 (c) = g 0 (f (c))f 0 (c).

41
3.3 Mean Value Theorem and its Applications

The Mean Value Theorem is one of the fundamental results in analysis, as the corollaries, examples,
applications will indicate. We will see its applications to uniform continuity, finding zeroes/roots
of a function, approximation of functions by polynomials (Taylor’s Theorem). We will also see
how Mean Value Theorem permits one to draw conclusions about the nature of a function from
information about its derivative.
Geometric Interpretation of Mean Value Theorem: There is a point on the curve y = f (x)
at which the tangent line is parallel to the line segment through the points (a, f (a)) (b, f (b)).

We need some definitions and preliminary results to prove this theorem. These results are
important results, in their own right.

Definition 3.3.1. (Absolute Maximum and Minimum) Let f : I −→ R with c ∈ I. Then


f has a absolute maximum (respectively, a absolute minimum) on I if there is a c ∈ I such that
f (x) ≤ f (c) (respectively, f (c) ≤ f (x)) for all x ∈ I.

Following is an important fact about continuous functions, we will use in the Mean value theorem
proof, which says that a continuous function on a closed and bounded interval, attains both its
maximum and minimum.

Theorem 3.3.2. If f : [a, b] → R is continuous on [a, b], then f has an absolute maximum and an
absolute minimum on [a, b].

Definition 3.3.3. (Local Maximum and Minimum) Let f : I −→ R with c ∈ I. Then f has
a local maximum (respectively, a local minimum) at c if there is a neighborhood (i.e., an open ball)
U = (c − δ, c + δ) of c such that f (x) ≤ f (c) (respectively, f (c) ≤ f (x)) for all x ∈ I ∩ U . We call
a local maximum or local minimum as a local extrema.

Theorem 3.3.4. Let f : I → R be differentiable at c ∈ I which is an interior point of I (that is, c


is not an end point of I). If f has a local maximum or a local minimum at c, then f 0 (c) = 0.

42
Proof. Let f have a local maximum at c. By definition, ∃ δ > 0 such that f (x) ≤ f (c) for all
x ∈ (c − δ, c + δ).
f (x) − f (c) f (x) − f (c)
If c − δ < x < c, then ≤ 0. This implies f 0 (c) = limx→c− ≥ 0.
x−c x−c
f (x) − f (c) f (x) − f (c)
If c < x < c + δ, then ≥ 0. This implies f 0 (c) = limx→c+ ≤ 0.
x−c x−c
Hence f 0 (c) = 0.
The proof for f having a local minimum at c is similar.

Theorem 3.3.5. [Mean Value Theorem] Let f : [a, b] → R be continuous on [a, b] and differen-
tiable on (a, b), then there is a c ∈ (a, b) such that

f (b) − f (a) = f 0 (c)(b − a).

Proof. Define F : [a, b] → R by

F (x) = [f (b) − f (a)]x − (b − a)f (x).

Since f is continuous on [a, b] and differentiable on (a, b), so is F , and F 0 (x) = [f (b) − f (a)] −
(b − a)f 0 (x). Also F (a) = F (b).
We need to show that there is a c ∈ (a, b) such that F 0 (c) = 0.
Case I: F is constant on [a, b]. Then F 0 (x) = 0 for all x ∈ [a, b].
Case II: F is not constant on [a, b]. Since F is continuous on [a, b], it has an absolute maximum
and an absolute minimum on [a, b]. Suppose both of the absolute extremes occur at the end points
a and b, then F (a) = F (b) would imply that F is constant on [a, b]. Thus, one of the absolute
extremes must occur in the interior of [a, b], i.e., ∃ c ∈ (a, b) such that F (c) is an extrema. By
Theorem 3.3.4, F 0 (c) = 0.

Consequence I: Uniform Continuity

Corollary 3.3.6. Let f : I → R be differentiable on I. If f has a bounded derivative on I, then f


is uniformly continuous on I.

Proof. Let M > 0 such that f 0 (x) ≤ M for all x ∈ I. Let  > 0, then
to show: ∃ δ > 0 such that ∀x, y ∈ I |x − y| < δ ⇒ |f (x) − f (y)| < .
By the mean value theorem, for any x, y ∈ I, there is a c between x, y such that

f (x) − f (y) = f 0 (c)(x − y).

Thus,

|f (x) − f (y)| = |f 0 (c)| |x − y| ≤ M |x − y| .



Let δ = , and let x, y ∈ I, then by the argument above we have
M
|f (x) − f (y)| ≤ M |x − y| < M δ = .

43

Note: Converse of above theorem is not true. For example, let f (x) = x, we have seen is
uniformly continuous on [0, ∞). But its derivative is unbounded on [0, ∞).

Example 3.3.7. It follows from above corollary that f (x) = sin x and g(x) = cos x are uniformly
continuous on R, since their derivatives on R are cos x and sin x respectively, which are bounded.

Consequence II: Understanding the function via. its derivative

Theorem 3.3.8. Let f : [a, b] → R be continuous on [a, b] and differentiable on (a, b), and let
f 0 (x) = 0 for all x ∈ (a, b). Then f is constant on [a, b].

Proof. Claim: f (x) = f (a) for all x ∈ [a, b].


Let x ∈ I, then apply Mean Value Theorem to f on [a, x]. There is a c ∈ (a, x) such that
f (x) − f (a) = f 0 (c)(x − a). But f 0 (x) = 0 for all x ∈ I, so f 0 (c) = 0, and f (x) − f (a) = 0.
Thus f (x) = f (a) for all x ∈ I.

Corollary 3.3.9. Let f, g : [a, b] → R be continuous on [a, b] and differentiable on (a, b), and let
f 0 (x) = g 0 (x) for all x ∈ (a, b). Then f = g + C, where C is a constant.

Proposition 3.3.10. Let f be differentiable on I. Then

(i) f is increasing on I if and only if f 0 (x) ≥ 0 for all x in I;

(ii) f is decreasing on I if and only if f 0 (x) ≤ 0 for all x in I.

Proof. (i) Let f be increasing on I. Let c ∈ I,


f (x) − f (c)
If x < c, then f (x) ≤ f (c). So ≥ 0.
x−c
f (x) − f (c)
If x > c, then f (x) ≥ f (c). So ≥ 0.
x−c
f (x) − f (c)
Hence lim ≥ 0, i.e., f 0 (c) ≥ 0. Since c ∈ I was arbitrary, f 0 (x) ≥ 0 for all x ∈ I.
x→c x−c
Conversely suppose f 0 (x) ≥ 0 for all x ∈ I. Let x, y ∈ I such that x ≤ y. Then by the Mean
Value Theorem, ∃ c ∈ (x, y) such that

f (y) − f (x) = f 0 (c)(y − x) ≥ 0 (since f 0 (c) ≥ 0 and y ≥ x).

Thus x ≤ y ⇒ f (x) ≤ f (y).


(ii) Similar.

Consequence III: Roots of a function, establishing inequalities, and approximations

Corollary 3.3.11. [Rolle’s Theorem] Let f : [a, b] → R be continuous on [a, b] and differentiable
on (a, b), and let f (a) = f (b). Then there is a c ∈ (a, b) such that

f 0 (c) = 0.

44
Proof. Use Mean Value Theorem and the fact that f (a) = f (b).

Proposition 3.3.12. If f : I → R is differentiable on I and f has two distinct roots on I, then f 0


has at least one root in I.

Proof. Let f have roots at a < b ∈ I, i.e., f (a) = 0 = f (b). Then by Rolle’s Theorem, applied to f
on [a, b], ∃ c ∈ (a, b) such that f 0 (c) = 0.

Example 3.3.13. Let f (x) = 2x5 + x3 + 3x, then has exactly one root.

Since f (0) = 0, x = 0 is a root. Suppose f has two distinct roots, a, b, i.e., f (a) = 0 = f (b).
Then by Rolle’s Theorem ∃ c ∈ (a, b) such that f 0 (c) = 0. But f 0 (x) = 10x4 + 3x2 + 3 ≥ 3 for all
x ∈ R. Hence f must only have one root.

Example 3.3.14. Prove the following inequalities:


(i) ex ≥ 1 + x for all x ∈ R (ii) |sin x| ≤ x for all x ≥ 0

(i) Case I: x = 0, then ex = x + 1, so the inequality holds.


Case II: x > 0
Since f (t) = et is differentiable on R, apply the Mean Value Theorem to f (t) = et on [0, x]. So,
∃ c ∈ (0, x) such that
ex − e0 = ec (x − 0) ⇒ ex − 1 = ec x.
Since 0 < c < x, and ex is increasing function,

xe0 < xec < xec ⇒ x < ex − 1 < xex ⇒ x < ex − 1 ⇒ ex > x + 1.
Case III: x < 0. Consider f (t) = et on [x, 0], and proceed as in Case II.

(ii) Apply the Mean Value Theorem to f (t) = sin t on [0, x]. So ∃ c ∈ (0, x) such that

sin x − sin 0 = (cos c)x ⇒ sin x = (cos c)x ⇒ |sin x| = |(cos c)x| ≤ |x| = x.

Thus −x ≤ sin x ≤ x.

Example 3.3.15. Estimate 89.

Consider f (x) = x on [81, 89]. By Mean Value Theorem, ∃ c ∈ (81, 89) such that
√ √ 1 √ 4
89 − 81 = √ (89 − 81) ⇒ 89 − 9 = √ .
2 c c
√ √ √ √ √ 4 4 4
Since 81 < c < 89 ⇒ c < 89 < 100 ⇒ 9 < c < 10 ⇒
81 < <√ <
10 c 9
4 √ 4 2 √ 4 √
So < 89 − 9 < ⇒ + 9 < 89 < + 9 ⇒ 9.04 < 89 < 9.0444.
10 9 5 9

Note: Using calculator 89 ∼ 9.4339811320566.

45

You might also like