You are on page 1of 206

Lecture Notes in Calculus

Raz Kupferman
Institute of Mathematics
The Hebrew University
May 5, 2011
2
Contents
1 Real numbers 1
1.1 Axioms of field . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Axioms of order (as taught in 2009) . . . . . . . . . . . . . . . . 7
1.3 Axioms of order (as taught in 2010, 2011) . . . . . . . . . . . . . 10
1.4 Absolute values . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Special sets of numbers . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 The Archimedean property . . . . . . . . . . . . . . . . . . . . . 18
1.7 Axiom of completeness . . . . . . . . . . . . . . . . . . . . . . . 22
1.8 Rational powers . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.9 Real-valued powers . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.10 Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2 Functions 43
2.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.4 Limits and order . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.5 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.6 Theorems about continuous functions . . . . . . . . . . . . . . . 70
2.7 Infinite limits and limits at infinity . . . . . . . . . . . . . . . . . 78
2.8 Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.9 Uniform continuity . . . . . . . . . . . . . . . . . . . . . . . . . 84
ii CONTENTS
3 Derivatives 91
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.2 Rules of differentiation . . . . . . . . . . . . . . . . . . . . . . . 98
3.3 Another look at derivatives . . . . . . . . . . . . . . . . . . . . . 102
3.4 The derivative and extrema . . . . . . . . . . . . . . . . . . . . . 104
3.5 Derivatives of inverse functions . . . . . . . . . . . . . . . . . . . 117
3.6 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.7 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4 Integration theory 133
4.1 Definition of the integral . . . . . . . . . . . . . . . . . . . . . . 133
4.2 Integration theorems . . . . . . . . . . . . . . . . . . . . . . . . 142
4.3 The fundamental theorem of calculus . . . . . . . . . . . . . . . . 153
4.4 Riemann sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.5 The trigonometric functions . . . . . . . . . . . . . . . . . . . . 158
4.6 The logarithm and the exponential . . . . . . . . . . . . . . . . . 163
4.7 Integration methods . . . . . . . . . . . . . . . . . . . . . . . . . 167
5 Sequences 171
5.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.2 Limits of sequences . . . . . . . . . . . . . . . . . . . . . . . . . 172
5.3 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
CONTENTS iii
Foreword
1. Texbooks: Spivak, Meizler, Hochman.
iv CONTENTS
Chapter 1
Real numbers
In this course we will cover the calculus of real univariate functions, which was
developed during more than two centuries. The pioneers were Isaac Newton
(1642-1737) and Gottfried Wilelm Leibniz (1646-1716). Some of their follow-
ers who will be mentioned along this course are Jakob Bernoulli (1654-1705),
Johann Bernoulli (1667-1748) and Leonhard Euler (1707-1783). These first two
generations of mathematicians developed most of the practice of calculus as we
know it; they could integrate many functions, solve many differential equations,
and sum up a large number of infinite series using a wealth of sophisticated analyt-
ical techniques. Yet, they were sometimes very vague about definitions and their
theory often laid on shaky grounds. The sound theory of calculus as we know it
today, and as we are going to learn it in this course was mostly developed through-
out the 19th century, notably by Joseph-Louis Lagrange (1736-1813), Augustin
Louis Cauchy (1789-1857), Georg Friedrich Bernhard Riemann (1826-1866), Pe-
ter Gustav Lejeune-Dirichlet (1805-1859), Joseph Liouville (1804-1882), Jean-
Gaston Darboux (1842-1917), and Karl Weierstrass (1815-1897). As calculus
was being established on firmer grounds, the theory of functions needed a thor-
ough revision of the concept of real numbers. In this context, we should mention
Georg Cantor (1845-1918) and Richard Dedekind (1831-1916).
1.1 Axioms of field
Even though we have been using “numbers” as an elementary notion since first
grade, a rigorous course of calculus should start by putting even such basic con-
2 Chapter 1
cept on axiomatic grounds.
The set of real numbers will be defined as an instance of a complete, ordered
field (לש רודס הדש).
Definition 1.1 A field F (הדש) is a non-empty set on which two binary opera-
tions are defined
1
: an operation which we call addition, and denote by +, and
an operation which we call multiplication and denote by (or by nothing, as in
a b = ab). The operations on elements of a field satisfy nine defining properties,
which we list now.
1. Addition is associative (יצוביק): for all a, b, c ∈ F,
(a + b) + c = a + (b + c).
It should be noted that addition was defined as a binary operation. As such,
there is no a-priori meaning to a +b +c. In fact, the meaning is ambiguous,
as we could first add a + b and then add c to the sum, or conversely, add
b + c, and then add the sum to a. The first axiom states that in either case,
the result is the same.
What about the addition of four elements, a + b + c + d? Does it require a
separate axiom of equivalence? We would like the following additions
((a + b) + c) + d
(a + (b + c)) + d
a + ((b + c) + d)
a + (b + (c + d))
(a + b) + (c + d)
to be equivalent. It is easy to see that the equivalence follows fromthe axiom
of associativity. Likewise (although it requires some non-trivial work), we
can show that the n-fold addition
a
1
+ a
2
+ + a
n
is defined unambiguously.
1
Abinary operation in F is a function that “takes” an ordered pair of elements in F and “returns”
an element in F. That is, to every a, b ∈ F corresponds one and only one a + b ∈ F and a b ∈ F.
In formal notation,
(∀a, b ∈ F)(∃!c ∈ F) : (c = a + b).
Real numbers 3
2. Existence of an additive-neutral element (רוביחל ילרטיינ רביא): there
exists an element 0 ∈ F such that for all a ∈ F
a + 0 = 0 + a = a.
3. Existence of an additive inverse (ידגנ רביא): for all a ∈ F there exists an
element b ∈ F, such that
a + b = b + a = 0.
It is customary to denote this additive inverse by (−a).
2
With the first three axioms there are a few things we can show. For example,
that
a + b = a + c implies b = c.
The proof requires all three axioms:
a + b = a + c
(−a) + (a + b) = (−a) + (a + c) (existence of inverse)
((−a) + a)) + b = ((−a) + a) + c (associativity)
0 + b = 0 + c (property of inverse)
b = c. (0 is the neutral element)
It follows at once that every element a ∈ F has a unique additive inverse,
for if b and c were both additive inverses of a, then a + b = 0 = a + c, from
which follows that b = c. Hence we can refer to the additive inverse of a,
which justifies the notation (−a).
We can also show that there is a unique element in F satisfying the neutral
property (it is not an a priori fact). Suppose there were a, b ∈ F such that
a + b = a.
Then adding (−a) (on the left!) and using the law of associativity we get
that b = 0.
2
The notation (−a) assumes implicitly that the inverse is unique. This is not an assumption, but
follows from the first three axioms.
4 Chapter 1
Comment: We only postulate the operations of addition and multiplication.
Subtraction is a short-hand notation for the addition of the additive inverse,
a − b
def
= a + (−b).
Comment: A set satisfying the first three axioms is called a group (הרובח).
By themselves, those three axioms have many implications, which can fill
an entire course.
4. Addition is commutative (יפוליח): for all a, b ∈ F
a + b = b + a.
With this law we finally obtain that any (finite!) summation of elements of
F can be re-arranged in any order.
On the other hand, our experience with numbers tells us that a − b = b − a
implies that a = b. There is no way we can prove it using only the first four
axioms! Indeed, all we obtain from it that
a + (−b) = b + (−a) (given)
(a + (−b)) + (b + a) = (b + (−a)) + (a + b) (commutativity)
a + ((−b) + b) + a = b + ((−a) + a) + b (associativity)
a + 0 + a = b + 0 + b (property of additive inverse)
a + a = b + b, (property of neutral element)
but how can we deduce from that that a = b?
With very little comments, we can state now the corresponding laws for
multiplication:
5. Multiplication is associative: for all a, b, c ∈ F,
a (b c) = (a b) c.
6. Existence of a multiplicative-neutral element: there exists an element
1 ∈ F such that for all a ∈ F,
a 1 = 1 a = a.
Real numbers 5
7. Existence of a multiplicative inverse (יכפוה רביא): for every a 0 there
exists an element a
−1
∈ F such that
3
a a
−1
= a
−1
a = 1.
The condition that a 0 has strong implications. For example, from the
fact that a b = a c we can deduce that b = c only if a 0.
Comment: Division is defined as multiplication by the inverse,
a/b
def
= a b
−1
.
8. Multiplication is commutative: for every a, b ∈ F,
a b = b a.
The last axiom relates the addition and multiplication operations:
9. Distributive law (וביקה קוח): for all a, b, c ∈ F,
a (b + c) = a b + a c.
We can revisit now the a − b = b − a problem. We can now proceed,
a + a = b + b
1 a + 1 a = 1 b + 1 b (property of neutral element)
(1 + 1) a = (1 + 1) b (distributive law).
We would be done if we knew that 1 + 1 0, for we would multiply both
sides by (1 + 1)
−1
. However, this does not follow from the axioms of field!
Example: Consider the following set F = ¦e, f ¦ with the following properties,
+ e f
e e f
f f e
e f
e e e
f e f
3
Here again, the uniqueness of the multiplicative inverse has to be proved, without which there
is no justification to the notation a
−1
.
6 Chapter 1
It takes some explicit verification to check that this is indeed a field (in fact the
smallest possible field), with e being the additive neutral and f being the multi-
plicative neutral (do you recognize this field?).
Note that
e − f = e + (−f ) = e + f = f + (−e) = f − e,
and yet e f , which is then no wonder that we can’t prove, based only on the
field axioms, that f − e = e − f implies that e = f .
Proposition 1.1 For every a ∈ F, a 0 = 0.
Proof : Using sequentially the property of the neutral element and the distributive
law:
a 0 = a (0 + 0) = a 0 + a 0.
Adding to both sides −(a 0) we obtain
a 0 = 0.
B
Comment: Suppose it were the case that 1 = 0, i.e., that the same element of F is
both the additive neutral and the multiplicative neutral. It would follows that for
every a ∈ F,
a = a 1 = a 0 = 0,
which means that 0 is the only element of F. This is indeed a field according to
the axioms, however a very boring one. Thus, we rule it out and require the field
to satisfy 0 1.
Comment: In principle, every algebraic identity should be proved from the ax-
ioms of field. In practice, we will assume henceforth that all known algebraic
manipulations are valid. For example, we will not bother to prove that
a − b = −(b − a)
(even though the proof is very easy). The only exception is (of course) if you get
to prove an algebraic identity as an assignment.
Real numbers 7
Exercise 1.1 Let F be a field. Prove, based on the field axioms, that
' If ab = a for some field element a 0, then b = 1.
^ a
3
− b
3
= (a − b)(a
2
+ ab + b
2
).
· If c 0 then a/b = (ac)/(bc).
´ If b, d 0 then
a
b
+
c
d
=
ad + bc
bd
.
Justify every step in your proof!
1.2 Axioms of order (as taught in 2009)
The real numbers contain much more structure than just being a field. They also
form an ordered set. The property of being ordered can be formalized by three
axioms:
Definition 1.2 (רוּדָס הדשׁ) A field F is said to be ordered if it has a distin-
guished subset P ⊂ F, which we call the positive elements, such that the following
three properties are satisfied:
1. Trichotomy: every element a ∈ F satisfies one and only one of the follow-
ing properties. Either (i) a ∈ P, or (ii) (−a) ∈ P, or (iii) a = 0.
2. Closure under addition: if a, b ∈ P then a + b ∈ P.
3. Closure under multiplication: if a, b ∈ P then a b ∈ P.
We supplement these axioms with the following definitions:
a < b means b − a ∈ P
a > b means b < a
a ≤ b means a < b or a = b
a ≥ b means a > b or a = b.
Comment: a > 0 means, by definition, that a−0 = a ∈ P. Similarly, a < 0 means,
by definition, that 0 − a = (−a) ∈ P.
We can then show a number of well-known properties of real numbers:
8 Chapter 1
Proposition 1.2 For every a, b ∈ F either (i) a < b, or (ii) a > b, or (iii) a = b.
Proof : This is an immediate consequence of the trichotomy law, along with our
definitions. We are asked to show that either (i) a−b ∈ P, or (ii) b−a = −(a−b) ∈
P, or (iii) that a − b = 0. That is, we are asked to show that a − b satisfies the
trichotomy, which is an axiom. B
Proposition 1.3 If a < b then a + c < b + c.
Proof :
(b + c) − (a + c) = b − a ∈ P.
B
Proposition 1.4 (Transitivity) If a < b and b < c then a < c.
Proof :
c − a = (c − b)
.,,.
in P
+(b − a)
.,,.
in P
.,,.
in P by closure under addition
.
B
Proposition 1.5 If a < 0 and b < 0 then ab > 0.
Proof : We have seen above that a, b < 0 is synonymous to (−a), (−b) ∈ P. By
the closure under multiplication, it follows that (−a)(−b) > 0. It remains to check
that the axioms of field imply that ab = (−a)(−b). B
Corollary 1.1 If a 0 then a a ≡ a
2
> 0.
Real numbers 9
Proof : If a 0, then by the trichotomy either a > 0, in which case a
2
> 0 follows
from the closure under multiplication, or a < 0 in which case a
2
> 0 follows from
the previous proposition, with a = b. B
Corollary 1.2 1 > 0.
Proof : This follows from the fact that 1 = 1
2
. B
Another corollary is that the field of complex numbers (yes, this is not within
the scope of the present course) cannot be ordered, since i
2
= (−1), which implies
that (−1) has to be positive, hence 1 has to be negative, which violates the previous
corollary.
Proposition 1.6 If a < b and c > 0 then ac < bc.
Proof : It is given that b − a ∈ P and c ∈ P. Then,
bc − ac = (b − a)
.,,.
in P
c
.,,.
in P
.,,.
in P by closure under multiplication
.
B
Exercise 1.2 Let F be an ordered field. Prove based on the axioms that
' a < b if and only if (−b) < (−a).
^ If a < b and c > d, then a − c < b − d.
· If a > 1 then a
2
> a.
´ If 0 < a < 1 then a
2
< a.
I If 0 ≤ a < b and 0 ≤ c < d then ac < bd.
I If 0 ≤ a < b then a
2
< b
2
.
Exercise 1.3 Prove using the axioms of ordered fields and using induction on
n that
10 Chapter 1
' If 0 ≤ x < y then x
n
< y
n
.
^ If x < y and n is odd then x
n
< y
n
.
· If x
n
= y
n
and n is odd then x = y.
´ If x
n
= y
n
and n is even then either x = y of x = −y.
I Conclude that
x
n
= y
n
if and only if x = y or x = −y.
Exercise 1.4 Let F be an ordered field, with multiplicative neutral term 1; we
also denote 1 +1 = 2. Prove that (a +b)/2 is well defined for all a, b ∈ F, and that
if a < b, then
a <
a + b
2
< b.
Exercise 1.5 Show that there can be no ordered field that has a finite number
of elements.
1.3 Axioms of order (as taught in 2010, 2011)
The real numbers contain much more structure than just being a field. They also
form an ordered set. The property of being ordered can be formalized by four
axioms:
Definition 1.3 (רוּדָס הדשׁ) A field F is said to be ordered if there exists a rela-
tion < (a relation is a property that every ordered pair of elements either satisfy
or not), such that the following four properties are satisfied:
O1 Trichotomy: every pair of elements a, b ∈ F satisfies one and only one of
the following properties. Either (i) a < b, or (ii) b < a, or (iii) a = b.
O2 Transitivity: if a < b and b < c then a < c.
O3 Invariance under addition: if a < b then a + c < b + c for all c ∈ F.
O4 Invariance under multiplication by positive numbers: if a < b then ac <
bc for all 0 < c.
Real numbers 11
Comment: The set of numbers a for which 0 < a are called the positive numbers.
The set of numbers a for which a < 0 are called the negative numbers.
We supplement these axioms with the following definitions:
a > b means b < a
a ≤ b means a < b or a = b
a ≥ b means a > b or a = b.
Proposition 1.7 The set of positive numbers is closed under addition and multi-
plication, namely
If 0 < a, b then 0 < a + b and 0 < ab.
Proof : Closure under addition follows from the fact that 0 < a implies that
b = 0 + b (neutral element)
< a + b, (O3)
and b < a + b follows from transitivity.
Closure under multiplication follows from the fact that
0 = 0 a (proved)
< b a. (O4)
B
(2 hrs, 2010) (2 hrs, 2011)
Proposition 1.8 0 < a if and only if (−a) < 0, i.e., a number is positive if and
only if its additive inverse is negative.
Proof : Suppose 0 < a, then
(−a) = 0 + (−a) (neutral element)
< a + (−a) (O3)
= 0 (additive inverse)
12 Chapter 1
i.e., (−a) < 0. Conversely, if (−a) < 0 then
0 = (−a) + a (additive inverse)
< 0 + a (O3)
= a (neutral element)
B
Corollary 1.3 The set of positive numbers is not empty.
Proof : Since by assumption 1 0, it follows by O1 that either 0 < 1 or 1 < 0. In
the latter case 0 < (−1), hence either 1 or (−1) is positive. B
Proposition 1.9 0 < 1.
Proof : By trichotomy, either 0 < 1, or 1 < 0, or 0 = 1. The third possibility is
ruled out by assumption. Suppose then, by contradiction, that 1 < 0. Then for
every 0 < a (and we know that at least one such exists),
a = a 1 < a 0 = 0,
i.e., a < 0 which violates the trichotomy axiom. Hence 0 < 1. B
Proposition 1.10 0 < a if and only if 0 < a
−1
, i.e., a number is positive if and
only if its multiplicative inverse is also positive.
Proof : Let 0 < a and suppose by contradiction that a
−1
< 0 (it can’t be zero
because a
−1
a = 1). Then,
1 = a a
−1
< a 0 = 0,
which is a contradiction. B
Real numbers 13
Proposition 1.11 If a < 0 and b < 0 then ab > 0.
Proof : We have seen above that a, b < 0 implies that 0 < (−a), (−b). By the
closure under multiplication, it follows that (−a)(−b) > 0. It remains to check that
the axioms of field imply that ab = (−a)(−b). B
Corollary 1.4 If a 0 then a a ≡ a
2
> 0.
Proof : If a 0, then by the trichotomy either a > 0, in which case a
2
> 0 follows
from the closure under multiplication, or a < 0 in which case a
2
> 0 follows from
the previous proposition, with a = b. B
Another corollary is that the field of complex numbers (which is not within the
scope of the present course) cannot be ordered, since i
2
= (−1), which implies
that (−1) has to be positive, hence 1 has to be negative, which is a violation.
1.4 Absolute values
Definition 1.4 For every element a in an ordered field F we define the absolute
value (טלחומ רע),
¦a¦ =
_
¸
¸
_
¸
¸
_
a a ≥ 0
(−a) a < 0.
We see right away that ¦a¦ = 0 if and only if a = 0, and otherwise ¦a¦ > 0.
Proposition 1.12 (Triangle inequality (שלושמה ויוויש יא)) For every a, b ∈ F,
¦a + b¦ ≤ ¦a¦ + ¦b¦.
Proof : We need to examine four cases (by trichotomy):
1. a ≥ 0 and b ≥ 0.
14 Chapter 1
2. a ≤ 0 and b ≥ 0.
3. a ≥ 0 and b ≤ 0.
4. a ≤ 0 and b ≤ 0.
In the first case, ¦a¦ = a and ¦b¦ = b, hence
¦a + b¦ = a + b = ¦a¦ + ¦b¦.
(Hey, does this mean that ¦a + b¦ ≤ ¦a¦ + ¦b¦?) In the fourth case, ¦a¦ = (−a),
¦b¦ = (−b), and 0 < −(a + b), hence
¦a + b¦ = −(a + b) = (−a) + (−b) = ¦a¦ + ¦b¦.
It remains to show the second case, as the third case follows by interchanging the
roles of a and b. Let a ≤ 0 and b ≥ 0, i.e., ¦a¦ = (−a) and ¦b¦ = b. We need to show
that
¦a + b¦ ≤ (−a) + b.
We can divide this case into two sub-cases. Either a + b ≥ 0, in which case
¦a + b¦ = a + b ≤ (−a) + b = ¦a¦ + ¦b¦.
If a + b < 0, then
¦a + b¦ = −(a + b) = (−a) + (−b) ≤ (−a) + b = ¦a¦ + ¦b¦,
which completes the proof
4
. B
Comment: By replacing b by (−b) we obtain that
¦a − b¦ ≤ ¦a¦ + ¦b¦.
The following version of the triangle inequality is also useful:
4
We have used the fact that a ≥ 0 implies (−a) ≤ a and a ≤ 0 implies a ≤ (−a). This is very
easy to show. For example,
a > 0 =⇒ a + (−a) > 0 + (−a) =⇒ 0 > (−a),
and (−a) < a follows from transitivity.
Real numbers 15
Proposition 1.13 (Reverse triangle inequality) For every a, b ∈ F,
¦¦a¦ − ¦b¦¦ ≤ ¦a + b¦.
Proof : By the triangle inequality,
∀a, c ∈ F, ¦a − c¦ ≤ ¦a¦ + ¦c¦.
Set c = b + a, in which case ¦b¦ ≤ ¦a¦ + ¦b + a¦, or
¦b¦ − ¦a¦ ≤ ¦b + a¦.
By interchanging the roles of a and b,
¦a¦ − ¦b¦ ≤ ¦a + b¦.
It follows that
5
¦¦a¦ − ¦b¦¦ ≤ ¦a + b¦,
B
Proposition 1.14 For every a, b, c ∈ F,
¦a + b + c¦ ≤ ¦a¦ + ¦b¦ + ¦c¦.
Proof : Use the triangle inequality twice,
¦a + b + c¦ = ¦a + (b + c)¦ ≤ ¦a¦ + ¦b + c¦ ≤ ¦a¦ + ¦b¦ + ¦c¦.
B
(2 hrs, 2009)
Exercise 1.6 Let min(x, y) and max(x, y) denote the smallest and the largest
of the two arguments, respectively. Show that
max(x, y) =
1
2
(x + y + ¦x − y¦) and min(x, y) =
1
2
(x + y − ¦x − y¦) .
5
Here we used the fact that any property satisfied by a and (−a) is also satisfied by ¦a¦.
16 Chapter 1
1.5 Special sets of numbers
What kind of sets can have the properties of an ordered field? We start by intro-
ducing the natural numbers (ייעבטה ירפסמה), by taking the element 1 (which
exists by the axioms), and naming the numbers
1 + 1, 1 + 1 + 1, 1 + 1 + 1 + 1, etc.
That these numbers are all distinct follows from the axioms of order (each number
is greater than its predecessor because 0 < 1). That we give them names, such as
2, 3, 4, and may represent them using a decimal system is immaterial. We denote
the set of natural numbers by N. Clearly the natural numbers do not form a field
(for example, there are no additive inverses because all the numbers are positive).
We can then augment the set of natural numbers by adding the numbers
0, −1, −(1 + 1), −(1 + 1 + 1), etc.
This forms the set of integers (ימלשה), which we denote by Z. The integers form
a commutative group with respect to addition, but are not a field since multiplica-
tive inverses do not exist
6
.
The set of numbers can be further augmented by adding all integer quotients,
m/n, n 0. This forms the set of rational numbers, which we denote by Q. It
should be noted that the rational numbers are the set of integer quotients modulo
an equivalence relation
7
. A rational number has infinitely many representations
as the quotient of two integers.
6
How do we know that (1 + 1)
−1
, for example, is not an integer? We know that it is positive
and non-zero. Suppose that it were true that 1 < (1 + 1)
−1
, then
1 + 1 = (1 + 1) 1 < (1 + 1) (1 + 1)
−1
= 1,
which is a violation of the axioms of order.
7
A few words about equivalence relations (תוליקש סחי). An equivalence relation on a set S is
a property that any two elements either have or don’t. If two elements a, b have this property, we
say that a is equivalent to b, and denote it by a ∼ b. An equivalence relation has to be symmetric
(a ∼ b implies b ∼ a), reflexive (a ∼ a for all a) and transitive (a ∼ b and b ∼ c implies a ∼ c).
Thus, with every element a ∈ S we can associate an equivalence class (תוליקש תקלחמ), which
is ¦b : b ∼ a¦. Every element belongs to one and only one equivalence class. An equivalence
relation partitions S into a collection of disjoint equivalence classes; we call this set S modulo the
equivalence relation ∼, and denote it S/ ∼.
Real numbers 17
Proposition 1.15 Let b, d 0. Then
a
b
=
c
d
if and only if ad = bc.
Proof : Multiply/divide both sides by bd (which cannot be zero if b, d 0). B
It can be checked that the rational numbers form an ordered field.
(4 hrs, 2010) (4 hrs, 2011)
In this course we will not teach mathematical induction, nor inductive defini-
tions; we will assume these concepts to be understood. As an example of an
inductive definition, we define for all a ∈ Q and n ∈ N, the n-th power, by setting,
a
1
= a and a
k+1
= a a
k
.
As an example of an inductive proof, consider the following proposition:
Proposition 1.16 Let a, b > 0 and n ∈ N. Then
a > b if and only if a
n
> b
n
.
Proof : We proceed by induction. This is certainly true for n = 1. Suppose this
were true for n = k. Then
a > b implies a
k
> b
k
implies a
k+1
= a a
k
> a b
k
> b b
k
= b
k+1
.
Conversely, if a
k+1
> b
k+1
then a ≤ b would imply that a
k+1
≤ b
k+1
, hence a > b.
B
Another inequality, which we will need occasionally is:
Proposition 1.17 (Bernoulli inequality) For all x > (−1) and n ∈ N,
(1 + x)
n
≥ 1 + nx.
18 Chapter 1
Proof : The proof is by induction. For n = 1 both sides are equal. Suppose this
were true for n = k. Then,
(1 + x)
k+1
= (1 + x)(1 + x)
k
≥ (1 + x)(1 + kx) = 1 + (k + 1)x + kx
2
≥ 1 + (k + 1)x,
where in the left-most inequality we have used explicitly the fact that 1 + x > 0.
B
Exercise 1.7 The following exercise deal with logical assertions, and is in
preparation to the type of assertions we will be dealing with all along this course.
For each of the following statements write its negation in hebrew, without using
the word “no”. Then write both the statement and its negation, using logical
quantifiers like ∀ and ∃.
' For every integer n, n
2
≥ n holds.
^ There exists a natural number M such that n < M for all natural numbers n.
· For every integer n, m, either n ≥ m or −n ≥ −m.
´ For every natural number n there exist natural numbers a, b,, such that b < n,
1 < a and n = ab.
(3 hrs, 2009)
1.6 The Archimedean property
We are aiming at constructing an entity that matches our notions and experiences
associated with numbers. Thus far, we defined an ordered field, and showed that it
must contain a set, which we called the integers, along with all numbers express-
ible at ratios of integers—the rational numbers Q. Whether it contains additional
elements is left for the moment unanswered. The point is that Q already has the
property of being an ordered field.
Before proceeding, we will need the following definitions:
Definition 1.5 A set of elements A ⊂ F is said to be bounded from above (וסח
ליעלמ) if there exists an element M ∈ F, such that a ≤ M for all a ∈ A, i.e.,
(∃M ∈ F) : (∀a ∈ A)(a ≤ M).
Real numbers 19
Such an element M is called an upper bound (ליעלמ סח) for A. A is said to be
bounded from below (ערלמ וסח) if
(∃m ∈ F) : (∀a ∈ A)(a ≥ m).
Such an element m is called a lower bound (ערלמ סח) for A. A set is said to be
bounded (וסח) if it is bounded both from above and below.
Note that if a set is upper bounded, then the upper bound is not unique, for if M is
an upper bound, so are M + 1, M + 2, and so on.
Proposition 1.18 A set A is bounded if and only if
(∃M ∈ F) : (∀a ∈ A)(¦a¦ ≤ M).
Proof :
8
Suppose A is bounded, then
(∃M
1
∈ F) : (∀a ∈ A)(a ≤ M
1
)
(∃M
2
∈ F) : (∀a ∈ A)(a ≥ M
2
).
That is,
(∃M
1
∈ F) : (∀a ∈ A)(a ≤ M
1
)
(∃M
2
∈ F) : (∀a ∈ A)((−a) ≤ (−M
2
)).
By taking M = max(M
1
, −M
2
) we obtain
(∀a ∈ A)(a ≤ M ∧ (−a) ≤ M),
i.e.,
(∀a ∈ A)(¦a¦ ≤ M).
The other direction isn’t much different. B
8
Once and for all: P if Q means that “if Q holds then P holds”, or Q implies P. P only if
Q means that “if Q does not hold, then P does not hold either”, which implies that P implies Q.
Thus, “if and only if” means that each one implies the other.
20 Chapter 1
A few notational conventions: for a < b we use the following notations for an
open (חותפ עטק), closed (רוגס עטק), and semi-open (רוגס יצח עטק) segment,
(a, b) = ¦x ∈ F : a < x < b¦
[a, b] = ¦x ∈ F : a ≤ x ≤ b¦
(a, b] = ¦x ∈ F : a < x ≤ b¦
[a, b) = ¦x ∈ F : a ≤ x < b¦.
Each of these sets is bounded (above and below).
We also use the following notations for sets that are bounded on one side,
(a, ∞) = ¦x ∈ F : x > a¦
[a, ∞) = ¦x ∈ F : x ≥ a¦
(−∞, a) = ¦x ∈ F : x < a¦
(−∞, a] = ¦x ∈ F : x ≤ a¦.
The first two sets are bounded from below, whereas the last two are bounded from
above. It should be emphasized that ±∞ are not members of F! The above is
purely a notation. Thus, “being less than infinity” does not mean being less than
a field element called infinity.
(5 hrs, 2011)
Consider now the set of natural numbers, N. Does it have an upper bound? Our
“geometrical picture” of the number axis clearly indicates that this set is un-
bounded, but can we prove it? We can easily prove the following:
Proposition 1.19 The natural numbers are not upper bounded by a natural num-
ber.
Proof : Suppose n ∈ N were an upper bound for N, i.e.,
(∀k ∈ N) : (k ≤ n).
However, m = n + 1 ∈ N and m > n, i.e.,
(∃m ∈ N) : (n < m),
which is a contradiction. B
Similarly,
Real numbers 21
Corollary 1.5 The natural numbers are not bounded from above by any rational
number
Proof : Suppose that p/q ∈ Q was an upper bound for N. Since, by the axioms of
order, p/q ≤ p, it would imply the existence of a natural number p that is upper
bound for N.
9
B
(5 hrs, 2010)
It may however be possible that an irrational element in F be an upper bound for
N. Can we rule out this possibility? It turns out that this cannot be proved from
the axioms of an ordered field. Since, however, the boundedness of the naturals is
so basic to our intuition, we may impose it as an additional axiom, known as the
axiom of the Archimedean field: the set of natural numbers is unbounded from
above
10
.
This axiom has a number of immediate consequences:
Proposition 1.20 In an Archimedean ordered field F:
(∀a ∈ F)(∃n ∈ N) : (a < n).
Proof : If this weren’t the case, then N would be bounded. Indeed, the negation of
the proposition is
(∃a ∈ F) : (∀n ∈ N)(n ≤ a),
i.e., a is an upper bound for N. B
Corollary 1.6 In an Archimedean ordered field F:
(∀ > 0)(∃n ∈ N) : (1/n < ).
9
Convince yourself that p/q ≤ p for all p, q ∈ N.
10
This additional assumption is temporary. The Archimedean property will be provable once
we add our last axiom.
22 Chapter 1
Proof : By the previous proposition,
(∀ > 0)(∃n ∈ N) : (0 < 1/ < n)
.,,.
(0<1/n<)
.
B
Corollary 1.7 In an Archimedean ordered field F:
(∀x, y > 0)(∃n ∈ N) : (y < nx).
Comment: This is really what is meant by the Archimedean property. For every
x, y > 0, a segment of length y can be covered by a finite number of segments of
length x.
y
x x x x x x
Proof : By the previous corollary with = y/x,
(∀x, y > 0)(∃n ∈ N) : (y/x < n)
.,,.
(y<nx)
.
B
1.7 Axiom of completeness
The Greeks knew already that the field of rational numbers is “incomplete”, in the
sense that there is no rational number whose square equals 2 (whereas they knew
by Pythagoras’ theorem that this should be the length of the diagonal of a unit
square). In fact, let’s prove it:
Proposition 1.21 There is no r ∈ Q such that r
2
= 2.
Real numbers 23
Proof : Suppose, by contradiction, that r = n/m, where n, m ∈ N (we can assume
that r is positive) satisfies r
2
= 2. Although it requires some number theoreti-
cal knowledge, we will assume that it is known that any rational number can be
brought into a form where m and n have no common divisor. By assumption,
m
2
/n
2
= 2, i.e., m
2
= 2n
2
. This means that m
2
is even, from which follows that m
is even, hence m = 2k for some k ∈ N. Hence, 4k
2
= 2n
2
, or n
2
= 2k
2
, from which
follows that n is even, contradicting the assumption that m and n have no common
divisor. B
Proof : Another proof: suppose again that r = n/m is irreducible and r
2
= 2.
Since, n
2
= 2m
2
, then
m
2
< n
2
< 4m
2
,
it follows that
m < n < 2m < 2n,
and
0 < n − m < m and 0 < 2m − n < n.
Consider now the ratio
q =
2m − n
n − m
,
whose numerator and denominator are both smaller than that of r. By elementary
arithmetic
q
2
=
4m
2
− 4mn + n
2
n
2
− 2nm + m
2
=
4 − 4n/m + n
2
/m
2
n
2
/m
2
− 2n/m + 1
=
6 − 4n/m
3 − 2n/m
= 2,
which is a contradiction. B
Exercise 1.8 Prove that there is no r ∈ Q such that r
2
= 3.
Exercise 1.9 What fails if we try to apply the same arguments to prove that
there is no r ∈ Q such that r
2
= 4 (an assertion that happens to be false)?
A way to cope with this “missing number” would be to add

2 “by hand” to the
set of rational numbers, along with all the numbers obtained by field operations
involving

2 and rational numbers (this is called in algebra a field extension)
24 Chapter 1
(הבחרה הדש)
11
. But then, what about

3? We could add all the square roots of
all positive rational numbers. And then, what about
3

2? Shall we add all n-th
roots? But then, what about a solution to the equation x
5
+ x + 1 = 0 (it cannot be
expressed in terms of roots as a result of Galois’ theory)?
It turns out that a single additional axiom, known as the axiom of completeness
(תומלשה תמויסכא), completes the set of rational numbers in one fell swoop, such
to provide solutions to all those questions. Let us try to look in more detail in
what sense is the field of rational numbers “incomplete”. Consider the following
two sets,
A = ¦x ∈ Q : 0 < x, x
2
≤ 2¦
B = ¦y ∈ Q : 0 < y, 2 ≤ y
2
¦.
Every element in B is greater or equal than every element in A (by transitivity and
by the fact that 0 < x ≤ y implies x
2
≤ y
2
). Formally,
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).
Does there exist a rational number that “separates” the two sets, i.e., does there
exist an element c ∈ Q such that
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b) ?
It can be shown that if there existed such a c it would satisfy c
2
= 2, hence such a
c does not exist. This observation motivates the following definition:
Definition 1.6 An ordered field F is said to be complete if for every two non-
empty sets, A, B ⊆ F satisfying
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).
there exists a c ∈ F such that
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b).
We will soon see how this axiom “completes” the field Q.
We next introduce more definitions. Let’s start with a motivating example:
11
It can be shown that this field extension of Q consists of all elements of the form
¦a + b

2 : a, b ∈ Q¦.
Real numbers 25
Example: For any set [a, b) = ¦x : a ≤ x < b¦, b is an upper bound, but so is any
larger element, e.g., b + 1. In fact, it is clear that b is the least upper bound.
Definition 1.7 Let A ⊆ F be a set. An element M ∈ F is called a least upper
bound (וילע סח) for A if (i) it is an upper bound for A, and (ii) if M
·
is also an
upper bound for A then M ≤ M
·
. That is,
(∀a ∈ A)(a ≤ M)
(∀M
·
∈ F)( if (∀a ∈ A)(a ≤ M
·
) then (M ≤ M
·
)).
We can see right away that a least upper bound, if it exists, is unique:
Proposition 1.22 Let A be a set and let M be a least upper bound for A. If M
·
is
also a least upper bound for A then M = M
·
.
Proof : It follows immediately from the definition of the least upper bound. If
M and M
·
are both least upper bounds, then both are in particular upper bounds,
hence M ≤ M
·
and M
·
≤ M, which implies that M = M
·
. B
We call the least upper bound of a set (if it exists!) a supremum, and denote it
by sup A. Similarly, the greatest lower bound (ותחת סח) of a set is called the
infimum, and it is denoted by inf A.
12
(7 hrs, 2011)
We can provide an equivalent definition of the least upper bound:
Proposition 1.23 Let A be a set. A number M is a least upper bound if and only
if (i) it is an upper bound, and (ii)
(∀ > 0)(∃a ∈ A) : (a > M − ).
12
In some books the notations lub (least upper bound) and glb (greatest lower bound) are used
instead of sup and inf.
26 Chapter 1
M
A
!!
a
Proof : There are two directions to be proved.
1. Suppose first that M were a least upper bound for A, i.e.,
(∀a ∈ A)(a ≤ M) and (∀M
·
∈ F)(if (∀a ∈ A)(a ≤ M
·
) then (M ≤ M
·
)).
Suppose, by contradiction, that there exists an > 0 such that a ≤ M− for
all a ∈ A, i.e., that
(∃ > 0) : (∀a ∈ A)(a < M − ),
then M − is an upper bound for A, smaller than the least upper bound,
which is a contradiction.
2. Conversely, suppose that M is an upper bound and
(∀ > 0)(∃a ∈ A) : (a > M − ).
Suppose that M was not a least upper bound. Then there exists a smaller
upper bound M
·
< M. Take = M − M
·
. Then, a ≤ M
·
= M − for all
a ∈ A, or in formal notation,
(∃ > 0) : (∀a ∈ A)(a < M − ),
which is a contradiction.
B
Examples: What are the least upper bounds (if they exist) in the following exam-
ples:
' [−5, 15] (answer: 15).
Real numbers 27
^ [−5, 15) (answer: 15).
· [−5, 15] ∪ ¦20¦ (answer: 20).
´ [−5, 15] ∪ (17, 18) (answer: 18).
I [−5, ∞) (answer: none).
I ¦1 − 1/n : n ∈ N¦ (answer: 1).
Exercise 1.10 Let F be an ordered field, and ∅ A ⊆ F. Prove that M = inf A
if and only if M is a lower bound for A, and
(∀ > 0)(∃a ∈ A) : (a < M + ).
Exercise 1.11 Show that the axiom of completeness is equivalent to the fol-
lowing statement: for every two non-empty sets A, B ⊂ F, satisfying that a ∈ A
and b ∈ B implies a ≤ b, there exists an element c ∈ F, such that
a ≤ c ≤ b for all a ∈ A and b ∈ B.
Exercise 1.12 For each of the following sets, determine whether it is upper
bounded, lower bounded, and if they are find their infimum and/or supremum:
' ¦(−2)
n
: n ∈ N¦.
^ ¦1/n : n ∈ Z, n 0¦.
· ¦1 + (−1)
n
: n ∈ N¦.
´ ¦1/n + (−1)
n
: n ∈ N¦.
I ¦n/(2n + 1) : n ∈ N¦.
I ¦x : ¦x
2
− 1¦ < 3¦.
Exercise 1.13 Let A, B be non-empty sets of real numbers. We define
A + B = ¦a + b : a ∈ A, b ∈ B¦
A − B = ¦a − b : a ∈ A, b ∈ B¦
A B = ¦a b : a ∈ A, b ∈ B¦.
Find A + B, A B and A A in each of the following cases:
' A = ¦1, 2, 3¦ and B = ¦−1, −2, −1/2, 1/2¦.
^ A = B = [0, 1].
28 Chapter 1
Exercise 1.14 Let A be a non-empty set of real numbers. Is each of the fol-
lowing statements necessarity true? Prove it or give a counter example:
' A + A = ¦2¦ A.
^ A − A = ¦0¦.
· ∀x ∈ A A, x ≥ 0.
Exercise 1.15 In each of the following items, A, B are non-empty subsets of a
complete ordered field. For the moment, you can only use the axiom of complete-
ness for the least upper bound.
' Suppose that A is lower bounded and set B = ¦x : −x ∈ A¦. Prove that B is
upper bounded, that A has an infimum, and sup B = −inf A.
^ Show that if A is upper bounded and B is lower bounded, then sup(A− B) =
sup A − inf B.
· Show that if A, B are lower bounded then inf(A + B) = inf A + inf B.
´ Suppose that A, B are upper bounded. It is necessarily true that sup(A B) =
sup A sup B?
I Show that if A, B only contain non-negative terms and are upper bounded,
then sup(A B) = sup A sup B.
Definition 1.8 Let A ⊂ F be a subset of an ordered field. It is said to have a
maximum if there exists an element M ∈ A which is an upper bound for A. We
denote
M = max A.
It is said to have a minimum if there exists an element m ∈ A which is a lower
bound for A. We denote
m = min A.
(7 hrs, 2010)
Proposition 1.24 If a set A has a maximum, then the maximum is the least upper
bound, i.e.,
sup A = max A.
Similarly, if A has a minimum, then the minimum is the greatest lower bound,
inf A = min A.
Real numbers 29
Proof : Let M = max A.
13
Then M is an upper bound for A, and for every > 0,
A ÷ M > M − ,
that is (∀ > 0)(∃a ∈ A)(a > M−), which proves that M is the least upper bound.
B
Examples: What are the maxima (if they exist) in the following examples:
' [−5, 15] (answer: 15).
^ [−5, 15) (answer: none).
· [−5, 15] ∪ ¦20¦ (answer: 20).
´ [−5, 15] ∪ (17, 18) (answer: none).
I [−5, ∞) (answer: none).
I ¦1 − 1/n : n ∈ N¦ (answer: none).
(5 hrs, 2009)
Proposition 1.25 Every finite set in an ordered field has a minimum and a maxi-
mum.
Proof : The proof is by induction on the size of the set. B
Corollary 1.8 Every finite set in an ordered field has a least upper bound and a
greatest lower bound.
Consider now the field Q of rational numbers, and its subset
A = ¦x ∈ Q : 0 < x, x
2
< 2¦.
This set is bounded from above, as 2, for example, is an upper bound. Indeed, if
x ∈ A, then
x
2
< 2 < 2
2
,
13
We emphasize that not every (non-empty) set has a maximum. The statement is that if a
maximum exists, then it is also the supremum.
30 Chapter 1
which implies that x < 2.
0 1 2
A
an upper bound
But does A have a least upper bound?
Proposition 1.26 Suppose that A has a least upper bound, then
(sup A)
2
= 2.
Proof : We assume that A has a least upper bound, and denote
α = sup A.
Clearly, 1 < α < 2.
Suppose, by contradiction, that α
2
< 2. Relying on the Archimedean property, we
choose n to be an integer large enough, such that
n > max
_
1
α
,
6
2 − α
2
_
.
We are going to show that there exists a rational number r > α, such that r
2
< 2,
which means that α is not even an upper bound for A—contradiction:
(α + 1/n)
2
= (α)
2
+
2
n
α +
1
n
2
< α
2
+
3
n
α (use 1/n < α)
< α
2
+
6
n
(use α < 2)
< α
2
+ 2 − α
2
= 2, (use 6/n < 2 − α
2
)
which means that α + 1/n ∈ A.
Real numbers 31
Similarly, suppose that α
2
> 2. This time we set
n >

α
2
− 2
.
We are going to show that there exists a rational number r < α, such that r
2
> 2,
which means that r is an upper bound for A smaller than α—again, a contradiction:
(α − 1/n)
2
= α
2

2
n
α +
1
n
2
> α
2

2
n
α (omit 1/n
2
)
> α
2
+ 2 − α
2
= 2, (use n/2α > α
2
− 2)
which means that, indeed, α − 1/n is an upper bound for A. B
Since we saw that there is no r ∈ Q such that r
2
= 2, it follows that there is no
supremum to this set in Q. We will now see that completeness guarantees the
existence of a lowest upper bound:
Proposition 1.27 In a complete ordered field every non-empty set that is upper-
bounded has a least upper bound.
Proof : Let A ⊂ F be a non-empty set that is upper-bounded. Define
B = ¦b ∈ F : b is an upper bound for A¦,
which by assumption is non-empty. Clearly,
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).
By the axiom of completeness,
(∃c ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b).
Clearly c is an upper bound for A smaller or equal than every other upper bound,
hence
c = sup A.
B
(8 hrs, 2010)
The following proposition is an immediate consequence:
32 Chapter 1
Proposition 1.28 In a complete ordered field every non-empty set that is bounded
from below has a greatest lower bound (an infimum).
Proof : Let m be a lower bound for A, i.e.,
(∀a ∈ A)(m ≤ a).
Consider the set
B = ¦(−a) : a ∈ A¦ .
Since
(∀a ∈ A)((−m) ≥ (−a)),
it follows that (−m) is a upper bound for B. By the axiom of completeness, B has
a least upper bound, which we denote by M. Thus,
(∀a ∈ A)(M ≥ (−a)) and (∀ > 0)(∃a ∈ A) : ((−a) > M − ).
This means that
(∀a ∈ A)((−M) ≤ a) and (∀ > 0)(∃a ∈ A) : (a < (−M) + ).
This means that (−M) = inf A. B
With that, we define the set of real numbers as a complete ordered field, which we
denote by R. It turns out that this defines the set uniquely, up to a relabeling of its
elements (i.e., up to an isomorphism).
14
The axiom of completeness can be used to prove the Archimedean property. In
other words, the axiom of completeness solves at the same time the “problems”
pointed out in the previous section.
Proposition 1.29 (Archimedean property) N is not bounded from above.
14
Strictly speaking, we should prove that such an animal exists. Constructing the set of real
numbers from the set of rational numbers is beyond the scope of this course. You are strongly
recommended, however, to read about it. See for example the construction of Dedekind cuts.
Real numbers 33
Proof : Suppose N was bounded. Then it would have a least upper bound M,
(∃M ∈ R) : (∀n ∈ N) (n ≤ M).
Since n ∈ N implies n + 1 ∈ N,
(∀n ∈ N) ((n + 1) ≤ M),
i.e.,
(∀n ∈ N) (n ≤ (M − 1)).
This means that M−1 is an upper bound, contradicting the fact that M is the least
upper bound. B
(9 hrs, 2011)
We next prove an important facts about the density (תופיפצ) of rational numbers
within the reals.
Proposition 1.30 (The rational numbers are dense in the reals) Let x, y ∈ R,
such that x < y. There exists a rational number q ∈ Q such that x < q < y.
In formal notation,
(∀x, y ∈ R, x < y)(∃q ∈ Q) : (x < q < y).
Proof : Since the natural numbers are not bounded, there exists an m ∈ Z ⊂ Q
such that m < x. Also there exists a natural number n ∈ N such that 1/n < y − x.
Consider now the sequence of rational numbers,
r
k
= m +
k
n
, k = 0, 1, 2, . . . .
From the Archimedean property follows that there exists a k such that
r
k
= m +
k
n
> x.
Let k be the smallest such number (the exists a smallest k because every finite set
has a minimum), i.e.,
r
k
> x and r
k−1
≤ x.
34 Chapter 1
Hence,
x < r
k
= r
k−1
+
1
n
≤ x +
1
n
< y,
which completes the proof.
x y !
q
1/"
If we start #om an integer poin$
on the le% of x and proceed by rational
steps of size less than &y'x(, then w)
reach a rational point between x and y.
B
The following proposition provides a useful fact about complete ordered fields.
We will use it later in the course.
Proposition 1.31 ((יכתחה תמל)) Let A, B ⊂ F be two non-empty sets in a com-
plete ordered field, such that
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b).
Then the following three statements are equivalent:
' (∃!M ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ M ≤ b).
^ sup A = inf B.
· (∀ > 0)(∃a ∈ A, b ∈ B) : (b − a < ).
Comment: The claim is not that the three items follow from the given data. The
claim is that each statement implies the two other.
Proof : Suppose that the first statement holds. We have already seen that sup A is
a lower bound for B. This implies at once that sup A ≤ inf B. If sup A < inf B,
then any number m in between would satisfy a < m < b for all a ∈ A, b ∈ B,
contradicting the assumption, hence ' implies ®.
Real numbers 35
By the properties of the infimum and the supremum,
(∀ > 0)(∃a ∈ A) : (a > sup A − /2)
(∀ > 0)(∃b ∈ B) : (b < inf B + /2).
That is,
(∀ > 0)(∃a ∈ A) : (a + /2 > sup A)
(∀ > 0)(∃b ∈ B) : (b − /2 < inf B),
i.e., (∀ > 0)(∃a ∈ A, b ∈ B) such that
b − a < (inf B + /2) − (sup A − /2) = .
Thus, ® implies ·.
It remains to show that · implies '. Suppose M was not unique, i.e., there were
M
1
< M
2
such that
(∀a ∈ A ∧ ∀b ∈ B)(a ≤ M
1
< M
2
≤ b).
Set m = (M
1
+ M
2
)/2 and = (M
2
− M
1
). Then,
(∀a ∈ A, b ∈ B) (a ≤ m − /2, b ≥ m + /2)
.,,.
b−a≥(m+/2)−(m−/2)=
,
i.e., if ' does not hold then · does not hold. This concludes the proof. B
(7 hrs, 2009)
Exercise 1.16 In this exercise you may assume that

2 is irrational.
' Show that if a + b

2 = 0, where a, b ∈ Q, then a = b = 0.
^ Conclude that a + b

2 = c + d

2, where a, b, c, d ∈ Q, then a = c and
b = d.
· Show that if a + b

2 ∈ Q, where a, b ∈ Q, then b = 0.
´ Show that the irrational numbers are dense, namely that there exists an irra-
tional number between every two.
I Show that the set
¦a + b

2 : a, b ∈ Q¦
with the standard addition and multiplication operations is a field.
36 Chapter 1
Exercise 1.17 Show that the set
¦m/2
n
: m ∈ Z, n ∈ N¦
is dense.
Exercise 1.18 Let A, B ⊆ R be non-empty sets. Prove or disprove each of the
following statements:
' If A ⊆ B and B is bounded from above then A is bounded from above.
^ If A ⊆ B and B is bounded from below then A is bounded from below.
· If every b ∈ B is an upper bound for A then every a ∈ A is a lower bound
form B.
´ A is bounded from above if and only if A ∩ Z is bounded from above.
Exercise 1.19 Let A ⊆ R be a non-empty set. Determine for each of the fol-
lowing statements whether it is equivalent, contradictory, implied by, or implying
the statement that s = sup A, or none of the above:
' a ≤ s for all a ∈ A.
^ For every a ∈ A there exists a r ∈ R, such that a < t < s.
· For every x ∈ R such that x > s there exists an a ∈ A such that s < a < x.
´ For every finite subset B ⊂ A, s ≥ max B.
I s is the supremum of A ∪ ¦s¦.
1.8 Rational powers
We defined the integer powers recursively,
x
1
= x x
k+1
= x x
k
.
We also define for x 0, x
0
= 1 and x
−n
= 1/x
n
.
Proposition 1.32 (Properties of integer powers) Let x, y 0 and m, n ∈ Z, then
' x
m
x
n
= x
m+n
.
Real numbers 37
^ (x
m
)
n
= x
mn
.
· (xy)
n
= x
n
y
n
.
´ 0 < α < β and n > 0 implies α
n
< β
n
.
I 0 < α < β and n < 0 implies α
n
> β
n
.
I α > 1 and n > m implies α
n
> α
m
.
I 0 < α < 1 and n > m implies α
n
< α
m
.
Proof : Do it. B
(10 hrs, 2011)
Definition 1.9 Let x > 0. An n-th root of x is a positive number y, such that
y
n
= x.
It is easy to see that if x has an n-th root then it is unique (since y
n
= z
n
, would
imply that y = z). Thus, we denote it by either
n

x, or by x
1/n
.
Theorem 1.1 (Existence and uniqueness of roots)
(∀x > 0 ∧ ∀n ∈ N)(∃!y > 0) : (y
n
= x).
(10 hrs, 2010)
Proof : Consider the set
S = ¦z ≥ 0 : z
n
< x¦.
This is a non-empty set (it contains zero) and bounded from above, since
if x ≤ 1 and z ∈ S then z
n
< x ≤ 1, which implies that z ≤ 1
if x > 1 and z ∈ S then z
n
< x, which implies that z ≤ x,
It follows that regardless of the value of x, z ∈ S implies that z ≤ max(1, x), and
the latter is hence an upper bound for S . By the axiom of completeness, there
exists a unique y = sup S , which is the “natural suspect” for being an n-th root of
x. Indeed, we will show that y
n
= x.
38 Chapter 1
Claim: y is positive Indeed,
0 <
x
1 + x
< 1,
hence
_
x
1 + x
_
n
<
x
1 + x
< x.
Thus, x/(1 + x) ∈ S , which implies that y > 0.
Claim: y
n
≥ x Suppose, by contradiction that y
n
< x. Set
0 < < min
_
1,
x − y
n
nx
_
.
Then,
nx < x − y
n
,
and
y
n
< x(1 − n) ≤ x(1 − )
n
,
where we used the Bernoulli inequality. This finally implies that
_
y
1 −
_
n
< x.
Thus, we found a number greater than y, whose n-th power is less than x, i.e., in
S . This contradicts the assumption that y is an upper bound for S .
Claim: y
n
≤ x Suppose, by contradiction that y
n
> x. Set
0 < < min
_
1,
y
n
− x
ny
n
_
.
Then,
ny
n
< y
n
− x,
and
[(1 − )y]
n
= (1 − )
n
y
n
≥ (1 − n)y
n
> x,
where we used once more the Bernoulli inequality. Thus, we found a number less
than y, whose n-th power is greater than x, i.e., it is an upper bound for S . This
contradicts the assumption that y is the least upper bound for S .
Real numbers 39
From the two inequalities follows (by trichotomy) that y
n
= x. B
Having shown that every positive number has a unique n-th root, we may proceed
to define rational powers.
Definition 1.10 Let r = m/n ∈ Q, then for all x > 0
x
r
= (x
m
)
1/n
.
There is one little delicacy with this definition. Recall that rational numbers do
not have a unique representation as the ratio of two integers. We thus need to show
that the above definition is independent of the representation
15
. In other words, if
ad = bc, with a, b, c, d ∈ Z, b, d 0, then for all x > 0
x
a/b
= x
c/d
.
This is a non-trivial fact. We need to prove that ad = bc implies that
sup¦z > 0 : z
b
< x
a
¦ = sup¦z > 0 : z
d
< x
c
¦.
The arguments goes as follows:
(x
a/b
)
bd
= (((x
a
)
1/b
)
b
)
d
= (x
a
)
d
= x
ad
= x
bc
= (x
c
)
b
= (((x
c
)
1/d
)
d
)
b
= (x
c/d
)
bd
,
hence x
a/b
= x
c/d
.
(8 hrs, 2009)
Proposition 1.33 (Properties of rational powers) Let x, y > 0 and r, s ∈ Q, then
' x
r
x
s
= x
r+s
.
^ (x
r
)
s
= x
rs
.
· (xy)
r
= x
r
y
r
.
´ 0 < α < β and r > 0 implies α
r
< β
r
.
I 0 < α < β and r < 0 implies α
r
> β
r
.
I α > 1 and r > s implies α
r
> α
s
.
I 0 < α < 1 and r > s implies α
r
< α
s
.
15
This is not obvious. Suppose we wanted to define the notion of even/odd for rational numbers,
by saying that a rational number is even if its numerator is even. Such as definition would imply
that 2/6 is even, but 1/3 (which is the same number!) is odd.
40 Chapter 1
Proof : We are going to prove only two items. Start with the first. Let r = a/b and
s = c/d. Using the (proved!) laws for integer powers,
(x
a/b
x
c/d
)
bd
= (x
a/b
)
bd
(x
c/d
)
bd
= = x
ad
x
bc
= x
ad+bc
= (x
a/b+c/d
)
bd
,
from which follows that
x
a/b
x
c/d
= x
a/b+c/d
.
Take then the fourth item. Let r = a/b. We already know that α
a
< β
a
. Thus
it remains to show α
1/b
< β
1/b
. This has to be because α
1/b
≥ β
1/b
would have
implied (from the rules for integer powers!) that α ≥ β. B
1.9 Real-valued powers
Delayed to much later in the course.
1.10 Addendum
Existence of non-Archimedean ordered fields Does there exist ordered fields
that do not satisfy the Archimedean property? The answer is positive. We will give
one example, bearing in mind that some of the arguments rely on later material.
Consider the set of rational functions. It is easy to see that this set forms a field
with respect to function addition and multiplication, with f (x) ≡ 0 for additive
neutral element and f (x) ≡ 1 for multiplicative neutral element. Ignore the fact
that a rational function may be undefined at a finite collection of points. The
”natural elements” are the constant functions, f (x) ≡ 1, f (x) ≡ 2, etc. We endow
this set with an order relation by defining the set P of positive elements as the
rational functions f that are eventually positive for x sufficiently large. We hace
f > g if f (x) > g(x) for x sufficiently large. It is easy to see that the set of “natural
elements” are bounded, say, by the element f (x) = x (which obviously belongs to
the set).
Connected set A set A ⊆ R will be said to be connected (רישק) if x, y ∈ A and
x < z < y implies z ∈ A (that is, every point between two elements in the set is
Real numbers 41
also n the set). We state that a (non-empty) connected set can only be of one of
the following forms:
¦a¦, (a, b), [a, b), (a, b], [a, b]
(a, ∞), [a, ∞), (−∞, b), (−∞, b], (−∞, ∞).
That is, it can only be a single point, an open, closed or semi-open segments, an
open or closed ray, or the whole line.
Exercise 1.20 Prove it.
42 Chapter 1
Chapter 2
Functions
2.1 Basic definitions
What is a function? There is a standard way of defining functions, but we will
deliberately be a little less formal than perhaps we should, and define a function
as a “machine”, which when provided with a number, returns a number (only one,
and always the same for the same input). In other words, a function is a rule that
assigns real numbers to real numbers.
A function is defined by three elements:
' A domain (וּחת): a subset of R. The numbers which may be “fed into the
machine”.
^ A range (חווט): another subset of R. Numbers that may be “emitted by the
machine”. We do not exclude the possibility that some of these numbers
may never be returned. We only require that every number returned by the
function belongs to its range.
· A transformation rule (הקתעה). The crucial point is that to every number
in its domain corresponds one and only one number in its range (תוּיכרע דח).
We normally denote functions by letters, like we do for real numbers (and for any
other mathematical entity). To avoid ambiguities, the function has to be defined
properly. For example, we may denote a function by the letter f . If A ⊆ R is its
domain, and B ⊆ R is its range, we write f : A → B ( f maps the set A into the set
B). The transformation rule has to specify what number in B is assigned by the
44 Chapter 2
function to each number x ∈ A. We denote the assignment by f (x) (the function
f evaluated at x). That the assignment rule is “assign f (x) to x” is denoted by
f : x ·→ f (x) (pronounced “ f maps x to f (x)”)
1
.
Definition 2.1 Let f : A → B be a function. Its image (הנומת) is the subset of B
of numbers that are actually assigned by the function. That is,
Image( f ) = ¦y ∈ B : ∃x ∈ A : f (x) = y¦.
The function f is said to be onto B (לע) if B is its image. f is said to be one-to-
one (תיכרע דח דח) if to each number in its image corresponds a unique number
in its domain, i.e.,
(∀y ∈ Image( f ))(∃!x ∈ A) : ( f (x) = y).
Examples:
' A function that assigns to every real number its square. If we denote the
function by f , then
f : R → R and f : x ·→ x
2
.
We may also write f (x) = x
2
.
We should not say, however, that “the function f is x
2
”. In particular, we
may use any letter other than x as an argument for f . Thus, the functions
f : R → R, defined by the transformation rules f (x) = x
2
, f (t) = t
2
,
f (α) = α
2
and f (ξ) = ξ
2
are identical
2
.
Also, it turns out that the function f only returns non-negative numbers.
There is however nothing wrong with the definition of the range as the
whole of R. We could limit the range to be the set [0, ∞), but not to the
set [1, 2].
^ A function that assigns to every w ±1 the number (w
3
+3w+5)/(w
2
−1).
If we denote this function by g, then
g : R \ ¦±1¦ → R and g : w ·→
w
3
+ 3w + 5
w
2
− 1
.
1
Programmers: think of f : A → B as defining the “type” or “syntax” of the function, and of
f : x ·→ f (x) as defining the “action” of the function.
2
It takes a great skill not to confuse the Greek letters ζ (zeta) and ξ (xi).
Functions 45
· A function that assigns to every −17 ≤ x ≤ π/3 its square. This function
differs from the function in the first example because the two functions do
not have the same domain (different “syntax” but same “routine”).
´ A function that assigns to every real numbers the value zero if it is irrational
and one if it is rational. This function is known as the Dirichlet function.
We have f : R → ¦0, 1¦, with
f : x ·→
_
¸
¸
_
¸
¸
_
0 x is irrational
1 x is rational.
I A function defined on the domain
A = ¦2, 17, π
2
/17, 36/π¦ ∪ ¦a + b

2 : a, b ∈ Q¦,
such that
x ·→
_
¸
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
¸
_
5 x = 2
36/π x = 17
28 x = π
2
/17 or 36/π
16 otherwise.
The range may be taken to be R, but the image is ¦6, 16, 28, 36/π¦.
I A function defined on R \ Q (the irrational numbers), which assigns to x
the number of 7’s in its decimal expansion, if this number is finite. If this
number is infinite, then it returns −π. This example differs from the previous
ones in that we do not have an assignment rule in closed form (how the heck
do we compute f (x)?). Nevertheless it provides a legal assignment rule
3
.
I For every n ∈ N we may define the n-th power function f
n
: R → R, by
f
n
: x ·→ x
n
. Here again, we will avoid referring to “the function x
n
”. The
function f
1
: x ·→ x is known as the identity function, often denoted by Id,
namely
Id : R → R, Id : x ·→ x.
3
We limited the range of the function to the irrationals because rational numbers may have a
non-unique decimal representation, e.g.,
0.7 = 0.6999999 . . . .
46 Chapter 2
I There are many functions that you all know since high school, such as the
sine, the cosine, the exponential, and the logarithm. These function re-
quire a careful, sometimes complicated, definition. It is our choice in the
present course to assume that the meaning of these functions (along with
their domains and ranges) are well-understood.
Given several functions, they can be combined together to form new functions.
For example, functions form a vector space over the reals. Let f : A → R and
g : B → R be given functions, and a, b ∈ R. We may define a new function
4
a f + b g : A ∩ B → R, a f + b g : x ·→ a f (x) + b g(x).
This is a vector field whose zero element is the zero function, x ·→ 0; given a
function f , its inverse is −f = (−1) f .
Moreover, functions form an algebra. We may define the product of two func-
tions,
f g : A ∩ B → R, f g : x ·→ f (x)g(x),
as well as their quotient,
f /g : A ∩ ¦z ∈ B : g(z) 0¦ → R, f /g : x ·→ f (x)/g(x),
(12 hrs, 2011)
A third operation that can combine two functions is composition. Let f : A → B
and g : B → C. We define
g ◦ f : A → C, g ◦ f : x ·→ g( f (x)).
For example, if f is the sine function and g is the square function, then
5
g ◦ f : ξ ·→ sin
2
ξ and f ◦ g : ζ ·→ sin ζ
2
,
i.e., composition is non-commutative. On the other hand, composition is associa-
tive, namely,
( f ◦ g) ◦ h = f ◦ (g ◦ h).
4
This is not trivial. We are adding “machines”, not numbers.
5
It is definitely a habit to use the letter x as generic argument for functions. Beware of fixations!
Any letter including Ξ is as good.
Functions 47
Note that for every function f ,
Id ◦ f = f ◦ Id = f ,
so that the identity is the neutral element with respect to function composition.
This should not be confused with the fact that x ·→ 1 is the neutral element with
respect to function multiplication.
Example: Consider the function f that assigns the rule
f : x ·→
x + x
2
+ x sin x
2
x sin x + x sin
2
x
.
This function can be written as
f =
Id +Id Id +Id sin ◦(Id Id)
Id sin +Id sin sin
.

Example: Recall the n-th power functions f
n
. A function P is called a polynomial
of degree n if there exist real numbers (a
i
)
n
i=0
, with a
n
0, such that
P =
n

k=0
a
k
f
k
.
The union over all n’s of polynomials of degree n is the set of polynomials. A
function is called rational if it is the ratio of two polynomials.
(10 hrs, 2009)
Exercise 2.1 The following exercise deals with the algebra of functions. Con-
sider the three following functions,
s(x) = sin x P(x) = 2
x
and S (x) = x
2
,
all three defined on R.
' Express the following functions without using the symbols s, P, and S : (i)
(s ◦ P)(y), (ii) (S ◦ s)(ξ), and (iii) (S ◦ P ◦ s)(t) + (s ◦ P)(t).
48 Chapter 2
^ Write the following functions using only the symbols s, P, and S , along
with the binary operators +, , and ◦: (i) f (x) = 2
sin x
, (ii) f (t) = 2
2
x
, (iii)
f (u) = sin(2
u
+ 2
u
2
), (iv) f (y) = sin(sin(sin(2
2
sin y
))), and (v) f (a) = 2
sin
2
a
+
sin(a
2
) + 2
sin(a
2
+sin a)
.
Exercise 2.2 An open segment I is called symmetric if there exists an a > 0
such that I = (−a, a). Let f : I → R where I is a symmetric segment; the function
f is called even if f (−x) = f (x) for every x ∈ I, and odd if f (−x) = −f (x) for
every x ∈ I.
Let f , g : I → R, where I is a symmetric segments, and consider the four possibili-
ties of f , g being either even or odd. In each case, determine whether the specified
function is even, odd, or not necessarily either two. Explain your statement, and
where applicable give a counter example:
' f + g.
^ f g.
· f ◦ g (here we assume that g : I → J and f : J → R, where I and J are
symmetric segments).
Exercise 2.3 Show that any function f : R → R can be written as
f = E + O,
where E is even and O is odd. Prove then that this decomposition is unique. That
is, if f = E + O = E
·
+ O
·
, then E = E
·
and O = O
·
.
Exercise 2.4 The following exercise deals with polynomials.
' Prove that for every polynomial of degree ≥ 1 and for every real number
a there exists a polynomial g and a real number b, such that f (x) = (x −
a)g(x) + b (hint: use induction on the degree of the polynomial).
^ Prove that if f is a polynomial satisfying f (a) = 0, then there exists a
polynomial g such that f (x) = (x − a)g(x).
· Conclude that if f is a polynomial of degree n, then there are at most n real
numbers a, such that f (a) = 0 (at most n roots).
Exercise 2.5 Let a, b, c ∈ R and a 0, and let f : A → B be given by
f : x ·→ ax
2
+ bx + c.
Functions 49
' Suppose B = R. Find the largest segment A for which f is one-to-one.
^ Suppose A = R. Find the largest segment B for which f is onto.
· Find segments A and B as large as possible for which f is one-to-one and
onto.
Exercise 2.6 Let f satisfy the relation f (x + y) = f (x) + f (y) for all x, y ∈ R.
' Prove inductively that
f (x
1
+ x
2
+ + x
n
) = f (x
1
) + f (x
2
) + + f (x
n
)
for all x
1
, . . . , x
n
∈ R.
^ Prove the existence of a real number c, such that f (x) = cx for all x ∈ Q.
(Note that we claim nothing for irrational arguments. )
2.2 Graphs
As you know, we can associate with every function a graph. What is a graph? A
drawing? A graph has to be thought of as a subset of the plane. For a function
f : A → B, we define the graph of f to be the set
Graph( f ) = ¦(x, y) : x ∈ A, y = f (x)¦ ⊂ A × B.
The defining property of the function is that it is uniquely defined, i.e.,
(x, y) ∈ Graph( f ) and (x, z) ∈ Graph( f ) implies y = z.
The function is one-to-one if
(x, y) ∈ Graph( f ) and (w, y) ∈ Graph( f ) implies x = w.
It is onto B if
(∀y ∈ B)(∃x ∈ A) : ((x, y) ∈ Graph( f )).
The definition we gave to a function as an assignment rule is, strictly speaking, not
a formal one. The standard way to define a function is via its graph. A function is
a graph—a subset of the Cartesian product space A × B.
50 Chapter 2
Comment: In fact, this is the general way to define functions. Let A and B be two
sets (of anything!). A function f : A → B is a subset of the Cartesian product
Graph( f ) ⊂ A × B = ¦(a, b) : a ∈ A, b ∈ B¦,
satisfying
(∀a ∈ A)(∃!b ∈ B) : ((a, b) ∈ Graph( f )).
(12 hrs, 2010)
2.3 Limits
Definition 2.2 Let x ∈ R. A neighborhood of x (הביבס) is an open segment
(a, b) that contains the point x (note that since the segment is open, x cannot be
a boundary point). A punctured neighborhood of x (תֶבֶקוּנמ הביבס) is a set
(a, b) \ ¦x¦ where a < x < b.
a a
punctured neighborhood
neighborhood
Definition 2.3 Let A ⊂ R. A point a ∈ A is called an interior point of A (הדוּקנ
תימינפ) if it had a neighborhood contained in A.
Notation: We will mostly deal with symmetric neighborhoods (whether punc-
tured or not), i.e., neighborhoods of a of the form
¦x : ¦x − a¦ < δ¦
for some δ > 0. We will introduce the following notations for neighborhoods,
B(a, δ) = (a − δ, a + δ)
B

(a, δ) = ¦x : 0 < ¦x − a¦ < δ¦.
Functions 51
We will also define one-sided neighborhoods,
B
+
(a, δ) = [a, a + δ)
B

+
(a, δ) = (a, a + δ)
B

(a, δ) = (a − δ, a]
B


(a, δ) = (a − δ, a).
Example: For A = [0, 1], the point 1/2 is an interior point, but not the point 0.

Example: The set Q ⊂ R has no interior points, and neither does its complement,
R \ Q.
In this section we explain the meaning of the limit of a function at a point. Infor-
mally, we say that:
The limit of a function f at a point a is , if we can make f assign
a value as close to as we wish, by making its argument sufficiently
close to a (excluding the value a itself).
Note that the function does not need to be equal to at the point a; in fact, it does
not even need to be defined at the point a.
Example: Consider the function
f : R → R f : x ·→ 3x.
We claim that the limit of this function at the point 5 is 15. This means that we
can make f (x) be as close to 15 as we wish, by making x sufficiently close to 5,
with 5 itself being excluded. Suppose you want f (x) to differ from 15 by less than
1/100. This means that you want
15 −
1
100
< f (x) = 3x < 15 +
1
100
.
This requirement is guaranteed if
5 −
1
300
< x < 5 +
1
300
.
52 Chapter 2
Thus, if we take x to differ from 5 by less than 1/300 (but more than zero!), then
we are guaranteed to have f (x) within the desired range. Since we can repeat this
construction for any number other than 1/100, then we conclude that the limit of
f at 5 is 15. We write,
lim
5
f = 15.
although the more common notation is
6
lim
x→5
f (x) = 15.
We can be more precise. Suppose you want f (x) to differ from 15 by less than ,
for some > 0 of your choice. In other words, you want
¦ f (x) − 15¦ = ¦3x − 15¦ < .
This is guaranteed if ¦x − 5¦ < /3, thus given > 0, choosing x within a sym-
metric punctured neighborhood of 5 of length 2/3 guarantees that f (x) is within
a distance of from 15.
This example insinuates what would be a formal definition of the limit of a func-
tion at a point:
Definition 2.4 Let f : A → B with a ∈ A an interior point. We say that the limit
of f at a is , if for every > 0, there exists a δ > 0, such that
7
¦ f (x) − ¦ < for all x ∈ B

(a, δ).
In formal notation,
(∀ > 0)(∃δ > 0) : (∀x ∈ B

(a, δ))(¦ f (x) − ¦ < ).
6
The choice of not using the common notation becomes clear when you realize that you could
as well write
lim
ℵ→5
f (ℵ) = 15 or lim
ξ→5
f (ξ) = 15.
7
The game is “you give me and I give you δ in return”.
Functions 53
a
l
a
l
! !
"
Player A: This is my !.
Player B: No problem. Here is my ".
Since Player B can find a " for every choice of ! made by Player A,it fo!ows tha"
the limit of f at a is l.
Example: Consider the square function f : R → R, f : x ·→ x
2
. We want to show
that
lim
3
f = 9.
By definition, we need to show that for every > 0 we can find a δ > 0, such that
¦ f (x) − 9¦ < whenever x ∈ B

(3, δ).
Thus, the game it to respond with an appropriate δ for every .
To find the appropriate δ, we examine the condition that needs to be satisfied:
¦ f (x) − 9¦ = ¦x − 3¦ ¦x + 3¦ < .
If we impose that ¦x − 3¦ < δ then
¦ f (x) − 9¦ < ¦x + 3¦ δ.
If the ¦x +3¦ wasn’t there we would have set δ = and we would be done. Instead,
we note that by the triangle inequality,
¦x + 3¦ = ¦x − 3 + 6¦ ≤ ¦x − 3¦ + 6,
so that for all ¦x − 3¦ < δ,
¦ f (x) − 9¦ ≤ ¦x − 3¦ (¦x − 3¦ + 6) < (6 + δ)δ.
54 Chapter 2
Now recall: given we have the freedom to choose δ > 0 such to make the right-
hand side less or equal than . We may freely require for example that d ≤ 1, in
which case
¦ f (x) − 9¦ < (6 + δ)δ ≤ 7δ.
If we furthermore take δ ≤ /7, then we reach the desired goal.
To summarize, given > 0 we take δ = min(1, /7), in which case (∀x : 0 <
¦x − 3¦ < δ),
¦ f (x) − 9¦ = ¦x − 3¦ ¦x + 3¦ ≤ ¦x − 3¦ (¦x − 3¦ + 6) < (6 + δ)δ ≤ (6 + 1)

7
= .
This proves (by definition) that lim
3
f = 9.
We can show, in general, that
lim
a
f = a
2
for all a ∈ R. [do it!]
(13 hrs, 2010)
Example: The next example is the function f : (0, ∞), f : x ·→ 1/x. We are going
to show that for every a > 0,
lim
a
f =
1
a
.
First fix a; it is not a variable. Let > 0 be given. We first observe that
¸
¸
¸
¸
¸
f (x) −
1
a
¸
¸
¸
¸
¸
=
¸
¸
¸
¸
¸
1
x

1
a
¸
¸
¸
¸
¸
=
¦x − a¦
ax
.
We need to be careful that the domain of x does not include zero. We start by
requiring that ¦x − a¦ < a/2, which at once implies that x > a/2, hence
¸
¸
¸
¸
¸
f (x) −
1
a
¸
¸
¸
¸
¸
<
2¦x − a¦
a
2
.
If we further require that ¦x − a¦ < a
2
/2, then ¦ f (x) − 1/a¦ < . To conclude,
¸
¸
¸
¸
¸
f (x) −
1
a
¸
¸
¸
¸
¸
< whenever x ∈ B

(a, δ),
for δ = min(a/2, a
2
/2).
(14 hrs, 2011)
Functions 55
Example: Consider next the Dirichlet function, f : R → R,
f : x ·→
_
¸
¸
_
¸
¸
_
0 x is irrational
1 x is rational.
We will show that f does not have a limit at zero. To show that for all ,
lim
0
f ,
we need to show the negation of
(∃ ∈ R) : (∀ > 0)(∃δ > 0) : (∀x ∈ B

(0, δ))( f (x) ∈ B(, )),
i.e.,
(∀ ∈ R)(∃ > 0) : (∀δ > 0)(∃x ∈ B

(0, δ)) : (¦ f (x) − ¦ ≥ ).
Indeed, take = 1/4. Then no matter what δ is, there exist points 0 < ¦x¦ < δ for
which f (x) = 0 and points for which f (x) = 1, i.e.,
(∀δ > 0)(∃x, y ∈ B

(0, δ)) : ( f (x) = 0, f (y) = 1).
No can satisfy both
¦0 − ¦ <
1
4
and ¦1 − ¦ <
1
4
.
I.e.,
(∀ ∈ R, ∀δ > 0)(∃x ∈ B

(0, δ)) : ( f (x) B(,
1
4
)).

(12 hrs, 2009)
Comment: It is important to stress what is the negation that the limit f at a is :
There exists an > 0, such that for all δ > 0, there exists an x, which
satisfies 0 < ¦x − a¦ < δ, but not ¦ f (x) − ¦ < .
Example: Consider, in contrast, the function f : R → R,
f : x ·→
_
¸
¸
_
¸
¸
_
0 x is irrational
x x is rational.
56 Chapter 2
Here we show that
lim
0
f = 0.
Indeed, let > 0 be given, then by taking δ = ,
x ∈ B

(0, δ) implies ¦ f (x) − 0¦ < .

Having a formal definition of a limit, and having seen a number of example, we
are in measure to prove general theorems about limits. The first theorem states
that a limit, if it exists, is unique.
Theorem 2.1 (Uniqueness of the limit) A function f : A → B has at most one
limit at any interior point a ∈ A.
Proof : Suppose, by contradiction, that
lim
a
f = and lim
a
f = m,
with m. By definition, there exists a δ
1
> 0 such that
¦ f (x) − ¦ <
1
4
¦ − m¦ whenever x ∈ B

(a, δ
1
),
and there exists a δ
2
> 0 such that
¦ f (x) − m¦ <
1
4
¦ − m¦ whenever x ∈ B

(a, δ
2
).
Take δ = min(δ
1
, δ
2
). Then, whenever x ∈ B

(a, δ),
¦ − m¦ = ¦( f (x) − m) − ( f (x) − )¦ ≤ ¦ f (x) − ¦ + ¦ f (x) − m¦ <
1
2
¦ − m¦,
which is a contradiction.
Functions 57
a
l
!
"
Player A: This is my !.
Player B: I give up. No " fits.
Since Player B cannot find a " for a particular choice of ! made by Player A,it fo!ows tha"
the limit of f at a cannot be both l and #.
#
!
a
l
!
#
!
B
Exercise 2.7 In each of the following items, determine whether the limit ex-
ists, and if it does calculate its value. You have to calculate the limit directly from
its definition (without using, for example, limits arithmetic—see below).
' lim
2
f , where f (x) = 7x.
^ lim
0
f , where f (x) = x
2
+ 3.
· lim
0
f , where f (x) = 1/x.
´ lim
a
f , where
f (x) =
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
1 x > 0
0 x = 0
−1 x < 0,
and a ∈ R.
I lim
a
f , where f (x) = ¦x¦.
I lim
a
f , where f (x) =

¦x¦.
Exercise 2.8 Let I be an open interval in R, with a ∈ I. Let f , g : I \ ¦a¦ → R
(i.e., defined on a punctured neighborhood of a). Let J ⊂ I be an open interval
containing the point a, such that f (x) = g(x) for all x ∈ J \ ¦a¦. Show that
lim
a
f exists if and only if lim
a
g exists, and if they do exist they are equal. (This
exercise shows that the limit of a function at a point only depends on its properties
in a neighborhood of that point).
58 Chapter 2
Exercise 2.9 The Riemann function R : R → R is defined as
R(x) =
_
¸
¸
_
¸
¸
_
1/q x = p/q ∈ Q
0 x Q,
where the representation p/q is in reduced form (for example R(4/6) = R(2/3) =
1/3). Note that R(0) = R(0/1) = 1. Show that lim
a
R exists everywhere and is
equal to zero.
We have seen above various examples of limits. In each case, we had to “work
hard” to prove what the limit was, by showing that the definition was satisfied.
This becomes impractical when the functions are more complex. Thus, we need
to develop theorems that will make our task easier. But first, a lemma:
Lemma 2.1 Let , m ∈ R. Then,
'
¦x − ¦ <

2
and ¦y − m¦ <

2
imply ¦(x + y) − ( + m)¦ < .
^
¦x − ¦ < min
_
1,

2(¦m¦ + 1)
_
and ¦y − m¦ < min
_
1,

2(¦¦ + 1)
_
imply
¦xy − m¦ < .
· If m 0 and
¦y − m¦ < min
_
¦m¦
2
,
¦m¦
2

2
_
,
then y 0 and
¸
¸
¸
¸
¸
1
y

1
m
¸
¸
¸
¸
¸
< .
(15 hrs, 2011)
Functions 59
Proof : The first item is obvious (the triangle inequality). Consider the second
item. If the two conditions are satisfied then
¦x¦ = ¦x − + ¦ < ¦x − ¦ + ¦¦ < ¦¦ + 1,
hence,
¦xy − m¦ = ¦x(y − m) + m(x − )¦ ≤ ¦x¦¦y − m¦ + ¦m¦¦x − ¦
<
¦¦ + 1
2(¦¦ + 1)
+
¦m¦
2(¦m¦ + 1)
< .
For the third item it is clear that ¦y − m¦ < ¦m¦/2 implies that ¦y¦ > ¦m¦/2, and in
particular, y 0. Then
¸
¸
¸
¸
¸
1
y

1
m
¸
¸
¸
¸
¸
=
¦y − m¦
¦y¦¦m¦
<
¦m¦
2

2¦m¦(¦m¦/2)
= .
B
(13 hrs, 2009)
(15 hrs, 2010)
Theorem 2.2 (Limits arithmetic) Let f and g be two functions defined in a neigh-
borhood of a, such that
lim
a
f = and lim
a
g = m.
Then,
lim
a
( f + g) = + m and lim
a
( f g) = m.
If, moreover, 0, then
lim
a
_
1
f
_
=
1

.
Proof : This is an immediate consequence of the above lemma. Since, given > 0,
we know that there exists a δ > 0, such that
8
¦ f (x) − ¦ <

2
and ¦g(x) − m¦ <

2
whenever x ∈ B

(a, δ).
8
This δ > 0 is already the smallest of the two δ’s corresponding to each of the functions.
60 Chapter 2
It follows that
¦( f (x) + g(x)) − ( + m)¦ < whenever x ∈ B

(a, δ).
There also exists a δ > 0, such that
¦ f (x) − ¦ < min
_
1,

2(¦m¦ + 1)
_
and ¦g(x) − m¦ < min
_
1,

2(¦¦ + 1)
_
whenever x ∈ B

(a, δ). It follows that
¦ f (x)g(x) − m¦ < whenever x ∈ B

(a, δ).
Finally, there exists a δ > 0, such that
¦ f (x) − ¦ < min
_
¦¦
2
,
¦¦
2

2
_
whenever x ∈ B

(a, δ).
It follows that
¸
¸
¸
¸
¸
1
f (x)

1

¸
¸
¸
¸
¸
< whenever x ∈ B

(a, δ).
B
Comment: It is important to point out that the fact that f + g has a limit at a does
not imply that either f or g has a limit at that point; take for example the functions
f (x) = 5 + 1/x and g(x) = 6 − 1/x at x = 0.
Example: What does it take now to show that
9
lim
x→a
x
3
+ 7x
5
x
2
+ 1
=
a
3
+ 7a
5
a
2
+ 1
.
We need to show that for all c ∈ R
lim
a
c = c,
and that for all k ∈ N,
lim
x→a
x
k
= a
k
,
which follows by induction once we show that for k = 1.
9
This is the standard notation to what we write in these notes as
f : ξ ·→
ξ
3
+ 7ξ
5
ξ
2
+ 1
and lim
a
f =
a
3
+ 7a
5
a
2
+ 1
.
Functions 61
Exercise 2.10 Show that
lim
x→a
x
n
= a
n
, a ∈ R, n ∈ N,
directly from the definition of the limit, i.e., without using limits arithmetic.
Exercise 2.11 Let f , g be defined in a punctured neighborhood of a point a.
Prove or disprove:
' If lim
a
f and lim
a
g do not exist, then lim
a
( f + g) does not exist either.
^ If lim
a
f and lim
a
g do not exist, then lim
a
( f g) does not exist either
· If both lim
a
f and lim
a
( f + g) exist, then lim
a
g also exists.
´ If both lim
a
f and lim
a
( f g) exist, then lim
a
g also exists.
I If lim
a
f exists and lim
a
( f +g) does not exist, then lim
a
( f +g) does no exist.
I If g is bounded in a punctured neighborhood of a and lim
a
f = 0, then
lim
a
( f g) = 0.
We conclude this section by extending the notion of a limit to that of a one-sided
limit.
Definition 2.5 Let f : A → B with a ∈ A an interior point. We say that the limit
on the right (ימימ לובג) of f at a is , if for every > 0, there exists a δ > 0, such
that
¦ f (x) − ¦ < whenever x ∈ B

+
(a, δ).
We write,
lim
a
+
f = .
An analogous definition is given for the limit on the left (לאמשמ לובג).
62 Chapter 2
a
l
a
l
! !
"
Player A: This is my !.
Player B: No problem. Here is my ".
Since Player B can find a " for every choice of ! made by Player A,it fo!ows tha"
the right#limit of f at a is l.
Exercise 2.12 We denote by f : x ·→ [x| the lower integer value of x.
' Draw the graph of this function.
^ Find
lim
a

f and lim
a
+
f
for a ∈ R. Prove it based on the definition of one-sided limits.
Exercise 2.13 Show that f : (0, ∞) → R, f : x ·→

x satisfies
lim
a
f =

a,
for all a > 0.
2.4 Limits and order
In this section we will prove a number of properties pertinent to limits and order.
First a definition:
Definition 2.6 Let f : A → B and a ∈ A an interior point. We say that f is
locally bounded (תימוקמ המוסח) near a if there exists a punctured neighborhood
U of a, such that the set
¦ f (x) : x ∈ U¦
Functions 63
is bounded. Equivalently, there exists a δ > 0 such that the set
¦ f (x) : x ∈ B

(a, δ)¦
is bounded.
Comment: As in the previous section, we consider punctured neighborhoods of a
point, and we don’t even care if the function is defined at the point.
Proposition 2.1 Let f and g be two functions defined in a punctured neighbor-
hood U of a point a. Suppose that
f (x) < g(x) ∀x ∈ U,
and that
lim
a
f = and lim
a
g = m.
Then ≤ m.
Comment: The very fact that f and g have limits at a is an assumption.
Comment: Note that even though f (x) < g(x) is a strict inequality, the resulting
inequality of the limits is in a weak sense. To see what this must be the case,
consider the example
f (x) = ¦x¦ and g(x) = 2¦x¦.
Even though f (x) < g(x) in an open neighborhood of 0,
lim
0
f = lim
0
g = 0.
Proof : Suppose, by contradiction, that > g and set =
1
2
( − m). Thus, there
exists a δ > 0 such that
(∀x ∈ B

(a, δ)) : (¦ f (x) − ¦ < and ¦g(x) − m¦ < ).
In particular, for every such x,
f (x) > − = −
− m
2
= m +
− m
2
= m + > g(x),
which is a contradiction. B
64 Chapter 2
Proposition 2.2 Let f and g be two functions defined in a punctured neighbor-
hood U of a point a. Suppose that
lim
a
f = and lim
a
g = m,
with < m. Then there exists a δ > 0 such that
(∀x ∈ B

(a, δ)) : ( f (x) < g(x)).
Comment: This time both inequalities are strong.
Proof : Let =
1
2
(m − ). There exists a δ > 0 such that
(∀x ∈ B

(a, δ)) : (¦ f (x) − ¦ < and ¦g(x) − m¦ < ).
For every such x,
f (x) < + < + + (g(x) − m + ) = g(x).
B
Proposition 2.3 Let f , g, h be defined in a punctured neighborhood U of a. Sup-
pose that
(∀x ∈ U) : ( f (x) ≤ g(x) ≤ h(x)),
and
lim
a
f = lim
a
h = .
Then
lim
a
g = .
Proof : It is given that for every > 0 there exists a δ > 0 such that
(∀x ∈ B

(a, δ)) : ( − < f (x) and h(x) < + ).
Functions 65
Then for every such x,
− < f (x) ≤ g(x) ≤ h(x) < + ,
i.e.,
(∀x ∈ B

(a, δ)) : (¦g(x) − ¦ < ).
B
(17 hrs, 2011)
Proposition 2.4 Let f be defined in a punctured neighborhood U of a. If the limit
lim
a
f =
exists, then f is locally bounded near a.
Proof : Immediate from the definition of the limit. B
Proposition 2.5 Let f be defined in a punctured neighborhood of a point a. Then,
lim
a
f = if and only if lim
a
( f − ) = 0.
Proof : Very easy. B
Proposition 2.6 Let f and g be defined in a punctured neighborhood of a point
a. Suppose that
lim
a
f = 0
whereas g is locally bounded near a. Then,
lim
a
( f g) = 0.
Proof : Very easy. B
66 Chapter 2
Comment: Note that we do not require g to have a limit at a.
With the above tools, here is another way of proving the product property of the
arithmetic of limits (this time without and δ).
Proposition 2.7 (Arithmetic of limits, product) Let f , g be defined in a punc-
tured neighborhood of a point a, and
lim
a
f = and lim
a
g = m.
Then
lim
a
( f g) = m.
Proof : Write
f g − m = ( f − )
.,,.
limit zero
g
.,,.
loc. bdd
.,,.
limit zero
− (g − m)
.,,.
limit zero
.,,.
limit zero
.
B
(17 hrs, 2010)
2.5 Continuity
Definition 2.7 A function f : A → R is said to be continuous at an inner point
a ∈ A, if it has a limit at a, and
lim
a
f = f (a).
Comment: If f has a limit at a and the limit differs from f (a) (or that f (a) is
undefined) , then we say that f has a removable discontinuity (הקילס תופיצר יא)
at a.
Examples:
' We saw that the function f (x) = x
2
has a limit at 3, and that this limit was
equal 9. Hence, f is continuous at x = 3.
Functions 67
^ We saw that the function f (x) = 1/x has a limit at any a > 0 and that this
limit equals 1/a. By a similar calculation we could have shown that it has a
limit at any a < 0 and that this limit also equals 1/a. Hence, f is continuous
at all x 0. Since f is not even defined at x = 0, then it is not continuous at
that point.
· The function f (x) = x sin 1/x (it was seen in the tutoring session) is contin-
uous at all x 0. At zero it is not defined, but if we define f (0) = 0, then it
is continuous for all x ∈ R (by the bounded times limit zero argument).
´ The function
f : x ·→
_
¸
¸
_
¸
¸
_
x x ∈ Q
0 x Q,
is continuous at x = 0, however it is not continuous at any other point,
because the limit of f at a 0 does not exist.
I The elementary functions sin and cos are continuous everywhere. To prove
that sin is continuous, we need to prove that for every > 0 there exists a
δ > 0 such that
¦ sin x − sin a¦ < whenever x ∈ B

(a, δ).
We will take it for granted at the moment.
Our theorems on limit arithmetic imply right away similar theorems for continu-
ity:
Theorem 2.3 If f : A → R and g : A → R are continuous at a ∈ A then f +g and
f g are continuous at a. Moreover, if f (a) 0 then 1/ f is continuous at a.
Proof : Obvious. B
Comment: Recall that in the definition of the limit, we said that the limit of f at a
is if for every > 0 there exists a δ > 0 such that
¦ f (x) − ¦ < whenever x ∈ B

(a, δ).
There was an emphasis on the fact that the point x itself is excluded. Continuity
is defined as that for every > 0 there exists a δ > 0 such that
¦ f (x) − f (a)¦ < whenever x ∈ B

(a, δ).
68 Chapter 2
Note that we may require that this be true whenever x ∈ B(a, δ). There is no need
to exclude the point x = a.
(19 hrs, 2010)
With that, we have all the tools to show that a function like, say,
f (x) =
sin
2
x + x
2
+ x
4
sin x
1 + sin
2
x
is continuous everywhere in R. But what about a function like sin x
2
. Do we have
the tools to show that it is continuous. No. We don’t yet have a theorem for the
composition of continuous functions.
Theorem 2.4 Let g : A → B and f : B → C. Suppose that g is continuous at an
inner point a ∈ A, and that f is continuous at g(a) ∈ B, which is an inner point.
Then, f ◦ g is continuous at a.
Proof : Let > 0. Since f is continuous at g(a), then there exists a δ
1
> 0, such
that
¦ f (y) − f (g(a))¦ < whenever y ∈ B(g(a), δ
1
).
Since g is continuous at a, then there also exists a δ
2
> 0, such that
¦g(x) − g(a)¦ < δ
1
or g(x) ∈ B(g(a), δ
1
) whenever x ∈ B(a, δ
2
).
Combining the two we get that
¦ f (g(x)) − f (g(a))¦ < whenever B(a, δ
2
).
B
Definition 2.8 A function f is said to be right-continuous (ימימ הפיצר) at a if
lim
a
+
f = f (a).
A similar definition is given for left-continuity (לאמשמ הפיצר).
So far, we have only dealt with continuity at points. Usually, we are interested in
continuity on intervals.
Functions 69
Definition 2.9 A function f is said to be continuous on an open interval (a, b) if
it is continuous at all x ∈ (a, b). It is said to be continuous on a closed interval
[a, b] if it is continuous on the open interval, and in addition it is right-continuous
at a and left-continuous at b.
Comment: We usually think of continuous function as “well-behaved”. One
should be careful with such interpretations; see for example
f (x) =
_
¸
¸
_
¸
¸
_
x sin
1
x
x 0
0 x = 0.
Exercise 2.14 Suppose f : R → R satisfies f (a + b) = f (a) + f (b) for all
a, b ∈ R. )We have already seen that this implies that f (x) = cx, for some constant
c for all x ∈ Q.) Suppose that in addition f is continuous at zero. Prove that f is
continuous everywhere.
Exercise 2.15 ' Prove that if f : A → B is continuous in A, then so is ¦ f ¦
(where ¦ f ¦(x)
def
= ¦ f (x)¦).
^ Prove that if f : A → R and g : A → R are continuous then so are max( f , g)
and min( f , g).
Exercise 2.16 Suppose that g, h : R → R are continuous at a and satisfy
g(a) = h(a). We define a new function that “glues” g and h at a:
f (x) =
_
¸
¸
_
¸
¸
_
g(x) x ≤ a
h(x) x > a.
Prove that f is continuous at a.
Exercise 2.17 Here is another “gluing” problem: Suppose that g is continuous
in [a, b], that h is continuous is [b, c], and g(b) = h(b). Define
f (x) =
_
¸
¸
_
¸
¸
_
g(x) x ∈ [a, b]
h(x) x ∈ [b, c].
Prove that f is continuous in [a, c].
70 Chapter 2
Example: Here is another “crazy function” due to Johannes Karl Thomae (1840-
1921):
r(x) =
_
¸
¸
_
¸
¸
_
1/q x = p/q
0 x Q,
where x = p/q assumes that x is rational in reduced form. This function has the
wonderful property of being continuous at all x Qand discontinuous at all x ∈ Q
(this is because its limit is everywhere zero). It took some more time until Vito
Volterra proved in 1881 that there can be no function that is continuous on Q and
discontinuous on R \ Q.
2.6 Theorems about continuous functions
Theorem 2.5 Suppose that f is continuous at a and f (a) > 0. Then there exists a
neighborhood of a in which f (x) > 0. That is, there exists a δ > 0 such that
f (x) > 0 whenever x ∈ B(a, δ).
Comment: An analogous theorem holds if f (a) < 0. Also, a one-sided version
can be proved, whereby if f is right-continuous at a and f (a) > 0, then
f (x) > 0 whenever x ∈ B
+
(a, δ).
Comment: Note that continuity is only required at the point a.
Proof : Since f is continuous at a, there exists a δ > 0 such that
¦ f (x) − f (a)¦ <
f (a)
2
whenever x ∈ B(a, δ).
But, ¦ f (x) − f (a)¦ <
f (a)
2
implies
f (a) − f (x) ≤ ¦ f (a) − f (x)¦ <
f (a)
2
,
Functions 71
i.e., f (x) > f (a)/2 > 0.
a
f!a"
1/2 f!a"
3/2 f!a"
!
B
This theorem can be viewed as a lemma for the following important theorem:
(15 hrs, 2009)
(20 hrs, 2010)
Theorem 2.6 (Intermediate value theorem ייניבה רע טפשמ) Suppose f is
continuous on an interval [a, b], with f (a) < 0 and f (b) > 0. Then, there ex-
ists a point c ∈ (a, b), such that f (c) = 0.
(19 hrs, 2011)
Comment: Continuity is required on the whole intervals, for consider f : [0, 1] →
R:
f (x) =
_
¸
¸
_
¸
¸
_
−1 0 ≤ x <
1
2
+1
1
2
≤ x ≤ 1.
Comment: If f (a) > 0 and f (b) < 0 the same conclusion holds for just replace f
by (−f ).
Proof : Consider the set
A = ¦x ∈ [a, b] : f (y) < 0 for all y ∈ [a, x]¦.
72 Chapter 2
The set is non empty for it contains a. Actually, the previous theorems guarantees
the existence of a δ > 0 such that [a, a+δ) ⊆ A. This set is also bounded by b. The
previous theorem guarantees even the existence of a δ > 0 such that b − δ is an
upper bound for A. By the axiom of completeness, there exists a number c ∈ (a, b)
such that
c = sup A.
f!a"
b c
f!b"
a
A
Suppose it were true that f (c) < 0. By the previous theorem, there exists a δ > 0
such that
f (x) < 0 whenever c ≤ x ≤ c + δ,
Because c is the least upper bound of a it follows that also
f (x) < 0 whenever x < c,
i.e., f (x) < 0 for all x ∈ [a, c + δ], which means that c + δ ∈ A, contradicting the
fact that c is an upper bound for A.
Suppose it were true that f (c) > 0. By the previous theorem, there exists a δ > 0
such that
f (x) > 0 whenever c − δ ≤ x ≤ c,
which implies that c − δ is an upper bound for A, i.e., c cannot be the least upper
bound of A. By the trichotomy property, we conclude that f (c) = 0. B
Corollary 2.1 If f is continuous on a closed interval [a, b], such that
f (a) < α < f (b),
then there exists a point c ∈ (a, b) at which f (c) = α.
Functions 73
Proof : Apply the previous theorem for g(x) = f (x) − α. B
Exercise 2.18 Suppose that f is continuous on [a, b] with f (x) ∈ Q for all
x ∈ [a, b]. What can be said about f ?
Exercise 2.19 (i) Suppose that f , g : [a, b] → R are continuous such that
f (a) < g(a) and f (b) > g(b). Show that there exists a point c ∈ [a, b] such that
f (c) = g(c). (ii) A fly flies from Tel-Aviv to Or Akiva (along the shortest path).
He starts its way at 10:00 and finishes at 16:00. The following day are makes his
way back along the very same path, starting again at 10:00 and finishing at 16:00.
Show that there will be a point along the way that he will cross a the exact same
time on both days.
Exercise 2.20 (i) Let f : [0, 1] → R be continuous, f (0) = f (1). Show
that there exists an x
0
∈ [0,
1
2
], such that f (x
0
) = f (x
0
+
1
2
). (Hint: consider
g(x) = f (x +
1
2
) − f (x).) (ii) Show that there exists an x
0
∈ [0, 1 −
1
n
], such that
f (x
0
) = f (x
0
+
1
n
).
Lemma 2.2 If f is continuous at a, then there is a δ > 0 such that f has an upper
bound on the interval (a − δ, a + δ).
Proof : This is obvious, for by the definition of continuity, there exists a δ > 0,
such that
f (x) ∈ B( f (a), 1) whenever x ∈ B(a, δ),
i.e., f (x) < f (a) + 1 on this interval. B
Theorem 2.7 (Karl Theodor Wilhelm Weierstraß) If f is continuous on [a, b]
then it is bounded from above of that interval, that is, there exists a number M,
such that f (x) < M for all x ∈ [a, b].
Comment: Here too, continuity is needed on the whole interval, for look at the
“counter example” f : [−1, 1] → R,
f (x) =
_
¸
¸
_
¸
¸
_
1/x x 0
0 x = 0.
74 Chapter 2
Comment: It is crucial that the interval [a, b] be closed. The function f : (0, 1] →
R, f : x ·→ 1/x is continuous on the semi-open interval, but it is not bounded from
above.
Proof : Let
A = ¦x ∈ [a, b] : f is bounded from above on [a, x]¦.
By the previous lemma, there exists a δ > 0 such that a + δ ∈ A. Also A is upper
bounded by b, hence there exists a c ∈ (a, b], such that
c = sup A.
We claim that c = b, for if c < b, then by the previous lemma, there exists a
neighborhood of c in which f is upper bounded, and c cannot be an upper bound
for A. B
Comment: In essence, this theorem is based on the fact that if a continuous func-
tion is bounded up to a point, then it is bounded up to a little farther. To be able to
take such increments up to b we need the axiom of completeness.
(17 hrs, 2009)
(22 hrs, 2010)
(20 hrs, 2011)
Theorem 2.8 (Weierstraß, Maximum principle ומיסקמה ורקע) If f is con-
tinuous on [a, b], then there exists a point c ∈ [a, b] such that
f (x) ≤ f (c) for all x ∈ [a, b].
Comment: Of course, there is a corresponding minimum principle.
Proof : We have just proved that f is upper bounded on [a, b], i.e., the set
A = ¦ f (x) : a ≤ x ≤ b¦
Functions 75
is upper bounded. This set is non-empty for it contains the point f (a). By the
axiom of completeness it has a least upper bound, which we denote by
α = sup A.
We need to show that this supremum is in fact a maximum; that there exists a
point c ∈ [a, b], for which f (c) = α.
Suppose, by contradiction, that this were not the case, i.e., that f (x) < α for all
x ∈ [a, b]. We define then a new function g : [a, b] → R,
g =
1
α − f
.
This function is defined everywhere on [a, b] (since we assumed that the denom-
inator does not vanish), it is continuous and positive. We will show that g is not
upper bounded on [a, b], contradicting thus the previous theorem. Indeed, since
α = sup A:
For all M > 0 there exists a y ∈ [a, b] such that f (y) > α − 1/M.
For this y,
g(y) >
1
α − (α − 1/M)
= M,
i.e., for every M > 0 there exists a point in [a, b] at which f takes a value greater
than M. B
Comment: Here too we needed continuity on a closed interval, for consider the
“counter examples”
f : [0, 1) → R f (x) = x
2
,
and
f : [0, 1] → R f (x) =
_
¸
¸
_
¸
¸
_
x
2
x < 1
0 x = 1
.
These functions do not attain a maximum in [0, 1].
Exercise 2.21 Let f : [a, b] → R be continuous, with f (a) = f (b) = 0 and
M = sup¦ f (x) : a ≤ x ≤ b¦ > 0.
Prove that for every 0 < c < M, the set ¦x ∈ [a, b] : f (x) = c¦ has at least two
elements.
76 Chapter 2
Exercise 2.22 Let f : [a, b] → R be continuous. Prove or disprove each of
the following statements:
' It is possible that f (x) > 0 for all x ∈ [a, b], and that for every > 0 there
exists an x ∈ [a, b] for which f (x) < .
^ There exists an x
0
∈ [a, b] such that for every [a, b] ÷ x x
0
, f (x) < f (x
0
).
Theorem 2.9 If n ∈ N is odd, then the equation
f (x) = x
n
+ a
n−1
x
n−1
+ . . . a
1
x + a
0
= 0
has a root (a solution) for any set of constants a
0
, . . . , a
n−1
.
Proof : The idea it to show that existence of points a, b for which f (b) > 0 and
f (a) < 0, and apply the intermediate value theorem, based on the fact that poly-
nomials are continuous. The only technical issue is to find such points a, b in a
way that works for all choices of a
0
, . . . , a
n−1
.
Let
M = max(1, 2n¦a
0
¦, . . . , 2n¦a
n−1
¦).
Then for ¦x¦ > M,
f (x)
x
n
= 1 +
a
n−1
x
+ +
a
0
x
n
(u + v ≥ u − ¦v¦) ≥ 1 −
¦a
n−1
¦
¦x¦
− −
¦a
0
¦
¦x
n
¦
(¦x
n
¦ > ¦x¦) ≥ 1 −
¦a
n−1
¦
¦x¦
− −
¦a
0
¦
¦x¦
(¦x¦ > ¦a
i
¦) ≥ 1 −
¦a
n−1
¦
2n¦a
n−1
¦
− −
¦a
0
¦
2n¦a
0
¦
= 1 − n
1
2n
> 0.
Since the sign of x
n
is the same as the sign of x, it follows that f (x) is positive for
x ≥ M and negative for x ≤ −M, hence there exists a a < c < b such that f (c) = 0.
B
(18 hrs, 2009)
The next theorem deals with the case where n is even:
Functions 77
Theorem 2.10 Let n be even. Then the function
f (x) = x
n
+ a
n−1
x
n−1
+ . . . a
1
x + a
0
= 0
has a minimum. Namely, there exists a c ∈ R, such that f (c) ≤ f (x) for all x ∈ R.
Proof : The idea is simple. We are going to show that there exists a b > 0 for
which f (x) > f (0) for all ¦x¦ ≥ b. Then, looking at the function f in the interval
[−b, b], we know that it assumes a minimum in this interval, i.e., that there exists
a c ∈ [−b, b] such that f (c) ≤ f (x) for all x ∈ [−b, b]. In particular, f (c) ≤ f (0),
which in turn is less than f (x) for all x [−b, b]. It follows that f (c) ≤ f (x) for
all x ∈ R.
It remains to find such a b. Let M be defined as in the previous theorem. Then,
for all ¦x¦ > M, using the fact that n is even,
f (x) = x
n
_
1 +
a
n−1
x
+ +
a
0
x
n
_
≥ x
n
_
1 −
¦a
n−1
¦
¦x¦
− −
¦a
0
¦
¦x
n
¦
_
≥ x
n
_
1 −
¦a
n−1
¦
¦x¦
− −
¦a
0
¦
¦x¦
_
≥ x
n
_
1 −
¦a
n−1
¦
2n¦a
n−1
¦
− −
¦a
0
¦
2n¦a
0
¦
_
=
1
2
x
n
.
Let then b > max(M,
n
_
2¦ f (0)¦), from which follows that f (x) > f (0) for ¦x¦ > b.
This concludes the proof. B
Corollary 2.2 Consider the equation
f (x) = x
n
+ a
n−1
x
n−1
+ . . . a
1
x + a
0
= α
with n even. Then there exists a number m such that this equation has a solution
for all α ≥ m but has no solution for α < m.
78 Chapter 2
Proof : According to the previous theorem there exists a c ∈ R such that f (c) is
the minimum of f . If α < f (c) then for all x, f (x) ≥ f (c) > α, i.e., there is no
solution. For α = f (c), c is a solution. For α > f (c) we have f (c) < α and for
large enough x, f (x) > α, hence, by the intermediate value theorem a root exists.
B
2.7 Infinite limits and limits at infinity
In this section we extend the notions of limits discussed in previous sections to
two cases: (i) the limit of a function at a point is infinite, and (ii) the limit of a
function at infinity. Before we start recall: infinity is not a real number!.
Definition 2.10 Let f : A → B be a function. We say that the limit of f at a is
infinity, denoted
lim
a
f = ∞,
if for every M there exists a δ > 0, such that
f (x) > M whenever 0 < ¦x − a¦ < δ.
Similarly,
lim
a
f = −∞,
if for every M there exists a δ > 0, such that
f (x) < M whenever 0 < ¦x − a¦ < δ.
Example: Consider the function
f : x ·→
_
¸
¸
_
¸
¸
_
1/¦x¦ x 0
17 x = 0.
The limit of f at zero is infinity. Indeed given M we take δ = 1/M, then
0 < ¦x¦ < δ implies f (x) > M.

Functions 79
Definition 2.11 Let f : R → R. We say that
lim

f = ,
if for every > 0 there exists an M, such that
¦ f (x) − ¦ < whenever x > M.
Similarly,
lim
−∞
f = ,
if for every > 0 there exists an M, such that
¦ f (x) − ¦ < whenever x < M.
Example: Consider the function f : x ·→ 3 + 1/x
2
. Then the limit of f at infinity
is 3, since given > 0 we take M = 1/

, and
x > M implies ¦ f (x) − 3¦ < .

Comment: These definitions are in full agreement with all previous definitions of
limits, if we adopt the idea that “neighborhoods of infinity” are sets of the form
(M, ∞).
Exercise 2.23 Prove directly from the definition that
lim
x→3
+
x
3
+ 2x + 1
x − 3
= ∞.
Exercise 2.24 Calculate the following limit using limit arithmetic,
lim
x→∞
_ √
x
2
+ 2x − x
_
.
Exercise 2.25 Prove that
lim
x→∞
f (x) exists if and only if lim
x→0
+
f (1/x) exist,
in which case they are equal.
80 Chapter 2
Figure 2.1: Infinite limit and limit at infinity.
Exercise 2.26 Let g be defined in a punctured neighborhood of a ∈ R, with
lim
a
g = ∞. Show that lim
a
1/g = 0.
Exercise 2.27
' Let f , g be defined in a punctured neighborhood of a ∈ R, with lim
a
f =
lim
a
g = −∞. Show that lim
a
( f g) = ∞.
^ Let f , g be defined in a punctured neighborhood of a ∈ R, with lim
a
f =
K < 0 and lim
a
g = −∞. Show that lim
a
( f /g) = 0.
(24 hrs, 2010)
2.8 Inverse functions
Suppose that f : A → B is one-to-one and onto. This means that for every b ∈ B
there exists a unique a ∈ A, such that f (a) = b. This property defines a function
from B to A. In fact, this function is also one-to-one and onto. We call this
function the function inverse to f (תיכפוה היצקנופ), and denote it by f
−1
(not to
be mistaken with 1/ f ). Thus, f
−1
: B → A,
f
−1
(y) = ¦x ∈ A : f (x) = y¦.
Functions 81
10
Note also that f
−1
◦ f is the identity in A whereas f ◦ f
−1
is the identity in B.
Comment: Recall that a function f : A → B can be defined as a subset of A × B,
Graph( f ) = ¦(a, b) : f (a) = b¦.
The inverse function can be defined as a subset of B × A: it is all the pairs (b, a)
for which (a, b) ∈ Graph( f ), or
Graph( f
−1
) = ¦(b, a) : f (a) = b¦.
Example: Let f : [0, 1] → R be given by f : x ·→ x
2
. This function is one-to-
one and its image is the segment [0, 1]. Thus we can define an inverse function
f
−1
: [0, 1] → [0, 1] by
f
−1
(y) = ¦x ∈ [0, 1] : x
2
= y¦,
i.e., it is the (positive) square root.
(22 hrs, 2011)
Theorem 2.11 If f : I → R is continuous and one-to-one (I is an interval), then
f is either monotonically increasing, or monotonically decreasing.
Comment: Continuity is crucial. The function f : [0, 2] → R,
f (x) =
_
¸
¸
_
¸
¸
_
x 0 ≤ x ≤ 1
6 − x 1 < x ≤ 2
is one-to-one with image [0, 1] ∪ [4, 5), but it is not monotonic.
10
Strictly speaking, the right-hand side is a set of real numbers, and not a real number. However,
the one-to-one and onto properties of f guarantee that it contains one and only one number, hence
we identify this singleton (ודיחי) with the unique element it contains.
82 Chapter 2
a b c
f!b"
f!a"
f!c"
Figure 2.2: Illustration of proof
Proof : We first prove that for every three points a < b < c, either
f (a) < f (b) < f (c), or f (a) > f (b) > f (c).
Suppose that f (a) < f (c) and that, by contradiction, f (b) < f (a) (see Figure 2.2).
By the intermediate value theorem, there exists a point x ∈ (b, c) such that f (x) =
f (a), contradicting the fact that f is one-to-one. Similarly, if f (b) > f (c), then
there exists a point y ∈ [a, b] such that f (y) = f (c). Thus, f (a) < f (c) implies that
f (a) < f (b) < f (c). We proceed similarly if f (a) > f (c).
It follows as once that for every four points a < b < c < d, either
f (a) < f (b) < f (c) < f (d) or f (a) > f (b) > f (c) > f (d).
Fix now a and b, and suppose w.l.o.g that f (a) < f (b) (they can’t be equal). Then,
for every x, y such that x < y, we apply the above arguments to the four points
a, b, x, y (whatever their order is) and conclude that f (x) < f (y).
B
Suppose that the function f : [a, b] → R is one-to-one and monotonically in-
creasing (without loss of generality). Then, its image is the interval [ f (a), f (b)].
Indeed, f (a) and f (b) are in the range, and by the intermediate value theorem,
so is any point in this interval. By monotonicity, no point outside this interval
Functions 83
can be in the range of f . It follows that the inverse function has range and im-
age f
−1
: [ f (a), f (b)] → [a, b]. We have proved, en passant, that continuous
one-to-one functions map closed segments into closed segments
11
.
Theorem 2.12 If f : [a, b] → R is continuous and one-to-one then so is f
−1
.
Proof : Without loss of generality, let us assume that f is monotonically increas-
ing, i.e.,
x < y implies f (x) < f (y).
Then, so is f
−1
. We need to show that for every y
0
∈ ( f (a), f (b)),
lim
y
0
f
−1
= f
−1
(y
0
).
Equivalently, we need to show that for every > 0 there exists a δ > 0, such that
¦ f
−1
(y) − f
−1
(y
0
)¦ < whenever ¦y − y
0
¦ < δ.
Now, every such y
0
is equal to f (x
0
) for a unique x
0
. Since f (and f
−1
) are both
monotonically increasing, it is clear how to set δ. Let
y
1
= f (x
0
− ) and y
2
= f (x
0
+ ).
By monotonicity, y
1
< y
0
< y
2
. We let then δ = min(y
0
− y
1
, y
2
− y
0
). Every
y
0
− δ < y < y
0
+ δ is in the interval (y
1
, y
2
), and therefore,
f
−1
(y
0
) − < f
−1
(y) < f
−1
(y
0
) + .
We can do it more formally. Let
δ = min( f ( f
−1
(y
0
) + ) − y
0
, y
0
− f ( f
−1
(y
0
) − )).
Then, ¦y − y
0
¦ < δ implies
y
0
− δ < y < y
0
+ δ,
or,
f ( f
−1
(y
0
)−) = y
0
−[y
0
−f ( f
−1
(y
0
)−)] < y < y
0
+[ f ( f
−1
(y
0
)+)−y
0
] = f ( f
−1
(y
0
)+).
11
More generally, a continuous one-to-one function maps connected sets into connected sets,
and retains the open/closed properties.
84 Chapter 2
Since f
−1
is monotonically increasing, we may apply f
−1
on all terms, giving
f
−1
(y
0
) − < f
−1
(y) < f
−1
(y
0
) + ,
or ¦ f
−1
(y
0
) − f
−1
(y)¦ < .
f!x
0
"=y
0
f
#1
!y
0
"=x
0
f!x
0
+!"
f!x
0
#!"
!
"
B
(20 hrs, 2009)
(26 hrs, 2010)
2.9 Uniform continuity
Recall the definition of a continuous function: a function f : (a, b) → R is contin-
uous on the interval (a, b), if it is continuous at every point in the interval. That is,
for every x ∈ (a, b) and every > 0, there exists a δ < 0, such that
¦ f (y) − f (x)¦ < whenever ¦y − x¦ < δ and y ∈ (a, b).
In general, we expect δ to depend both on x and on . In fact, the way we set
it here, this definition applies for continuity on a closed interval as well (i.e, it
includes one-sided continuity as well).
Let us re-examine two examples:
Functions 85
Example: Consider the function f : (0, 1) → R, f : x ·→ x
2
. This function is
continuous on (0, 1). Why? Because for every x ∈ (0, 1), and every y ∈ (0, 1),
¦ f (y) − f (x)¦ = ¦y
2
− x
2
¦ = ¦y − x¦¦y + x¦ ≤ (x + 1)¦y − x¦.
Thus, given x and > 0, if we choose δ = δ(, x) = /(x + 1), then
¦ f (y) − f (x)¦ < whenever ¦x − y¦ < δ(, x) and y ∈ (0, 1).
Thus, δ depends both on and x. However, it is always legitimate to replace δ by
a smaller number. If we take δ = /2, then the same δ fits all points x.
Example: Consider next the function f : (0, 1) → R, f : x ·→ 1/x. This function
is also continuous on (0, 1). Why? Given x ∈ (0, 1) and y ∈ (0, 1),
¦ f (x) − f (y)¦ =
¦x − y¦
xy
=
¦x − y¦
x[x + (y − x)]
.
If we take δ(, x) = min(x/2, x
2
/2), then
¦x − y¦ < δ implies ¦ f (x) − f (y)¦ <
δ
x(x − δ)
< .
Here, given , the closer x is to zero, the smaller is δ(, x). There is no way we
can find a δ = δ() that would fit all x.
These two examples motivate the following definition:
Definition 2.12 f : A → B is said to be uniformly continuous on A (הפיצר
הווש הדימב) if for every > 0 corresponds a δ = δ() > 0, such that
¦ f (y) − f (x)¦ < whenever ¦y − x¦ < δ and x, y ∈ A.
Note that x and y play here symmetric roles
12
.
What is the essential difference between the two above examples? The func-
tion x ·→ x
2
could have been defined on the closed interval [0, 1] and it would
have been continuous there too. In contrast, there is no way we could have de-
fined the function x ·→ 1/x as a continuous function on [0, 1], even if we took
care of its value at zero. We have already seen examples where continuity on a
12
The term “uniform” means that the same number can be used for all points.
86 Chapter 2
closed interval had strong implications (ensures boundedness and the existence of
a maximum). This is also the case here. We will prove that continuity on a closed
interval implies uniform continuity.
We proceed by steps:
Lemma 2.3 Let f : [a, c] → R be continuous. Let a < b < c, and suppose that
for a given > 0 there exist δ
1
, δ
2
> 0, such that
x, y ∈ [a, b] and ¦x − y¦ < δ
1
implies ¦ f (x) − f (y)¦ <
x, y ∈ [b, c] and ¦x − y¦ < δ
2
implies ¦ f (x) − f (y)¦ < .
Then there exists a δ > 0 such that
x, y ∈ [a, c] and ¦x − y¦ < δ implies ¦ f (x) − f (y)¦ < .
Proof : We have to take care at the “gluing” at the point b. Since the function is
continuous at b, then there exists a δ
3
> 0 such that
¦ f (x) − f (b)¦ <

2
whenever ¦x − b¦ < δ
3
.
Hence,
¦x − b¦ < δ
3
and ¦y − b¦ < δ
3
implies ¦ f (x) − f (y)¦ < . (2.1)
Take now δ = min(δ
1
, δ
2
, δ
3
). We claim that
¦ f (x) − f (y)¦ < whenever ¦x − y¦ < δ and x, y ∈ [a, c].
Why is that true? One of three: either x, y ∈ [a, b], in which case it follows from
the fact that δ ≤ δ
1
, or x, y ∈ [b, c], in which case it follows from the fact that
δ ≤ δ
2
. Remains the third case, where, without loss of generality, x ∈ [a, b] and
y ∈ [b, c]. Since ¦x −y¦ < δ
3
, it follows that ¦x −b¦ < δ
3
and ¦y −b¦ < δ
3
and we use
(2.1). B
(24 hrs, 2011)
Functions 87
Theorem 2.13 If f is continuous on [a, b] then it is uniformly continuous on that
interval.
Proof : Given > 0, we will say that f is -good on [a, b] if there exists a δ > 0,
such that
¦ f (x) − f (y)¦ < whenever ¦x − y¦ < δ and x, y ∈ [a, b]
(i.e., if for that particular there exists a corresponding δ). We need to show that
f is -good on [a, b] for all > 0. Note that we proved in Lemma 2.3 that if f is
-good on [a, b] and on [b, c], then it is -good on [a, c].
Given , define
A = ¦x ∈ [a, b] : f is -good on [a, x]¦.
The set A is non-empty, as it includes the point a (trivial) and it is bounded by b.
Hence it has a supremum, which we denote by α. Suppose that α < b. Since f is
continuous at α, then there exists a δ
0
> 0 such that
¦ f (y) − f (α)¦ <

2
whenever ¦y − α¦ < δ
0
and y ∈ [a, b].
Thus,
∀y, z ∈ B(α, δ
0
), ¦ f (y) − f (z)¦ ≤ ¦ f (y) − f (α)¦ + ¦ f (z) − f (α)¦ < ,
fromwhich follows that f is -good in (α−δ
0
, α+δ
0
)∩[a, b] and hence in the closed
interval [α − δ
0
/2, α + δ
0
/2] ∩ [a, b]. By the definition of α, it is also -good on
[a, α−δ
0
/2]∩[a, b], and by the previous lemma, f is -good on [a, α+δ
0
/2]∩[a, b],
contradicting the fact that α is an upper bound for A. Thus f is -good on [a, b].
Since this argument holds independently of we conclude that f is -good in [a, b]
for all > 0. B
Comment: In this course we will make use of uniform continuity only once, when
we study integration.
(27 hrs, 2010)
Examples:
88 Chapter 2
' Consider the function f (x) = sin(1/x) defined on (0, 1). Even though it is
continuous, it is not uniformly continuous.
^ Consider the function Id : R → R. It is easy to see that it is uniformly
continuous. The function Id Id, however, is continuous on R, but not uni-
formly continuous. Thus, the product of uniformly continuous functions is
not necessarily uniformly continuous.
· Finally the function f (x) = sin x
2
is continuous and bounded on R, but it is
not uniformly continuous.
(22 hrs, 2009)
Exercise 2.28 Let f : R → R be continuous. Prove or disprove:
' If
lim

f = lim
−∞
f = 0,
then f attains a maximum value.
^ If lim

f = ∞ and lim
−∞
f = −∞, then f assumes all real values; that is f
is on R.
· If on any interval [n, n + 1], n ∈ Z, f is uniformly continuous, then f is
uniformly continuous on R.
Exercise 2.29 Suppose A ⊆ R is non-empty.
' Prove that if f , g are uniformly continuous on A then αf + βg is also uni-
formly continuous on A, where α, β ∈ R.
^ Is in the above case f g necessarily uniformly continuous?
Exercise 2.30 Let f : (0, ∞) → R be given by f : x ·→ 1/

x:
' Let a > 0. Prove that f is uniformly continuous on [a, ∞).
^ Show that f is not uniformly continuous on (0, ∞).
Exercise 2.31 Let I be an open connected domain (i.e., an open segment, an
open ray, or the whole line) and let f , g : I → R be uniformly continuous on I.
Prove that f g is uniformly continuous on I. (Hint: read the proof about the limit
of a product of functions being the product of their limits.)
Functions 89
Exercise 2.32 Let f : A → B and g : B → C be uniformly continuous. Show
that g ◦ f is uniformly continuous.
Exercise 2.33 Let a > 0. Show that f (x) = sin(1/x) is not uniformly contin-
uous on (0, a).
Exercise 2.34 For which values of α > 0 is the function f (x) = x
α
uniformly
continuous on [0, ∞)?
Exercise 2.35 Let α > 0. We say that f : A → B is H¨ older continuous of
order α if there exists a constant M > 0, such that
¦ f (x) − f (y)¦ < M¦x − y¦
α
, ∀x, y ∈ A.
' Show that f (x) =

x is H¨ older continuous of order 1/2 on (0, ∞).
^ Prove that if f is H¨ older continuous of order α on A then it is uniformly
continuous on A.
· Prove that if f is H¨ older continuous of order α > 1, then it is a constant
function.
90 Chapter 2
Chapter 3
Derivatives
3.1 Definition
In this chapter we define and study the notion of differentiable functions. Deriva-
tives were historically introduced in order to answer the need of measuring the
“rate of change” of a function. We will actually adopt this approach as a prelude
to the formal definition of the derivative.
Before we start, one technical clarification. In the previous chapter, we introduced
the limit of a function f : A → B at an (interior) point a,
lim
a
f .
Consider the function g defined by
g(x) = f (a + x),
on the range,
¦x : a + x ∈ A¦.
In particular, zero is an interior point of this set. We claim that
lim
0
g = lim
a
f ,
provided, of course, that the right hand side exists. It is a good exercise to prove
it formally. Suppose that the right hand side equals . This means that
(∀ > 0)(∃δ > 0) : (∀x ∈ B

(a, δ))(¦ f (x) − ¦ < ).
92 Chapter 3
Setting x − a = y,
(∀ > 0)(∃δ > 0) : (∀y ∈ B

(0, δ))(¦ f (a + y) − ¦ < ),
and by the definition of g,
(∀ > 0)(∃δ > 0) : (∀y ∈ B

(0, δ))(¦g(y) − ¦ < ),
i.e., lim
0
g = . Often, one does not bother to define the function g and simply
writes
lim
x→0
f (a + x) = lim
a
f ,
although this in not in agreement with our notations; f (a + x) stands for the com-
posite function,
f ◦ (x ·→ a + x).
Let now f be a function defined on some interval I, and let a and x be two points
inside this interval. The variation in the value of f between these two points is
f (x) − f (a). We define the mean rate of change (עצוממ יוניש בצק) of f between
the points a and x to be the ratio
f (x) − f (a)
x − a
.
This quantity can be attributed with a number of interpretations. First, if we con-
sider the graph of f , then the mean rate of change is the slope of the secant line
(רתימ) that intersects the graph at the points a and x (see Figure 3.1). Second, it
has a meaning in many physical situations. For example, f can be the position
along a line (say, in meters relative to the origin) as function of time (say, in sec-
onds relative to an origin of time). Thus, f (a) is the distance from the origin in
meters a seconds after the time origin, and f (x) is the distance from the origin
in meters x seconds after the time origin. Then f (x) − f (a) is the displacement
(קתעה) between time a and time x, and [ f (x) − f (a)]/(x −a) is the mean displace-
ment per unit time, or the mean velocity.
This physical example is a good preliminary toward the definition of the deriva-
tive. The mean velocity, or mean rate of displacement, is an average quantity
between two instants, but there is nothing to guarantee that within this time in-
terval, the body was in a “fixed state”. For example, we could intersect this time
interval into two equal sub-intervals, and measure the mean velocity in each half.
Nothing guarantees that these mean velocities will equal the mean velocity over
Derivatives 93
f!a"
a a+h
f!a+h"
!
Figure 3.1: Relation between the mean rate of change of a function and the slope
of the corresponding secant.
the whole interval. Physicists aimed to define an “instantaneous velocity”, and
the way to do it was to make the length of the interval very small. Of course,
having h small (what does small mean?) does not change the average nature of
the measured velocity.
An instantaneous rate of change can be defined by using limits. Fixing the point
a, we define a function

f ,a
: x ·→
f (x) − f (a)
x − a
.
The function ∆
f ,a
can be defined on the same domain as f , but excluding the point
a. We say that f is differentiable at a (תיליבאיצנרפיד וא הריזג) if ∆
f ,a
has a limit
at a. We write
lim
a

f ,a
= f
·
(a).
We call the limit f
·
(a) the derivative (תרזגנ) of f at a.
Comment: In the more traditional notation,
f
·
(a) = lim
x→a
f (x) − f (a)
x − a
.
At this stage, f
·
(a) is nothing but a notation. It is not (yet!) a function evaluated
94 Chapter 3
at the point a. Another standard notation, due to Leibniz, is
d f
dx
(a).
Having identified the mean rate of change as the slope of the secant line, the
derivative has a simple geometrical interpretation. As x tends to a, the corre-
sponding family of secants tends to a line which is tangent to f at the point a.
For us, these are only hand-waving arguments, as we have never assigned any
meaning to the limit of a family of lines.
Example: Consider the constant function, f : R → R, f : x ·→ c. We are going to
calculate its derivative at the point a. We construct the function

f ,a
(y) =
f (y) − f (a)
y − a
=
c − c
y − a
= 0.
The limit of this function at a is clearly zero, hence f
·
(a) = 0.
In the above example, we could have computed the derivative at any point. In
other words, we can define a function that given a point x, returns the derivative of
f at that point, i.e., returns f
·
(x). A function f : A → B is called differentiable in
a subset U ⊆ A of its domain if it has a derivative at every point x ∈ U. We then
define the derivative function, f
·
: U → R, as the function,
f
·
(x) = lim
x

f ,x
.
Again, the more traditional notation is,
f
·
(x) = lim
y→x
f (y) − f (x)
y − x
.
Example: Consider the function f : x ·→ x
2
. We calculate its derivative at a point
x by first observing that

f ,x
(y) =
y
2
− x
2
y − x
= x + y or ∆
f ,x
= Id +x,
so that
f
·
(x) = lim
x

f ,x
= 2x.
Derivatives 95

(23 hrs, 2009)
(25 hrs, 2011)
Example: Consider now the function f : x ·→ ¦x¦. Let’s first calculate its deriva-
tive at a point x > 0. For x > 0,

f ,x
(y) =
¦y¦ − x
y − x
.
Since we are interested in the limit of ∆
f ,x
at x > 0 we may well assume that y > 0,
in which case ∆
f ,x
(y) = 1. Hence
f
·
(x) = lim
x

f ,x
= 1.
For negative x, ¦x¦ = −x and we may consider y < 0 as well, so that

f ,x
(y) =
(−y) − (−x)
y − a
= −1,
so that
f
·
(x) = lim
x

f ,x
= −1.
Remains the point 0 itself,

f ,0
(y) =
¦y¦ − ¦0¦
y − 0
= sgn(y).
The limit at zero does not exist since every neighborhood of zero has points where
this function equals one and points where this function equals minus one. Thus f
is differentiable everywhere except for the origin
1
.
(29 hrs, 2010)
One-sided differentiability This last example shows a case where a limit does
not exist, but one-sided limits do exist. This motivates the following definitions:
1
We use here a general principle, whereby lim
a
f does not exist if there exist
1

2
, such that
every neighborhood of a has points x, y, such that f (x) =
1
and f (y) =
2
.
96 Chapter 3
Definition 3.1 A function f is said to be differentiable on the right (ימימ הריזג)
at a if the one-sided limit
lim
a
+

f ,a
exists. We denote the right-hand derivative (תינמי תרזגנ) by f
·
(a
+
). A similar
definition holds for left-hand derivatives.
Example: Here is one more example that practices the calculation of the deriva-
tive directly from the definition. Consider first the function
f : x ·→
_
¸
¸
_
¸
¸
_
x sin
1
x
x 0
0 x = 0.
We have already seen that that this function is continuous at zero, but is it differ-
entiable at zero? We construct the function

f ,0
(y) =
f (y) − f (0)
y − 0
= sin
1
y
.
The function ∆
f ,0
does not have a limit at 0, because every neighborhood of zero
has both points where ∆
f ,0
= 1 and points where ∆
f ,0
= −1.
Example: In contrast, consider the function
f : x ·→
_
¸
¸
_
¸
¸
_
x
2
sin
1
x
x 0
0 x = 0.
We construct

f ,0
(y) =
f (y) − f (0)
y − 0
= y sin
1
y
.
Now lim
0

f ,0
= 0, hence f is differentiable at zero.
Differentiability and continuity The following theorem shows that differen-
tiable functions are a subclass (“better behaved”) of the continuous functions.
Theorem 3.1 If f is differentiable at a then it is continuous at a.
Derivatives 97
Proof : Note that for y a,
f (y) = f (a) + (y − a)
f (y) − f (a)
(y − a)
.
Viewing f (a) as a constant, we have a functional identity,
f = f (a) + (Id −a) ∆
f ,a
valid in a punctured neighborhood of a. By limit arithmetic,
lim
a
f = f (a) + lim
a
(Id −a) lim
a

f ,a
= f (a) + 0 f
·
(a) = f (a).
B
Exercise 3.1 Show, based on the definition of the derivative, that if f is dif-
ferentiable at a ∈ R with f (x) 0, then ¦ f ¦ is also differentiable at that point.
(Hint, use the fact that a function differentiable at a point is also continuous at that
point.)
Exercise 3.2 Let I be a connected domain, and suppose that f is differentiable
at every interior point of I and continuous on I. Suppose furthermore that ¦ f
·
(x)¦ <
M for all x ∈ I. Show that f is uniformly continuous on I.
Higher order derivatives If a function f is differentiable on an interval A, we
can construct a new function—its derivative, f
·
. The derivative f
·
can have vari-
ous properties, For example, it may be continuous, or not, and in particular, it may
be differentiable on A, or on a subset of A. Then, we can define a new function—
the derivative of the derivative, or the second derivative (הינש תרזגנ), which we
denote by f
··
. By definition
f
··
(x) = lim
x

f
·
,x
.
(Note that this is a limit of limits.) Likewise, the second derivative may be dif-
ferentiable, in which case we may define the third derivative f
···
, and so on. For
derivatives higher than the third, it is customary to use, for example, the notation
f
(4)
rather than f
····
. The k-th derivative, f
(k)
, is defined recursively,
f
(k)
(x) = lim
x

f
(k−1)
,x
.
For the recursion to hold from k = 1, we also set f
(0)
= f .
98 Chapter 3
3.2 Rules of differentiation
Recall that when we studied limits, we first calculated limits by using the defini-
tion of the limit, but very soon this became impractical, and we proved a number
of theorems (arithmetic of limits), with which we were able to easily calculate
a large variety of limits. The same exactly will happen with derivatives. After
having calculated the derivatives of a small number of functions, we develop tools
that will enable us to (easily) compute derivatives, without having to go back to
the definitions.
Theorem 3.2 If f is a constant function, then f
·
= 0 (by this we mean that f
·
:
x ·→ 0).
Proof : We proved this is in the previous section. B
Theorem 3.3 If f = Id then f
·
= 1 (by this we mean that f
·
: x ·→ 1).
Proof : Immediate from the definition. B
Theorem 3.4 Let f , g be functions defined on the same domain (it suffices that the
domains have a non-empty intersection). If both f and g are differentiable at a
then f + g is differentiable at a, and
( f + g)
·
(a) = f
·
(a) + g
·
(a).
Proof : We consider the function

f +g,a
=
( f + g) − ( f + g)(a)
Id −a
=
f + g − f (a) − g(a)
Id −a
= ∆
f ,a
+ ∆
g,a
,
Derivatives 99
and by the arithmetic of limits,
( f + g)
·
(a) = f
·
(a) + g
·
(a).
B
Theorem 3.5 (Leibniz rule) Let f , g be functions defined on the same domain. If
both f and g are differentiable at a then f g is differentiable at a, and
( f g)
·
(a) = f
·
(a) g(a) + f (a) g
·
(a).
Proof : We consider the function,

f g,a
=
f g − ( f g)(a)
Id −a
=
f g − f g(a) + f g(a) − f (a)g(a)
Id −a
= f ∆
g,a
+ g(a)∆
f ,a
,
and it remains to apply the arithmetic rules of limits. B
Corollary 3.1 The derivative is a linear operator,
(αf + βg)
·
= αf
·
+ βg
·
.
Proof : We only need to verify that if f is differentiable at a then so is αf and
(αf )
·
= αf
·
. This follows from the derivative of the product. B
Example: Let n ∈ N and consider the functions f
n
: x ·→ x
n
. Then,
f
·
n
(x) = n x
n−1
or equivalently f
·
n
= n f
n−1
.
We can show this inductively. We know already that this is true for n = 0, 1.
Suppose this were true for n = k. Then, f
k+1
= f
k
Id, and by the differentiation
rule for products,
f
·
k+1
= f
·
k
Id +f
k
Id
·
,
i.e.,
f
·
k+1
(x) = k x
k−1
x + x
k
1 = (k + 1) x
k
.

100 Chapter 3
Theorem 3.6 If g is differentiable at a and g(a) 0, then 1/g is differentiable at
a and
(1/g)
·
(a) = −
g
·
(a)
g
2
(a)
.
Proof : Since g is continuous at a (it is continuous since it is differentiable) and it
is non-zero, then there exists a neighborhood of a in which g does not vanish (by
Theorem 2.5). In this neighborhood we consider the function

1/g,a
=
1/g − (1/g)(a)
Id −a
=
1
Id −a
_
1
g

1
g(a)
_
=
−1
g g(a)

g,a
.
It remains to take the limit at a and apply the arithmetic laws of limits. B
(31 hrs, 2010)
Theorem 3.7 If f and g are differentiable at a and g(a) 0, then the function f /g
is differentiable at a and
( f /g)
·
=
f
·
g − g
·
f
g
2
.
Proof : Apply the last two theorems. B
(25 hrs, 2009)
With these theorems in hand, we can differentiate a large number of functions.
In particular there is no difficulty in differentiating a product of more than two
functions. For example,
( f gh)
·
= (( f g)h)
·
= ( f g)
·
h+( f g)h
·
= ( f
·
g+ f g
·
)h = ( f g)h
·
= f
·
gh+ f g
·
h+ f gh
·
.
Suppose we take for granted that
sin
·
= cos and cos
·
= −sin .
Then we have no problem calculating the derivative of, say, x ·→ sin
3
x, as a
product of three functions. But what about the derivative of x ·→ sin x
3
? Here we
need a rule for how to differentiate compositions.
Derivatives 101
Consider the composite function g ◦ f , i.e.,
(g ◦ f )(x) = g( f (x)).
Let’s try to calculate its derivative at a point a, assuming for the moment that both
f and g are differentiable everywhere. We then need to look at the function

g◦ f ,a
(y) =
(g ◦ f )(y) − (g ◦ f )(a)
y − a
=
g( f (y)) − g( f (a))
y − a
,
and calculate its limit at a. When y is close to a, we expect the arguments f (y)
and f (a) of g to be very close. This suggest the following treatment,

g◦ f ,a
(y) =
g( f (y)) − g( f (a))
f (y) − f (a)

f (y) − f (a)
y − a
=
g( f (y)) − g( f (a))
f (y) − f (a)

f ,a
(y).
It looks that as y → a, since f (y) − f (a) → 0, this product tends to g
·
( f (a)) f
·
(a).
The problem is that while the limit y → a means that the case y = a is not to be
considered, there is nothing to prevent the denominator f (y)−f (a) fromvanishing,
rendering this expression meaningless. Yet, the result is correct, and it only takes
a little more subtlety to prove it.
(27 hrs, 2011)
Theorem 3.8 Let f : A → B and g : B → R. If f is differentiable at a ∈ A, and g
is differentiable at f (a), then
(g ◦ f )
·
(a) = g
·
( f (a)) f
·
(a).
Proof : We introduce the following function, defined in a neighborhood of f (a),
ψ : z ·→
_
¸
¸
_
¸
¸
_

g, f (a)
(z) z f (a)
g
·
( f (a)) z = f (a).
The fact that g is differentiable at f (a) implies that ψ is continuous at f (a).
We next claim that for y a

g◦ f ,a
= (ψ ◦ f ) ∆
f ,a
.
102 Chapter 3
Why that? If f (y) f (a), then this equation reads
g( f (y)) − g( f (a))
y − a
=
g( f (y)) − g( f (a))
f (y) − f (a)

f (y) − f (a)
y − a
,
which holds indeed, whereas if f (y) = f (a) it reads,
g( f (y)) − g( f (a))
y − a
= g
·
( f (a))
f (y) − f (a)
y − a
,
and both sides are zero. Consider now the limit of the right hand side at a. By
limits arithmetic,
lim
a
_
(ψ ◦ f ) ∆
f ,a
_
= ψ( f (a)) lim
a

f ,a
= g
·
( f (a)) f
·
(a).
B
(32 hrs, 2010)
3.3 Another look at derivatives
In this short section we provide another (equivalent) characterization of differen-
tiability and the derivative. Its purpose is to offer a slightly different angle of view
on the subject, and how clever definitions can sometimes greatly simplify proofs.
Our definition of derivatives states that a function f is differentiable at an interior
point a, if the function ∆
f ,a
has a limit at a, and we denote this limit by f
·
(a).
The function ∆
f ,a
is not defined at a, hence not continuous at a, but this is a
removable discontinuity. We could say that f is differentiable at a if there exists a
real number, , such that the function
S
f ,x
(y) =
_
¸
¸
_
¸
¸
_

f ,a
(y) y a
y = a
is continuous at a, and S
f ,a
(a) = is called the derivative of f at a. For y a,
S
f ,a
(y) = ∆
f ,a
(y) =
f (x) − f (a)
x − a
,
or equivalently,
f (y) = f (a) + S
f ,a
(y)(y − a).
This equation holds also for y = a. This suggests the following alternative defini-
tion of the derivative:
Derivatives 103
Definition 3.2 A function f defined in on open neighborhood of a point a is said
to be differentiable at a if there exists a function S
f ,a
continuous at a, such that
f (y) = f (a) + S
f ,a
(y)(y − a).
S
f ,a
(a) is called the derivative of f at a and is denoted by f
·
(a).
Of course, S
f ,a
coincides with ∆
f ,a
in some punctured neighborhood of a.
A first example Take the function f : x → x
2
. Then
f (y) − f (a) = (y + a)(y − a),
or
f = f (a) + (Id +a)(Id −a).
Since the function Id +a is continuous at a, it follows that f is differentiable at a
and
f
·
(a) = lim
a
(Id +a) = 2a.
Leibniz’ rule Let’s now see how this alternative definition simplifies certain
proofs. Suppose for example that both f and g are differentiable at a. This implies
the existence of two continuous functions, S
f ,a
and S
g,a
, such that
f (y) = f (a) + S
f ,a
(y)(y − a)
g(y) = g(a) + S
g,a
(y)(y − a)
in some neighborhood of a and f
·
(a) = S
f ,a
(a) and g
·
(a) = S
g,a
(a). Then,
f (y)g(y) =
_
f (a) + S
f ,a
(y)(y − a)
_ _
g(a) + S
g,a
(y)(y − a)
_
= f (a)g(a) +
_
f (a)S
g,a
(y) + g(a) S
f ,a
(y) + S
f ,a
(y)S
g,a
(y)(y − a)
_
(y − a).
By limit arithmetic, the function in the brackets,
F = f (a)S
g,a
+ g(a) S
f ,a
+ S
f ,a
S
g,a
(Id −a)
is continuous at a, hence f g is differentiable at a and
( f g)
·
(a) = lim
a
F = f (a)g
·
(a) + g(a) f
·
(a).
104 Chapter 3
Derivative of a composition Suppose now that f is differentiable at a and g is
differentiable at f (a). Then, there exist continuous functions, S
f ,a
and S
g, f (a)
, such
that
f (y) = f (a) + S
f ,a
(y)(y − a)
g(z) = g( f (a)) + S
g, f (a)
(z)(z − f (a)).
Now,
g( f (y)) = g( f (a)) + S
g, f (a)
( f (y))( f (y) − f (a))
= g( f (a)) + S
g, f (a)
( f (y))S
f ,a
(y)(y − a),
or,
(g ◦ f )(y) = (g ◦ f )(a) + S
g, f (a)
( f (y))S
f ,a
(y)
.,,.
[
S
g, f (a)
◦ f
]
S
f ,a
(y)
(y − a).
By the properties of continuous functions, the function in the square brackets is
continuous at a, and
(g ◦ f )
·
(a) = lim
a
_
S
g, f (a)
◦ f
_
S
f ,a
= g
·
( f (a)) f
·
(a).
(34 hrs, 2010)
(28 hrs, 2011)
3.4 The derivative and extrema
In high-school calculus, one of the main uses of differential calculus is to find
extrema of functions. We are going to put this practice on solid grounds. First,
recall the definition,
Definition 3.3 Let f : A → B. A point a ∈ A (not necessarily an internal point)
is said to be a maximum point of f in A, if
f (x) ≤ f (a) ∀x ∈ A.
We define similarly a minimum point.
Comment: By no means a maximum point has to be unique, nor to exist. For
example, in a constant function all points are minima and maxima. On the other
Derivatives 105
hand, the function f : (0, 1) → R, f : x ·→ x
2
does not have a maximum in (0, 1).
Also, we proved that a continuous function defined on a closed interval always
has a maximum (and a minumum, both, not necessarily unique).
Here is a first connection between maximum (and minimum) points and deriva-
tives:
Theorem 3.9 Let f : (a, b) → R. If x ∈ (a, b) is a maximum point of f and f is
differentiable at x, then f
·
(x) = 0.
Proof : Since x is, by assumption, a maximum point, then for every y ∈ (a, b),
f (y) − f (x) ≤ 0.
In particular, for y x,

f ,x
(y) =
f (y) − f (x)
y − x
,
satisfies
_
¸
¸
_
¸
¸
_

f ,x
(y) ≤ 0 y > x

f ,x
(y) ≥ 0 y < x.
It is given that ∆
f ,x
has a limit at x (it is f
·
(x)). We will show that this limit is
necessarily zero.
Suppose the limit, , were positive. Then, taking = /2, there has to be a δ > 0,
such that
0 <

2
< ∆
f ,x
(y) <
3
2
whenever 0 < ¦y − x¦ < δ,
but this can’t hold, since every neighborhood of x has points y where ∆
f ,x
(y) ≤ 0.
Similarly, can’t be negative, hence = 0. B
Comment: This is a uni-directional theorem. It does not imply that if f
·
(x) = 0
then x is a maximum point (nor a minimum point).
Comment: The open interval cannot be replaced by the closed interval [a, b],
because at the end points we can only consider one-sided limits.
106 Chapter 3
Definition 3.4 Let f : A → B. An interior point a ∈ A is called a local maxi-
mum of f (ימוקמ ומיסקמ), if there exists a δ > 0 such that a is a maximum of f
in (a − δ, a + δ) (“local” always means “in a sufficiently small neighborhood”).
Theorem 3.10 (Pierre de Fermat) If a is a local maximum (or minimum) of f in
some open interval and f is differentiable at a then f
·
(a) = 0.
Proof : There is actually nothing to prove, as the previous theorem applies verba-
tim for f restricted to the interval (a − δ, a + δ). B
Comment: The converse is not true. Take for example the function f : x → x
3
. If
is differentiable in R and f
·
(0) = 0, but zero is not a local minimum, nor a local
maximum of f .
Comment: Although this theorem is due to Fermat, its proof is slightly shorter
than the proof of Fermat’s last theorem.
Definition 3.5 Let f : A → B be differentiable. A point a ∈ A is called a critical
point of f if f
·
(a) = 0. The value f (a) is called a critical value.
Locating extremal points Let f : [a, b] → R and suppose we want to find a
maximum point of f . If f is continuous, then a maximum is guaranteed to exist.
There are three types of candidates: (1) critical points of f in (a, b), (ii) the end
points a and b, and (iii) points where f is not differentiable. If f is differentiable,
then the task reduces to finding all the critical points and comparing all the critical
values,
¦ f (x) : f
·
(x) = 0¦.
Finally, the largest critical value has to be compared to f (a) and f (b).
Example: Consider the function f : [−1, 2] → R, f : x ·→ x
3
− x. This function
is differentiable everywhere, and its critical points satisfy
f
·
(x) = 3x
2
− 1 = 0,
i.e., x = ±1/

3, which are both in the domain. It is easily checked that
f (1/

3) = −
2
3

3
and f (−1/

3) =
2
3

3
.
Derivatives 107
Finally, f (−1) = 0 and f (2) = 6, hence 2 is the (unique) maximum point.
So far, we have derived properties of the derivative given the function. What about
the reverse direction? Take the following example: we know that the derivative
of a constant function is zero. Is it also true that if the derivative is zero then the
function is a constant? A priori, it is not clear how to show it. How can we go
from the knowledge that
lim
y→x
f (y) − f (x)
y − x
= 0,
to showing that f is a constant function. The following two theorems will provide
us with the necessary tools:
Theorem 3.11 (Michel Rolle, 1691) If f is continuous on [a, b], differentiable in
(a, b), and f (a) = f (b), then there exists a point c ∈ (a, b) where f
·
(c) = 0.
Proof : It follows from the continuity of f that it has a maximum and a minimum
point in [a, b]. If the maximum occurs at some c ∈ (a, b) then f
·
(c) = 0 and we are
done. If the minimum occurs at some c ∈ (a, b) then f
·
(c) = 0 and we are done.
The only remaining alternative is that a and b are both minima and maxima, in
which case f is a constant and its derivative vanishes at some interior point (well,
at all of them). B
Comment: The requirement that f be differentiable everywhere in (a, b) is imper-
ative, for consider the function
f (x) =
_
¸
¸
_
¸
¸
_
x 0 ≤ x ≤
1
2
1 − x
1
2
< x ≤ 1.
Even though f is continuous in [0, 1] and f (0) = f (1) = 0, there is no interior
point where f
·
(x) = 0.
Theorem 3.12 (Mean-value theorem (עצוממה רעה טפשמ)) If f is continu-
ous on [a, b] and differentiable in (a, b), then there exists a point c ∈ (a, b) where
f
·
(c) =
f (b) − f (a)
b − a
,
that is, a point at which the derivative equals to the mean rate of change of f
between a and b.
108 Chapter 3
Comment: Rolle’s theorem is a particular case.
Proof : This is almost a direct consequence of Rolle’s theorem. Define the function
g : [a, b] → R,
g(x) = f (x) −
f (b) − f (a)
b − a
(x − a).
g is continuous on [a, b] and differentiable in (a, b). Moreover, g(a) = g(b) = f (a).
Hence by Rolle’s theorem there exists a point c ∈ (a, b), such that
0 = g
·
(c) = f
·
(c) −
f (b) − f (a)
b − a
.
B
Corollary 3.2 Let f : [a, b] → R. If f
·
(x) = 0 for all x ∈ (a, b) then f is a
constant.
Proof : Let c, d ∈ (a, b), c < d. By the mean-value theorem there exists some
e ∈ (c, d), such that
f
·
(e) =
f (d) − f (c)
d − c
,
however f
·
(e) = 0, hence f (d) = f (c). Since this holds for all pair of points, then
f is a constant. B
Comment: This is only true if f is defined on an interval. Take a domain of
definition which is the union of two disjoint sets, and this is no longer true ( f is
only constant in every “connected component” (תורישק ביכר)).
(27 hrs, 2009)
Corollary 3.3 If f and g are differentiable on an interval with f
·
(x) = g
·
(x), then
there exists a real number c such that f = g + c on that interval.
Proof : Apply the previous corollary for f − g. B
Derivatives 109
Corollary 3.4 If f is continuous on a closed interval [a, b] and f
·
(x) > 0 in (a, b),
then f is increasing on that interval.
Proof : Let x < y belong to that interval. By the mean-value theorem there exists
a c ∈ (x, y) such that
f
·
(c) =
f (y) − f (x)
y − x
,
however f
·
(c) > 0, hence f (y) > f (x). B
Comment: The converse is not true. If f is increasing and differentiable then
f
·
(x) ≥ 0, but equality may hold, as in f (x) = x
3
at zero.
We have seen that at a local minimum (or maximum) the derivative (if it exists)
vanishes, but that the opposite is not true. The following theorem gives a sufficient
condition for a point to be a local minimum (with a corresponding theorem for a
local maximum).
Theorem 3.13 If f
·
(a) = 0 and f
··
(a) > 0 then a is a local minimum of f .
(30 hrs, 2011)
Comment: The fact that f
··
exists at a implies:
1. f
·
exists in some neighborhood of a.
2. f
·
is continuous at a.
3. f is continuous in some neighborhood of a.
Proof : By definition,
f
··
(a) = lim
a

f
·
,a
> 0 where ∆
f
·
,a
(y) =
f
·
(y) − f
·
(a)
y − a
=
f
·
(y)
y − a
.
Thus there exists a δ > 0 such that
f
·
(y)
y − a
> 0 whenever 0 < ¦y − a¦ < δ,
110 Chapter 3
or,
f
·
(y) > 0 whenever a < y < a + δ
f
·
(y) < 0 whenever a − δ < y < a.
It follows that f is increasing in a right-neighborhood of a and decreasing in a
left-neighborhood of a, i.e., a is a local minimum.
Another way of completing the argument is as follows: for every y ∈ (a, a + δ):
f (y) − f (a)
y − a
= f
·
(c
y
) > 0 i.e., f (y) > f (a),
where c
y
∈ (a, y). Similarly, for every y ∈ (a − δ, a):
f (y) − f (a)
y − a
= f
·
(c
y
) < 0 i.e., f (y) > f (a),
where here c
y
∈ (y, a).
B
Example: Consider the function f (x) = x
3
− x. There are three critical points
0, ±1/

3: one local maximum, one local minimum, and one which is neither.

Example: Consider the function f (x) = x
4
. Zero is a minimum point, even though
f
··
(0) = 0.
(36 hrs, 2010)
Theorem 3.14 If f has a local minimum at a and f
··
(a) exists, then f
··
(a) ≥ 0.
Proof : Suppose, by contradiction, that f
··
(a) < 0. By the previous theorem this
would imply that a is both a local minimum and a local maximum. This in turn
implies that there exists a neighborhood of a in which f is constant, i.e., f
·
(a) =
f
··
(a) = 0, a contradiction. B
The following theorem states that the derivative of a continuous function cannot
have a removable discontinuity.
Derivatives 111
Theorem 3.15 Suppose that f is continuous at a and
lim
a
f
·
exists,
then f
·
(a) exists and f
·
is continuous at a.
Proof : The assumption that f
·
has a limit at a implies that f
·
exists in some punc-
tured neighborhood U of a. Thus, for y ∈ U \ ¦a¦ the function f is continuous on
the closed segment that connects y and a and differentiable on the corresponding
open segment. By the mean-value theorem, there exists a point c
y
between y and
a, such that
f
·
(c
y
) =
f (y) − f (a)
y − a
,
and therefore ¸
¸
¸
¸
¸

f ,a
(y) − lim
a
f
·
¸
¸
¸
¸
¸
=
¸
¸
¸
¸
¸
f
·
(c
y
) − lim
a
f
·
¸
¸
¸
¸
¸
.
Since f
·
has a limit at a,
(∀ > 0)(∃δ > 0) : (∀y ∈ B

(a, δ))
_
¦ f
·
(y) − lim
a
f
·
¦ <
_
.
Since y ∈ B

(a, δ) implies that c
y
∈ B

(a, δ),
(∀ > 0)(∃δ > 0) : (∀y ∈ B

(a, δ))
_
¦ f
·
(c
y
) − lim
a
f
·
¦ <
_
,
which by the above identity gives
(∀ > 0)(∃δ > 0) : (∀y ∈ B

(a, δ))
_
¦∆
f ,a
(y) − lim
a
f
·
¦ <
_
,
which precisely means that
f
·
(a) = lim
a
f
·
.
B
Example: The function
f (x) =
_
¸
¸
_
¸
¸
_
x
2
x < 0
x
2
+ 5 x ≥ 0,
112 Chapter 3
is differentiable in a neighborhood of 0 and
lim
0
f
·
exists,
but it is not continuous, and therefore not differentiable at zero.
Example: The function
f (x) =
_
¸
¸
_
¸
¸
_
x
2
sin 1/x x 0
0 x = 0,
is continuous and differentiable in a neighborhood of zero. However, the deriva-
tive does not have a limit at zero, so that the theorem does not apply. Note, how-
ever, that f is differentiable at zero.
(29 hrs, 2009) (37 hrs, 2010)
Theorem 3.16 (Augustin Louis Cauchy; mean value theorem) Suppose that
f , g are continuous on [a, b] and differentiable in (a, b). Then, there exists a
c ∈ (a, b), such that
[ f (b) − f (a)]g
·
(c) = [g(b) − g(a)] f
·
(c).
Comment: If g(b) g(a) then this theorem states that
f (b) − f (a)
g(b) − g(a)
=
f
·
(c)
g
·
(c)
.
This may seem like a corollary of the mean-value theorem. We know that there
exist c,d, such that
f
·
(c) =
f (b) − f (a)
b − a
and g
·
(d) =
g(b) − g(a)
b − a
.
The problem is that there is no reason for the points c and d to coincide.
Derivatives 113
Proof : Define
h(x) = f (x)[g(b) − g(a)] − g(x)[ f (b) − f (a)],
and apply Rolle’s theorem. Namely, we verify that h is continuous on [a, b] and
differentiable in (a, b), and that h(a) = h(b), hence there exists a point c ∈ (a, b),
such that
h
·
(c) = f
·
(c)[g(b) − g(a)] − g
·
(c)[ f (b) − f (a)] = 0.
B
Theorem 3.17 (L’Hˆ opital’s rule) Suppose that
lim
a
f = lim
a
g = 0,
and that
lim
a
f
·
g
·
exists.
Then the limit lim
a
( f /g) exists and
lim
a
f
g
= lim
a
f
·
g
·
.
Comment: This is a very convenient tool for obtaining a limit of a fraction, when
both numerator and denominator vanish in the limit
2
.
Proof : It is implicitly assumed that f and g are differentiable in a neighborhood
of a (except perhaps at a) and that g
·
is non-zero in a neighborhood of a (except
perhaps at a); otherwise, the limit lim
a
( f
·
/g
·
) would not have existed. Note that f
and g are not necessarily defined at a; since they have a limit there, there will be
no harm assuming that they are continuous at a, i.e., f (a) = g(a) = 0
3
.
2
L’Hˆ opital’s rule was in fact derived by Johann Bernoulli who was giving to the French noble-
man, le Marquis de l’Hˆ opital, lessons in calculus. L’Hˆ opital was the one to publish this theorem,
acknowledging the help of Bernoulli.
3
What we really do is to define continuous functions
˜
f (x) =
_
¸
¸
_
¸
¸
_
f (x) x a
0 x = a
and ˜ g(x) =
_
¸
¸
_
¸
¸
_
g(x) x a
0 x = a
,
114 Chapter 3
First, we claim that g does not vanish in some neighborhood U of a, for by Rolle’s
theorem, it would imply that g
·
vanishes somewhere in this neighborhood. More
formally, if
g
·
(x) 0 whenever 0 < ¦x − a¦ < δ,
then
g(x) 0 whenever 0 < ¦x − a¦ < δ,
for if g(x) = 0, with, say, 0 < x < δ, then there exists a 0 < c
x
< x < δ, where
g
·
(c
x
) = 0.
Both f and g are continuous and differentiable in some neighborhood U that con-
tains a, hence by Cauchy’s mean-value theorem, there exists for every x ∈ U, a
point c
x
between x and a, such that
f (x) − 0
g(x) − 0
=
f (x)
g(x)
=
f
·
(c
x
)
g
·
(c
x
)
. (3.1)
Since the limit of f
·
/g
·
at a exists,
(∀ > 0)(∃δ > 0) : x ∈ B

(a, δ) implies
_
¸
¸
¸
¸
¸
f
·
(x)
g
·
(x)
− lim
a
f
·
g
·
¸
¸
¸
¸
¸
<
_
,
and since x ∈ B

(a, δ) implies c
x
∈ B

(a, δ),
(∀ > 0)(∃δ > 0) : x ∈ B

(a, δ) implies
_
¸
¸
¸
¸
¸
f
·
(c
x
)
g
·
(c
x
)
− lim
a
f
·
g
·
¸
¸
¸
¸
¸
<
_
,
and by (3.1),
(∀ > 0)(∃δ > 0) : x ∈ B

(a, δ) implies
_
¸
¸
¸
¸
¸
f (x)
g(x)
− lim
a
f
·
g
·
¸
¸
¸
¸
¸
<
_
,
which means that
lim
a
f
g
= lim
a
f
·
g
·
.
B
Example:
and prove the theorem for the “corrected” functions
˜
f and ˜ g. At the end, we observe that the result
must apply for f and g as well.
Derivatives 115
' The limit
lim
x→0
sin x
x
= 1.
^ The limit
lim
x→0
sin
2
x
x
2
= 1,
with two consecutive applications of l’Hˆ opital’s rule.

(32 hrs, 2011)
Comment: Purposely, we only prove one variant among many other of l’Hˆ opital’s
rule. Another variants is: Suppose that
lim

f = lim

g = 0,
and that
lim

f
·
g
·
exists.
Then
lim

f
g
= lim

f
·
g
·
.
Yet another one is: Suppose that
lim
a
f = lim
a
g = ∞,
and that
lim
a
f
·
g
·
exists.
Then
lim
a
f
g
= lim
a
f
·
g
·
.
The following theorem is very reminiscent of l’Hˆ opital’s rule, but note how dif-
ferent it is:
116 Chapter 3
Theorem 3.18 Suppose that f and g are differentiable at a, with
f (a) = g(a) = 0 and g
·
(a) 0.
Then,
lim
a
f
g
=
f
·
(a)
g
·
(a)
.
Proof : Without loss of generality, assume g
·
(a) > 0. That is,
lim
y→a
g(y) − g(a)
y − a
= lim
y→a
g(y)
y − a
= g
·
(a) > 0,
hence there exists a punctured neighborhood of a where g(x)/(x − a) > 0, and in
particular, in which the numerator g(x) is non-zero. Thus, we can divide by g(x)
in a punctured neighborhood of a, and
f (x)
g(x)
=
f (x) − f (a)
g(x) − g(a)
=
f (x)−f (a)
x−a
g(x)−g(a)
x−a
=

f ,a
(x)

g,a
(x)
.
Using the arithmetic of limits, we obtain the desired result. B
(33 hrs, 2011)
(30 hrs, 2009)
Exercise 3.3 Show that the function
f (x) =
_
¸
¸
_
¸
¸
_
1
2
x + x
2
sin
1
x
x 0
0 x = 0
satisfies f
·
(0) > 0, and yet, is not increasing in any neighborhood of 0 (i.e., there
is no neighborhood of zero in which x < y implies f (x) < f (y)).
Exercise 3.4 Let I be an open segment. Any suppose that f
·
(x) = f (x) for all
x ∈ I. Show that there exists a constant c, such that f (x) = c e
x
. (Hint: consider
the function g(x) = f (x)/e
x
.)
Exercise 3.5
Derivatives 117
' Let f : R → R be differentiable. Show that if f vanishes at m points, then
f
·
vanishes at at least m − 1 points.
^ Let f be n times continuously differentiable in [a, b], and n + 1 times dif-
ferentiable in (a, b). Suppose furthermore that f (b) = 0 and f
(k)
(a) = 0 for
k = 0, 1, . . . , n. Show that there exists a c ∈ (a, b) such that f
(n+1)
(c) = 0.
Exercise 3.6 Let
f (x) =
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
sin(ωx) x < 0
γ x = 0
αx
2
+ βx x > 0.
Find all the values of α, β, γ, ω for which f is (i) continuous at zero, (ii) differen-
tiable at zero, (iii) twice differentiable at zero.
Exercise 3.7 Calculate the following limits:
' lim
x→0
sin(ax)
sin(bx) cos(cx)
.
^ lim
x→0
tan(x)−x
x−sin(x)
.
· lim
x→0
x
2
sin(1/x)
sin(x)
.
´ lim
x→1
(1 − x) tan
_
πx
2
_
.
Exercise 3.8 Suppose that f : (0, ∞) → R is differentiable on (0, ∞) and
lim

f
·
= L. Prove, without using l’Hopital’s theorem that lim
x→∞
f (x)/x = L.
Hint: write
f (x)
x
=
f (x) − f (x
0
)
x − x
0

x − x
0
x
,
and choose x
0
appropriately in order to use Lagrange’s theorem.
3.5 Derivatives of inverse functions
We have already shown that if f : A → B is one-to-one and onto, then it has an
inverse f
−1
: B → A. We then say that f is invertible (הכיפה). We have seen
that an invertible function defined on an interval must be monotonic, and that the
inverse of a continuous function is continuous. In this section, we study under
what conditions is the inverse of a differentiable function differentiable.
118 Chapter 3
Figure 3.2: The derivative of the inverse function
Recall also that
f
−1
◦ f = Id,
with Id : A → A. If both f and f
−1
were differentiable at a and f (a), respectively,
then by the composition rule,
( f
−1
)
·
( f (a)) f
·
(a) = 1.
Setting f (a) = b, or equivalently, a = f
−1
(b), we get
( f
−1
)
·
(b) =
1
f
·
( f
−1
(b))
.
There is one little flaw with this argument. We have assumed ( f
−1
)
·
to exist. If this
were indeed the case, then we have just derived an expression for the derivative of
the inverse.
Corollary 3.5 If f
·
( f
−1
(b)) = 0 then f
−1
is not differentiable at b.
Example: The function f (x) = x
3
+ 2 is continuous and invertible, with f
−1
(x) =
3

x − 2. Since, f
·
( f
−1
(2)) = 0, then f
−1
is not differentiable at 2.
(39 hrs, 2010)
Theorem 3.19 Let f : I → R be continuous and invertible. If f is differentiable
at f
−1
(b) and f
·
( f
−1
(b)) 0, then f
−1
is differentiable at b.
Derivatives 119
Proof : Let b = f (a), i.e., a = f
−1
(b). It is given
lim
a

f ,a
= f
·
(a) 0.
To each x a corresponds a y b such that
y = f (x) and x = f
−1
(y).
Then,

f ,a
(x) =
f (x) − f (a)
x − a
=
y − b
f
−1
(y) − f
−1
(b)
=
1

f
−1
,b
(y)
,
which we can rewrite as

f
−1
,b
(y) =
1

f ,a
(x)
=
1

f ,a
( f
−1
(y))
.
Since f
−1
is continuous at b and ∆
f ,a
is continuous at a = f
−1
(b), it follows from
limit artithmetic that
lim
b

f
−1
,b
=
1
f
·
( f
−1
(b))
.
B
Example: Consider the function f (x) = tan x defined on (−π/2, π/2). We denote
its inverse by arctan x. Then
4
,
( f
−1
)
·
(x) =
1
f
·
( f
−1
(x))
= cos
2
(arctan x) =
1
1 + tan
2
(arctan x)
=
1
1 + x
2
.

Example: Consider the function f (x) = x
n
, with n integer. This proves that the
derivative of f
−1
(x) = x
1/n
is
( f
−1
)
·
(x) =
1
n
x
1/n−1
,
and together with the composition rule gives the derivative for all rational powers.

4
We used the identity cos
2
= 1/(1 + tan
2
).
120 Chapter 3
Comment: So far, we have not defined real-valued powers. We will get to it after
we study integration theory.
Exercise 3.9 Let f : R → R be one-to-one and onto, and furthermore dif-
ferentiable with f
·
(x) 0 for all x ∈ R. Suppose that there exists a function
F : R → R such that F
·
= f . Define G(x) = x f
−1
(x) − F( f
−1
(x)). Prove that
G
·
(x) = f
−1
(x). Note that this statement says that if we know a function whose
derivative is one-to-one and onto, then we know the function whose derivative is
its inverse.
Exercise 3.10 Let f be invertible and twice differentiable in (a, b). Further-
more, f
·
(x) 0 in (a, b). Prove that ( f
−1
)
··
exists and find an expression for it in
terms of f and its derivatives.
Exercise 3.11 Suppose that f : R → R is continuous, one-to-one and onto,
and satisfies f = f
−1
. (i) Show that f has a fixed point, i.e., there exists an x ∈ R
such that f (x) = x. (ii) Find such an f that has a unique fixed point. Hint: start by
figuring out how do such functions look like.
3.6 Complements
Theorem 3.20 Suppose that f : A → B is monotonic (say, increasing), then it has
one-sided limits at every point that has a one-sided neighborhood that belongs to
A. (In particular, if A is a connected set then f has one-sided limits everywhere.)
Proof : Consider a segment (b, a] ∈ A. We need to show that f has a left-limit at
a. The set
K = ¦ f (x) : b < x < a¦
is non-empty and upper bounded by f (a) (since f is increasing). Hence we can
define = sup K. We are going to show that
= lim
a

f .
Let > 0. By the definition of the supremum, there exists an m ∈ K such that
m > − . Let y be such f (y) = m. By the monotonicity of f ,
− < f (x) ≤ whenever y < x < a.
Derivatives 121
This concludes the proof. We proceed similarly for right-limits. B
Theorem 3.21 Let f : [a, b] → R be monotonic (say, increasing). Then the image
of f is a segment if and only if f is continuous.
Proof : (i) Suppose first that f is continuous. By monotonicity,
Image( f ) ⊆ [ f (a), f (b)].
That every point of [ f (a), f (b)] is in the image of f follows from the intermediate-
value theorem.
(ii) Suppose then that Image( f ) = [ f (a), f (b)]. Suppose, by contradiction, that f
is not continuous. This implies the existence of a point c ∈ [a, b], where
f (c) > lim
c

f and/or f (c) < lim
c
+
f .
(We rely on the fact that one-sided derivatives exist.)
Say, for example, that f (c) < lim
c
+ f . Then,
f (c) <
f (c) + lim
c
+ f
2
< lim
c
+
f
is not in the image of f , which contradicts the fact that Image( f ) = [ f (a), f (b)].
B
Theorem 3.22 (Jean-Gaston Darboux) Let f : A → B be differentiable on (a, b),
including a right-derivative at a and a left-derivative at b. Then, f
·
has the
“intermediate-value property”, whereby for every t between f
·
(a
+
) and f
·
(b

)
there exists an x ∈ [a, b], such that f
·
(x) = t.
Comment: The statement is trivial when f is continuously differentiable (by
the intermediate-value theorem). It is nevertheless correct even if f
·
has dis-
continuities. This theorem thus limits the functions that are derivatives of other
functions—not every function can be a derivative.
122 Chapter 3
Proof : Without loss of generality, let us assume that f
·
(a
+
) > f
·
(b

). Let f
·
(a
+
) >
t > f
·
(b

), and consider the function
g(x) = f (x) − tx.
(Here t is a fixed parameter; for every t we can construct such a function.) We
have
g
·
(a
+
) = f
·
(a
+
) − t > 0 and g
·
(b

) = f
·
(b

) − t < 0,
and we wish to show the existence of an x ∈ [a, b] such that g
·
(x) = 0.
Since g is continuous on [a, b] then it must attain a maximum. The point a can-
not be a maximum point because g is increasing at a. Similarly, b cannot be a
maximum point because g is decreasing at b. Thus, there exists an interior point c
which is a maximum, and by Fermat’s theorem, g
·
(c) = f
·
(c) − t = 0. B
(32 hrs, 2009)
Example: The function
f (x) =
_
¸
¸
_
¸
¸
_
x
2
sin 1/x x 0
0 x = 0
is differentiable everywhere, and its derivative is
f
·
(x) =
_
¸
¸
_
¸
¸
_
2x sin 1/x − cos 1/x x 0
0 x = 0
.
The derivative is not continuous at zero, but nevertheless satisfes the Darboux
theorem.
Example: How crazy can a continuous function be? It was Weierstrass who
shocked the world by constructing a continuous function that it nowhere differ-
entiable! As instance of his class of example is
f (x) =

k=0
cos(21
k
πx)
3
k
.

Derivatives 123
3.7 Taylor’s theorem
Recall the mean-value theorem. If f : [a, b] → R is continuous on [a, b] and
differentiable in (a, b), then for every point x ∈ (a, b) there exists a point c
x
, such
that
f (x) − f (a)
x − a
= f
·
(c
x
).
We can write it alternatively as
f (x) = f (a) + f
·
(c
x
)(x − a).
Suppose, for example, that we knew that ¦ f
·
(y)¦ < M for all y. Then, we would
deduce that f (x) cannot differ from f (a) by more than M times the separation
¦x − a¦. In particular,
lim
a
[ f − f (a)] = 0.
It turns out that if we have more information, about higher derivatives of f , then
we can refine a lot the estimates we have on the variation of f as we get away
from the point a.
Theorem 3.23 Let f be a given function, and a an interior point of its domain.
Suppose that the derivatives
f
·
(a), f
··
(a), . . . , f
(n)
(a)
all exist. Let a
k
= f
(k)
(a)/k! and set the Taylor polynomial of degree n,
P
f ,n,a
(x) = a
0
+ a
1
(x − a) + a
2
(x − a)
2
+ + a
n
(x − a)
n
.
Then,
lim
a
f − P
f ,n,a
(Id −a)
n
= 0.
Comment: Loosely speaking, this theorem states that, as x approaches a, P
f ,n,a
(x)
is closer to f (x) than (x − a)
n
(or that f and P
f ,n,a
are “equal up to order n”; see
below). For n = 0 it states that
lim
a
[ f − f (a)] = 0,
124 Chapter 3
whereas for n = 1,
lim
a
_
f − f (a)
(Id −a)
− f
·
(a)
_
= lim
a

f ,a
− f
·
(a) = 0,
which holds by definition.
Proof : Consider the ratio
f − P
f ,n,a
(Id −a)
n
,
which is a ratio of n-times differentiable functions at a. Both numerator and de-
nominator vanish at a, so we will attempt to use l’Hˆ opital’s rule. For that we note
that
P
·
f ,n,a
= P
f
·
,n−1,a
.
Thus,
lim
a
f − P
f ,n,a
(Id −a)
n
= lim
a
f
·
− P
f
·
,n−1,a
n (Id −a)
n−1
,
provided that the right hand side exists. We then proceed to consider the right
hand side, which is the limit of the ratio of two differentiable functions that vanish
at a. Hence,
lim
a
f
·
− P
f
·
,n−1,a
n (Id −a)
n−1
= lim
a
f
··
− P
f
··
,n−2,a
n(n − 1) (Id −a)
n−2
,
provided that the right hand side exists. We proceed inductively, until we get that
lim
a
f − P
f ,n,a
(Id −a)
n
= lim
a
f
(n−1)
− P
f
(n−1)
,1,a
n! (Id −a)
,
provided that the right hand side exists. Now write the right hand side explicitly,
1
n!
lim
a
f
(n−1)
− f
(n−1)
(a) − f
(n)
(a)(Id −a)
Id −a
=
1
n!
_
lim
a

f
(n−1)
,a
− f
(n)
(a)
_
,
which vanishes by the definition of f
(n)
(a). This concludes the proof. B
(35 hrs, 2011)
The polynomial P
f ,n,a
(x) is known as the Taylor polynomial of degree n of f
about a. Its existence relies on f being differentiable n times at a. If we denote by
Π
n
the set of polynomials of degree at most n, then the defining property of P
f ,n,a
is:
Derivatives 125
1. P
f ,n,a
∈ Π
n
.
2. P
(k)
f ,n,a
(a) = f
(k)
(a) for k = 0, 1, . . . , n.
Definition 3.6 Let f , g be defined in a neighborhood of a. We say that f and g
are equal up to order n at a if
lim
a
f − g
(Id −a)
n
= 0.
It is easy to see that “equal up to order n” is an equivalence relation, which
defines an equivalence class.
Thus, the above theorem proves that f (x) and P
f ,n,a
(provided that the latter exists)
are equal up to order n at a.
The next task is to obtain an expression for the difference between the function f
and its Taylor polynomial. We define the remainder (תיראש) of order n, R
n
(x), by
f (x) = P
f ,n,a
(x) + R
f ,n,a
(x).
Proposition 3.1 Let P, Q ∈ Π
n
be equal up to order n at a. Then P = Q.
Proof : We can always express these polynomials as
P(x) = a
0
+ a
1
(x − a) + + a
n
(x − a)
n
Q(x) = b
0
+ b
1
(x − a) + + b
n
(x − a)
n
.
We know that
lim
a
P − Q
(Id −a)
k
= 0 for k = 0, 1, . . . , n.
For k = 0
0 = lim
a
(P − Q) = a
0
− b
0
,
i.e., a
0
= b
0
. We proceed similarly for each k to show that a
k
= b
k
for k =
0, 1, . . . , n. B
126 Chapter 3
Corollary 3.6 If f is n times differentiable at a and f is equal to Q ∈ Π
n
up to
order n at a, then
P
f ,n,a
= Q.
Example: We know from high-school math that for ¦x¦ < 1
f (x) =
1
1 − x
=

k=0
x
k
=
n

k=0
x
k
+
x
n+1
1 − x
.
Thus,
f (x) −
_
n
k=0
x
k
x
n
=
x
1 − x
.
Since the right hand side tends to zero as x → 0 it follows that f and
_
n
k=0
x
k
are
equal up to order n at 0, which means that
P
f ,n,0
(x) =
n

k=0
x
k
.

Example: Similarly for ¦x¦ < 1
f (x) =
1
1 + x
2
=

k=0
(−x
2
)
k
=
n

k=0
(−1)
k
x
2k
+ (−1)
n+1
x
2n+2
1 + x
2
.
Thus,
f (x) −
_
n
k=0
(−1)
k
x
2k
x
2n
=
x
2
1 + x
2
.
Since the right hand side tends to zero as x → 0 it follows that
P
f ,2n,0
(x) =
n

k=0
(−1)
k
x
2k
.

Derivatives 127
Example: Let f = arctan. Then,
f (x) =
_
x
0
dt
1 + t
2
=
_
x
0
_
¸
¸
¸
¸
¸
_
n

k=0
(−1)
k
t
2k
+ (−1)
n+1
t
2n+2
1 + t
2
_
¸
¸
¸
¸
¸
_
dt
=
n

k=0
(−1)
k
x
2k+1
2k + 1
+
_
x
0
(−1)
n+1
t
2n+2
1 + t
2
dt.
Since ¸
¸
¸
¸
¸
¸
¸
f (x) −
n

k=0
(−1)
k
x
2k+1
2k + 1
¸
¸
¸
¸
¸
¸
¸
≤ ¦x¦
2n+3
,
it follows that
P
f ,2n+2,0
=
n

k=0
(−1)
k
x
2k+1
2k + 1
.

(36 hrs, 2011)
Theorem 3.24 Let f , g be n times differentiable at a. Then,
P
f +g,n,a
= P
f ,n,a
+ P
g,n,a
,
and
P
f g,n,a
= [P
f ,n,a
P
g,n,a
]
n
,
where []
n
stands for a truncation of the polynomial in (Id −a)at the n-th power.
Proof : We use again the fact that if a polynomial “looks” like the Taylor polyno-
mial then it is. The sum P
f ,n,a
+ P
g,n,a
belongs to Π
n
and
lim
a
( f + g) − (P
f ,n,a
+ P
g,n,a
)
(Id −a)
n
= lim
a
f − P
f ,n,a
(Id −a)
n
+ lim
a
g − P
g,n,a
(Id −a)
n
= 0,
which proves the first statement.
Second, we note that P
f ,n,a
P
g,n,a
∈ Π
2n
, and that
f g − P
f ,n,a
P
g,n,a
(Id −a)
n
=
f g − P
f ,n,a
g + P
f ,n,a
g − P
f ,n,a
P
g,n,a
(Id −a)
n
= g
f − P
f ,n,a
(Id −a)
n
+ P
f ,n,a
g − P
g,n,a
(Id −a)
n
,
128 Chapter 3
whose limit at a is zero. All that remains it to truncate the product at the n-th
power of (Id −a). B
Theorem 3.25 (Taylor) Suppose that f is (n +1)-times differentiable on the inter-
val [a, x]. Then,
f (x) = P
f ,n,a
(x) + R
n, f ,a
(x),
where the remainder R
f ,n,a
(x) can be represented in the Lagrange form,
R
f ,n,a
(x) =
f
(n+1)
(c)
(n + 1)!
(x − a)
n+1
,
for some c ∈ (a, x). It can also be represented in the Cauchy form,
R
f ,n,a
(x) =
f
(n+1)
(ξ)
n!
(x − ξ)
n
(x − a),
for some ξ ∈ (a, x).
Comment: For n = 0 this is simply the mean-value theorem.
Proof : Fix x, and consider the function φ : [a, x] → R,
φ(z) = f (x) − P
f ,n,z
(x)
= f (x) − f (z) − f
·
(z)(x − z) −
f
··
(z)
2!
(x − z)
2
− −
f
(n)
(z)
n!
(x − z)
n
.
We note that
φ(a) = R
f ,n,a
(x) and φ(x) = 0.
Moreover, φ is differentiable, with
φ
·
(z) = −f
·
(z) −
_
f
··
(z)(x − z) − f
·
(z)
_

_
f
···
(z)
2!
(x − z)
2
− f
··
(z)(x − z)
_
− −
_
f
(n+1)
(z)
n!
(x − z)
n

f
(n)
(z)
(n − 1)!
(x − z)
n−1
_
= −
f
(n+1)
(z)
n!
(x − z)
n
.
Derivatives 129
Let now ψ be any function defined on [a, x], differentiable on (a, x) with non-
vanishing derivative. By Cauchy’s mean-value theorem, there exists a point a <
ξ < x, such that
φ(x) − φ(a)
ψ(x) − ψ(a)
=
φ
·
(ξ)
ψ
·
(ξ)
.
That is,
R
n
(x) =
ψ(x) − ψ(a)
ψ
·
(ξ)
f
(n+1)
(ξ)
n!
(x − ξ)
n
Set for example ψ(z) = (x − z)
p
for some p. Then,
R
n
(x) = (x − a)
p
f
(n+1)
(ξ)
p n!
(x − ξ)
n−p+1
For p = 1 we retrieve the Cauchy form. For p = n + 1 we retrieve the Lagrange
form. B
Comment: Later in this course we will study series and ask questions like “does
the Taylor polynomial P
n
(x) tend to f (x) as n → ∞?”. This is the notion of con-
verging series. Here we deal with a different beast: the degree of the polynomial
n is fixed and we consider how does P
n
(x) approach f (x) as x → a. The fact that
P
n
approaches f faster than (x − a)
n
as x → a means that P
n
is an asymptotic
series of f .
(34 hrs, 2009)
Exercise 3.12 Write the polynomial p(x) = x
3
−3x
2
+3x −6 as a polynomial
in x + 1. That is, find coefficients a
0
, a
1
, . . . , a
n
, such that p(x) =
_
n
k=0
a
k
(x + 1)
k
.
Solve this problem using two methods: (i) Newton’s binomial, and (ii) Taylor’s
polynomial.
Exercise 3.13 Let f : R → R be n times differentiable, such that f
(n)
(x) = 0
for all x ∈ R. Prove that f is a polynomial whose degree is at most n − 1.
Exercise 3.14 Write the Taylor polynomials of degree n at the point x
0
for
the following functions, as well as the corresponding remainders, both in the La-
grange and Cauchy forms:
' f (x) = 1/(1 + x), x
0
= 0, n ∈ N.
^ f (x) = cosh(x), x
0
= 0, n ∈ N.
130 Chapter 3
· f (x) = x
5
− x
4
+ 7x
3
− 2x
2
+ 3x + 1, x
0
= 1, n = 5.
´ f (x) = (x − 2)
6
+ 9(x + 1)
4
+ (x − 2)
2
+ 8, x
0
= 0, n = 6.
I f (x) = (1 + x)
α
, α N, x
0
= 0, n ∈ N. Use the notation
_
α
n
_
def
=
α(α − 1) . . . (α − n + 1)
n!
,
which generalizes Newton’s binomial.
I f (x) = sin(x), x
0
= π/2, n = 2k, k ∈ N.
Exercise 3.15 Show using different methods that of f
··
(a) exists, then
f
·
(a) = lim
h→0
f (a + h) + f (a − h) − 2 f (a)
h
.
' Using l’Hopital’s rule.
^ Using Taylor’s polynomial of order two f at a. Note that you can’t use
neither Lagrange or Cauchy’s form of the remainder, since it is only given
that f is twice differentiable at a.
· Show that if
f (x) =
_
¸
¸
_
¸
¸
_
x
2
x ≥ 0
−x
2
x < 0,
then f
··
(0) does not exist, but the limit of the above fraction does exist. Why
isn’t it a contradiction?
Exercise 3.16 Evaluate, using Taylor’s polynomial, the following expressions,
up to an accuracy of at least 0.01:
' e
1/2
.
^ cos(1/2).
Exercise 3.17
' For every polynomial P(x) =
_
n
k=0
a
k
x
k
and evert q ∈ N, we denote
P
q
(x) =
_
¸
¸
_
¸
¸
_
_
q
k=0
a
k
x
k
q < n
P(x) q ≥ n.
Derivatives 131
Suppose that f , g have Taylor polynomials of order n at zero, P and Q.
Suppose furthermore that g(0) = 0, and that g does not vanish in a punctured
neighborhood of zero. Show that the Taylor polynomial of order n at zero
of f ◦ g is given by (P ◦ Q)
n
. Hint: write f (g(x)) = f (g(x))/g(x) g(x).
^ Given the Taylor polynomial P of order n at zero of the function f , write
the Taylor polynomial of order n at zero of the function g(x) = f (2x).
· Find the Taylor polynomial of order three at zero of the function f (x) =
exp(sin(x)).
Exercise 3.18 Let f be n times differentiable at x
0
and let P(x) =
_
n
k=0
a
k
x
k
be
its Taylor polynomial at zero. What is the Taylor polynomial of f
·
of order n − 1
at zero. Justify your answer.
Exercise 3.19 Show that if f and g are n times differentiable at x
0
, then
( f g)
(n)
(x
0
) =
n

k=0
_
n
k
_
f
(k)
(x
0
) g
(n−k)
(x
0
).
Hint: think of the Taylor polynomial of a product of functions.
Exercise 3.20 Suppose that f and g have derivatives of all orders in R and
that P
n
(x) = Q
n
(x) for all n ∈ N and every x ∈ R, where P
n
and Q
n
are the
corresponding Taylor polynomials at zero. Does it imply that f (x) = g(x)?
132 Chapter 3
Chapter 4
Integration theory
4.1 Definition of the integral
Derivatives were about calculating the rate of change of a function. Integrals were
invented in order to calculate areas. The fact that the two are intimately related—
that derivatives are in a sense the inverse of integrals—is a wonder.
When we learn in school about the area of a region, it is defined as “the number
of squares with sides of length one that fit into the region”. It is of course hard
to make sense with such a definition. We will learn how to calculate the area of
a region bounded by the x-axis, the graph of a function f , and two vertical lines
passing though points (a, 0) and (b, 0). For the moment let us assume that f ≥ 0.
This region will be denoted by R( f , a, b), and its area (once defined) will be called
the integral of f on [a, b]
1
.
1
There are multiple non-equivalent ways to define integrals. We study integration theory in the
sense of Riemann; another classical integration theory is that of Lebesgue (studied in the context
of measure theory). Within the Riemann integration theory, we adopt a particular derivation due
to Darboux.
134 Chapter 4
a b
f
R!f,a,b"
Definition 4.1 A partition (הקולח) of the interval [a, b] is a partition (yes, I
know, you can’t use the word to be defined in a definition) of the interval into a
finite number of segments,
[a, b] = [x
0
, x
1
] ∪ [x
1
, x
2
] ∪ ∪ [x
n−1
, x
n
].
We will identify a partition with a finite number of points,
a = x
0
< x
1
< x
2
< < x
n
= b.
Definition 4.2 Let f : [a, b] → Rbe a bounded function, and let P = ¦x
0
, x
1
, . . . , x
n
¦
be a partition of [a, b]. Let
m
i
= inf¦ f (x) : x
i−1
≤ x ≤ x
i
¦
M
i
= sup¦ f (x) : x
i−1
≤ x ≤ x
i
¦.
We define the lower sum (ותחת וכס) of f for P to be
L( f , P) =
n

i=1
m
i
(x
i
− x
i−1
).
We define the upper sum (וילע וכס) of f for P to be
U( f , P) =
n

i=1
M
i
(x
i
− x
i−1
).
Clearly, any sensible definition of the area A of R( f , a, b) must satisfy
L( f , P) ≤ A ≤ U( f , P)
for all partitions P.
Integration theory 135
a b
f
x
0
x
1
x
2
x
3
Figure 4.1: The lower and upper sums of f for P.
Comments:
' The boundedness of f is essential in these definitions.
^ It is always true that L( f , P) ≤ U( f , P) because m
i
≤ M
i
for all i.
· Is it however true that L( f , P
1
) ≤ U( f , P
2
) where P
1
and P
2
are arbitrary
partitions? It ought to be true.
(35 hrs, 2009)
(38 hrs, 2011)
Exercise 4.1 Let f : [−1, 1] → R be given by f : x ·→ x
2
and let P =
¦−1, −1/2, −1/3, 2/3, 8/9, 1¦ be a partition of [−1, 1]. Calculate L( f , P) and U( f , P).
Definition 4.3 A partition Q is called a refinement (ודיע) of a partition P if
P ⊂ Q.
Lemma 4.1 Let Q be a refinement of P. Then
L( f , Q) ≥ L( f , P) and U( f , Q) ≤ U( f , P).
136 Chapter 4
Proof : Suppose first that Q differs from P by exactly one point. That is,
P = ¦x
0
, x
1
, . . . , x
n
¦
Q = ¦x
0
, x
1
, . . . , x
j−1
, y, x
j
, . . . , x
n
¦.
Then,
L( f , P) =
n

i=1
m
i
(x
i
− x
i−1
)
and
L( f , Q) =
j−1

i=1
m
i
(x
i
− x
i−1
) + ˜ m
j
(y − x
j−1
) + ˆ m
j
(x
j
− y) +
n

i=j+1
m
i
(x
i
− x
i−1
),
where
˜ m
j
= inf¦ f (x) : x
j−1
≤ x ≤ y¦
ˆ m
j
= inf¦ f (x) : y ≤ x ≤ x
j
¦.
Thus,
L( f , Q) − L( f , P) = ( ˜ m
j
− m
j
)(y − x
j−1
) + ( ˆ m
j
− m
j
)(x
j
− y).
It remains to note that ˜ m
j
≥ m
j
and ˆ m
j
≥ m
j
, since both are infimums over a
smaller set
2
. Thus, L( f , Q) − L( f , P) ≥ 0.
Consider now two partitions, P ⊂ Q. There exists a sequence of partitions,
P = P
0
⊂ P
1
⊂ ⊂ P
k
= Q,
such that each P
j
differs from P
j−1
by one point. Hence,
L( f , P) ≤ L( f , P
1
) ≤ ≤ L( f , Q).
A similar argument works for the upper sums. B
Theorem 4.1 For every two partitions P
1
,P
2
of [a, b],
L( f , P
1
) ≤ U( f , P
2
).
2
It is always true that if A ⊂ B then
inf A ≥ inf B and sup A ≤ sup B.
Integration theory 137
Proof : Let Q = P
1
∪ P
2
. Since Q is finer than both P
1
and P
2
it follows that
L( f , P
1
) ≤ L( f , Q) ≤ U( f , Q) ≤ U( f , P
2
).
B
Exercise 4.2 Prove or disprove:
' A partition Q is a refinement of a partition P if and only if every segment in
Q is contained in a segment of P.
^ If a partition P is a refinement of a partition Q then every segment in Q is a
segment in P.
· If R is a partition of both P and Q then it is also a refinement of P ∪ Q.
Exercise 4.3 Characterize the functions f for which there exists a partition P
such that L( f , P) = U( f , P).
We may now consider the collection of lower and upper sums,
L( f ) = ¦L( f , P) : P is a partition of [a, b]¦
U ( f ) = ¦U( f , P) : P is a partition of [a, b]¦.
Note that both L( f ) and U ( f ) are sets of real numbers, and not sets of partitions.
The set L( f ) is non-empty and bounded from above by any element of U ( f ).
Similarly, the set U ( f ) is non-empty and bounded from below by any element of
L( f ). Thus, we may define the least upper bound and greatest lower bound,
L( f ) = sup L( f ) and U( f ) = inf U ( f ).
The real number U( f ) is called the upper integral (וילע לרגטניא) of f on [a, b]
and L( f ) is called the lower integral (ותחת לרגטניא) of f on [a, b]
We once proved a theorem stating that the following assertions are equivalent:
' L( f ) = U( f ).
^ There is a unique number c, such that ≤ c ≤ u for all ∈ L( f ) and
u ∈ U ( f ).
· For every > 0 there exist an ∈ L( f ) and a u ∈ U ( f ), such that u− < .
138 Chapter 4
Definition 4.4 If L( f ) = U( f ) we say that f is integrable on [a, b]. We call this
number the integral of f on [a, b], and denote it by
_
b
a
f .
That is, f is integrable on [a, b] if the upper and lower integrals coincide.
Comment: The integral of f is the unique number such that
L( f , P) ≤
_
b
a
f ≤ U( f , Q),
for all partitions P and Q.
Comment: You are all accustomed to the notation
_
b
a
f (x) dx.
This notation is indeed very suggestive about the meaning of the integral, however,
it is very important to realize that the integral only involves f , a and b, and it is
not a function of any x.
Comment: At this stage we only consider integrals of bounded functions over
bounded segments—so-called proper integrals. Integrals like
_

a
f and
_
1
0
log x dx,
are instances of improper integrals. We will only briefly talk about them (only
due to a lack of time).
Comment: What about the case where the sign of f is not everywhere positive?
The definition remains the same. The geometric interpretation is that areas under
the x-axis count as “negative areas”.
Two questions arise:
' How do we know whether a function is integrable?
Integration theory 139
^ How to compute the integral of an integrable function?
(40 hrs, 2011)
The following lemma will prove useful in checking whether a function is inte-
grable:
Theorem 4.2 A bounded function f is integrable if and only if there exists for
every > 0 a partition P, such that
U( f , P) < L( f , P) + .
Proof : If for every > 0 there exists a partition P, such that
U( f , P) < L( f , P) + ,
then f is integrable, by the equivalent criteria described above.
Conversely, suppose f is integrable. Then, given > 0, there exist partitions P, P
·
,
such that
U( f , P) − L( f , P
·
) < .
Let P
··
= P ∪ P
·
. Since P
··
is a refinement of both, then
U( f , P
··
) − L( f , P
··
) ≤ U( f , P) − L( f , P
·
) < .
B
Example: Let f : [a, b] → R, f : x ·→ c. Then for every partition P,
L( f , P) =
n

i=1
m
i
(x
i
− x
i−1
) = c(b − a)
U( f , P) =
n

i=1
M
i
(x
i
− x
i−1
) = c(b − a).
It follows that L( f ) = U( f ) = c(b − a), hence f is integrable and
_
b
a
f = c(b − a).

140 Chapter 4
Example: Let f : [0, 1] → R,
f : x ·→
_
¸
¸
_
¸
¸
_
0 x Q
1 x ∈ Q.
For every partition and every j,
m
j
= 0 and M
j
= 1,
hence
L( f , P) = 0 and U( f , P) = 1,
and L( f ) = 0 and U( f ) = 1. Since L( f ) U( f ), f is not integrable.
(41 hrs, 2010)
Example: Consider next the function f : [0, 2] → R,
f : x ·→
_
¸
¸
_
¸
¸
_
0 x 1
1 x = 1.
Let P
n
, n ∈ N denote the family of uniform partitions,
x
i
=
2i
n
.
For every such partition,
L( f , P) = 0 and U( f , P) = 2/n.
Set now,
L
unif
( f ) = ¦L( f , P
n
) : n ∈ N¦
U
unif
( f ) = ¦U( f , P
n
) : n ∈ N¦
Clearly, L
unif
( f ) ⊆ L( f ) and U
unif
( f ) ⊆ U ( f ), hence
L( f ) = sup L( f ) ≥ sup L
unif
( f ) = 0
U( f ) = inf U ( f ) ≤ inf U
unif
( f ) = 0,
thus
0 ≤ L( f ) ≤ U( f ) ≤ 0,
Integration theory 141
which implies that f is integrable, and
_
2
0
f = 0.
This example shows that integrals are not affected by the value of a function at an
isolated point.
We are next going to examine a number of examples, which on the one hand, will
practice our understanding of integrability, and on the other hand, will convince
us that we must develop tools to deal with more complicated functions.
Example: Let f : [0, b] → R, f = Id. Let P be any partition, then m
i
= x
i−1
and
M
i
= x
i
, hence
L( f , P) =
n

i=1
x
i−1
(x
i
− x
i−1
)
U( f , P) =
n

i=1
x
i
(x
i
− x
i−1
).
Using again uniform partition,
P
n
= ¦0, b/n, 2b/n, . . . , b¦,
L( f , P
n
) =
n

i=1
(i − 1)b
n
b
n
=
b
2
n
2
n(n − 1)
2
=
b
2
2
_
1 −
1
n
_
U( f , P
n
) =
n

i=1
ib
n
b
n
=
b
2
n
2
n(n + 1)
2
=
b
2
2
_
1 +
1
n
_
.
As in the previous example,
L( f ) = sup L( f ) ≥ sup L
unif
( f ) =
b
2
2
U( f ) = inf U ( f ) ≤ inf U
unif
( f ) =
b
2
2
,
hence f is integrable and
_
b
0
f =
b
2
2
.

(37 hrs, 2009)
142 Chapter 4
Example: Let f : [0, b] → R, f (x) = x
2
. Let P be any partition, then m
i
= x
2
i−1
and M
i
= x
2
i
, hence
L( f , P) =
n

i=1
x
2
i−1
(x
i
− x
i−1
)
U( f , P) =
n

i=1
x
2
i
(x
i
− x
i−1
).
Take the same uniform partition, then
3
L( f , P
n
) =
n

i=1
(i − 1)
2
b
2
n
2
b
n
=
b
3
n
3
(n − 1)n(2n − 1)
6
=
b
3
3
_
1 −
1
n
_ _
1 −
1
2n
_
U( f , P
n
) =
n

i=1
i
2
b
2
n
2
b
n
=
b
3
n
3
n(n + 1)(2n + 1)
6
=
b
3
3
_
1 +
1
n
_ _
1 +
2
n
_
.
Since
L( f ) ≥ sup L
unif
( f ) = b
3
/3 and U( f ) ≤ inf U
unif
( f ) = b
3
/3,
it follows that
_
b
0
f =
b
3
3
.

This game can’t go much further, i.e., we need stronger tools. We start by verify-
ing which classes of functions are integrable.
(42 hrs, 2010)
4.2 Integration theorems
Theorem 4.3 If f [a, b] → R is monotone, then it is integrable.
3
n

k=1
k
2
=
1
6
n(n + 1)(2n + 1).
Integration theory 143
Proof : Assume, without loss of generality, that f is monotonically increasing.
Then, for every partition P,
L( f , P) =
n

i=1
f (x
i−1
)(x
i
− x
i−1
) and U( f , P) =
n

i=1
f (x
i
)(x
i
− x
i−1
),
hence,
U( f , P) − L( f , P) =
n

i=1
[ f (x
i
) − f (x
i−1
)](x
i
− x
i−1
).
Take now a family of equally-spaced partitions, P
n
= ¦a + j(b −a)/n¦
n
j=0
. For such
partitions,
U( f , P
n
) − L( f , P
n
) =
b − a
n
n

i=1
[ f (x
i
) − f (x
i−1
)] =
(b − a)[ f (b) − f (a)]
n
.
Thus, for every > 0 we can choose n > (b −a)[ f (b) − f (a)]/, which guarantees
that U( f , P
n
) − L( f , P
n
) < (we used the fact that the sum was “telescopic”). B
Theorem 4.4 If f [a, b] → R is continuous, then it is integrable.
Proof : First, a continuous function on a segment is bounded. We want to show
that for every > 0 there exists a partition P, such that
U( f , P) < L( f , P) + .
We need to somehow control the difference between m
i
and M
i
. Recall that a
function that is continuous on a closed interval is uniformly continuous. That is,
there exists a δ > 0, such that
¦ f (x) − f (y)¦ <

b − a
whenever ¦x − y¦ < δ.
Take then a partition P such that
x
i
− x
i−1
< δ for all i = 1, . . . , n.
Since continuous functions on closed intervals assume minima and maxima in the
interval, it follows that M
i
− m
i
< /(b − a), and
U( f , P) − L( f , P) =
n

i=1
(M
i
− m
i
)(x
i
− x
i−1
) <

b − a
n

i=1
(x
i
− x
i−1
) = .
144 Chapter 4
B
(42 hrs, 2011)
Theorem 4.5 Let f be a bounded function on [a, b], and let a < c < b. Then, f is
integrable on [a, b] if and only if it is integrable on both [a, c] and [c, b], in which
case
_
b
a
f =
_
c
a
f +
_
b
c
f .
Proof : (1) Suppose first that f is integrable on [a, b]. It is sufficient to show that it
is integrable on [a, c]. For every > 0 there exists a partition P of [a, b], such that
U( f , P) − L( f , P) < .
If the partition P does not include the point c then let Q = P ∪ ¦c¦, and
U( f , Q) − L( f , Q) ≤ U( f , P) − L( f , P) < .
Suppose that x
j
= c, then P
·
= ¦x
0
, . . . , x
j
¦ is a partition of [a, c], and P
··
=
¦x
j
, . . . , x
n
¦ is a partition of [c, b]. We have
L( f , Q) = L( f , P
·
) + L( f , P
··
) and U( f , Q) = U( f , P
·
) + U( f , P
··
),
from which we deduce that
[U( f , P
·
) − L( f , P
·
)] = [U( f , Q) − L( f , Q)]
.,,.
<
−[U( f , P
··
) − L( f , P
··
)]
.,,.
≥0
< ,
hence f is integrable on [a, c].
(2) Conversely, suppose that f is integrable on both [a, c] and [c, b]. Then there a
exists a partition P
·
of [a, c] and a partition P
··
of [c, b], such that
U( f , P
·
) − L( f , P
·
) <

2
and U( f , P
··
) − L( f , P
··
) <

2
.
Since P = P
·
∪P
··
is a partition of [a, b], summing up, we obtain the desired result.
Integration theory 145
Thus, we have shown that f is integrable on [a, b] if and only if it is integrable on
both [a, c] and [c, b]. Next, since
L( f , P
·
) ≤
_
c
a
f ≤ U( f , P
·
) for every partition P
·
of [a, c]
L( f , P
··
) ≤
_
b
c
f ≤ U( f , P
··
) for every partition P
··
of [c, b],
it follows that
L( f , P) ≤
_
c
a
f +
_
b
c
f ≤ U( f , P) for every partition P of [a, b].
(More precisely, for every partition P that is a union of partitions of P
·
and P
··
,
but it is easy to see that this implies for every partition.) Since there is only one
such number, it is equal to
_
b
a
f . B
(39 hrs, 2009)
Comment: So far, we have only defined the integral on [a, b] with b > a. We now
add the following conventions,
_
a
b
f = −
_
b
a
f and
_
a
a
f = 0,
so that (as can easily be checked) the addition rule always holds.
Theorem 4.6 If f and g are integrable on [a, b], then so is f + g, and
_
b
a
( f + g) =
_
b
a
f +
_
b
a
g.
Proof : First of all, we observe that if f and g are bounded functions on a set A,
then
4
inf¦ f (x) + g(x) : x ∈ A¦ ≥ inf¦ f (x) : x ∈ A¦ + inf¦g(x) : x ∈ A¦
sup¦ f (x) + g(x) : x ∈ A¦ ≤ sup¦ f (x) : x ∈ A¦ + sup¦g(x) : x ∈ A¦.
4
This is because for every x,
f (x) + g(x) ≥ inf¦ f (x) : x ∈ A¦ + inf¦g(x) : x ∈ A¦,
so that the right-hand side is a lower bound of f + g.
146 Chapter 4
This implies that for every partition P,
L( f + g, P) ≥ L( f , P) + L(g, P)
U( f + g, P) ≤ U( f , P) + U(g, P),
hence,
U( f + g, P) − L( f + g, P) ≤ [U( f , P) − L( f , P)] + [U(g, P) − L(g, P)]. (4.1)
We are practically done. Given > 0 there exist partitions P
·
, P
··
, such that
U( f , P
·
) − L( f , P
·
) <

2
and U(g, P
··
) − L(g, P
··
) <

2
.
Take P = P
·
∪ P
··
, then
U( f , P) − L( f , P) ≤ U( f , P
·
) − L( f , P
·
) <

2
U(g, P) − L(g, P) ≤ U(g, P
··
) − L(g, P
··
) <

2
,
and it remains to substitute into (4.1), to prove that f + g is integrable.
Now, for every partition P,
L( f , P) + L(g, P) ≤ L( f + g, P) ≤
_
b
a
( f + g) ≤ U( f + g, P) ≤ U( f , P) + U(g, P),
and also
L( f , P) + L(g, P) ≤
_
b
a
f +
_
b
a
g ≤ U( f , P) + U(g, P),
It follows that
¸
¸
¸
¸
¸
¸
_
b
a
( f + g) −
_
b
a
f −
_
b
a
g
¸
¸
¸
¸
¸
¸
≤ [U( f , P) + U(g, P)] − [L( f , P) + L(g, P)].
Since the right-hand side can be made less than for every , it follows that it is
zero
5
. B
5
If ¦x¦ < for every > 0 then x = 0.
Integration theory 147
Corollary 4.1 If f is integrable on [a, b] and g differs from f only at a finite num-
ber of points then
_
b
a
f =
_
b
a
g.
Proof : We saw that (g − f ), which is non-zero only at a finite number of points, is
integrable and its integral is zero. Then we only need to use the additivity of the
integral for g = f + (g − f ). B
Exercise 4.4 Denote by
_
the upper integral. Let f , g be bounded on [a, b].
Show that
_ b
a
( f + g) ≤
_ b
a
f +
_ b
a
g.
Also, find two functions f , g for which the inequality is strong.
Exercise 4.5 Recall the Riemann function,
R(x) =
_
¸
¸
_
¸
¸
_
1/q x = p/q ∈ Q
0 x Q.
Let [a, b] be a gievn interval.
' Show that
_
b
a
R = 0 (the lower integral).
^ Let > 0 and define
Z

= ¦x ∈ [a, b] : R(x) > ¦.
Show that Z

is a finite set.
· Show that there exists a partition P such that U(R, P) < .
´ Conclude that R is integrable on [a, b] and that
_
b
a
R = 0.
(44 hrs, 2010)
148 Chapter 4
Theorem 4.7 If f is integrable on [a, b] and c ∈ R, then c f is integrable on [a, b]
and
_
b
a
(c f ) = c
_
b
a
f .
Proof : Start with the case where c > 0. For every partition P,
L(c f , P) = c L( f , P) and U(c f , P) = c U( f , P).
Since f is integrable on [a, b], then there exists for every > 0 a partition P, such
that
U( f , P) − L( f , P) <

c
,
hence
U(c f , P) − L(c f , P) < ,
which proves that c f is integrable on [a, b]. Then, we note that for every partition
P,
L(c f , P) = c L( f , P) ≤ c
_
b
a
f ≤ c U( f , P) = U(c f , P).
Since there is only one number such that L(c f , P) ≤ ≤ U(c f , P) for all parti-
tions P, and this number is by definition
_
b
a
(c f ), then
_
b
a
(c f ) = c
_
b
a
f .
Consider then the case where c < 0; in fact, it is sufficient to consider the case of
c = −1, since for arbitrary c < 0 we write c f = (−1)¦c¦ f . For every partition P,
L(−f , P) = −U( f , P) and U(−f , P) = −L( f , P).
Since f is integrable on [a, b], then there exists for every > 0 a partition P, such
that
U( f , P) − L( f , P) < ,
hence
U(−f , P) − L(−f , P) = −(L( f , P) − U( f , P)) < ,
Integration theory 149
which proves that (−f ) is integrable on [a, b]. Then, we note that for every parti-
tion P,
L(−f , P) = −U( f , P) ≤ −
_
b
a
f ≤ −L( f , P) = U(−f , P),
and the rest is as above. B
(45 hrs, 2010)
This last theorem is in fact a special case of a more general theorem:
Theorem 4.8 Suppose that both f and g are integrable on [a, b], then so is their
product f g.
Proof : We will prove this theorem for the case where f , g ≥ 0. The generalization
to arbitrary functions is not hard (for example, since f and g are bounded, we
can add to each a constant to make it positive and then use the “arithmetic of
integrability”).
Let P be a partition. It is useful to introduce the following notations:
m
f
j
= inf¦ f (x) : x
j−1
≤ x ≤ x
j
¦
m
g
j
= inf¦g(x) : x
j−1
≤ x ≤ x
j
¦
m
f g
j
= inf¦ f (x)g(x) : x
j−1
≤ x ≤ x
j
¦
M
f
j
= sup¦ f (x) : x
j−1
≤ x ≤ x
j
¦
M
g
j
= sup¦g(x) : x
j−1
≤ x ≤ x
j
¦
M
f g
j
= sup¦ f (x)g(x) : x
j−1
≤ x ≤ x
j
¦.
We also define
M
f
= sup¦¦ f (x)¦ : a ≤ x ≤ b¦
M
g
= sup¦¦g(x)¦ : a ≤ x ≤ b¦.
For every x
j−1
≤ x ≤ x
j
,
m
f
j
m
g
j
≤ f (x)g(x) ≤ M
f
j
M
g
j
,
hence
m
f
j
m
g
j
≤ m
f g
j
and M
f g
j
≤ M
f
j
M
g
j
.
150 Chapter 4
It further follows that
M
f g
j
−m
f g
j
≤ M
f
j
M
g
j
−m
f
j
m
g
j
≤ (M
f
j
−m
f
j
)M
g
j
+m
f
j
(M
g
j
−m
g
j
) ≤ M
g
(M
f
j
−m
f
j
)+M
f
(M
g
j
−m
g
j
).
This inequality implies that
U( f g, P) − L( f g, P) ≤ M
g
[U( f , P) − L( f , P)] + M
f
[U(g, P) − L(g, P)].
It is then a simple task to show that the integrability of f and g implies the inte-
grability of f g. B
Theorem 4.9 If f is integrable on [a, b] and m ≤ f (x) ≤ M for all x, then
m(b − a) ≤
_
b
a
f ≤ M(b − a).
Proof : Let P
0
= ¦x
0
, x
n
¦, then
L( f , P
0
) ≥ m(b − a) and U( f , P
0
) ≤ M(b − a),
and the proof follows from the fact that
L( f , P
0
) ≤
_
b
a
f ≤ U( f , P
0
).
B
Suppose now that f is integrable on [a, b]. We proved that it is integrable on every
sub-segment. Define for every x,
F(x) =
_
x
a
f .
Theorem 4.10 The function F defined above is continuous on [a, b].
Integration theory 151
Proof : Since f is integrable it is bounded. Define
M = sup¦¦ f (x)¦ : a ≤ x ≤ b¦.
Given > 0 take δ = /M. Then for ¦x − y¦ < δ (suppose wlog that y > x),
F(y) − F(x) =
_
y
a
f −
_
x
a
f =
_
y
x
f .
By the previous theorem, since −M ≤ f (x) ≤ M for all x,
−M(y − x) ≤ F(y) − F(x) ≤ M(y − x),
i.e., ¦F(y) − F(x)¦ ≤ M(y − x) < . That is, given > 0, we take δ = /M, and then
¦F(y) − F(x)¦ < whenever ¦x − y¦ < δ,
which proves that F is (uniformly) continuous on [a, b]. B
(41 hrs, 2009)
Exercise 4.6 Let f be bounded and integrable in [a, b] and denote M = sup¦¦ f (x)¦ :
a ≤ x ≤ b¦. Show that for every p > 0,
__
b
a
¦ f ¦
p
_
1/p
≤ M¦b − a¦
1/p
,
and in particular, that the integral on the left hand side exists.
Exercise 4.7 Show that if f is bounded and integrable on [a, b], then ¦ f ¦ is
integrable on [a, b] and
¸
¸
¸
¸
¸
¸
_
b
a
f
¸
¸
¸
¸
¸
¸

_
b
a
¦ f ¦.
Exercise 4.8 Let a < b and c > 0 and let f be integrable on [ca, cb]. Show
that
_
cb
ca
f = c
_
b
a
g,
where g(x) = f (cx).
152 Chapter 4
Exercise 4.9 Let f be integrable on [a, b] and g(x) = f (x − c). Show that g is
integrable on [a + c, b + c] and
_
b
a
f =
_
b+c
a+c
g.
Exercise 4.10 Let f : [a, b] → R be bounded, and define the variation of f in
[a, b] as
ω
f
= sup¦ f (x) : a ≤ x ≤ b¦ − inf¦ f (x) : a ≤ x ≤ b¦.
' Prove that
ω
f
= sup¦ f (x
1
)−f (x
2
) : a ≤ x
1
, x
2
≤ b¦ = sup¦¦ f (x
1
)−f (x
2
)¦ : a ≤ x
1
, x
2
≤ b¦.
^ Show that f is integrable on [a, b] if and only if there exists for every > 0
a partition P = ¦x
0
, . . . , x
n
¦, such that
_
n
i=1
ω
i
< , where ω
i
is the variation
of f in the segment [x
i−1
, x
i
].
· Prove that if f is integrable on [a, b], and there exists an m > 0 such that
¦ f (x)¦ ≥ m for all x ∈ [a, b], then 1/ f is integrable on [a, b].
Exercise 4.11 A function s : [a, b] → R is called a step function (תיצקנופ
תוגרדמ) if there exists a partition p such that s is constant on every open interval
(we do not care for the values at the nodes),
' Show that if f is integrable on [a, b] then there exist for every > 0 two
step functions, s
1
≤ f and s
2
≥ f , such that
_
b
a
f −
_
b
a
s
1
< and
_
b
a
s
2

_
b
a
f < .
^ Show that if f is integrable on [a, b] then there exists for every > 0 a
continuous function g, g ≤ f , such that
_
b
a
f −
_
b
a
g < .
Discussion: A question that occupied many mathematicians in the second half
of the 19th century is what characterizes integrable functions. Hermann Hankel
(1839-1873) proved that if a function is integrable, then it is necessarily continu-
ous on a dense set of points (he also proved the opposite, but had an error in his
Integration theory 153
proof). The ultimate answer is due to Henri Lebesgue (1875-1941) who proved
that a bounded function is integrable if and only if its points of discontinuity have
measure zero (a notion not defined in this course).
(44 hrs, 2011)
4.3 The fundamental theorem of calculus
Theorem 4.11 ((ילאיצנרפידה ובשחה לש ידוסיה טפשמה)) Let f be inte-
grable on [a, b] and define for every x ∈ [a, b],
F(x) =
_
x
a
f .
If f is continuous at c ∈ (a, b), then F is differentiable at c and
F
·
(c) = f (c).
Comment: The theorem states that in a certain way the integral is an operation
inverse to the derivative. Note the condition on f being continuous. For example,
if f (x) = 0 everywhere except for the point x = 1 where it is equal to one, then
F(x) = 0, and it is not true that F
·
(1) = f (1).
Proof : We need to show that
lim
c

F,c
= f (c).
We are going to show that this holds for the right derivative; the same arguments
can then be applied for the left derivative.
For x > c,
F(x) − F(c) =
_
c
x
f .
If we define
m(c, x) = inf¦ f (y) : x < y < c¦
M(c, x) = sup¦ f (y) : x < y < c¦,
154 Chapter 4
then,
m(c, x)(x − c) ≤ F(x) − F(c) ≤ M(c, x)(x − c),
or
m(c, x) ≤ ∆
F,c
(x) ≤ M(c, x).
Then,
¸
¸
¸∆
F,c
(x) − f (c)
¸
¸
¸ ≤ max ¦¦m(c, x) − f (y)¦, ¦M(c, x) − f (y)¦¦ .
It remains that the right-hand side as a limit zero at c. Since f is continuous at c,
(∀ > 0)(∃δ > 0) : (∀c < y < c + δ)(¦ f (y) − f (c)¦ < /2).
For c < x < c + δ,
(∀ > 0)(∃δ > 0) : (∀c < y < c + x)(¦ f (y) − f (c)¦ < /2),
hence
¦m(c, x) − f (c)¦ ≤ /2 < e and ¦M(c, x) − f (c)¦ ≤ /2 < e.
To conclude, for every > 0 there exists a δ > 0 such that for all c < x < c + δ,
¸
¸
¸∆
F,c
(x) − f (c)
¸
¸
¸ < ,
which concludes the proof. B
Corollary 4.2 (Newton-Leibniz formula) If f is continuous on [a, b] and f = g
·
for some function g, then
_
b
a
f = g(b) − g(a).
Proof : We have seen that
F(x) =
_
x
a
f
satisfies the identity F
·
(x) = f (x) for all x (since f is continuous on [a, b]). We
have also seen that it implies that F = g + c, for some constant c, hence
_
x
a
f = F(x) = g(x) + c,
Integration theory 155
but since F(a) = 0 it follows that c = −g(a). Setting x = b we get the desired
result. B
This is a very useful result. It means that whenever we know a pair of functions
f , g, such that g
·
= f , we have an easy way to calculate the integral of f . On the
other hand, it is totally useless if we want to calculate the integral of f , but no
primitive function (המודק היצקנופ) g is known.
(43 hrs, 2009) (47 hrs, 2010)
Exercise 4.12 Let f : R → R be continuous and g, h : (a, b) → R differen-
tiable. Define F : (a, b) → R by
F(x) =
_
h(x)
g(x)
f .
' Show that F is well-defined for all x ∈ (a, b).
^ Show that F is differentiable, and calculate its derivative.
Exercise 4.13 Differentiate the following functions:
'
_
1
0
cos(t
2
) dt.
^
_
e
x
0
cos(t) dt.
·
_
x
2
x
e
sin(t)
dt.
Exercise 4.14 Let f : R → R be continuous. For every δ > 0 we define
F
δ
(x) =
1

_
x+δ
x−δ
f .
Show that for every x ∈ R,
f (x) = lim
δ→0
+
F
δ
(x).
Exercise 4.15 Let f , g : [a, b] → R be integrable, such that g(x) ≥ 0 for all
x ∈ [a, b].
' Show that there exists a λ ∈ R, such that
inf¦ f (x) : a ≤ x ≤ b¦ ≤ λ ≤ sup¦ f (x) : a ≤ x ≤ b¦,
156 Chapter 4
and
_
b
a
( f g) = λ
_
b
a
g.
Hint: distinguish between the cases where
_
b
a
g > 0 and
_
b
a
g = 0.
^ Show that if f is continuous, then there exists a c ∈ [a, b], such that
_
b
a
( f g) = f (c)
_
b
a
g.
(This is known as the integral mean value theorem.)
Exercise 4.16 Let f : [a, b] → R be monotonic and let g : [a, b] → R be
non-negative and integrable. Show that there exists a ξ ∈ [a, b], such that
_
b
a
( f g) = f (a)
_
ξ
a
g + f (b)
_
b
ξ
g.
Exercise 4.17 Let f : [a, b] → R be integrable, c ∈ (a, b), and F(x) =
_
x
a
f .
Prove or disprove:
' If f is differentiable at c then F is differentiable at c.
^ If f is differentiable at c then F
·
is continuous at c.
· If f is differentiable in a neighborhood of c and f
·
is continuous at c then F
is twice differentiable at c.
4.4 Riemann sums
Let’s ask a practical question. Let f be a given integrable function on [a, b], and
suppose we want to calculate the integral of f on [a, b], however we do not know
any primitive function of f . Suppose that we are content with an approximation
of this integral within a maximum error of . Take a partition P: since
L( f , P) ≤
_
b
a
f ≤ U( f , P),
it follows that if the difference between the upper and lower sum is less than ,
then their average differs from the integral by less that /2. If the difference is
Integration theory 157
greater than , then it can be reduced by refining the partition. Thus, up to the fact
that we have to compute lower and upper sums, we can keep refining until they
differ by less than , and so we obtain an approximation to the integral within the
desired accuracy.
There is one drawback in this plan. Infimums and supremums may be hard to
calculate. On the other hand, since m
i
≤ f (x) ≤ M
i
for all x
i−1
≤ x ≤ x
i
, it follows
that if in every interval in the partition we take some point t
i
(a sample point),
then
L( f , P) ≤
n

i=1
f (t
i
)(x
i−1
− x
i
) ≤ U( f , P).
The sum in the middle is called a Riemann sum. Its evaluation only requires
function evaluations, without any inf or sup. If the partition is sufficiently fine,
such that U( f , P) − L( f , P) < , then
¸
¸
¸
¸
¸
¸
¸
n

i=1
f (t
i
)(x
i−1
− x
i
) −
_
b
a
f
¸
¸
¸
¸
¸
¸
¸
< .
Note that we don’t know a priori whether a Riemann is greater or smaller than the
integral. What we are going to show is that Riemann sums converge to the integral
as we refine the partition, irrespectively on how we sample the points t
i
.
Definition 4.5 The diameter of a partition P is
diamP = max¦x
i
− x
i−1
: 1 ≤ i ≤ n¦.
Theorem 4.12 Let f be an integrable function on [a, b]. For every > 0 there
exists a δ > 0, such that
¸
¸
¸
¸
¸
¸
¸
n

i=1
f (t
i
)(x
i−1
− x
i
) −
_
b
a
f
¸
¸
¸
¸
¸
¸
¸
<
for every partition P such that diamP < δ and any choice of points t
i
.
Proof : It is enough to show that given > 0 there exists a δ > 0 such that
¦U( f , P) − L( f , P)¦ < whenever diamP < δ,
158 Chapter 4
for any Riemann sum with the same partition will differ from the integral by less
than .
Let
M = sup¦¦ f (x)¦ : a ≤ x ≤ b¦,
and take some partition P
·
= ¦y
0
, . . . , y
k
¦ for which
¦U( f , P
·
) − L( f , P
·
)¦ <

2
.
We set
δ <

4Mk
.
For every partition P of diameter less than δ,
U( f , P) − L( f , P) =
n

i=1
(M
i
− m
i
)(x
i
− x
i−1
)
can be divided into two sums. One for which each of the intervals [x
i−1
, x
i
] is
contained in an interval [y
j−1
, y
j
]. The contribution of these terms is at most /2.
The second sum includes those intervals for which x
i−1
< y
j
< x
i
. There are at
most k such terms and each contributes at most Mδ. This concludes the proof. B
4.5 The trigonometric functions
Discussion: Describe the way the trigonometric functions are defined in geome-
try, and explain that we want definitions that only rely on the 13 axioms of real
numbers.
Definition 4.6 The number π is defined as
π = 2
_
1
−1

1 − x
2
dx.
Comment: We have used here the standard notation of integrals. In our notation,
we have a function ϕ : [−1, 1] → R defined as ϕ : x ·→

1 − x
2
, and we define
π = 2
_
1
−1
ϕ.
Integration theory 159
This is well defined because ϕ is continuous, hence integrable.
Consider now a unit circle. For −1 ≤ x ≤ 1 we look at the point (x,

1 − x
2
),
connect it to the origin, and express the area of the sector bounded by this seg-
ment and the positive x-axis. This geometric construction is just the motivation;
formally, we define a function A : [−1, 1] → R,
A(x) =
x

1 − x
2
2
+
_
1
x
ϕ.
If θ is the angle of this sector, then we know that x = cos θ, and that A(x) = θ/2.
Thus,
A(cos θ) =
θ
2
,
or,
cos θ = A
−1
_
θ
2
_
.
This observation motivates the way we are going to define the cosine function.
x
A!x"
We now examine some properties of the function A. First, A(−1) = π/2 and
A(1) = 0. Second,
A
·
(x) = −
1
2

1 − x
2
,
i.e., the function A is monotonically decreasing. It follows that it is invertible,
with its inverse defined on [0, π/2].
Definition 4.7 We define the functions cos : [0, π] → R and sin : [0, π] → R by
cos x = A
−1
(x/2) and sin x =

1 − cos
2
x.
160 Chapter 4
With these definitions in hand we define the other trigonometric functions, tan,
cot, etc.
We now need to show that the sine and cosine functions satisfy all known differ-
ential and algebraic properties.
Theorem 4.13 For 0 < x < π,
sin
·
x = cos x and cos
·
x = −sin x.
Proof : Since A is differentiable and A
·
(x) 0 for 0 < x < π, we have by the chain
rule and the derivative of the inverse function formula,
cos
·
x =
1
2
(A
−1
)
·
(x/2) =
1
2
1
A
·
(A
−1
(x/2))
= −
1
2
1
1
2

1−(A
−1
(x/2))
2
= = −sin x.
The derivative of the sine function is then obtained by the composition rule. B
(45 hrs, 2009)
We are now in measure to start picturing the graphs of these two functions. First,
by definition,
sin x > 0 0 < x < π.
It thus follows that cos is decreasing in this domain. We also have
cos π = A
−1
(π/2) = −1 and cos 0 = A
−1
(0) = 1,
from which follows that
sin π = 0 and sin 0 = 0.
By the intermediate value theorem, cos must vanish for some 0 < y < π. Since
A(cos y) =
y
2
,
it follows that
y = 2A(0) = 2
_
1
0
ϕ =
π
2
,
Integration theory 161
where we have used the fact that by symmetry,
_
0
−1
ϕ =
_
1
0
ϕ.
Thus,
cos
π
2
= 0 and sin
π
2
= 1.
We can then extend the definition of the sine and cosine to all of R by setting
sin x = −sin(2π − x) and cos x = cos(2π − x) for π ≤ x ≤ 2π,
and
sin(x + 2πk) = sin x and cos(x + 2πk) = cos x for all k ∈ Z,
It is easy to verify that the expressions for the derivatives of sine and cosine remain
the same for all x ∈ R.
Since we defined the trigonometric functions (more precisely their inverses) via
integrals, it is not surprising that it was straightforward to differentiate them. A
little more subtle is to derive their algebraic properties.
Lemma 4.2 Let f : R → R be twice differentiable, and satisfy the differential
equation
f
··
+ f = 0,
with f (0) = f
·
(0) = 0. Then, f is identically zero.
Proof : Multiply the equation by 2f
·
, then
2 f
··
f
·
+ 2f f
·
= [( f
·
)
2
+ f
2
]
·
= 0,
from which we deduce that
( f
·
)
2
+ f
2
= c,
for some c ∈ R. The conditions at zero imply that c = 0, which implies that f is
identically zero. B
162 Chapter 4
Theorem 4.14 Let f : R → R be twice differentiable, and satisfy the differential
equation
f
··
+ f = 0,
with f (0) = a and f
·
(0) = b. Then,
f = a cos +b sin .
Proof : Define
g = f − a cos −b sin,
then,
g
··
+ g = 0,
and g(0) = g
·
(0) = 0. It follows from the lemma that g is identically zero, hence
f = a cos +b sin. B
Theorem 4.15 For every x, y ∈ R,
sin(x + y) = sin x cos y + cos x sin y
cos(x + y) = cos x cos y − sin x sin y.
Proof : Fix y and define
f : x ·→ sin(x + y),
from which follows that
f
·
: x ·→ cos(x + y)
f
··
: x ·→ −sin(x + y).
Thus, f
··
+ f = 0 and
f (0) = sin y and f
·
(0) = cos y.
It follows from the previous theorem that
f (x) = sin y cos x + cos y sin x.
We proceed similarly for the cosine function. B
Integration theory 163
4.6 The logarithm and the exponential
Our (starting) goal is to define the function
f : x ·→ 10
x
for x ∈ R, along with its inverse, which we denote by log
10
.
For n ∈ Nwe have an inductive definition, which in particular, satisfies the identity
10
n
10
m
= 10
n+m
.
We then extend the definition of the exponential (with base 10) to integers, by
requiring the above formula to remain correct. Thus, we must set
10
0
= 1 and 10
−n
=
1
10
n
.
The next extension is to fractional powers of the form 10
1/n
. Since we want
10
1/n
× 10
1/n
× × 10
1/n
.,,.
n times
= 10
1
= 10,
this forces the definition 10
1/n
=
n

10. Then, we extend the definition to rational
powers, by
10
1/n
× 10
1/n
× × 10
1/n
.,,.
m times
= 10
m/n
.
We have proved in the beginning of this course that the definition of a rational
power is independent of the representation of that fraction.
The question is how to extend the definition of 10
x
when x is a real number?
This can be done with lots of “sups” and “infs”, but here we shall adopt another
approach. We are looking for a function f with the property that
f (x + y) = f (x) f (y),
and f (1) = 10. Can we find a differentiable function that satisfies these properties?
If there was such a function, then
f
·
(x) = lim
h→0
f (x + h) − f (x)
h
= lim
h→0
f (x) f (h) − f (x)
h
=
_
lim
h→0
f (h) − 1
h
_
f (x).
164 Chapter 4
Suppose that the limit on the right hand side exists, and denote it by α. Then,
f
·
(x) = α f (x) and f (1) = 10.
This is not very helpful, as the derivative of f is expressed in terms of the (un-
known) f .
It is clear that f should be monotonically increasing (it is so for rational powers),
hence invertible. Then,
( f
−1
)
·
(x) =
1
f
·
( f
−1
(x))
=
1
αf ( f
−1
(x))
=
1
αx
.
From this we deduce that
f
−1
(x) =
_
x
1
dt
αt
= log
10
(x).
We seem to have succeeded in defining (using integration) the logarithm-base-10
function, but there is a caveat. We do not know the value of α. Since, in any case,
the so far distinguished role of the number 10 is unnatural, we define an alternative
logarithm function:
Definition 4.8 For all x > 0,
log(x) =
_
x
1
dt
t
.
Comment: Since 1/x is continuous on (0, ∞), this integral is well-defined for all
0 < x < ∞.
Theorem 4.16 For all x, y > 0,
log(xy) = log(x) + log(y).
Proof : Fix y and consider the function
f : x ·→ log(xy).
Integration theory 165
Then,
f
·
(x) =
1
xy
y =
1
x
.
Since log
·
(x) = 1/x, it follows that
log(xy) = log(x) + c.
The value of the constant is found by setting x = 1. B
Corollary 4.3 For integer n,
log(x
n
) = n log(x).
Proof : By induction on n. B
Corollary 4.4 For all x, y > 0,
log
_
x
y
_
= log(x) − log(y).
Proof :
log(x) = log
_
x
y
y
_
= log
_
x
y
_
+ log(y).
B
Definition 4.9 The exponential function is defined as
exp(x) = log
−1
(x).
Comment: The logarithm is invertible because it is monotonically increasing, as
its derivative is strictly positive. What is less easy to see is that the domain of the
exponential is the whole of R. Indeed, since
log(2
n
) = n log(2),
166 Chapter 4
it follows that the logarithm is not bounded above. Similarly, since
log(1/2
n
) = −n log(2),
it is neither bounded below. Thus, its range is R.
Theorem 4.17
exp
·
= exp .
Proof : We proceed as usual,
exp
·
(x) = (log
−1
)
·
(x) =
1
log
·
(log
−1
(x))
= log
−1
(x) = exp(x).
B
Theorem 4.18 For every x, y,
exp(x + y) = exp(x) exp(y).
Proof : Set t = exp(x) and s = exp(y). Thus, x = log(t) and y = log(s), and
x + y = log(t) + log(s) = log(ts).
Exponentiating both sides, we get the desired result. B
Definition 4.10
e = exp(1).
There is plenty more to be done with exponentials and logarithms, like for exam-
ple, show that for rational numbers r, e
r
= exp(r), and the passage between bases.
Due to the lack of time, we will stop at this point.
Integration theory 167
4.7 Integration methods
Given a function f , a question of practical interest is how to calculate integrals,
_
b
a
f .
If we know a function F, which is a primitive of f , in the sense that F
·
= f , then
the answer is known,
_
b
a
f = F(b) − F(a)
def
= F¦
b
a
.
To effectively use this formula, we must know many pairs of functions f , F. Here
is a table of pairs f , F, which we already know:
f (x) F(x)
1 x
x
n
x
n+1
/(n + 1)
cos x sin x
sin x −cos x
1/x log(x)
exp(x) exp(x)
sec
2
(x) tan(x)
1/(1 + x
2
) arctan(x)
1/

1 − x
2
arcsin(x)
Sums of such functions and multiplication by constants are easily treated with our
integration theorems, for
_
b
a
( f + g) =
_
b
a
f +
_
b
a
g and
_
b
a
(c f ) = c
_
b
a
f .
Functions that can be expressed in terms of powers, trigonometric, exponential
and logarithmic functions and called elementary. In this section we study tech-
niques that in some cases allow to express a primitive function in terms of ele-
mentary functions. There are basically two techniques—integration by parts, and
substitution—both we will study.
168 Chapter 4
Comment: Every continuous function f has a primitive,
F(x) =
_
x
a
f ,
where a is arbitrary. It is however not always the case that the primitive of an
elementary function is itself and elementary function.
Example: Consider the function
F(x) = x arctan(x) −
1
2
log(1 + x
2
).
It is easily checked that
f (x) = F
·
(x) = arctan(x).
As a result of this observation,
_
b
a
arctan(x) dx = F¦
b
a
.
The question is how could have we guessed this answer if the function F had not
been given to us.
Theorem 4.19 (Integration by parts (יקלחב היצרגטניא)) Let f , g have con-
tinuous first derivatives, then
_
b
a
f g
·
= ( f g)¦
b
a

_
b
a
f
·
g.
Comment: When we use this formula, f and g
·
are given. Then, g on the right-
hand side may be any primitive of g
·
. Also, setting b = x and noting that primitives
are defined up to constants, we have that
Primitive( f g
·
) = f Primitive(g
·
) − Primitive( f
·
Primitive(g)) + const.
Integration theory 169
Example:
_
b
a
x e
x
dx.

Example: The 1-trick:
_
b
a
log(x) dx.

Example: The back-to-original-problem-trick
_
b
a
(1/x) log(x) dx.

Once we have used this method to get new f , F pairs, we use this knowledge to
obtain new such pairs:
Example: Using the fact that a primitive of log(x) is x log(x) − x, we solve
_
b
a
log
2
(x) dx.

Theorem 4.20 (Substitution formula (הבצהה תטיש)) For continuously differ-
entiable f and g,
_
b
a
(g ◦ f ) f
·
=
_
f (b)
f (a)
g.
Proof : Let G be a primitive of g, then
(G ◦ f )
·
= (g ◦ f ) f
·
,
170 Chapter 4
hence
_
b
a
(g ◦ f ) f
·
=
_
b
a
(G ◦ f )
·
= (G ◦ f )¦
b
a
= G¦
f (b)
f (a)
=
_
f (b)
f (a)
g.
B
The simplest use of this formula is by recognizing that a function is of the form
(g ◦ f ) f
·
.
Example: Take the integral
_
b
a
sin
5
(x) cos(x) dx.
If f (x) = sin(x) and g(x) = x
5
, then
sin
5
(x) cos(x) = g( f (x)) f
·
(x),
hence
_
b
a
sin
5
(x) cos(x) dx =
_
sin(b)
sin(a)
t
5
dt =
t
6
6
¸
¸
¸
¸
¸
¸
sin(b)
sin(a)
.

(46 hrs, 2009)
Chapter 5
Sequences
5.1 Basic definitions
An (infinite) sequence (הרדס) is an infinite ordered set of real numbers. We denote
sequences by a letter which labels the sequence, and a subscript which labels the
position in the sequence, for example,
a
1
, a
2
, a
3
, . . . .
The essential fact is that to each natural number corresponds a real number. Hav-
ing said that, we can define:
Definition 5.1 An infinite sequence is a function N → R.
Comment: Let a : N → R. Rather than writing a(3), a(17), we write a
3
, a
17
.
Comment: As for functions, where we often refer to “the function x
2
”, we often
refer to “the sequence 1/n”.
Notation: We usually denote a sequence a : N → R by (a
n
). Another ubiquitous
notation is (a
n
)

n=1
.
Examples:
' a : n ·→ n or a
n
= n.
172 Chapter 5
^ b : n ·→ (−1)
n
or b
n
= (−1)
n
.
· c : n ·→ 1/n or c
n
= 1/n.
´ (d
n
) = ¦2, 3, 5, 7, 11, . . . ¦, the sequence of primes.
In the first and fourth cases we would say that “the sequence goes (תפאוש) to
infinity”. In the second case we would say that “it jumps between −1 and 1”. In
the third case we would say that “it goes to zero”. We will make such statements
precise momentarily.
Like functions, sequences can be graphed. More useful often is to mark the ele-
ments of the sequence on the number axis.
a
1
a
4
a
5
a
3
a
2
5.2 Limits of sequences
Definition 5.2 A sequence (a
n
) converges (תסנכתמ) to , denoted
lim
n→∞
a
n
= ,
if
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a
n
− ¦ < ).
Definition 5.3 A sequence is called convergent (תסנכתמ) if it converges to some
(real) number; otherwise it is called divergent (תרדבתמ) (there is no such thing
as convergence to infinity).
Comment: There is a close connection between the limit of a sequence and the
limit of a function at infinity. We will elaborate on that later on.
Notation: Like for functions, a “cleaner” notation for the limit of a sequence (a
n
)
would be
lim

a.
Sequences 173
Example: Let a : n ·→ 1/n, or
(a
n
) = 1,
1
2
,
1
3
,
1
4
, . . .
It is easy to show that
lim

a = 0,
since,
(∀ > 0)(set N = ¦1/¦) then (∀n > N)
_
¦a
n
¦ =
1
n
<
1
N
<
_
.

Example: Let a : n ·→

n + 1 −

n, or
(a
n
) =

2 −

1,

3 −

2,

4 −

3, . . . .
It seems reasonable that
lim

a = 0,
however showing it requires some manipulations. One way is by an algebraic
trick,
a
n
=
(

n + 1 −

n)(

n + 1 +

n)

n + 1 +

n
=
1

n + 1 +

n
<
1
2

n
,
from which it is easy to proceed. The same inequality can be obtained by defining
a function f : x ·→

x, and using the mean-value theorem,

y −

x =
y − x
2

ξ
,
for some x < ξ < y. Substituting x = n and y = n + 1, we obtain the desired
inequality.
Example: Let now
a : n ·→
3n
3
+ 7n
2
+ 1
4n
3
− 8n + 63
.
174 Chapter 5
It seems reasonable that
lim

a =
3
4
,
but once again, proving it is not totally straightforward. One way to proceed is to
note that
a
n
=
3 + 7/n + 1/n
3
4 − 8/n
2
+ 63/n
3
,
and use limit arithmetic as stated below.
Exercise 5.1 For each of the following sequence determine whether it is con-
verging or diverging. Prove your claim using the definition of the limit:
' a
n
=
7n
2
+

n
n
2
+3n
.
^ a
n
=
n
4
+1
n
3
−3
.
· a
n
=

n
2
+ 1 − n.
Theorem 5.1 Let a and b be convergent sequences. Then the sequences (a+b)
n
=
a
n
+ b
n
and (a b)
n
= a
n
b
n
are also convergent, and
lim

(a + b) = lim

a + lim

b
lim

(a b) = lim

a lim

b.
If furthermore lim

b 0, then there exists an N ∈ N such that b
n
0 for all
n > N, the sequence (a/b) converges and
lim

a
b
=
lim

a
lim

b
.
Comment: Since the limit of a sequence is only determined by the “tail” of the
sequence, we do not care if a finite number of elements is not defined.
Proof : Easy once you’ve done it for functions. B
Sequences 175
Exercise 5.2 Show that if a and b are converging sequences, such that
lim

a = L and lim

b = M,
then
lim

(a + b) = L + M.
We now establish the analogy between limits of sequences and limits of functions.
Let a be a sequence, and define a function f : [1, ∞) → R by
f (x) = a
n
+ (a
n+1
− a
n
)(x − n) n ≤ x ≤ n + 1.
Proposition 5.1
lim

a = if and only if lim

f = .
Proof : Suppose first that the sequence converges. Thus,
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a
n
− ¦ < ).
Since for every x, f (x) lies between f (¦x¦) and f ([x|), it follows that for all x > N,
¦ f (x) − ¦ ≤ max
_
¦a
¦x¦
− ¦, ¦a
[x|
− ¦
_
< ,
hence
(∀ > 0)(∃N ∈ N) : (∀x > N)(¦ f (x) − ¦ < ).
Conversely, if f converges to as x → ∞, then,
(∀ > 0)(∃N ∈ N) : (∀x > N)(¦ f (x) − ¦ < ),
and this is in particular true for x ∈ N. B
Conversely, given a function f : R → R, it is often useful to define a sequence
a
n
= f (n). If f (x) converges as x → ∞, then so does the sequence (a
n
).
176 Chapter 5
Example: Let 0 < a < 1, then
lim
n→∞
a
n
=?
We define f (x) = a
x
. Then
f (x) = e
x log a
,
and a
n
= f (n). Since log a < 0 it follows that lim
x→∞
f (x) = 0, hence the sequence
converges to zero.
Comment: Like for functions, we may define convergence of sequences to ±∞,
only that we don’t use the term convergence, but rather say that a sequence ap-
proaches (תפאוש) infinity.
Theorem 5.2 Let f be defined on a (punctured) neighborhood of a point c. Sup-
pose that
lim
c
f = .
If a is a sequence whose entries belong to the domain of f , such that a
n
c, and
lim

a = c,
then
lim

f (a) = ,
where the composition of a function and a sequence is defined by
[ f (a)]
n
= f (a
n
).
Conversely, if lim

f (a) = for every sequence a that converges to c, then
lim
c
f = .
Comment: This theorem provides a characterization of the limit of a function at
a point in terms of sequences. This is known as Heine’s characterization of the
limit.
Proof : Suppose first that
lim
c
f = .
Sequences 177
Thus,
(∀ > 0)(∃δ > 0) : (∀x : 0 < ¦x − c¦ < δ)(¦ f (x) − ¦ < ).
Let a be a sequence as specified in the theorem, namely, a
n
c and lim

a = c.
Then
(∀δ > 0)(∃N ∈ N) : (∀n > N)(0 < ¦a
n
− c¦ < δ).
Combining the two,
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦ f (a
n
) − ¦ < ),
i.e., lim

f (a) = .
Conversely, suppose that for any sequence a as above lim

f (a) = . Suppose, by
contradiction that
lim
c
f
(i.e., either the limit does not exist or it is not equal to c). This means that
(∃ > 0) : (∀δ > 0)(∃x : 0 < ¦x − c¦ < δ) : (¦ f (x) − ¦ ≥ ).
In particular, setting δ
n
= 1/n,
(∃ > 0) : (∀n ∈ N)(∃a
n
: 0 < ¦a
n
− c¦ < δ
n
) : (¦ f (a
n
) − ¦ ≥ ).
The sequence a converges to c but f (a) does not converge to , which contradicts
our assumptions. B
Example: Consider the sequence
a : n ·→ sin
_
13 +
1
n
2
_
.
This sequence can be written as a = sin(b), where b
n
= 13 + 1/n
2
. Since
lim
13
sin = sin(13) and lim

b = 13,
it follows that
lim

a = sin(13).

(48 hrs, 2009)
178 Chapter 5
Theorem 5.3 A convergent sequence is bounded.
Proof : Let a be a sequence that converges to a limit α. We need to show that there
exists L
1
, L
2
such that
L
1
≤ a
n
≤ L
2
∀n ∈ N.
By definition, setting = 1,
(∃N ∈ N) : (∀n > N)(¦a
n
− α¦ < 1),
or,
(∃N ∈ N) : (∀n > N)(α − 1 < a
n
< α + 1).
Set
M = max¦a
n
: 1 ≤ n ≤ N¦
m = min¦a
n
: 1 ≤ n ≤ N¦.
then for all n,
min(m, α − 1) ≤ a
n
≤ max(M, α + 1).
B
Theorem 5.4 Let a be a bounded sequence and let b be a sequence that converges
to zero. Then
lim
n→∞
(ab) = 0.
Proof : Let M be a bound on a, namely,
¦a
n
¦ ≤ M ∀n ∈ N.
Since b converges to zero,
(∀ > 0)(∃N ∈ N) : (∀n > N)
_
¦b
n
¦ <

M
_
.
Thus,
(∀ > 0)(∃N ∈ N) : (∀n > N) (¦a
n
b
n
¦ ≤ M¦b
n
¦ < ) ,
which implies that the sequence ab converges to zero. B
Sequences 179
Example: The sequence
a : n ·→
sin n
n
converges to zero.
Theorem 5.5 Let a be a sequence with non-zero elements, such that
lim

a = 0.
Then
lim

1
¦a¦
= ∞.
Proof : By definition,
(∀M > 0)(∃N ∈ N) : (∀n > N)(0 < ¦a
n
¦ < 1/M),
hence
(∀M > 0)(∃N ∈ N) : (∀n > N)(1/¦a
n
¦ > M),
which concludes the proof. B
Theorem 5.6 Let a be a sequence such that
lim

¦a¦ = ∞.
Then
lim

1
a
= 0.
Proof : By definition,
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a
n
¦ > 1/),
hence
(∀ > 0)(∃N ∈ N) : (∀n > N)(0 < 1/¦a
n
¦ < ),
which concludes the proof. B
180 Chapter 5
Theorem 5.7 Suppose that a and b are convergent sequences,
lim

a = α and lim

b = β,
and β > α. Then there exists an N ∈ N, such that
b
n
> a
n
for all n > N,
i.e., the sequence b is eventually greater (term-by-term) than the sequence a.
Proof : Since (β − α)/2 > 0,
(∃N
1
∈ N) : (∀n > N
1
)
_
a
n
< α +
1
2
(β − α)
_
,
and
(∃N
2
∈ N) : (∀n > N
2
)
_
b
n
> β −
1
2
(β − α)
_
.
Thus,
(∃N
1
, N
2
∈ N) : (∀n > max(N
1
, N
2
)) (b
n
− a
n
> 0) .
B
Corollary 5.1 Let a be a sequence and α, β ∈ R. If the sequence a converges to α
and α > β then eventually a
n
> β.
Theorem 5.8 Suppose that a and b are convergent sequences,
lim

a = α and lim

b = β,
and there exists an N ∈ N, such that a
n
≤ b
n
for all n > N. Then α ≤ β.
Proof : By contradiction, using the previous theorem. B
Comment: If instead a
n
< b
n
for all n > N, then we still only have α ≤ β. Take
for example the sequences a
n
= 1/n and b
n
= 2/n. Even though a
n
< b
n
for all n,
both converge to the same limit.
Sequences 181
Theorem 5.9 (Sandwich) Suppose that a and b are sequences that converge to
the same limit . Let c be a sequence for which there exists an N ∈ N such that
a
n
≤ c
n
≤ b
n
for all n > N.
Then
lim

c = .
Proof : It is given that
(∀ > 0)(∃N
1
∈ N) : (∀n > N
1
)(¦a
n
− ¦ < ),
and
(∀ > 0)(∃N
2
∈ N) : (∀n > N
2
)(¦b
n
− ¦ < ).
Thus,
(∀ > 0)(∃N
1
, N
2
∈ N) : (∀n > max(N
1
, N
2
))(− < a
n
− ≤ c
n
− ≤ b
n
− < ),
or
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦c
n
− ¦ < ).
B
Example: Since for all n,
1 <
_
1 + 1/n < 1 + 1/n,
it follows that
lim
n→∞
_
1 + 1/n = 1.
We can also deduce this from the fact that
lim
x→1

x = 1 and lim
n→∞
(1 + 1/n) = 1.

Exercise 5.3
182 Chapter 5
' Find sequences a and b, such that a converges to zero, b is unbounded, and
ab converges to zero.
^ Find sequences a and b, such that a converges to zero, b is unbounded, and
ab converges to a limit other then zero.
· Find sequences a and b, such that a converges to zero, b is unbounded, and
ab is bounded but does not converge.
Definition 5.4 A sequence a is called increasing (הלוע) if a
n+1
> a
n
for all n. It
is called non-decreasing (תדרוי אל) if a
n+1
≥ a
n
for all n. We define similarly a
decreasing and a non-increasing sequence.
Theorem 5.10 (Bounded + monotonic = convergent) Let a be a non-decreasing
sequence bounded from above. Then it is convergent.
Proof : Set
α = sup¦a
n
: n ∈ N¦.
(Note that a supremum is a property of a set, i.e., the order in the set does not
matter.) By the definition of the supremum,
(∀ > 0)(∃N ∈ N) : (α − < a
N
≤ α).
Since the sequence is non-decreasing,
(∀ > 0)(∃N ∈ N) : (∀n > N)(α − < a
N
≤ a
n
≤ α),
and in particular,
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a
n
− α¦ < ).
B
Example: Consider the sequence
a : n ·→
_
1 +
1
n
_
n
.
Sequences 183
We will first show that a
n
< 3 for all n. Indeed
1
,
_
1 +
1
n
_
n
= 1 +
_
n
1
_
1
n
+
_
n
2
_
1
n
2
+ +
_
n
n
_
1
n
n
= 1 + 1 +
n(n − 1)
2! n
2
+ +
n!
n! n
n
≤ 1 + 1 +
1
2!
+ +
1
n!
≤ 1 + 1 +
1
2
+ +
1
2
n−1
< 3.
Second, we claim that this sequence is increasing, as
a
n
= 1 + 1 +
1
2!
_
1 −
1
n
_
+
1
3!
_
1 −
1
n
_ _
1 −
2
n
_
+ .
When n grows, both the number of elements grows as well as their values. It
follows that
lim
n→∞
_
1 +
1
n
_
n
exists
(and equals to 2.718 ≡ e).
Definition 5.5 Let a be a sequence. A subsequence (הרדס תת) of a is any
sequence
a
n
1
, a
n
2
, . . . ,
such that
n
1
< n
2
< .
More formally, b is a subsequence of a if there exists a monotonically increasing
sequence of natural numbers (n
k
), such that b
k
= a
n
k
for all k ∈ N.
Example: Let a by the sequence of natural numbers, namely a
n
= n. The subse-
quence b of all even numbers is
b
k
= a
2k
,
i.e., n
k
= 2k.
1
The binomial formula is
(a + b)
n
=
n

k=0
_
n
k
_
a
k
b
n−k
.
184 Chapter 5
Lemma 5.1 Any sequence contains a subsequence which is either non-decreasing
or non-increasing.
Proof : Let a be a sequence. Let’s call a number n a peak point (איש תדוקנ) of the
sequence a if a
m
< a
n
for all m > n.
a peak poin!
a peak poin!
There are now two possibilities.
There are infinitely many peak points: If n
1
< n
2
< are a sequence of peak
points, then the subsequence a
n
k
is decreasing.
There are finitely many peak points: Then let n
1
be greater than all the peak points.
Since it is not a peak point, there exists an n
2
, such that a
n
2
≥ a
n
1
. Continuing this
way, we obtain a non-decreasing subsequence. B
Corollary 5.2 (Bolzano-Weierstraß) Every bounded sequence has a convergent
subsequence.
(50 hrs, 2009)
In many cases, we would like to know whether a sequence is convergent even
if we do not know what the limit is. We will now provide such a convergence
criterion.
Definition 5.6 A sequence a is called a Cauchy sequence if
(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a
n
− a
m
¦ < ).
Sequences 185
Theorem 5.11 A sequence converges if and only if it is a Cauchy sequence.
Proof : One direction is easy
2
. If a sequence a converges to a limit α, then
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a
n
− α¦ < /2),
hence
(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a
n
− a
m
¦ ≤ ¦a
n
− α¦ + ¦a
m
− α¦ < ),
i.e., the sequence is a Cauchy sequence.
Suppose now that a is a Cauchy sequence. We first show that the sequence is
bounded. Taking = 1,
(∃N ∈ N) : (∀n > N)(¦a
n
− a
N+1
¦ < 1),
which proves that the sequence is bounded.
By the Bolzano-Weierstraß theorem, it follows that a has a converging subse-
quence. Denote this subsequence by b with b
k
= a
n
k
and its limit by β. We will
show that the whole sequence converges to β.
Indeed, by the Cauchy property
(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a
n
− a
m
¦ < /2),
and whereas by the convergence of the sequence b,
(∀N ∈ N)(∃K ∈ N : n
K
> N) : (∀k > K)(¦b
k
− β¦ < /2).
Combining the two,
(∀ > 0)(∃N, K ∈ N : n
K
> N) : (∀n > N)(¦a
n
− β¦ ≤ ¦a
n
− b
k
¦ + ¦b
k
− β¦ < ).
This concludes the proof. B
(51 hrs, 2009)
2
There is something amusing about calling sequences satisfying this property a Cauchy se-
quence. Cauchy assumed that sequences that get eventually arbitrarily close converge, without
being aware that this is something that ought to be proved.
186 Chapter 5
5.3 Infinite series
Let a be a sequence. We define the sequence of partial sums (ייקלח ימוכס) of
a by
S
a
n
=
n

k=1
a
k
.
Note that S
a
does not depend on the order of summation (a finite summation). If
the sequence S
a
converges, we will interpret its limit as the infinite sum of the
sequence.
Definition 5.7 A sequence a is called summable (המיכס) if the sequence of par-
tial sums S
a
converges. In this case we write

k=1
a
k
def
= lim

S
a
.
The sequence of partial sum is called a series (רוט). Often, the term series refers
also to the limit of the sequence of partial sums.
Example: The canonical example of a series is the geometric series. Let a be a
geometric sequence
a
n
= q
n
¦q¦ < 1.
The geometric series is the sequence of partial sums, for which we have an explicit
expresion:
S
a
n
= q
1 − q
n
1 − q
This series converges, hence this geometric sequence is summable and

n=1
q
n
=
q
1 − q
.

Comment: Note the terminology:
sequence is summable = series is convergent.
Sequences 187
Theorem 5.12 Let a and b be summable sequences and γ ∈ R. Then the sequences
a + b and γa are summable and

n=1
(a
n
+ b
n
) =

n=1
a
n
+

n=1
b
n
,
and

n=1
(γa
n
) = γ

n=1
a
n
.
Proof : Very easy. For example,
S
a+b
n
=
n

k=1
(a
k
+ b
k
) = S
a
n
+ S
b
n
,
and the rest follows from limits arithmetic. B
An important question is how to identify summable sequences, even in cases
where the sum is not known. This brings us to discuss convergence criteria.
Theorem 5.13 (Cauchy’s criterion) The sequence a is summable if and only if for
every > 0 there exists an N ∈ N, such that
¦a
n+1
+ a
n+2
+ + a
m
¦ < for all m > n > N.
Proof : This is just a restatement of Cauchy’s convergence criterion for sequences.
We know that the series converges if and only if
(∀ > 0)(∃N ∈ N) : (∀m > n > N)(¦S
a
m
− S
a
n
¦ < ),
but
S
a
m
− S
a
n
= a
n+1
+ a
n+2
+ + a
m
.
B
188 Chapter 5
Corollary 5.3 If a is summable then
lim

a = 0.
Proof : We apply Cauchy’s criterion with m = n + 1. That is,
(∀ > 0)(∃N ∈ N) : (∀n > N)(¦a
n+1
¦ < ).
B
The vanishing of the sequence is only a necessary condition for its summability.
It is not sufficient as proves the harmonic series, a
n
= 1/n,
S
a
n
= 1 +
1
2
+
1
3
+
1
4
.,,.
≥1/2
+
1
5
+
1
6
+
1
7
+
1
8
.,,.
≥1/2
+ .
This grouping of terms shows that the sequence of partial sums is unbounded.
For a while we will deal with non-negative sequences. The reason will be made
clear later on.
Theorem 5.14 A non-negative sequence is summable if and only if the series (i.e.,
the sequence of partial sums) is bounded.
Proof : Consider the sequence of partial sums, which is monotonically increas-
ing. Convergence implies boundedness. Monotonicity and boundedness imply
convergence. B
Theorem 5.15 (Comparison test) If 0 ≤ a
n
≤ b
n
for every n and the sequence b is
summable, then the sequence a is summable.
Proof : Since b is summable then the corresponding series S
b
is bounded, namely.
(∃M > 0) : (∀n ∈ N)
_
0 ≤ S
b
n
≤ M
_
.
The bound M bounds the series S
a
, which therefore converges. B
Sequences 189
Example: Consider the following sequence:
a
n
=
2 + sin
3
(n + 1)
2
n
+ n
2
.
Since
0 < a
n

3
2
n
def
= b
n
,
and the series associated with b converges, it follows that the series of a converges.

Example: Consider now the sequence
a
n
=
1
2
n
− 1 + sin
3
(n
2
)
.
Here we have
0 <
1
2
n
≤ a
n

1
2
n
− 2

2
2
n
, n ≥ 2,
so again, based on the comparison test and the convergence of the geometric series
we conclude that the series of a converges.
Example: The next example is
a
n
=
n + 1
n
2
+ 1
.
It is clear that the corresponding series diverges, since a
n
“roughly behaves” like
1/n. This can be shown by the following inequality
1
n
=
n + 1
n
2
+ n
≤ a
n
,
which proves that the series S
a
is unbounded. The following theorem provides an
even easier way to show that.
(53 hrs, 2009)
Theorem 5.16 (Limit comparison) Let a and b be positive sequences, such that
lim

a
b
= γ 0.
Then a is summable if and only if b is summable.
190 Chapter 5
Example: Back to the previous example, setting b
n
= 1/n we find
lim

a
b
= lim
n→∞
n
2
+ n
n
2
+ 1
= 1,
hence the series of a diverges.
Proof : Suppose that b is summable. Since
lim

a
γ b
= 1,
it follows that
(∃N ∈ N) : (∀n > N)
_
a
n
γ b
n
< 2
_
,
or
(∃N ∈ N) : (∀n > N) (a
n
< 2γb
n
) .
Now,
S
a
n
= S
a
N
+
n

k=N+1
a
k
< S
a
N
+ 2γ
n

k=N+1
b
k
≤ S
a
N
+ 2γS
b
n
.
Since the right hand sides is uniformly bounded, then a is summable. The roles of
a and b can then be interchanged. B
Theorem 5.17 (Ratio test (הנמה חבמ)) Let a
n
> 0 and suppose that
lim
n→∞
a
n+1
a
n
= r.
(i) If r < 1 then (a
n
) is summable. (ii) If r > 1 then (a
n
) is not summable.
Proof : Suppose first that r < 1. Then,
(∃s : r < s < 1)(∃N ∈ N) : (∀n > N)
_
a
n+1
a
n
≤ s
_
.
In particular, a
N+1
≤ s a
N
, a
N+2
≤ s
2
a
N
, and so on. Thus,
n

k=1
a
k
=
N

k=1
a
k
+ a
N
n

k=N+1
s
k−N
.
Sequences 191
The second term on the right hand side is a partial sum of a convergent geometric
series, hence the partial sums of (a
n
) are bounded, which proves their convergence.
Conversely, if r > 1, then
(∃s : 1 < s < r)(∃N ∈ N) : (∀n > N)
_
a
n+1
a
n
≥ s
_
,
from which follows that a
n
≥ s
n−N
a
N
for all n > N, i.e., the sequence (a
n
) is not
even bounded (and certainly not summable). B
Examples: Apply the ratio test for
a
n
=
1
n!
b
n
=
r
n
n!
c
n
= n r
n
.
Comment: What about the case r = 1 in the above theorem? This does not provide
any information about convergence, as show the two examples
3
,

n=1
1
n
2
< ∞ and

n=1
1
n
= ∞.
These examples motivate, however, another comparison test. Note that
lim
x→∞
_
x
1
dt
t
= ∞ whereas lim
x→∞
_
x
1
dt
t
2
< ∞.
Indeed, there is a close relation between the convergence of the series and the
convergence of the corresponding integrals.
Definition 5.8 Let f be integrable on any segment [a, b] for some a and b > a.
Then
_

a
f
def
= lim
x→∞
_
x
a
f .
3
This notation means that the sequence a
n
= 1/n
2
is summable, whereas the sequence b
n
= 1/n
is not.
192 Chapter 5
Theorem 5.18 (Integral test) Let f be a positive decreasing function on [1, ∞)
and set a
n
= f (n). Then
_

n=1
a
n
converges if and only if
_

1
f exists.
Proof : The proof hinges on showing that
[x|

k=2
a
k

_
x
1
f ≤
¦x¦

k=1
a
k
.
TO COMPLETE. B
Example: Consider the series

n=1
1
n
p
for p > 0. We apply the integral text, and consider the integral
_
x
1
dt
t
p
=
_
¸
¸
_
¸
¸
_
log x p = 1
(1 − p)
−1
(x
−p+1
− 1) p 1.
This integral converges as x → ∞ if and only if p > 1, hence so does the series.

(55 hrs, 2009)
Example: One can take this further and prove that

n=1
1
n(log n)
p
converges if and only if p > 1, and so does

n=1
1
n log n(log log n)
p
,
and so on. This is known as Bertrand’s ladder.
We have dealt with series of non-negative sequences. The case of non-positive
sequences is the same as we can define b
n
= (−a
n
). Sequences of non-fixed sign
are a different story.
Sequences 193
Definition 5.9 A series
_

n=1
a
n
is said to be absolutely convergent (תסנכתמ
טלחהב) if
_

n=1
¦a
n
¦ converges. A series that converges but does not converge
absolutely is called conditionally convergent (יאנתב תסנכתמ).
Example: The series
1 −
1
2
+
1
3

1
4
+ ,
converges conditionally.
Example: The series
1 −
1
3
+
1
5

1
7
+ ,
converges conditionally as well. It is a well-known alternating series, which Leib-
niz showed (without any rigor) to be equal to π/4. It is known as the Leibniz
series.
Theorem 5.19 Every absolutely convergent series is convergent. Also, a series is
absolutely convergent, if and only if the two series formed by its positive elements
and its negative elements both converge.
Proof : The fact that an absolutely convergent series converges follows fromCauchy’s
criterion. Indeed, for every > 0 there exists an N ∈ N such that
¦a
n+1
+ + a
m
¦ ≤ ¦a
n+1
¦ + + ¦a
m
¦ <
whenever m, n > N.
Define now
4
a
+
n
=
_
¸
¸
_
¸
¸
_
a
n
a
n
> 0
0 a
n
≤ 0
and a

n
=
_
¸
¸
_
¸
¸
_
a
n
a
n
< 0
0 a
n
≥ 0
.
4
For example, if a
n
= (−1)
n+1
/n, then
a
+
n
= 1, 0,
1
3
, 0,
1
5
, 0,
a

n
= 0, −
1
2
, 0, −
1
4
, 0, −
1
6
, .
194 Chapter 5
Then, a
n
= a
+
n
+ a

n
, ¦a
n
¦ = a
+
n
− a

n
, and conversely,
a
+
n
=
1
2
(a
n
+ ¦a
n
¦) and a

n
=
1
2
(a
n
− ¦a
n
¦) .
Thus,
n

k=1
¦a
k
¦ =
n

k=1
a
+
n

n

k=1
a

n
,
hence if the two series of positive and negative terms converge so does the series
of absolute values. On the other hand,
n

k=1
a
+
n
=
1
2
n

k=1
a
n
+
1
2
n

k=1
¦a
n
¦
n

k=1
a

n
=
1
2
n

k=1
a
n

1
2
n

k=1
¦a
n
¦
so that the other direction holds as well. B
(56 hrs, 2009)
Theorem 5.20 (Leibniz) Suppose that a
n
is a non-increasing sequence of non-
negative numbers, i.e.,
a
1
≥ a
2
≥ ≥ 0,
and
lim
n→∞
a
n
= 0.
Then the series

n=1
(−1)
n+1
a
n
= a
1
− a
2
+ a
3
− a
4
+
converges.
Proof : Simple considerations show that
s
2
≤ s
4
≤ s
6
≤ ≤ s
5
≤ s
3
≤ s
1
.
Indeed,
s
2n
≤ s
2n
+ (a
2n+1
− a
2n+2
) = s
2n+2
= s
2n+1
− a
2n+2
≤ s
2n+1
.
Sequences 195
Thus, the sequence b
n
= s
2n
is non-decreasing and bounded from above, whereas
the sequence c
n
= s
2n+1
is non-increasing and bounded from below. It follows that
both subsequences converge,
lim
n→∞
b
n
= b and lim
n→∞
c
n
= c.
By limit arithmetic
lim
n→∞
(c
n
− b
n
) = lim
n→∞
(s
2n+1
− s
2n
) = lim
n→∞
a
2n+1
= c − b,
however a
2n+1
converges to zero, hence b = c. It remains to show that if both the
odd and even elements of a sequence converge to the same limit then the whole
sequence converges to this limit.
Indeed,
(∀ > 0)(∃N
1
∈ N) : (∀n > N
1
)(¦s
2n+1
− b¦ < )
and
(∀ > 0)(∃N
2
∈ N) : (∀n > N
2
)(¦s
2n
− b¦ < ),
i.e.,
(∀ > 0)(∃N
1
, N
2
∈ N) : (∀n > max(2N
1
+ 1, 2N
2
))(¦s
n
− b¦ < ).
B
Definition 5.10 A rearrangement (שדחמ רודיס) of a sequence a
n
is a sequence
b
n
= a
f (n)
,
where f : N → N is one-to-one and onto.
Theorem 5.21 (Riemann) If
_

n=1
a
n
is conditionally convergent then for every
α ∈ R there exists a rearrangement of the series that converges to α.
Proof : I will not give a formal proof, but only a sketch. B
196 Chapter 5
Example: Consider the series
S =

n=1
(−1)
n+1
n
= 1 −
1
2
+
1
3

1
4
+
1
5
+ .
We proved that this series does indeed converge to some
1
2
< S < 1. By limit
arithmetic for sequences (or series),
1
2
S =

n=1
(−1)
n+1
2n
=
1
2

1
4
+
1
6

1
8
+
1
10
+ ,
which we can also pad with zeros,
1
2
S = 0 +
1
2
+ 0 −
1
4
+ 0 +
1
6
+ 0 −
1
8
+ 0 +
1
10
+ .
Using once more sequence arithmetic,
3
2
S = 1 +
1
3

1
2
+
1
5
+
1
7

1
4
+ . . . ,
which directly proves that a rearrangement of the alternating harmonic series con-
verges to three halves of its value. In this example the rearrangement is
b
3n+1
= a
4n+1
b
3n+2
= a
4n+3
b
3n
= a
2n
.

Comment: Here is an important fact about Cauchy sequences: let (a
n
) be a Cauchy
sequence with limit a. Then,
(∀ > 0)(∃N ∈ N) : (∀m, n > N)(¦a
n
− a
m
¦ < ).
Think now of the left hand side as a sequence with index m with n fixed. As
m → ∞ this sequence converges to ¦a
n
− a¦. This implies that
¦a
n
− a¦ ≤ for all n > N.
Sequences 197
Theorem 5.22 If
_

n=1
a
n
converges absolutely then any rearrangement
_

n=1
b
n
converges absolutely, and

n=1
a
n
=

n=1
b
n
.
In other word, the order of summation does not matter in an absolutely convergent
series.
Proof : Let b
n
= a
f (n)
and set
s
n
=
n

k=1
a
k
and t
n
=
n

k=1
b
k
.
Let > 0 be given. Since
_

n=1
a
n
converges, then there exists an N such that
¸
¸
¸
¸
¸
¸
¸
s
N

k=1
a
k
¸
¸
¸
¸
¸
¸
¸
<

2
.
Moreover, since the sequence is absolutely convergent, we can choose N large
enough such that

k=1
¦a
k
¦ −
N

k=1
¦a
k
¦ =

k=N+1
¦a
k
¦ <

2
.
Take now M be sufficiently large, so that each of the a
1
, . . . , a
N
appears among the
b
1
, . . . , b
M
. For all m > M,
¦t
m
− s
N
¦ =
¸
¸
¸
¸
¸
¸
¸

k>N, f (k)≤m
a
k
¸
¸
¸
¸
¸
¸
¸

k=N+1
¦a
k
¦ <

2
,
i.e.,
¸
¸
¸
¸
¸
¸
¸

k=1
a
k
− t
m
¸
¸
¸
¸
¸
¸
¸

¸
¸
¸
¸
¸
¸
¸

k=1
a
k
− s
N
¸
¸
¸
¸
¸
¸
¸
+ ¦s
N
− t
m
¦ ≤ .
This proves that
lim
m→∞
t
m
=

k=1
a
k
,
198 Chapter 5
i.e.,

n=1
b
n
=

n=1
a
n
.
That
_

n=1
b
n
converges absolutely follows formthe fact that ¦b
n
¦ is a rearrangement
of ¦a
n
¦ and the first part of this proof. B
Absolute convergence is also necessary in order to multiply two series. Consider
the product
_
¸
¸
¸
¸
¸
_

n=1
a
n
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_

k=1
b
k
_
¸
¸
¸
¸
¸
_
= (a
1
+ a
2
+ )(b
1
+ b
2
+ ).
This product seems to be a sum of all products a
n
b
k
, i.e.,

k,n=1
(a
n
b
k
),
except that these numbers to be added form a “matrix”, i.e., a set that can be
ordered in different ways.
Theorem 5.23 Suppose that
_

n=1
a
n
and
_

n=1
b
n
converge absolutely, and that c
n
is any sequence that contains all terms a
n
b
k
. Then

n=1
c
n
=
_
¸
¸
¸
¸
¸
_

n=1
a
n
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_

k=1
b
k
_
¸
¸
¸
¸
¸
_
.
Proof : Consider the sequence
s
n
=
_
¸
¸
¸
¸
¸
_

k=1
¦a
k
¦
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_

=1
¦b

¦
_
¸
¸
¸
¸
¸
_
.
This sequence converges as n → ∞ (by limit arithmetic), which implies that it is
a Cauchy sequence. For every > 0 there exists an N ∈ N, such that
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
n

k=1
¦a
k
¦
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_
n

=1
¦b

¦
_
¸
¸
¸
¸
¸
_

_
¸
¸
¸
¸
¸
_
m

k=1
¦a
k
¦
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_
m

=1
¦b

¦
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
<

2
Sequences 199
for all m, n > N. By the comment above,
¸
¸
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
m

k=1
¦a
k
¦
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_
m

=1
¦b

¦
_
¸
¸
¸
¸
¸
_

_
¸
¸
¸
¸
¸
_

k=1
¦a
k
¦
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_

=1
¦b

¦
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸


2
for all m > N. Thus,

k or >N
¦a
k
¦¦b

¦ ≤

2
< .
Let now (c
n
) be a sequence comprising of all pairs a
n
b

. There exists an M suffi-
ciently large, such that m > M implies that either k or is greater than N, where
c
m
= a
k
b

. I.e.,
M

n=1
c
n

_
¸
¸
¸
¸
¸
_
N

k=1
a
k
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_
N

=1
b

_
¸
¸
¸
¸
¸
_
only consists of terms a
k
b

with either k or greater than N, hence for all m > M,
¸
¸
¸
¸
¸
¸
¸
m

n=1
c
n

_
¸
¸
¸
¸
¸
_
N

k=1
a
k
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_
N

=1
b

_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸

k or >N
¦a
k
¦¦b

¦ < .
Letting N → ∞,
¸
¸
¸
¸
¸
¸
¸
m

n=1
c
n

_
¸
¸
¸
¸
¸
_

k=1
a
k
_
¸
¸
¸
¸
¸
_
_
¸
¸
¸
¸
¸
_

=1
b

_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
¸
¸
≤ ,
for all m > M, which proves the desires result. B
200 Chapter 5

2

Contents
1 Real numbers 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 Axioms of field . . . . . . . . . . . . . . . . . . . . . . . . . . . Axioms of order (as taught in 2009) . . . . . . . . . . . . . . . . Axioms of order (as taught in 2010, 2011) . . . . . . . . . . . . . Absolute values . . . . . . . . . . . . . . . . . . . . . . . . . . . Special sets of numbers . . . . . . . . . . . . . . . . . . . . . . . The Archimedean property . . . . . . . . . . . . . . . . . . . . . Axiom of completeness . . . . . . . . . . . . . . . . . . . . . . . Rational powers . . . . . . . . . . . . . . . . . . . . . . . . . . . Real-valued powers . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 7 10 13 16 18 22 36 40 40 43 43 49 50 62 66 70 78 80 84

1.10 Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Functions 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limits and order . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theorems about continuous functions . . . . . . . . . . . . . . . Infinite limits and limits at infinity . . . . . . . . . . . . . . . . . Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . Uniform continuity . . . . . . . . . . . . . . . . . . . . . . . . .

ii 3 Derivatives 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4

CONTENTS 91 91 98

Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rules of differentiation . . . . . . . . . . . . . . . . . . . . . . .

Another look at derivatives . . . . . . . . . . . . . . . . . . . . . 102 The derivative and extrema . . . . . . . . . . . . . . . . . . . . . 104 Derivatives of inverse functions . . . . . . . . . . . . . . . . . . . 117 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 123 133

Integration theory 4.1 4.2 4.3 4.4 4.5 4.6 4.7

Definition of the integral . . . . . . . . . . . . . . . . . . . . . . 133 Integration theorems . . . . . . . . . . . . . . . . . . . . . . . . 142 The fundamental theorem of calculus . . . . . . . . . . . . . . . . 153 Riemann sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 The trigonometric functions . . . . . . . . . . . . . . . . . . . . 158 The logarithm and the exponential . . . . . . . . . . . . . . . . . 163 Integration methods . . . . . . . . . . . . . . . . . . . . . . . . . 167 171

5

Sequences 5.1 5.2 5.3

Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Limits of sequences . . . . . . . . . . . . . . . . . . . . . . . . . 172 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

iii .CONTENTS Foreword 1. Hochman. Meizler. Texbooks: Spivak.

iv CONTENTS .

they were sometimes very vague about definitions and their theory often laid on shaky grounds. which was developed during more than two centuries. Some of their followers who will be mentioned along this course are Jakob Bernoulli (1654-1705). Joseph Liouville (1804-1882). 1. solve many differential equations. JeanGaston Darboux (1842-1917). The pioneers were Isaac Newton (1642-1737) and Gottfried Wilelm Leibniz (1646-1716). and Karl Weierstrass (1815-1897). they could integrate many functions.Chapter 1 Real numbers In this course we will cover the calculus of real univariate functions. notably by Joseph-Louis Lagrange (1736-1813). and as we are going to learn it in this course was mostly developed throughout the 19th century. Peter Gustav Lejeune-Dirichlet (1805-1859). Augustin Louis Cauchy (1789-1857). the theory of functions needed a thorough revision of the concept of real numbers. Georg Friedrich Bernhard Riemann (1826-1866). The sound theory of calculus as we know it today. a rigorous course of calculus should start by putting even such basic con- . we should mention Georg Cantor (1845-1918) and Richard Dedekind (1831-1916). As calculus was being established on firmer grounds. Yet.1 Axioms of field Even though we have been using “numbers” as an elementary notion since first grade. and sum up a large number of infinite series using a wealth of sophisticated analytical techniques. In this context. These first two generations of mathematicians developed most of the practice of calculus as we know it. Johann Bernoulli (1667-1748) and Leonhard Euler (1707-1783).

In formal notation. Chapter 1 The set of real numbers will be defined as an instance of a complete. and then add the sum to a. What about the addition of four elements. As such. a + b + c + d? Does it require a separate axiom of equivalence? We would like the following additions ((a + b) + c) + d (a + (b + c)) + d a + ((b + c) + d) a + (b + (c + d)) (a + b) + (c + d) to be equivalent. c ∈ F. there is no a-priori meaning to a + b + c. b ∈ F)(∃!c ∈ F) : (c = a + b). 1 . Likewise (although it requires some non-trivial work). the meaning is ambiguous. The first axiom states that in either case. The operations on elements of a field satisfy nine defining properties. b. as in a · b = ab). b ∈ F corresponds one and only one a + b ∈ F and a · b ∈ F. Addition is associative ( ): for all a. and an operation which we call multiplication and denote by · (or by nothing. and denote by +. A binary operation in F is a function that “takes” an ordered pair of elements in F and “returns” an element in F. or conversely. ordered field ( ). (a + b) + c = a + (b + c). as we could first add a + b and then add c to the sum. we can show that the n-fold addition a1 + a2 + · · · + an is defined unambiguously. to every a. In fact. 1. which we list now. add b + c. (∀a. That is. It is easy to see that the equivalence follows from the axiom of associativity. It should be noted that addition was defined as a binary operation. Definition 1.1 A field F ( ) is a non-empty set on which two binary operations are defined1 : an operation which we call addition. the result is the same.2 cept on axiomatic grounds.

This is not an assumption. (existence of inverse) (associativity) (property of inverse) (0 is the neutral element) It follows at once that every element a ∈ F has a unique additive inverse.2 With the first three axioms there are a few things we can show. that a+b=a+c implies b = c. Suppose there were a. 3. which justifies the notation (−a). Existence of an additive-neutral element ( exists an element 0 ∈ F such that for all a ∈ F a + 0 = 0 + a = a. b ∈ F such that a + b = a. for if b and c were both additive inverses of a. such that 3 ): there ): for all a ∈ F there exists an a + b = b + a = 0. Existence of an additive inverse ( element b ∈ F. 2 . The notation (−a) assumes implicitly that the inverse is unique. For example.Real numbers 2. then a + b = 0 = a + c. It is customary to denote this additive inverse by (−a). The proof requires all three axioms: a+b=a+c (−a) + (a + b) = (−a) + (a + c) ((−a) + a)) + b = ((−a) + a) + c 0+b=0+c b = c. Hence we can refer to the additive inverse of a. from which follows that b = c. but follows from the first three axioms. We can also show that there is a unique element in F satisfying the neutral property (it is not an a priori fact). Then adding (−a) (on the left!) and using the law of associativity we get that b = 0.

b. (given) (commutativity) (associativity) (property of additive inverse) (property of neutral element) . Subtraction is a short-hand notation for the addition of the additive inverse. There is no way we can prove it using only the first four axioms! Indeed. By themselves. our experience with numbers tells us that a − b = b − a implies that a = b. Multiplication is associative: for all a. all we obtain from it that a + (−b) = b + (−a) (a + (−b)) + (b + a) = (b + (−a)) + (a + b) a + ((−b) + b) + a = b + ((−a) + a) + b a+0+a=b+0+b a + a = b + b. but how can we deduce from that that a = b? With very little comments. 4. a · 1 = 1 · a = a. Existence of a multiplicative-neutral element: there exists an element 1 ∈ F such that for all a ∈ F. def Comment: A set satisfying the first three axioms is called a group ( ). b ∈ F a + b = b + a. a · (b · c) = (a · b) · c. we can state now the corresponding laws for multiplication: 5. Addition is commutative ( ): for all a. c ∈ F.4 Chapter 1 Comment: We only postulate the operations of addition and multiplication. 6. a − b = a + (−b). which can fill an entire course. On the other hand. those three axioms have many implications. With this law we finally obtain that any (finite!) summation of elements of F can be re-arranged in any order.

We would be done if we knew that 1 + 1 0. f } with the following properties. ): for every a 5 0 there The condition that a 0 has strong implications. a+a=b+b 1·a+1·a=1·b+1·b (1 + 1) · a = (1 + 1) · b (property of neutral element) (distributive law). the uniqueness of the multiplicative inverse has to be proved. We can now proceed. b ∈ F. for we would multiply both sides by (1 + 1)−1 . a · b = b · a. We can revisit now the a − b = b − a problem. c ∈ F. Existence of a multiplicative inverse ( ¯ exists an element a−1 ∈ F such that3 a · a−1 = a−1 · a = 1. this does not follow from the axioms of field! Example: Consider the following set F = {e. Multiplication is commutative: for every a. Distributive law ( ): for all a. For example. The last axiom relates the addition and multiplication operations: 9. without which there is no justification to the notation a−1 . Comment: Division is defined as multiplication by the inverse. a/b = a · b−1 . b.Real numbers 7. from the fact that a · b = a · c we can deduce that b = c only if a 0. However. . a · (b + c) = a · b + a · c. + e f 3 e e f f f e · e f e e e f e f Here again. def 8.

It would follows that for every a ∈ F. we will not bother to prove that a − b = −(b − a) (even though the proof is very easy). The only exception is (of course) if you get to prove an algebraic identity as an assignment. we rule it out and require the field to satisfy 0 1. Proof : Using sequentially the property of the neutral element and the distributive law: a · 0 = a · (0 + 0) = a · 0 + a · 0. i. a = a · 1 = a · 0 = 0. I Comment: Suppose it were the case that 1 = 0.. and yet e f . a · 0 = 0. every algebraic identity should be proved from the axioms of field.1 For every a ∈ F. In practice.6 Chapter 1 It takes some explicit verification to check that this is indeed a field (in fact the smallest possible field). that f − e = e − f implies that e = f . Comment: In principle. Thus. Note that e − f = e + (− f ) = e + f = f + (−e) = f − e. For example. we will assume henceforth that all known algebraic manipulations are valid. however a very boring one. Proposition 1. which is then no wonder that we can’t prove.e. that the same element of F is both the additive neutral and the multiplicative neutral. This is indeed a field according to the axioms. . based only on the field axioms. Adding to both sides −(a · 0) we obtain a · 0 = 0. with e being the additive neutral and f being the multiplicative neutral (do you recognize this field?). which means that 0 is the only element of F.

that a − 0 = a ∈ P. a3 − b3 = (a − b)(a2 + ab + b2 ). Either (i) a ∈ P. b ∈ P then a · b ∈ P. b ∈ P then a + b ∈ P. d 0 then a c ad + bc + = . Closure under multiplication: if a. such that the following three properties are satisfied: 1. Closure under addition: if a. We can then show a number of well-known properties of real numbers: . or (ii) (−a) ∈ P. or (iii) a = 0.1 Let F be a field.2 ( ) A field F is said to be ordered if it has a distinguished subset P ⊂ F. then b = 1. The property of being ordered can be formalized by three axioms: Definition 1. Comment: a > 0 means. Similarly. that Œ  Ž  If ab = a for some field element a 0. that 0 − a = (−a) ∈ P. by definition. Prove. which we call the positive elements. If b. 2. They also form an ordered set. a < 0 means. If c 0 then a/b = (ac)/(bc). by definition. We supplement these axioms with the following definitions: a<b a>b a≤b a≥b means means means means b−a∈ P b<a a < b or a = b a > b or a = b.Real numbers 7  Exercise 1. b d bd Justify every step in your proof! 1.2 Axioms of order (as taught in 2009) The real numbers contain much more structure than just being a field. based on the field axioms. 3. Trichotomy: every element a ∈ F satisfies one and only one of the following properties.

(−b) ∈ P. which is an axiom.4 (Transitivity) If a < b and b < c then a < c.8 Chapter 1 Proposition 1. we are asked to show that a − b satisfies the trichotomy.2 For every a. Proof : c−a= (c − b) + (b − a) . That is. Proof : (b + c) − (a + c) = b − a ∈ P. I Proposition 1. or (iii) that a − b = 0.3 If a < b then a + c < b + c. I Proposition 1. . We are asked to show that either (i) a − b ∈ P. or (iii) a = b. along with our definitions. or (ii) b − a = −(a − b) ∈ P. it follows that (−a)(−b) > 0. b ∈ F either (i) a < b. By the closure under multiplication. Proof : This is an immediate consequence of the trichotomy law. Proof : We have seen above that a. It remains to check that the axioms of field imply that ab = (−a)(−b).1 If a 0 then a · a ≡ a2 > 0. I Corollary 1.5 If a < 0 and b < 0 then ab > 0. or (ii) a > b. in P in P in P by closure under addition I Proposition 1. b < 0 is synonymous to (−a).

which implies that (−1) has to be positive.2 Let F be an ordered field. this is not within the scope of the present course) cannot be ordered.Real numbers 9 Proof : If a 0. or a < 0 in which case a2 > 0 follows from the previous proposition. Prove based on the axioms that Œ  Ž   ‘ n that a < b if and only if (−b) < (−a). which violates the previous corollary. If a > 1 then a2 > a. I Corollary 1.2 1 > 0. Proof : This follows from the fact that 1 = 12 . in P by closure under multiplication I  Exercise 1. Then. I Another corollary is that the field of complex numbers (yes. Proof : It is given that b − a ∈ P and c ∈ P. then a − c < b − d. hence 1 has to be negative. If 0 < a < 1 then a2 < a.6 If a < b and c > 0 then ac < bc.3 Prove using the axioms of ordered fields and using induction on . If a < b and c > d. Proposition 1. If 0 ≤ a < b then a2 < b2 . since i2 = (−1). If 0 ≤ a < b and 0 ≤ c < d then ac < bd. with a = b. then by the trichotomy either a > 0. bc − ac = (b − a) · in P c in P . in which case a2 > 0 follows from the closure under multiplication.  Exercise 1.

 If xn = yn and n is even then either x = y of x = −y.10 Œ If 0 ≤ x < y then xn < yn . with multiplicative neutral term 1. O3 Invariance under addition: if a < b then a + c < b + c for all c ∈ F.5 Show that there can be no ordered field that has a finite number 1. such that the following four properties are satisfied: Definition 1.  Conclude that xn = yn if and only if x = y or x = −y. Prove that (a + b)/2 is well defined for all a. or (iii) a = b. The property of being ordered can be formalized by four axioms: tion < (a relation is a property that every ordered pair of elements either satisfy or not). a< 2 of elements. then a+b < b. or (ii) b < a. Either (i) a < b. b ∈ F satisfies one and only one of the following properties.4 Let F be an ordered field. and that if a < b. b ∈ F. 2011) The real numbers contain much more structure than just being a field.  Exercise 1.3 Axioms of order (as taught in 2010. O4 Invariance under multiplication by positive numbers: if a < b then ac < bc for all 0 < c. O2 Transitivity: if a < b and b < c then a < c. They also form an ordered set.  Exercise 1. Chapter 1  If x < y and n is odd then xn < yn . Ž If xn = yn and n is odd then x = y.3 ( ) A field F is said to be ordered if there exists a rela- O1 Trichotomy: every pair of elements a. . we also denote 1 + 1 = 2.

b then 0<a+b and 0 < ab. We supplement these axioms with the following definitions: a>b a≤b a≥b means means means b<a a < b or a = b a > b or a = b.e.Real numbers 11 Comment: The set of numbers a for which 0 < a are called the positive numbers. Closure under multiplication follows from the fact that 0=0·a < b · a. Proof : Closure under addition follows from the fact that 0 < a implies that b=0+b < a + b. a number is positive if and only if its additive inverse is negative. namely If 0 < a. Proof : Suppose 0 < a.8 0 < a if and only if (−a) < 0. The set of numbers a for which a < 0 are called the negative numbers. 2010) (2 hrs. and b < a + b follows from transitivity. 2011) (neutral element) (O3) Proposition 1. Proposition 1. i.. (proved) (O4) I (2 hrs.7 The set of positive numbers is closed under addition and multiplication. then (−a) = 0 + (−a) < a + (−a) =0 (neutral element) (O3) (additive inverse) .

a < 0 which violates the trichotomy axiom. 1 = a · a−1 < a · 0 = 0. In the latter case 0 < (−1)..10 0 < a if and only if 0 < a−1 . Then for every 0 < a (and we know that at least one such exists). a number is positive if and only if its multiplicative inverse is also positive.e. i.12 Chapter 1 i. it follows by O1 that either 0 < 1 or 1 < 0. either 0 < 1..3 The set of positive numbers is not empty.. Conversely. Then.e. I Proposition 1. i. a = a · 1 < a · 0 = 0. hence either 1 or (−1) is positive. Hence 0 < 1.9 0 < 1. if (−a) < 0 then 0 = (−a) + a <0+a =a (additive inverse) (O3) (neutral element) I Corollary 1. by contradiction. or 1 < 0. (−a) < 0. Proof : Since by assumption 1 0. which is a contradiction. Proof : Let 0 < a and suppose by contradiction that a−1 < 0 (it can’t be zero because a−1 a = 1). I Proposition 1. Suppose then.e. The third possibility is ruled out by assumption. Proof : By trichotomy. that 1 < 0. I . or 0 = 1.

1.  a  a≥0  |a| =  (−a) a < 0. hence 1 has to be negative. then by the trichotomy either a > 0.12 (Triangle inequality ( |a + b| ≤ |a| + |b|. I Another corollary is that the field of complex numbers (which is not within the scope of the present course) cannot be ordered. Proposition 1.4 value ( Absolute values ). . since i2 = (−1).Real numbers 13 Proposition 1. I Corollary 1. or a < 0 in which case a2 > 0 follows from the previous proposition. in which case a2 > 0 follows from the closure under multiplication. and otherwise |a| > 0. It remains to check that the axioms of field imply that ab = (−a)(−b). By the closure under multiplication.4 For every element a in an ordered field F we define the absolute We see right away that |a| = 0 if and only if a = 0. Proof : We have seen above that a. )) For every a. which implies that (−1) has to be positive. it follows that (−a)(−b) > 0. (−b). a ≥ 0 and b ≥ 0. b < 0 implies that 0 < (−a). which is a violation.  Definition 1. with a = b. Proof : We need to examine four cases (by trichotomy): 1.4 If a 0 then a · a ≡ a2 > 0. Proof : If a 0.11 If a < 0 and b < 0 then ab > 0. b ∈ F.

and 0 < −(a + b). If a + b < 0. i. hence |a + b| = −(a + b) = (−a) + (−b) = |a| + |b|. then |a + b| = −(a + b) = (−a) + (−b) ≤ (−a) + b = |a| + |b|. |a| = (−a). |b| = (−b). We can divide this case into two sub-cases. a ≤ 0 and b ≥ 0.e. (Hey. and (−a) < a follows from transitivity. in which case |a + b| = a + b ≤ (−a) + b = |a| + |b|. We need to show that |a + b| ≤ (−a) + b. |a| = (−a) and |b| = b. which completes the proof4 . hence |a + b| = a + b = |a| + |b|. a > 0 =⇒ a + (−a) > 0 + (−a) =⇒ 0 > (−a). Either a + b ≥ 0. For example. a ≤ 0 and b ≤ 0. Let a ≤ 0 and b ≥ 0. This is very easy to show. Chapter 1 In the first case. I Comment: By replacing b by (−b) we obtain that |a − b| ≤ |a| + |b|. 3. a ≥ 0 and b ≤ 0. 4.14 2. The following version of the triangle inequality is also useful: We have used the fact that a ≥ 0 implies (−a) ≤ a and a ≤ 0 implies a ≤ (−a).. |a| = a and |b| = b. 4 . does this mean that |a + b| ≤ |a| + |b|?) In the fourth case. as the third case follows by interchanging the roles of a and b. It remains to show the second case.

y) = 1 (x + y − |x − y|) . By interchanging the roles of a and b. b ∈ F. 2009)  Exercise 1. c ∈ F.14 For every a. . Show that max(x. c ∈ F. It follows that5 ||a| − |b|| ≤ |a + b|. |a| − |b| ≤ |a + b|. or |b| − |a| ≤ |b + a|. I (2 hrs. y) and max(x. b. |a + b + c| = |a + (b + c)| ≤ |a| + |b + c| ≤ |a| + |b| + |c|. ||a| − |b|| ≤ |a + b|. respectively. Proof : By the triangle inequality.Real numbers 15 Proposition 1. |a + b + c| ≤ |a| + |b| + |c|. ∀a. Proof : Use the triangle inequality twice. in which case |b| ≤ |a| + |b + a|. y) = 5 1 (x + y + |x − y|) 2 and min(x.13 (Reverse triangle inequality) For every a. Set c = b + a. I Proposition 1. 2 Here we used the fact that any property satisfied by a and (−a) is also satisfied by |a|. |a − c| ≤ |a| + |c|.6 Let min(x. y) denote the smallest and the largest of the two arguments.

which we denote by Q. 4. 1 + 1 + 1. and naming the numbers 1 + 1. then 1 + 1 = (1 + 1) · 1 < (1 + 1) · (1 + 1)−1 = 1. there are no additive inverses because all the numbers are positive). we call this set S modulo the equivalence relation ∼. We can then augment the set of natural numbers by adding the numbers 0. We denote the set of natural numbers by N. such as 2. n 0. etc.5 Special sets of numbers What kind of sets can have the properties of an ordered field? We start by introducing the natural numbers ( ). which is {b : b ∼ a}. which is a violation of the axioms of order. with every element a ∈ S we can associate an equivalence class ( ). and may represent them using a decimal system is immaterial. etc. This forms the set of integers ( ). 6 How do we know that (1 + 1)−1 . b have this property. That we give them names. 3. m/n.16 Chapter 1 1. . is not an integer? We know that it is positive and non-zero. reflexive (a ∼ a for all a) and transitive (a ∼ b and b ∼ c implies a ∼ c). The set of numbers can be further augmented by adding all integer quotients. The integers form a commutative group with respect to addition. Suppose that it were true that 1 < (1 + 1)−1 . An equivalence relation partitions S into a collection of disjoint equivalence classes. −1. 7 A few words about equivalence relations ( ). we say that a is equivalent to b. An equivalence relation has to be symmetric (a ∼ b implies b ∼ a). That these numbers are all distinct follows from the axioms of order (each number is greater than its predecessor because 0 < 1). It should be noted that the rational numbers are the set of integer quotients modulo an equivalence relation7 . An equivalence relation on a set S is a property that any two elements either have or don’t. Thus. 1 + 1 + 1 + 1. Clearly the natural numbers do not form a field (for example. for example. A rational number has infinitely many representations as the quotient of two integers. −(1 + 1 + 1). by taking the element 1 (which exists by the axioms). Every element belongs to one and only one equivalence class. This forms the set of rational numbers. which we denote by Z. −(1 + 1). but are not a field since multiplicative inverses do not exist6 . and denote it by a ∼ b. If two elements a. and denote it S / ∼.

Real numbers

17

Proposition 1.15 Let b, d
a c = b d

0. Then if and only if ad = bc.

Proof : Multiply/divide both sides by bd (which cannot be zero if b, d It can be checked that the rational numbers form an ordered field. (4 hrs, 2010) (4 hrs, 2011)

0).

I

In this course we will not teach mathematical induction, nor inductive definitions; we will assume these concepts to be understood. As an example of an inductive definition, we define for all a ∈ Q and n ∈ N, the n-th power, by setting, a1 = a and ak+1 = a · ak .

As an example of an inductive proof, consider the following proposition:

Proposition 1.16 Let a, b > 0 and n ∈ N. Then
a>b if and only if an > bn .

Proof : We proceed by induction. This is certainly true for n = 1. Suppose this were true for n = k. Then a>b implies ak > bk implies ak+1 = a · ak > a · bk > b · bk = bk+1 .

Conversely, if ak+1 > bk+1 then a ≤ b would imply that ak+1 ≤ bk+1 , hence a > b. I Another inequality, which we will need occasionally is:

Proposition 1.17 (Bernoulli inequality) For all x > (−1) and n ∈ N,
(1 + x)n ≥ 1 + nx.

18

Chapter 1

Proof : The proof is by induction. For n = 1 both sides are equal. Suppose this were true for n = k. Then, (1 + x)k+1 = (1 + x)(1 + x)k ≥ (1 + x)(1 + kx) = 1 + (k + 1)x + kx2 ≥ 1 + (k + 1)x, where in the left-most inequality we have used explicitly the fact that 1 + x > 0. I 

Exercise 1.7 The following exercise deal with logical assertions, and is in
preparation to the type of assertions we will be dealing with all along this course. For each of the following statements write its negation in hebrew, without using the word “no”. Then write both the statement and its negation, using logical quantifiers like ∀ and ∃. Œ For every integer n, n2 ≥ n holds.  There exists a natural number M such that n < M for all natural numbers n. Ž For every integer n, m, either n ≥ m or −n ≥ −m.  For every natural number n there exist natural numbers a, b,, such that b < n, 1 < a and n = ab. (3 hrs, 2009)

1.6

The Archimedean property

We are aiming at constructing an entity that matches our notions and experiences associated with numbers. Thus far, we defined an ordered field, and showed that it must contain a set, which we called the integers, along with all numbers expressible at ratios of integers—the rational numbers Q. Whether it contains additional elements is left for the moment unanswered. The point is that Q already has the property of being an ordered field. Before proceeding, we will need the following definitions:

Definition 1.5 A set of elements A ⊂ F is said to be bounded from above (
) if there exists an element M ∈ F, such that a ≤ M for all a ∈ A, i.e., (∃M ∈ F) : (∀a ∈ A)(a ≤ M).

Real numbers Such an element M is called an upper bound ( bounded from below ( ) if (∃m ∈ F) : (∀a ∈ A)(a ≥ m).

19 ) for A. A is said to be

Such an element m is called a lower bound ( ) for A. A set is said to be bounded ( ) if it is bounded both from above and below. Note that if a set is upper bounded, then the upper bound is not unique, for if M is an upper bound, so are M + 1, M + 2, and so on.

Proposition 1.18 A set A is bounded if and only if
(∃M ∈ F) : (∀a ∈ A)(|a| ≤ M).

Proof : 8 Suppose A is bounded, then (∃M1 ∈ F) : (∀a ∈ A)(a ≤ M1 ) (∃M2 ∈ F) : (∀a ∈ A)(a ≥ M2 ). That is, (∃M1 ∈ F) : (∀a ∈ A)(a ≤ M1 ) (∃M2 ∈ F) : (∀a ∈ A)((−a) ≤ (−M2 )). By taking M = max(M1 , −M2 ) we obtain (∀a ∈ A)(a ≤ M ∧ (−a) ≤ M), i.e., (∀a ∈ A)(|a| ≤ M). The other direction isn’t much different.
8

I

Once and for all: P if Q means that “if Q holds then P holds”, or Q implies P. P only if Q means that “if Q does not hold, then P does not hold either”, which implies that P implies Q. Thus, “if and only if” means that each one implies the other.

20

Chapter 1

A few notational conventions: for a < b we use the following notations for an open ( ), closed ( ), and semi-open ( ) segment, (a, b) = {x ∈ F : [a, b] = {x ∈ F : (a, b] = {x ∈ F : [a, b) = {x ∈ F : a< a≤ a< a≤ x < b} x ≤ b} x ≤ b} x < b}.

Each of these sets is bounded (above and below). We also use the following notations for sets that are bounded on one side, (a, ∞) = {x ∈ F : [a, ∞) = {x ∈ F : (−∞, a) = {x ∈ F : (−∞, a] = {x ∈ F : x > a} x ≥ a} x < a} x ≤ a}.

The first two sets are bounded from below, whereas the last two are bounded from above. It should be emphasized that ±∞ are not members of F! The above is purely a notation. Thus, “being less than infinity” does not mean being less than a field element called infinity. (5 hrs, 2011) Consider now the set of natural numbers, N. Does it have an upper bound? Our “geometrical picture” of the number axis clearly indicates that this set is unbounded, but can we prove it? We can easily prove the following:

Proposition 1.19 The natural numbers are not upper bounded by a natural number. Proof : Suppose n ∈ N were an upper bound for N, i.e., (∀k ∈ N) : (k ≤ n). However, m = n + 1 ∈ N and m > n, i.e., (∃m ∈ N) : (n < m), which is a contradiction. Similarly, I

Real numbers

21

Corollary 1.5 The natural numbers are not bounded from above by any rational
number

Proof : Suppose that p/q ∈ Q was an upper bound for N. Since, by the axioms of order, p/q ≤ p, it would imply the existence of a natural number p that is upper bound for N.9 I (5 hrs, 2010) It may however be possible that an irrational element in F be an upper bound for N. Can we rule out this possibility? It turns out that this cannot be proved from the axioms of an ordered field. Since, however, the boundedness of the naturals is so basic to our intuition, we may impose it as an additional axiom, known as the axiom of the Archimedean field: the set of natural numbers is unbounded from above10 . This axiom has a number of immediate consequences:

Proposition 1.20 In an Archimedean ordered field F:
(∀a ∈ F)(∃n ∈ N) : (a < n).

Proof : If this weren’t the case, then N would be bounded. Indeed, the negation of the proposition is (∃a ∈ F) : (∀n ∈ N)(n ≤ a), i.e., a is an upper bound for N. I

Corollary 1.6 In an Archimedean ordered field F:
(∀ > 0)(∃n ∈ N) : (1/n < ).

Convince yourself that p/q ≤ p for all p, q ∈ N. This additional assumption is temporary. The Archimedean property will be provable once we add our last axiom.
10

9

22

Chapter 1

Proof : By the previous proposition, (∀ > 0)(∃n ∈ N) : (0 < 1/ < n) .
(0<1/n< )

I

Corollary 1.7 In an Archimedean ordered field F:
(∀x, y > 0)(∃n ∈ N) : (y < nx).

Comment: This is really what is meant by the Archimedean property. For every
x, y > 0, a segment of length y can be covered by a finite number of segments of length x.
y x x x x x x

Proof : By the previous corollary with = y/x, (∀x, y > 0)(∃n ∈ N) : (y/x < n) .
(y<nx)

I

1.7

Axiom of completeness

The Greeks knew already that the field of rational numbers is “incomplete”, in the sense that there is no rational number whose square equals 2 (whereas they knew by Pythagoras’ theorem that this should be the length of the diagonal of a unit square). In fact, let’s prove it:

Proposition 1.21 There is no r ∈ Q such that r2 = 2.

Real numbers

23

Proof : Suppose, by contradiction, that r = n/m, where n, m ∈ N (we can assume that r is positive) satisfies r2 = 2. Although it requires some number theoretical knowledge, we will assume that it is known that any rational number can be brought into a form where m and n have no common divisor. By assumption, m2 /n2 = 2, i.e., m2 = 2n2 . This means that m2 is even, from which follows that m is even, hence m = 2k for some k ∈ N. Hence, 4k2 = 2n2 , or n2 = 2k2 , from which follows that n is even, contradicting the assumption that m and n have no common divisor. I Proof : Another proof: suppose again that r = n/m is irreducible and r2 = 2. Since, n2 = 2m2 , then m2 < n2 < 4m2 , it follows that m < n < 2m < 2n, and 0<n−m<m Consider now the ratio q= 2m − n , n−m and 0 < 2m − n < n.

whose numerator and denominator are both smaller than that of r. By elementary arithmetic q2 = 4m2 − 4mn + n2 4 − 4n/m + n2 /m2 6 − 4n/m = 2 2 = 2, = n2 − 2nm + m2 n /m − 2n/m + 1 3 − 2n/m I

which is a contradiction. 

Exercise 1.8 Prove that there is no r ∈ Q such that r2 = 3.  Exercise 1.9 What fails if we try to apply the same arguments to prove that
there is no r ∈ Q such that r2 = 4 (an assertion that happens to be false)? √ A way to cope with this “missing number” would be to add 2 “by hand” to the set of rational numbers, along with all the numbers obtained by field operations √ involving 2 and rational numbers (this is called in algebra a field extension)

. This observation motivates the following definition: Definition 1. known as the axiom of completeness ( ¯ ). what about 3? We could √ all the square roots of add 3 all positive rational numbers. B ⊆ F satisfying (∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). what about a solution to the equation x5 + x + 1 = 0 (it cannot be expressed in terms of roots as a result of Galois’ theory)? It turns out that a single additional axiom. Does there exist a rational number that “separates” the two sets.6 An ordered field F is said to be complete if for every two nonempty sets. We next introduce more definitions. A = {x ∈ Q : 0 < x. Let us try to look in more detail in what sense is the field of rational numbers “incomplete”.24 Chapter 1 √ ( )11 .e. (∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). x2 ≤ 2} B = {y ∈ Q : 0 < y. And then. We will soon see how this axiom “completes” the field Q. Every element in B is greater or equal than every element in A (by transitivity and by the fact that 0 < x ≤ y implies x2 ≤ y2 ). completes the set of rational numbers in one fell swoop. But then.. i. b ∈ Q}. 2 ≤ y2 }. there exists a c ∈ F such that (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b). hence such a c does not exist. what about 2? Shall we add all n-th roots? But then. such to provide solutions to all those questions. Formally. does there exist an element c ∈ Q such that (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b) ? It can be shown that if there existed such a c it would satisfy c2 = 2. Consider the following two sets. A. Let’s start with a motivating example: 11 It can be shown that this field extension of Q consists of all elements of the form √ {a + b 2 : a.

g. I We call the least upper bound of a set (if it exists!) a supremum. if it exists. (∀a ∈ A)(a ≤ M) (∀M ∈ F)( if (∀a ∈ A)(a ≤ M ) then (M ≤ M )). it is clear that b is the least upper bound.7 Let A ⊆ F be a set.Real numbers 25 Example: For any set [a. hence M ≤ M and M ≤ M. e. 2011) We can provide an equivalent definition of the least upper bound: Proposition 1. and it is denoted by inf A. and (ii) (∀ > 0)(∃a ∈ A) : (a > M − ). If M and M are both least upper bounds.. and denote it by sup A. is unique: Proposition 1. Definition 1. An element M ∈ F is called a least upper bound ( ) for A if (i) it is an upper bound for A. We can see right away that a least upper bound. In fact. b is an upper bound. In some books the notations lub (least upper bound) and glb (greatest lower bound) are used instead of sup and inf. (7 hrs. If M is also a least upper bound for A then M = M .22 Let A be a set and let M be a least upper bound for A. 12 . but so is any larger element. Proof : It follows immediately from the definition of the least upper bound.23 Let A be a set. which implies that M = M . That is. b + 1. Similarly. and (ii) if M is also an upper bound for A then M ≤ M . b) = {x : a ≤ x < b}. the greatest lower bound ( ) of a set is called the 12 infimum. A number M is a least upper bound if and only if (i) it is an upper bound. then both are in particular upper bounds.

or in formal notation. 2. which is a contradiction. a ≤ M = M − for all a ∈ A. which is a contradiction. Then there exists a smaller upper bound M < M. (∀a ∈ A)(a ≤ M) and (∀M ∈ F)(if (∀a ∈ A)(a ≤ M ) then (M ≤ M )).. that (∃ > 0) : (∀a ∈ A)(a < M − ). Suppose first that M were a least upper bound for A. suppose that M is an upper bound and (∀ > 0)(∃a ∈ A) : (a > M − ). Suppose. 15] (answer: 15). smaller than the least upper bound. that there exists an > 0 such that a ≤ M − for all a ∈ A. Take = M − M . .26 Chapter 1 A a M !! Proof : There are two directions to be proved. i. Suppose that M was not a least upper bound. 1. then M − is an upper bound for A.. i. (∃ > 0) : (∀a ∈ A)(a < M − ). Conversely. Then. I Examples: What are the least upper bounds (if they exist) in the following examples: Œ [−5.e.e. by contradiction.

and A ⊆ F. ∞) (answer: none).11 Show that the axiom of completeness is equivalent to the following statement: for every two non-empty sets A. 15] ∪ {20} (answer: 20). We define A + B = {a + b : a ∈ A. {1/n + (−1)n : n ∈ N}. −1/2. 1/2}. there exists an element c ∈ F.Real numbers  Ž   ‘ [−5. and if they are find their infimum and/or supremum: Œ  Ž   ‘ {(−2)n : n ∈ N}. [−5. [−5.  A = B = [0. determine whether it is upper bounded. Prove that M = inf A (∀ > 0)(∃a ∈ A) : (a < M + ). [−5. 2. B be non-empty sets of real numbers.10 Let F be an ordered field. 3} and B = {−1. 1]. {1/n : n ∈ Z. A · B and A · A in each of the following cases: Œ A = {1.12 For each of the following sets. b ∈ B} A · B = {a · b : a ∈ A. and ∅ if and only if M is a lower bound for A. {1 + (−1)n : n ∈ N}.  Exercise 1. satisfying that a ∈ A and b ∈ B implies a ≤ b. 15] ∪ (17. Find A + B. b ∈ B} A − B = {a − b : a ∈ A. lower bounded.13 Let A. 15) (answer: 15). . n 0}. {n/(2n + 1) : n ∈ N}. 27  Exercise 1.  Exercise 1. {x : |x2 − 1| < 3}. −2. b ∈ B}. 18) (answer: 18). B ⊂ F. {1 − 1/n : n ∈ N} (answer: 1). such that a≤c≤b for all a∈A and b ∈ B.  Exercise 1.

Ž Show that if A. Œ Suppose that A is lower bounded and set B = {x : −x ∈ A}. We denote m = min A.14 Let A be a non-empty set of real numbers. and sup B = − inf A..  Show that if A is upper bounded and B is lower bounded. if A has a minimum. A. i.28 Chapter 1  Exercise 1.  Suppose that A. Definition 1. x ≥ 0. B are upper bounded.  Exercise 1. Ž ∀x ∈ A · A.  A − A = {0}. inf A = min A. that A has an infimum. B are lower bounded then inf(A + B) = inf A + inf B. then sup(A · B) = sup A · sup B. B only contain non-negative terms and are upper bounded.24 If a set A has a maximum. It is said to have a minimum if there exists an element m ∈ A which is a lower bound for A. then the maximum is the least upper bound. Similarly. 2010) Proposition 1. For the moment. Prove that B is upper bounded. you can only use the axiom of completeness for the least upper bound. Is each of the following statements necessarity true? Prove it or give a counter example: Œ A + A = {2} · A. (7 hrs. It is necessarily true that sup(A · B) = sup A · sup B?  Show that if A.8 Let A ⊂ F be a subset of an ordered field. We denote M = max A. then sup(A − B) = sup A − inf B.15 In each of the following items. B are non-empty subsets of a complete ordered field.e. sup A = max A. . It is said to have a maximum if there exists an element M ∈ A which is an upper bound for A. then the minimum is the greatest lower bound.

[−5. The statement is that if a maximum exists. and for every > 0. 2009) Proposition 1. is an upper bound. for example. which proves that M is the least upper bound. A M>M− .8 Every finite set in an ordered field has a least upper bound and a greatest lower bound. 15] ∪ (17. I Corollary 1. [−5. We emphasize that not every (non-empty) set has a maximum. as 2. 15) (answer: none). [−5. [−5. that is (∀ > 0)(∃a ∈ A)(a > M − ). Consider now the field Q of rational numbers. {1 − 1/n : n ∈ N} (answer: none). then x 2 < 2 < 22 . if x ∈ A.13 Then M is an upper bound for A. (5 hrs. Indeed. then it is also the supremum. 15] ∪ {20} (answer: 20). I Examples: What are the maxima (if they exist) in the following examples: Œ  Ž   ‘ [−5. x2 < 2}. This set is bounded from above.25 Every finite set in an ordered field has a minimum and a maximum. 15] (answer: 15). 13 . ∞) (answer: none).Real numbers 29 Proof : Let M = max A. Proof : The proof is by induction on the size of the set. 18) (answer: none). and its subset A = {x ∈ Q : 0 < x.

(use 1/n < α) (use α < 2) (use 6/n < 2 − α2 ) . Relying on the Archimedean property.30 which implies that x < 2. Suppose. by contradiction. . such that n > max 1 6 . such that r2 < 2. Clearly. α 2 − α2 We are going to show that there exists a rational number r > α. Proof : We assume that A has a least upper bound.26 Suppose that A has a least upper bound. Chapter 1 A 0 But does A have a least upper bound? an upper bound 1 2 Proposition 1. then (sup A)2 = 2. which means that α is not even an upper bound for A—contradiction: 2 1 (α + 1/n)2 = (α)2 + α + 2 n n 3 < α2 + α n 6 < α2 + n 2 < α + 2 − α2 = 2. we choose n to be an integer large enough. which means that α + 1/n ∈ A. that α2 < 2. 1 < α < 2. and denote α = sup A.

We will now see that completeness guarantees the existence of a lowest upper bound: Proposition 1.Real numbers Similarly. α − 1/n is an upper bound for A. indeed. 2 (omit 1/n2 ) (use n/2α > α2 − 2) which means that. suppose that α2 > 2.27 In a complete ordered field every non-empty set that is upperbounded has a least upper bound. hence c = sup A. By the axiom of completeness. Clearly. Clearly c is an upper bound for A smaller or equal than every other upper bound. it follows that there is no supremum to this set in Q. I (8 hrs. Define B = {b ∈ F : b is an upper bound for A}. a contradiction: 2 1 (α − 1/n)2 = α2 − α + 2 n n 2 > α2 − α n > α2 + 2 − α2 = 2. which by assumption is non-empty. This time we set n> α2 2α . such that r2 > 2. (∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). −2 31 We are going to show that there exists a rational number r < α. (∃c ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ c ≤ b). I Since we saw that there is no r ∈ Q such that r = 2. 2010) The following proposition is an immediate consequence: . Proof : Let A ⊂ F be a non-empty set that is upper-bounded. which means that r is an upper bound for A smaller than α—again.

to read about it.32 Chapter 1 Proposition 1. Proof : Let m be a lower bound for A. It turns out that this defines the set uniquely.. and (∀ > 0)(∃a ∈ A) : (a < (−M) + ). I and (∀ > 0)(∃a ∈ A) : ((−a) > M − ).29 (Archimedean property) N is not bounded from above. In other words. You are strongly recommended. which we denote by R. which we denote by M.28 In a complete ordered field every non-empty set that is bounded from below has a greatest lower bound (an infimum). 14 The axiom of completeness can be used to prove the Archimedean property. By the axiom of completeness. Consider the set B = {(−a) : a ∈ A} . we define the set of real numbers as a complete ordered field. Since (∀a ∈ A)((−m) ≥ (−a)). i. up to an isomorphism).e. up to a relabeling of its elements (i. Proposition 1. however. Strictly speaking. With that.e. the axiom of completeness solves at the same time the “problems” pointed out in the previous section. See for example the construction of Dedekind cuts. 14 . (∀a ∈ A)(m ≤ a). it follows that (−m) is a upper bound for B. Thus. we should prove that such an animal exists.. Constructing the set of real numbers from the set of rational numbers is beyond the scope of this course. (∀a ∈ A)(M ≥ (−a)) This means that (∀a ∈ A)((−M) ≤ a) This means that (−M) = inf A. B has a least upper bound.

Proof : Since the natural numbers are not bounded. Proposition 1.30 (The rational numbers are dense in the reals) Let x. Consider now the sequence of rational numbers. there exists an m ∈ Z ⊂ Q such that m < x. y ∈ R. k rk = m + . (∃M ∈ R) : (∀n ∈ N) (n ≤ M). y ∈ R.e. n k = 0.e. .. . Then it would have a least upper bound M. (∀n ∈ N) ((n + 1) ≤ M). contradicting the fact that M is the least upper bound. n Let k be the smallest such number (the exists a smallest k because every finite set has a minimum).Real numbers Proof : Suppose N was bounded. In formal notation. x < y)(∃q ∈ Q) : (x < q < y). 1. I (9 hrs. i. (∀x. . 33 This means that M − 1 is an upper bound. ) of rational numbers such that x < y. From the Archimedean property follows that there exists a k such that rk = m + k > x. 2. . (∀n ∈ N) (n ≤ (M − 1)). . Also there exists a natural number n ∈ N such that 1/n < y − x. i. rk > x and rk−1 ≤ x. There exists a rational number q ∈ Q such that x < q < y. Since n ∈ N implies n + 1 ∈ N.. 2011) We next prove an important facts about the density ( within the reals.

b ∈ B.34 Hence. . Ž (∀ > 0)(∃a ∈ A. then w) reach a rational point between x and y. Chapter 1 x < rk = rk−1 + which completes the proof. then any number m in between would satisfy a < m < b for all a ∈ A. Then the following three statements are equivalent: Œ (∃!M ∈ F) : (∀a ∈ A ∧ ∀b ∈ B)(a ≤ M ≤ b). Proof : Suppose that the first statement holds. The claim is that each statement implies the two other. n n 1/" q If we start #om an integer poin$ on the le% of x and proceed by rational steps of size less than &y'x(. Proposition 1. We have already seen that sup A is a lower bound for B. contradicting the assumption. such that )) Let A. 1 1 ≤ x + < y. This implies at once that sup A ≤ inf B. If sup A < inf B. Comment: The claim is not that the three items follow from the given data. ! x y I The following proposition provides a useful fact about complete ordered fields.  sup A = inf B.31 (( ¯ plete ordered field. B ⊂ F be two non-empty sets in a com(∀a ∈ A ∧ ∀b ∈ B)(a ≤ b). b ∈ B) : (b − a < ). hence x implies y. We will use it later in the course.

y implies z. where a.. b ∈ B) (a ≤ m − /2.  Show that the set √ {a + b 2 : a. where a. d ∈ Q. then b = 0.  Show that the irrational numbers are dense. (7 hrs. Then.e. 2009) √ I  Exercise 1. It remains to show that z implies x. √ Œ Show that if a + b 2 = 0. c. b−a≥(m+ /2)−(m− /2)= i. i. √ √  Conclude that a + b 2 = c + d 2. Thus. 35 i. . there were M1 < M2 such that (∀a ∈ A ∧ ∀b ∈ B)(a ≤ M1 < M2 ≤ b). Set m = (M1 + M2 )/2 and = (M2 − M1 ). then a = b = 0...16 In this exercise you may assume that 2 is irrational. b. (∀ > 0)(∃a ∈ A) : (a > sup A − /2) (∀ > 0)(∃b ∈ B) : (b < inf B + /2). b ∈ Q. (∀ > 0)(∃a ∈ A. b ∈ Q} with the standard addition and multiplication operations is a field.e. (∀ > 0)(∃a ∈ A) : (a + /2 > sup A) (∀ > 0)(∃b ∈ B) : (b − /2 < inf B). where a. √ Ž Show that if a + b 2 ∈ Q. (∀a ∈ A. b ∈ Q. This concludes the proof. b ≥ m + /2). b ∈ B) such that b − a < (inf B + /2) − (sup A − /2) = . That is. if x does not hold then z does not hold.e. then a = c and b = d. namely that there exists an irrational number between every two. Suppose M was not unique.Real numbers By the properties of the infimum and the supremum.

such that a < t < s. or none of the above: Œ  Ž   a ≤ s for all a ∈ A. n ∈ Z. We defined the integer powers recursively. n ∈ N} is dense. B ⊆ R be non-empty sets. For every finite subset B ⊂ A.32 (Properties of integer powers) Let x.18 Let A. Determine for each of the following statements whether it is equivalent. s is the supremum of A ∪ {s}.8 Rational powers x1 = x xk+1 = x · xk . Prove or disprove each of the following statements: Œ If A ⊆ B and B is bounded from above then A is bounded from above. Proposition 1. or implying the statement that s = sup A. For every a ∈ A there exists a r ∈ R.  Exercise 1.36 Chapter 1  Exercise 1.  Exercise 1. Ž If every b ∈ B is an upper bound for A then every a ∈ A is a lower bound form B. y 0 and m. implied by. We also define for x 0. x0 = 1 and x−n = 1/xn . .  A is bounded from above if and only if A ∩ Z is bounded from above. 1. For every x ∈ R such that x > s there exists an a ∈ A such that s < a < x. contradictory. s ≥ max B.19 Let A ⊆ R be a non-empty set. then Œ xm xn = xm+n .  If A ⊆ B and B is bounded from below then A is bounded from below.17 Show that the set {m/2n : m ∈ Z.

and the latter is hence an upper bound for S . An n-th root of x is a positive number y. (10 hrs. . we denote it by either n x. z ∈ S implies that z ≤ max(1.9 Let x > 0. which implies that z ≤ 1 if x > 1 and z ∈ S then zn < x.Real numbers 37  Ž   ‘ ’ (xm )n = xmn . It follows that regardless of the value of x. x). This is a non-empty set (it contains zero) and bounded from above. would it imply that y = z). By the axiom of completeness. Thus. which implies that z ≤ x. there exists a unique y = sup S . 2010) Proof : Consider the set S = {z ≥ 0 : zn < x}. 0 < α < β and n < 0 implies αn > βn . (xy)n = xn yn . 0 < α < β and n > 0 implies αn < βn . 2011) I Definition 1. since if x ≤ 1 and z ∈ S then zn < x ≤ 1. Proof : Do it. Theorem 1. we will show that yn = x. or by x1/n . It is easy to see that if x has an n-th root then √ is unique (since yn = zn .1 (Existence and uniqueness of roots) (∀x > 0 ∧ ∀n ∈ N)(∃!y > 0) : (yn = x). (10 hrs. Indeed. such that yn = x. α > 1 and n > m implies αn > αm . 0 < α < 1 and n > m implies αn < αm . which is the “natural suspect” for being an n-th root of x.

Chapter 1 0< hence x < 1. where we used the Bernoulli inequality. and yn < x(1 − n) ≤ x(1 − )n . Thus. by contradiction that yn < x. by contradiction that yn > x. yn − x . 1+x x n x < < x. nyn < yn − x. we found a number less than y. whose n-th power is greater than x. nx < x − yn .. This contradicts the assumption that y is an upper bound for S . we found a number greater than y. x − yn . Claim: yn ≤ x Suppose. in S . Set 0< < min 1. whose n-th power is less than x. which implies that y > 0. i. nyn . < min 1.e. it is an upper bound for S . This finally implies that y 1− n < x. Claim: yn ≥ x Suppose. i. where we used once more the Bernoulli inequality. Thus. 1+x 1+x Thus. This contradicts the assumption that y is the least upper bound for S . and [(1 − )y]n = (1 − )n yn ≥ (1 − n)yn > x. nx Then..38 Claim: y is positive Indeed.e. x/(1 + x) ∈ S . Set 0< Then.

b. d 0. (8 hrs. α > 1 and r > s implies αr > α s .Real numbers From the two inequalities follows (by trichotomy) that yn = x. by saying that a rational number is even if its numerator is even. This is a non-trivial fact. We thus need to show that the above definition is independent of the representation15 . The arguments goes as follows: (xa/b )bd = (((xa )1/b )b )d = (xa )d = xad = xbc = (xc )b = (((xc )1/d )d )b = (xc/d )bd . c. Such as definition would imply that 2/6 is even. Suppose we wanted to define the notion of even/odd for rational numbers. Definition 1. y > 0 and r. 2009) Proposition 1. (xr ) s = xrs . 39 I Having shown that every positive number has a unique n-th root. (xy)r = xr yr . we may proceed to define rational powers. We need to prove that ad = bc implies that sup{z > 0 : zb < xa } = sup{z > 0 : zd < xc }. Recall that rational numbers do not have a unique representation as the ratio of two integers. hence xa/b = xc/d . 0 < α < β and r < 0 implies αr > βr . but 1/3 (which is the same number!) is odd. then Œ  Ž   ‘ ’ 15 xr x s = xr+s . s ∈ Q. This is not obvious. In other words. d ∈ Z. . then for all x > 0 xr = (xm )1/n . with a.10 Let r = m/n ∈ Q. b. There is one little delicacy with this definition. if ad = bc. then for all x > 0 xa/b = xc/d . 0 < α < 1 and r > s implies αr < α s . 0 < α < β and r > 0 implies αr < βr .33 (Properties of rational powers) Let x.

We will give one example. Let r = a/b.10 Addendum Existence of non-Archimedean ordered fields Does there exist ordered fields that do not satisfy the Archimedean property? The answer is positive. Start with the first. from which follows that xa/b xc/d = xa/b+c/d . y ∈ A and x < z < y implies z ∈ A (that is. (xa/b xc/d )bd = (xa/b )bd (xc/d )bd = · · · = xad xbc = xad+bc = (xa/b+c/d )bd . The ”natural elements” are the constant functions. Thus it remains to show α1/b < β1/b . This has to be because α1/b ≥ β1/b would have implied (from the rules for integer powers!) that α ≥ β.9 Real-valued powers Delayed to much later in the course. every point between two elements in the set is . etc. with f (x) ≡ 0 for additive neutral element and f (x) ≡ 1 for multiplicative neutral element. say. It is easy to see that this set forms a field with respect to function addition and multiplication. Connected set A set A ⊆ R will be said to be connected ( ) if x. We already know that αa < βa . We hace f > g if f (x) > g(x) for x sufficiently large. Let r = a/b and s = c/d. f (x) ≡ 2.40 Chapter 1 Proof : We are going to prove only two items. by the element f (x) = x (which obviously belongs to the set). Consider the set of rational functions. Using the (proved!) laws for integer powers. bearing in mind that some of the arguments rely on later material. I 1. 1. It is easy to see that the set of “natural elements” are bounded. f (x) ≡ 1. Ignore the fact that a rational function may be undefined at a finite collection of points. Take then the fourth item. We endow this set with an order relation by defining the set P of positive elements as the rational functions f that are eventually positive for x sufficiently large.

[a. b). [a. (a. (a. b] (−∞. b). (−∞. (a. b). it can only be a single point. an open.  Exercise 1.20 Prove it. an open or closed ray.Real numbers 41 also n the set). closed or semi-open segments. . ∞). (−∞. [a. or the whole line. That is. ∞). b]. b]. ∞). We state that a (non-empty) connected set can only be of one of the following forms: {a}.

42 Chapter 1 .

To avoid ambiguities. we write f : A → B ( f maps the set A into the set B). we may denote a function by the letter f . The numbers which may be “fed into the machine”. returns a number (only one. For example. We normally denote functions by letters. We only require that every number returned by the function belongs to its range. Numbers that may be “emitted by the machine”. We do not exclude the possibility that some of these numbers may never be returned. the function has to be defined properly. and define a function as a “machine”. The crucial point is that to every number in its domain corresponds one and only one number in its range ( ¯ ). Ž A transformation rule ( ). a function is a rule that assigns real numbers to real numbers. but we will deliberately be a little less formal than perhaps we should. which when provided with a number. In other words. like we do for real numbers (and for any other mathematical entity). and always the same for the same input).  A range ( ): another subset of R. The transformation rule has to specify what number in B is assigned by the .Chapter 2 Functions 2. and B ⊆ R is its range. If A ⊆ R is its domain.1 Basic definitions What is a function? There is a standard way of defining functions. A function is defined by three elements: Œ A domain ( ): a subset of R.

(∀y ∈ Image( f ))(∃!x ∈ A) : ( f (x) = y).  A function that assigns to every w ±1 the number (w3 + 3w + 5)/(w2 − 1). and of f : x → f (x) as defining the “action” of the function. defined by the transformation rules f (x) = x2 . and g:w→ w3 + 3w + 5 . f (t) = t2 . We denote the assignment by f (x) (the function f evaluated at x). 2]. 2 It takes a great skill not to confuse the Greek letters ζ (zeta) and ξ (xi).. Also. Examples: Œ A function that assigns to every real number its square. Its image ( of numbers that are actually assigned by the function. the functions f : R → R. ) is the subset of B The function f is said to be onto B ( ) if B is its image. i.44 Chapter 2 function to each number x ∈ A.1 Let f : A → B be a function. however. that “the function f is x2 ”. then g : R \ {±1} → R 1 and f : x → x2 . Thus. Definition 2. There is however nothing wrong with the definition of the range as the whole of R. then f :R→R We may also write f (x) = x2 . f (α) = α2 and f (ξ) = ξ2 are identical2 . w2 − 1 Programmers: think of f : A → B as defining the “type” or “syntax” of the function. In particular.e. it turns out that the function f only returns non-negative numbers. ∞). That is. Image( f ) = {y ∈ B : ∃x ∈ A : f (x) = y}. We could limit the range to be the set [0. but not to the set [1. we may use any letter other than x as an argument for f . If we denote this function by g. . That the assignment rule is “assign f (x) to x” is denoted by f : x → f (x) (pronounced “ f maps x to f (x)”)1 . f is said to be one-toone ( ¯ ) if to each number in its image corresponds a unique number in its domain. If we denote the function by f . We should not say.

often denoted by Id. . 0. 28. π2 /17. We have f : R → {0. ’ For every n ∈ N we may define the n-th power function fn : R → R. The function f1 : x → x is known as the identity function. 17. This function differs from the function in the first example because the two functions do not have the same domain (different “syntax” but same “routine”). If this number is infinite.  A function that assigns to every real numbers the value zero if it is irrational and one if it is rational. Id : x → x.Functions 45 Ž A function that assigns to every −17 ≤ x ≤ π/3 its square. This example differs from the previous ones in that we do not have an assignment rule in closed form (how the heck do we compute f (x)?). such that  5      36/π   x→ 28      16  x=2 x = 17 x = π2 /17 or 36/π otherwise. This function is known as the Dirichlet function. e. 36/π}. by fn : x → xn . 36/π} ∪ {a + b 2 : a. namely Id : R → R. 3 . which assigns to x the number of 7’s in its decimal expansion.. ‘ A function defined on R \ Q (the irrational numbers). . We limited the range of the function to the irrationals because rational numbers may have a non-unique decimal representation. The range may be taken to be R. Here again. Nevertheless it provides a legal assignment rule3 . then it returns −π. with  0 x is irrational   f :x→ 1 x is rational. but the image is {6.g. 1}. we will avoid referring to “the function xn ”.6999999 . .   A function defined on the domain √ A = {2. 16. b ∈ Q}. if this number is finite.7 = 0.

Let f : A → B and g : B → C. b ∈ R. then5 g ◦ f : ξ → sin2 ξ and f ◦ g : ζ → sin ζ 2 . f /g : x → f (x)/g(x). It is definitely a habit to use the letter x as generic argument for functions. 0} → R. if f is the sine function and g is the square function. Beware of fixations! Any letter including Ξ is as good. We are adding “machines”. x → 0.. We may define a new function4 a f + b g : A ∩ B → R. Let f : A → R and g : B → R be given functions. its inverse is − f = (−1) f . For example. sometimes complicated. f · g : x → f (x)g(x). not numbers. It is our choice in the present course to assume that the meaning of these functions (along with their domains and ranges) are well-understood. and a. This is a vector field whose zero element is the zero function. We may define the product of two functions. Given several functions. 2011) A third operation that can combine two functions is composition. On the other hand. i. functions form a vector space over the reals. functions form an algebra. This is not trivial. 5 4 . the exponential. Moreover. the cosine. ( f ◦ g) ◦ h = f ◦ (g ◦ h). f /g : A ∩ {z ∈ B : g(z) (12 hrs. such as the sine. given a function f . as well as their quotient. f · g : A ∩ B → R.e. g ◦ f : x → g( f (x)). composition is associative.46 Chapter 2 “ There are many functions that you all know since high school. These function require a careful. We define g ◦ f : A → C. they can be combined together to form new functions. a f + b g : x → a f (x) + b g(x). definition. For example. and the logarithm. namely. composition is non-commutative.

2009)  Exercise 2.Functions Note that for every function f . (10 hrs. The union over all n’s of polynomials of degree n is the set of polynomials. with an i=0 n 0. A function P is called a polynomial of degree n if there exist real numbers (ai )n . . Example: Consider the function f that assigns the rule f :x→ This function can be written as f = Id + Id · Id + Id · sin ◦(Id · Id) .1 The following exercise deals with the algebra of functions. P(x) = 2 x and S (x) = x2 . Id · sin + Id · sin · sin x + x2 + x sin x2 . such that P= k=0 ak fk . s(x) = sin x all three defined on R. P. Consider the three following functions. Id ◦ f = f ◦ Id = f. and S : (i) (s ◦ P)(y). x sin x + x sin2 x Example: Recall the n-th power functions fn . Œ Express the following functions without using the symbols s. 47 so that the identity is the neutral element with respect to function composition. This should not be confused with the fact that x → 1 is the neutral element with respect to function multiplication. (ii) (S ◦ s)(ξ). A function is called rational if it is the ratio of two polynomials. and (iii) (S ◦ P ◦ s)(t) + (s ◦ P)(t).

 Exercise 2. b.  Exercise 2. (iii) 2 sin y 2 f (u) = sin(2u + 2u ).2 An open segment I is called symmetric if there exists an a > 0 such that I = (−a.  Prove that if f is a polynomial satisfying f (a) = 0. and let f : A → B be given by f : x → ax2 + bx + c. Let f : I → R where I is a symmetric segment. That is. where I is a symmetric segments. if f = E + O = E + O . where I and J are symmetric segments). and odd if f (−x) = − f (x) for every x ∈ I. along x with the binary operators +. then there are at most n real numbers a. and where applicable give a counter example: Œ f + g. and (v) f (a) = 2sin a + 2 +sin a) sin(a2 ) + 2sin(a . determine whether the specified function is even. (ii) f (t) = 22 . such that f (a) = 0 (at most n roots). or not necessarily either two. a). g : I → R. Let f.  Exercise 2. then E = E and O = O . odd. Prove then that this decomposition is unique. g being either even or odd. (iv) f (y) = sin(sin(sin(22 ))).48 Chapter 2  Write the following functions using only the symbols s. such that f (x) = (x − a)g(x) + b (hint: use induction on the degree of the polynomial). the function f is called even if f (−x) = f (x) for every x ∈ I. ·. and S .5 Let a.  Exercise 2.3 Show that any function f : R → R can be written as f = E + O. c ∈ R and a 0. P. where E is even and O is odd. Œ Prove that for every polynomial of degree ≥ 1 and for every real number a there exists a polynomial g and a real number b. In each case.  f · g. Explain your statement. and consider the four possibilities of f. .4 The following exercise deals with polynomials. then there exists a polynomial g such that f (x) = (x − a)g(x). Ž f ◦ g (here we assume that g : I → J and f : J → R. and ◦: (i) f (x) = 2sin x . Ž Conclude that if f is a polynomial of degree n.

y) : x ∈ A. A function is a graph—a subset of the Cartesian product space A × B.  Exercise 2. (x.6 Let f satisfy the relation f (x + y) = f (x) + f (y) for all x. and (x. The definition we gave to a function as an assignment rule is. xn ∈ R..2 Graphs As you know. not a formal one. What is a graph? A drawing? A graph has to be thought of as a subset of the plane. . we define the graph of f to be the set Graph( f ) = {(x. ) 2. y = f (x)} ⊂ A × B. For a function f : A → B. z) ∈ Graph( f ) implies y = z. y) ∈ Graph( f ) The function is one-to-one if (x. . . i.  Suppose A = R. . The defining property of the function is that it is uniquely defined. (Note that we claim nothing for irrational arguments. strictly speaking. Œ Prove inductively that f (x1 + x2 + · · · + xn ) = f (x1 ) + f (x2 ) + · · · + f (xn ) for all x1 . such that f (x) = cx for all x ∈ Q. .Functions Œ Suppose B = R. The standard way to define a function is via its graph. Find the largest segment A for which f is one-to-one. y) ∈ Graph( f )). Find the largest segment B for which f is onto. 49 Ž Find segments A and B as large as possible for which f is one-to-one and onto. y ∈ R.  Prove the existence of a real number c.e. we can associate with every function a graph. y) ∈ Graph( f ) implies x = w. y) ∈ Graph( f ) It is onto B if (∀y ∈ B)(∃x ∈ A) : ((x. and (w.

A point a ∈ A is called an interior point of A ( ) if it had a neighborhood contained in A. δ) = {x : 0 < |x − a| < δ}. neighborhoods of a of the form {x : |x − a| < δ} for some δ > 0. Notation: We will mostly deal with symmetric neighborhoods (whether punctured or not). satisfying (∀a ∈ A)(∃!b ∈ B) : ((a.3 Limits Definition 2. B(a. Let A and B be two sets (of anything!).3 Let A ⊂ R.2 Let x ∈ R.50 Chapter 2 Comment: In fact. i. 2010) 2. b ∈ B}. We will introduce the following notations for neighborhoods. δ) = (a − δ. x cannot be a boundary point). b) ∈ Graph( f )). punctured neighborhood neighborhood a a Definition 2. (12 hrs. A function f : A → B is a subset of the Cartesian product Graph( f ) ⊂ A × B = {(a. b) : a ∈ A. . b) that contains the point x (note that since the segment is open.. this is the general way to define functions. b) \ {x} where a < x < b. A punctured neighborhood of x ( ) is a set (a.e. a + δ) B◦ (a. A neighborhood of x ( ) is an open segment (a.

In this section we explain the meaning of the limit of a function at a point. the point 1/2 is an interior point. δ) = [a. δ) = (a − δ. by making x sufficiently close to 5. − 51 Example: For A = [0. Suppose you want f (x) to differ from 15 by less than 1/100. a). in fact. and neither does its complement. This means that we can make f (x) be as close to 15 as we wish.Functions We will also define one-sided neighborhoods. B+ (a. Note that the function does not need to be equal to at the point a. Example: Consider the function f :R→R f : x → 3x. a + δ) + B− (a. with 5 itself being excluded. a] B◦ (a. 300 300 . we say that: The limit of a function f at a point a is . We claim that the limit of this function at the point 5 is 15. δ) = (a. but not the point 0. δ) = (a − δ. by making its argument sufficiently close to a (excluding the value a itself). 100 100 This requirement is guaranteed if 5− 1 1 < x<5+ . it does not even need to be defined at the point a. Informally. Example: The set Q ⊂ R has no interior points. 1]. if we can make f assign a value as close to as we wish. This means that you want 15 − 1 1 < f (x) = 3x < 15 + . R \ Q. a + δ) B◦ (a.

you want | f (x) − 15| = |3x − 15| < . for some > 0 of your choice.4 Let f : A → B with a ∈ A an interior point. choosing x within a symmetric punctured neighborhood of 5 of length 2 /3 guarantees that f (x) is within a distance of from 15. .52 Chapter 2 Thus. In other words. Suppose you want f (x) to differ from 15 by less than . We write. Since we can repeat this construction for any number other than 1/100. We say that the limit of f at a is . such that7 | f (x) − | < In formal notation. x→5 We can be more precise. This example insinuates what would be a formal definition of the limit of a function at a point: Definition 2. δ). if we take x to differ from 5 by less than 1/300 (but more than zero!). lim f = 15. if for every > 0. 7 The game is “you give me and I give you δ in return”. This is guaranteed if |x − 5| < /3. δ))(| f (x) − | < ). then we conclude that the limit of f at 5 is 15. 5 although the more common notation is6 lim f (x) = 15. The choice of not using the common notation becomes clear when you realize that you could as well write ℵ→5 6 for all x ∈ B◦ (a. thus given > 0. there exists a δ > 0. lim f (ℵ) = 15 or ξ→5 lim f (ξ) = 15. (∀ > 0)(∃δ > 0) : (∀x ∈ B◦ (a. then we are guaranteed to have f (x) within the desired range.

Player B: No problem. If we impose that |x − 3| < δ then | f (x) − 9| < |x + 3| δ. we need to show that for every > 0 we can find a δ > 0. we note that by the triangle inequality. To find the appropriate δ. Instead. δ). If the |x + 3| wasn’t there we would have set δ = and we would be done. |x + 3| = |x − 3 + 6| ≤ |x − 3| + 6. such that | f (x) − 9| < whenever x ∈ B◦ (3. We want to show that lim f = 9. Example: Consider the square function f : R → R.it fo!ows tha" the limit of f at a is l. we examine the condition that needs to be satisfied: | f (x) − 9| = |x − 3| |x + 3| < . so that for all |x − 3| < δ. Thus. 3 By definition.Functions 53 Player A: This is my !. l ! l ! " a a Since Player B can find a " for every choice of ! made by Player A. f : x → x2 . | f (x) − 9| ≤ |x − 3| (|x − 3| + 6) < (6 + δ)δ. . Here is my ". the game it to respond with an appropriate δ for every .

To summarize. > 0 we take δ = min(1. Example: The next example is the function f : (0. given |x − 3| < δ). . hence f (x) − 2|x − a| 1 < . then | f (x) − 1/a| < . We may freely require for example that d ≤ 1. We start by requiring that |x − a| < a/2. ∞). which at once implies that x > a/2. then we reach the desired goal. If we furthermore take δ ≤ /7. We are going 1 lim f = . f : x → 1/x. f (x) − for δ = min(a/2. it is not a variable. We can show. in which case (∀x : 0 < | f (x) − 9| = |x − 3| |x + 3| ≤ |x − 3| (|x − 3| + 6) < (6 + δ)δ ≤ (6 + 1) = . a a2 If we further require that |x − a| < a2 /2. in general. in which case | f (x) − 9| < (6 + δ)δ ≤ 7δ. δ).54 Chapter 2 Now recall: given we have the freedom to choose δ > 0 such to make the righthand side less or equal than . To conclude. /7). (14 hrs. Let > 0 be given. 2011) 1 < a whenever x ∈ B◦ (a. We first observe that f (x) − 1 1 1 |x − a| . that lim f = a2 a for all a ∈ R. 7 This proves (by definition) that lim3 f = 9. a a First fix a. = − = a x a ax We need to be careful that the domain of x does not include zero. a2 /2). [do it!] (13 hrs. 2010) to show that for every a > 0.

δ)) : ( f (x) (12 hrs. To show that for all . Indeed..  0   f :x→ 1  x is irrational x is rational. in contrast.e. δ)) : (| f (x) − | ≥ ). 4 1 4 and 1 |1 − | < . the function f : R → R. δ)) : ( f (x) = 0. we need to show the negation of (∃ ∈ R) : (∀ > 0)(∃δ > 0) : (∀x ∈ B◦ (0. 4 Comment: It is important to stress what is the negation that the limit f at a is : There exists an > 0. i.Functions 55 Example: Consider next the Dirichlet function. which satisfies 0 < |x − a| < δ. )). y ∈ B◦ (0. f : R → R. (∀ ∈ R. No can satisfy both |0 − | < I. but not | f (x) − | < . 1 )). δ))( f (x) ∈ B( ... We will show that f does not have a limit at zero.e. f (y) = 1). (∀δ > 0)(∃x. 2009) B( . Then no matter what δ is. ∀δ > 0)(∃x ∈ B◦ (0. . there exists an x. (∀ ∈ R)(∃ > 0) : (∀δ > 0)(∃x ∈ B◦ (0.  0   f :x→ x  x is irrational x is rational. take = 1/4. Example: Consider. lim f 0 . there exist points 0 < |x| < δ for which f (x) = 0 and points for which f (x) = 1. i.e. such that for all δ > 0.

let > 0 be given. there exists a δ1 > 0 such that 1 | f (x) − | < | − m| 4 whenever x ∈ B◦ (a. Theorem 2. The first theorem states that a limit. Then. Proof : Suppose. we are in measure to prove general theorems about limits. . δ). a with m. that lim f = a and lim f = m. and having seen a number of example. if it exists. is unique. then by taking δ = .56 Here we show that Chapter 2 lim f = 0. Having a formal definition of a limit. and there exists a δ2 > 0 such that 1 | f (x) − m| < | − m| 4 whenever x ∈ B◦ (a. whenever x ∈ B◦ (a. Take δ = min(δ1 . δ2 ). δ) implies | f (x) − 0| < . 1 | − m| = |( f (x) − m) − ( f (x) − )| ≤ | f (x) − | + | f (x) − m| < | − m|. δ2 ). By definition.1 (Uniqueness of the limit) A function f : A → B has at most one limit at any interior point a ∈ A. δ1 ). by contradiction. 0 Indeed. 2 which is a contradiction. x ∈ B◦ (0.

it fo!ows tha" the limit of f at a cannot be both l and #. where f (x) = 1/x. and if they do exist they are equal. Player B: I give up.. Show that lima f exists if and only if lima g exists. No " fits. where f (x) = |x|. I  Exercise 2. where f (x) = x2 + 3. where f (x) = 7x. You have to calculate the limit directly from its definition (without using. (This exercise shows that the limit of a function at a point only depends on its properties in a neighborhood of that point). f . with a ∈ I.7 In each of the following items. # l ! ! # l ! ! " a a Since Player B cannot find a " for a particular choice of ! made by Player A. for example.Functions 57 Player A: This is my !. determine whether the limit exists. defined on a punctured neighborhood of a). Let J ⊂ I be an open interval containing the point a. where f (x) = |x|. √ ‘ lima f . x>0 x=0 x < 0.  Exercise 2. g : I \ {a} → R (i.e. Œ  Ž  lim2 lim0 lim0 lima f . Let f. such that f (x) = g(x) for all x ∈ J \ {a}.8 Let I be an open interval in R. and if it does calculate its value.  lima f . . where  1      f (x) = 0    −1  and a ∈ R. f . limits arithmetic—see below). f .

We have seen above various examples of limits. Œ |x − | <  |x − | < min 1. imply |xy − m| < . 2011) . by showing that the definition was satisfied.58 Chapter 2  Exercise 2.1 Let . where the representation p/q is in reduced form (for example R(4/6) = R(2/3) = 1/3). 2 2 . 2(| | + 1) 2 and |y − m| < 2 imply |(x + y) − ( + m)| < . m ∈ R. y m |m| |m|2 . Ž If m 0 and |y − m| < min then y 0 and 1 1 − < . we had to “work hard” to prove what the limit was. a lemma: Lemma 2. 2(|m| + 1) and |y − m| < min 1. Then.9 The Riemann function R : R → R is defined as  1/q   R(x) =  0  x = p/q ∈ Q x Q. Show that lima R exists everywhere and is equal to zero. But first. In each case. This becomes impractical when the functions are more complex. we need to develop theorems that will make our task easier. (15 hrs. Note that R(0) = R(0/1) = 1. Thus.

|xy − m| = |x(y − m) + m(x − )| ≤ |x||y − m| + |m||x − | | |+1 |m| < + < . lim( f + g) = + m a and 1 1 = . 2(| | + 1) 2(|m| + 1) For the third item it is clear that |y − m| < |m|/2 implies that |y| > |m|/2. 2009) (15 hrs. y m |y||m| 2|m|(|m|/2) I (13 hrs. we know that there exists a δ > 0. hence. This δ > 0 is already the smallest of the two δ’s corresponding to each of the functions.Functions 59 Proof : The first item is obvious (the triangle inequality). Then |y − m| |m|2 1 1 − = < = . If the two conditions are satisfied then |x| = |x − + | < |x − | + | | < | | + 1. Consider the second item. 2010) Theorem 2. δ). y 0. and in particular. a Then. moreover. 0. then lim a Proof : This is an immediate consequence of the above lemma. f lim( f · g) = · m. such that lim f = a and lim g = m.2 (Limits arithmetic) Let f and g be two functions defined in a neighborhood of a. a If. such that8 | f (x) − | < 8 2 and |g(x) − m| < 2 whenever x ∈ B◦ (a. Since. given > 0. .

2(|m| + 1) and whenever x ∈ B◦ (a. δ). x→a which follows by induction once we show that for k = 1. |g(x) − m| < min 1. such that | f (x) − | < min 1. δ). δ). | f (x) − | < min 2 2 It follows that 1 1 − < f (x) whenever x ∈ B◦ (a. 2(| | + 1) whenever x ∈ B◦ (a. lim xk = ak . whenever x ∈ B◦ (a. 9 This is the standard notation to what we write in these notes as f :ξ→ ξ3 + 7ξ5 ξ2 + 1 and lim f = a a3 + 7a5 . a2 + 1 . x→a x2 + 1 a +1 We need to show that for all c ∈ R lim lim c = c. such that | | | |2 . It follows that | f (x)g(x) − m| < Finally. take for example the functions f (x) = 5 + 1/x and g(x) = 6 − 1/x at x = 0. δ). I Comment: It is important to point out that the fact that f + g has a limit at a does not imply that either f or g has a limit at that point.60 It follows that Chapter 2 |( f (x) + g(x)) − ( + m)| < There also exists a δ > 0. Example: What does it take now to show that9 x3 + 7x5 a3 + 7a5 = 2 . a and that for all k ∈ N. whenever x ∈ B◦ (a. there exists a δ > 0. δ).

Functions

61 

Exercise 2.10 Show that
lim xn = an ,
x→a

a ∈ R, n ∈ N,

directly from the definition of the limit, i.e., without using limits arithmetic. 

Exercise 2.11 Let f, g be defined in a punctured neighborhood of a point a.
Prove or disprove: Œ If lima f and lima g do not exist, then lima ( f + g) does not exist either.  If lima f and lima g do not exist, then lima ( f · g) does not exist either Ž If both lima f and lima ( f + g) exist, then lima g also exists.  If both lima f and lima ( f · g) exist, then lima g also exists.  If lima f exists and lima ( f + g) does not exist, then lima ( f + g) does no exist. ‘ If g is bounded in a punctured neighborhood of a and lima f = 0, then lima ( f · g) = 0. We conclude this section by extending the notion of a limit to that of a one-sided limit.

Definition 2.5 Let f : A → B with a ∈ A an interior point. We say that the limit
on the right ( that

) of f at a is , if for every > 0, there exists a δ > 0, such whenever x ∈ B◦ (a, δ). +

| f (x) − | < We write,

lim f = . +
a

An analogous definition is given for the limit on the left (

).

62

Chapter 2

Player A: This is my !.

Player B: No problem. Here is my ".

l

!

l

! "

a

a

Since Player B can find a " for every choice of ! made by Player A,it fo!ows tha" the right#limit of f at a is l. 

Exercise 2.12 We denote by f : x → x the lower integer value of x.
Œ Draw the graph of this function.  Find lim f −
a

and

lim f +
a

for a ∈ R. Prove it based on the definition of one-sided limits. 

Exercise 2.13 Show that f : (0, ∞) → R, f : x → x satisfies
lim f =
a

a,

for all a > 0.

2.4

Limits and order

In this section we will prove a number of properties pertinent to limits and order. First a definition:

Definition 2.6 Let f : A → B and a ∈ A an interior point. We say that f is
locally bounded ( U of a, such that the set ) near a if there exists a punctured neighborhood { f (x) : x ∈ U}

Functions is bounded. Equivalently, there exists a δ > 0 such that the set { f (x) : x ∈ B◦ (a, δ)} is bounded.

63

Comment: As in the previous section, we consider punctured neighborhoods of a
point, and we don’t even care if the function is defined at the point.

Proposition 2.1 Let f and g be two functions defined in a punctured neighborhood U of a point a. Suppose that f (x) < g(x) and that lim f =
a

∀x ∈ U, lim g = m.
a

and

Then ≤ m.

Comment: The very fact that f and g have limits at a is an assumption. Comment: Note that even though f (x) < g(x) is a strict inequality, the resulting
inequality of the limits is in a weak sense. To see what this must be the case, consider the example f (x) = |x| and g(x) = 2|x|. Even though f (x) < g(x) in an open neighborhood of 0, lim f = lim g = 0.
0 0

Proof : Suppose, by contradiction, that exists a δ > 0 such that (∀x ∈ B◦ (a, δ)) : (| f (x) − | < In particular, for every such x, f (x) > − = − which is a contradiction.

> g and set and

= 1 ( − m). Thus, there 2

|g(x) − m| < ).

−m −m =m+ = m + > g(x), 2 2 I

64

Chapter 2

Proposition 2.2 Let f and g be two functions defined in a punctured neighborhood U of a point a. Suppose that lim f =
a

and

lim g = m,
a

with < m. Then there exists a δ > 0 such that (∀x ∈ B◦ (a, δ)) : ( f (x) < g(x)).

Comment: This time both inequalities are strong.
Proof : Let = 1 (m − ). There exists a δ > 0 such that 2 (∀x ∈ B◦ (a, δ)) : (| f (x) − | < For every such x, f (x) < + < + + (g(x) − m + ) = g(x). I and |g(x) − m| < ).

Proposition 2.3 Let f, g, h be defined in a punctured neighborhood U of a. Suppose that (∀x ∈ U) : ( f (x) ≤ g(x) ≤ h(x)), and lim f = lim h = .
a a

Then lim g = .
a

Proof : It is given that for every > 0 there exists a δ > 0 such that (∀x ∈ B◦ (a, δ)) : ( − < f (x) and h(x) < + ).

− < f (x) ≤ g(x) ≤ h(x) < + . lim( f g) = 0. then f is locally bounded near a. i.6 Let f and g be defined in a punctured neighborhood of a point a. lim f = a if and only if lim( f − ) = 0.Functions Then for every such x. 2011) Proposition 2. Proof : Immediate from the definition of the limit. Then. Then.e. I . I Proposition 2. (∀x ∈ B◦ (a. 65 I (17 hrs.5 Let f be defined in a punctured neighborhood of a point a.. a Proof : Very easy.4 Let f be defined in a punctured neighborhood U of a. I Proposition 2. If the limit lim f = a exists. Suppose that lim f = 0 a whereas g is locally bounded near a. a Proof : Very easy. δ)) : (|g(x) − | < ).

a Then lim( f g) = m. product) Let f. a Proof : Write fg − m = (f − ) g − (g − m) . if it has a limit at a. Hence. ) Examples: Œ We saw that the function f (x) = x2 has a limit at 3. Proposition 2. a Comment: If f has a limit at a and the limit differs from f (a) (or that f (a) is undefined) . then we say that f has a removable discontinuity ( at a. here is another way of proving the product property of the arithmetic of limits (this time without and δ).7 (Arithmetic of limits. and lim f = a and lim g = m. . bdd limit zero I (17 hrs.7 A function f : A → R is said to be continuous at an inner point a ∈ A. and that this limit was equal 9. and lim f = f (a). f is continuous at x = 3.5 Continuity Definition 2.66 Chapter 2 Comment: Note that we do not require g to have a limit at a. With the above tools. limit zero limit zero limit zero loc. g be defined in a punctured neighborhood of a point a. 2010) 2.

Continuity is defined as that for every > 0 there exists a δ > 0 such that | f (x) − f (a)| < whenever x ∈ B◦ (a. Our theorems on limit arithmetic imply right away similar theorems for continuity: Theorem 2. we said that the limit of f at a x ∈ B◦ (a. Ž The function f (x) = x sin 1/x (it was seen in the tutoring session) is continuous at all x 0. because the limit of f at a 0 does not exist. I Comment: Recall that in the definition of the limit. but if we define f (0) = 0. To prove that sin is continuous. if f (a) Proof : Obvious. f is continuous at all x 0. Moreover. By a similar calculation we could have shown that it has a limit at any a < 0 and that this limit also equals 1/a. At zero it is not defined.  The elementary functions sin and cos are continuous everywhere. δ). δ). Since f is not even defined at x = 0. then it is continuous for all x ∈ R (by the bounded times limit zero argument). . Hence.Functions 67  We saw that the function f (x) = 1/x has a limit at any a > 0 and that this limit equals 1/a.  is continuous at x = 0. then it is not continuous at that point. δ). however it is not continuous at any other point.3 If f : A → R and g : A → R are continuous at a ∈ A then f + g and f · g are continuous at a. There was an emphasis on the fact that the point x itself is excluded. is if for every > 0 there exists a δ > 0 such that | f (x) − | < whenever 0 then 1/ f is continuous at a. we need to prove that for every > 0 there exists a δ > 0 such that | sin x − sin a| < whenever x ∈ B◦ (a.  The function  x x ∈ Q   f :x→ 0 x Q. We will take it for granted at the moment.

(19 hrs. δ1 ). Proof : Let that Since g is continuous at a. Do we have the tools to show that it is continuous. δ1 ) Combining the two we get that | f (g(x)) − f (g(a))| < whenever B(a. δ2 ). We don’t yet have a theorem for the composition of continuous functions. There is no need to exclude the point x = a. f ◦ g is continuous at a. Suppose that g is continuous at an inner point a ∈ A. > 0. But what about a function like sin x2 . Since f is continuous at g(a). Usually. we have all the tools to show that a function like. No. and that f is continuous at g(a) ∈ B. 2010) With that. .68 Chapter 2 Note that we may require that this be true whenever x ∈ B(a. Definition 2. we are interested in continuity on intervals. Then. I whenever x ∈ B(a. we have only dealt with continuity at points. δ).4 Let g : A → B and f : B → C.8 A function f is said to be right-continuous ( lim f = f (a). say. Theorem 2. such | f (y) − f (g(a))| < whenever y ∈ B(g(a). δ2 ). So far. such that |g(x) − g(a)| < δ1 or g(x) ∈ B(g(a). + a ) at a if A similar definition is given for left-continuity ( ). sin2 x + x2 + x4 sin x f (x) = 1 + sin2 x is continuous everywhere in R. then there also exists a δ2 > 0. then there exists a δ1 > 0. which is an inner point.

It is said to be continuous on a closed interval [a. h : R → R are continuous at a and satisfy g(a) = h(a). then so is | f | (where | f |(x) = | f (x)|).Functions 69 Definition 2. g).) Suppose that in addition f is continuous at zero. for some constant c for all x ∈ Q. We define a new function that “glues” g and h at a:  g(x) x ≤ a   f (x) =  h(x) x > a. One should be careful with such interpretations.16 Suppose that g. b]   f (x) =  h(x) x ∈ [b.  Exercise 2. b] if it is continuous on the open interval.  Exercise 2.  Exercise 2.14 Suppose f : R → R satisfies f (a + b) = f (a) + f (b) for all a. c]. and g(b) = h(b). Define  g(x) x ∈ [a. c].17 Here is another “gluing” problem: Suppose that g is continuous in [a.  Prove that f is continuous at a.9 A function f is said to be continuous on an open interval (a. )We have already seen that this implies that f (x) = cx.  Exercise 2. Prove that f is continuous everywhere. see for example   x sin 1 x 0   x f (x) =  0  x = 0. b) if it is continuous at all x ∈ (a. and in addition it is right-continuous at a and left-continuous at b. g) and min( f. Comment: We usually think of continuous function as “well-behaved”. b). b ∈ R. b]. def  Prove that if f : A → R and g : A → R are continuous then so are max( f.15 Œ Prove that if f : A → B is continuous in A. c]. .  Prove that f is continuous in [a. that h is continuous is [b.

6 Theorems about continuous functions Theorem 2. That is. Then there exists a neighborhood of a in which f (x) > 0. This function has the wonderful property of being continuous at all x Q and discontinuous at all x ∈ Q (this is because its limit is everywhere zero).5 Suppose that f is continuous at a and f (a) > 0. there exists a δ > 0 such that | f (x) − f (a)| < But. where x = p/q assumes that x is rational in reduced form. δ). Proof : Since f is continuous at a. | f (x) − f (a)| < f (a) 2 f (a) 2 whenever x ∈ B(a. a one-sided version can be proved. whereby if f is right-continuous at a and f (a) > 0. It took some more time until Vito Volterra proved in 1881 that there can be no function that is continuous on Q and discontinuous on R \ Q. δ). there exists a δ > 0 such that f (x) > 0 whenever x ∈ B(a. δ). Also. then f (x) > 0 whenever x ∈ B+ (a. 2 . Comment: Note that continuity is only required at the point a. Comment: An analogous theorem holds if f (a) < 0. 2. implies f (a) − f (x) ≤ | f (a) − f (x)| < f (a) .70 Chapter 2 Example: Here is another “crazy function” due to Johannes Karl Thomae (18401921):  1/q   r(x) =  0  x = p/q x Q.

2 Comment: If f (a) > 0 and f (b) < 0 the same conclusion holds for just replace f by (− f ). 2011) Comment: Continuity is required on the whole intervals. x]}. b). b]. Proof : Consider the set A = {x ∈ [a. there exists a point c ∈ (a.6 (Intermediate value theorem ) Suppose f is continuous on an interval [a. such that f (c) = 0. b] : f (y) < 0 for all y ∈ [a.e. (19 hrs. 2010) Theorem 2. f (x) > f (a)/2 > 0. 1] → R:  −1   f (x) =  +1  1 0≤x< 2 1 ≤ x ≤ 1.Functions i. Then. 2009) (20 hrs. for consider f : [0. with f (a) < 0 and f (b) > 0. .. 71 3/2 f!a" f!a" 1/2 f!a" a ! I This theorem can be viewed as a lemma for the following important theorem: (15 hrs.

This set is also bounded by b. By the previous theorem. The previous theorem guarantees even the existence of a δ > 0 such that b − δ is an upper bound for A. such that f (a) < α < f (b). i. i. c + δ]. f!b" A a f!a" c b Suppose it were true that f (c) < 0.72 Chapter 2 The set is non empty for it contains a. b) such that c = sup A. a + δ) ⊆ A. Actually. the previous theorems guarantees the existence of a δ > 0 such that [a. which implies that c − δ is an upper bound for A. there exists a number c ∈ (a.e. By the previous theorem. b]. .e. By the axiom of completeness.. I Corollary 2. c cannot be the least upper bound of A. we conclude that f (c) = 0. f (x) < 0 for all x ∈ [a. b) at which f (c) = α.. contradicting the fact that c is an upper bound for A. there exists a δ > 0 such that f (x) > 0 whenever c − δ ≤ x ≤ c.1 If f is continuous on a closed interval [a. By the trichotomy property. Suppose it were true that f (c) > 0. Because c is the least upper bound of a it follows that also f (x) < 0 whenever x < c. there exists a δ > 0 such that f (x) < 0 whenever c ≤ x ≤ c + δ. then there exists a point c ∈ (a. which means that c + δ ∈ A.

. Show that there exists a point c ∈ [a. . b] → R are continuous such that f (a) < g(a) and f (b) > g(b). a + δ). n bound on the interval (a − δ. b] such that f (c) = g(c). (ii) A fly flies from Tel-Aviv to Or Akiva (along the shortest path). 1 ]. What can be said about f ? 73 I  Exercise 2. 1 − 1 ]. The following day are makes his way back along the very same path.2 If f is continuous at a.) (ii) Show that there exists an x0 ∈ [0. He starts its way at 10:00 and finishes at 16:00. i.e. continuity is needed on the whole interval. b].18 Suppose that f is continuous on [a. b] with f (x) ∈ Q for all  Exercise 2.20 (i) Let f : [0. δ). I Theorem 2. f (x) < f (a) + 1 on this interval. b]. b] then it is bounded from above of that interval. for look at the “counter example” f : [−1. such that f (x) < M for all x ∈ [a. such that f (x0 ) = f (x0 + 1 ). Lemma 2. Show that there exists an x0 ∈ [0. for by the definition of continuity. there exists a δ > 0. that is.  Exercise 2.19 (i) Suppose that f. g : [a. 1) whenever x ∈ B(a. then there is a δ > 0 such that f has an upper Proof : This is obvious. Comment: Here too. 1] → R. there exists a number M. Show that there will be a point along the way that he will cross a the exact same time on both days. 1] → R be continuous.7 (Karl Theodor Wilhelm Weierstraß) If f is continuous on [a.Functions Proof : Apply the previous theorem for g(x) = f (x) − α. such that n f (x0 ) = f (x0 + 1 ). (Hint: consider 2 2 1 g(x) = f (x + 2 ) − f (x). f (0) = f (1).  1/x   f (x) =  0  x 0 x = 0. starting again at 10:00 and finishing at 16:00. such that f (x) ∈ B( f (a). x ∈ [a.

b] be closed. b]. x]}. there exists a neighborhood of c in which f is upper bounded. I Comment: In essence.8 (Weierstraß. then there exists a point c ∈ [a. then it is bounded up to a little farther. 1] → R. b] : f is bounded from above on [a. The function f : (0. i. 2009) (22 hrs. By the previous lemma. To be able to take such increments up to b we need the axiom of completeness.e. the set A = { f (x) : a ≤ x ≤ b} . b] such that f (x) ≤ f (c) for all x ∈ [a. f : x → 1/x is continuous on the semi-open interval. We claim that c = b. b]. for if c < b.. 2011) Theorem 2. Proof : Let A = {x ∈ [a.74 Chapter 2 Comment: It is crucial that the interval [a. and c cannot be an upper bound for A. such that c = sup A. ) If f is con- Comment: Of course. (17 hrs. there is a corresponding minimum principle. hence there exists a c ∈ (a. b]. Also A is upper bounded by b. this theorem is based on the fact that if a continuous function is bounded up to a point. Proof : We have just proved that f is upper bounded on [a. 2010) (20 hrs. there exists a δ > 0 such that a + δ ∈ A. then by the previous lemma. b]. Maximum principle tinuous on [a. but it is not bounded from above.

. 1) → R and f : [0. Prove that for every 0 < c < M.   x2   f (x) =  0  x<1 . for every M > 0 there exists a point in [a. b] (since we assumed that the denominator does not vanish). 1]. 1 = M. We will show that g is not upper bounded on [a. We define then a new function g : [a. for which f (c) = α. I g(y) > Comment: Here too we needed continuity on a closed interval. Indeed.21 Let f : [a. the set {x ∈ [a. that there exists a point c ∈ [a. b].e. . 1] → R f (x) = x2 .. b] → R. b] at which f takes a value greater than M. b] : f (x) = c} has at least two elements. This set is non-empty for it contains the point f (a). b]. We need to show that this supremum is in fact a maximum. since α = sup A: For all M > 0 there exists a y ∈ [a. i. by contradiction.Functions 75 is upper bounded. b] → R be continuous. For this y. which we denote by α = sup A. α− f This function is defined everywhere on [a. contradicting thus the previous theorem.e.  Exercise 2. α − (α − 1/M) i. b] such that f (y) > α − 1/M. Suppose. it is continuous and positive. that this were not the case. x=1 These functions do not attain a maximum in [0. g= 1 . with f (a) = f (b) = 0 and M = sup{ f (x) : a ≤ x ≤ b} > 0. that f (x) < α for all x ∈ [a. b]. By the axiom of completeness it has a least upper bound. for consider the “counter examples” f : [0.

and that for every > 0 there exists an x ∈ [a. b in a way that works for all choices of a0 . f (x) a0 an−1 + ··· + n =1+ n x x x |an−1 | |a0 | (u + v ≥ u − |v|) ≥1− − ··· − n |x| |x | |an−1 | |a0 | (|xn | > |x|) ≥1− − ··· − |x| |x| |a0 | |an−1 | (|x| > |ai |) ≥1− − ··· − 2n|an−1 | 2n|a0 | 1 =1−n· > 0. b] such that for every [a.9 If n ∈ N is odd. Let M = max(1. a1 x + a0 = 0 has a root (a solution) for any set of constants a0 . .76 Chapter 2  Exercise 2. 2009) The next theorem deals with the case where n is even: . based on the fact that polynomials are continuous. b] x x0 . . Proof : The idea it to show that existence of points a. Prove or disprove each of the following statements: Œ It is possible that f (x) > 0 for all x ∈ [a. b]. . . . I (18 hrs. . an−1 . . . b for which f (b) > 0 and f (a) < 0. Theorem 2. it follows that f (x) is positive for x ≥ M and negative for x ≤ −M. then the equation f (x) = xn + an−1 xn−1 + . . Then for |x| > M. f (x) < f (x0 ). . . b] for which f (x) < . hence there exists a a < c < b such that f (c) = 0. and apply the intermediate value theorem. .22 Let f : [a. The only technical issue is to find such points a. an−1 . . 2n Since the sign of xn is the same as the sign of x. 2n|an−1 |). . 2n|a0 |.  There exists an x0 ∈ [a. b] → R be continuous.

Functions 77 Theorem 2. f (x) = xn 1 + ≥ ≥ ≥ = an−1 a0 + ··· + n x x |a0 | |an−1 | − ··· − n xn 1 − |x| |x | |a0 | |an−1 | xn 1 − − ··· − |x| |x| |an−1 | |a0 | − ··· − xn 1 − 2n|an−1 | 2n|a0 | 1 n x. 2 Let then b > max(M. b]. Then. using the fact that n is even. a1 x + a0 = α with n even. n 2| f (0)|). Then there exists a number m such that this equation has a solution for all α ≥ m but has no solution for α < m. b]. . It remains to find such a b. f (c) ≤ f (0). It follows that f (c) ≤ f (x) for all x ∈ R. b].e. we know that it assumes a minimum in this interval.. Let M be defined as in the previous theorem.10 Let n be even. I Corollary 2. . from which follows that f (x) > f (0) for |x| > b.2 Consider the equation f (x) = xn + an−1 xn−1 + . . Proof : The idea is simple. such that f (c) ≤ f (x) for all x ∈ R. that there exists a c ∈ [−b. for all |x| > M. . Namely. b] such that f (c) ≤ f (x) for all x ∈ [−b. Then. which in turn is less than f (x) for all x [−b. This concludes the proof. i. looking at the function f in the interval [−b. a1 x + a0 = 0 has a minimum. We are going to show that there exists a b > 0 for which f (x) > f (0) for all |x| ≥ b. there exists a c ∈ R. In particular. Then the function f (x) = xn + an−1 xn−1 + . .

such that f (x) < M whenever 0 < |x − a| < δ. c is a solution. a whenever 0 < |x − a| < δ. f (x) > α. by the intermediate value theorem a root exists. and (ii) the limit of a function at infinity. If α < f (c) then for all x. We say that the limit of f at a is infinity.7 Infinite limits and limits at infinity In this section we extend the notions of limits discussed in previous sections to two cases: (i) the limit of a function at a point is infinite. . f (x) ≥ f (c) > α. Before we start recall: infinity is not a real number!.e. hence. then 0 < |x| < δ implies f (x) > M. For α > f (c) we have f (c) < α and for large enough x.. lim f = −∞. Indeed given M we take δ = 1/M. denoted lim f = ∞. Definition 2.78 Chapter 2 Proof : According to the previous theorem there exists a c ∈ R such that f (c) is the minimum of f . a if for every M there exists a δ > 0. there is no solution.10 Let f : A → B be a function. such that f (x) > M Similarly. The limit of f at zero is infinity. Example: Consider the function  1/|x|   f :x→ 17  x 0 x = 0. i. I 2. if for every M there exists a δ > 0. For α = f (c).

since given > 0 we take M = 1/ x>M . such that | f (x) − | < Similarly.25 Prove that x→∞ lim f (x) exists if and only if x→0+ lim f (1/x) exist.  Exercise 2. x→∞ lim x2 + 2x − x . such that | f (x) − | < whenever x < M. . x−3 √  Exercise 2. ∞). We say that lim f = . if we adopt the idea that “neighborhoods of infinity” are sets of the form (M. if for every > 0 there exists an M.  Exercise 2.23 Prove directly from the definition that x→3 lim+ x3 + 2x + 1 = ∞. Comment: These definitions are in full agreement with all previous definitions of limits. and implies | f (x) − 3| < .Functions 79 Definition 2.24 Calculate the following limit using limit arithmetic. Example: Consider the function f : x → 3 + 1/x2 . −∞ whenever x > M. ∞ if for every > 0 there exists an M.11 Let f : R → R. in which case they are equal. Then the limit of f at infinity √ is 3. lim f = .

Show that lima ( f /g) = 0. with lima f = lima g = −∞. Show that lima ( f · g) = ∞.26 Let g be defined in a punctured neighborhood of a ∈ R. Show that lima 1/g = 0. g be defined in a punctured neighborhood of a ∈ R. g be defined in a punctured neighborhood of a ∈ R.  Let f. . This means that for every b ∈ B there exists a unique a ∈ A. and denote it by f −1 (not to −1 be mistaken with 1/ f ). this function is also one-to-one and onto. (24 hrs. In fact.80 Chapter 2 Figure 2. This property defines a function from B to A. We call this function the function inverse to f ( ¯ ). with lima g = ∞. Thus.1: Infinite limit and limit at infinity. 2010) 2.  Exercise 2.27 Œ Let f. with lima f = K < 0 and lima g = −∞. f −1 (y) = {x ∈ A : f (x) = y}.  Exercise 2.8 Inverse functions Suppose that f : A → B is one-to-one and onto. f : B → A. such that f (a) = b.

Example: Let f : [0. 1] → [0. or Graph( f −1 ) = {(b. However. and not a real number. 1] by f −1 (y) = {x ∈ [0. 1] → R be given by f : x → x2 . The inverse function can be defined as a subset of B × A: it is all the pairs (b. but it is not monotonic. This function is one-toone and its image is the segment [0. Strictly speaking.. or monotonically decreasing. a) : f (a) = b}. Thus we can define an inverse function f −1 : [0. 5). 2011) Theorem 2. 1] : x2 = y}. The function f : [0.  x   f (x) =  6 − x  0≤x≤1 1<x≤2 is one-to-one with image [0. 2] → R. a) for which (a.e. 10 . b) ∈ Graph( f ). b) : f (a) = b}. (22 hrs. Graph( f ) = {(a. then f is either monotonically increasing. the one-to-one and onto properties of f guarantee that it contains one and only one number.11 If f : I → R is continuous and one-to-one (I is an interval). i. 1] ∪ [4. Comment: Recall that a function f : A → B can be defined as a subset of A × B. hence we identify this singleton ( ) with the unique element it contains. the right-hand side is a set of real numbers. 1]. it is the (positive) square root.Functions 10 81 Note also that f −1 ◦ f is the identity in A whereas f ◦ f −1 is the identity in B. Comment: Continuity is crucial.

By monotonicity. so is any point in this interval. b] → R is one-to-one and monotonically increasing (without loss of generality). contradicting the fact that f is one-to-one. Suppose that f (a) < f (c) and that. and by the intermediate value theorem. b. It follows as once that for every four points a < b < c < d. f (b) < f (a) (see Figure 2. its image is the interval [ f (a). f (a) < f (c) implies that f (a) < f (b) < f (c). Similarly. y (whatever their order is) and conclude that f (x) < f (y). Then. by contradiction.l. either f (a) < f (b) < f (c). either f (a) < f (b) < f (c) < f (d) or f (a) > f (b) > f (c) > f (d). for every x.o. By the intermediate value theorem. there exists a point x ∈ (b. if f (b) > f (c). We proceed similarly if f (a) > f (c). f (a) and f (b) are in the range. or f (a) > f (b) > f (c). y such that x < y. Indeed. we apply the above arguments to the four points a. Fix now a and b.2: Illustration of proof Proof : We first prove that for every three points a < b < c. no point outside this interval . x.2). then there exists a point y ∈ [a. Then. f (b)]. Thus. and suppose w. c) such that f (x) = f (a). I Suppose that the function f : [a.g that f (a) < f (b) (they can’t be equal). b] such that f (y) = f (c).82 Chapter 2 f!c" f!a" f!b" a b c Figure 2.

let us assume that f is monotonically increasing. so is f −1 . en passant. such that | f −1 (y) − f −1 (y0 )| < whenever |y − y0 | < δ. By monotonicity. lim f −1 = f −1 (y0 ). that continuous one-to-one functions map closed segments into closed segments11 . i. y0 − f ( f −1 (y0 ) − )). Every y0 − δ < y < y0 + δ is in the interval (y1 . f ( f −1 (y0 )− ) = y0 −[y0 − f ( f −1 (y0 )− )] < y < y0 +[ f ( f −1 (y0 )+ )−y0 ] = f ( f −1 (y0 )+ ).Functions 83 can be in the range of f . We have proved. More generally. Then. Theorem 2. Let δ = min( f ( f −1 (y0 ) + ) − y0 . y2 ). and therefore. every such y0 is equal to f (x0 ) for a unique x0 . We can do it more formally. y0 Equivalently. f (b)] → [a. x<y implies f (x) < f (y). Let y1 = f (x0 − ) and y2 = f (x0 + ).. y1 < y0 < y2 . f (b)). It follows that the inverse function has range and image f −1 : [ f (a). a continuous one-to-one function maps connected sets into connected sets. we need to show that for every > 0 there exists a δ > 0. y2 − y0 ). Now. f −1 (y0 ) − < f −1 (y) < f −1 (y0 ) + . it is clear how to set δ.e. and retains the open/closed properties. . We need to show that for every y0 ∈ ( f (a). 11 y0 − δ < y < y0 + δ. b]. b] → R is continuous and one-to-one then so is f −1 .12 If f : [a. We let then δ = min(y0 − y1 . |y − y0 | < δ implies or. Then. Proof : Without loss of generality. Since f (and f −1 ) are both monotonically increasing.

b).e. Let us re-examine two examples: . That is. 2010) 2. such that | f (y) − f (x)| < whenever |y − x| < δ and y ∈ (a. if it is continuous at every point in the interval. 2009) (26 hrs. b) and every > 0.9 Uniform continuity Recall the definition of a continuous function: a function f : (a.84 Chapter 2 Since f −1 is monotonically increasing. In fact. it includes one-sided continuity as well). b). b) → R is continuous on the interval (a. this definition applies for continuity on a closed interval as well (i. for every x ∈ (a. the way we set it here. there exists a δ < 0. In general. or | f −1 (y0 ) − f −1 (y)| < . f!x0 +!" f!x0 "=y0 f!x0 #!" ! " f#1!y0"=x0 I (20 hrs. we may apply f −1 on all terms. giving f −1 (y0 ) − < f −1 (y) < f −1 (y0 ) + . we expect δ to depend both on x and on .

Note that x and y play here symmetric roles12 . xy x[x + (y − x)] If we take δ( . This function is continuous on (0. Thus. Example: Consider next the function f : (0. In contrast. then the same δ fits all points x. even if we took care of its value at zero. 1) → R. These two examples motivate the following definition: Definition 2. if we choose δ = δ( . 1) → R. 1). | f (y) − f (x)| = |y2 − x2 | = |y − x||y + x| ≤ (x + 1)|y − x|. There is no way we can find a δ = δ( ) that would fit all x. Why? Given x ∈ (0. 1). We have already seen examples where continuity on a 12 The term “uniform” means that the same number can be used for all points. 1). x) = min(x/2. x) and y ∈ (0. it is always legitimate to replace δ by a smaller number. Why? Because for every x ∈ (0. the smaller is δ( . f : x → 1/x. given x and > 0. there is no way we could have defined the function x → 1/x as a continuous function on [0. 1].12 f : A → B is said to be uniformly continuous on A ( ) if for every > 0 corresponds a δ = δ( ) > 0. . 1). the closer x is to zero. x(x − δ) Here.Functions 85 Example: Consider the function f : (0. x). However. x) = /(x + 1). given . 1) and y ∈ (0. and every y ∈ (0. This function is also continuous on (0. y ∈ A. Thus. If we take δ = /2. | f (x) − f (y)| = |x − y| |x − y| = . 1). then | f (y) − f (x)| < whenever |x − y| < δ( . δ depends both on and x. 1] and it would have been continuous there too. x2 /2). What is the essential difference between the two above examples? The function x → x2 could have been defined on the closed interval [0. 1). such that whenever |y − x| < δ and | f (y) − f (x)| < x. then |x − y| < δ implies | f (x) − f (y)| < δ < . f : x → x2 .

I (24 hrs. b] and y ∈ [b. δ2 . We will prove that continuity on a closed interval implies uniform continuity. it follows that |x − b| < δ3 and |y − b| < δ3 and we use (2. |x − b| < δ3 and |y − b| < δ3 implies | f (x) − f (y)| < . We claim that | f (x) − f (y)| < whenever |x − y| < δ and x. then there exists a δ3 > 0 such that | f (x) − f (b)| < Hence. in which case it follows from the fact that δ ≤ δ1 . Since the function is continuous at b. We proceed by steps: Lemma 2. Remains the third case. y ∈ [a. y ∈ [a. such that x. where. (2. Proof : We have to take care at the “gluing” at the point b.3 Let f : [a. δ2 > 0. Why is that true? One of three: either x. y ∈ [a. in which case it follows from the fact that δ ≤ δ2 . c]. This is also the case here. c]. c].1). Since |x − y| < δ3 . Then there exists a δ > 0 such that x. y ∈ [b. c] and |x − y| < δ implies | f (x) − f (y)| < . b]. or x. 2011) . x ∈ [a. c] and and |x − y| < δ1 |x − y| < δ2 implies implies | f (x) − f (y)| < | f (x) − f (y)| < . c] → R be continuous. Let a < b < c. y ∈ [b.1) 2 whenever |x − b| < δ3 . b] x. without loss of generality.86 Chapter 2 closed interval had strong implications (ensures boundedness and the existence of a maximum). y ∈ [a. and suppose that for a given > 0 there exist δ1 . Take now δ = min(δ1 . δ3 ).

Functions 87 Theorem 2. (27 hrs. if for that particular there exists a corresponding δ). α−δ0 /2]∩[a. b] for all > 0. x]}. b] for all > 0. > 0. b]. b]. then there exists a δ0 > 0 such that | f (y) − f (α)| < Thus. The set A is non-empty. contradicting the fact that α is an upper bound for A.. I Comment: In this course we will make use of uniform continuity only once. z ∈ B(α. b] and hence in the closed interval [α − δ0 /2. b] (i. 2010) Examples: . By the definition of α. b] then it is uniformly continuous on that interval. δ0 ). when we study integration. Thus f is -good on [a. from which follows that f is -good in (α−δ0 . Suppose that α < b. f is -good on [a. We need to show that f is -good on [a. b] and on [b. Given . b]. 2 whenever |y − α| < δ0 and y ∈ [a. Since this argument holds independently of we conclude that f is -good in [a. c].3 that if f is -good on [a. | f (y) − f (z)| ≤ | f (y) − f (α)| + | f (z) − f (α)| < . α+δ0 )∩[a. Since f is continuous at α. we will say that f is -good on [a. α+δ0 /2]∩[a. |x − y| < δ Proof : Given such that | f (x) − f (y)| < whenever and x. as it includes the point a (trivial) and it is bounded by b. then it is -good on [a. c]. ∀y. define A = {x ∈ [a. y ∈ [a. b] if there exists a δ > 0. b]. α + δ0 /2] ∩ [a. it is also -good on [a.13 If f is continuous on [a. and by the previous lemma. Hence it has a supremum. b] : f is -good on [a. which we denote by α. Note that we proved in Lemma 2.e. b].

Ž If on any interval [n. the product of uniformly continuous functions is not necessarily uniformly continuous. Ž Finally the function f (x) = sin x2 is continuous and bounded on R.30 Let f : (0.28 Let f : R → R be continuous. then f is uniformly continuous on R.88 Chapter 2 Œ Consider the function f (x) = sin(1/x) defined on (0. n + 1]. ∞). (22 hrs. 1). an open segment. (Hint: read the proof about the limit of a product of functions being the product of their limits. g : I → R be uniformly continuous on I. Prove that f · g is uniformly continuous on I. ∞ −∞ then f attains a maximum value. It is easy to see that it is uniformly continuous. Prove that f is uniformly continuous on [a.  Is in the above case f · g necessarily uniformly continuous?  Exercise 2.. Thus.) . n ∈ Z.  Consider the function Id : R → R. The function Id · Id. Prove or disprove: Œ If lim f = lim f = 0. that is f is on R. ∞) → R be given by f : x → 1/ x: Œ Let a > 0. where α. then f assumes all real values. Œ Prove that if f. but not uniformly continuous. 2009)  Exercise 2. g are uniformly continuous on A then α f + βg is also uniformly continuous on A. it is not uniformly continuous.  Exercise 2. is continuous on R.29 Suppose A ⊆ R is non-empty. f is uniformly continuous. an open ray. however.31 Let I be an open connected domain (i. or the whole line) and let f. Even though it is continuous. but it is not uniformly continuous. ∞).e.  If lim∞ f = ∞ and lim−∞ f = −∞. β ∈ R.  Show that f is not uniformly continuous on (0. √  Exercise 2.

 Exercise 2.Functions 89  Exercise 2. Ž Prove that if f is H¨ lder continuous of order α > 1. ∞).  Prove that if f is H¨ lder continuous of order α on A then it is uniformly o continuous on A. y ∈ A.35 Let α > 0. Show that f (x) = sin(1/x) is not uniformly continuous on (0. a). We say that f : A → B is H¨ lder continuous of o order α if there exists a constant M > 0.32 Let f : A → B and g : B → C be uniformly continuous.  Exercise 2. ∀x.34 For which values of α > 0 is the function f (x) = xα uniformly continuous on [0. √ o Œ Show that f (x) = x is H¨ lder continuous of order 1/2 on (0. ∞)?  Exercise 2. then it is a constant o function.33 Let a > 0. such that | f (x) − f (y)| < M|x − y|α . Show that g ◦ f is uniformly continuous. .

90 Chapter 2 .

This means that (∀ > 0)(∃δ > 0) : (∀x ∈ B◦ (a. We claim that lim g = lim f. we introduced the limit of a function f : A → B at an (interior) point a. Derivatives were historically introduced in order to answer the need of measuring the “rate of change” of a function. Suppose that the right hand side equals . one technical clarification. lim f. a Consider the function g defined by g(x) = f (a + x). zero is an interior point of this set. 0 a provided. that the right hand side exists. It is a good exercise to prove it formally. We will actually adopt this approach as a prelude to the formal definition of the derivative. δ))(| f (x) − | < ). of course. In particular.1 Definition In this chapter we define and study the notion of differentiable functions.Chapter 3 Derivatives 3. In the previous chapter. Before we start. on the range. {x : a + x ∈ A}. .

if we consider the graph of f . x→0 a although this in not in agreement with our notations. and by the definition of g. For example. Then f (x) − f (a) is the displacement ( ) between time a and time x. δ))(|g(y) − | < ). First. f ◦ (x → a + x). f (a + x) stands for the composite function. The mean velocity. then the mean rate of change is the slope of the secant line ( ) that intersects the graph at the points a and x (see Figure 3. Often. δ))(| f (a + y) − | < ). and let a and x be two points inside this interval. Let now f be a function defined on some interval I. (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (0. it has a meaning in many physical situations. but there is nothing to guarantee that within this time interval. we could intersect this time interval into two equal sub-intervals. is an average quantity between two instants. x−a This quantity can be attributed with a number of interpretations. the body was in a “fixed state”. Thus. For example. in meters relative to the origin) as function of time (say. We define the mean rate of change ( ) of f between the points a and x to be the ratio f (x) − f (a) . one does not bother to define the function g and simply writes lim f (a + x) = lim f. i. Chapter 3 (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (0. This physical example is a good preliminary toward the definition of the derivative. f (a) is the distance from the origin in meters a seconds after the time origin. The variation in the value of f between these two points is f (x) − f (a). and [ f (x) − f (a)]/(x − a) is the mean displacement per unit time.. or mean rate of displacement.92 Setting x − a = y. Second.1). or the mean velocity.e. lim0 g = . and measure the mean velocity in each half. and f (x) is the distance from the origin in meters x seconds after the time origin. in seconds relative to an origin of time). Nothing guarantees that these mean velocities will equal the mean velocity over . f can be the position along a line (say.

a = f (a).a can be defined on the same domain as f . f (a) is nothing but a notation. Physicists aimed to define an “instantaneous velocity”. Fixing the point a.a has a limit at a.a : x → x−a The function ∆ f. x−a At this stage. f (a) = lim x→a f (x) − f (a) .Derivatives 93 f!a+h" f!a" ! a a+h Figure 3. Of course. a We call the limit f (a) the derivative ( ) of f at a. the whole interval. but excluding the point a. we define a function f (x) − f (a) . We say that f is differentiable at a ( ) if ∆ f.1: Relation between the mean rate of change of a function and the slope of the corresponding secant. having h small (what does small mean?) does not change the average nature of the measured velocity. It is not (yet!) a function evaluated . ∆ f. Comment: In the more traditional notation. We write lim ∆ f. An instantaneous rate of change can be defined by using limits. and the way to do it was to make the length of the interval very small.

we could have computed the derivative at any point. is df (a).x . f : x → c.e. f (x) = lim y→x f (y) − f (x) .x = 2x. . Another standard notation. we can define a function that given a point x.94 Chapter 3 at the point a. We construct the function ∆ f.a (y) = f (y) − f (a) c − c = = 0. as the function. these are only hand-waving arguments. x y2 − x2 = x+y y−x or ∆ f. y−x Example: Consider the function f : x → x2 . Example: Consider the constant function. x Again. We are going to calculate its derivative at the point a. due to Leibniz. In the above example. f : U → R.x (y) = so that f (x) = lim ∆ f. the derivative has a simple geometrical interpretation.. hence f (a) = 0. We calculate its derivative at a point x by first observing that ∆ f. For us. dx Having identified the mean rate of change as the slope of the secant line. In other words. As x tends to a. returns f (x). the corresponding family of secants tends to a line which is tangent to f at the point a. the more traditional notation is. returns the derivative of f at that point. A function f : A → B is called differentiable in a subset U ⊆ A of its domain if it has a derivative at every point x ∈ U. f (x) = lim ∆ f. We then define the derivative function. f : R → R.x = Id +x. i. y−a y−a The limit of this function at a is clearly zero. as we have never assigned any meaning to the limit of a family of lines.

x (y) = 1.x at x > 0 we may well assume that y > 0. 2009) (25 hrs.x (y) = so that f (x) = lim ∆ f. ∆ f. in which case ∆ f. x For negative x. ∆ f. Thus f is differentiable everywhere except for the origin1 . This motivates the following definitions: We use here a general principle. but one-sided limits do exist. 2011) Example: Consider now the function f : x → |x|. 1 1 2. For x > 0. |x| = −x and we may consider y < 0 as well. Let’s first calculate its derivative at a point x > 0. such that . whereby lima f does not exist if there exist every neighborhood of a has points x.0 (y) = |y| − |0| = sgn(y). Hence f (x) = lim ∆ f. x (−y) − (−x) = −1.x = −1. 2010) One-sided differentiability This last example shows a case where a limit does not exist.x = 1.x (y) = |y| − x . y−a Remains the point 0 itself.Derivatives 95 (23 hrs. so that ∆ f. y. y−0 The limit at zero does not exist since every neighborhood of zero has points where this function equals one and points where this function equals minus one. (29 hrs. y−x Since we are interested in the limit of ∆ f. such that f (x) = 1 and f (y) = 2 .

0 does not have a limit at 0. f (y) − f (0) 1 = y sin . ) by f (a+ ). because every neighborhood of zero has both points where ∆ f.96 Chapter 3 ) Definition 3.1 If f is differentiable at a then it is continuous at a.0 (y) = f (y) − f (0) 1 = sin .0 = 0. We have already seen that that this function is continuous at zero. consider the function   x2 sin 1   x f :x→ 0  We construct ∆ f. We denote the right-hand derivative ( definition holds for left-hand derivatives. A similar Example: Here is one more example that practices the calculation of the derivative directly from the definition. but is it differentiable at zero? We construct the function ∆ f.0 = 1 and points where ∆ f. Example: In contrast. Differentiability and continuity The following theorem shows that differentiable functions are a subclass (“better behaved”) of the continuous functions. . Consider first the function   x sin 1 x 0   x f :x→ 0  x = 0.1 A function f is said to be differentiable on the right ( at a if the one-sided limit lim ∆ f.0 (y) = x 0 x = 0. hence f is differentiable at zero.0 = −1. Theorem 3. y−0 y The function ∆ f.a + a exists. y−0 y Now lim0 ∆ f.

For derivatives higher than the third. use the fact that a function differentiable at a point is also continuous at that point. f = f (a) + (Id −a) ∆ f. in which case we may define the third derivative f . By definition f (x) = lim ∆ f . we have a functional identity. f . f (k) . or on a subset of A. then | f | is also differentiable at that point. is defined recursively.)  Exercise 3.Derivatives Proof : Note that for y a.) Likewise.x . or the second derivative ( ).x . or not. we can construct a new function—its derivative. for example. x (Note that this is a limit of limits. Then.2 Let I be a connected domain. By limit arithmetic. Higher order derivatives If a function f is differentiable on an interval A. The k-th derivative. the second derivative may be differentiable. The derivative f can have various properties. we also set f (0) = f . lim f = f (a) + lim(Id −a) lim ∆ f. a a a I  Exercise 3.a = f (a) + 0 · f (a) = f (a). and in particular. f (k) (x) = lim ∆ f (k−1) . Suppose furthermore that | f (x)| < M for all x ∈ I. it may be differentiable on A. we can define a new function— the derivative of the derivative. f (y) = f (a) + (y − a) · f (y) − f (a) . Show that f is uniformly continuous on I. which we denote by f . (y − a) 97 Viewing f (a) as a constant. the notation f (4) rather than f . and suppose that f is differentiable at every interior point of I and continuous on I. (Hint. For example. it is customary to use. it may be continuous. that if f is differentiable at a ∈ R with f (x) 0. x For the recursion to hold from k = 1.a valid in a punctured neighborhood of a. and so on. .1 Show. based on the definition of the derivative.

we develop tools that will enable us to (easily) compute derivatives.a + ∆g.a .4 Let f. we first calculated limits by using the definition of the limit. Proof : Immediate from the definition. then f = 0 (by this we mean that f : x → 0). I Theorem 3.a = ( f + g) − ( f + g)(a) Id −a f + g − f (a) − g(a) = Id −a = ∆ f. and we proved a number of theorems (arithmetic of limits). without having to go back to the definitions. I Theorem 3.3 If f = Id then f = 1 (by this we mean that f : x → 1).2 Rules of differentiation Recall that when we studied limits.2 If f is a constant function. Proof : We consider the function ∆ f +g. After having calculated the derivatives of a small number of functions. and ( f + g) (a) = f (a) + g (a). g be functions defined on the same domain (it suffices that the domains have a non-empty intersection). with which we were able to easily calculate a large variety of limits. Theorem 3. The same exactly will happen with derivatives. Proof : We proved this is in the previous section.98 Chapter 3 3. but very soon this became impractical. . If both f and g are differentiable at a then f + g is differentiable at a.

Suppose this were true for n = k.a + g(a)∆ f. Id −a Id −a and it remains to apply the arithmetic rules of limits. and ( f g) (a) = f (a) g(a) + f (a) g (a). and by the differentiation rule for products. g be functions defined on the same domain. . Then.a .e. We can show this inductively.1 The derivative is a linear operator. 99 I Theorem 3.. fn (x) = n xn−1 or equivalently fn = n fn−1 .Derivatives and by the arithmetic of limits. I Example: Let n ∈ N and consider the functions fn : x → xn . ( f + g) (a) = f (a) + g (a). If both f and g are differentiable at a then f · g is differentiable at a. i. We know already that this is true for n = 0. fk+1 (x) = k xk−1 · x + xk · 1 = (k + 1) xk .5 (Leibniz rule) Let f. Then. f g − ( f g)(a) f g − f g(a) + f g(a) − f (a)g(a) = = f ∆g. fk+1 = fk · Id. Proof : We consider the function. (α f + βg) = α f + βg . 1. This follows from the derivative of the product.a = Corollary 3. Proof : We only need to verify that if f is differentiable at a then so is α f and (α f ) = α f . fk+1 = fk · Id + fk · Id . I ∆ f g.

a .5). (31 hrs.100 Chapter 3 Theorem 3. But what about the derivative of x → sin x3 ? Here we need a rule for how to differentiate compositions. say. as a product of three functions.a = 1 1 1 1/g − (1/g)(a) −1 = − · ∆g. then 1/g is differentiable at Proof : Since g is continuous at a (it is continuous since it is differentiable) and it is non-zero. (25 hrs. 2010) Theorem 3. then the function f /g is differentiable at a and ( f /g) = f g−g f . g2 (a) 0. x → sin3 x. Then we have no problem calculating the derivative of. .6 If g is differentiable at a and g(a) a and (1/g) (a) = − g (a) .7 If f and g are differentiable at a and g(a) 0. Suppose we take for granted that sin = cos and cos = − sin . g2 Proof : Apply the last two theorems. For example. In this neighborhood we consider the function ∆1/g. ( f gh) = (( f g)h) = ( f g) h + ( f g)h = ( f g + f g )h = ( f g)h = f gh + f g h + f gh . = Id −a Id −a g g(a) g · g(a) I It remains to take the limit at a and apply the arithmetic laws of limits. then there exists a neighborhood of a in which g does not vanish (by Theorem 2. 2009) I With these theorems in hand. In particular there is no difficulty in differentiating a product of more than two functions. we can differentiate a large number of functions.

This suggest the following treatment. i. and it only takes a little more subtlety to prove it. the result is correct.  The fact that g is differentiable at f (a) implies that ψ is continuous at f (a). rendering this expression meaningless. y−a y−a and calculate its limit at a. then (g ◦ f ) (a) = g ( f (a)) · f (a). this product tends to g ( f (a)) · f (a). Proof : We introduce the following function. we expect the arguments f (y) and f (a) of g to be very close. (27 hrs. 2011) Theorem 3. ∆g◦ f.a (y) = (g ◦ f )(y) − (g ◦ f )(a) g( f (y)) − g( f (a)) = .Derivatives Consider the composite function g ◦ f . f (y) − f (a) y−a f (y) − f (a) It looks that as y → a. 101 Let’s try to calculate its derivative at a point a.a (y) = g( f (y)) − g( f (a)) f (y) − f (a) g( f (y)) − g( f (a)) · = ∆ f. assuming for the moment that both f and g are differentiable everywhere. f (a) (z) z f (a)  ψ:z→ g ( f (a)) z = f (a).a = (ψ ◦ f ) · ∆ f. defined in a neighborhood of f (a). The problem is that while the limit y → a means that the case y = a is not to be considered.e. Yet. When y is close to a. We next claim that for y a ∆g◦ f.  ∆  g. .8 Let f : A → B and g : B → R. If f is differentiable at a ∈ A. since f (y) − f (a) → 0. and g is differentiable at f (a). there is nothing to prevent the denominator f (y)− f (a) from vanishing.a (y). We then need to look at the function ∆g◦ f..a . (g ◦ f )(x) = g( f (x)).

. This suggests the following alternative definition of the derivative: f (x) − f (a) .a = ψ( f (a)) · lim ∆ f. The function ∆ f. and how clever definitions can sometimes greatly simplify proofs. a a I (32 hrs. x−a a. then this equation reads g( f (y)) − g( f (a)) g( f (y)) − g( f (a)) f (y) − f (a) = · . . and S f.a has a limit at a.x (y) =    y=a is continuous at a. 2010) 3. hence not continuous at a. and we denote this limit by f (a). f (y) = f (a) + S f. We could say that f is differentiable at a if there exists a real number. By limits arithmetic. Consider now the limit of the right hand side at a. y−a f (y) − f (a) y−a which holds indeed. Its purpose is to offer a slightly different angle of view on the subject. if the function ∆ f. Our definition of derivatives states that a function f is differentiable at an interior point a.a (y)(y − a).102 Why that? If f (y) Chapter 3 f (a). For y S f.a is not defined at a. whereas if f (y) = f (a) it reads. This equation holds also for y = a. f (y) − f (a) g( f (y)) − g( f (a)) = g ( f (a)) · .a  S f. y−a y−a and both sides are zero. lim (ψ ◦ f ) · ∆ f.a (a) = is called the derivative of f at a.a (y) = or equivalently.a (y) = ∆ f.3 Another look at derivatives In this short section we provide another (equivalent) characterization of differentiability and the derivative. such that the function  ∆ (y) y a  f. but this is a removable discontinuity.a = g ( f (a)) · f (a).

a . Of course.a and Sg. Then f (y) − f (a) = (y + a)(y − a).a (y)(y − a) in some neighborhood of a and f (a) = S f.a + g(a) S f.2 A function f defined in on open neighborhood of a point a is said to be differentiable at a if there exists a function S f. F = f (a)Sg.a (y)(y − a) g(a) + Sg.a (y)(y − a).a (y)(y − a) g(y) = g(a) + Sg.Derivatives 103 Definition 3.a (y)(y − a) = f (a)g(a) + f (a)Sg.a (y) + S f. hence f g is differentiable at a and ( f g) (a) = lim F = f (a)g (a) + g(a) f (a).a Sg.a coincides with ∆ f. such that f (y) = f (a) + S f.a continuous at a.a (a) is called the derivative of f at a and is denoted by f (a).a (a). the function in the brackets. a Leibniz’ rule Let’s now see how this alternative definition simplifies certain proofs.a (Id −a) is continuous at a. it follows that f is differentiable at a and f (a) = lim(Id +a) = 2a. S f.a (y) + g(a) S f.a (y)(y − a) (y − a). This implies the existence of two continuous functions. f (y)g(y) = f (a) + S f. S f. A first example Take the function f : x → x2 . Since the function Id +a is continuous at a. or f = f (a) + (Id +a)(Id −a).a (y)Sg. such that f (y) = f (a) + S f.a . S f. By limit arithmetic.a in some punctured neighborhood of a.a + S f.a (a) and g (a) = Sg. Suppose for example that both f and g are differentiable at a. Then.

Definition 3. We are going to put this practice on solid grounds. A point a ∈ A (not necessarily an internal point) is said to be a maximum point of f in A. 2010) (28 hrs. recall the definition. nor to exist. Then. f (a) ◦ f · S f. For example. f (a) ( f (y))( f (y) − f (a)) = g( f (a)) + Sg.a (y)(y − a) g(z) = g( f (a)) + Sg. there exist continuous functions. one of the main uses of differential calculus is to find extrema of functions. in a constant function all points are minima and maxima.a and Sg. (g ◦ f )(y) = (g ◦ f )(a) + Sg.a (y)(y − a). Comment: By no means a maximum point has to be unique.4 The derivative and extrema In high-school calculus. S f. f (a) ◦ f ]·S f. Now.3 Let f : A → B. f (a) ( f (y))S f. a (34 hrs. f (a) (z)(z − f (a)). f (a) ( f (y))S f. ∀x ∈ A.a (y) By the properties of continuous functions. f (a) . or. First. such that f (y) = f (a) + S f. and (g ◦ f ) (a) = lim Sg. if f (x) ≤ f (a) We define similarly a minimum point. 2011) 3. g( f (y)) = g( f (a)) + Sg.104 Chapter 3 Derivative of a composition Suppose now that f is differentiable at a and g is differentiable at f (a).a = g ( f (a)) f (a). [Sg.a (y)(y − a). On the other . the function in the square brackets is continuous at a.

since every neighborhood of x has points y where ∆ f.x (y) ≤ 0.x It is given that ∆ f.x (y) = satisfies f (y) − f (x) .x has a limit at x (it is f (x)). 1). both. . for y x. but this can’t hold. taking such that 0< 2 < ∆ f. Similarly. the function f : (0. a maximum point. b) is a maximum point of f and f is differentiable at x. Suppose the limit. We will show that this limit is necessarily zero.  ∆ (y) ≤ 0  f. . can’t be negative. b). then f (x) = 0. by assumption. In particular. f (y) − f (x) ≤ 0. Then. ∆ f.x   ∆ (y) ≥ 0  f. were positive.9 Let f : (a. f : x → x2 does not have a maximum in (0. b]. then for every y ∈ (a. not necessarily unique). 1) → R. b) → R. Here is a first connection between maximum (and minimum) points and derivatives: Theorem 3. there has to be a δ > 0.Derivatives 105 hand. Also. 0 < |y − x| < δ.x (y) < 3 2 whenever = /2. Proof : Since x is. hence = 0. y−x y>x y < x. because at the end points we can only consider one-sided limits. we proved that a continuous function defined on a closed interval always has a maximum (and a minumum. Comment: The open interval cannot be replaced by the closed interval [a. It does not imply that if f (x) = 0 then x is a maximum point (nor a minimum point). If x ∈ (a. I Comment: This is a uni-directional theorem.

A point a ∈ A is called a critical point of f if f (a) = 0. a + δ). There are three types of candidates: (1) critical points of f in (a.4 Let f : A → B. 3 3 3 3 . x = ±1/ 3. (ii) the end points a and b. If Comment: Although this theorem is due to Fermat. Theorem 3.. which are both in the domain.10 (Pierre de Fermat) If a is a local maximum (or minimum) of f in some open interval and f is differentiable at a then f (a) = 0. √ i. If f is differentiable. If f is continuous. if there exists a δ > 0 such that a is a maximum of f in (a − δ. An interior point a ∈ A is called a local maximum of f ( ). Definition 3. a + δ) (“local” always means “in a sufficiently small neighborhood”). then the task reduces to finding all the critical points and comparing all the critical values. It is easily checked that √ √ 2 2 f (1/ 3) = − √ and f (−1/ 3) = √ .e. the largest critical value has to be compared to f (a) and f (b). Proof : There is actually nothing to prove. but zero is not a local minimum. its proof is slightly shorter than the proof of Fermat’s last theorem. b). 2] → R. b] → R and suppose we want to find a maximum point of f . and its critical points satisfy f (x) = 3x2 − 1 = 0. Comment: The converse is not true. The value f (a) is called a critical value. Take for example the function f : x → x3 . { f (x) : f (x) = 0}.106 Chapter 3 Definition 3. as the previous theorem applies verbatim for f restricted to the interval (a − δ.5 Let f : A → B be differentiable. nor a local maximum of f . Locating extremal points Let f : [a. Finally. then a maximum is guaranteed to exist. I is differentiable in R and f (0) = 0. Example: Consider the function f : [−1. f : x → x3 − x. This function is differentiable everywhere. and (iii) points where f is not differentiable.

there is no interior point where f (x) = 0. b). I Comment: The requirement that f be differentiable everywhere in (a. b−a that is. What about the reverse direction? Take the following example: we know that the derivative of a constant function is zero. b) is imperative. If the maximum occurs at some c ∈ (a. f (−1) = 0 and f (2) = 6. we have derived properties of the derivative given the function. differentiable in (a. If the minimum occurs at some c ∈ (a. b] and differentiable in (a. b) then f (c) = 0 and we are done. b]. 1] and f (0) = f (1) = 0. b]. 107 So far. b) where f (b) − f (a) . b) then f (c) = 0 and we are done. 2 Even though f is continuous in [0. . Proof : It follows from the continuity of f that it has a maximum and a minimum point in [a. at all of them). it is not clear how to show it. How can we go from the knowledge that f (y) − f (x) = 0. b) where f (c) = 0. then there exists a point c ∈ (a. The following two theorems will provide us with the necessary tools: Theorem 3.Derivatives Finally. Is it also true that if the derivative is zero then the function is a constant? A priori.12 (Mean-value theorem ( f (c) = )) If f is continuous on [a. The only remaining alternative is that a and b are both minima and maxima.11 (Michel Rolle. a point at which the derivative equals to the mean rate of change of f between a and b. in which case f is a constant and its derivative vanishes at some interior point (well. then there exists a point c ∈ (a. Theorem 3. hence 2 is the (unique) maximum point. lim y→x y−x to showing that f is a constant function. for consider the function  x   f (x) =  1 − x  1 0≤x≤ 2 1 < x ≤ 1. and f (a) = f (b). 1691) If f is continuous on [a. b).

Hence by Rolle’s theorem there exists a point c ∈ (a. Moreover. Proof : Apply the previous corollary for f − g. f (e) = d−c however f (e) = 0. g(a) = g(b) = f (a). d ∈ (a. hence f (d) = f (c). such that f (d) − f (c) . Proof : This is almost a direct consequence of Rolle’s theorem. Proof : Let c. b). b) then f is a constant. Take a domain of definition which is the union of two disjoint sets. then there exists a real number c such that f = g + c on that interval.108 Chapter 3 Comment: Rolle’s theorem is a particular case. d). g(x) = f (x) − b−a g is continuous on [a. and this is no longer true ( f is only constant in every “connected component” ( ¯ )). b] → R. b] and differentiable in (a. f (b) − f (a) (x − a). Since this holds for all pair of points. b). Define the function g : [a. c < d. (27 hrs. I . then f is a constant.2 Let f : [a. b−a I Corollary 3. such that 0 = g (c) = f (c) − f (b) − f (a) . 2009) Corollary 3. If f (x) = 0 for all x ∈ (a. By the mean-value theorem there exists some e ∈ (c. b] → R.3 If f and g are differentiable on an interval with f (x) = g (x). b). I Comment: This is only true if f is defined on an interval.

We have seen that at a local minimum (or maximum) the derivative (if it exists) vanishes.13 If f (a) = 0 and f (a) > 0 then a is a local minimum of f . Proof : Let x < y belong to that interval. y) such that f (y) − f (x) f (c) = . The following theorem gives a sufficient condition for a point to be a local minimum (with a corresponding theorem for a local maximum). but that the opposite is not true. as in f (x) = x3 at zero. . hence f (y) > f (x). If f is increasing and differentiable then f (x) ≥ 0. f exists in some neighborhood of a.4 If f is continuous on a closed interval [a. Proof : By definition. f (a) = lim ∆ f a . (30 hrs. b] and f (x) > 0 in (a. 3. 2011) Comment: The fact that f exists at a implies: 1. I Comment: The converse is not true. b). Theorem 3. y−x however f (c) > 0. y−a y−a Thus there exists a δ > 0 such that f (y) >0 y−a whenever 0 < |y − a| < δ.a (y) = f (y) − f (a) f (y) = . but equality may hold.a >0 where ∆ f . 2. By the mean-value theorem there exists a c ∈ (x. f is continuous in some neighborhood of a. then f is increasing on that interval. f is continuous at a.Derivatives 109 Corollary 3.

0. and one which is neither. Proof : Suppose. even though f (0) = 0.. Example: Consider the function f (x) = x3 − x. one local minimum..14 If f has a local minimum at a and f (a) exists. y). then f (a) ≥ 0. a): f (y) − f (a) = f (cy ) < 0 y−a where here cy ∈ (y. Another way of completing the argument is as follows: for every y ∈ (a. It follows that f is increasing in a right-neighborhood of a and decreasing in a left-neighborhood of a. (36 hrs.110 or. f (y) > f (a). I i. By the previous theorem this would imply that a is both a local minimum and a local maximum. 2010) Theorem 3. for every y ∈ (a − δ. by contradiction.e.e. a is a local minimum. f (a) = f (a) = 0. i. ±1/ 3: one local maximum. I The following theorem states that the derivative of a continuous function cannot have a removable discontinuity. Zero is a minimum point. Chapter 3 f (y) > 0 f (y) < 0 whenever whenever a<y<a+δ a − δ < y < a. f (y) > f (a)..e.e. Similarly. a). . a + δ): f (y) − f (a) = f (cy ) > 0 y−a i. a contradiction. where cy ∈ (a.. There are three critical points √ Example: Consider the function f (x) = x4 . i. that f (a) < 0. This in turn implies that there exists a neighborhood of a in which f is constant.

(∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (a. y−a and therefore ∆ f. δ). a a Since f has a limit at a.a (y) − lim f = f (cy ) − lim f . Proof : The assumption that f has a limit at a implies that f exists in some punctured neighborhood U of a.a (y) − lim f | < a . such that f (y) − f (a) f (cy ) = . a then f (a) exists and f is continuous at a. there exists a point cy between y and a.Derivatives 111 Theorem 3.15 Suppose that f is continuous at a and lim f exists. a I Example: The function   x2   f (x) =  2 x + 5  x<0 x ≥ 0. δ)) |∆ f. δ)) | f (y) − lim f | < a . δ)) | f (cy ) − lim f | < a . for y ∈ U \ {a} the function f is continuous on the closed segment that connects y and a and differentiable on the corresponding open segment. By the mean-value theorem. which precisely means that f (a) = lim f . which by the above identity gives (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (a. Thus. . (∀ > 0)(∃δ > 0) : (∀y ∈ B◦ (a. δ) implies that cy ∈ B◦ (a. Since y ∈ B◦ (a.

b). Example: The function   x2 sin 1/x   f (x) =  0  x 0 x = 0. b] and differentiable in (a. b−a The problem is that there is no reason for the points c and d to coincide. so that the theorem does not apply. b). the derivative does not have a limit at zero.16 (Augustin Louis Cauchy. mean value theorem) Suppose Comment: If g(b) g(a) then this theorem states that f (c) f (b) − f (a) = .112 Chapter 3 is differentiable in a neighborhood of 0 and lim f exists. g(b) − g(a) g (c) This may seem like a corollary of the mean-value theorem. (29 hrs. . and therefore not differentiable at zero. Note. is continuous and differentiable in a neighborhood of zero. g are continuous on [a. Then. that f is differentiable at zero. 2009) (37 hrs. such that [ f (b) − f (a)]g (c) = [g(b) − g(a)] f (c). 0 but it is not continuous. such that f (c) = f (b) − f (a) b−a and g (d) = g(b) − g(a) . Theorem 3. 2010) that f. We know that there exist c.d. However. there exists a c ∈ (a. however.

f (a) = g(a) = 03 . a g g Comment: This is a very convenient tool for obtaining a limit of a fraction. otherwise.17 (L’Hˆ pital’s rule) Suppose that o lim f = lim g = 0. le Marquis de l’Hˆ pital. when both numerator and denominator vanish in the limit2 .Derivatives Proof : Define h(x) = f (x)[g(b) − g(a)] − g(x)[ f (b) − f (a)]. b). the limit lima ( f /g ) would not have existed.   0 0 x=a x=a 2 .e. Namely. 3 What we really do is to define continuous functions    f (x) x a g(x) x a     f˜(x) =  and g(x) =  ˜ . such that h (c) = f (c)[g(b) − g(a)] − g (c)[ f (b) − f (a)] = 0. L’Hˆ pital’s rule was in fact derived by Johann Bernoulli who was giving to the French nobleo man. lessons in calculus. b] and differentiable in (a. hence there exists a point c ∈ (a.. and that h(a) = h(b). 113 and apply Rolle’s theorem. I Theorem 3. a a and that lim a f exists. g Then the limit lima ( f /g) exists and lim a f f = lim . b). Note that f and g are not necessarily defined at a. o o acknowledging the help of Bernoulli. L’Hˆ pital was the one to publish this theorem. we verify that h is continuous on [a. there will be no harm assuming that they are continuous at a. i. since they have a limit there. Proof : It is implicitly assumed that f and g are differentiable in a neighborhood of a (except perhaps at a) and that g is non-zero in a neighborhood of a (except perhaps at a).

then g(x) 0 whenever 0 < |x − a| < δ. δ).1) f f (x) − lim < a g g (x) . for by Rolle’s theorem. f f = lim . with.1). . say. a point c x between x and a. (∀ > 0)(∃δ > 0) : x ∈ B◦ (a. (∀ > 0)(∃δ > 0) : x ∈ B◦ (a. then there exists a 0 < c x < x < δ. Both f and g are continuous and differentiable in some neighborhood U that contains a. At the end. where g (c x ) = 0. δ) implies c x ∈ B◦ (a. we claim that g does not vanish in some neighborhood U of a. there exists for every x ∈ U. g(x) − 0 g(x) g (c x ) Since the limit of f /g at a exists.114 Chapter 3 First. such that f (x) − 0 f (x) f (c x ) = = . if g (x) 0 whenever 0 < |x − a| < δ. δ) implies and since x ∈ B◦ (a. More formally. δ) implies which means that lim a (3. a g g I Example: and prove the theorem for the “corrected” functions f˜ and g. hence by Cauchy’s mean-value theorem. δ) implies and by (3. it would imply that g vanishes somewhere in this neighborhood. (∀ > 0)(∃δ > 0) : x ∈ B◦ (a. 0 < x < δ. f f (c x ) − lim < a g g (c x ) . f f (x) − lim < a g g(x) . for if g(x) = 0. we observe that the result ˜ must apply for f and g as well.

we only prove one variant among many other of l’Hˆ pital’s o rule. ∞ g g Yet another one is: Suppose that lim f = lim g = ∞. but note how difo ferent it is: . x→0 x sin2 x = 1. a a and that lim a f exists. g Then lim a f f = lim .Derivatives Œ The limit 115 lim  The limit sin x = 1. g Then lim ∞ f f = lim . 2011) Comment: Purposely. ∞ ∞ and that lim ∞ f exists. x→0 x2 with two consecutive applications of l’Hˆ pital’s rule. Another variants is: Suppose that lim f = lim g = 0. o lim (32 hrs. a g g The following theorem is very reminiscent of l’Hˆ pital’s rule.

e. Show that there exists a constant c. Any suppose that f (x) = f (x) for all  Exercise 3. and yet. ∆g. x ∈ I. Thus. g g (a) g (a) 0.5 . in which the numerator g(x) is non-zero. and in particular. y→a y→a y − a y−a hence there exists a punctured neighborhood of a where g(x)/(x − a) > 0. and f (x) − f (a) f (x) = = g(x) g(x) − g(a) f (x)− f (a) x−a g(x)−g(a) x−a = ∆ f. we can divide by g(x) in a punctured neighborhood of a. with f (a) = g(a) = 0 Then. is not increasing in any neighborhood of 0 (i.a (x) I Using the arithmetic of limits. we obtain the desired result. such that f (x) = c e x ..a (x) . Proof : Without loss of generality. there is no neighborhood of zero in which x < y implies f (x) < f (y)).116 Chapter 3 Theorem 3. (Hint: consider the function g(x) = f (x)/e x . lim g(y) − g(a) g(y) = lim = g (a) > 0. 2011) (30 hrs. That is.4 Let I be an open segment. (33 hrs. lim a and f f (a) = . assume g (a) > 0.18 Suppose that f and g are differentiable at a.)  Exercise 3. 2009)  Exercise 3.3 Show that the function   1 x + x2 sin 1   x f (x) =  2 0  x 0 x=0 satisfies f (0) > 0.

7 Calculate the following limits: Œ lim x→0  sin(ax) . Show that if f vanishes at m points. . 1.5 Derivatives of inverse functions We have already shown that if f : A → B is one-to-one and onto. sin(bx) cos(cx) tan(x)−x lim x→0 x−sin(x) . then it has an inverse f −1 : B → A. x x − x0 x and choose x0 appropriately in order to use Lagrange’s theorem. Hint: write f (x) − f (x0 ) x − x0 f (x) = · . .  Exercise 3. Find all the values of α. and n + 1 times differentiable in (a.6 Let  sin(ωx)      f (x) = γ    2 αx + βx  x<0 x=0 x > 0. We then say that f is invertible ( ¯ ). In this section. sin(x) πx 2 Ž lim x→0  lim x→1 (1 − x) tan . ∞) and lim∞ f = L. and that the inverse of a continuous function is continuous. Show that there exists a c ∈ (a. (ii) differentiable at zero. . Prove. β. b). then f vanishes at at least m − 1 points.  Exercise 3.  Let f be n times continuously differentiable in [a. 3. we study under what conditions is the inverse of a differentiable function differentiable. Suppose furthermore that f (b) = 0 and f (k) (a) = 0 for k = 0. . γ. ω for which f is (i) continuous at zero. x2 sin(1/x) . b]. . without using l’Hopital’s theorem that lim x→∞ f (x)/x = L.8 Suppose that f : (0. b) such that f (n+1) (c) = 0. We have seen that an invertible function defined on an interval must be monotonic. ∞) → R is differentiable on (0. n.Derivatives 117 Œ Let f : R → R be differentiable. (iii) twice differentiable at zero.  Exercise 3.

2: The derivative of the inverse function Recall also that f −1 ◦ f = Id. Corollary 3. If f is differentiable at f −1 (b) and f ( f −1 (b)) 0. then f −1 is not differentiable at 2. There is one little flaw with this argument. Example: The function f (x) = x3 + 2 is continuous and invertible. ( f −1 ) ( f (a)) f (a) = 1.118 Chapter 3 Figure 3. Since. then by the composition rule. 2010) Theorem 3. then f −1 is differentiable at b. We have assumed ( f −1 ) to exist. or equivalently. a = f −1 (b). we get ( f −1 ) (b) = 1 f ( f −1 (b)) . with Id : A → A.5 If f ( f −1 (b)) = 0 then f −1 is not differentiable at b. . If both f and f −1 were differentiable at a and f (a).19 Let f : I → R be continuous and invertible. f ( f −1 (2)) = 0. with f −1 (x) = √ 3 x − 2. Setting f (a) = b. respectively. (39 hrs. then we have just derived an expression for the derivative of the inverse. If this were indeed the case.

Then4 . ( f −1 ) (x) = 1 f ( f −1 (x)) = cos2 (arctan x) = 1 1+ tan2 (arctan x) = 1 ..a (x) = which we can rewrite as ∆ f −1 . a = f −1 (b). Then. 4 We used the identity cos2 = 1/(1 + tan2 ). −1 (b) x−a f (y) − f ∆ f −1 .e. To each x a corresponds a y y = f (x) b such that and x = f −1 (y). It is given lim ∆ f. 1 + x2 derivative of f −1 (x) = x1/n is Example: Consider the function f (x) = xn .b = . it follows from limit artithmetic that 1 lim ∆ f −1 .a = f (a) a 119 0. −1 (b)) b f (f I Example: Consider the function f (x) = tan x defined on (−π/2. n and together with the composition rule gives the derivative for all rational powers.Derivatives Proof : Let b = f (a).a is continuous at a = f −1 (b). i. . This proves that the ( f −1 ) (x) = 1 1/n−1 x .b (y) Since f −1 is continuous at b and ∆ f. with n integer.b (y) = 1 1 = . ∆ f. ∆ f.a (x) ∆ f. We denote its inverse by arctan x. π/2).a ( f −1 (y)) y−b 1 f (x) − f (a) = −1 = .

and satisfies f = f −1 .) Proof : Consider a segment (b. Further- more. By the definition of the supremum. − < f (x) ≤ whenever y < x < a.  Exercise 3. if A is a connected set then f has one-sided limits everywhere.11 Suppose that f : R → R is continuous.e.  Exercise 3. Hence we can define = sup K. . there exists an m ∈ K such that m > − . then we know the function whose derivative is its inverse. Note that this statement says that if we know a function whose derivative is one-to-one and onto. a] ∈ A. Prove that ( f −1 ) exists and find an expression for it in terms of f and its derivatives.10 Let f be invertible and twice differentiable in (a.. (i) Show that f has a fixed point. By the monotonicity of f . b). 3. b). Suppose that there exists a function F : R → R such that F = f .9 Let f : R → R be one-to-one and onto. then it has one-sided limits at every point that has a one-sided neighborhood that belongs to A. We need to show that f has a left-limit at a. i. Hint: start by figuring out how do such functions look like.  Exercise 3. Prove that G (x) = f −1 (x). (ii) Find such an f that has a unique fixed point. Let y be such f (y) = m. increasing).20 Suppose that f : A → B is monotonic (say. (In particular. Define G(x) = x f −1 (x) − F( f −1 (x)). − a Let > 0. We will get to it after we study integration theory. We are going to show that = lim f. there exists an x ∈ R such that f (x) = x.120 Chapter 3 Comment: So far. one-to-one and onto. we have not defined real-valued powers. The set K = { f (x) : b < x < a} is non-empty and upper bounded by f (a) (since f is increasing).6 Complements Theorem 3. and furthermore differentiable with f (x) 0 for all x ∈ R. f (x) 0 in (a.

where f (c) > lim f − c and/or f (c) < lim f. This implies the existence of a point c ∈ [a. f (b)] is in the image of f follows from the intermediatevalue theorem. It is nevertheless correct even if f has discontinuities.21 Let f : [a. . b). f (b)]. + c (We rely on the fact that one-sided derivatives exist. Image( f ) ⊆ [ f (a). f (b)].) Say. that f is not continuous. f (c) < f (c) + limc+ f < lim f c+ 2 is not in the image of f . such that f (x) = t. including a right-derivative at a and a left-derivative at b. whereby for every t between f (a+ ) and f (b− ) there exists an x ∈ [a. We proceed similarly for right-limits. that f (c) < limc+ f . Then the image of f is a segment if and only if f is continuous. b]. This theorem thus limits the functions that are derivatives of other functions—not every function can be a derivative. Then. Proof : (i) Suppose first that f is continuous.22 (Jean-Gaston Darboux) Let f : A → B be differentiable on (a. I Theorem 3. for example. Comment: The statement is trivial when f is continuously differentiable (by the intermediate-value theorem). (ii) Suppose then that Image( f ) = [ f (a). Then. 121 I Theorem 3. f has the “intermediate-value property”. b] → R be monotonic (say. by contradiction. f (b)]. b]. which contradicts the fact that Image( f ) = [ f (a). Suppose. increasing).Derivatives This concludes the proof. By monotonicity. That every point of [ f (a).

The point a cannot be a maximum point because g is increasing at a. b cannot be a maximum point because g is decreasing at b. x=0 The derivative is not continuous at zero. (Here t is a fixed parameter. and we wish to show the existence of an x ∈ [a. g (c) = f (c) − t = 0. Similarly. Let f (a+ ) > t > f (b− ).122 Chapter 3 Proof : Without loss of generality. there exists an interior point c which is a maximum. but nevertheless satisfes the Darboux theorem. Since g is continuous on [a. and by Fermat’s theorem. 2009) Example: The function   x2 sin 1/x   f (x) =  0  x 0 x=0 is differentiable everywhere. and its derivative is  2x sin 1/x − cos 1/x   f (x) =  0  x 0 . and consider the function g(x) = f (x) − tx. let us assume that f (a+ ) > f (b− ). Example: How crazy can a continuous function be? It was Weierstrass who shocked the world by constructing a continuous function that it nowhere differentiable! As instance of his class of example is ∞ f (x) = k=0 cos(21k πx) . b] such that g (x) = 0. b] then it must attain a maximum. 3k .) We have g (a+ ) = f (a+ ) − t > 0 and g (b− ) = f (b− ) − t < 0. I (32 hrs. Thus. for every t we can construct such a function.

b). Suppose. b] → R is continuous on [a. that we knew that | f (y)| < M for all y. then we can refine a lot the estimates we have on the variation of f as we get away from the point a. Then. . such that f (x) − f (a) = f (c x ). a It turns out that if we have more information. lim a f − P f. (Id −a)n Comment: Loosely speaking. f (n) (a) all exist. for example. f (a).23 Let f be a given function. as x approaches a. b) there exists a point c x . . Theorem 3. For n = 0 it states that lim[ f − f (a)] = 0. P f.a (x) is closer to f (x) than (x − a)n (or that f and P f. .n. If f : [a. In particular. a . then for every point x ∈ (a. Then.a = 0. Suppose that the derivatives f (a).7 Taylor’s theorem Recall the mean-value theorem.a (x) = a0 + a1 (x − a) + a2 (x − a)2 + · · · + an (x − a)n .n. about higher derivatives of f .n. this theorem states that. b] and differentiable in (a. x−a We can write it alternatively as f (x) = f (a) + f (c x )(x − a). lim[ f − f (a)] = 0.n. and a an interior point of its domain.Derivatives 123 3. we would deduce that f (x) cannot differ from f (a) by more than M times the separation |x − a|. .a are “equal up to order n”. Let ak = f (k) (a)/k! and set the Taylor polynomial of degree n. see below). P f.

(35 hrs. lim n−1 a n(n − 1) (Id −a)n−2 a n (Id −a) provided that the right hand side exists. For that we note o that P f. Now write the right hand side explicitly.n. a (Id −a) which holds by definition.n.a . f (n−1) − P f (n−1) . This concludes the proof.n.a − f (n) (a) .n. (Id −a)n which is a ratio of n-times differentiable functions at a. lim a Chapter 3 f − f (a) − f (a) = lim ∆ f. 2011) The polynomial P f.n−2.a is: I . so we will attempt to use l’Hˆ pital’s rule.n. = lim a (Id −a)n n! (Id −a) provided that the right hand side exists. f − P f . If we denote by Πn the set of polynomials of degree at most n.n−1. then the defining property of P f.a = lim . f − P f .a f − P f. a n (Id −a)n−1 a (Id −a)n provided that the right hand side exists. Both numerator and denominator vanish at a.a . which is the limit of the ratio of two differentiable functions that vanish at a.a − f (a) = 0.a = lim . Its existence relies on f being differentiable n times at a.n−1. We then proceed to consider the right hand side.a f − P f. 1 f (n−1) − f (n−1) (a) − f (n) (a)(Id −a) 1 lim = lim ∆ f (n−1) .a (x) is known as the Taylor polynomial of degree n of f about a. Proof : Consider the ratio f − P f.a = P f . We proceed inductively. until we get that lim lim a Thus.124 whereas for n = 1. Hence.n−1.a .a f − P f .n.1. n! a Id −a n! a which vanishes by the definition of f (n) (a).

n.a (x) + R f.a (provided that the latter exists) are equal up to order n at a. which defines an equivalence class. 1. .n. 2. The next task is to obtain an expression for the difference between the function f and its Taylor polynomial. (Id −a)n It is easy to see that “equal up to order n” is an equivalence relation. by f (x) = P f. Then P = Q. a0 = b0 .a ∈ Πn . g be defined in a neighborhood of a. n.. We say that f and g are equal up to order n at a if lim a f −g = 0. We proceed similarly for each k to show that ak = bk for k = 0.n. Proposition 3. . Q ∈ Πn be equal up to order n at a.1 Let P.Derivatives 1. . n. Rn (x).6 Let f.a (x). P f. . n. 1. 1. I . . .a 125 Definition 3. P(k) (a) = f (k) (a) for k = 0. f. . . We know that lim a P−Q =0 (Id −a)k for k = 0.n. For k = 0 0 = lim(P − Q) = a0 − b0 . the above theorem proves that f (x) and P f.e. . . .n. Thus. . Proof : We can always express these polynomials as P(x) = a0 + a1 (x − a) + · · · + an (x − a)n Q(x) = b0 + b1 (x − a) + · · · + bn (x − a)n . a i. We define the remainder ( ) of order n.

1 + x2 P f.6 If f is n times differentiable at a and f is equal to Q ∈ Πn up to order n at a.0 (x) = k=0 xk . then P f. which means that n xk are P f. f (x) − xn n k=0 ∞ n x = k k=0 k=0 xk + xn+1 . Example: We know from high-school math that for |x| < 1 1 f (x) = = 1−x Thus.2n.n.126 Chapter 3 Corollary 3. . 1−x xk = x . Example: Similarly for |x| < 1 1 f (x) = = 1 + x2 Thus.a = Q.n. 1−x n k=0 Since the right hand side tends to zero as x → 0 it follows that f and equal up to order n at 0. x2 .0 (x) = k=0 (−1)k x2k . 1 + x2 Since the right hand side tends to zero as x → 0 it follows that f (x) − = n n k 2k k=0 (−1) x x2n ∞ n (−x ) = 2 k k=0 k=0 (−1)k x2k + (−1)n+1 x2n+2 .

2011) Theorem 3. Proof : We use again the fact that if a polynomial “looks” like the Taylor polynomial then it is.a = [P f. P f +g.a =g + P f.n.n. Then.n.a + Pg.n.a g + P f. 1 + t2 Since x2k+1 (−1) f (x) − ≤ |x|2n+3 .n.a Pg. Then.a = P f. 2k + 1 (36 hrs.n. n (Id −a) (Id −a)n . where []n stands for a truncation of the polynomial in (Id −a)at the n-th power.0 = n (−1)k k=0 x2k+1 .a Pg. The sum P f.n. Second.n.a Pg.a .a f g − P f.n. g be n times differentiable at a.a ) = lim + lim = 0.n. x f (x) = 0 n dt = 1 + t2 x 0        n (−1) t + (−1) k 2k k=0 x n+1  t2n+2    dt   2 1+t = k=0 (−1)k x2k+1 + 2k + 1 n (−1)n+1 0 t2n+2 dt.a Pg.a g − Pg. we note that P f.n.n. 2k + 1 k=0 k it follows that P f.a ]n .n.n.a + Pg.a ( f + g) − (P f.n.n.24 Let f.a g − P f.2n+2.a .a belongs to Πn and lim a f − P f. n n a a (Id −a)n (Id −a) (Id −a) which proves the first statement. and that f g − P f.n.a ∈ Π2n .n.n.a + Pg.a g − Pg.n.Derivatives 127 Example: Let f = arctan.n.n.a = n (Id −a) (Id −a)n f − P f. and P f g.n.

I Theorem 3. f (x) = P f.a (x) = f (n+1) (c) (x − a)n+1 . x).128 Chapter 3 whose limit at a is zero. f (z) f (n) (z) (x − z)2 − · · · − (x − z)n . φ is differentiable. All that remains it to truncate the product at the n-th power of (Id −a).n.z (x) = f (x) − f (z) − f (z)(x − z) − We note that φ(a) = R f.a (x) can be represented in the Lagrange form.n. n! Comment: For n = 0 this is simply the mean-value theorem.a (x). (n + 1)! for some c ∈ (a. It can also be represented in the Cauchy form. where the remainder R f. 2! n! f (n+1) (z) f (n) (z) (x − z)n − (x − z)n−1 n! (n − 1)! f (n+1) (z) =− (x − z)n .a (x) = for some ξ ∈ (a. and consider the function φ : [a.n. x] → R. x]. with φ (z) = − f (z) − f (z)(x − z) − f (z) − − ··· − f (z) (x − z)2 − f (z)(x − z) 2! and φ(x) = 0. f (n+1) (ξ) (x − ξ)n (x − a). f.n. x). φ(z) = f (x) − P f.a (x) Moreover. Then.a (x) + Rn. R f. n! .n.n. R f. Proof : Fix x.25 (Taylor) Suppose that f is (n + 1)-times differentiable on the interval [a.

x0 = 0. That is. both in the Lagrange and Cauchy forms: Œ f (x) = 1/(1 + x). x) with nonvanishing derivative.  f (x) = cosh(x). find coefficients a0 . Prove that f is a polynomial whose degree is at most n − 1. such that p(x) = n ak (x + 1)k . Rn (x) = (x − a) p f (n+1) (ξ) (x − ξ)n−p+1 p n! For p = 1 we retrieve the Cauchy form. .13 Let f : R → R be n times differentiable. a1 .14 Write the Taylor polynomials of degree n at the point x0 for the following functions. . n ∈ N. differentiable on (a. For p = n + 1 we retrieve the Lagrange form.Derivatives 129 Let now ψ be any function defined on [a. an .  Exercise 3. Rn (x) = ψ(x) − ψ(a) f (n+1) (ξ) (x − ξ)n ψ (ξ) n! Set for example ψ(z) = (x − z) p for some p. there exists a point a < ξ < x. such that φ(x) − φ(a) φ (ξ) = . n ∈ N. x0 = 0. such that f (n) (x) = 0 for all x ∈ R. By Cauchy’s mean-value theorem.12 Write the polynomial p(x) = x3 − 3x2 + 3x − 6 as a polynomial  Exercise 3. I Comment: Later in this course we will study series and ask questions like “does the Taylor polynomial Pn (x) tend to f (x) as n → ∞?”. ψ(x) − ψ(a) ψ (ξ) That is. k=0 Solve this problem using two methods: (i) Newton’s binomial. as well as the corresponding remainders. Then. 2009) in x + 1. (34 hrs. Here we deal with a different beast: the degree of the polynomial n is fixed and we consider how does Pn (x) approach f (x) as x → a. . . and (ii) Taylor’s polynomial. This is the notion of converging series.  Exercise 3. x]. The fact that Pn approaches f faster than (x − a)n as x → a means that Pn is an asymptotic series of f . .

‘ f (x) = sin(x). Use the notation α def α(α − 1) .  then f (0) does not exist.  cos(1/2).17 Œ For every polynomial P(x) = n k=0 ak xk and evert q ∈ N. x0 = 0. then f (a) = lim h→0 f (a + h) + f (a − h) − 2 f (a) . the following expressions. x0 = 1.  Exercise 3. h Œ Using l’Hopital’s rule. .16 Evaluate. α N. we denote q<n q ≥ n.  f (x) = (x − 2)6 + 9(x + 1)4 + (x − 2)2 + 8.15 Show using different methods that of f (a) exists. but the limit of the above fraction does exist. n = 6. . Why isn’t it a contradiction?  Exercise 3. using Taylor’s polynomial.  Using Taylor’s polynomial of order two f at a. since it is only given that f is twice differentiable at a. k ∈ N. x0 = π/2. n = 5. up to an accuracy of at least 0. n ∈ N. = n! n which generalizes Newton’s binomial. Note that you can’t use neither Lagrange or Cauchy’s form of the remainder. n = 2k.130 Chapter 3 Ž f (x) = x5 − x4 + 7x3 − 2x2 + 3x + 1.  q   ak x k  Pq (x) =  k=0 P(x)  .  Exercise 3. (α − n + 1) . Ž Show that if   x2  x≥0  f (x) =  2 −x x < 0.  f (x) = (1 + x)α .01: Œ e1/2 . x0 = 0.

g have Taylor polynomials of order n at zero. write the Taylor polynomial of order n at zero of the function g(x) = f (2x). where Pn and Qn are the corresponding Taylor polynomials at zero.Derivatives 131 Suppose that f. Ž Find the Taylor polynomial of order three at zero of the function f (x) = exp(sin(x)). P and Q. k Hint: think of the Taylor polynomial of a product of functions.18 Let f be n times differentiable at x0 and let P(x) = ak xk be its Taylor polynomial at zero. Hint: write f (g(x)) = f (g(x))/g(x) · g(x). n k=0  Exercise 3.20 Suppose that f and g have derivatives of all orders in R and . and that g does not vanish in a punctured neighborhood of zero. then n ( f · g)(n) (x0 ) = k=0 n (k) f (x0 ) g(n−k) (x0 ).  Given the Taylor polynomial P of order n at zero of the function f . Show that the Taylor polynomial of order n at zero of f ◦ g is given by (P ◦ Q)n . that Pn (x) = Qn (x) for all n ∈ N and every x ∈ R.19 Show that if f and g are n times differentiable at x0 .  Exercise 3. Justify your answer. Does it imply that f (x) = g(x)?  Exercise 3. Suppose furthermore that g(0) = 0. What is the Taylor polynomial of f of order n − 1 at zero.

132 Chapter 3 .

Within the Riemann integration theory. We will learn how to calculate the area of a region bounded by the x-axis. it is defined as “the number of squares with sides of length one that fit into the region”. For the moment let us assume that f ≥ 0. This region will be denoted by R( f. 0) and (b. When we learn in school about the area of a region. Integrals were invented in order to calculate areas. and its area (once defined) will be called the integral of f on [a. and two vertical lines passing though points (a. b).Chapter 4 Integration theory 4. 0). There are multiple non-equivalent ways to define integrals. It is of course hard to make sense with such a definition. 1 . a. b]1 . We study integration theory in the sense of Riemann. The fact that the two are intimately related— that derivatives are in a sense the inverse of integrals—is a wonder. the graph of a function f . another classical integration theory is that of Lebesgue (studied in the context of measure theory). we adopt a particular derivation due to Darboux.1 Definition of the integral Derivatives were about calculating the rate of change of a function.

a.2 Let f : [a. . P) ≤ A ≤ U( f. . We define the lower sum ( ¯ ) of f for P to be n L( f. . you can’t use the word to be defined in a definition) of the interval into a finite number of segments. a = x0 < x1 < x2 < · · · < xn = b. [a. xn } be a partition of [a. . b) must satisfy L( f. xn ]. b]. b] → R be a bounded function.b" a b Definition 4. P) for all partitions P.1 A partition ( ) of the interval [a. x1 ] ∪ [x1 .a. Definition 4. and let P = {x0 . any sensible definition of the area A of R( f. Clearly. We define the upper sum ( ¯ ) of f for P to be n U( f. P) = i=1 Mi (xi − xi−1 ). P) = i=1 mi (xi − xi−1 ). I know. .134 Chapter 4 f R!f. b] is a partition (yes. x1 . x2 ] ∪ · · · ∪ [xn−1 . b] = [x0 . We will identify a partition with a finite number of points. Let mi = inf{ f (x) : xi−1 ≤ x ≤ xi } Mi = sup{ f (x) : xi−1 ≤ x ≤ xi }.

−1/3. 1]. 2011)  Exercise 4. 2/3. P) and U( f. Q) ≥ L( f. P2 ) where P1 and P2 are arbitrary partitions? It ought to be true. 8/9.Integration theory 135 f a x0 x1 x2 x3 b Figure 4. Ž Is it however true that L( f.  It is always true that L( f.3 A partition Q is called a refinement ( P ⊂ Q. P1 ) ≤ U( f.1 Let f : [−1. P) and U( f. 1} be a partition of [−1. 2009) (38 hrs. Then L( f. (35 hrs. P) because mi ≤ Mi for all i. Definition 4.1: The lower and upper sums of f for P. ) of a partition P if Lemma 4. P). P). Comments: Œ The boundedness of f is essential in these definitions. Q) ≤ U( f. 1] → R be given by f : x → x2 and let P = {−1. −1/2. .1 Let Q be a refinement of P. Calculate L( f. P) ≤ U( f.

such that each P j differs from P j−1 by one point. . x1 . . L( f. Q) = i=1 mi (xi − xi−1 ) + m j (y − x j−1 ) + m j (x j − y) + ˜ ˆ i= j+1 mi (xi − xi−1 ). since both are infimums over a ˜ ˆ 2 smaller set . x j−1 . . Q) − L( f. . Thus. P) ≥ 0. I Theorem 4. That is. xn } Q = {x0 . L( f. P1 ) ≤ · · · ≤ L( f. . There exists a sequence of partitions.P2 of [a. xn }. Consider now two partitions. . P) = i=1 n mi (xi − xi−1 ) and j−1 n L( f. x j . y. b]. A similar argument works for the upper sums. P = {x0 . .136 Chapter 4 Proof : Suppose first that Q differs from P by exactly one point. L( f. P1 ) ≤ U( f. P2 ). ˜ ˆ Thus. . It remains to note that m j ≥ m j and m j ≥ m j . P ⊂ Q. . . L( f. Then. P) ≤ L( f. 2 It is always true that if A ⊂ B then inf A ≥ inf B and sup A ≤ sup B. where m j = inf{ f (x) : x j−1 ≤ x ≤ y} ˜ m j = inf{ f (x) : y ≤ x ≤ x j }. . Q). . P = P0 ⊂ P1 ⊂ · · · ⊂ Pk = Q. Q) − L( f. P) = (m j − m j )(y − x j−1 ) + (m j − m j )(x j − y). Hence. .1 For every two partitions P1 . ˆ L( f. x1 .

L ( f ) = {L( f. Q) ≤ U( f. 137 I  Exercise 4. ≤ c ≤ u for all ∈ L ( f ) and Ž For every > 0 there exist an ∈ L ( f ) and a u ∈ U ( f ). Ž If R is a partition of both P and Q then it is also a refinement of P ∪ Q. P) : P is a partition of [a.  Exercise 4. We may now consider the collection of lower and upper sums. b]}. Similarly. b] We once proved a theorem stating that the following assertions are equivalent: Œ L( f ) = U( f ). Q) ≤ U( f. we may define the least upper bound and greatest lower bound.2 Prove or disprove: Œ A partition Q is a refinement of a partition P if and only if every segment in Q is contained in a segment of P.3 Characterize the functions f for which there exists a partition P such that L( f. and not sets of partitions. L( f ) = sup L ( f ) and U( f ) = inf U ( f ). The set L ( f ) is non-empty and bounded from above by any element of U ( f ).Integration theory Proof : Let Q = P1 ∪ P2 . such that u − < . Thus. The real number U( f ) is called the upper integral ( ) of f on [a. b]} U ( f ) = {U( f. P) : P is a partition of [a. b] and L( f ) is called the lower integral ( ) of f on [a. .  If a partition P is a refinement of a partition Q then every segment in Q is a segment in P. P). the set U ( f ) is non-empty and bounded from below by any element of L ( f ). P1 ) ≤ L( f. P2 ). Since Q is finer than both P1 and P2 it follows that L( f. Note that both L ( f ) and U ( f ) are sets of real numbers. P) = U( f.  There is a unique number c. such that u ∈ U ( f ).

We will only briefly talk about them (only due to a lack of time). it is very important to realize that the integral only involves f .138 Chapter 4 Definition 4. b]. We call this number the integral of f on [a. The geometric interpretation is that areas under the x-axis count as “negative areas”.4 If L( f ) = U( f ) we say that f is integrable on [a. Comment: At this stage we only consider integrals of bounded functions over bounded segments—so-called proper integrals. and denote it by b f. b] if the upper and lower integrals coincide. b]. Two questions arise: Œ How do we know whether a function is integrable? . Comment: The integral of f is the unique number such that b L( f. are instances of improper integrals. a This notation is indeed very suggestive about the meaning of the integral. f is integrable on [a. P) ≤ a f ≤ U( f. however. Comment: What about the case where the sign of f is not everywhere positive? The definition remains the same. a and b. Integrals like ∞ 1 f a and 0 log x dx. for all partitions P and Q. Comment: You are all accustomed to the notation b f (x) dx. a That is. and it is not a function of any x. Q).

P) − L( f. such that U( f. by the equivalent criteria described above. suppose f is integrable. P ) ≤ U( f. Proof : If for every > 0 there exists a partition P. n L( f. P) = It follows that L( f ) = U( f ) = c(b − a). a . P) < L( f.2 A bounded function f is integrable if and only if there exists for every > 0 a partition P. P) = i=1 n mi (xi − xi−1 ) = c(b − a) Mi (xi − xi−1 ) = c(b − a). such that U( f. P . i=1 U( f. P) − L( f. P ) < . 2011) 139 The following lemma will prove useful in checking whether a function is integrable: Theorem 4. Let P = P ∪ P . given > 0. Since P is a refinement of both. such that U( f. P) + . then U( f. P) + . P) < L( f. b] → R.Integration theory  How to compute the integral of an integrable function? (40 hrs. then f is integrable. Then. there exist partitions P. f : x → c. P ) < . P ) − L( f. Then for every partition P. I Example: Let f : [a. Conversely. hence f is integrable and b f = c(b − a).

n Lunif ( f ) = {L( f.  0   f :x→ 1  x 1 x = 1. Let Pn . and U( f. Since L( f ) and M j = 1. P) = 0 (41 hrs. hence L( f ) = sup L ( f ) ≥ sup Lunif ( f ) = 0 U( f ) = inf U ( f ) ≤ inf Uunif ( f ) = 0. n ∈ N denote the family of uniform partitions. thus 0 ≤ L( f ) ≤ U( f ) ≤ 0. 2010) and U( f. P) = 1. and L( f ) = 0 and U( f ) = 1. 1] → R. Pn ) : n ∈ N} Clearly. Example: Consider next the function f : [0. P) = 2/n. Pn ) : n ∈ N} Uunif ( f ) = {U( f. .140 Chapter 4 Example: Let f : [0. mj = 0 hence L( f.  0   f :x→ 1  For every partition and every j. xi = For every such partition. x Q x ∈ Q. 2] → R. Lunif ( f ) ⊆ L ( f ) and Uunif ( f ) ⊆ U ( f ). P) = 0 Set now. U( f ). 2i . f is not integrable. L( f.

Pn = {0. then mi = xi−1 and Mi = xi . will convince us that we must develop tools to deal with more complicated functions. n L( f. We are next going to examine a number of examples. b}. . hence n L( f. Let P be any partition. and 2 141 f = 0. . Pn ) = i=1 As in the previous example.Integration theory which implies that f is integrable. b] → R. will practice our understanding of integrability. i=1 U( f. f = Id. b/n. . P) = Using again uniform partition. 2 (37 hrs. which on the one hand. and on the other hand. Example: Let f : [0. 2b/n. L( f ) = sup L ( f ) ≥ sup Lunif ( f ) = U( f ) = inf U ( f ) ≤ inf Uunif ( f ) = hence f is integrable and b 0 b2 2 b2 . 0 This example shows that integrals are not affected by the value of a function at an isolated point. 2 b2 f = . P) = i=1 n xi−1 (xi − xi−1 ) xi (xi − xi−1 ). nn n 2 2 n U( f. Pn ) = i=1 n 1 (i − 1)b b b2 n(n − 1) b2 = 2 = 1− n n n 2 2 n ib b b2 n(n + 1) b2 1 = 2 = 1+ . 2009) . .

We start by verifying which classes of functions are integrable.3 If f [a. b] → R. 6 . we need stronger tools. 3 n k2 = k=1 1 n(n + 1)(2n + 1)..2 Integration theorems Theorem 4.142 Chapter 4 2 Example: Let f : [0. Take the same uniform partition. Pn ) = i=1 Since L( f ) ≥ sup Lunif ( f ) = b3 /3 it follows that 0 b and b3 . P) = i=1 n 2 xi−1 (xi − xi−1 ) U( f. i. f (x) = x2 .e. Pn ) = i=1 n (i − 1)2 b2 b b3 (n − 1)n(2n − 1) b3 1 = 3 = 1− 2 n n n 6 3 n i2 b2 b b3 n(n + 1)(2n + 1) b3 1 = 3 = 1+ 2 n n n 6 3 n 1+ 1− 2 . Let P be any partition. 2010) 4. b] → R is monotone. then it is integrable. 3 U( f ) ≤ inf Uunif ( f ) = b3 /3. hence n L( f. then3 n L( f. P) = i=1 xi2 (xi − xi−1 ). (42 hrs. n 1 2n U( f. then mi = xi−1 and Mi = xi2 . f = This game can’t go much further.

We want to show that for every > 0 there exists a partition P. P) = i=1 (Mi − mi )(xi − xi−1 ) < b−a (xi − xi−1 ) = . n n L( f. which guarantees that U( f. Then. b] → R is continuous. P) − L( f. P) − L( f. n Thus. that f is monotonically increasing.4 If f [a. n. such that b−a Take then a partition P such that xi − xi−1 < δ | f (x) − f (y)| < whenever |x − y| < δ. then it is integrable. P) = i=1 f (xi−1 )(xi − xi−1 ) and U( f. without loss of generality. U( f. That is. We need to somehow control the difference between mi and Mi . P) = n [ f (xi ) − f (xi−1 )](xi − xi−1 ). P) < L( f. such that U( f. Pn ) = n n [ f (xi ) − f (xi−1 )] = i=1 (b − a)[ f (b) − f (a)] . . Pn ) − L( f. b−a U( f. Recall that a function that is continuous on a closed interval is uniformly continuous. for all i = 1. i=1 Take now a family of equally-spaced partitions. . there exists a δ > 0. for every partition P. Proof : First. . I Theorem 4. Pn ) − L( f. and n n U( f. P) + . i=1 . it follows that Mi − mi < /(b − a). Since continuous functions on closed intervals assume minima and maxima in the interval. P) = i=1 f (xi )(xi − xi−1 ). .Integration theory 143 Proof : Assume. for every > 0 we can choose n > (b − a)[ f (b) − f (a)]/ . Pn ) < (we used the fact that the sum was “telescopic”). For such j=0 partitions. Pn = {a + j(b − a)/n}n . a continuous function on a segment is bounded. hence.

Q)] − [U( f. P ) < 2 and U( f. P ) − L( f. c] and [c. P ) + U( f. . P ) − L( f. b]. c] and a partition P of [c. P ) − L( f. b]. hence f is integrable on [a. c]. such that U( f. . We have L( f. P ). b].5 Let f be a bounded function on [a. we obtain the desired result. suppose that f is integrable on both [a. P) − L( f. c]. For every > 0 there exists a partition P of [a. P) < . b]. . P ) − L( f. and let a < c < b. xn } is a partition of [c. b]. If the partition P does not include the point c then let Q = P ∪ {c}. b] if and only if it is integrable on both [a. . It is sufficient to show that it is integrable on [a. P ) < . < ≥0 = and U( f. . . b]. Proof : (1) Suppose first that f is integrable on [a. P )] = [U( f. P ) + L( f. . x j } is a partition of [a. in which case b c b f = a a f+ c f. c] and [c. 2 Since P = P ∪P is a partition of [a. Q) ≤ U( f. and U( f. P )] < . . Then there a exists a partition P of [a.144 Chapter 4 I (42 hrs. Q) = U( f. c]. b]. . and P {x j . then P = {x0 . Then. Q) = L( f. Q) − L( f. 2011) Theorem 4. b]. f is integrable on [a. Q) − L( f. Suppose that x j = c. P) − L( f. P ) from which we deduce that [U( f. P) < . summing up. (2) Conversely. such that U( f.

a b a f =− b a f and a f = 0. f (x) + g(x) ≥ inf{ f (x) : x ∈ A} + inf{g(x) : x ∈ A}.6 If f and g are integrable on [a. for every partition P that is a union of partitions of P and P . P) for every partition P of [a. b] if and only if it is integrable on both [a. so that (as can easily be checked) the addition rule always holds. since c L( f. b]. Theorem 4.Integration theory 145 Thus. b]. c] for every partition P of [c. We now add the following conventions. b] with b > a. L( f. c] and [c. then so is f + g. we have shown that f is integrable on [a. . 4 This is because for every x. b]. P ) c for every partition P of [a. we observe that if f and g are bounded functions on a set A. P ) f ≤ U( f. then4 inf{ f (x) + g(x) : x ∈ A} ≥ inf{ f (x) : x ∈ A} + inf{g(x) : x ∈ A} sup{ f (x) + g(x) : x ∈ A} ≤ sup{ f (x) : x ∈ A} + sup{g(x) : x ∈ A}. (More precisely. and b b b ( f + g) = a a f+ a g. we have only defined the integral on [a. P ) ≤ it follows that c b L( f. Next.) Since there is only one b such number. it is equal to a f . but it is easy to see that this implies for every partition. b]. I (39 hrs. 2009) Comment: So far. P) ≤ a f+ c f ≤ U( f. so that the right-hand side is a lower bound of f + g. Proof : First of all. P ) ≤ a b f ≤ U( f.

for every partition P. P ) < Take P = P ∪ P . P) + U(g. P) + L(g. P) − L(g. 5 If |x| < for every > 0 then x = 0. P ) − L( f. P) + U(g. b (4.1). P ) − L(g. P) ≤ a ( f + g) ≤ U( f + g. P) ≤ U( f. P) ≤ U(g. P) ≤ [U( f. P) − L( f. P)] − [L( f. Now. then 2 U(g. P) − L(g. L( f + g. such that U( f. 2 and it remains to substitute into (4. P) + L(g. 2 U( f. U( f + g. It follows that b b b ( f + g) − a a f− a g ≤ [U( f. P) + U(g. P) ≤ U( f. Given > 0 there exist partitions P . P ) − L( f. P ) < . P ) − L(g. P) ≥ L( f. P)] + [U(g. P) + U(g. and also L( f. it follows that it is I Since the right-hand side can be made less than zero5 . P) − L( f + g. P). P ) < L( f. P ) < .1) 2 and U(g. P) ≤ U( f. P)]. P) U( f + g. P . . P) ≤ L( f + g. to prove that f + g is integrable. We are practically done. hence. P) + L(g. P) + L(g. P).146 Chapter 4 This implies that for every partition P. P)]. P) ≤ a b b f+ a g ≤ U( f. P) − L( f. P). for every .

b] and that (44 hrs.5 Recall the Riemann function.4 Denote by Show that b the upper integral. R = 0 (the lower integral). Proof : We saw that (g − f ). Then we only need to use the additivity of the integral for g = f + (g − f ). which is non-zero only at a finite number of points.  Conclude that R is integrable on [a. 2010) b a R = 0. b]. Œ Show that b a x = p/q ∈ Q x Q. b] be a gievn interval. Also. Show that Z is a finite set. is integrable and its integral is zero.1 If f is integrable on [a. I  Exercise 4. b] : R(x) > }. . Let f. find two functions f.  1/q   R(x) =  0  Let [a.  Let > 0 and define Z = {x ∈ [a.Integration theory 147 Corollary 4.  Exercise 4. b] and g differs from f only at a finite number of points then b b f = a a g. g be bounded on [a. b b ( f + g) ≤ a a f+ a g. P) < . Ž Show that there exists a partition P such that U(R. g for which the inequality is strong.

such that U( f. and this number is by definition a (c f ). L(− f. ≤ U(c f. Consider then the case where c < 0. P) = −U( f. P) = c L( f. . For every partition P. P) = c L( f. Since f is integrable on [a. Since f is integrable on [a. P) < . Proof : Start with the case where c > 0. P) = −(L( f. P). P). then there exists for every > 0 a partition P. P) = −L( f. P) − L(c f. then b b (c f ) = c a a f. P). in fact.7 If f is integrable on [a. hence U(− f. we note that for every partition P. such that U( f. P) − L(− f. b]. P) − L( f. it is sufficient to consider the case of c = −1. b L(c f. then c f is integrable on [a. b] and a b b (c f ) = c a f. P) − L( f. P)) < . P) < . P) − U( f. P) and U(− f. P) ≤ b tions P. c hence U(c f. b].148 Chapter 4 Theorem 4. P) for all parti- Since there is only one number such that L(c f. b] and c ∈ R. which proves that c f is integrable on [a. Then. L(c f. since for arbitrary c < 0 we write c f = (−1)|c| f . then there exists for every > 0 a partition P. P) = c U( f. P) and U(c f. P) ≤ c a f ≤ c U( f. P) = U(c f. b]. P) < . For every partition P.

j . we can add to each a constant to make it positive and then use the “arithmetic of integrability”). (45 hrs. j j hence m fj mg ≤ m fj g j and M jf g ≤ M jf M g . Proof : We will prove this theorem for the case where f. we note that for every partition P. P) ≤ − a f ≤ −L( f. Let P be a partition. The generalization to arbitrary functions is not hard (for example. g ≥ 0. b]. We also define M f = sup{| f (x)| : a ≤ x ≤ b} M g = sup{|g(x)| : a ≤ x ≤ b}. Then. P) = U(− f. b L(− f. then so is their product f · g.Integration theory 149 which proves that (− f ) is integrable on [a. 2010) This last theorem is in fact a special case of a more general theorem: Theorem 4. I and the rest is as above. m fj mg ≤ f (x)g(x) ≤ M jf M g . It is useful to introduce the following notations: m fj = inf{ f (x) : x j−1 ≤ x ≤ x j } mg = inf{g(x) : x j−1 ≤ x ≤ x j } j m fj g = inf{ f (x)g(x) : x j−1 ≤ x ≤ x j } M jf = sup{ f (x) : x j−1 ≤ x ≤ x j } M g = sup{g(x) : x j−1 ≤ x ≤ x j } j M jf g = sup{ f (x)g(x) : x j−1 ≤ x ≤ x j }. P). b]. since f and g are bounded. P) = −U( f.8 Suppose that both f and g are integrable on [a. For every x j−1 ≤ x ≤ x j .

P0 ) ≥ m(b − a) and U( f. b]. then b m(b − a) ≤ a f ≤ M(b − a). P0 ) ≤ a f ≤ U( f. P)] + M f [U(g. P) − L( f.10 The function F defined above is continuous on [a. It is then a simple task to show that the integrability of f and g implies the integrability of f g. Proof : Let P0 = {x0 . b] and m ≤ f (x) ≤ M for all x. Theorem 4. j j j j j j j This inequality implies that U( f g. P0 ). P) ≤ M g [U( f. then L( f. b]. P) − L(g. xn }. I Suppose now that f is integrable on [a. P)]. P0 ) ≤ M(b − a). Define for every x. I Theorem 4. .9 If f is integrable on [a. x F(x) = a f. P) − L( f g. We proved that it is integrable on every sub-segment.150 It further follows that Chapter 4 M jf g −m fj g ≤ M jf M g −m fj mg ≤ (M jf −m fj )M g +m fj (M g −mg ) ≤ M g (M jf −m fj )+M f (M g −mg ). and the proof follows from the fact that b L( f.

we take δ = /M.7 Show that if f is bounded and integrable on [a.e. Show that cb b f =c ca a g. Define M = sup{| f (x)| : a ≤ x ≤ b}.8 Let a < b and c > 0 and let f be integrable on [ca. Show that for every p > 0. then | f | is integrable on [a. By the previous theorem. and then |F(y) − F(x)| < whenever |x − y| < δ. .6 Let f be bounded and integrable in [a. 2009)  Exercise 4.. i. and in particular. b 1/p | f |p a ≤ M|b − a|1/p . That is. b].  Exercise 4. −M(y − x) ≤ F(y) − F(x) ≤ M(y − x). that the integral on the left hand side exists. cb]. Then for |x − y| < δ (suppose wlog that y > x). (41 hrs. Given > 0 take δ = /M. |F(y) − F(x)| ≤ M(y − x) < . given > 0. b].  Exercise 4. y x y 151 F(y) − F(x) = a f− a f = x f.Integration theory Proof : Since f is integrable it is bounded. b] and b b f ≤ a a | f |. where g(x) = f (cx). b] and denote M = sup{| f (x)| : a ≤ x ≤ b}. I which proves that F is (uniformly) continuous on [a. since −M ≤ f (x) ≤ M for all x.

then 1/ f is integrable on [a. b].  Exercise 4. .9 Let f be integrable on [a. x2 ≤ b} = sup{| f (x1 )− f (x2 )| : a ≤ x1 . b] and g(x) = f (x − c). Ž Prove that if f is integrable on [a. Hermann Hankel (1839-1873) proved that if a function is integrable. but had an error in his .11 A function s : [a. such that b b f− a a g< . where ωi is the variation i=1 of f in the segment [xi−1 . > 0a  Show that if f is integrable on [a. Œ Show that if f is integrable on [a. b]. Show that g is b b+c f = a a+c g.  Show that f is integrable on [a.10 Let f : [a. Discussion: A question that occupied many mathematicians in the second half of the 19th century is what characterizes integrable functions. xi ]. b] if and only if there exists for every > 0 a partition P = {x0 . Œ Prove that ω f = sup{ f (x1 )− f (x2 ) : a ≤ x1 . such that b b b b > 0 two f− a a s1 < and a s2 − a f < . b]. b] as ω f = sup{ f (x) : a ≤ x ≤ b} − inf{ f (x) : a ≤ x ≤ b}. b] → R be bounded. b] then there exists for every continuous function g. b] → R is called a step function ( ) if there exists a partition p such that s is constant on every open interval (we do not care for the values at the nodes). xn }. and there exists an m > 0 such that | f (x)| ≥ m for all x ∈ [a. b] then there exist for every step functions. such that n ωi < . g ≤ f . b + c] and  Exercise 4. then it is necessarily continuous on a dense set of points (he also proved the opposite. s1 ≤ f and s2 ≥ f . . . and define the variation of f in [a.  Exercise 4. x2 ≤ b}.152 Chapter 4 integrable on [a + c. .

x) = inf{ f (y) : x < y < c} M(c. Comment: The theorem states that in a certain way the integral is an operation inverse to the derivative. b). Proof : We need to show that lim ∆F. b]. (44 hrs. For example. If we define m(c. For x > c. b] and define for every x ∈ [a. if f (x) = 0 everywhere except for the point x = 1 where it is equal to one.Integration theory 153 proof).c = f (c). c We are going to show that this holds for the right derivative. F(x) − F(c) = x c f. x) = sup{ f (y) : x < y < c}. then F is differentiable at c and F (c) = f (c).3 The fundamental theorem of calculus )) Let f be intex Theorem 4. Note the condition on f being continuous. the same arguments can then be applied for the left derivative. If f is continuous at c ∈ (a. 2011) 4. then F(x) = 0. The ultimate answer is due to Henri Lebesgue (1875-1941) who proved that a bounded function is integrable if and only if its points of discontinuity have measure zero (a notion not defined in this course). F(x) = a f. and it is not true that F (1) = f (1).11 (( grable on [a. .

a .c (x) ≤ M(c. (∀ > 0)(∃δ > 0) : (∀c < y < c + x)(| f (y) − f (c)| < /2). x) ≤ ∆F. Chapter 4 m(c. hence x f = F(x) = g(x) + c. It remains that the right-hand side as a limit zero at c. (∀ > 0)(∃δ > 0) : (∀c < y < c + δ)(| f (y) − f (c)| < /2). or m(c. I Corollary 4.c (x) − f (c) < . x) − f (y)|.2 (Newton-Leibniz formula) If f is continuous on [a. x) − f (c)| ≤ /2 < e and |M(c. We have also seen that it implies that F = g + c. x) − f (y)|} . x)(x − c). b] and f = g for some function g. x)(x − c) ≤ F(x) − F(c) ≤ M(c. then b f = g(b) − g(a). x) − f (c)| ≤ /2 < e. ∆F. Then. b]).154 then. To conclude. |M(c. Since f is continuous at c. ∆F.c (x) − f (c) ≤ max {|m(c. which concludes the proof. for some constant c. a Proof : We have seen that F(x) = a x f satisfies the identity F (x) = f (x) for all x (since f is continuous on [a. x). for every > 0 there exists a δ > 0 such that for all c < x < c + δ. For c < x < c + δ. hence |m(c.

(43 hrs. esin(t) dt. . such that inf{ f (x) : a ≤ x ≤ b} ≤ λ ≤ sup{ f (x) : a ≤ x ≤ b}. b) → R by h(x) F(x) = g(x) f.14 Let f : R → R be continuous. Setting x = b we get the desired result. I This is a very useful result. we have an easy way to calculate the integral of f . 0 x2 x  Exercise 4. b). b]. 1 2δ x+δ f. h : (a. 2009) (47 hrs. For every δ > 0 we define Fδ (x) = Show that for every x ∈ R. cos(t) dt. It means that whenever we know a pair of functions f. Œ Show that there exists a λ ∈ R. but no primitive function ( ) g is known. On the other hand.  Show that F is differentiable. + δ→0  Exercise 4. 2010)  Exercise 4.15 Let f.  Exercise 4.12 Let f : R → R be continuous and g. b] → R be integrable. x−δ f (x) = lim Fδ (x). it is totally useless if we want to calculate the integral of f . such that g = f . and calculate its derivative. such that g(x) ≥ 0 for all x ∈ [a. g.Integration theory 155 but since F(a) = 0 it follows that c = −g(a).13 Differentiate the following functions: Œ  Ž 1 0 ex cos(t2 ) dt. b) → R differentiable. Define F : (a. Œ Show that F is well-defined for all x ∈ (a. g : [a.

 Show that if f is continuous. If the difference is . ξ x a  Exercise 4. P). b]. b]. b]. then there exists a c ∈ [a. P) ≤ a f ≤ U( f. Show that there exists a ξ ∈ [a. b] → R be non-negative and integrable. b]. b] → R be integrable. 4.)  Exercise 4. b] → R be monotonic and let g : [a. Œ If f is differentiable at c then F is differentiable at c.156 and Chapter 4 b b ( f g) = λ a a g.16 Let f : [a. and F(x) = Prove or disprove: f.  If f is differentiable at c then F is continuous at c. c ∈ (a. b). it follows that if the difference between the upper and lower sum is less than . however we do not know any primitive function of f . Let f be a given integrable function on [a. such that b ξ b ( f g) = f (a) a a g + f (b) g. such that b b ( f g) = f (c) a a g. then their average differs from the integral by less that /2. (This is known as the integral mean value theorem. Suppose that we are content with an approximation of this integral within a maximum error of . Take a partition P: since b L( f. b b Hint: distinguish between the cases where a g > 0 and a g = 0.4 Riemann sums Let’s ask a practical question. and suppose we want to calculate the integral of f on [a.17 Let f : [a. Ž If f is differentiable in a neighborhood of c and f is continuous at c then F is twice differentiable at c.

Proof : It is enough to show that given > 0 there exists a δ > 0 such that |U( f. Its evaluation only requires function evaluations. If the partition is sufficiently fine. The sum in the middle is called a Riemann sum. P) < .12 Let f be an integrable function on [a. without any inf or sup. then n b f (ti )(xi−1 − xi ) − i=1 a f < . it follows that if in every interval in the partition we take some point ti (a sample point). such that n b > 0 there f (ti )(xi−1 − xi ) − i=1 a f < for every partition P such that diam P < δ and any choice of points ti . we can keep refining until they differ by less than . since mi ≤ f (x) ≤ Mi for all xi−1 ≤ x ≤ xi . What we are going to show is that Riemann sums converge to the integral as we refine the partition. then n L( f. For every exists a δ > 0. up to the fact that we have to compute lower and upper sums. such that U( f.Integration theory 157 greater than . P) − L( f. P) − L( f. then it can be reduced by refining the partition. and so we obtain an approximation to the integral within the desired accuracy. P)| < whenever diam P < δ. irrespectively on how we sample the points ti . b]. Definition 4. Note that we don’t know a priori whether a Riemann is greater or smaller than the integral. Infimums and supremums may be hard to calculate. . P). There is one drawback in this plan. Thus.5 The diameter of a partition P is diam P = max{xi − xi−1 : 1 ≤ i ≤ n}. Theorem 4. On the other hand. P) ≤ i=1 f (ti )(xi−1 − xi ) ≤ U( f.

. The contribution of these terms is at most /2. and take some partition P = {y0 . n δ< . 1] → R defined as ϕ : x → 1 1 − x2 . P) = i=1 (Mi − mi )(xi − xi−1 ) can be divided into two sums. y j ].6 The number π is defined as 1 π=2 −1 √ 1 − x2 dx. P )| < . U( f. yk } for which |U( f. √ we have a function ϕ : [−1. This concludes the proof. One for which each of the intervals [xi−1 .158 Chapter 4 for any Riemann sum with the same partition will differ from the integral by less than . . Definition 4. P ) − L( f.5 The trigonometric functions Discussion: Describe the way the trigonometric functions are defined in geometry. Let M = sup{| f (x)| : a ≤ x ≤ b}. xi ] is contained in an interval [y j−1 . I 4. There are at most k such terms and each contributes at most Mδ. Comment: We have used here the standard notation of integrals. and explain that we want definitions that only rely on the 13 axioms of real numbers. . . 2 We set 4Mk For every partition P of diameter less than δ. and we define π=2 −1 ϕ. . In our notation. P) − L( f. The second sum includes those intervals for which xi−1 < y j < xi .

and express the area of the sector bounded by this segment and the positive x-axis. 159 √ Consider now a unit circle. with its inverse defined on [0. A(−1) = π/2 and A(1) = 0. θ . Second. 1 A (x) = − √ . For −1 ≤ x ≤ 1 we look at the point (x. √ 1 x 1 − x2 + ϕ. First. . 2 or. It follows that it is invertible. This geometric construction is just the motivation.Integration theory This is well defined because ϕ is continuous. and that A(x) = θ/2. formally. hence integrable.7 We define the functions cos : [0.. connect it to the origin. 2 1 − x2 i. cos θ = A−1 2 This observation motivates the way we are going to define the cosine function. then we know that x = cos θ. A!x" x We now examine some properties of the function A. A(x) = 2 x If θ is the angle of this sector. we define a function A : [−1. 1] → R. π] → R by cos x = A−1 (x/2) and sin x = √ 1 − cos2 x.e. 1 − x2 ). Thus. π/2]. Definition 4. the function A is monotonically decreasing. π] → R and sin : [0. θ A(cos θ) = .

13 For 0 < x < π. We now need to show that the sine and cosine functions satisfy all known differential and algebraic properties. Since y A(cos y) = .160 Chapter 4 With these definitions in hand we define the other trigonometric functions. etc. sin x = cos x and cos x = − sin x. 2009) We are now in measure to start picturing the graphs of these two functions. (45 hrs. tan. 2 it follows that y = 2A(0) = 2 0 1 ϕ= π . 2 . I 1−(A−1 (x/2))2 The derivative of the sine function is then obtained by the composition rule. cos must vanish for some 0 < y < π. we have by the chain rule and the derivative of the inverse function formula. Proof : Since A is differentiable and A (x) 0 for 0 < x < π. by definition. Theorem 4. We also have cos π = A−1 (π/2) = −1 from which follows that sin π = 0 and sin 0 = 0. First. and cos 0 = A−1 (0) = 1. 1 1 1 1 cos x = (A−1 ) (x/2) = =− −1 (x/2)) 2 2 A (A 2 √ 2 1 1 = · · · = − sin x. cot. By the intermediate value theorem. It thus follows that cos is decreasing in this domain. sin x > 0 0 < x < π.

which implies that f is identically zero. It is easy to verify that the expressions for the derivatives of sine and cosine remain the same for all x ∈ R.2 Let f : R → R be twice differentiable. Then. Proof : Multiply the equation by 2 f . Since we defined the trigonometric functions (more precisely their inverses) via integrals. The conditions at zero imply that c = 0. I . π = 1. and cos x = cos(2π − x) for π ≤ x ≤ 2π. for some c ∈ R. cos π =0 2 and sin We can then extend the definition of the sine and cosine to all of R by setting sin x = − sin(2π − x) and sin(x + 2πk) = sin x and cos(x + 2πk) = cos x for all k ∈ Z. then 2 f f + 2 f f = [( f )2 + f 2 ] = 0. it is not surprising that it was straightforward to differentiate them.Integration theory where we have used the fact that by symmetry. 2 Thus. f is identically zero. with f (0) = f (0) = 0. 0 1 161 ϕ= −1 0 ϕ. from which we deduce that ( f )2 + f 2 = c. A little more subtle is to derive their algebraic properties. and satisfy the differential equation f + f = 0. Lemma 4.

Then.15 For every x. f + f = 0 and f (0) = sin y and It follows from the previous theorem that f (x) = sin y cos x + cos y sin x. I .14 Let f : R → R be twice differentiable. Proof : Fix y and define f : x → sin(x + y). Proof : Define g = f − a cos −b sin. hence f = a cos +b sin. sin(x + y) = sin x cos y + cos x sin y cos(x + y) = cos x cos y − sin x sin y. It follows from the lemma that g is identically zero. I Theorem 4. f (0) = cos y. with f (0) = a and f (0) = b. then.162 Chapter 4 Theorem 4. g + g = 0. from which follows that f : x → cos(x + y) f : x → − sin(x + y). Thus. We proceed similarly for the cosine function. and satisfy the differential equation f + f = 0. f = a cos +b sin . y ∈ R. and g(0) = g (0) = 0.

Integration theory

163

4.6

The logarithm and the exponential

Our (starting) goal is to define the function f : x → 10 x for x ∈ R, along with its inverse, which we denote by log10 . For n ∈ N we have an inductive definition, which in particular, satisfies the identity 10n 10m = 10n+m . We then extend the definition of the exponential (with base 10) to integers, by requiring the above formula to remain correct. Thus, we must set 100 = 1 and 10−n = 1 . 10n

The next extension is to fractional powers of the form 101/n . Since we want 101/n × 101/n × · · · × 101/n = 101 = 10,
n times

√ n this forces the definition 101/n = 10. Then, we extend the definition to rational powers, by 101/n × 101/n × · · · × 101/n = 10m/n .
m times

We have proved in the beginning of this course that the definition of a rational power is independent of the representation of that fraction. The question is how to extend the definition of 10 x when x is a real number? This can be done with lots of “sups” and “infs”, but here we shall adopt another approach. We are looking for a function f with the property that f (x + y) = f (x) f (y), and f (1) = 10. Can we find a differentiable function that satisfies these properties? If there was such a function, then f (x) = lim
h→0

f (h) − 1 f (x + h) − f (x) f (x) f (h) − f (x) = lim = lim f (x). h→0 h→0 h h h

164

Chapter 4

Suppose that the limit on the right hand side exists, and denote it by α. Then, f (x) = α f (x) and f (1) = 10.

This is not very helpful, as the derivative of f is expressed in terms of the (unknown) f . It is clear that f should be monotonically increasing (it is so for rational powers), hence invertible. Then, ( f −1 ) (x) = From this we deduce that
x

1 f ( f −1 (x))

=

1 α f ( f −1 (x))

=

1 . αx

f −1 (x) =
1

dt = log10 (x). αt

We seem to have succeeded in defining (using integration) the logarithm-base-10 function, but there is a caveat. We do not know the value of α. Since, in any case, the so far distinguished role of the number 10 is unnatural, we define an alternative logarithm function:

Definition 4.8 For all x > 0,
x

log(x) =
1

dt . t

0 < x < ∞.

Comment: Since 1/x is continuous on (0, ∞), this integral is well-defined for all

Theorem 4.16 For all x, y > 0,
log(xy) = log(x) + log(y).

Proof : Fix y and consider the function f : x → log(xy).

Integration theory Then, f (x) = Since log (x) = 1/x, it follows that log(xy) = log(x) + c. The value of the constant is found by setting x = 1. 1 1 ·y= . xy x

165

I

Corollary 4.3 For integer n,
log(xn ) = n log(x).

Proof : By induction on n.

I

Corollary 4.4 For all x, y > 0,
log x = log(x) − log(y). y

Proof : log(x) = log x x y = log + log(y). y y I

Definition 4.9 The exponential function is defined as
exp(x) = log−1 (x).

Comment: The logarithm is invertible because it is monotonically increasing, as
its derivative is strictly positive. What is less easy to see is that the domain of the exponential is the whole of R. Indeed, since log(2n ) = n log(2),

166

Chapter 4

it follows that the logarithm is not bounded above. Similarly, since log(1/2n ) = −n log(2), it is neither bounded below. Thus, its range is R.

Theorem 4.17

exp = exp .

Proof : We proceed as usual, exp (x) = (log−1 ) (x) = 1 = log−1 (x) = exp(x). log (log−1 (x)) I

Theorem 4.18 For every x, y,
exp(x + y) = exp(x) · exp(y).

Proof : Set t = exp(x) and s = exp(y). Thus, x = log(t) and y = log(s), and x + y = log(t) + log(s) = log(ts). Exponentiating both sides, we get the desired result. I

Definition 4.10
e = exp(1). There is plenty more to be done with exponentials and logarithms, like for example, show that for rational numbers r, er = exp(r), and the passage between bases. Due to the lack of time, we will stop at this point.

. in the sense that F = f .Integration theory 167 4. b f. trigonometric.7 Integration methods Given a function f . and substitution—both we will study. a question of practical interest is how to calculate integrals. we must know many pairs of functions f. F. then the answer is known. In this section we study techniques that in some cases allow to express a primitive function in terms of elementary functions. There are basically two techniques—integration by parts. a To effectively use this formula. for b b b b b ( f + g) = a a f+ a g and a (c f ) = c a f. b a f = F(b) − F(a) = F|b . which is a primitive of f . Functions that can be expressed in terms of powers. a If we know a function F. Here is a table of pairs f. F. exponential and logarithmic functions and called elementary. which we already know: f (x) 1 xn cos x sin x 1/x exp(x) sec2 (x) 1/(1 + x2 ) √ 1/ 1 − x2 F(x) x n+1 x /(n + 1) sin x − cos x log(x) exp(x) tan(x) arctan(x) arcsin(x) def Sums of such functions and multiplication by constants are easily treated with our integration theorems.

we have that Primitive( f g ) = f Primitive(g ) − Primitive( f Primitive(g)) + const. Example: Consider the function F(x) = x arctan(x) − It is easily checked that f (x) = F (x) = arctan(x). It is however not always the case that the primitive of an elementary function is itself and elementary function. setting b = x and noting that primitives are defined up to constants. Theorem 4. a a The question is how could have we guessed this answer if the function F had not been given to us. b 1 log(1 + x2 ).168 Chapter 4 Comment: Every continuous function f has a primitive. x F(x) = a f. g on the righthand side may be any primitive of g . f and g are given. Then. where a is arbitrary. then b b )) Let f. g have con- f g = ( f g)|b − a a a f g. Comment: When we use this formula. Also. . 2 arctan(x) dx = F|b . As a result of this observation.19 (Integration by parts ( tinuous first derivatives.

a Example: The back-to-original-problem-trick b (1/x) log(x) dx. a Theorem 4. a Once we have used this method to get new f. F pairs. f (a) (g ◦ f ) f = a Proof : Let G be a primitive of g. b f (b) )) For continuously differg. a Example: The 1-trick: b log(x) dx. then (G ◦ f ) = (g ◦ f ) f . . we solve b log2 (x) dx.Integration theory 169 Example: b x e x dx.20 (Substitution formula ( entiable f and g. we use this knowledge to obtain new such pairs: Example: Using the fact that a primitive of log(x) is x log(x) − x.

170 hence a Chapter 4 b b f (b) (g ◦ f ) f = a (G ◦ f ) = (G ◦ f )|b = G| ff (b) = a (a) g. f (a) I The simplest use of this formula is by recognizing that a function is of the form (g ◦ f ) f . hence b sin(b) sin5 (x) cos(x) dx = a sin(a) t5 dt = t6 6 sin(b) . a If f (x) = sin(x) and g(x) = x5 . Example: Take the integral b sin5 (x) cos(x) dx. sin(a) (46 hrs. 2009) . then sin5 (x) cos(x) = g( f (x)) f (x).

Comment: Let a : N → R. . . The essential fact is that to each natural number corresponds a real number. Rather than writing a(3). for example. Notation: We usually denote a sequence a : N → R by (an ).1 Basic definitions An (infinite) sequence ( ) is an infinite ordered set of real numbers. n=1 Examples: Œ a : n → n or an = n. Comment: As for functions. we often refer to “the sequence 1/n”. . a1 . where we often refer to “the function x2 ”. a2 . . a(17). we write a3 . a3 . . Another ubiquitous notation is (an )∞ . We denote sequences by a letter which labels the sequence. Having said that.Chapter 5 Sequences 5.1 An infinite sequence is a function N → R. and a subscript which labels the position in the sequence. a17 . we can define: Definition 5.

a “cleaner” notation for the limit of a sequence (an ) would be lim a. . ∞ . In the first and fourth cases we would say that “the sequence goes ( ) to infinity”. 11. More useful often is to mark the elements of the sequence on the number axis.172 Chapter 5  b : n → (−1)n or bn = (−1)n . the sequence of primes. . 5. ¯ ) if it converges to some ) (there is no such thing Comment: There is a close connection between the limit of a sequence and the limit of a function at infinity. Ž c : n → 1/n or cn = 1/n. 7. Like functions. otherwise it is called divergent ( as convergence to infinity). In the third case we would say that “it goes to zero”.2 A sequence (an ) converges ( n→∞ if (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − | < ). 3. Definition 5. Definition 5. sequences can be graphed.2 Limits of sequences ¯ ) to . We will make such statements precise momentarily. We will elaborate on that later on. }.  (dn ) = {2. denoted lim an = . . Notation: Like for functions. In the second case we would say that “it jumps between −1 and 1”. a1 a5 a3 a2 a4 5.3 A sequence is called convergent ( (real) number.

One way is by an algebraic trick. Example: Let a : n → n + 1 − n. . . 4n3 − 8n + 63 . ∞ since. . . 2 ξ for some x < ξ < y. Example: Let now a:n→ 3n3 + 7n2 + 1 . 3 − 2. . . The same inequality can be obtained by defining √ a function f : x → x. √ y−x √ y− x= √ . however showing it requires some manipulations. and using the mean-value theorem. . 2 3 4 It is easy to show that lim a = 0.Sequences 173 Example: Let a : n → 1/n. Substituting x = n and y = n + 1. . (∀ > 0)(set N = 1/ ) then (∀n > N) |an | = 1 1 < < n N . or (an ) = It seems reasonable that lim a = 0. n+1+ n n+1+ n 2 n from which it is easy to proceed. . 4 − 3. . √ √ √ √ ( n + 1 − n)( n + 1 + n) 1 1 an = = √ √ √ √ < √ . or 1 1 1 (an ) = 1. ∞ √ √ √ √ √ √ √ √ 2 − 1. we obtain the desired inequality.

n n4 +1 . and ∞ ∞ ∞ lim(a + b) = lim a + lim b lim(a · b) = lim a · lim b. n3 −3 √ n2 + 1 − n.1 For each of the following sequence determine whether it is converging or diverging.  Exercise 5. Then the sequences (a + b)n = an + bn and (a · b)n = an bn are also convergent. 4 − 8/n2 + 63/n3 and use limit arithmetic as stated below. proving it is not totally straightforward. Theorem 5. we do not care if a finite number of elements is not defined. b lim∞ b Comment: Since the limit of a sequence is only determined by the “tail” of the sequence. ∞ ∞ ∞ If furthermore lim∞ b 0. Prove your claim using the definition of the limit: Œ an =  an = Ž an = √ 7n2 + n 2 +3n . the sequence (a/b) converges and lim ∞ 0 for all a lim∞ a = .1 Let a and b be convergent sequences. One way to proceed is to note that 3 + 7/n + 1/n3 an = .174 It seems reasonable that Chapter 5 3 lim a = . Proof : Easy once you’ve done it for functions. ∞ 4 but once again. I . then there exists an N ∈ N such that bn n > N.

| f (x) − | ≤ max |a x − |. ∞ We now establish the analogy between limits of sequences and limits of functions. |a x − | < . ∞ then lim(a + b) = L + M. and this is in particular true for x ∈ N. I Conversely. then. Conversely. such that lim a = L ∞ and lim b = M. Let a be a sequence. ∞) → R by f (x) = an + (an+1 − an )(x − n) n ≤ x ≤ n + 1. ∞ Proof : Suppose first that the sequence converges. f (x) lies between f ( x ) and f ( x ). Thus. it follows that for all x > N. .1 lim a = ∞ if and only if lim f = . Since for every x.2 Show that if a and b are converging sequences. (∀ > 0)(∃N ∈ N) : (∀x > N)(| f (x) − | < ). Proposition 5. If f (x) converges as x → ∞. given a function f : R → R. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − | < ). hence (∀ > 0)(∃N ∈ N) : (∀x > N)(| f (x) − | < ). and define a function f : [1. then so does the sequence (an ). it is often useful to define a sequence an = f (n). if f converges to as x → ∞.Sequences 175  Exercise 5.

and an = f (n). Theorem 5. Since log a < 0 it follows that lim x→∞ f (x) = 0.176 Chapter 5 Example: Let 0 < a < 1. Proof : Suppose first that lim f = . Then n→∞ lim an =? f (x) = e x log a . ∞ where the composition of a function and a sequence is defined by [ f (a)]n = f (an ).2 Let f be defined on a (punctured) neighborhood of a point c. such that an lim a = c. we may define convergence of sequences to ±∞. for every sequence a that converges to c. and then lim f (a) = . Comment: Like for functions. then Comment: This theorem provides a characterization of the limit of a function at a point in terms of sequences. c If a is a sequence whose entries belong to the domain of f . only that we don’t use the term convergence. Suppose that lim f = . This is known as Heine’s characterization of the limit. if lim∞ f (a) = limc f = . c . Conversely. ∞ c. hence the sequence converges to zero. then We define f (x) = a x . but rather say that a sequence approaches ( ) infinity.

(∃ > 0) : (∀n ∈ N)(∃an : 0 < |an − c| < δn ) : (| f (an ) − | ≥ ). namely. Suppose. (∀ > 0)(∃δ > 0) : (∀x : 0 < |x − c| < δ)(| f (x) − | < ). 177 Let a be a sequence as specified in the theorem. Conversely.. In particular. This means that (∃ > 0) : (∀δ > 0)(∃x : 0 < |x − c| < δ) : (| f (x) − | ≥ ). ∞ (48 hrs. ∞ it follows that lim a = sin(13). i. which contradicts our assumptions. (∀ > 0)(∃N ∈ N) : (∀n > N)(| f (an ) − | < ). where bn = 13 + 1/n2 .e. either the limit does not exist or it is not equal to c). 2009) . an c and lim∞ a = c. I Example: Consider the sequence a : n → sin 13 + 1 . n2 This sequence can be written as a = sin(b).. Then (∀δ > 0)(∃N ∈ N) : (∀n > N)(0 < |an − c| < δ). suppose that for any sequence a as above lim∞ f (a) = . setting δn = 1/n. The sequence a converges to c but f (a) does not converge to . lim∞ f (a) = .e. Combining the two.Sequences Thus. by contradiction that lim f c (i. Since lim sin = sin(13) 13 and lim b = 13.

α + 1). |an | ≤ M Since b converges to zero. (∃N ∈ N) : (∀n > N)(α − 1 < an < α + 1). We need to show that there exists L1 . Set M = max{an : 1 ≤ n ≤ N} m = min{an : 1 ≤ n ≤ N}. Theorem 5. namely.178 Chapter 5 Theorem 5. Then n→∞ lim (ab) = 0. . (∀ > 0)(∃N ∈ N) : (∀n > N) (|an bn | ≤ M|bn | < ) . min(m.3 A convergent sequence is bounded.4 Let a be a bounded sequence and let b be a sequence that converges to zero. α − 1) ≤ an ≤ max(M. I M . Proof : Let M be a bound on a. (∀ > 0)(∃N ∈ N) : (∀n > N) |bn | < Thus. ∀n ∈ N. setting = 1. Proof : Let a be a sequence that converges to a limit α. By definition. (∃N ∈ N) : (∀n > N)(|an − α| < 1). L2 such that L1 ≤ an ≤ L2 ∀n ∈ N. which implies that the sequence ab converges to zero. or. I then for all n.

a Proof : By definition. I Theorem 5. ∞ Then lim ∞ 1 = 0.Sequences 179 Example: The sequence a:n→ converges to zero.6 Let a be a sequence such that lim |a| = ∞. which concludes the proof. which concludes the proof. (∀M > 0)(∃N ∈ N) : (∀n > N)(0 < |an | < 1/M). ∞ Then lim ∞ 1 = ∞. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an | > 1/ ). I . hence (∀ > 0)(∃N ∈ N) : (∀n > N)(0 < 1/|an | < ).5 Let a be a sequence with non-zero elements. such that lim a = 0. sin n n Theorem 5. |a| Proof : By definition. hence (∀M > 0)(∃N ∈ N) : (∀n > N)(1/|an | > M).

I Corollary 5. then we still only have α ≤ β. Then there exists an N ∈ N. Proof : By contradiction.8 Suppose that a and b are convergent sequences. Then α ≤ β. Even though an < bn for all n. ∞ and there exists an N ∈ N. the sequence b is eventually greater (term-by-term) than the sequence a. 2 Thus. such that an ≤ bn for all n > N. β ∈ R. such that bn > an for all n > N. I Comment: If instead an < bn for all n > N..e. (∃N1 ∈ N) : (∀n > N1 ) an < α + 1 (β − α) . Theorem 5.180 Chapter 5 Theorem 5. Proof : Since (β − α)/2 > 0. i.1 Let a be a sequence and α. both converge to the same limit. ∞ and β > α. . 2 and (∃N2 ∈ N) : (∀n > N2 ) bn > β − 1 (β − α) . using the previous theorem. (∃N1 .7 Suppose that a and b are convergent sequences. Take for example the sequences an = 1/n and bn = 2/n. N2 )) (bn − an > 0) . If the sequence a converges to α and α > β then eventually an > β. lim a = α ∞ and lim b = β. N2 ∈ N) : (∀n > max(N1 . lim a = α ∞ and lim b = β.

N2 ))(− < an − ≤ cn − ≤ bn − < ).Sequences 181 Theorem 5. (∀ > 0)(∃N1 . N2 ∈ N) : (∀n > max(N1 . 1< it follows that n→∞ 1 + 1/n < 1 + 1/n. x→1 n→∞  Exercise 5. Proof : It is given that (∀ > 0)(∃N1 ∈ N) : (∀n > N1 )(|an − | < ). or (∀ > 0)(∃N ∈ N) : (∀n > N)(|cn − | < ). ∞ for all n > N.3 . 1 + 1/n = 1.9 (Sandwich) Suppose that a and b are sequences that converge to the same limit . Thus. and (∀ > 0)(∃N2 ∈ N) : (∀n > N2 )(|bn − | < ). Let c be a sequence for which there exists an N ∈ N such that an ≤ cn ≤ bn Then lim c = . lim We can also deduce this from the fact that √ lim x = 1 and lim (1 + 1/n) = 1. I Example: Since for all n.

and ab is bounded but does not converge. b is unbounded. It ≥ an for all n. such that a converges to zero. the order in the set does not matter. We define similarly a Theorem 5. . I Example: Consider the sequence 1 a:n→ 1+ n n . and in particular. i. (Note that a supremum is a property of a set.. b is unbounded. and ab converges to a limit other then zero.4 A sequence a is called increasing ( is called non-decreasing ( ) if an+1 decreasing and a non-increasing sequence.10 (Bounded + monotonic = convergent) Let a be a non-decreasing sequence bounded from above. and ab converges to zero.) By the definition of the supremum. Then it is convergent. b is unbounded. Ž Find sequences a and b. such that a converges to zero. such that a converges to zero.e. (∀ > 0)(∃N ∈ N) : (∀n > N)(α − < aN ≤ an ≤ α).  Find sequences a and b.182 Chapter 5 Œ Find sequences a and b. ) if an+1 > an for all n. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − α| < ). Since the sequence is non-decreasing. Definition 5. (∀ > 0)(∃N ∈ N) : (α − < aN ≤ α). Proof : Set α = sup{an : n ∈ N}.

i. . 1 The binomial formula is (a + b)n = n k=0 n k n−k ab . Definition 5. namely an = n. both the number of elements grows as well as their values. . we claim that this sequence is increasing. nk = 2k. It follows that n 1 exists lim 1 + n→∞ n (and equals to 2. such that n1 < n2 < · · · .Sequences We will first show that an < 3 for all n.e. an2 .718 · · · ≡ e). Indeed1 . as an = 1 + 1 + 1 1 1 1 1− + 1− 2! n 3! n 1− 2 + ··· . . ) of a is any More formally. A subsequence ( sequence an1 .. such that bk = ank for all k ∈ N. b is a subsequence of a if there exists a monotonically increasing sequence of natural numbers (nk ). k . Example: Let a by the sequence of natural numbers. . 1 1+ n n 183 n 1 n 1 n 1 + + ··· + 2 1 n 2 n n nn n(n − 1) n! =1+1+ + ··· + 2 2! n n! nn 1 1 ≤ 1 + 1 + + ··· + 2! n! 1 1 ≤ 1 + 1 + + · · · + n−1 < 3. n When n grows. 2 2 =1+ Second. The subsequence b of all even numbers is bk = a2k .5 Let a be a sequence.

then the subsequence ank is decreasing.1 Any sequence contains a subsequence which is either non-decreasing or non-increasing. There are finitely many peak points: Then let n1 be greater than all the peak points. there exists an n2 . Definition 5. (50 hrs. 2009) In many cases. Since it is not a peak point. a peak poin! a peak poin! ) of the There are now two possibilities.6 A sequence a is called a Cauchy sequence if (∀ > 0)(∃N ∈ N) : (∀m. .2 (Bolzano-Weierstraß) Every bounded sequence has a convergent subsequence. such that an2 ≥ an1 .184 Chapter 5 Lemma 5. We will now provide such a convergence criterion. Continuing this way. n > N)(|an − am | < ). Proof : Let a be a sequence. we obtain a non-decreasing subsequence. Let’s call a number n a peak point ( sequence a if am < an for all m > n. There are infinitely many peak points: If n1 < n2 < · · · are a sequence of peak points. we would like to know whether a sequence is convergent even if we do not know what the limit is. I Corollary 5.

This concludes the proof. which proves that the sequence is bounded. Cauchy assumed that sequences that get eventually arbitrarily close converge. (∃N ∈ N) : (∀n > N)(|an − aN+1 | < 1). 2009) There is something amusing about calling sequences satisfying this property a Cauchy sequence. Proof : One direction is easy2 . it follows that a has a converging subsequence. 2 I . n > N)(|an − am | < /2). By the Bolzano-Weierstraß theorem.Sequences 185 Theorem 5. We first show that the sequence is bounded. Denote this subsequence by b with bk = ank and its limit by β. Combining the two. without being aware that this is something that ought to be proved. (∀ > 0)(∃N. i. n > N)(|an − am | ≤ |an − α| + |am − α| < ).e.. We will show that the whole sequence converges to β.11 A sequence converges if and only if it is a Cauchy sequence. hence (∀ > 0)(∃N ∈ N) : (∀m. and whereas by the convergence of the sequence b. (51 hrs. If a sequence a converges to a limit α. by the Cauchy property (∀ > 0)(∃N ∈ N) : (∀m. Indeed. (∀N ∈ N)(∃K ∈ N : nK > N) : (∀k > K)(|bk − β| < /2). Suppose now that a is a Cauchy sequence. K ∈ N : nK > N) : (∀n > N)(|an − β| ≤ |an − bk | + |bk − β| < ). then (∀ > 0)(∃N ∈ N) : (∀n > N)(|an − α| < /2). the sequence is a Cauchy sequence. Taking = 1.

1−q Comment: Note the terminology: sequence is summable = series is convergent. In this case we write ∞ a ¯ ) if the sequence of par- ak = lim S a . Note that S does not depend on the order of summation (a finite summation). Let a be a geometric sequence an = qn |q| < 1. Definition 5. . Often. hence this geometric sequence is summable and ∞ qn = n=1 q . We define the sequence of partial sums ( a by n a Sn = k=1 a ak . k=1 ∞ def The sequence of partial sum is called a series ( also to the limit of the sequence of partial sums.3 Infinite series ¯ ) of Let a be a sequence. for which we have an explicit expresion: 1 − qn a Sn = q 1−q This series converges. If the sequence S a converges. the term series refers Example: The canonical example of a series is the geometric series. The geometric series is the sequence of partial sums.186 Chapter 5 5. ). we will interpret its limit as the infinite sum of the sequence.7 A sequence a is called summable ( tial sums S converges.

I An important question is how to identify summable sequences. Proof : This is just a restatement of Cauchy’s convergence criterion for sequences. and the rest follows from limits arithmetic. even in cases where the sum is not known. such that |an+1 + an+2 + · · · + am | < for all m > n > N.12 Let a and b be summable sequences and γ ∈ R. For example. and ∞ ∞ (γan ) = γ n=1 n=1 an . Then the sequences a + b and γa are summable and ∞ ∞ ∞ (an + bn ) = n=1 n=1 an + n=1 bn . but a a S m − S n = an+1 + an+2 + · · · + am . n a+b Sn = k=1 a b (ak + bk ) = S n + S n . We know that the series converges if and only if a a (∀ > 0)(∃N ∈ N) : (∀m > n > N)(|S m − S n | < ). I . Theorem 5.Sequences 187 Theorem 5. This brings us to discuss convergence criteria. Proof : Very easy.13 (Cauchy’s criterion) The sequence a is summable if and only if for every > 0 there exists an N ∈ N.

b (∃M > 0) : (∀n ∈ N) 0 ≤ S n ≤ M . 2 3 4 5 6 7 8 ≥1/2 ≥1/2 This grouping of terms shows that the sequence of partial sums is unbounded. which therefore converges. Proof : Consider the sequence of partial sums. Convergence implies boundedness.14 A non-negative sequence is summable if and only if the series (i. an = 1/n. which is monotonically increasing. I The vanishing of the sequence is only a necessary condition for its summability. namely..3 If a is summable then lim a = 0. Monotonicity and boundedness imply convergence. For a while we will deal with non-negative sequences.e. The bound M bounds the series S a . ∞ Proof : We apply Cauchy’s criterion with m = n + 1. Proof : Since b is summable then the corresponding series S b is bounded. (∀ > 0)(∃N ∈ N) : (∀n > N)(|an+1 | < ). then the sequence a is summable. That is. I . The reason will be made clear later on. It is not sufficient as proves the harmonic series.15 (Comparison test) If 0 ≤ an ≤ bn for every n and the sequence b is summable. the sequence of partial sums) is bounded.188 Chapter 5 Corollary 5. a Sn = 1 + 1 1 1 1 1 1 1 + + + + + + +··· . Theorem 5. I Theorem 5.

2n 2 −2 2 so again. . based on the comparison test and the convergence of the geometric series we conclude that the series of a converges.Sequences 189 Example: Consider the following sequence: an = Since 2 + sin3 (n + 1) . Then a is summable if and only if b is summable. ≤ an ≤ n n ≥ 2. This can be shown by the following inequality an = 1 n+1 = 2 ≤ an . 0 < an ≤ Example: Consider now the sequence an = Here we have 2n 1 . it follows that the series of a converges. 2n and the series associated with b converges. − 1 + sin3 (n2 ) 1 2 1 ≤ n. 2n + n2 3 def = bn . 2009) Theorem 5. 0< Example: The next example is n+1 . The following theorem provides an even easier way to show that. since an “roughly behaves” like 1/n. such that lim ∞ a =γ b 0. n2 + 1 It is clear that the corresponding series diverges.16 (Limit comparison) Let a and b be positive sequences. (53 hrs. n n +n which proves that the series S a is unbounded.

b n→∞ n + 1 hence the series of a diverges. γ bn it follows that (∃N ∈ N) : (∀n > N) or (∃N ∈ N) : (∀n > N) (an < 2γbn ) . Now. setting bn = 1/n we find lim ∞ n2 + n a = lim 2 = 1. . The roles of a and b can then be interchanged. n→∞ an (i) If r < 1 then (an ) is summable.17 (Ratio test ( lim )) Let an > 0 and suppose that an+1 = r. γb an <2 . aN+1 ≤ s aN . n N n ak = k=1 k=1 ak + a N k=N+1 sk−N . a Sn n n = Sa N + k=N+1 ak < Sa N + 2γ k=N+1 b bk ≤ S a + 2γS n . an In particular. (∃s : r < s < 1)(∃N ∈ N) : (∀n > N) an+1 ≤s . Proof : Suppose first that r < 1. Proof : Suppose that b is summable. Since lim ∞ a = 1. then a is summable.190 Chapter 5 Example: Back to the previous example. I Theorem 5. Thus. N Since the right hand sides is uniformly bounded. Then. aN+2 ≤ s2 aN . and so on. (ii) If r > 1 then (an ) is not summable.

Note that x x→∞ lim 1 dt =∞ t x whereas x→∞ lim 1 dt < ∞. t2 Indeed. n These examples motivate. Conversely. the sequence (an ) is not even bounded (and certainly not summable). Then a 3 ∞ f = lim def x x→∞ f. . b] for some a and b > a. as show the two examples3 . ∞ n=1 1 <∞ n2 ∞ and n=1 1 = ∞.Sequences 191 The second term on the right hand side is a partial sum of a convergent geometric series. hence the partial sums of (an ) are bounded. i. there is a close relation between the convergence of the series and the convergence of the corresponding integrals. another comparison test. an = Comment: What about the case r = 1 in the above theorem? This does not provide any information about convergence.e. whereas the sequence bn = 1/n is not. which proves their convergence. then (∃s : 1 < s < r)(∃N ∈ N) : (∀n > N) an+1 ≥s .8 Let f be integrable on any segment [a. if r > 1. a This notation means that the sequence an = 1/n2 is summable. Definition 5. I Examples: Apply the ratio test for 1 n! rn bn = n! cn = n r n . however. an from which follows that an ≥ sn−N aN for all n > N..

Example: Consider the series ∞ n=1 1 np for p > 0. hence so does the series. Proof : The proof hinges on showing that x x x ak ≤ k=2 1 f ≤ k=1 ak . . p  1 t This integral converges as x → ∞ if and only if p > 1. Sequences of non-fixed sign are a different story.192 Chapter 5 Theorem 5.18 (Integral test) Let f be a positive decreasing function on [1. 2009) Example: One can take this further and prove that ∞ n=1 1 n(log n) p converges if and only if p > 1. and so does ∞ n=1 1 . (55 hrs. Then ∞ n=1 an converges if and only if ∞ 1 f exists. This is known as Bertrand’s ladder. and consider the integral  x  p=1 dt log x  = (1 − p)−1 (x−p+1 − 1) p 1. n log n(log log n) p and so on. I TO COMPLETE. ∞) and set an = f (n). We apply the integral text. The case of non-positive sequences is the same as we can define bn = (−an ). We have dealt with series of non-negative sequences.

if and only if the two series formed by its positive elements and its negative elements both converge. n 2 4 6 . Define now4  a  n  a+ =  n 0  4 an > 0 an ≤ 0 and  a  n  a− =  n 0  an < 0 . for every > 0 there exists an N ∈ N such that |an+1 + · · · + am | ≤ |an+1 | + · · · + |am | < whenever m. Indeed. 0.19 Every absolutely convergent series is convergent. − . Also. . n > N. A series that converges but does not converge absolutely is called conditionally convergent ( ¯ ). − . 2 3 4 Example: The series 1 1 1 + − + ··· . 0. − . It is a well-known alternating series. 0. 1− Theorem 5. Proof : The fact that an absolutely convergent series converges follows from Cauchy’s criterion. It is known as the Leibniz series. · · · . 3 5 7 converges conditionally as well. 0. then 1 1 a+ = 1. which Leibniz showed (without any rigor) to be equal to π/4. a series is absolutely convergent. 1 1 1 + − + ··· . if an = (−1)n+1 /n. 0. . · · · n 3 5 1 1 1 a− = 0. an ≥ 0 For example.Sequences 193 an is said to be absolutely convergent ( ¯ ) if |an | converges. ∞ n=1 Definition 5.9 A series ∞ n=1 Example: The series 1− converges conditionally.

an = a+ + a− . 2009) I Theorem 5.20 (Leibniz) Suppose that an is a non-increasing sequence of nonnegative numbers. n n n n a+ = n Thus. n hence if the two series of positive and negative terms converge so does the series of absolute values. Indeed. Proof : Simple considerations show that s2 ≤ s4 ≤ s6 ≤ · · · ≤ s5 ≤ s3 ≤ s1 . On the other hand. and conversely. a1 ≥ a2 ≥ · · · ≥ 0. Then the series ∞ (−1)n+1 an = a1 − a2 + a3 − a4 + · · · n=1 converges. s2n ≤ s2n + (a2n+1 − a2n+2 ) = s2n+2 = s2n+1 − a2n+2 ≤ s2n+1 . 2 |ak | = k=1 k=1 a+ − n k=1 a− . i. and n→∞ lim an = 0.e. . (56 hrs. n a+ = n k=1 n 1 2 1 2 n an + k=1 n 1 2 1 2 n |an | k=1 n a− = n k=1 an − k=1 |an | k=1 so that the other direction holds as well.194 Chapter 5 Then.. 1 (an + |an |) 2 n and n a− = n n 1 (an − |an |) . |an | = a+ − a− .

n→∞ n→∞ however a2n+1 converges to zero. n→∞ lim bn = b and n→∞ lim cn = c. N2 ∈ N) : (∀n > max(2N1 + 1. hence b = c. By limit arithmetic n→∞ lim (cn − bn ) = lim (s2n+1 − s2n ) = lim a2n+1 = c − b. ) of a sequence an is a sequence Theorem 5. but only a sketch. where f : N → N is one-to-one and onto. 2N2 ))(|sn − b| < ). I Definition 5. It follows that both subsequences converge.10 A rearrangement ( bn = a f (n) . ∞ n=1 Proof : I will not give a formal proof. i. whereas the sequence cn = s2n+1 is non-increasing and bounded from below. the sequence bn = s2n is non-decreasing and bounded from above.21 (Riemann) If an is conditionally convergent then for every α ∈ R there exists a rearrangement of the series that converges to α. (∀ > 0)(∃N1 .e.. It remains to show that if both the odd and even elements of a sequence converge to the same limit then the whole sequence converges to this limit.Sequences 195 Thus. Indeed. (∀ > 0)(∃N1 ∈ N) : (∀n > N1 )(|s2n+1 − b| < ) and (∀ > 0)(∃N2 ∈ N) : (∀n > N2 )(|s2n − b| < ). I .

.. 1 S = 2 ∞ < S < 1. 2 3 2 5 7 4 which directly proves that a rearrangement of the alternating harmonic series converges to three halves of its value. In this example the rearrangement is b3n+1 = a4n+1 b3n+2 = a4n+3 b3n = a2n . As m → ∞ this sequence converges to |an − a|. Then. 2n 2 4 6 8 10 which we can also pad with zeros. By limit n=1 (−1)n+1 1 1 1 1 1 = − + − + + ··· . n 2 3 4 5 1 2 We proved that this series does indeed converge to some arithmetic for sequences (or series). Think now of the left hand side as a sequence with index m with n fixed.196 Chapter 5 Example: Consider the series ∞ S = n=1 (−1)n+1 1 1 1 1 = 1 − + − + + ··· . (∀ > 0)(∃N ∈ N) : (∀m. 1 1 1 1 1 1 S =0+ +0− +0+ +0− +0+ + ··· . 1 1 1 1 1 3 S = 1 + − + + − + . Comment: Here is an important fact about Cauchy sequences: let (an ) be a Cauchy sequence with limit a.. This implies that |an − a| ≤ for all n > N.. n > N)(|an − am | < ). 2 2 4 6 8 10 Using once more sequence arithmetic.

. since the sequence is absolutely convergent. the order of summation does not matter in an absolutely convergent series.Sequences 197 Theorem 5. ∞ |tm − sN | = k>N. For all m > M. 2 Take now M be sufficiently large. and ∞ ∞ ∞ n=1 ∞ n=1 bn an = n=1 n=1 bn . In other word. aN appears among the b1 . . . Proof : Let bn = a f (n) and set n n sn = k=1 ak ∞ n=1 and tn = k=1 bk . so that each of the a1 . we can choose N large enough such that ∞ N ∞ |ak | − k=1 k=1 |ak | = k=N+1 |ak | < .e.. ∞ This proves that m→∞ lim tm = k=1 ak . . . Let > 0 be given. Since an converges. . then there exists an N such that ∞ sN − k=1 ak < . b M . 2 Moreover. . 2 i. f (k)≤m ak ≤ k=N+1 |ak | < . . ∞ ∞ ak − tm ≤ k=1 k=1 ak − sN + |sN − tm | ≤ . .22 If an converges absolutely then any rearrangement converges absolutely.

such that  m   n  n   m              |a |  |b | −  |a |  |b | <           k  k        2 k=1 =1 k=1 =1 .e. Then  ∞  ∞  ∞          cn =  an   bk  .e. i. I Absolute convergence is also necessary in order to multiply two series. i. a set that can be ordered in different ways..       n=1 n=1 k=1 ∞ n=1 Proof : Consider the sequence     sn =    ∞ k=1     |ak |    ∞ =1     |b | . k. Theorem 5.23 Suppose that an and ∞ bn converge absolutely. Consider the product  ∞  ∞       a   b  = (a + a + · · · )(b + b + · · · ). That ∞ bn converges absolutely follows form the fact that |bn | is a rearrangement n=1 of |an | and the first part of this proof.. Chapter 5 ∞ ∞ bn = n=1 n=1 an . ∞ (an bk ).e. which implies that it is a Cauchy sequence.198 i.   This sequence converges as n → ∞ (by limit arithmetic).n=1 except that these numbers to be added form a “matrix”. For every > 0 there exists an N ∈ N. and that cn n=1 is any sequence that contains all terms an bk .       n  k 1 2 1 2    n=1 k=1 This product seems to be a sum of all products an bk ..

 N  N  M          cn −  ak   b        n=1 k=1 =1 only consists of terms ak b with either k or greater than N.. By the comment above.  m  m         |a |  |b | −            k      k=1 =1 199 ∞ k=1     |ak |    ∞ =1     |b | ≤   2 for all m > N. .Sequences for all m. Thus. n > N. I.   I for all m > M. Let now (cn ) be a sequence comprising of all pairs an b . hence for all m > M.e. There exists an M sufficiently large.  N  N  m          cn −  ak   b  ≤ |ak ||b | < . where cm = ak b . which proves the desires result. |ak ||b | ≤ k or >N 2 < .       n=1 k=1 =1 k or >N Letting N → ∞. such that m > M implies that either k or is greater than N. m n=1     cn −    ∞ k=1     ak     ∞ =1     b ≤ .

200 Chapter 5 .